Skip to main content
Tiny Talk gives you access to models from multiple model providers (OpenAI, Anthropic, Google, Meta, xAI, DeepSeek, Mistral, Cohere, and Qwen) served through different inference providers (OpenAI, OpenRouter, Groq, and Azure). Each model has different strengths, speeds, and credit costs.

How credits work

On plans that include AI credits (Basic AI, Standard AI, Pro AI), each agent response consumes credits based on the model used. Credits reset monthly.
PlanMonthly Credits
Basic AI2,000
Standard AI10,000
Pro AI40,000
Most models cost 1 credit per response. Mid-tier models (GPT-4o, Claude Sonnet, Gemini Pro) cost 2 credits. Higher-tier models (GPT-4, o1) cost 5 credits, and top-tier models (Claude Opus 4, Claude 3 Opus) cost 10 credits.

Bring Your Own Key (BYOK)

On legacy plans (Basic, Team, Enterprise) and some other plans, you can use your own API key instead of credits. Go to Integrations → Hub and enter your inference provider API key. The agent will use your key for all API calls, with no credit consumption on Tiny Talk’s side. After entering a key, click Verify to test it. Tiny Talk will check the key and display the models available for it. Make sure the model you want to use for your agent appears in this list.
An OpenAI key is always required on BYOK plans — Tiny Talk uses OpenAI’s embedding model for knowledge base indexing and search, regardless of which model you choose for agent responses.
Using your own OpenAI key? See OpenAI’s pricing page for per-token costs of each model.

Available models

OpenAI

ModelCredit CostFeatures
GPT-3.5 Turbo1Fast, affordable general-purpose model
GPT-45High-intelligence model for complex tasks
GPT-4 Turbo5Faster, cheaper GPT-4 with vision support
GPT-4o2Versatile flagship model with vision
GPT-4o Mini1Fast, affordable with vision support
GPT-4.12Flagship for complex tasks
GPT-4.1 Mini1Balanced intelligence, speed, and cost
GPT-4.1 Nano1Fastest, most cost-effective GPT-4.1
GPT-51Flagship for coding, reasoning, and agentic tasks
GPT-5 Mini1Faster, cost-efficient version of GPT-5
GPT-5 Nano1Fastest, cheapest GPT-5 variant
GPT-5.12Advanced reasoning with vision
GPT-5.22Advanced reasoning with vision
GPT-5.2 Chat1Cost-efficient reasoning with vision
o15Reasoning model with chain-of-thought
o1 Mini1Fast reasoning model
o3-mini1Fast reasoning model

Anthropic (via OpenRouter)

ModelCredit CostFeatures
Claude Opus 410Top coding model with sustained performance
Claude Sonnet 4.62Latest Sonnet with enhanced coding and reasoning
Claude Sonnet 42Enhanced coding and reasoning with precision
Claude 3.7 Sonnet2Improved reasoning and problem-solving
Claude 3.5 Sonnet2Great at coding, data science, visual processing
Claude 3 Opus10Powerful model for highly complex tasks
Claude 3 Haiku1Fastest Anthropic model, near-instant responses

Google (via OpenRouter)

ModelCredit CostFeatures
Gemini 3.1 Pro Preview2Latest flagship with reasoning, 1M token context
Gemini 3 Pro Preview2Flagship frontier model, 1M token context
Gemini 3 Flash Preview1Optimized for speed, 1M token context
Gemini 2.5 Pro2Advanced reasoning, coding, and math
Gemini 2.5 Flash1Fast workhorse for reasoning tasks
Gemini 2.0 Flash1Fast multimodal understanding
Gemini 1.5 Pro1Mid-size model for wide-range reasoning
Gemini Flash 1.51Good at classification and summarization
Gemma 3 27B1Multilingual, 140+ languages

Meta (via OpenRouter)

ModelCredit CostFeatures
Llama 4 Maverick1Multimodal, 12 languages
Llama 4 Scout1MoE model for multilingual tasks
Llama 3.3 70B Instruct1Multilingual dialogue, 8 languages
Llama 3.1 8B Instruct1Lightweight, low-latency

xAI (via OpenRouter)

ModelCredit CostFeatures
Grok 4.1 Fast1Fast reasoning model, 256k context
Grok 42Latest reasoning model, 256k context
Grok 3 Beta2Enterprise use cases, deep domain knowledge
Grok 3 Mini Beta1Lightweight thinking model

Other model providers (via OpenRouter)

ModelCredit CostModel Provider
DeepSeek-V3.21DeepSeek
DeepSeek-R11DeepSeek
DeepSeek V3 03241DeepSeek
Mistral Nemo1MistralAI
Mistral Small 31MistralAI
Command R1Cohere
Qwen3 235B A22B1Qwen
Qwen 2.51Qwen

Groq (inference provider)

ModelCredit CostFeatures
Llama 3.3 70B1Ultra-fast inference on Groq hardware
Llama 3.1 8B Instant1Instant responses, low latency
Llama 4 Maverick1~600 tokens/second on Groq
GPT-OSS 120B1Large reasoning model
GPT-OSS 20B1Smaller reasoning model

Azure OpenAI (inference provider, EU data residency)

Available on Pro plan only. These models run on Azure’s European (Germany) infrastructure for data residency compliance.
ModelCredit CostFeatures
GPT-4o (Azure EU/DE)2European data residency
GPT-4o Mini (Azure EU/DE)1European data residency
Need a different model on Azure EU? Pro plan users can contact support to request provisioning of additional models.

Selecting a model

Go to your agent’s Settings and choose a model from the dropdown. Consider:
  • Cost — 1-credit models are most efficient for high-volume agents
  • Quality — Premium models (GPT-5, Claude Sonnet/Opus) produce better responses for complex queries
  • Speed — Groq models offer the fastest inference; Mini/Nano variants are faster than full models
  • Data residency — Use Azure EU models if you need European data processing (Pro plan)