AI Models - Tiny Talk

Tiny Talk gives you access to models from multiple model providers (OpenAI, Anthropic, Google, Meta, xAI, DeepSeek, Mistral, Cohere, and Qwen) served through different inference providers (OpenAI, OpenRouter, Groq, and Azure). Each model has different strengths, speeds, and credit costs.

How credits work

On plans that include AI credits (Basic AI, Standard AI, Pro AI), each agent response consumes credits based on the model used. Credits reset monthly.

Plan	Monthly Credits
Basic AI	2,000
Standard AI	10,000
Pro AI	40,000

Most models cost 1 credit per response. Mid-tier models (GPT-4o, Claude Sonnet, Gemini Pro) cost 2 credits. Higher-tier models (GPT-4, o1) cost 5 credits, and top-tier models (Claude Opus 4) cost 10 credits.

Bring Your Own Key (BYOK)

On legacy plans (Basic, Team, Enterprise) and some other plans, you can use your own API key instead of credits. Go to Integrations → Hub and enter your inference provider API key. The agent will use your key for all API calls, with no credit consumption on Tiny Talk’s side. After entering a key, click Verify to test it. Tiny Talk will check the key and display the models available for it. Make sure the model you want to use for your agent appears in this list.

An OpenAI key is always required on BYOK plans — Tiny Talk uses OpenAI’s embedding model for knowledge base indexing and search, regardless of which model you choose for agent responses.

Using your own OpenAI key? See OpenAI’s pricing page for per-token costs of each model.

Available models

OpenAI

Model	Credit Cost	Reasoning	Features
GPT-3.5 Turbo	1	—	Fast, affordable general-purpose model
GPT-4	5	—	High-intelligence model for complex tasks
GPT-4 Turbo	5	—	Faster, cheaper GPT-4 with vision support
GPT-4o	2	—	Versatile flagship model with vision
GPT-4o Mini	1	—	Fast, affordable with vision support
GPT-4.1	2	—	Flagship for complex tasks
GPT-4.1 Mini	1	—	Balanced intelligence, speed, and cost
GPT-4.1 Nano	1	—	Fastest, most cost-effective GPT-4.1
GPT-5	1	Yes	Flagship for coding, reasoning, and agentic tasks
GPT-5 Mini	1	Yes	Faster, cost-efficient version of GPT-5
GPT-5 Nano	1	Yes	Fastest, cheapest GPT-5 variant
GPT-5.1	2	Yes	Advanced reasoning with vision
GPT-5.2	2	Yes	Advanced reasoning with vision
GPT-5.2 Chat	1	Yes (medium only)	Cost-efficient reasoning with vision
o1	5	Yes	Reasoning model with chain-of-thought
o3-mini	1	Yes	Fast reasoning model

Anthropic (via OpenRouter)

Model	Credit Cost	Reasoning	Features
Claude Opus 4	10	—	Top coding model with sustained performance
Claude Sonnet 4.6	2	—	Latest Sonnet with enhanced coding and reasoning
Claude Sonnet 4	2	—	Enhanced coding and reasoning with precision
Claude 3.7 Sonnet	2	—	Improved reasoning and problem-solving
Claude 3 Haiku	1	—	Fastest Anthropic model, near-instant responses

Google (via OpenRouter)

Model	Credit Cost	Reasoning	Features
Gemini 3.1 Pro Preview	2	Yes	Latest flagship with reasoning, 1M token context
Gemini 3 Flash Preview	1	Yes	Optimized for speed, 1M token context
Gemini 2.5 Pro	2	—	Advanced reasoning, coding, and math
Gemini 2.5 Flash	1	—	Fast workhorse for reasoning tasks
Gemma 3 27B	1	—	Multilingual, 140+ languages

Meta (via OpenRouter)

Model	Credit Cost	Reasoning	Features
Llama 4 Maverick	1	—	Multimodal, 12 languages
Llama 4 Scout	1	—	MoE model for multilingual tasks
Llama 3.3 70B Instruct	1	—	Multilingual dialogue, 8 languages
Llama 3.1 8B Instruct	1	—	Lightweight, low-latency

xAI (via OpenRouter)

Model	Credit Cost	Reasoning	Features
Grok 4.1 Fast	1	Yes	Fast reasoning model, 256k context
Grok 4	2	Yes	Latest reasoning model, 256k context
Grok 3 Beta	2	—	Enterprise use cases, deep domain knowledge
Grok 3 Mini Beta	1	—	Lightweight thinking model

Other model providers (via OpenRouter)

Model	Credit Cost	Reasoning	Model Provider
DeepSeek V4 Pro	1	Yes	DeepSeek
DeepSeek V4 Flash	1	Yes	DeepSeek
DeepSeek-V3.2	1	Yes	DeepSeek
DeepSeek-R1	1	Yes	DeepSeek
DeepSeek V3 0324	1	—	DeepSeek
Mistral Nemo	1	—	MistralAI
Mistral Small 3	1	—	MistralAI
Command R	1	—	Cohere
Qwen3 235B A22B	1	—	Qwen
Qwen 2.5	1	—	Qwen

Groq (inference provider)

Model	Credit Cost	Reasoning	Features
Llama 3.3 70B	1	—	Ultra-fast inference on Groq hardware
Llama 3.1 8B Instant	1	—	Instant responses, low latency
GPT-OSS 120B	1	Yes (low/medium/high)	Large reasoning model
GPT-OSS 20B	1	Yes (low/medium/high)	Smaller reasoning model

Azure OpenAI (inference provider, EU data residency)

Available on Pro plan only. These models run on Azure’s European (Germany) infrastructure for data residency compliance.

Model	Credit Cost	Reasoning	Features
GPT-4o (Azure EU/DE)	2	—	European data residency
GPT-4o Mini (Azure EU/DE)	1	—	European data residency

Need a different model on Azure EU? Pro plan users can contact support to request provisioning of additional models.

Reasoning-capable models

Reasoning-capable models spend additional compute “thinking” before they answer, which improves quality on complex, multi-step questions in exchange for higher latency and credit cost. When you select one of these models, the Reasoning Effort and Reasoning Summary controls appear in your agent’s model configuration (and the temperature slider is hidden). See Reasoning configuration for details. Models marked Yes in the Reasoning column of the tables above support these controls. They span several providers:

OpenAI — o1, o3-mini, GPT-5, GPT-5 Mini, GPT-5 Nano, GPT-5.1, GPT-5.2, GPT-5.2 Chat
Google — Gemini 3.1 Pro Preview, Gemini 3 Flash Preview
xAI — Grok 4.1 Fast, Grok 4
DeepSeek — DeepSeek V4 Pro, DeepSeek V4 Flash, DeepSeek-V3.2, DeepSeek-R1
Groq — GPT-OSS 120B, GPT-OSS 20B

Available effort levels vary by model — for example, GPT-5.2 Chat only supports Medium, and GPT-OSS 120B/20B only support Low, Medium, and High. The dashboard only shows values the selected model accepts.

Selecting a model

Go to your agent’s Settings and choose a model from the dropdown. Consider:

Cost — 1-credit models are most efficient for high-volume agents
Quality — Premium models (GPT-5, Claude Sonnet/Opus) produce better responses for complex queries
Speed — Groq models offer the fastest inference; Mini/Nano variants are faster than full models
Reasoning — Reasoning-capable models deliver better answers on complex questions at the cost of latency and extra credits
Data residency — Use Azure EU models if you need European data processing (Pro plan)

Documentation Index

​How credits work

​Bring Your Own Key (BYOK)

​Available models

​OpenAI

​Anthropic (via OpenRouter)

​Google (via OpenRouter)

​Meta (via OpenRouter)

​xAI (via OpenRouter)

​Other model providers (via OpenRouter)

​Groq (inference provider)

​Azure OpenAI (inference provider, EU data residency)

​Reasoning-capable models

​Selecting a model

How credits work

Bring Your Own Key (BYOK)

Available models

OpenAI

Anthropic (via OpenRouter)

Google (via OpenRouter)

Meta (via OpenRouter)

xAI (via OpenRouter)

Other model providers (via OpenRouter)

Groq (inference provider)

Azure OpenAI (inference provider, EU data residency)

Reasoning-capable models

Selecting a model