Agent Settings is where you define how your agent thinks, responds, and behaves. Access these settings by selecting an agent and clicking Settings in the sidebar. This page covers the System Prompt, AI model settings (temperature, match count, guardrail level, reasoning), Rate Limiting, and the Playground.Documentation Index
Fetch the complete documentation index at: https://tinytalk.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
System prompt
The system prompt is a set of instructions given to the AI model that shapes how your agent behaves and responds. It acts as a foundational directive — the model follows these instructions throughout every conversation. Use the system prompt to define:- Personality and tone — formal, casual, friendly, concise
- Behavioral rules — what topics to cover, what to avoid, when to escalate
- Response language — which language(s) to respond in
- Constraints — word limits, formatting preferences, disclosure rules
Displaying images in responses
The chat widget renders Markdown images. You can instruct your agent to include images in its responses by adding image URLs in Markdown format to the system prompt. For images to render correctly, the URL must end with a supported extension (.png, .jpg, .jpeg, .gif, .webp).
Add a line like the following to your system prompt to display an image after each response:
Example system prompts
These examples follow prompting best practices: they define a clear role, set behavioral boundaries, specify tone and formatting, and include fallback instructions. Adapt them to your use case and knowledge base content.Customer support & e-commerce
Customer support & e-commerce
Internal knowledge base & HR
Internal knowledge base & HR
Education & online courses
Education & online courses
Healthcare information & triage
Healthcare information & triage
Real estate & property information
Real estate & property information
Restaurant, travel & hospitality
Restaurant, travel & hospitality
Want to improve your prompting skills? These resources cover techniques and best practices for writing effective system prompts:
- Prompt Engineering Guide — comprehensive guide to prompting techniques
- DAIR.AI Prompt Engineering Guide — research-backed prompting methods and examples
AI model settings
These settings fine-tune how the AI model generates responses. You can find them below the system prompt in Settings.Temperature
Temperature controls the randomness of the model’s output. Lower values produce more focused, deterministic responses. Higher values introduce more variety and creativity.| Value | Behavior |
|---|---|
| 0 (default) | Most focused — the model produces nearly identical answers each time |
| 0.1–0.3 | Reliable and predictable with minor variation — ideal for support and knowledge base agents |
| 0.4–0.7 | A mix of precision and originality — good for conversational or exploratory use cases |
| 0.8–1.0 | Imaginative and diverse responses — better suited for brainstorming or creative writing |
Match count
Match count determines how many relevant passages from your Knowledge Base are included as context when the model generates a response. The default is 5. A higher match count gives the model more information to work with, which can improve answer accuracy for broad topics. However, too many matches may introduce irrelevant content that dilutes the response quality.Both the match count and the system prompt consume tokens from the model’s context window. A high match count combined with a lengthy system prompt leaves less room for the conversation itself, which can cause the model to lose track of earlier messages. Balance these settings and experiment — a concise system prompt with a moderate match count often outperforms a large prompt with many matches.
Guardrail level
The guardrail level controls whether the agent restricts its answers to your Knowledge Base or can draw on its general knowledge. The default is High.| Level | Behavior |
|---|---|
| High (default) | The agent only responds to queries within the context of its trained Knowledge Base and avoids off-topic responses |
| None | The agent responds freely to any query, leveraging its general knowledge even if the question falls outside its specific area of expertise |
Reasoning configuration
Some AI models — called reasoning-capable models — spend extra time “thinking” before they answer. When you select one of these models, two additional controls appear in Model Configuration. The temperature slider is hidden, since reasoning models don’t use it. Both settings are optional. Leave them unset to use the provider’s default.Reasoning effort
Controls how much the model reasons before responding. Higher effort generally produces better answers on complex questions, but responses are slower and consume more credits.| Value | Behavior |
|---|---|
| None | No reasoning — closest to a standard chat response |
| Minimal | Very light reasoning pass |
| Low | Short reasoning pass — good for simple lookups |
| Medium | Balanced reasoning for most use cases |
| High | Deeper reasoning for complex, multi-step questions |
| xHigh | Maximum reasoning — slowest and most expensive |
Reasoning summary
Controls whether the model returns a summary of its reasoning alongside the answer.| Value | Behavior |
|---|---|
| Auto | The provider decides when to include a summary |
| Concise | Short summary of the reasoning steps |
| Detailed | Full breakdown of the reasoning steps |
Not every reasoning model exposes both controls, and some models restrict which effort levels are available. The dropdown shows only the values supported by the currently selected model.
Rate limiting
Rate limiting prevents individual users from sending too many messages in a short period, protecting your agent from spam and abuse. The limit is tracked per user and starts when they send their first message in a given window. To enable rate limiting, go to Settings → Rate Limits and toggle Enable Rate Limiting. Configure two values:- Message — the maximum number of messages a user can send within the window
- Duration (minutes) — the time window in which messages are counted
Rate limit reached message
Customize the message users see when they hit the rate limit. This field supports per-language values when you have additional languages enabled.Rate limits apply to each channel separately — activity on the web messenger does not affect WhatsApp rate limits. API requests authenticated with a valid API key are not rate limited.
Playground
The Playground is a live preview of your messenger where you can test your agent’s responses without embedding it on your website. Access it from Playground in the sidebar. Use the Playground to:- Test system prompt changes and see how the agent responds
- Verify Knowledge Base content is being retrieved correctly
- Preview the messenger appearance (welcome messages, suggested messages, colors)