Skip to main content
The Knowledge Base is where your agent finds answers. Each item you add — whether it’s a file, a web page, or a Notion page — becomes a resource. When a visitor asks a question, your agent searches across all resources using hybrid search (combining semantic similarity with keyword matching) to find the most relevant passages.

Adding resources

Navigate to Knowledge Base in the sidebar and open the Add Resource tab. You can add resources from multiple sources:

Document upload

Upload files directly from your computer:
FormatExtension
PDF.pdf
Microsoft Word.docx
Microsoft Excel.xlsx
Microsoft PowerPoint.pptx
CSV.csv
Markdown.md
MDX.mdx
Plain text.txt
Only text content is extracted. Images, charts, and scanned pages without selectable text are not processed.
Maximum file size depends on your plan: 25 MB (Free), 50 MB (Basic AI, Standard AI), or 100 MB (Pro AI).

Website crawler

Enter a URL and Tiny Talk will crawl the page (and optionally follow links) to extract text content. You can scrape a single page, crawl an entire site recursively, or use a sitemap. See the full Website Crawler guide for configuration options, crawl modes, and troubleshooting.

Google Drive

Connect your Google Drive account to import documents:
  1. Go to Integrations → Hub → Google Drive and authorize access
  2. Return to Knowledge Base → Add Resource → Google Drive
  3. Browse and select files to import
In addition to the file formats listed above, Google Drive imports also support native Google Docs, Google Sheets, and Google Slides — no conversion needed.

Notion

Connect your Notion workspace to import pages:
  1. Go to Integrations → Hub → Notion and authorize access
  2. During authorization, Notion will ask you to select which pages Tiny Talk can access — only those pages will be available for import
  3. Return to Knowledge Base → Add Resource → Notion
  4. Select the pages to import
If you add new pages in Notion that you want to import later, disconnect the Notion integration and re-connect it. During the new authorization flow, expand the allowed page list to include the new pages.

How search works

When a visitor asks a question, your agent runs two searches in parallel across all resources:
  • Semantic search finds content that is conceptually similar to the question, even if the exact words don’t match. This is powered by vector embeddings.
  • Keyword search (BM25) finds content that contains the same words and phrases as the question. This catches exact matches that semantic search might rank lower.
Results from both searches are combined and deduplicated, with the most relevant passages included in the agent’s context. This hybrid approach means your agent can handle both precise lookups (“What is the refund policy?”) and broader conceptual questions (“How do returns work?”).

Understanding characters

Every resource you add is measured in characters (the raw text length after extraction). This is effectively your Knowledge Base storage capacity — the more resources you add, the more characters are consumed. Delete a resource and the characters are freed up. Your plan has a total character limit across your account, and it does not reset monthly.
PlanCharacter Limit
Free600,000
Basic AI30,000,000
Standard AI40,000,000
Pro AI50,000,000
A typical web page is 5,000–15,000 characters. A 10-page PDF is roughly 30,000–50,000 characters.

Resource status

After you add a resource, it goes through several stages before it becomes searchable:
  1. Idle — Resource created, waiting for processing to begin
  2. Analyzing — Text is being extracted from the file
  3. Train (Processable) — Analysis complete, ready to be trained
  4. Pending — Queued for training after you click Train
  5. Training (Processing) — Content is being chunked and embedded as vectors
  6. Trained (Succeeded) — Resource is live and searchable by the agent
  7. Failed — An error occurred (check the error message and retry)

Managing resources

From the Resources tab you can:
  • View all resources with their status, character count, and source
  • See total character usage and resource count
  • Delete resources you no longer need
  • Re-process failed resources
  • Filter and search across resources

File privacy

Uploaded files (PDF, Word, Excel, etc.) are stored privately and are not publicly accessible. Team members can view files from the dashboard via temporary signed links, but the files cannot be accessed by visitors or anyone outside your workspace. Website-crawled content is inherently public, since it comes from pages already available on the web.

Troubleshooting

  • Ensure the file is under your plan’s size limit
  • The file must contain selectable text (not scanned images)
  • Password-protected files are not supported
Training requires a valid OpenAI API key with access to the text-embedding-ada-002 model. Common causes:
  • No API key added — If your plan requires a Bring Your Own Key (BYOK), you must add your OpenAI API key before training will work. Go to Integrations → Hub → OpenAI to add it.
  • Key is invalid or expired — Go to Integrations → Hub → OpenAI and click Verify to check for errors.
  • Insufficient credits or rate limit — Your OpenAI account may be out of credits or hitting rate limits. Check your OpenAI billing dashboard and review OpenAI’s error codes for details.
  • Missing model access — Your OpenAI key must have access to the text-embedding-ada-002 model. Verify this in your OpenAI account settings.
  • Verify the resource status is Trained
  • The agent uses hybrid search (semantic + keyword), but it may not find a match if the visitor’s question doesn’t relate to the resource’s text
  • Try asking a question that directly relates to the resource to test
No. Your agent only uses the content that was crawled and trained at the time you added the resource. If your website content changes, you’ll need to re-crawl the URL and train again. There is no automatic re-crawling or syncing at the moment.
No. Your agent only uses the resources in your Knowledge Base to answer questions. It does not learn or improve from visitor conversations over time. To improve responses, add or refine your Knowledge Base resources.
No. At the moment your agent cannot access external websites or URLs during a conversation. It only searches the resources you’ve added to the Knowledge Base. To include content from a website, use the Website Crawler to import it first.
Training time depends on the size and number of resources. A single document typically takes a few seconds to a minute. Large batches or website crawls with many pages may take several minutes. You can monitor progress from the Resources tab.