RAG & Retrieval

This page explains how the Agent answers questions using your documents—what happens behind the scenes when you ask a question.

What is RAG

RAG stands for Retrieval-Augmented Generation. Instead of relying only on what the AI model knows, RAG:

The result: answers based on your documents, with citations linking back to sources.

When you ask the Agent a question:

Query analysis — Your question is analyzed to determine what information is needed
Tool selection — The Agent decides which tools to use (chunk search, graph queries, etc.)
Retrieval — Relevant chunks and/or graph data are fetched
Context assembly — Retrieved information is formatted as context
Generation — The AI generates an answer using that context
Citation linking — Claims are connected back to source chunks

The Agent can pull from multiple sources:

Document chunks — Text segments from your uploaded documents, found via semantic or lexical search

Graph entities — People, organizations, and other entities extracted into your knowledge graphs

Graph relationships — Connections between entities that help answer relationship questions

Document metadata — Titles, dates, and other document-level information

The Agent's primary retrieval tool (retrieve_chunks) supports three modes:

Mode	Best for	Example
Semantic	Conceptual queries	"What are the key findings about oversight?"
Lexical	Exact names, quotes, identifiers	"Project Blue Book", "Dr. Hynek"
Hybrid	Mixed queries (default)	"What did John Smith say about the budget?"

The Agent chooses modes automatically but can be directed by your phrasing.

When a knowledge graph is available, retrieval can also traverse the graph structure:

This helps with questions about relationships and connections:

Every claim in an Agent response should link back to a source chunk. Citations let you:

If the Agent can't find supporting evidence, it should say so rather than make unsupported claims.

Document quality — Clean text with good structure produces better chunks

Query specificity — More specific questions retrieve more relevant context

Corpus coverage — The Agent can only answer from what's in your documents

Chunk relevance — If the right chunks aren't retrieved, the answer may be incomplete