By retrieving relevant passages, the model can cite evidence and reduce hallucinations compared to answering from memory alone.
dictionary
Retrieval-Augmented Generation (RAG)
RAG combines search and generation so models can ground responses in up-to-date, organization-specific information. It is often used for support, knowledge bases, and research assistants where accuracy matters.
CategorySystems
Reading time6 min read
Last updatedJan 31, 2025
Definition
A technique that augments model responses with retrieved context from trusted data sources.
Need this applied?
We help teams go from definitions to deployed workflows—safely and fast.
A typical RAG pipeline includes document ingestion, chunking, embeddings, vector search, and a generator that uses the retrieved context.
- • Embeddings and vector database for semantic search.
- • Prompt templates to integrate retrieved context.
- • Evaluation of retrieval quality and output accuracy.
FAQ
Does RAG replace fine-tuning?
RAG and fine-tuning solve different problems. RAG grounds answers in external knowledge; fine-tuning changes model behavior or tone.
Email this summary + checklist
Get a copy of “Retrieval-Augmented Generation (RAG)” and an AI readiness checklist in your inbox.