4.1 Why RAG?
AI-generated content may contain errors. Always verify against official sources.
4.1 Why RAG?
Key Concepts: Hallucination · Knowledge cutoff · Grounding in facts
Official Docs: LangChain RAG Tutorial · Anthropic — Contextual Retrieval
The Two Core Problems RAG Solves
1. Hallucination
LLMs are trained to produce plausible text, not true text. When asked about information not well represented in training data, they may confidently generate incorrect details.
User: What was Acme Corp's Q3 2025 revenue?
LLM (no context): "Acme Corp reported Q3 2025 revenue of $2.3 billion..." ❌ fabricated
2. Knowledge Cutoff
Every model has a training data cutoff. Events, documents, and updates after that date are unknown to the model unless you provide them in the prompt.
What is RAG?
Retrieval-Augmented Generation (RAG) grounds the model in real documents by retrieving relevant context at query time and injecting it into the prompt.
┌───────────────────────────────────────────┐
│ USER QUERY │
│ "What was Q3 2025 revenue?" │
└───────────────────────────────────────────┘
↓
┌───────────────────────────────────────────┐
│ RETRIEVER │
│ Embed query → search vector store │
│ Return top-k relevant document chunks │
└───────────────────────────────────────────┘
↓
┌───────────────────────────────────────────┐
│ LLM (context injected into prompt) │
│ "Based on the Q3 report: revenue = ..." │
└───────────────────────────────────────────┘
RAG vs Fine-Tuning
| RAG | Fine-Tuning | |
|---|---|---|
| Knowledge update | Real-time (re-index) | Requires re-training |
| Cost | Low | High |
| Citable sources | Yes | No |
| New facts | ✅ | ❌ |
| New style/format | ❌ | ✅ |
Use RAG for factual, up-to-date knowledge. Use fine-tuning to teach the model a new style or specialised reasoning pattern.