Skip to main content

4.1 Why RAG?

AI-Generated Content

AI-generated content may contain errors. Always verify against official sources.

4.1 Why RAG?

Key Concepts: Hallucination · Knowledge cutoff · Grounding in facts

Official Docs: LangChain RAG Tutorial · Anthropic — Contextual Retrieval


The Two Core Problems RAG Solves

1. Hallucination

LLMs are trained to produce plausible text, not true text. When asked about information not well represented in training data, they may confidently generate incorrect details.

User: What was Acme Corp's Q3 2025 revenue?
LLM (no context): "Acme Corp reported Q3 2025 revenue of $2.3 billion..." ❌ fabricated

2. Knowledge Cutoff

Every model has a training data cutoff. Events, documents, and updates after that date are unknown to the model unless you provide them in the prompt.


What is RAG?

Retrieval-Augmented Generation (RAG) grounds the model in real documents by retrieving relevant context at query time and injecting it into the prompt.

┌───────────────────────────────────────────┐
│ USER QUERY │
│ "What was Q3 2025 revenue?" │
└───────────────────────────────────────────┘

┌───────────────────────────────────────────┐
│ RETRIEVER │
│ Embed query → search vector store │
│ Return top-k relevant document chunks │
└───────────────────────────────────────────┘

┌───────────────────────────────────────────┐
│ LLM (context injected into prompt) │
│ "Based on the Q3 report: revenue = ..." │
└───────────────────────────────────────────┘

RAG vs Fine-Tuning

RAGFine-Tuning
Knowledge updateReal-time (re-index)Requires re-training
CostLowHigh
Citable sourcesYesNo
New facts
New style/format
Rule of Thumb

Use RAG for factual, up-to-date knowledge. Use fine-tuning to teach the model a new style or specialised reasoning pattern.


Further Reading

Next → 4.2 Document Loading & Chunking