Skip to main content

6.1 Types of Hallucination

AI-Generated Content

AI-generated content may contain errors. Always verify against official sources.

6.1 Types of Hallucination

Key Concepts: Intrinsic vs extrinsic · Fabrication vs distortion · Event shadowing

Key Paper: Survey of Hallucination in Natural Language Generation (Ji et al., 2023)


What is Hallucination?

Hallucination in LLMs is when the model generates plausible-sounding but factually incorrect or unsupported content.

“LLMs don’t lie. They complete text patterns. The problem is: patterns that look like facts can be wrong.”


Taxonomy of Hallucination

Type 1 — Intrinsic Hallucination

The output contradicts the source document provided in context.

Source: "The Eiffel Tower is 330 metres tall."
Hallucination: "As stated in the text, the Eiffel Tower is 300 metres tall."

Type 2 — Extrinsic Hallucination

The output adds information not present in the source.

Source: "Einstein won the Nobel Prize in 1921."
Hallucination: "Einstein won the Nobel Prize in 1921 for his theory of relativity."
# WRONG: he won it for the photoelectric effect, not relativity

Type 3 — Fabrication

The model invents entirely fictional facts with no basis.

Prompt: "Tell me about Professor John Smith at Harvard."
Hallucination: "Prof. Smith has published 50 papers on quantum computing and won the Turing Award."
# May not exist at all

Type 4 — Event Shadowing

The model blends two real events into one inaccurate account.

Hallucination: "The 1969 moon landing used the Apollo 10 spacecraft."
# Apollo 11 landed; Apollo 10 orbited but didn't land

Why Hallucination Happens

CauseExplanation
Token prediction biasModel predicts the most likely next token, not the most accurate
Knowledge cutoffTraining data ends at a date; post-cutoff questions have no ground truth
Sparse training dataRare topics have few training examples, so the model generalises poorly
Over-confidenceModels don’t know what they don’t know

Detecting Hallucination

from openai import OpenAI
client = OpenAI()

def check_for_hallucination(claim: str, source_context: str) -> dict:
"""Use GPT-4o to check if a claim is supported by the source."""
prompt = f"""Given the following source context, determine if the claim is:
- SUPPORTED: The claim is directly supported by the context
- CONTRADICTED: The claim contradicts the context
- UNVERIFIABLE: The claim is not addressed in the context

Source context:
{source_context}

Claim: {claim}

Respond with JSON: {{"verdict": "SUPPORTED|CONTRADICTED|UNVERIFIABLE", "reason": "<brief explanation>"}}"""

resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
temperature=0,
response_format={"type": "json_object"},
)
import json
return json.loads(resp.choices[0].message.content)

# Test it
result = check_for_hallucination(
claim="The Eiffel Tower was built in 1889.",
source_context="The Eiffel Tower, built for the 1889 World’s Fair, stands 330m tall."
)
print(result) # {"verdict": "SUPPORTED", "reason": "Context explicitly states built for 1889 World's Fair"}

Common Mistakes

Common Mistakes
  1. Assuming low temperature = no hallucinationtemperature=0 makes outputs deterministic but does not prevent the model from confidently generating wrong facts.
  2. Not providing source context — when asked factual questions without a grounding context, hallucination rate increases significantly. Use RAG.
  3. Asking the model to evaluate its own accuracy — “Are you sure this is correct?” is unreliable. The model often doubles down on incorrect answers.
  4. Treating all hallucinations equally — a fabricated date in a creative story is harmless; a fabricated drug dosage in a medical app is dangerous.

Quick Quiz

Test Your Understanding

Q1. What is the difference between intrinsic and extrinsic hallucination?
A1. Intrinsic hallucination contradicts the provided source document. Extrinsic hallucination adds information not present in the source.

Q2. Does setting temperature=0 prevent hallucination?
A2. No. It makes outputs deterministic (same input = same output) but the model can still confidently generate incorrect facts.

Q3. Why do LLMs hallucinate on topics with sparse training data?
A3. With few training examples, the model has little signal about what is true and generalises from unrelated patterns, producing plausible-sounding but wrong content.


Student Exercise

Exercise 12.1 — Hallucination taxonomy
Ask GPT-4o-mini 10 questions about obscure historical events. For each answer, research the correct facts and classify each error as: Fabrication, Intrinsic, Extrinsic, or Event Shadowing. Report the distribution.


Further Reading

Next → 12.2 Rule-Based Validation