ReferenceAdvanced4-6 min reference
Hallucination Testing
A hallucination is a confident, fluent, wrong answer — the hardest LLM failure to catch because it looks right. Testing for it means deliberately provoking it and checking claims against a source of truth. This sheet covers the types, detection, and provocation tactics; see RAG Testing and LLM Evaluation for neighbouring techniques (linked below).
Types
| Type | Example |
|---|---|
| Factual | Invents a wrong fact, date, or figure |
| Unfaithful (RAG) | Claim not supported by the provided context |
| Fabricated source | Cites a paper, API, or URL that doesn't exist |
| Instruction drift | Ignores a constraint it was given |
| Overconfidence | States a guess as certain fact |
How to detect
- Groundedness check: does each claim trace to provided context/source? (LLM-as-judge or rules.)
- Reference comparison: check against a known answer / knowledge base.
- Self-consistency: ask N times; inconsistent answers signal low confidence.
- Citation verification: confirm cited sources/URLs actually exist and say that.
- Human spot-check on a sample for ground truth.
Prompts that provoke hallucination
- Questions about things that don't exist ("the 2027 X API") — does it refuse or invent?
- Out-of-corpus questions in a RAG app.
- Requests for specific figures, citations, or quotes.
- Ambiguous / under-specified prompts.
- Niche, long-tail topics with little training data.
Common mistakes
- Judging fluency as correctness.
- Testing only answerable questions (never the "should refuse" cases).
- One sample instead of self-consistency across several.
- Not verifying that citations are real.
- No source of truth to check claims against.
// Related resources