LLMs Predict Tokens, Not Facts — What That Means for Hallucinations

2026-06-06


LLMs Predict Tokens, Not Facts — What That Means for Hallucinations

You'll get better results from Claude — and fewer nasty surprises — once you understand what it's actually doing when it responds.

The jargon

Token — a chunk of text, roughly a word or part of a word. "Unbelievable" might be two tokens; "the" is one.
Next-token prediction — the core operation: given everything so far, what token is most likely to come next?
Hallucination — when a model produces text that sounds confident and correct but is factually wrong.

The lesson

Claude is not retrieving facts from a database. It was trained to predict what text should come next, given a vast sample of human writing. That training compressed patterns — which words follow which other words, in which contexts — into billions of numbers. At inference time, it runs those patterns forward.

This means fluency and accuracy are entirely separate things. A response can read perfectly and be completely wrong. The model has no internal alarm that fires when it's about to state something false. It just picks the next likely token.

Hallucinations aren't bugs or laziness. They're the predictable output of a system that learned "what plausible text looks like" rather than "what is true."

How it works

Imagine asking: "What year did the Treaty of Utrecht extend to include Canada?" Claude has no record of that treaty doing so. But "Treaty of Utrecht" and "Canada" appear near dates in its training data. So it produces a plausible-sounding year — confidently — because that's what a confident answer to this question would look like in text.

The model doesn't know what it doesn't know. It only knows what token fits.

Two patterns make hallucinations worse:

  1. Obscure specifics — exact dates, citations, version numbers, statistics. These were sparse in training data, so the model fills gaps with plausible-sounding values.
  2. Closed questions with a confident frame — asking "What is the capital of X?" pressures the model toward a definitive answer even when none exists.

Two patterns reduce them:

  1. Give it the facts, ask it to reason — paste in a document and ask Claude to summarise or analyse it. Now it's pattern-matching against your text, not reaching into training data.
  2. Ask for uncertainty"If you're not sure, say so" actually works. It shifts the likely next token away from false confidence.

When to reach for this mental model / when not to

Use it whenever you're asking Claude for specifics you can't easily verify: citations, API version details, legal precedents, named individuals. Treat those outputs as first drafts to be checked, not final answers.

Don't overfit to it. For tasks grounded in the content you provide — drafting, editing, summarising, reasoning through a problem you've described — hallucination risk drops sharply. The model is working with your tokens, not inventing its own.

Try it

Take a factual claim from a recent Claude response that you accepted without checking. Look it up now. If it's right, notice that you got lucky. If it's wrong, notice that the prose gave you no warning. That feeling is the lesson.


Don't miss what's next. Subscribe to My Claude Daily Learning: