Context Windows Aren't Memory — What Actually Persists Between Turns


Context Windows Aren't Memory — What Actually Persists Between Turns

This matters every time you start a new conversation and expect Claude to "remember" something from the last one. It doesn't. Understanding why stops you from building systems on a foundation that isn't there.

The jargon

Context window — the block of text Claude can see at one moment. Everything inside it is "in context." Everything outside it is invisible.

Token — roughly one word, or three-to-four characters. Context windows are measured in tokens. Claude's limits range from around 100k to 200k tokens depending on the model.

Stateless — each API call is independent. The server doesn't keep a running tab on you between calls.

The lesson

Claude has no persistent memory of its own. What feels like memory — Claude "remembering" something you said two messages ago — is just the fact that the earlier message is still sitting in the same scrolling block of text.

The moment a conversation ends, that block is gone. A new conversation is a blank page. Claude doesn't have a database, a diary, or a background process keeping notes. It only knows what's written in the text that gets sent to it right now.

How it works

When you send a message, your client (Claude.ai, an API call, Claude Code) bundles up the entire conversation so far — every message, from the beginning — and sends it all to the model every single time. Claude reads the whole thing fresh. Then it replies. That reply gets added to the bundle. Next turn, the whole thing gets sent again.

This is why long conversations get expensive and slow. You're not sending one message. You're resending the entire transcript, plus one new message, on every turn.

Turn 1 sent to API:  [system prompt] + [user msg 1]
Turn 2 sent to API:  [system prompt] + [user msg 1] + [assistant reply 1] + [user msg 2]
Turn 3 sent to API:  [system prompt] + [user msg 1] + [assistant reply 1] + [user msg 2] + [assistant reply 2] + [user msg 3]

Nothing is stored on Anthropic's side between calls. The conversation history lives in your application — your code, your browser tab, your API client.

When this matters — and when it doesn't

It matters the moment you design something that needs to carry information across sessions. Summarising a meeting today and referencing it next week won't work unless you explicitly paste that summary back in. It also matters when debugging why Claude "forgot" an instruction — if it wasn't in the window, it was never seen.

It matters less in a single long conversation where everything stays in context. There, Claude really does have perfect recall — of that session only.

Try it

Open two separate Claude.ai conversations right now. Tell Claude something specific in the first — a made-up project name, say "Project Wombat." Close that tab. Open the second conversation and ask Claude what you told it earlier. It will have no idea. That blank stare is the context window boundary in action.

Once you've felt it, you'll stop trusting that Claude "knows" something just because you told it before.


Don't miss what's next. Subscribe to My Claude Daily Learning: