Claude re-reads your entire conversation every time it replies. A 20-message session burns ~105,000 tokens — and one measured session found 98.5% of that went to re-reading history, only 1.5% to new output. Once you see the leak, the moves are obvious. This is every move on one page.
| # | Move | Why it works |
|---|---|---|
| 1 | Restart, don't follow up | Fresh session every ~20 messages. Every follow-up re-reads the whole history again. |
| 2 | Batch into one message | 3 prompts = 3 full reloads. One message, three tasks = one reload. |
| 3 | Plan in Chat, build in Cowork | Nail it in Chat first. Revising inside Cowork re-reads the whole folder every time. |
| 4 | Match the model | Sonnet/Haiku for quick work. Frees 30–70% of budget. (Full table below.) |
| 5 | Keep context files small | Cowork reads them before every task. 22k → under 2k = same signal, 10× less noise. |
| 6 | Spread the 5-hr window | Rolling window. Split morning/afternoon/evening so earlier usage rolls off. |
The 6 moves are what to do. These are why it works — so you can invent your own.
Verified = confirmed in Anthropic's tooling/docs Translated = a power-user move in operator language ★ = non-negotiable
In Chrome, screenshots are sized image blocks that accumulate and re-send every turn. 18 of them burned 17% of a Max plan in 5 turns. Move: tell it to READ the page, not photograph it.
Making Claude FIND something burns tokens reading everything. Move: name the exact file, paste the exact section. "Look at the pricing below" beats "find the pricing."
Claude caches the stable front of your chat. Switching models OR adding a tool mid-session shatters it and spikes cost. Move: pick model + tools BEFORE you start.
Don't just "summarize." Direct what survives: "Keep the decisions and open questions, drop the back-and-forth." 10× more signal at the same cost.
Exploratory work pollutes a thread — Claude starts referencing its own dead ends. Move: run the messy part in a throwaway chat, bring back only the clean answer.
When tasks don't depend on each other, say so. Move: "Do these three — they're independent." It handles them together instead of in sequence.
New chat = total reset. Clear = wipe context, keep workspace. Compact = summarize + continue. Match the reset to the situation.
Free Chrome extensions (Claude Token Counter et al) show live token count, context %, and usage bars on claude.ai. You can't manage a leak you can't see.
"Score this 1–100 on X, and if under 90, tell me why and rewrite." The number forces honesty. "Is this good?" gets flattery. (Three flavors below.)
A raw PDF carries rendering noise Claude has to chew through. Clean markdown keeps only what matters — headings, tables, lists. Same document, up to 70% fewer tokens (a 70k-token PDF can drop to ~21k) and better output, because the structure survives. The tool is MarkItDown by Microsoft. Free.
The free "PDF to Markdown" app in the Microsoft Store (built on MarkItDown). Drop a file, get clean .md back. No terminal, no setup.
MarkItDown ships an MCP server. Wire it in as a connector and every file you upload auto-converts to markdown before Claude reads it.
There's also a command-line version (markitdown file.pdf -o file.md) for the technically inclined — but the Store app is the one for most people.
Paste a long output back and ask: "This is taking up too many tokens — how do I shorten it without losing the meaning?" It'll trim its own bloat. Works on outputs, context files, and prompts alike.
Anything that runs daily runs its token cost daily. Paste out your scheduled-task prompts and trim them the same way — a bloated prompt on a timer is a leak that bills you every morning.
| Failure mode | Looks like | The move |
|---|---|---|
| Drift | Stops following rules ~20 msgs in | Restart. Carry context via notes. |
| Context overload | Slows, thrashes, "autocompact" churn | Thread's too full. Fresh chat. Trim the input. |
| Template collapse | Every output sounds the same | New chat. Re-anchor with a fresh example. |
| Over-softening | Hedges, won't say the hard thing | Tighten Custom Instructions Part 2. |
| Hallucination | Confident, specific, wrong | Ask for sources. Web Search on. Cross-check. |
Claude Code users drop these rules into a CLAUDE.md file so every session follows them. Your equivalent: paste this into your Cowork Project instructions (or a Project's custom instructions on claude.ai). Then it runs these habits for you instead of you remembering them.