Artificial Intelligence 8 min read

How to Stop Context Rot in Claude Code: Rewind, Compact, and Sub‑Agents

This guide explains why massive token windows can cause context rot in Claude Code, demonstrates how to monitor usage, and walks through the /rewind, /clear, and /compact commands plus sub‑agent techniques to keep the model’s context clean and efficient.

SuanNi

Apr 20, 2026

How to Stop Context Rot in Claude Code: Rewind, Compact, and Sub‑Agents

Developers using Claude Code often encounter sudden drops in model performance because the context window—up to one million tokens—fills with system prompts, prior dialogue, tool calls, file reads, and output results. When the window reaches its physical limit, the model automatically compresses the current task into a short summary, discarding older, irrelevant information; this phenomenon is called context rot .

The article first shows how the new /usage slash command reveals real resource consumption, highlighting that users manage sessions very differently: some keep long‑lived connections, others open a new window for each command. Without proper window‑capacity management, a million‑token context can quickly become polluted, scattering the model’s attention across irrelevant old tokens and degrading its ability to focus on the current task.

To combat this, the author outlines four possible actions after each model reply:

Continue the conversation : send a new message in the same session.

Rewind : use the /rewind command to jump back to a previous message, erasing all subsequent redundant information.

Clear : invoke /clear to start a fresh session, usually accompanied by a brief summary extracted from the previous dialogue.

Compact : ask the model to summarize the current session and continue from that concise memory.

The author emphasizes that rewind is often more efficient than manual correction. For example, when Claude reads five files, tries a solution, and fails, instead of typing “the previous method didn’t work, try method B,” you can rewind to the point right after the file reads and issue a new instruction like “don’t use method A; the foo module lacks the required interface, use method B instead.” This preserves useful file‑read records while discarding the failed attempt, keeping the context pure.

When the session becomes overly long, the article compares compression and clearing . Compression lets the model generate a summary automatically, which can retain more details but may miss nuances if the model’s attention is low. Clearing requires the user to manually write the essential points, offering absolute precision at the cost of extra effort. The author notes that automatic compression can misfire during long debugging sessions, summarizing hours of work into a brief note that omits a newly discovered warning, leading the model to ignore that warning later.

To avoid such pitfalls, the guide recommends proactive manual summarization before the automatic boundary triggers, especially when the model is in a low‑intelligence state caused by context rot.

Another powerful technique is launching sub‑agents . When a task will generate a lot of intermediate data that is no longer needed, you can instruct Claude to spawn a sub‑program with a clean context window. The sub‑agent performs the heavy lifting—e.g., reading another codebase to learn its authentication flow—and returns only a final report to the parent session. This isolates noise and ensures the main context stays focused.

The article concludes with a decision checklist (illustrated in the original images) that helps engineers choose between continuing, rewinding, clearing, compressing, or delegating to a sub‑agent based on the current workload and the risk of context rot.

Reference: Anthropic’s Claude Code session‑management guide (https://claude.com/blog/using-claude-code-session-management-and-1m-context)

Claude Session Management AI prompt engineering compact rewind sub‑agents context rot