Mastering Claude Code: A Proven Workflow to Keep AI Agents Stable
This article outlines a practical, step‑by‑step workflow for Claude Code that starts with defining acceptance criteria, correctly layering context, selecting the right execution channel, enforcing system‑level constraints, and actively managing long sessions, turning experimental AI agents into reliable engineering tools.
TL;DR – 10 Essential Rules
Claude Code fails not because teams lack features, but because they lack a default workflow.
The first step is defining "what counts as done", not dumping background. **CLAUDE.md** should hold only persistent contracts, not a knowledge base. memory, rules, skills, hooks are distinct layers; mixing them destabilises the system.
Small tasks go straight to execution; complex tasks need exploration then planning; noisy investigations belong to subagents.
Long‑session problems are less about window size and more about unmanaged context noise.
Prompt warnings are not constraints; real constraints belong in hooks, permissions, and sandboxes.
Local sessions, worktrees, and GitHub Actions each suit different stages and should be routed separately.
Stable usage = clear goals, explicit verification, layered control, clean sessions.
The essence of best practice is turning Claude’s autonomous abilities into a reproducible engineering process.
Step 1 – Define Acceptance Before Adding Context
When a task arrives, write a clear definition of "what counts as finished" before providing any background information. This mirrors Anthropic’s verification focus and OpenAI’s harness engineering approach.
Without an acceptance standard, Claude Code guesses from vague language, behaving like a fast but error‑prone intern. The recommended four items are:
Goal : e.g., "Identify and fix the root cause of login failures" instead of "Check this module".
Definition of Done : e.g., "All related tests pass, UI matches design, API returns meet three field constraints".
Verification Method : specify a test command, screenshot, log, build output, or performance metric.
Failure Diagnosis : e.g., "First inspect auth middleware and token refresh logic".
Claude excels when goals and verification are explicit; it then runs the workflow reliably.
Step 2 – Put Context in the Right Layer
Misplaced context is a common source of instability. Anthropic’s documentation splits persistent contracts ( CLAUDE.md), long‑term facts ( memory), path‑specific rules ( .claude/rules/), reusable low‑frequency workflows ( skills), and deterministic actions ( hooks). The table below summarises what belongs where:
CLAUDE.md : build commands, project contracts, directory boundaries, NEVER list, compact instructions – not tutorials or low‑frequency manuals.
memory : stable preferences, project facts, long‑term constraints – not one‑off task state.
.claude/rules/ : path‑level, language‑level, directory‑level rules – not global explanations.
skills : low‑frequency but reusable task‑oriented workflows – not constantly resident rules.
hooks : deterministic actions that must run every time – not complex reasoning steps.
Treat CLAUDE.md as a collaboration contract, not a knowledge base. Keep it concise and focused on things that must be present in every session.
Step 3 – Choose the Correct Execution Channel
Claude Code can act in the local session, via plan mode, subagents, worktrees, or GitHub Actions. Without a default routing strategy, everything ends up in the main session, making the conversation noisy and hard to follow.
Typical routing:
Small changes : execute directly in the local session.
Complex multi‑step tasks : use plan then execute.
Noisy investigations : delegate to a subagent and bring back a summary.
Parallel tasks : isolate each in its own worktree .
Rule‑based batch work : run via GitHub Actions (PR review, issue triage, scheduled checks, standardized fixes).
Choosing the right channel keeps the main session focused and prevents state contamination.
Step 4 – Let the System Enforce High‑Risk Actions
Plain textual reminders ("be careful", "don't delete") are ineffective. Real safety comes from system‑level constraints:
Put always‑required contracts in CLAUDE.md.
Put occasional needs in skills or rules.
Put mandatory per‑run actions in hooks.
Use permissions and deny‑lists to block disallowed capabilities.
Hooks are ideal for actions that must not rely on the model’s memory, such as formatting, linting, protecting sensitive files, audit logging, fixed notifications, and critical command blocking.
Step 5 – Actively Manage Long Sessions
Long sessions degrade not only because of token limits but also due to accumulated noise, stale constraints, and lost failure recovery points. Two key commands help:
When to use /compact
The current phase isn’t finished yet, but the session is filled with tool output.
You need to continue the same task while trimming old details.
You want to keep key state but discard fine‑grained history.
Think of it as slimming the session while the task continues.
When to use /clear
The task goal has changed.
The previous context no longer helps the current problem.
You have a concise handoff ready.
This is a deliberate phase switch, not just garbage collection.
When to write HANDOFF.md
Large tasks need staged handovers.
Starting a new session but want to preserve critical constraints.
Passing current findings, unfinished items, and risks to the next Claude instance.
Hand‑off files capture essential state for the next round, complementing memory, rules, and CLAUDE.md.
Default Daily Workflow
Assess complexity : small change → direct; multi‑step → plan; noisy → subagent.
Write acceptance : define goal, success criteria, verification, failure diagnostics.
Distribute context : CLAUDE.md for contracts, memory for long‑term facts, .claude/rules/ for path rules, skills for low‑frequency workflows.
Select execution channel : main session, subagent, worktree, or GitHub Actions as appropriate.
Enforce high‑risk actions via hooks and permissions.
Manage session health : /compact during a phase, /clear on phase change, HANDOFF.md for handovers.
This straightforward sequence avoids flashy tricks and aligns with Claude Code’s underlying ReACT loop: think → act → observe → repeat until verification succeeds.
Why This Order Works
The core ReACT loop runs a simple while True cycle in Python. The six‑layer model (task loop, persistent contract, workflow layer, action layer, control layer, isolation layer) maps to the steps above:
Define acceptance → gives the loop a clear termination condition.
Layered context → ensures each iteration receives clean input.
Choose execution channel → isolates noisy or parallel work.
System constraints (hooks) → add deterministic safety checks.
Active session management → compress or reset context as the loop progresses.
Understanding this mapping turns a collection of tips into a coherent engineering practice.
Additional Practical Details
Key numbers:
Claude’s 200 K token window effectively offers 160–180 K for user content; tool definitions and system prompts consume ~25 K tokens.
Prompt caching works best when static content (system prompts, tool definitions) is placed first, and dynamic content (timestamps, variables) stays in user messages.
Skill descriptors should state *when* to use the skill, not *what* the skill is, to keep token usage low and matching accurate.
Typical skill types:
Checklist : pre‑release validation, PR standards.
Workflow : migration, fault‑diagnosis (often with disable-model-invocation: true for high‑risk steps).
Domain Expert : specialised knowledge bases.
Tools should be easy to use correctly; for example, replace markdown‑based user prompts with an explicit AskUserQuestion tool.
Periodically audit your toolset because model capabilities evolve; what was once a necessary guard may become a bottleneck.
Health‑Check Tool
You can run the community claude-health tool (install via npx skills add tw93/claude-health) and invoke /health inside Claude Code to get a quick report on CLAUDE.md, rules, skills, and hooks.
Conclusion
Claude Code’s best practice is not a collection of clever prompts but a stable default workflow: define acceptance first, layer context correctly, route execution to the appropriate channel, enforce constraints at the system level, and actively manage long sessions. Following this order turns Claude from a powerful but fragile assistant into a reliable engineering partner.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
