Artificial Intelligence 20 min read

Designing Agent Tools: Key Lessons from Claude Code’s Action Space

This article distills the Claude Code team's hard‑won insights on building effective AI agents, highlighting why action‑space design outweighs model size, how structured questioning improves bandwidth, when to replace Todos with Tasks, and a repeatable seven‑step loop for evolving toolsets.

Architect

Feb 28, 2026

Designing Agent Tools: Key Lessons from Claude Code’s Action Space

Background

The Claude Code team, led by Thariq, published a detailed post titled Lessons from Building Claude Code: Seeing like an Agent . It shares the pitfalls, trial‑and‑error paths, and the final methodology they settled on for building agent‑centric tools.

Core Insight: Action Space Design

The hardest problem in agent development is not model intelligence but designing the action space —the set of tools the model can invoke. Too many tools overwhelm the model; too few limit capability. The goal is to give the model tools it can use well and recover from failures.

Thariq’s mantra is to learn to see like an agent : observe real dialogues, spot where the model gets stuck, and iterate on the tool design.

TL;DR – Key Takeaways

Action space design determines the agent’s behavior.

Tool strength matters less than the model’s ability to use it.

Structured questioning (AskUserQuestion) dramatically reduces bandwidth loss.

Over‑engineered tools become constraints as models improve.

Progressive disclosure (layered knowledge) is more stable than stuffing everything into prompts.

Keep tool count low; each new tool adds a failure point.

1️⃣ Action Space Is Your Product

Many teams pile on capabilities—web access, database queries, multiple models—only to discover two common failures after launch:

The model hesitates at dozens of entry points, trying the wrong tool.

Developers cannot trace why a particular action was chosen, making debugging hard.

In Claude’s API, tools are built from primitives such as bash, skills, and code execution. The design dilemma becomes whether to expose a single “universal” tool or many specialized tools.

You face a tough math problem. Which tool would you choose? Paper & pencil – low‑tech, slow. Calculator – faster, requires skill. Computer – most powerful, requires coding.

For agents, give the model the tool it can use effectively, not the one you think is strongest.

Practical risk/ability matrix (simplified):

Low risk, low ability : read‑only retrieval, safe but limited.

Medium risk, composable : structured queries, task management – higher bandwidth, needs clear contracts.

High risk, high upside : bash, code execution, network access – powerful but requires strict permissions, state handling, and recoverability.

2️⃣ Structured Questioning – AskUserQuestion

Agents often need clarification, but plain‑text questions suffer from low bandwidth. Claude Code’s goal was to lower this friction.

Attempt 1: Embed questions in ExitPlanTool

Adding a question parameter to the planning tool confused the model about whether it was planning or asking, leading to role clashes.

Attempt 2: Enforce strict Markdown format

Requiring a rigid markdown schema made the system brittle; any deviation broke parsing.

Attempt 3: Dedicated AskUserQuestion tool

This single‑purpose tool presents a structured UI, blocks the agent loop until the user answers, produces a fixed, parse‑able output, and can be reused across SDKs and skills.

3️⃣ From Todos to Tasks

Early Claude Code used a TodoWrite tool to keep the model on track. As the model grew stronger, the todo list became a restrictive script, especially with sub‑agents.

Switching to a Task tool transformed the workflow into a collaborative protocol:

Todos : linear list, good for single‑threaded, anti‑drift scenarios.

Tasks : DAG with dependencies, state sync, output archiving, and rollback – essential for multi‑agent coordination.

Tools inevitably expire; when the model’s capabilities evolve, old tools must be revisited.

4️⃣ Progressive Context Building

Claude initially relied on RAG vector retrieval, which required heavy indexing and still fed static context. To let the model find information itself, a Grep ‑style tool was added, later formalized as Skills with the principle of progressive disclosure :

Provide an entry point, let the model fetch files layer by layer.

Never load the entire knowledge base into the prompt.

Knowledge layering example:

Layer 0 – Index (200‑500 chars) : list capabilities and entry points.

Layer 1 – Pattern cards (500‑1500 chars) : checklists, examples, negative cases; must be executable.

Layer 2 – Full manual (>2000 chars) : loaded only when needed.

Recursive reading should be bounded by two metrics: search depth and payoff . Stop when deeper searches no longer improve answer quality.

5️⃣ Seven‑Step Iterative Tool Design Loop

Find friction in real model outputs.

Pick the smallest lever (prompt tweak before adding a tool).

Make the interface narrow – one tool, one responsibility.

Give structure to the machine (schemas, enums, defaults).

Make failures recoverable (retries, rollbacks).

Persist outputs as reviewable files.

Periodically audit tools for obsolescence after model upgrades.

6️⃣ Decision Table – When to Add a Tool

Unstable phrasing or occasional format drift → tweak prompts, low cost, no action‑space change.

Need stable, parseable output → add a tool with schema and enums.

Large but rarely used knowledge → layered files with progressive disclosure.

Complex lookup (docs, code) with a pattern → sub‑agent with search strategy.

7️⃣ Anti‑Patterns

One tool trying to plan, ask, and execute simultaneously.

Relying on the model to always output a strict text format.

Treating a Todo list as an immutable script.

Embedding an entire document in the system prompt.

Prioritizing raw capability over recoverability.

8️⃣ Tool Design Is an Art, Not a Science

Effective tool design requires continuous experimentation, output analysis, and incremental improvements, always returning to the principle of “seeing like an agent.”

Practical Checklist (12 items)

Provide a structured “Ask” entry point (AskUserQuestion or equivalent).

Design the UI contract: short questions, mutually exclusive options, defaults, optional free‑form input.

Replace static Todos with collaborative Tasks.

Ensure Tasks express dependencies, state, and output locations.

Give the model a self‑search tool (Grep) for code‑base context.

Organize knowledge in layered files (index, pattern cards, full manual).

Apply progressive disclosure – keep rarely used info out of the prompt.

Limit tool count – each new tool adds a failure point.

Make important actions replayable (logs, traces, parameters).

Optimize for recoverability (cheap retries, clear preconditions, observable state).

Persist deliverables as files for review.

After each model upgrade, reassess and prune outdated tools.

Source

Original post by Thariq ( @trq212 ) on X:

https://x.com/trq212/status/2027463795355095314

prompt engineering AI engineering Tooling Action Space

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Core Insight: Action Space Design

TL;DR – Key Takeaways

1️⃣ Action Space Is Your Product

2️⃣ Structured Questioning – AskUserQuestion

Attempt 1: Embed questions in ExitPlanTool

Attempt 2: Enforce strict Markdown format

Attempt 3: Dedicated AskUserQuestion tool

3️⃣ From Todos to Tasks

4️⃣ Progressive Context Building

5️⃣ Seven‑Step Iterative Tool Design Loop

6️⃣ Decision Table – When to Add a Tool

7️⃣ Anti‑Patterns

8️⃣ Tool Design Is an Art, Not a Science

Practical Checklist (12 items)

Source

Architect

How this landed with the community

Was this worth your time?

0 Comments

Attempt 1: Embed questions in ExitPlanTool

Attempt 2: Enforce strict Markdown format

Attempt 3: Dedicated AskUserQuestion tool