From Harness Design to Managed Agents: Anthropic’s Full‑Stack Agent Engineering

The article examines Anthropic’s evolution of AI agent infrastructure—from single‑agent loops and context compression to multi‑agent harnesses, managed sessions, sandbox isolation, and robust context engineering—highlighting design trade‑offs, performance gains, security guarantees, and practical principles for building production‑grade agents.

AI Engineer Programming
AI Engineer Programming
AI Engineer Programming
From Harness Design to Managed Agents: Anthropic’s Full‑Stack Agent Engineering

Harness = Container

Anthropic defines the harness as everything that wraps an LLM: prompts, tool connections, inter‑agent collaboration structures, and feedback loops. The core tension in harness design is that it must assume certain model limitations, even though model capabilities continuously expand.

From Single Agent to Meta‑Framework

Stage 1: Single Agent + Compression. A single model runs inside the context window and uses context compression for long tasks. Failure modes include the model attempting to finish the entire application in one go and exhausting the context mid‑task, or prematurely declaring completion when the window is near its limit. Even with compression, the next round of instructions may be unclear, requiring costly state reconstruction.

Stage 2: Dual‑Agent Framework (Initializer + Encoder). Two roles solve state‑transfer problems: the initializer decomposes the request, sets up the environment, and writes a progress file claude-progress.txt while preserving a clear git history; the encoder reads the progress file in subsequent sessions, performs incremental work, updates the file, and exits. This mirrors human engineers writing hand‑off notes and yields a large performance boost over the baseline.

Stage 3: Triple‑Agent Framework (Planner + Generator + Evaluator). Adding an evaluator separates execution from self‑assessment, addressing the tendency of models to over‑score their own output, especially in subjective domains like design.

Stage 4: Meta‑Framework (Managed Agents). Agent components are virtualized into stable interfaces, forming the basis of Managed Agents.

From Prompting to Information Resource Management

As agent tasks become more complex, the core challenge shifts from writing good prompts to managing the information that drives model output. Context engineering—an evolution of prompt engineering—curates the set of data that flows into the LLM at each inference step, including system prompts, tool definitions, external data, and message history.

The fundamental obstacle is "Context Rot": as token count grows, the model’s ability to recall early information degrades because each token attends to all others, diluting attention budget. Simply enlarging the window does not solve this; information density and relevance are key.

Anthropic groups context‑engineering techniques into three pillars:

Compaction. Near‑window‑limit, the history is high‑fidelity summarized and used as the new session start. Over‑aggressive compaction can discard subtle context needed later. The safest form is Tool Result Clearing—keeping only the fact that a tool was called, not the full raw output.

Structured Note‑taking. Agents write key state, decisions, and progress to persistent storage (file system or database) and read it back when needed. Example file: claude-progress.txt. This gives agents an external working memory across sessions.

Multi‑Agent Architecture. Tasks are split among multiple agents, each with a clean context window. The main agent orchestrates and aggregates results, preventing cross‑task contamination.

Balancing between overly hard‑coded rules (fragile agents) and overly vague instructions (model drift) is essential.

Managed Agents

Anthropic builds Managed Agents to solve the classic systems‑design problem of supporting programs that have not yet been imagined. The design mirrors OS virtualization, where stable interfaces (e.g., read()) hide underlying implementation details.

Managed Agents expose three stable interfaces:

Session. An append‑only log of all events, persisted outside the framework.

Harness. The component that invokes the model and routes tool calls to the appropriate infrastructure.

Sandbox. An isolated execution environment where the model can run code and edit files.

Each interface makes minimal assumptions about the others, allowing independent replacement or failure.

From "Pet" to "Livestock"

Initially all components shared a single container, a classic "pet" server pattern where container failure caused session loss and debugging was opaque. Decoupling turned each component into "livestock" that can be swapped. The harness now calls the sandbox via a uniform execute(name, input) → string API; sandbox failures are reported as tool‑call errors for the model to decide on retries. New sandboxes are provisioned with provision({resources}). If the harness crashes, it can be revived with wake(sessionId) and retrieve the session log via getSession(id), resuming from the last event.

This change cut first‑token latency (TTFT) dramatically: p50 TTFT dropped ~60 % and p95 TTFT dropped >90 % because sessions that do not need a sandbox can start inference immediately from the persisted log.

Session ≠ Context Window

Traditional frameworks make irreversible decisions—compression, trimming, dropping—when processing context, which can discard tokens that later become important. Managed Agents keep a persistent session log outside the LLM’s context window. The getEvents() API lets the harness slice the log, rewind, or replay events, and optionally transform them before feeding them back into the model. This cleanly separates a recoverable context store (handled by the session) from any mutable context‑management strategy (handled by the harness).

Multiple Brains, Multiple Hands

Decoupling the brain (harness) from the hands (sandbox) solves two scaling dimensions. Stateless harness instances can be run in parallel for many concurrent sessions, and they no longer assume all resources are co‑located, enabling private‑cloud deployments. Each sandbox is a tool with a uniform interface, allowing any custom tool, MCP server, or Anthropic‑provided tool to be swapped without the harness needing to know its implementation.

Credential Management and Sandbox Isolation

In tightly coupled designs, model‑generated code runs in the same container as credentials, allowing the model to read environment variables and obtain unrestricted tokens. Managed Agents address this with two patterns:

Bind credentials to resources but keep them hidden from the sandbox. For example, a Git token is used only during sandbox initialization to clone a repository; subsequent push/pull operations happen without the sandbox ever seeing the token.

Store credentials in an external vault. When a custom tool (via MCP) is invoked, a proxy holding a reference token fetches the real credential from the vault and performs the external call, while the framework never sees the credential.

The structural guarantee is that generated code never has a direct path to the credential store, a stronger safety measure than merely limiting token permissions.

Agent‑Computer Interface (ACI)

Anthropic likens tool design for agents to human‑computer interaction (HCI): poorly designed tools systematically degrade agent performance, even with powerful models. Good tool design follows three principles:

Clarity. Descriptions must let the model understand when and how to use the tool, including examples, edge cases, input formats, and differentiation from similar tools.

Poka‑yoke (error‑proofing). Parameter design should make misuse difficult, not just documented.

Least Privilege. Tools expose only the minimal operations required for the task, reducing both security risk and cognitive load.

In multi‑agent scenarios, sub‑agent interfaces should be treated like ordinary tools: the main agent calls a sub‑agent without needing to know its internal complexity.

The Model Context Protocol (MCP) standardizes tool integration, reducing fragmentation across frameworks and platforms.

From Single Agent to Collaborative Network

When tasks exceed a single context window, multi‑agent orchestration becomes necessary. Core value lies in parallelizing and specializing sub‑problems while giving each a clean context. Common patterns include:

Supervisor Pattern. A central agent decomposes work and delegates to specialized workers, then aggregates results. Clear control flow but a single point of failure.

Parallel Fan‑out. Independent sub‑tasks are processed concurrently by multiple agents, suitable for low‑dependency workloads.

Pipeline. Agents are arranged in a sequence where each output feeds the next, matching the planner → generator → evaluator pipeline.

Decentralized Coordination. Agents communicate peer‑to‑peer without a central coordinator, offering resilience at the cost of higher debugging complexity.

The main failure mode is agents acting on conflicting assumptions due to inconsistent or incomplete context. Anthropic recommends starting with a single‑agent system and only introducing multi‑agent architecture when the single agent repeatedly hits scope limits, latency bottlenecks, or accuracy problems, because multi‑agent systems increase runtime cost, error‑propagation paths, and debugging difficulty.

Observability and Long‑Running Task State Management

Long‑running agents (hours or days) need robust observability: every tool call, model response, error, and state change is emitted via emitEvent(id, event) to a persistent log. This enables precise replay of any moment without attaching a debugger to a live container, a major improvement over coupled designs where engineers had to shell into a container holding user data.

Idempotency is also critical; distributed systems must ensure that repeated tool calls (e.g., due to network retries) do not cause unintended side effects.

Core Principles for Production‑Grade Agents

Principle 1: Continuously reassess framework assumptions as model capabilities evolve. Patches for older models may become dead weight or harmful after upgrades.

Principle 2: Design around stable interfaces, not concrete implementations. Session, sandbox, and tool‑call boundaries should be abstracted.

Principle 3: Treat context as a limited energy budget, not an infinite store. Only bring information into the window when it has a clear justification.

Principle 4: Separate evaluation from execution. Independent evaluators provide objective feedback, avoiding self‑scoring bias.

Principle 5: Enforce structural security boundaries. Credentials must never be reachable from the code‑execution environment.

Principle 6: Start with the simplest solution. Add complexity only when a concrete need is demonstrated.

Principle 7: Interfaces are permanent, frameworks are temporary. Future frameworks should be able to plug into the same stable interfaces, even for programs that do not yet exist.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI agentsObservabilitySecuritycontext engineeringManaged Agentsharness design
AI Engineer Programming
Written by

AI Engineer Programming

In the AI era, defining problems is often more important than solving them; here we explore AI's contradictions, boundaries, and possibilities.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.