Multi‑Agent Collaboration: How AI Commands AI and the New Complexity in Harness Engineering

This article dissects Claude Code's multi‑agent architecture, explaining why single‑agent designs hit context, serial, and failure walls, comparing leading frameworks, and detailing Claude's AgentTool recursion safeguards, Coordinator control‑data separation, UDS‑based swarms, IterationBudget controls, and the three engineering guardrails that keep multi‑agent systems reliable.

James' Growth Diary
James' Growth Diary
James' Growth Diary
Multi‑Agent Collaboration: How AI Commands AI and the New Complexity in Harness Engineering

James introduces the most complex area of Harness Engineering: multi‑agent collaboration. When an AI system evolves from a single agent with tools to multiple agents that schedule each other, the engineering complexity grows exponentially, requiring coordination, communication, error handling, and budget control.

1. The Problem – Why One Agent Is Not Enough

Three walls limit a single‑agent approach:

Context window : processing many files or API calls quickly exceeds token limits.

Serial waiting : independent tasks (e.g., architecture analysis, security check, test verification, style review) must run sequentially, wasting time.

Failure without isolation : a sub‑task crash halts the whole session.

Multi‑agent systems address these with:

Context isolation – each sub‑agent sees only its own task.

Parallel execution – total time becomes the longest task, not the sum.

Fault isolation – a sub‑agent crash does not affect the parent.

2. Industry Solutions

Before diving into Claude Code, the article surveys three mature frameworks:

LangGraph

Models agents as nodes in a directed graph (state‑machine + graph). It offers visual debugging but requires pre‑defining the graph, limiting dynamic task creation.

# LangGraph multi‑Agent sketch
from langgraph.graph import StateGraph
builder = StateGraph(AgentState)
builder.add_node("researcher", research_agent)
builder.add_node("writer", write_agent)
builder.add_node("reviewer", review_agent)
# researcher → writer → reviewer
builder.add_edge("researcher", "writer")
builder.add_conditional_edges("writer", should_revise, {
    "revise": "reviewer",
    "done": END
})
graph = builder.compile()

AutoGen

Uses a conversation model where each agent is a chat participant. This enables natural‑language coordination but makes precise control and budget enforcement difficult.

# AutoGen multi‑Agent dialogue sketch
from autogen import AssistantAgent, UserProxyAgent, GroupChat
coder = AssistantAgent("coder", llm_config={"model": "gpt-4"})
tester = AssistantAgent("tester", llm_config={"model": "gpt-4"})
reviewer = AssistantAgent("reviewer", llm_config={"model": "gpt-4"})
groupchat = GroupChat(agents=[coder, tester, reviewer], messages=[], max_round=10)

CrewAI

Organises agents as a team with explicit roles, goals, and backstories. It is easy to start but hides the underlying scheduler, limiting fine‑grained control.

# CrewAI role‑based sketch
from crewai import Agent, Task, Crew
researcher = Agent(role="研究员", goal="深度调研技术方案", backstory="10年架构经验")
writer = Agent(role="作者", goal="把技术内容写成易读文章", backstory="技术写作专家")
research_task = Task(description="调研 multi‑agent 框架", agent=researcher)
write_task = Task(description="写一篇技术分析文章", agent=writer)
crew = Crew(agents=[researcher, writer], tasks=[research_task, write_task])
result = crew.kickoff()

A comparison table highlights coordination mode, flexibility, controllability, debugging difficulty, and budget support for each framework.

3. Claude Code’s AgentTool – Recursive Sub‑Agent Protection

AgentTool launches a full sub‑agent via Agent(prompt="…"). The execution chain includes parsing the effective type, permission filtering, assembling an isolated tool pool, and running the sub‑agent’s query() loop with its own system prompt.

// AgentTool execution chain (src/tools/AgentTool/AgentTool.tsx)
AI calls Agent(prompt, subagent_type?)
  ↓
AgentTool.call()
  ├─ parse effectiveType (Agent / Fork / GP fallback)
  ├─ filterDeniedAgents()   // permission check
  ├─ assembleToolPool()      // independent tool pool
  └─ runAgent()
      ├─ getAgentSystemPrompt() // dedicated system prompt
      └─ query()                // standard agentic loop

Two guardrails prevent infinite recursion:

Runtime check – querySource marks a fork; canFork() rejects further forks if the source starts with 'agent:builtin:fork'.

Persistent tag – a <fork‑boilerplate> marker is injected into the sub‑agent’s messages; the main loop scans message history for this tag to block additional forks.

4. Coordinator Mode – Separating Control and Data Planes

The Coordinator receives only user messages, while Workers execute tasks and return structured XML results. The system prompt defines the Coordinator’s role and constraints.

`You are Claude Code, an AI assistant that orchestrates software engineering tasks across multiple workers.
## Your Role
- Help the user achieve their goal
- Direct workers to research, implement and verify code changes
- Synthesize results and communicate with the user
- Answer questions directly when possible — don’t delegate work you can handle without tools`

Coordinator tools include AGENT_TOOL_NAME, SEND_MESSAGE_TOOL_NAME, TASK_STOP_TOOL_NAME, and a PR subscription tool. Workers have a broader tool set (Bash, FileRead, etc.) but lack scheduling tools.

Worker results are returned as XML, enabling precise parsing of task ID, status, summary, result, token usage, and duration.

<task-notification>
  <task-id>a1b2c3d4e</task-id>
  <status>completed</status>
  <summary>Agent "Fix auth bug" completed</summary>
  <result>Found null pointer in src/auth/validate.ts:42...</result>
  <usage>
    <total_tokens>8432</total_tokens>
    <tool_uses>12</tool_uses>
    <duration_ms>23400</duration_ms>
  </usage>
</task-notification>

5. Agent Swarms – Real Parallelism with Unix Domain Sockets

Claude Code runs agents in separate processes on the same machine and communicates via Unix Domain Socket (UDS), achieving sub‑millisecond latency without port conflicts. A three‑stage address parser distinguishes uds: (point‑to‑point), bridge: (IDE bridge), and plain paths (mailbox).

function parseMessageAddress(target: string) {
  if (target.startsWith("uds:")) return { scheme: "uds", target: target.slice(4) };
  if (target.startsWith("bridge:")) return { scheme: "bridge", target: target.slice(7) };
  if (target.startsWith("/")) return { scheme: "uds", target };
  return { scheme: "other", target };
}

Mailbox files store JSONL messages; agents inject unread messages into the LLM context instead of using a dedicated receiver thread.

// Mailbox message format (JSONL line)
{ "from": "coordinator", "text": "请先分析 auth 模块的结构", "timestamp": 1718323200000, "read": false }

6. IterationBudget – Preventing Runaway Agents

Each agent receives an IterationBudget limiting the number of tool‑call rounds (default 100) and optionally token usage. Task IDs are nine characters: a one‑letter prefix indicating TaskType (e.g., 'a' for local_agent) followed by eight base‑36 random characters, yielding ~2.8 trillion combinations.

// TaskType enum and ID generation (TypeScript)
export type TaskType =
  | 'local_bash'
  | 'local_agent'
  | 'remote_agent'
  | 'in_process_teammate'
  | 'local_workflow'
  | 'monitor_mcp'
  | 'dream';

function generateTaskId(type: TaskType): string {
  const prefix = TASK_ID_PREFIXES[type];
  const bytes = randomBytes(8);
  let id = prefix;
  for (let i = 0; i < 8; i++) {
    id += TASK_ID_ALPHABET[bytes[i] % TASK_ID_ALPHABET.length];
  }
  return id; // e.g. "a3x7k2m9p"
}

The budget loop checks budget.currentIterations >= budget.maxIterations and aborts gracefully, returning a budget_exceeded status with partial results.

7. The Three Guardrails from a Harness Perspective

Depth Guard – prevents infinite agent trees using both the runtime querySource check and the persistent <fork‑boilerplate> tag. The guard logic survives context compression.

// Depth guard example
function canSpawnSubagent(currentDepth: number, maxDepth = 3): boolean {
  if (currentDepth >= maxDepth) {
    console.warn(`[Depth Guard] Max depth ${maxDepth} reached, blocking spawn`);
    return false;
  }
  return true;
}

Budget Guard – enforces per‑agent iteration limits, optional token caps, and terminal‑state checks ( isTerminalTaskStatus) to stop further processing.

// Budget guard wrapper
async function withBudget(fn, budget) {
  const start = Date.now();
  let calls = 0;
  const controller = new AbortController();
  if (budget.maxMs) setTimeout(() => controller.abort(), budget.maxMs);
  try {
    const result = await fn();
    return { result, budgetExceeded: false };
  } catch (err) {
    if (controller.signal.aborted) return { result: null, budgetExceeded: true };
    throw err;
  }
}

Isolation Guard – each sub‑agent runs with its own context, tool pool, and optional separate process; failures are reported via <task-notification> without crashing the parent.

Conclusion

The article walks through Claude Code’s source to reveal how multi‑agent systems overcome the three classic walls, how they differ from existing frameworks, and how Claude implements robust engineering mechanisms: recursive protection, clear control‑data separation, ultra‑low‑latency UDS communication, structured XML results, and layered budget/guardrails. The three guardrails—Depth, Budget, and Isolation—ensure that AI agents can be orchestrated safely and efficiently at scale.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

multi-agentUnix Domain SocketAI orchestrationHarness EngineeringCoordinatorAgentToolIterationBudget
James' Growth Diary
Written by

James' Growth Diary

I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.