Why Multi‑Agent Systems Need More Than Role‑Playing: 5 Coordination Patterns Explained
Anthropic’s recent analysis reveals five multi‑agent coordination patterns—Generator‑Verifier, Orchestrator‑Subagent, Agent Teams, Message Bus, and Shared State—highlighting that the real challenges lie in context boundaries, information flow, verification standards, and termination conditions rather than merely assigning roles.
TL;DR
Anthropic defines five multi‑agent coordination patterns: Generator‑Verifier, Orchestrator‑Subagent, Agent Teams, Message Bus, Shared State.
All patterns answer three engineering questions: where to cut context boundaries, how information should flow, and when the system should stop.
Start with the simplest pattern that satisfies the task; only adopt a more complex pattern when a concrete bottleneck appears.
Why the “virtual‑company” metaphor is misleading
Many tutorials illustrate a multi‑agent system as a virtual company (Product Manager, Architect, Developer, Tester) because the analogy is easy to grasp. In production systems at Anthropic, OpenAI, and Google this model is not used. The real difficulty is not assigning human‑like roles but splitting the task at appropriate context boundaries and guaranteeing that information, validation, and termination mechanisms reliably capture the workflow.
Fundamental challenges of multi‑agent systems
Critical context can be lost when it is not passed to the next agent.
Intermediate reasoning may be compressed into conclusions, causing distortion.
Task objectives can drift over multiple rounds of hand‑off.
Vague verification standards turn the verifier into a rubber stamp.
Agents may enter endless token‑consuming loops without converging.
Consequently, the focus should be on information architecture rather than on an organizational hierarchy.
Seven design questions to ask before building a multi‑agent system
Does the task exceed a single agent’s context window or search capability?
Are sub‑tasks independent or tightly coupled?
What context boundaries does each agent need?
Should intermediate findings be returned to an orchestrator or shared in real time?
Can the completion criteria be expressed as a checkable standard (rubric, test suite, policy)?
If loops occur, how does the system know when to stop?
When failures happen, should the system roll back, retry, degrade, or hand off to a human?
Answering these questions often reveals that a single well‑engineered agent with proper state files, tests, and permissions can solve most problems.
Pattern 1: Generator‑Verifier
A generator agent produces an output; a verifier agent checks it against an explicit rubric. The value lies in the rubric, not in the extra agent.
Typical verification checks include:
Does the code pass the designated test suite?
Are only allowed files modified?
Are all issue acceptance criteria covered?
Is any undocumented functionality used?
Are high‑risk operations present?
Is there a fallback path for failures?
To avoid infinite loops, set a maximum iteration count (e.g., 5 – 10) and define a fallback strategy such as escalation to a human or returning the best‑so‑far result.
Pattern 2: Orchestrator‑Subagent
The orchestrator plans, decomposes, and aggregates results, while sub‑agents handle isolated, well‑defined sub‑tasks. This preserves continuous control over the overall goal and works well when sub‑tasks have clear boundaries, minimal dependencies, and distinct outputs.
Typical use case: automated code review. The orchestrator asks sub‑agents to evaluate style, security, and test coverage separately, then synthesizes a final report.
Beware of bottlenecks when sub‑agents need to share intermediate findings; excessive back‑and‑forth through the orchestrator can cause information loss.
Pattern 3: Agent Teams
Similar to Orchestrator‑Subagent, but workers are persistent agents that retain context across multiple rounds. This is ideal for long‑running, independent tasks such as large‑scale code‑base migrations.
Key requirements:
Strong task partitioning (e.g., per‑service or per‑module).
Clear ownership and conflict‑resolution mechanisms (file‑level locks, isolated branches).
Integration testing and rollback strategies for each worker.
If these mechanisms are missing, the team can produce conflicting changes and nondeterministic results.
Pattern 4: Message Bus
Agents communicate via publish/subscribe events, removing a single orchestrator. This mirrors event‑driven microservice architectures and suits scenarios with many event sources, evolving workflows, and extensible agent ecosystems.
Engineering safeguards required:
Reliable logging and tracing for each event.
Dead‑letter queues and retry policies to handle silent failures.
Correlation IDs to trace event cascades.
Debugging becomes harder than in a direct orchestrator model, so the added scalability must be justified.
Pattern 5: Shared State
All agents read/write a persistent store (database, file system, knowledge base). This enables real‑time sharing of discoveries and eliminates a central bottleneck.
Risks and mitigations:
Duplicate work and conflicting updates – enforce version control and idempotent writes.
Non‑convergent loops – impose explicit termination conditions such as a time budget, token budget, or a judge agent that evaluates convergence.
Choosing the right pattern
The five patterns form a complexity‑progression path rather than a hierarchy. Begin with the simplest pattern that satisfies the task, observe where it stalls, and then move to the next pattern that addresses the specific bottleneck (e.g., switch to Message Bus when the orchestrator’s conditional logic proliferates).
Practical insights from Anthropic’s production systems
Anthropic’s internal research shows that multi‑agent performance gains are largely due to increased token consumption (broader search space) rather than superior division of labor.
Critical nodes such as context files, logs, permissions, and session state are externalized into durable storage (e.g., claude-progress.txt, runbook Markdown files, persistent JSON logs) to avoid reliance on the model’s fleeting context window.
Conclusion
Multi‑agent design is fundamentally an information‑architecture problem: define clear context boundaries, establish robust verification standards, and implement reliable termination conditions. Treat agents as specialized capability nodes rather than human‑like roles, and let the engineering scaffolding—skills, pipelines, state files—carry the system’s stability.
References
Anthropic: Multi‑agent coordination patterns – https://claude.com/blog/multi-agent-coordination-patterns
Anthropic: Building multi‑agent systems – https://claude.com/blog/building-multi-agent-systems-when-and-how-to-use-them
Anthropic: Building effective agents – https://www.anthropic.com/engineering/building-effective-agents
Anthropic: Harness design for long‑running apps – https://www.anthropic.com/engineering/harness-design-long-running-apps
Anthropic: Managed Agents – https://www.anthropic.com/engineering/managed-agents
Anthropic: Harnessing Claude’s intelligence – https://claude.com/blog/harnessing-claudes-intelligence
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
