Mastering LangGraph Multi‑Agent Collaboration: The Supervisor Pattern from Theory to Practice
This article explains why single‑agent LLM pipelines fail when many tools are attached, introduces the Supervisor pattern that separates routing and execution across specialized agents, compares Tool‑Calling and Handoff approaches, provides a complete TypeScript implementation—including hierarchical supervisors—and lists five common pitfalls with concrete fixes.
Why Multi‑Agent Collaboration Is Needed
When a single LLM agent is equipped with many tools, the probability of selecting the wrong tool rises sharply, the context window fills with all tool‑call histories, and debugging becomes nearly impossible.
Tool overload: research shows error rates increase noticeably once an agent holds more than ten tools.
Context explosion: all sub‑task dialogues are compressed into one context window, diluting core instructions.
Non‑debuggable: failures cannot be traced to a specific tool call.
单 Agent 架构问题示意:
用户请求
↓
[单一 Agent] ← 挂了 20 个工具
↓
工具选择出现幻觉 ← 概率随工具数量指数增长
↓
上下文爆炸:所有工具调用历史都在同一个 context window
↓
结果混乱 / Token 超限 / 不可调试Supervisor Pattern Core Principle
The Supervisor pattern enforces responsibility separation : a top‑level Supervisor (total‑control agent) handles routing, coordination, and final aggregation, while each Worker agent focuses on a single expertise.
用户请求
↓
┌──────────────┐
│ Supervisor │ ← 只负责:路由 + 协调 + 最终整合
│ (总控 Agent) │
└──────┬───────┘
│ tool_call / handoff
┌─────────┼─────────┐
↓ ↓ ↓
[研究 Agent] [数学 Agent] [代码 Agent]
专注搜索 专注计算 专注生成
↓ ↓ ↓
结果1 结果2 结果3
└─────────┼─────────┘
↓
Supervisor 整合结果
↓
最终答案Understand user intent and decompose the task.
Route: decide which Worker to invoke and in what order.
Integrate: combine each Worker’s output into the final answer.
Worker responsibility: execute its specialized toolset without handling routing or other Workers.
Two Implementation Paradigms: Tool Calling vs. Handoff
Control: Tool Calling – control stays with Supervisor; Handoff – control transfers to the Worker.
Suitable scenarios: Tool Calling – tasks are clear and workflow predictable; Handoff – tasks need flexible dialogue, Worker interacts directly with the user.
Context sharing: Tool Calling – Supervisor manages all context; Handoff – state is passed between nodes.
Debug difficulty: Tool Calling – low, traceable; Handoff – higher, flow is distributed.
Typical use cases: Tool Calling – report generation, data‑analysis pipelines; Handoff – customer‑service bots, multi‑turn negotiations.
Practical guidance: about 90 % of scenarios are well served by the Tool‑Calling Supervisor mode; reserve Handoff for cases where a Worker must converse with the user (e.g., medical triage).
Step‑by‑Step TypeScript Implementation
1. Define Worker Agents
import { ChatOpenAI } from "@langchain/openai";
import { createReactAgent } from "@langchain/langgraph/prebuilt";
import { tool } from "@langchain/core/tools";
import { z } from "zod";
const model = new ChatOpenAI({ model: "gpt-4o-mini" });
// Math tools
const addTool = tool(async ({ a, b }) => `${a + b}`, {
name: "add",
description: "Add two numbers",
schema: z.object({ a: z.number(), b: z.number() })
});
const multiplyTool = tool(async ({ a, b }) => `${a * b}`, {
name: "multiply",
description: "Multiply two numbers",
schema: z.object({ a: z.number(), b: z.number() })
});
// Simplified web‑search tool
const webSearchTool = tool(async ({ query }) => `搜索结果:关于 "${query}" 的最新信息...`, {
name: "web_search",
description: "Search the web for information",
schema: z.object({ query: z.string() })
});
const mathAgent = createReactAgent({
llm: model,
tools: [addTool, multiplyTool],
messageModifier: "你是数学专家。只处理数学计算,完成后直接返回结果,不要添加额外说明。"
});
const researchAgent = createReactAgent({
llm: model,
tools: [webSearchTool],
messageModifier: "你是研究专家。只负责信息搜索,不做数学计算,完成后直接返回搜索结果。"
});2. Define Supervisor State and Routing Logic
import { Annotation, StateGraph, END, START } from "@langchain/langgraph";
import { BaseMessage, HumanMessage } from "@langchain/core/messages";
// Supervisor state definition
const SupervisorState = Annotation.Root({
messages: Annotation<BaseMessage[]>({
reducer: (x, y) => x.concat(y)
}),
next: Annotation<string>({
reducer: (x, y) => y ?? x,
default: () => "FINISH"
})
});
const members = ["math_agent", "research_agent"] as const;
type Member = typeof members[number] | "FINISH";
async function supervisorNode(state: typeof SupervisorState.State) {
const systemPrompt = `你是一个任务协调主管,管理以下专家:${members.join(", ")}
根据用户的请求和当前进展,决定下一步应该调用哪个专家,或者回复 FINISH 表示任务完成。
规则:
- 需要搜索信息 → research_agent
- 需要数学计算 → math_agent
- 所有任务完成 → FINISH
只回复专家名称或 FINISH,不要添加其他内容.`;
const response = await model.invoke([
{ role: "system", content: systemPrompt },
...state.messages
]);
const next = (response.content as string).trim() as Member;
return { next };
}3. Assemble the Graph
async function callAgent(agentGraph, state) {
const result = await agentGraph.invoke({ messages: state.messages });
const lastMessage = result.messages[result.messages.length - 1];
return { messages: [lastMessage] };
}
const workflow = new StateGraph(SupervisorState)
.addNode("supervisor", supervisorNode)
.addNode("math_agent", (state) => callAgent(mathAgent, state))
.addNode("research_agent", (state) => callAgent(researchAgent, state))
.addConditionalEdges("supervisor", (state) => state.next, {
math_agent: "math_agent",
research_agent: "research_agent",
FINISH: END
})
.addEdge("math_agent", "supervisor")
.addEdge("research_agent", "supervisor")
.addEdge(START, "supervisor");
const app = workflow.compile();4. Run and Observe
async function main() {
const result = await app.invoke({
messages: [new HumanMessage("搜索一下 FAANG 公司2024年的员工总数,然后帮我算一下平均每家有多少人")]
});
console.log("最终答案:");
console.log(result.messages[result.messages.length - 1].content);
console.log("
调用轨迹:");
result.messages.forEach((msg, i) => {
const name = (msg as any).name || msg.constructor.name;
console.log(`${i}: [${name}]`);
});
}
main();Execution shows the Supervisor first invokes research_agent to fetch employee counts, then calls math_agent for the average, and finally aggregates the answer without any manual routing.
Advanced: Hierarchical Supervisors and Sub‑Graphs
A single‑level Supervisor works well for 3‑5 Workers. For more complex tasks, a top‑level Supervisor can delegate to department‑level Supervisors, each with its own state namespace.
用户请求
↓
┌──────────────────┐
│ 顶层 Supervisor │
└────────┬─────────┘
│
┌───────────┴───────────┐
↓ ↓
┌────────────────┐ ┌────────────────┐
│ 研究部 Supervisor│ │ 分析部 Supervisor│
└───────┬────────┘ └───────┬────────┘
│ │
┌─────┴─────┐ ┌─────┴─────┐
↓ ↓ ↓ ↓
[网页搜索] [学术搜索] [数据分析] [可视化]
Agent Agent Agent AgentIn LangGraph each department Supervisor is added as a sub‑graph node using addNode:
import { StateGraph } from "@langchain/langgraph";
const researchDeptGraph = buildResearchDeptGraph();
const analysisDeptGraph = buildAnalysisDeptGraph();
const topLevelWorkflow = new StateGraph(TopLevelState)
.addNode("top_supervisor", topSupervisorNode)
.addNode("research_dept", researchDeptGraph.compile())
.addNode("analysis_dept", analysisDeptGraph.compile())
.addConditionalEdges("top_supervisor", routeByDept, {
research_dept: "research_dept",
analysis_dept: "analysis_dept",
FINISH: END
})
.addEdge("research_dept", "top_supervisor")
.addEdge("analysis_dept", "top_supervisor")
.addEdge(START, "top_supervisor");Common Pitfalls and Fixes
Pitfall 1 – Supervisor infinite loop Symptom: Supervisor repeatedly calls the same Worker and never returns FINISH . Root cause: Prompt lacks a clear termination rule. Fix: Add explicit stop conditions and a recursion limit, e.g.:
const app = workflow.compile({ recursionLimit: 10 });Pitfall 2 – Worker messages drown Supervisor context Symptom: After several rounds, Supervisor makes wrong routing decisions. Root cause: Full Worker dialogue history (including all tool calls) is appended to Supervisor messages. Fix: Return only the last message from a Worker and tag it for identification:
async function callAgent(agentGraph, state) {
const result = await agentGraph.invoke({ messages: state.messages });
const lastMessage = result.messages[result.messages.length - 1];
lastMessage.name = "worker_result";
return { messages: [lastMessage] };
}Pitfall 3 – Workers call each other, causing deadlock Symptom: Worker A invokes Worker B, which in turn invokes Worker A, leading to a hang. Root cause: Worker prompt does not forbid invoking other agents. Fix: Explicitly state in the Worker prompt that it must only execute its own toolset and never call other agents.
Pitfall 4 – Vague Supervisor routing prompt Symptom: Supervisor routes a math question to the research agent. Root cause: Routing prompt lacks precise decision criteria. Fix: Provide a detailed rule set, for example:
const routePrompt = `
判断规则(按优先级):
1. 用户要求搜索、查询最新信息 → research_agent
2. 用户要求计算、统计、数学运算 → math_agent
3. 信息已收集、计算完成 → FINISH
如果已有 Worker 返回结果,优先考虑 FINISH。
`;Pitfall 5 – Forgetting a checkpointer for Workers Symptom: A long‑running Worker crashes; on restart the task starts from scratch, wasting tokens. Root cause: Worker sub‑graph lacks a checkpoint. Fix: Attach the same MemorySaver checkpointer used by the main graph:
import { MemorySaver } from "@langchain/langgraph";
const checkpointer = new MemorySaver();
const app = workflow.compile({ checkpointer });
// When invoking
const result = await app.invoke(input, { configurable: { thread_id: "task-001" } });Summary
Multi‑agent setups avoid exponential tool‑selection errors by separating responsibilities.
The Supervisor pattern delegates routing to a top‑level agent while Workers specialize in execution, keeping contexts clean.
Tool‑Calling is suitable for roughly 90 % of cases; Handoff is reserved for scenarios requiring direct Worker‑user interaction.
A complete TypeScript example demonstrates defining Workers, designing Supervisor state, routing logic, graph assembly, and execution.
Hierarchical Supervisors enable complex organizational structures with independent state namespaces.
Five common pitfalls—infinite loops, context pollution, worker deadlocks, vague routing, missing checkpointers—are each paired with concrete remediation steps.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
James' Growth Diary
I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
