How to Build a Multi‑Agent AI Research Assistant with LangGraph
This article demonstrates how to construct a multi‑agent AI research assistant using the LangGraph framework, detailing the system’s shared state design, individual agent implementations for research, fact‑checking, and report generation, workflow orchestration, advanced patterns like dynamic routing and parallel execution, and performance considerations.
Introduction
Imagine a junior developer building an AI research assistant that can perform fact‑checking, summarisation, sentiment analysis, and cross‑referencing across multiple data sources in just four hours. Six months ago this required a senior engineering team working for weeks; the LangGraph multi‑agent framework now makes it feasible.
Traditional vs. Multi‑Agent AI
Conventional AI applications rely on a single large model to handle all tasks, akin to one person acting as researcher, writer, fact‑checker, and editor simultaneously. Multi‑agent systems distribute complex tasks to specialised agents, each excelling in its domain, and coordinate them precisely to achieve the overall goal.
Research shows that multi‑agent AI can improve performance on complex tasks by 40‑60% compared with single‑model approaches, while also offering better maintainability, debuggability, and scalability.
System Architecture
The core of the system is a shared state space that all agents can read and write. The following Python code defines the state schema using TypedDict and creates a StateGraph instance.
from langgraph.graph import StateGraph, START, END
from typing import TypedDict, Annotated, List
from langgraph.graph.message import add_messages
class ResearchState(TypedDict):
topic: str
research_queries: List[str]
raw_information: List[str]
validated_facts: List[str]
final_report: str
current_agent: str
messages: Annotated[list, add_messages]
# Initialise the workflow
workflow = StateGraph(ResearchState)This shared state acts like a team‑wide information board, allowing each agent to access previous contributions and add its own results.
Agent Implementations
Researcher Agent
The researcher breaks a broad topic into concrete, searchable queries and gathers initial information.
def researcher_agent(state: ResearchState):
"""Break the research topic into specific queries and collect raw info."""
topic = state["topic"]
query_prompt = f"""
Decompose the research topic into 3‑5 specific, searchable queries:
{topic}
Make each query focused and actionable.
"""
queries = llm.invoke(query_prompt).content.split('
')
queries = [q.strip() for q in queries if q.strip()]
raw_info = []
for query in queries:
# Simulated research step – replace with real search in production
research_result = llm.invoke(f"Research and provide information about: {query}")
raw_info.append(research_result.content)
return {
"research_queries": queries,
"raw_information": raw_info,
"current_agent": "researcher",
"messages": [f"Researcher completed queries: {', '.join(queries)}"]
}
workflow.add_node("researcher", researcher_agent)The agent turns a vague subject (e.g., "climate change") into concrete questions such as current CO₂ levels, renewable‑energy adoption rates, and policy impact assessments.
Fact‑Checker Agent
The fact‑checker validates the raw information, extracts reliable facts, and discards dubious content.
def fact_checker_agent(state: ResearchState):
"""Validate and cross‑reference the collected information."""
raw_info = state["raw_information"]
validated_facts = []
for info_piece in raw_info:
validation_prompt = f"""
Analyse the accuracy and reliability of this information:
{info_piece}
Rate reliability (1‑10) and identify statements needing extra verification.
Extract only the most trustworthy facts.
"""
validation_result = llm.invoke(validation_prompt)
if "reliable" in validation_result.content.lower():
validated_facts.append(info_piece)
return {
"validated_facts": validated_facts,
"current_agent": "fact_checker",
"messages": [f"Fact‑checker validated {len(validated_facts)} information pieces"]
}
workflow.add_node("fact_checker", fact_checker_agent)This strict verification step ensures that only high‑confidence data proceeds to the reporting stage.
Report Writer Agent
The report writer assembles the validated facts into a structured research report.
def report_writer_agent(state: ResearchState):
"""Create a comprehensive report from validated facts."""
topic = state["topic"]
validated_facts = state["validated_facts"]
report_prompt = f"""
Produce a comprehensive research report on the following topic:
{topic}
Using these validated facts:
{chr(10).join(validated_facts)}
Organise the report as:
1. Executive Summary
2. Key Findings
3. Supporting Evidence
4. Conclusion
Keep it professional yet easy to understand.
"""
final_report = llm.invoke(report_prompt).content
return {
"final_report": final_report,
"current_agent": "report_writer",
"messages": [f"Report writer completed final report ({len(final_report)} characters)"]
}
workflow.add_node("report_writer", report_writer_agent)The output is a logically coherent, well‑structured document that balances depth and readability.
Workflow Orchestration
The agents are linked together in a directed graph. The START node triggers the researcher, whose output feeds the fact‑checker, which in turn feeds the report writer. The compiled workflow can be invoked with an initial state.
# Define the workflow sequence
workflow.add_edge(START, "researcher")
workflow.add_edge("researcher", "fact_checker")
workflow.add_edge("fact_checker", "report_writer")
workflow.add_edge("report_writer", END)
# Compile the workflow
app = workflow.compile()
# Helper to run the assistant
def run_research_assistant(topic: str):
initial_state = {
"topic": topic,
"research_queries": [],
"raw_information": [],
"validated_facts": [],
"final_report": "",
"current_agent": "",
"messages": []
}
result = app.invoke(initial_state)
return result["final_report"]This design guarantees ordered hand‑offs: the researcher gathers data, the fact‑checker validates it, and the report writer produces the final output.
Advanced Patterns
Dynamic Agent Selection
In some scenarios the system decides at runtime which researcher variant to use based on topic complexity or controversy.
def supervisor_agent(state: ResearchState):
"""Choose the next agent based on current state."""
topic_complexity = analyze_complexity(state["topic"])
if topic_complexity > 8:
return "expert_researcher"
elif "controversial" in state["topic"].lower():
return "bias_checker"
else:
return "standard_researcher"
# Add conditional routing
workflow.add_conditional_edges(
"supervisor",
supervisor_agent,
{
"expert_researcher": "expert_researcher",
"bias_checker": "bias_checker",
"standard_researcher": "researcher"
}
)Parallel Processing
Independent tasks such as sentiment analysis, keyword extraction, and summary generation can run concurrently after the researcher finishes.
# Register parallel agents (implementations omitted for brevity)
workflow.add_node("sentiment_analyzer", analyze_sentiment)
workflow.add_node("keyword_extractor", extract_keywords)
workflow.add_node("summary_generator", generate_summary)
# Launch them in parallel
workflow.add_edge("researcher", ["sentiment_analyzer", "keyword_extractor", "summary_generator"])
# Merge results back to the report writer
workflow.add_edge(["sentiment_analyzer", "keyword_extractor", "summary_generator"], "report_writer")Debugging & Monitoring
Because multi‑agent workflows can become complex, explicit logging nodes help trace execution.
def log_agent_transition(state):
current_agent = state.get("current_agent", "unknown")
print(f"Agent {current_agent} completed. State: {len(state.get('messages', []))} messages")
return state
workflow.add_node("log_researcher", log_agent_transition)
workflow.add_edge("researcher", "log_researcher")
workflow.add_edge("log_researcher", "fact_checker")This provides visibility into each hand‑off, which is essential for production‑grade monitoring.
Performance Evaluation
Multi‑agent architectures typically consume more API calls and compute resources than a single‑model pipeline, but the quality gains—task‑specific optimisation, fault isolation, and independent maintenance—often outweigh the added cost. Reported benchmarks indicate a 40‑60% improvement in task performance.
Challenges include higher coordination overhead, more complex error handling, and the need for robust routing logic.
Conclusion
Multi‑agent AI systems represent a significant evolution in application architecture. By decomposing complex problems into specialised agents that collaborate, developers can achieve higher performance, better maintainability, and easier debugging. With frameworks like LangGraph maturing, now is the optimal moment to adopt multi‑agent designs for next‑generation AI solutions.
Data Party THU
Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
