Multi-Agent Research Overview, Open-Source Implementations, and Design Considerations
This article reviews the background of multi‑agent systems, compares major open‑source frameworks such as AutoGen, MetaGPT, AgentVerse, and XAgent, discusses design principles, collaboration strategies, and offers conclusions on LLM‑driven versus SOP‑driven approaches for building multi‑agent applications.
Multi-Agent Research Background
Our project team is developing application‑layer services based on the Agent paradigm, leveraging large language model (LLM) capabilities to provide higher‑level services. Numerous Agent frameworks have emerged, with LangChain being the most well‑known.
Original article: https://lilianweng.github.io/posts/2023-06-23-agent/ Lilian, OpenAI AI Safety lead, released an Agent design share. Supporting papers: MRKL Systems – a modular neuro‑symbolic architecture that combines LLMs, external knowledge sources, and discrete reasoning (May 2022). A Survey on Large Language Model based Autonomous Agents (August 2023) – the first comprehensive Agent survey in China.
Single agents have inherent limitations; to handle complex problems we need multiple specialized agents, such as separate development and testing agents for a data‑analysis project.
Example: Implementing a high‑quality data‑analysis project requires at least a development and a testing agent; a single agent cannot efficiently manage multi‑role collaboration.
Many recent papers propose Multi‑Agent designs, leading to frameworks like MetaGPT, AutoGen, and XAgent. While these frameworks enable handling of complex tasks, they also introduce additional overhead.
What Is a Multi‑Agent?
According to ChatGPT, a Multi‑Agent system (MAS) consists of multiple autonomous agents that can operate independently or collaboratively, featuring autonomy, local views, decentralization, and cooperation/competition.
Multi‑Agent systems are applied in automation control, social simulation, resource management, and e‑commerce, involving complex interaction and coordination mechanisms.
Currently, there is no industry‑standard implementation; each framework follows its own research‑driven design.
Open‑Source Multi‑Agent Implementations Comparison
Source: https://zhuanlan.zhihu.com/p/660045220
Project (Producer)
Source Repository
Key Features
AutoGen (Microsoft)
https://github.com/microsoft/autogen
Supports human‑in‑the‑loop interaction and parallel multi‑agent collaboration.
MetaGPT (DeepWisdom)
https://github.com/geekan/MetaGPT/blob/main/docs/README_CN.md
Built‑in internet‑company‑style agents, customizable for product‑level services, also supports human involvement.
AgentVerse (Mianbi & Tsinghua)
https://github.com/OpenBMB/AgentVerse/blob/main/README_zh.md
Customizable agents and collaboration workflows driven by AgentVerse.
Agents (Wave Intelligence)
https://github.com/aiwaves-cn/agents
Similar to AgentVerse but explicitly supports human participation.
XAgent (Mianbi & Tsinghua)
https://github.com/OpenBMB/XAgent/blob/main/README_ZH.md
Built‑in agent collaboration flow with a large set of tool servers.
Design Principles for Open‑Source Implementations
Basic Requirements
Agents must exchange or share inputs and outputs.
Collaboration among agents must be rule‑driven.
Agent termination must be condition‑based (e.g., maximum rounds, explicit termination message).
Implementation Directions
Users can customize participating agents; humans can also act as agents (common practice).
Define clear agent responsibilities for direct usage.
Multi‑Agent Collaboration Approaches
Three main directions are identified: SOP‑driven (e.g., MetaGPT), LLM‑driven (e.g., AutoGen), and parallel execution with LLM summarization (e.g., XAgent).
1. SOP‑Driven Agents – MetaGPT
MetaGPT uses three core concepts:
Environment : a shared storage where agents observe and publish important messages.
Observe : agents read relevant messages from the environment and decide on actions.
Action : agents perform actions and publish results back to the environment.
Reference: https://docs.deepwisdom.ai/zhcn/guide/tutorials/concepts.html#多智能体
2. LLM‑Driven Agents – AutoGen
AutoGen’s GroupChat supports four collaboration modes: auto, manual, random, and round‑robin, defaulting to auto. In auto mode, the GroupChat issues an LLM query to decide the next speaker based on conversation history and agent names.
Reference: https://microsoft.github.io/autogen/docs/reference/agentchat/groupchat#groupchat-objects
Example of the speaker‑selection prompt:
# groupchat.py
def select_speaker_msg(self, agents: List[Agent]):
"""Return the message for selecting the next speaker."""
return f"""You are in a role play game. The following roles are available:
{self._participant_roles(agents)}.
Read the following conversation.
Then select the next role from {[agent.name for agent in agents]} to play. Only return the role."""
# auto speaker selection
selector.update_system_message(self.select_speaker_msg(agents))
final, name = selector.generate_oai_reply(
self.messages
+ [{
"role": "system",
"content": f"Read the above conversation. Then select the next role from {[agent.name for agent in agents]} to play. Only return the role.",
}]
)3. Parallel Execution with LLM Summarization – XAgent
XAgent follows a pipeline of plan → dispatch → execute → submit → revise , using submit and revise agents for post‑processing and decision making.
Reference video: https://www.bilibili.com/video/BV1BN411W7VH
Research Conclusions
I favor the LLM‑driven multi‑agent design because it reduces engineering overhead and can evolve as LLM capabilities improve, whereas SOP‑driven approaches are static.
LLM capabilities are rapidly advancing; current bottlenecks may be overcome.
LLM‑driven collaboration enables dynamic routing based on actual problems.
Future Thoughts on Self‑Developed Multi‑Agent Frameworks
If we were to build our own framework, essential features include enhanced single‑agent abilities, team awareness, customizable collaboration flows (LLM or SOP), and shared memory. Optional enhancements involve autonomous agent awareness, concurrent execution, human‑in‑the‑loop adjustments, long‑term memory, and strict data security.
Essential Elements
Agent – strengthen individual agent capabilities.
Team – enable agents to discover peers.
Collaboration – customizable flows (LLM or SOP).
Memory – shared storage for collaborative data.
Optional Enhancements
Agents autonomously drive surrounding agents.
Concurrent agent execution.
Human‑machine iterative goal refinement.
Persistent long‑term memory.
Strict data control and sanitization for outbound information.
Appendix
AutoGen Learning Resources
Theoretical and Introductory Materials
Talk: PSU Assistant Professor Wu Qingyun – “AutoGen: Using Multi‑Agent Dialogue to Enable Next‑Gen LLM Applications” (Bilibili).
Microsoft AutoGen demo and code walkthrough (Bilibili).
Paper: “AutoGen: Enabling Next‑Gen LLM Applications via Multi‑Agent Conversation Framework”.
Multi‑Agent Conversation Framework overview.
Collaboration Principle Demonstration
Agent LLM Selection Logic
See the code snippet above for the speaker‑selection prompt used by AutoGen.
Demo Learning
https://microsoft.github.io/autogen/docs/Examples/AgentChat
Rare Earth Juejin Tech Community
Juejin, a tech community that helps developers grow.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.