Artificial Intelligence 24 min read

Understanding Multi‑Agent AI Systems: ReAct Architecture, MCP Protocol, and OpenManus Implementation

Understanding multi‑agent AI systems, this article explains how ReAct’s tightly coupled reasoning‑action loop, the Model Context Protocol, and the open‑source OpenManus implementation enable autonomous task planning, tool invocation, and memory management, contrasting traditional chatbots with delivery‑centered agents while highlighting current limitations and future optimization needs.

DeWu Technology
DeWu Technology
DeWu Technology
Understanding Multi‑Agent AI Systems: ReAct Architecture, MCP Protocol, and OpenManus Implementation

Since the release of ChatGPT in December 2022, users have become accustomed to interacting with large language models (LLMs) through simple dialogue and prompt engineering. However, this interaction model is limited: it cannot automatically decompose tasks, invoke external tools, or synthesize results without explicit user guidance.

ChatBot vs. Agent – A traditional ChatBot follows a "dialogue‑centered" approach, responding to a single user instruction. In contrast, a "delivery‑centered" multi‑agent system (Agent) receives a high‑level goal, automatically plans sub‑tasks, selects appropriate tools, executes them, and returns a polished solution.

From Prompt to Thought Chain – The "Chain‑of‑Thought" (CoT) technique encourages LLMs to reason step‑by‑step, akin to writing on a scratchpad. This improves mathematical and logical performance but still relies on static prompts.

ReAct Architecture – ReAct tightly couples Reasoning and Action . The model iteratively thinks, decides which tool to call (e.g., search engine, calculator), observes the result, and repeats until a final answer is produced. A minimal ReAct prompt example is:

You are an AI assistant. Use the following tools: {tools}

- Question: {input}
- Thought: ...
- Action: [tool_name] with arguments
- Observation: ...
(Repeat as needed)
- Final Answer: ...

Agent Architecture – An Agent consists of four core components:

Planning: decomposes complex tasks into executable sub‑tasks.

Memory: stores short‑term and long‑term context (messages, files, embeddings).

Tools: external capabilities such as web browsing, code execution, or database queries.

Action: executes the selected tool and integrates the observation back into the reasoning loop.

OpenManus Case Study – OpenManus is an open‑source implementation of a multi‑agent system built on the Model Context Protocol (MCP). It demonstrates the full pipeline from prompt definition to tool execution. Repository: https://github.com/mannaandpoem/OpenManus/tree/main

Setup Instructions

Create a Python 3.12 virtual environment.

Install Playwright:

playwright install
# or
python -m playwright install
# install only Firefox
python -m playwright install firefox

Configure an LLM API key (e.g., DeepSeek, Tongyi Qianwen).

Tool Definitions (JSON)

[
  {
    "type": "function",
    "function": {
      "name": "python_execute",
      "description": "Executes Python code string. Only print outputs are visible.",
      "parameters": {
        "type": "object",
        "properties": {"code": {"type": "string", "description": "The Python code to execute."}},
        "required": ["code"]
      }
    }
  },
  {
    "type": "function",
    "function": {
      "name": "google_search",
      "description": "Perform a Google search and return a list of relevant links.",
      "parameters": {
        "type": "object",
        "properties": {
          "query": {"type": "string", "description": "The search query."},
          "num_results": {"type": "integer", "description": "Number of results to return.", "default": 10}
        },
        "required": ["query"]
      }
    }
  }
]

Example LLM Response (Tool Call)

ChatCompletionMessage(
    role='assistant',
    content="It seems there was an issue retrieving Kobe Bryant's height and weight. I will use BrowserUseTool to navigate to a reliable source.",
    tool_calls=[
        ChatCompletionMessageToolCall(
            id='call_aez57ImfIEZrqjZdcW9sFNEJ',
            function=Function(
                name='browser_use',
                arguments='{"action":"navigate","url":"https://www.biography.com/athlete/kobe-bryant"}'
            ),
            type='function'
        )
    ]
)

Model Context Protocol (MCP) – MCP standardizes the interaction between LLMs and external tools, similar to a USB‑C interface for devices. It defines hosts (LLM clients), servers (tool providers), and a lightweight client‑server communication model that decouples tool implementations from specific LLM APIs, enabling cross‑platform compatibility.

Implementation Details

State management uses a context manager to switch between IDLE, RUNNING, and ERROR states.

Memory updates occur after each think() and act() step, with a fixed capacity to avoid unbounded growth.

The main loop repeatedly calls step() (ReAct step) until a maximum step count or a FINISHED state is reached.

Special tools such as Terminate trigger a graceful shutdown.

Conclusion – OpenManus showcases how a ReAct‑based agent can combine web search, code execution, and reasoning to solve multi‑step tasks (e.g., calculating Kobe Bryant’s BMI). Nevertheless, it remains a prototype: token consumption is high, context overflow can cause hallucinations, and complex queries still fail. Future work should focus on optimizing tool usage, reducing token overhead, and improving robust memory management to move such agents toward production‑grade AI assistants.

AI agentsMCPOpenManusprompt engineeringReactTool Integration
DeWu Technology
Written by

DeWu Technology

A platform for sharing and discussing tech knowledge, guiding you toward the cloud of technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.