40+ Diagrams Uncover LLM Agents’ Core Components, Multi‑Agent Frameworks, and MCP Stack
This article breaks down the essential building blocks of LLM agents—including environment, sensors, effectors, short‑ and long‑term memory, tools, planning, and reasoning—while illustrating how Model Context Protocol (MCP), Toolformer, ReAct, Reflexion, and popular multi‑agent frameworks such as AutoGen, MetaGPT and CAMEL enable scalable, collaborative AI systems.
What Is an LLM Agent?
According to Russell and Norvig (2016), an AI agent is any entity that perceives its environment via sensors and acts upon it with effectors. LLM agents follow this definition, interacting with both physical and software environments.
Core Components of an Agent
Environment – the world the agent interacts with.
Sensor – observes the environment.
Effector – tools that act on the environment.
Effectuator – the “brain” or rules that decide how observations translate into actions.
This framework applies to robots, AI agents, and any system that needs to perceive and act.
Memory
LLMs are inherently forgetful; they do not retain conversation history across separate queries. Short‑term (working) memory is provided by the model’s context window, which can be expanded to hold full dialogue histories. When the context window is insufficient, a secondary LLM can summarize prior interactions.
Long‑term memory is required to track dozens or hundreds of steps. A common technique stores all past interactions, actions, and dialogues in an external vector database, which can be queried via Retrieval‑Augmented Generation (RAG).
Tools
Tools let an LLM interact with external services (e.g., databases, APIs) or run custom code. They are used for data retrieval or performing actions such as scheduling meetings. To invoke a tool, the LLM must generate a JSON‑compatible request that can be fed to a code interpreter.
Model Context Protocol (MCP)
Anthropic’s MCP standardizes API access for services like weather apps and GitHub. It consists of three components: MCP Host (e.g., Cursor) that manages connections, MCP Client that maintains a one‑to‑one link with the server, and MCP Server that provides context, tools, and capabilities to the LLM.
In a typical workflow, an LLM asks the MCP Host which tools are available, selects a tool, sends a request through the MCP Server, receives the result, and then formulates a response for the user.
Planning
Planning decomposes a task into executable steps. It requires the LLM to reason about the next step before acting. “Reasoning‑type” LLMs perform chain‑of‑thought before answering, and this reasoning can be achieved via fine‑tuning or prompt engineering. Providing exemplars in prompts guides the LLM’s reasoning process.
During training, models such as DeepSeek‑R1 use reward signals to encourage internal thinking.
Reasoning and Action (ReAct)
ReAct combines reasoning with action using a prompt that cycles through three stages: Think (reason about the current situation), Act (execute actions, often via tools), and Observe (reason about the outcome). This loop enables the LLM to iteratively solve problems.
Reflection (Reflexion)
Even ReAct‑enabled agents can fail. Reflexion adds a verbal reinforcement step that lets the agent learn from past failures by reflecting on its actions and the evaluator’s scores. The process involves three roles: an Actor that selects actions, an Evaluator that scores outputs, and a Self‑Reflector that reviews both.
Multi‑Agent Collaboration
Single‑agent systems face challenges such as tool overload, complex context, and lack of specialization. Multi‑Agent systems address these by assigning specialized agents, each with its own toolset, under a supervisory coordinator that manages communication and task allocation.
Key architectural questions are Agent Initialization (how specialized agents are created) and Agent Coordination (how they cooperate).
Human‑Behavior Simulation
The paper “Generative Agents: Interactive Simulacra of Human Behavior” introduced agents that simulate believable human actions using profiles, memory, planning, and reflection, demonstrating the power of modular agent design.
Modular Frameworks
Popular frameworks—AutoGen, MetaGPT, and CAMEL—differ mainly in how agents communicate, but all rely on collaborative communication where agents exchange state, goals, and next steps. These frameworks have seen explosive growth in recent weeks.
2025 is projected to be a pivotal year for AI agents, with continued real‑world deployments such as DeepSeek‑R1 combined with agents.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Smart Era Software Development
Committed to openness and connectivity, we build frontline engineering capabilities in software, requirements, and platform engineering. By integrating digitalization, cloud computing, blockchain, new media and other hot tech topics, we create an efficient, cutting‑edge tech exchange platform and a diversified engineering ecosystem. Provides frontline news, summit updates, and practical sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
