Mastering Microsoft AutoGen 0.4: Build Async Multi‑Agent Apps from Scratch
This article provides a comprehensive, step‑by‑step guide to Microsoft AutoGen 0.4, explaining its layered architecture, core concepts such as Agent, Runtime, and Agent ID, and demonstrating both a simple Hello‑World multi‑agent example and an AI‑enabled agent with full Python code snippets.
Introduction to AutoGen 0.4
Microsoft AutoGen, once part of the early LLM‑application framework trio with LangChain and LlamaIndex, was completely redesigned and released as version 0.4 (stable 0.4.2 in early 2025). The new release is a non‑backward‑compatible, asynchronous, message‑driven multi‑agent framework that works well with LangGraph and LlamaIndex workflows.
Architecture Layers
AutoGen is organized into three reusable layers:
Core : Provides foundational components such as message communication, agents, tools, logging, and distributed runtime support.
AgentChat : A higher‑level API built on Core, offering pre‑constructed agent types, agent teams, and message abstractions, facilitating migration from older versions.
Extensions : Third‑party plug‑ins (LLM clients, code executors, additional tools) that extend the ecosystem; developers can also contribute their own extensions.
An additional example application, Magnetic‑One , demonstrates a generic multi‑agent program for web browsing, code execution, and file handling.
Key Concepts
Agent : A software object that can respond to messages, maintain state, and execute custom logic (e.g., API calls, sending messages, running Python code). An LLM‑driven agent is an AI Agent.
Runtime : A server‑like container that manages agent lifecycle (registration, creation, destruction) and provides message routing. It can run on a single machine or be distributed across multiple nodes.
Agent ID : The unique identifier for an agent instance, composed of an Agent Type and an Agent Key , serving as the address for message delivery.
Messages : Structured data exchanged between agents. AutoGen supports direct messages and publish/subscribe (topic‑based) broadcasting.
Hello World Example (Pure Core)
The following Python code defines a simple two‑agent system (ManagerAgent and WorkerAgent) that exchanges a "Hello World" message using the Core layer.
from dataclasses import dataclass
@dataclass
class MyTextMessage:
content: str
from autogen_core import AgentId, MessageContext, RoutedAgent, message_handler
class MyWorkerAgent(RoutedAgent):
def __init__(self) -> None:
super().__init__("MyWorkerAgent")
@message_handler
async def handle_my_message(self, message: MyTextMessage, ctx: MessageContext) -> MyTextMessage:
print(f"{self.id.key} received from {ctx.sender}: {message.content}
")
return MyTextMessage(content="OK, Got it!")
class MyManagerAgent(RoutedAgent):
def __init__(self) -> None:
super().__init__("MyManagerAgent")
self.worker_agent_id = AgentId('my_worker_agent', 'worker')
@message_handler
async def handle_my_message(self, message: MyTextMessage, ctx: MessageContext) -> None:
print(f"{self.id.key} received: {message.content}
")
print(f"{self.id.key} sending to {self.worker_agent_id}...
")
response = await self.send_message(message, self.worker_agent_id)
print(f"{self.id.key} got reply from {self.worker_agent_id}: {response.content}
")
async def main():
runtime = SingleThreadedAgentRuntime()
await MyManagerAgent.register(runtime, "my_manager_agent", lambda: MyManagerAgent())
await MyWorkerAgent.register(runtime, "my_worker_agent", lambda: MyWorkerAgent())
runtime.start()
agent_id = AgentId("my_manager_agent", "manager")
await runtime.send_message(MyTextMessage(content="Hello World!"), agent_id)
await runtime.stop_when_idle()
asyncio.run(main())This example shows how to register agents, start the runtime, send a direct message, and shut down the system.
First AI Agent Using AutoGen‑Core
Next, we introduce an AI‑enabled agent that leverages an LLM (e.g., gpt-4o-mini) for response generation and maintains a short conversation history.
@dataclass
class Message:
content: str
class MyAgent(RoutedAgent):
def __init__(self) -> None:
super().__init__("A simple agent")
self._system_messages = [SystemMessage(content="You are a helpful AI assistant. Please answer in Chinese.")]
self._model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")
self._model_context = BufferedChatCompletionContext(buffer_size=5)
@message_handler
async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
user_message = UserMessage(content=message.content, source="user")
await self._model_context.add_message(user_message)
response = await self._model_client.create(
self._system_messages + (await self._model_context.get_messages()),
cancellation_token=ctx.cancellation_token,
)
await self._model_context.add_message(AssistantMessage(content=response.content, source=self.metadata["type"]))
return Message(content=response.content)After registering this agent in the same runtime as before, sending a message such as "中国的首都是哪个城市?" will produce an LLM‑generated answer, demonstrating how AutoGen‑Core can be extended with AI capabilities.
Running the Examples
Both examples produce console output that confirms agent registration, message flow, and (for the AI agent) LLM responses. The tutorial emphasizes that the Core layer is agnostic to AI; any agent can be built without LLMs, and AI functionality can be added later.
AI Large Model Application Practice
Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
