Mastering Microsoft AutoGen 0.4: Build Async Multi‑Agent Apps from Scratch

This article provides a comprehensive, step‑by‑step guide to Microsoft AutoGen 0.4, explaining its layered architecture, core concepts such as Agent, Runtime, and Agent ID, and demonstrating both a simple Hello‑World multi‑agent example and an AI‑enabled agent with full Python code snippets.

AI Large Model Application Practice
AI Large Model Application Practice
AI Large Model Application Practice
Mastering Microsoft AutoGen 0.4: Build Async Multi‑Agent Apps from Scratch

Introduction to AutoGen 0.4

Microsoft AutoGen, once part of the early LLM‑application framework trio with LangChain and LlamaIndex, was completely redesigned and released as version 0.4 (stable 0.4.2 in early 2025). The new release is a non‑backward‑compatible, asynchronous, message‑driven multi‑agent framework that works well with LangGraph and LlamaIndex workflows.

Architecture Layers

AutoGen is organized into three reusable layers:

Core : Provides foundational components such as message communication, agents, tools, logging, and distributed runtime support.

AgentChat : A higher‑level API built on Core, offering pre‑constructed agent types, agent teams, and message abstractions, facilitating migration from older versions.

Extensions : Third‑party plug‑ins (LLM clients, code executors, additional tools) that extend the ecosystem; developers can also contribute their own extensions.

An additional example application, Magnetic‑One , demonstrates a generic multi‑agent program for web browsing, code execution, and file handling.

Key Concepts

Agent : A software object that can respond to messages, maintain state, and execute custom logic (e.g., API calls, sending messages, running Python code). An LLM‑driven agent is an AI Agent.

Runtime : A server‑like container that manages agent lifecycle (registration, creation, destruction) and provides message routing. It can run on a single machine or be distributed across multiple nodes.

Agent ID : The unique identifier for an agent instance, composed of an Agent Type and an Agent Key , serving as the address for message delivery.

Messages : Structured data exchanged between agents. AutoGen supports direct messages and publish/subscribe (topic‑based) broadcasting.

Hello World Example (Pure Core)

The following Python code defines a simple two‑agent system (ManagerAgent and WorkerAgent) that exchanges a "Hello World" message using the Core layer.

from dataclasses import dataclass

@dataclass
class MyTextMessage:
    content: str

from autogen_core import AgentId, MessageContext, RoutedAgent, message_handler

class MyWorkerAgent(RoutedAgent):
    def __init__(self) -> None:
        super().__init__("MyWorkerAgent")

    @message_handler
    async def handle_my_message(self, message: MyTextMessage, ctx: MessageContext) -> MyTextMessage:
        print(f"{self.id.key} received from {ctx.sender}: {message.content}
")
        return MyTextMessage(content="OK, Got it!")

class MyManagerAgent(RoutedAgent):
    def __init__(self) -> None:
        super().__init__("MyManagerAgent")
        self.worker_agent_id = AgentId('my_worker_agent', 'worker')

    @message_handler
    async def handle_my_message(self, message: MyTextMessage, ctx: MessageContext) -> None:
        print(f"{self.id.key} received: {message.content}
")
        print(f"{self.id.key} sending to {self.worker_agent_id}...
")
        response = await self.send_message(message, self.worker_agent_id)
        print(f"{self.id.key} got reply from {self.worker_agent_id}: {response.content}
")

async def main():
    runtime = SingleThreadedAgentRuntime()
    await MyManagerAgent.register(runtime, "my_manager_agent", lambda: MyManagerAgent())
    await MyWorkerAgent.register(runtime, "my_worker_agent", lambda: MyWorkerAgent())
    runtime.start()
    agent_id = AgentId("my_manager_agent", "manager")
    await runtime.send_message(MyTextMessage(content="Hello World!"), agent_id)
    await runtime.stop_when_idle()

asyncio.run(main())

This example shows how to register agents, start the runtime, send a direct message, and shut down the system.

First AI Agent Using AutoGen‑Core

Next, we introduce an AI‑enabled agent that leverages an LLM (e.g., gpt-4o-mini) for response generation and maintains a short conversation history.

@dataclass
class Message:
    content: str

class MyAgent(RoutedAgent):
    def __init__(self) -> None:
        super().__init__("A simple agent")
        self._system_messages = [SystemMessage(content="You are a helpful AI assistant. Please answer in Chinese.")]
        self._model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")
        self._model_context = BufferedChatCompletionContext(buffer_size=5)

    @message_handler
    async def handle_user_message(self, message: Message, ctx: MessageContext) -> Message:
        user_message = UserMessage(content=message.content, source="user")
        await self._model_context.add_message(user_message)
        response = await self._model_client.create(
            self._system_messages + (await self._model_context.get_messages()),
            cancellation_token=ctx.cancellation_token,
        )
        await self._model_context.add_message(AssistantMessage(content=response.content, source=self.metadata["type"]))
        return Message(content=response.content)

After registering this agent in the same runtime as before, sending a message such as "中国的首都是哪个城市?" will produce an LLM‑generated answer, demonstrating how AutoGen‑Core can be extended with AI capabilities.

Running the Examples

Both examples produce console output that confirms agent registration, message flow, and (for the AI agent) LLM responses. The tutorial emphasizes that the Core layer is agnostic to AI; any agent can be built without LLMs, and AI functionality can be added later.

AutoGen architecture diagram
AutoGen architecture diagram
Layered component diagram
Layered component diagram
Core runtime diagram
Core runtime diagram
Hello World output
Hello World output
AI agent response
AI agent response
PythonLLMAsyncFrameworkMulti-agentAutoGen
AI Large Model Application Practice
Written by

AI Large Model Application Practice

Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.