Artificial Intelligence 30 min read

7 Essential Agent Design Patterns for Building Autonomous AI Systems

This article explains the fundamental differences between workflows and agents, introduces seven core design patterns—including three workflow patterns and four agent patterns—provides Python examples using Ollama, and shows how to combine these patterns to create robust, autonomous AI applications.

360 Zhihui Cloud Developer

Oct 24, 2025

7 Essential Agent Design Patterns for Building Autonomous AI Systems

1. Workflow vs Agent

Workflows follow a fixed script, while agents make decisions based on goals. Use a workflow when steps are deterministic; use an agent when the next action depends on dynamic conditions.

2. Three Core Workflow Patterns

Prompt Chaining

Definition

Break a task into linear dependent steps where each step’s output feeds the next step.

Typical Scenarios

Structured document generation (outline → content → proofreading)

Multi‑step data processing (extract → transform → summarize)

Cross‑language content production (summarize → translate → polish)

Code Example

import ollama

MODEL = "llama3:8b"

def prompt_chaining(original_text: str) -> tuple[str, str]:
    """Chain workflow: summarize then translate"""
    summary_prompt = f"""Please summarize the following text in one sentence:
{original_text}
Keep it concise and retain core information."""
    summary_response = ollama.generate(model=MODEL, prompt=summary_prompt, options={"temperature": 0.3})
    summary = summary_response["response"].strip()
    print(f"Step 1 - Summary:
{summary}
")

    translate_prompt = f"""Translate the following English summary into Chinese:
{summary}
Make the translation fluent and natural, without adding extra content."""
    translate_response = ollama.generate(model=MODEL, prompt=translate_prompt, options={"temperature": 0.1})
    translation = translate_response["response"].strip()
    print(f"Step 2 - Translation:
{translation}
")
    return summary, translation

if __name__ == "__main__":
    text = """A large language model (LLM) is a language model trained with self‑supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation..."""
    prompt_chaining(text)

Routing

Definition

Use a routing agent to classify inputs and direct them to specialized processing logic, avoiding unnecessary resource usage.

Typical Scenarios

Customer service (billing queries → billing specialist, technical issues → tech support)

Multi‑model dispatch (simple queries → lightweight model, complex reasoning → large model)

Task classification (weather lookup → weather API, math calculation → calculator tool)

Code Example

import ollama, json
from typing import Optional

MODEL = "llama3:8b"

def get_routing_decision(user_query: str) -> dict:
    routing_prompt = f"""Classify the user query into one of the following categories:
- weather: weather‑related questions
- science: scientific questions
- unknown: anything else
Return JSON with fields 'category' and 'reasoning'.
User query: {user_query}"""
    response = ollama.generate(model=MODEL, prompt=routing_prompt, options={"temperature": 0.1})
    try:
        return json.loads(response["response"].strip())
    except json.JSONDecodeError:
        return {"category": "unknown", "reasoning": "Unable to parse"}

# Additional functions for handling each category omitted for brevity

Parallelization

Definition

Split a task into independent subtasks that run concurrently, then aggregate the results to improve efficiency.

Typical Scenarios

Multi‑style content generation (humorous, formal, technical)

RAG query decomposition (parallel retrieval of sub‑queries)

Batch document processing (summarize many docs simultaneously)

Code Example

import ollama, asyncio, time
MODEL = "llama3:8b"

async def async_generate(prompt: str, task_name: str) -> tuple[str, str]:
    print(f"Starting task: {task_name}")
    start = time.time()
    client = ollama.AsyncClient()
    resp = await client.generate(model=MODEL, prompt=prompt, options={"temperature": 0.6})
    elapsed = time.time() - start
    print(f"Task {task_name} finished in {elapsed:.2f}s")
    return task_name, resp["response"].strip()

async def parallel_workflow(destination: str) -> str:
    tasks = [
        {"name": "景点推荐", "prompt": f"推荐{destination}的3个必去景点，每个用一句话说明"},
        {"name": "美食推荐", "prompt": f"推荐{destination}的2种特色美食，说明口感和推荐店铺类型"},
        {"name": "交通贴士", "prompt": f"给出{destination}的3条旅行交通贴士"}
    ]
    async_tasks = [async_generate(t["prompt"], t["name"]) for t in tasks]
    results = await asyncio.gather(*async_tasks)
    result_dict = dict(results)
    aggregate_prompt = f"""Combine the following results into a coherent short guide for {destination}:
1. 景点推荐: {result_dict['景点推荐']}
2. 美食推荐: {result_dict['美食推荐']}
3. 交通贴士: {result_dict['交通贴士']}
Use natural Chinese and keep all key information."""
    final = ollama.generate(model=MODEL, prompt=aggregate_prompt)
    return final["response"].strip()

if __name__ == "__main__":
    print(asyncio.run(parallel_workflow("杭州")))

3. Four Core Agent Patterns

Reflection

Definition

Iteratively generate, evaluate, and improve output, allowing the agent to self‑correct until it meets the criteria.

Typical Scenarios

Copywriting refinement

Code generation with syntax checking

Report writing with data verification

Code Example

import ollama
from typing import Tuple

MODEL = "llama3:8b"
MAX_ITER = 3

def generate_product_report(product_name: str, feedback: str = "") -> str:
    base = f"Generate a feature report for {product_name} including:
1. Positioning (1 sentence)
2. Core features (≥3, each 1 sentence)
3. Target users (1 sentence)."
    if feedback:
        base += f"

Incorporate the following feedback: {feedback}"
    resp = ollama.generate(model=MODEL, prompt=base)
    return resp["response"].strip()

def evaluate_report(report: str) -> Tuple[bool, str]:
    eval_prompt = f"""Assess whether the following report meets the requirements:
- Contains Positioning, Core features, Target users sections
- At least 3 core features
Report:
{report}
Respond with 'PASS' or 'FAIL' and, if FAIL, provide specific feedback."""
    resp = ollama.generate(model=MODEL, prompt=eval_prompt)
    txt = resp["response"].strip()
    if "PASS" in txt:
        return True, "Report meets requirements."
    feedback = txt.split("反馈：")[-1].strip() if "反馈：" in txt else "Missing required sections."
    return False, feedback

def reflection_workflow(product_name: str) -> str:
    report = ""
    for i in range(MAX_ITER):
        feedback = "" if i == 0 else last_feedback
        report = generate_product_report(product_name, feedback)
        passed, last_feedback = evaluate_report(report)
        if passed:
            return report
    return report

if __name__ == "__main__":
    print(reflection_workflow("智能笔记APP"))

Tool Use

Definition

The agent decides whether to invoke external tools (APIs, databases, calculators) and incorporates the tool’s result into its final answer.

Typical Scenarios

Real‑time information queries (weather, stock prices)

Data calculations

External system interactions (calendar booking, payment refunds)

Code Example

import ollama, json
from typing import Optional, Dict

MODEL = "llama3:8b"

def get_stock_price(symbol: str) -> Dict[str, str]:
    mock = {
        "AAPL": {"name": "苹果公司", "price": "189.56 USD", "time": "2025-10-01 14:30"},
        "MSFT": {"name": "微软公司", "price": "412.89 USD", "time": "2025-10-01 14:30"}
    }
    return mock.get(symbol.upper(), {"name": "未知股票", "price": "无数据", "time": "无数据"})

def decide_tool_use(user_query: str) -> tuple[bool, Optional[Dict]]:
    prompt = f"""Analyze the query and decide if the stock price tool is needed.
If needed, output JSON with fields 'need_tool' (true/false) and 'tool_params' ({{'tool_name': 'get_stock_price', 'symbol': '...'}}).
User query: {user_query}"""
    resp = ollama.generate(model=MODEL, prompt=prompt)
    try:
        data = json.loads(resp["response"].strip())
        return data["need_tool"], data.get("tool_params")
    except (json.JSONDecodeError, KeyError):
        return False, None

def tool_use_workflow(user_query: str) -> str:
    need_tool, params = decide_tool_use(user_query)
    if need_tool and params and params.get("tool_name") == "get_stock_price":
        symbol = params.get("symbol")
        if not symbol:
            return "请提供具体的股票代码。"
        stock = get_stock_price(symbol)
        answer_prompt = f"""Based on the following stock data, answer the user query concisely.
Query: {user_query}
Data:
- Name: {stock['name']}
- Price: {stock['price']}
- Time: {stock['time']}"""
        resp = ollama.generate(model=MODEL, prompt=answer_prompt)
        return resp["response"].strip()
    else:
        direct_prompt = f"""Answer the user query directly. If it involves stock prices, ask for a specific ticker symbol."""
        resp = ollama.generate(model=MODEL, prompt=direct_prompt)
        return resp["response"].strip()

if __name__ == "__main__":
    print(tool_use_workflow("AAPL（苹果公司）的实时股价是多少？"))

Planning (Planner‑Worker)

Definition

A planner breaks a complex goal into ordered sub‑tasks; workers execute each sub‑task, and the results are aggregated into the final output.

Typical Scenarios

Complex project management (requirements → design → implementation → testing)

Multi‑step content creation (topic → research → outline → draft)

Travel itinerary planning

Code Example

import ollama, json
from typing import List, Dict

MODEL = "llama3:8b"

class Task:
    def __init__(self, task_id: int, description: str, worker: str):
        self.task_id = task_id
        self.description = description
        self.worker = worker

def planner(user_goal: str) -> List[Task]:
    prompt = f"""Decompose the goal '{user_goal}' into 3‑4 sequential tasks. Each task should include a description and a worker role (Researcher, Writer, Editor). Return JSON with a 'tasks' array containing 'task_id', 'description', 'worker'."""
    resp = ollama.generate(model=MODEL, prompt=prompt)
    try:
        data = json.loads(resp["response"].strip())
        return [Task(t["task_id"], t["description"], t["worker"]) for t in data["tasks"]]
    except (json.JSONDecodeError, KeyError):
        return [
            Task(1, "Collect core concepts of AI agents", "Researcher"),
            Task(2, "Write tutorial outline", "Writer"),
            Task(3, "Proofread and add examples", "Editor")
        ]

def worker_execute(task: Task, previous: str = "") -> str:
    base = f"You are a {task.worker}. Execute the following task: {task.description}."
    if previous:
        base += f"

Reference previous result:
{previous}"
    resp = ollama.generate(model=MODEL, prompt=base)
    return resp["response"].strip()

def planning_workflow(user_goal: str) -> str:
    tasks = planner(user_goal)
    prev = ""
    results = []
    for t in tasks:
        res = worker_execute(t, prev)
        results.append(res)
        prev = res
    aggregate = f"Combine the following results into a complete tutorial for '{user_goal}':
" + "
---
".join(results)
    final = ollama.generate(model=MODEL, prompt=aggregate)
    return final["response"].strip()

if __name__ == "__main__":
    print(planning_workflow("写一篇AI智能体入门教程（300字左右）"))

Multi‑Agent

Definition

Multiple specialized agents collaborate under a coordinator, passing context and results to achieve a complex goal.

Typical Scenarios

Travel planning (hotel, restaurant, transport agents)

Software development (product manager, developer, tester agents)

Content creation teams (topic, writer, editor agents)

Code Example

import ollama
from typing import Tuple, Optional

MODEL = "llama3:8b"
AGENT_ROLES = {
    "Coordinator": "Understands user intent and delegates to other agents.",
    "HotelAgent": "Handles hotel selection and booking.",
    "RestaurantAgent": "Handles restaurant recommendation and reservation."
}

def run_agent(agent_name: str, system_prompt: str, user_query: str, context: str = "") -> Tuple[str, Optional[str]]:
    full_prompt = f"System prompt: {system_prompt}
"
    if context:
        full_prompt += f"Context: {context}
"
    full_prompt += f"User query: {user_query}
"
    full_prompt += "Provide your response and, if another agent should be called, end with a line 'NextAgent: <AgentName>' or 'NextAgent: None'."
    resp = ollama.generate(model=MODEL, prompt=full_prompt)
    text = resp["response"].strip()
    next_agent = None
    for line in text.split("
"):
        if line.startswith("NextAgent:"):
            next_agent = line.split(":", 1)[1].strip()
            text = text.replace(line, "").strip()
            break
    return text, next_agent

def multi_agent_workflow(user_goal: str) -> str:
    current_agent = "Coordinator"
    context = ""
    results = []
    while current_agent and current_agent != "None":
        system = AGENT_ROLES.get(current_agent, "")
        reply, next_agent = run_agent(current_agent, system, user_goal, context)
        results.append(f"[{current_agent}]: {reply}")
        context += f"
{current_agent} result: {reply}"
        current_agent = next_agent
    return "
".join(results) + "

Final aggregated result:
" + context.strip()

if __name__ == "__main__":
    print(multi_agent_workflow("帮我规划上海3天旅行，9月10-12日，2人，酒店预算每晚800元以内，想订1家外滩附近的餐厅。"))

4. Pattern Combination

Real‑world systems often combine multiple patterns, e.g., a travel planner may use Multi‑Agent coordination, Planning for each sub‑task, Tool Use for external APIs, and Reflection to verify results. Start with simple patterns and layer additional ones as needed.

Conclusion

Agent patterns are not a silver bullet but a toolbox. Whether you need a simple chain workflow or a sophisticated multi‑agent system, selecting the right pattern(s) based on task characteristics enables you to turn AI autonomy into practical solutions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

design-patterns Python AI Agents LLM workflow Autonomous Systems

Written by

360 Zhihui Cloud Developer

360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.