7 Essential Agent Design Patterns for Building Autonomous AI Systems
This article explains the fundamental differences between workflows and agents, introduces seven core design patterns—including three workflow patterns and four agent patterns—provides Python examples using Ollama, and shows how to combine these patterns to create robust, autonomous AI applications.
1. Workflow vs Agent
Workflows follow a fixed script, while agents make decisions based on goals. Use a workflow when steps are deterministic; use an agent when the next action depends on dynamic conditions.
2. Three Core Workflow Patterns
Prompt Chaining
Definition
Break a task into linear dependent steps where each step’s output feeds the next step.
Typical Scenarios
Structured document generation (outline → content → proofreading)
Multi‑step data processing (extract → transform → summarize)
Cross‑language content production (summarize → translate → polish)
Code Example
import ollama
MODEL = "llama3:8b"
def prompt_chaining(original_text: str) -> tuple[str, str]:
"""Chain workflow: summarize then translate"""
summary_prompt = f"""Please summarize the following text in one sentence:
{original_text}
Keep it concise and retain core information."""
summary_response = ollama.generate(model=MODEL, prompt=summary_prompt, options={"temperature": 0.3})
summary = summary_response["response"].strip()
print(f"Step 1 - Summary:
{summary}
")
translate_prompt = f"""Translate the following English summary into Chinese:
{summary}
Make the translation fluent and natural, without adding extra content."""
translate_response = ollama.generate(model=MODEL, prompt=translate_prompt, options={"temperature": 0.1})
translation = translate_response["response"].strip()
print(f"Step 2 - Translation:
{translation}
")
return summary, translation
if __name__ == "__main__":
text = """A large language model (LLM) is a language model trained with self‑supervised machine learning on a vast amount of text, designed for natural language processing tasks, especially language generation..."""
prompt_chaining(text)Routing
Definition
Use a routing agent to classify inputs and direct them to specialized processing logic, avoiding unnecessary resource usage.
Typical Scenarios
Customer service (billing queries → billing specialist, technical issues → tech support)
Multi‑model dispatch (simple queries → lightweight model, complex reasoning → large model)
Task classification (weather lookup → weather API, math calculation → calculator tool)
Code Example
import ollama, json
from typing import Optional
MODEL = "llama3:8b"
def get_routing_decision(user_query: str) -> dict:
routing_prompt = f"""Classify the user query into one of the following categories:
- weather: weather‑related questions
- science: scientific questions
- unknown: anything else
Return JSON with fields 'category' and 'reasoning'.
User query: {user_query}"""
response = ollama.generate(model=MODEL, prompt=routing_prompt, options={"temperature": 0.1})
try:
return json.loads(response["response"].strip())
except json.JSONDecodeError:
return {"category": "unknown", "reasoning": "Unable to parse"}
# Additional functions for handling each category omitted for brevityParallelization
Definition
Split a task into independent subtasks that run concurrently, then aggregate the results to improve efficiency.
Typical Scenarios
Multi‑style content generation (humorous, formal, technical)
RAG query decomposition (parallel retrieval of sub‑queries)
Batch document processing (summarize many docs simultaneously)
Code Example
import ollama, asyncio, time
MODEL = "llama3:8b"
async def async_generate(prompt: str, task_name: str) -> tuple[str, str]:
print(f"Starting task: {task_name}")
start = time.time()
client = ollama.AsyncClient()
resp = await client.generate(model=MODEL, prompt=prompt, options={"temperature": 0.6})
elapsed = time.time() - start
print(f"Task {task_name} finished in {elapsed:.2f}s")
return task_name, resp["response"].strip()
async def parallel_workflow(destination: str) -> str:
tasks = [
{"name": "景点推荐", "prompt": f"推荐{destination}的3个必去景点,每个用一句话说明"},
{"name": "美食推荐", "prompt": f"推荐{destination}的2种特色美食,说明口感和推荐店铺类型"},
{"name": "交通贴士", "prompt": f"给出{destination}的3条旅行交通贴士"}
]
async_tasks = [async_generate(t["prompt"], t["name"]) for t in tasks]
results = await asyncio.gather(*async_tasks)
result_dict = dict(results)
aggregate_prompt = f"""Combine the following results into a coherent short guide for {destination}:
1. 景点推荐: {result_dict['景点推荐']}
2. 美食推荐: {result_dict['美食推荐']}
3. 交通贴士: {result_dict['交通贴士']}
Use natural Chinese and keep all key information."""
final = ollama.generate(model=MODEL, prompt=aggregate_prompt)
return final["response"].strip()
if __name__ == "__main__":
print(asyncio.run(parallel_workflow("杭州")))3. Four Core Agent Patterns
Reflection
Definition
Iteratively generate, evaluate, and improve output, allowing the agent to self‑correct until it meets the criteria.
Typical Scenarios
Copywriting refinement
Code generation with syntax checking
Report writing with data verification
Code Example
import ollama
from typing import Tuple
MODEL = "llama3:8b"
MAX_ITER = 3
def generate_product_report(product_name: str, feedback: str = "") -> str:
base = f"Generate a feature report for {product_name} including:
1. Positioning (1 sentence)
2. Core features (≥3, each 1 sentence)
3. Target users (1 sentence)."
if feedback:
base += f"
Incorporate the following feedback: {feedback}"
resp = ollama.generate(model=MODEL, prompt=base)
return resp["response"].strip()
def evaluate_report(report: str) -> Tuple[bool, str]:
eval_prompt = f"""Assess whether the following report meets the requirements:
- Contains Positioning, Core features, Target users sections
- At least 3 core features
Report:
{report}
Respond with 'PASS' or 'FAIL' and, if FAIL, provide specific feedback."""
resp = ollama.generate(model=MODEL, prompt=eval_prompt)
txt = resp["response"].strip()
if "PASS" in txt:
return True, "Report meets requirements."
feedback = txt.split("反馈:")[-1].strip() if "反馈:" in txt else "Missing required sections."
return False, feedback
def reflection_workflow(product_name: str) -> str:
report = ""
for i in range(MAX_ITER):
feedback = "" if i == 0 else last_feedback
report = generate_product_report(product_name, feedback)
passed, last_feedback = evaluate_report(report)
if passed:
return report
return report
if __name__ == "__main__":
print(reflection_workflow("智能笔记APP"))Tool Use
Definition
The agent decides whether to invoke external tools (APIs, databases, calculators) and incorporates the tool’s result into its final answer.
Typical Scenarios
Real‑time information queries (weather, stock prices)
Data calculations
External system interactions (calendar booking, payment refunds)
Code Example
import ollama, json
from typing import Optional, Dict
MODEL = "llama3:8b"
def get_stock_price(symbol: str) -> Dict[str, str]:
mock = {
"AAPL": {"name": "苹果公司", "price": "189.56 USD", "time": "2025-10-01 14:30"},
"MSFT": {"name": "微软公司", "price": "412.89 USD", "time": "2025-10-01 14:30"}
}
return mock.get(symbol.upper(), {"name": "未知股票", "price": "无数据", "time": "无数据"})
def decide_tool_use(user_query: str) -> tuple[bool, Optional[Dict]]:
prompt = f"""Analyze the query and decide if the stock price tool is needed.
If needed, output JSON with fields 'need_tool' (true/false) and 'tool_params' ({{'tool_name': 'get_stock_price', 'symbol': '...'}}).
User query: {user_query}"""
resp = ollama.generate(model=MODEL, prompt=prompt)
try:
data = json.loads(resp["response"].strip())
return data["need_tool"], data.get("tool_params")
except (json.JSONDecodeError, KeyError):
return False, None
def tool_use_workflow(user_query: str) -> str:
need_tool, params = decide_tool_use(user_query)
if need_tool and params and params.get("tool_name") == "get_stock_price":
symbol = params.get("symbol")
if not symbol:
return "请提供具体的股票代码。"
stock = get_stock_price(symbol)
answer_prompt = f"""Based on the following stock data, answer the user query concisely.
Query: {user_query}
Data:
- Name: {stock['name']}
- Price: {stock['price']}
- Time: {stock['time']}"""
resp = ollama.generate(model=MODEL, prompt=answer_prompt)
return resp["response"].strip()
else:
direct_prompt = f"""Answer the user query directly. If it involves stock prices, ask for a specific ticker symbol."""
resp = ollama.generate(model=MODEL, prompt=direct_prompt)
return resp["response"].strip()
if __name__ == "__main__":
print(tool_use_workflow("AAPL(苹果公司)的实时股价是多少?"))Planning (Planner‑Worker)
Definition
A planner breaks a complex goal into ordered sub‑tasks; workers execute each sub‑task, and the results are aggregated into the final output.
Typical Scenarios
Complex project management (requirements → design → implementation → testing)
Multi‑step content creation (topic → research → outline → draft)
Travel itinerary planning
Code Example
import ollama, json
from typing import List, Dict
MODEL = "llama3:8b"
class Task:
def __init__(self, task_id: int, description: str, worker: str):
self.task_id = task_id
self.description = description
self.worker = worker
def planner(user_goal: str) -> List[Task]:
prompt = f"""Decompose the goal '{user_goal}' into 3‑4 sequential tasks. Each task should include a description and a worker role (Researcher, Writer, Editor). Return JSON with a 'tasks' array containing 'task_id', 'description', 'worker'."""
resp = ollama.generate(model=MODEL, prompt=prompt)
try:
data = json.loads(resp["response"].strip())
return [Task(t["task_id"], t["description"], t["worker"]) for t in data["tasks"]]
except (json.JSONDecodeError, KeyError):
return [
Task(1, "Collect core concepts of AI agents", "Researcher"),
Task(2, "Write tutorial outline", "Writer"),
Task(3, "Proofread and add examples", "Editor")
]
def worker_execute(task: Task, previous: str = "") -> str:
base = f"You are a {task.worker}. Execute the following task: {task.description}."
if previous:
base += f"
Reference previous result:
{previous}"
resp = ollama.generate(model=MODEL, prompt=base)
return resp["response"].strip()
def planning_workflow(user_goal: str) -> str:
tasks = planner(user_goal)
prev = ""
results = []
for t in tasks:
res = worker_execute(t, prev)
results.append(res)
prev = res
aggregate = f"Combine the following results into a complete tutorial for '{user_goal}':
" + "
---
".join(results)
final = ollama.generate(model=MODEL, prompt=aggregate)
return final["response"].strip()
if __name__ == "__main__":
print(planning_workflow("写一篇AI智能体入门教程(300字左右)"))Multi‑Agent
Definition
Multiple specialized agents collaborate under a coordinator, passing context and results to achieve a complex goal.
Typical Scenarios
Travel planning (hotel, restaurant, transport agents)
Software development (product manager, developer, tester agents)
Content creation teams (topic, writer, editor agents)
Code Example
import ollama
from typing import Tuple, Optional
MODEL = "llama3:8b"
AGENT_ROLES = {
"Coordinator": "Understands user intent and delegates to other agents.",
"HotelAgent": "Handles hotel selection and booking.",
"RestaurantAgent": "Handles restaurant recommendation and reservation."
}
def run_agent(agent_name: str, system_prompt: str, user_query: str, context: str = "") -> Tuple[str, Optional[str]]:
full_prompt = f"System prompt: {system_prompt}
"
if context:
full_prompt += f"Context: {context}
"
full_prompt += f"User query: {user_query}
"
full_prompt += "Provide your response and, if another agent should be called, end with a line 'NextAgent: <AgentName>' or 'NextAgent: None'."
resp = ollama.generate(model=MODEL, prompt=full_prompt)
text = resp["response"].strip()
next_agent = None
for line in text.split("
"):
if line.startswith("NextAgent:"):
next_agent = line.split(":", 1)[1].strip()
text = text.replace(line, "").strip()
break
return text, next_agent
def multi_agent_workflow(user_goal: str) -> str:
current_agent = "Coordinator"
context = ""
results = []
while current_agent and current_agent != "None":
system = AGENT_ROLES.get(current_agent, "")
reply, next_agent = run_agent(current_agent, system, user_goal, context)
results.append(f"[{current_agent}]: {reply}")
context += f"
{current_agent} result: {reply}"
current_agent = next_agent
return "
".join(results) + "
Final aggregated result:
" + context.strip()
if __name__ == "__main__":
print(multi_agent_workflow("帮我规划上海3天旅行,9月10-12日,2人,酒店预算每晚800元以内,想订1家外滩附近的餐厅。"))4. Pattern Combination
Real‑world systems often combine multiple patterns, e.g., a travel planner may use Multi‑Agent coordination, Planning for each sub‑task, Tool Use for external APIs, and Reflection to verify results. Start with simple patterns and layer additional ones as needed.
Conclusion
Agent patterns are not a silver bullet but a toolbox. Whether you need a simple chain workflow or a sophisticated multi‑agent system, selecting the right pattern(s) based on task characteristics enables you to turn AI autonomy into practical solutions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
360 Zhihui Cloud Developer
360 Zhihui Cloud is an enterprise open service platform that aims to "aggregate data value and empower an intelligent future," leveraging 360's extensive product and technology resources to deliver platform services to customers.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
