How to Build Effective LLM Agents: Design Principles and Practical Workflows
This article outlines Anthropic's yearly insights on constructing large‑language‑model agents, explaining their definitions, when to employ them, recommended frameworks, modular building blocks, common workflow patterns, and real‑world application scenarios for developers.
1. What Is an Agent
Anthropic classifies two related concepts: Workflows , which are predefined code paths that orchestrate LLM calls and tools, and Agents , which let the LLM dynamically guide its own process and tool usage, deciding how to accomplish a task.
2. When to Use (and Not Use) Agents
Prefer the simplest solution; only add complexity when necessary. Use Workflows for well‑defined, predictable tasks, and choose Agents when flexibility and model‑driven decisions are required. Often a single LLM call with retrieval or examples suffices.
3. When & How to Use Frameworks
Several frameworks simplify agent development:
LangChain’s LangGraph
Amazon Bedrock AI Agent framework
Rivet – a drag‑and‑drop visual workflow builder
Vellum – another visual tool for complex workflows
While they reduce boilerplate, they add abstraction layers that can hide prompts and responses, making debugging harder. Start with the raw LLM API whenever possible.
4. Building Modules, Workflows, and Agents
1. Building Module: Augmented LLM
The core building block is an LLM enhanced with retrieval, tool use, and memory. It can generate its own queries, select appropriate tools, and decide what information to retain.
Customize these capabilities for each use case and provide a clean, well‑documented interface.
2. Workflow Patterns
(1) Prompt Chaining
Break a task into sequential steps, each LLM call processing the previous output. Optional programmatic checks (“gates”) can enforce correctness.
Applicable scenario: Tasks that can be cleanly divided into fixed subtasks, trading latency for higher accuracy.
Generating marketing copy and translating it.
Creating a document outline, validating it, then writing the document.
(2) Routing
Classify input and direct it to specialized downstream prompts or tools, preserving focus and performance.
Applicable scenario: When a task contains clearly separable categories that can be accurately classified.
Routing different customer‑service queries (FAQ, refund, technical) to distinct flows.
Sending simple queries to a small model and complex ones to a larger model.
(3) Parallelization
Run independent sub‑tasks concurrently (partition) or run the same task multiple times and aggregate results (voting).
Applicable scenario: When sub‑tasks can be processed in parallel to speed up execution or when multiple perspectives improve confidence.
Partition: One model filters unsafe content while another generates the response.
Voting: Multiple prompts review code for vulnerabilities and vote on findings.
(4) Orchestrator‑Workers
A central LLM dynamically decomposes a task, delegates subtasks to worker LLMs, and synthesizes their outputs.
Applicable scenario: Complex tasks where the number and nature of subtasks cannot be predetermined, such as large‑scale code changes.
Complex code refactoring across many files.
Gathering and analyzing information from multiple sources.
(5) Evaluator‑Optimizer
One LLM generates a response while another evaluates and provides feedback in a loop.
Applicable scenario: When clear evaluation criteria exist and iterative improvement yields measurable value.
Literary translation refined by a second LLM providing critique.
Complex search tasks where the evaluator decides whether further searching is needed.
3. Agents
As LLMs mature in reasoning, tool use, and error recovery, agents emerge as autonomous systems that start from user commands, plan, act, and optionally request human feedback. They continuously observe the environment (tool results, code execution) to assess progress and stop based on predefined conditions.
Applicable scenario: Open‑ended problems where steps cannot be hard‑coded and the model must make multiple decisions.
A coding agent that edits many files to solve SWE‑bench tasks.
Claude using a computer to complete tasks in the “computer‑use” reference implementation.
5. Composition and Customization
These patterns are not rigid; developers can mix and match them to fit specific use cases. Add complexity only when it demonstrably improves results.
6. Summary
Success with LLMs comes from building the right system, not the most complex one. Start with simple prompts, evaluate thoroughly, and only introduce multi‑step agents when necessary. Follow three core principles: keep designs simple, make planning steps explicit for transparency, and invest in thorough tool documentation and testing.
Original link: https://www.anthropic.com/engineering/building-effective-agents
Data Thinking Notes
Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.