20 Loop Design Patterns Every AI Engineer Should Know
The article presents twenty essential loop design patterns for industrial AI systems, explains how they differ from single‑call prompts, provides concrete examples, code snippets, and use‑case scenarios, and shows how these loops enable self‑improvement, memory, planning, exploration, and system optimization for AI agents.
Agent vs Loop
Agent is a worker that performs a single task.
Loop is the mechanism that continuously improves the worker by repeating generate‑evaluate‑learn‑improve cycles.
Production‑grade AI systems are built from loops rather than single model calls.
Generate → Evaluate → Learn → ImproveCategory 1: Quality‑Improvement Loops
1. Generate → Critique → Rewrite
The generator produces a draft, an independent critic reviews it, and the generator rewrites based on the feedback. The cycle repeats until a quality threshold is reached.
[Generator] → initial draft
[Critic] → "Paragraph 3 is vague, lacks evidence, tone is unprofessional."
[Generator] → rewrite based on critique
[Critic] → "Improved, but conclusion still weak."
[Generator] → final rewriteApplicable scenarios : copywriting, code review, report drafting, strategic plans, sales outreach.
Key insight : the generation model is rarely the best judge of its own output; an independent critic exposes blind spots.
2. Score‑and‑Retry Loop
Generate an output, score it, and retry while the score is below a threshold.
score = evaluate(output)
while score < threshold:
output = generate(prompt)
score = evaluate(output)
attempts += 1
if attempts > max_retries:
return best_so_far # return the best result after max retriesApplicable scenarios : any task with a quantifiable quality metric (e.g., extraction accuracy, format compliance, factual correctness).
Core design : the generator is unaware of the evaluation; the evaluator holds the scoring criteria, isolating roles.
3. Multi‑Critic Loop
Four independent critics evaluate the same output on different dimensions:
Correctness critic – factual accuracy.
Style critic – clarity and fluency.
Safety critic – compliance and safety.
Domain critic – adherence to expert standards.
The output is released only after passing all four checks.
Applicable scenarios : medical AI, legal document review, financial analysis, regulated content generation.
4. Adversarial Critique Loop
The critic’s sole role is to break the generator’s answer with questions such as:
"Which assumptions are invalid?"
"What key evidence is missing?"
"How would a skeptic rebut this?"
"Which confident conclusions are actually wrong?"
The generator must defend or rewrite; only answers that survive repeated attacks are accepted.
Applicable scenarios : cutting‑edge research reviews, investment logic checks, strategic planning, risk assessment.
5. Judge Ensemble Loop
Multiple (e.g., five) independent judges score the same output; the average is taken and the system proceeds only when consensus is high, reducing noise.
Applicable scenarios : tasks with unstable single‑model evaluations, ultra‑low‑tolerance requirements, critical edge‑case handling.
Category 2: Memory Loops
6. Reflexion Loop
When an agent fails, it analyzes the cause, stores the lesson in memory, injects the lesson into the next context, and retries. Each iteration becomes smarter.
Attempt 1: failed
Reflection: "I assumed X, but X was wrong. Verify X next time."
Attempt 2: inject lesson → partial success
Reflection: "Improved, but missed step Y. Add Y check."
Attempt 3: success7. Memory Update Loop
After each task, record:
What decision was made?
What result did it produce?
If you could redo it, what would you change?
Future runs automatically inherit this knowledge base, leading to performance gains over time.
8. Error Library Loop
Every failure (wrong answer, bad output, edge case) is stored in an error library. Before a new task, the system checks the library; if a similar failure exists, it injects the known fix into the execution plan, preventing repeat mistakes.
Applicable scenarios : any production system where avoiding repeat failures is critical.
9. Success Pattern Loop
Successful executions are also recorded: save the execution path, context, and key success factors, then reuse them for similar future tasks.
10. Memory Compression Loop
When the number of memory items reaches N, the system compresses them into higher‑level abstractions.
Before compression:
"Task A failed because of X."
"Task B failed because of X."
"Task C failed because of X."
After compression:
"Underlying rule: X always leads to failure. Check X before any task."This keeps the context window clean and ensures critical rules remain readable.
Category 3: Planning Loops
11. Plan → Execute → Replan
The initial plan is not immutable. After each step, the system observes the result, updates the plan, and continues, forming a converging spiral rather than a linear waterfall.
Make plan → execute step → observe result → update plan → continueApplicable scenarios : environments with dynamic changes, tasks with strong dependencies, long‑horizon complex missions.
12. Dynamic Workflow Loop
The pipeline shape is decided at runtime based on intermediate results:
If Step 1 outputs A → take branch X.
If output is B → take branch Y.
If output is C → skip Step 2 and jump to Step 5.
Applicable scenarios : deep document research, multi‑channel customer routing, adaptive content generation.
13. Goal Decomposition Loop
A large, vague goal is recursively broken down into sub‑goals, tasks, and atomic steps until each unit can be solved with a single model call.
Big goal: "Write a detailed competitor analysis report"
├─ Subgoal 1: "Identify top‑5 direct competitors"
├─ Subgoal 2: "Analyze core features of each"
├─ Subgoal 3: "Compare pricing models"
└─ Subgoal 4: "Find market gaps"
Each subgoal → tasks → single‑call API stepsApplicable scenarios : long‑term research, multi‑stage planning, complex codebase refactoring.
14. Progress Evaluation Loop
Every N steps the system asks, "Is the current action truly moving us toward the ultimate goal?" If yes, continue; if no, switch strategy or re‑plan.
Applicable scenarios : long‑running research agents, multi‑day automation, self‑debugging coding agents.
15. Constraint Satisfaction Loop
The loop runs until all hard constraints (budget, quality, latency, brand tone, no hallucinations) are satisfied. If any constraint fails, the system iteratively improves the output.
while not all_constraints_satisfied(output):
output = improve(output, unsatisfied_constraints)Applicable scenarios : commercial production systems where any business‑rule violation marks the output as incomplete.
Category 4: Exploration Loops
16. Branch‑and‑Explore Loop
Generate several candidate approaches (conservative, aggressive, creative), evaluate each, and keep only the best.
paths = [generate(approach="conservative"),
generate(approach="aggressive"),
generate(approach="creative")]
# evaluate all paths
scores = [evaluate(p) for p in paths]
best = paths[scores.index(max(scores))]Applicable scenarios : multi‑version copy testing, architecture decision evaluation, hypothesis validation, A/B generation.
17. Tree Search Loop
Extends branch‑and‑explore into depth: expand promising nodes, prune weak ones, continue until the optimal leaf is found.
Root → expand [A, B, C]
├─ A → expand [A1, A2] (promising)
├─ B → prune (poor)
└─ A1 → expand [A1a, A1b]
└─ A1a → optimal solution ✓Applicable scenarios : high‑complexity reasoning, long‑horizon planning, code‑base‑scale refactoring.
Cost : computationally expensive but solves tasks a single API call cannot.
18. Debate Loop
Two agents argue opposite sides of the same issue, constantly challenging each other’s assumptions and demanding evidence. The final answer emerges from this adversarial tension.
Applicable scenarios : investment decisions, strategic planning, major risk assessment, deep academic or industry critique.
Category 5: System‑Optimization Loops
19. Prompt Optimization Loop
The system scores each task output, identifies failure cases, rewrites the prompt to fix them, and repeats until the target score is reached.
current_prompt = "Summarize this document."
for iteration in range(max_iterations):
outputs = [run(current_prompt, doc) for doc in test_set]
scores = [evaluate(o) for o in outputs]
avg_score = mean(scores)
if avg_score >= target:
break
failures = [o for o, s in zip(outputs, scores) if s < threshold]
current_prompt = improve_prompt(current_prompt, failures)20. Workflow Optimization Loop
The system measures latency, cost, and quality of each atomic step, then rewrites its own workflow:
If latency exceeds target → parallelize slow steps.
If cost exceeds budget → replace expensive GPT‑4 nodes with cheaper models while preserving quality.
If quality drops → insert a critic before the final output.
metrics = measure_workflow(outputs, latency, cost)
if metrics.latency > target_latency:
workflow = parallelize(slow_steps)
if metrics.cost > budget:
workflow = replace_with_cheaper_model(high_cost_steps)
if metrics.quality < threshold:
workflow = add_critic_before(final_output_step)When latency, cost, and quality are all within bounds, the system has achieved true self‑evolution.
Unified Formula Behind All Patterns
Despite varied forms, every pattern follows the same skeleton: Action → Observe → Evaluate → Adjust This loop refines a rough start into an industrial‑grade product.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
TonyBai
Tony Bai's tech world (tonybai.com). Not satisfied with just "knowing how", we strive for mastery. Focused on Go language internals, high-quality engineering practices, and cloud‑native architecture, exploring cutting‑edge intersections of Go and AI. Gophers who pursue technology are welcome—follow me and evolve with Go.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
