Artificial Intelligence 22 min read

Loop Engineering Guide: Build the Brakes Before the Loop

This article explains how to design reliable AI‑agent loops by first defining clear stop conditions, evidence collection, and hand‑off points, then detailing the minimal components, loop types, cost controls, and practical CI and verification examples to avoid runaway automation.

Architect

Jun 15, 2026

Loop Engineering Guide: Build the Brakes Before the Loop

Running an agent continuously can produce many commits, files, and summaries while the original bug remains. Loop Engineering focuses on stopping the loop at the right time, handing over to a human when needed, and leaving sufficient evidence for the next person.

Loop definition

A loop is a small system that performs trigger → execute → verify → record → continue or stop . The most dangerous loops run smoothly without anyone understanding why they continue.

Design checklist

Define the allowed actions before expanding capabilities.

Start with low‑risk loops such as CI triage, fact‑checking, documentation drift detection, dependency pre‑checks, or duplicate‑failure classification.

A loop needs at least four explicit boundaries: input, permissions, verification, and stop conditions.

Reusable assets rely on Skills, Runbooks, tests, state ledgers, and acceptance criteria rather than a single magic prompt.

One‑sentence rule: Write the brake before you write the loop.

Prompt is still needed

Prompts still provide goals, context, constraints, and acceptance criteria, but they now live inside a maintainable engineering asset instead of a one‑off chat input.

目标 → 经验 → 任务 → 结果 → 什么时候停

When a goal becomes a runtime object it must include scope, invariants, verification evidence, budget limits, and a closure output.

Minimal loop structure

A usable loop can be broken into six parts: trigger, input, allowed actions, prohibited actions, verification, and stop. Missing any part makes the loop prone to failure. The Evaluator and State components are often overlooked; without an Evaluator the loop self‑audits optimistically, and without State each iteration forgets previous decisions.

Three loop types

Reminder loop

Periodically discovers problems and generates a checklist. Typical scenarios:

Daily scan for new issues.

Hourly CI failure check.

Nightly aggregation of production errors.

Morning generation of technical topic candidates.

Best for entry‑level loops because they only read and output a to‑do list.

Fix loop

Attempts fixes in an isolated workspace. Typical actions:

Auto‑fix flaky tests.

Try minimal dependency upgrades.

Repair broken documentation links.

Add missing tests.

Apply small review‑comment changes.

Requires a worktree, iteration limit, tests, and human review before merging.

Evolve loop

Continuously discovers new tasks, plans, assigns, verifies, and writes state back. It can read GitHub, Slack, user feedback, and decide what to build. This type is the most imaginative but also the riskiest because it touches product decisions, permissions, budgets, and organizational debt.

Typical adoption path: Reminder → Fix → Evolve.

CI split loop example

目标：每天早上 9 点检查过去 24 小时失败的 CI。

输入：
- 失败 workflow 名称；
- job 日志；
- 最近 5 个 commit；
- 相关测试文件；
- 过去 7 天是否出现过同类失败。

允许动作：
- 只读分析；
- 将失败归类；
- 对低风险问题创建独立 worktree 尝试修复；
- 生成候选 PR。

禁止动作：
- 不允许跳过测试；
- 不允许修改生产配置；
- 不允许改数据库迁移；
- 不允许直接 merge。

验证：
- 目标测试通过；
- 全量 lint/type‑check 通过；
- 输出失败分类、证据链接和变更摘要。

停止：
- 最多 3 轮；
- 单个问题最多 30 分钟；
- 连续两轮失败原因不变就停止；
- 涉及权限、数据、计费、安全时交还给人。

This template is simple yet close to a production‑ready first version. Its value lies in cleaning the first round of triage, classifying failures, and presenting actionable evidence for humans.

Writing verification loop example

目标：检查一篇技术稿里的事实断言，输出可发前风险清单。

输入：
- 当前草稿；
- 公开来源清单；
- 官方文档；
- 公开讨论链接；
- 相关代码仓库或 README。

允许动作：
- 抽取事实断言；
- 按来源类型标注可信度；
- 查官方文档、代码、论文、访谈记录；
- 输出修改建议。

禁止动作：
- 不直接替换关键结论；
- 不把二手来源写成官方确认；
- 不为缺来源的数字补编出处。

验证：
- 每条事实至少有来源类型；
- 高风险断言标红；
- 时间敏感信息标注日期；
- 无法确认的内容降级或删除。

停止：
- 找不到可信来源时停止；
- 来源冲突时交给作者；
- 涉及法律、财务、安全结论时不自动定稿。

This loop reads a lot, writes little, and forces a clear separation between observed statements and the author’s judgments.

Cost pre‑allocation

Loop cost is multiplicative:

总成本 ≈ 触发次数 × 每轮上下文 × Agent 数量 × 工具调用 × 重试次数

Design must include explicit budget fields such as maximum runtime, iteration count, token or monetary caps, and max no‑progress rounds. A simple stop rule: if two consecutive rounds add no new evidence, shrink the failure scope, or pass any new verification, then stop and hand over to a human.

Avoid self‑audit

Split responsibilities between deterministic validators (tests, lint, type‑check, schema validation, screenshots, SQL read‑only audit, static scan) and a reviewer agent that only checks a checklist:

你是验证者，不是实现者。
只检查以下内容：
1. 是否满足 SPEC；
2. 是否有未验证声明；
3. 是否扩大权限或修改范围；
4. 是否跳过测试；
5. 是否引入不可回滚变更；
6. 是否需要人工决策。

输出：
- pass/fail；
- 证据；
- 需要人确认的问题；
- 不允许直接修复。

When not to use a loop

Goal changes daily – the loop chases noise.

Verification relies on “feeling right” – hard to know if truly completed.

Requires production write permission – high‑impact error risk.

Depends on oral background – state ledger can’t capture context.

Budget has no ceiling – risk of endless spending.

Team never reads results – loop creates orphaned output.

One‑off task – automation cost outweighs benefit.

Before building a loop ask: “Who will read the output tomorrow morning? Who has authority to decide the next step?” If no answer, don’t build.

Design checklist (table converted to list)

Loop Name: One‑sentence purpose.

Business Goal: Cost reduction or quality improvement target.

Trigger: Schedule, event, manual, unmet goal.

Input Sources: Logs, issues, PRs, docs, code, monitoring.

Trust Level: Which sources are authoritative.

Readable Scope: Allowed repositories, directories, systems.

Writable Scope: Read‑only or writable locations.

Isolation: Worktree, temporary branch, sandbox, snapshot.

Process Assets: Skills, Runbooks, SPEC, rules.

Actions: Classify, fix, open PR, write report, notify.

Verification: Tests, lint, screenshots, logs, reviewer agent.

State Ledger: Where each round writes results.

Cost Limits: Time, iterations, token/amount caps.

Stop Conditions: Failure, no progress, over budget, forbidden zone.

Human Escalation: When to hand over to a person.

Rollback: How to revert changes and trace them.

Retrospective: How failures feed back into Skills/Runbooks.

If any field cannot be filled, the system isn’t ready yet.

Final insight

Loop Engineering may be a passing buzzword, but the underlying engineering problems persist. When agents run continuously we must consider why they wake, where they act, which experience they follow, how they prove correctness, when they stop, and how failure evidence is retained.

Think of a loop as a tiny runtime design, not just a prompt trick. It frees humans from repetitive steps but forces clearer goals, boundaries, evidence, and budgets.

References

Addy Osmani, “Loop Engineering”, 2026‑06‑07 – https://addyosmani.com/blog/loop-engineering/

WorkOS, “Key takeaways from Boris Cherny on building Claude Code”, 2026‑06‑02 – https://workos.com/blog/boris-cherny-claude-code-acquired-interview-takeaways

Mike Van Horn, “WTF Is a Loop? Peter Steinberger vs. Boris Cherny” – https://x.com/mvanhorn/article/2063865685558903149

Hacker News discussion – https://news.ycombinator.com/item?id=48514387

Various Chinese articles referenced in the original text (links retained).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Automation AI agents Prompt engineering Cost Management CI Runtime Design Loop Engineering

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.