Loop Engineering: Designing Self‑Running Agent Loops
Addy Osmani introduces Loop Engineering, a shift from writing prompts for coding agents to building autonomous loops composed of automations, worktrees, skills, plugins, sub‑agents, and state, while highlighting benefits, trade‑offs, and the new leverage point for engineers.
Addy Osmani published a detailed article introducing the term "Loop Engineering," describing a trend from manually writing prompts for programming agents to designing autonomous loop systems that drive agents themselves.
"You should no longer hand‑craft prompts for programming agents. Instead, design a loop that prompts the agent for you."
Previously, Boris, author of Claude Code, expressed a similar view: "I no longer write prompts; I write loops."
Before and Now
Two years ago, using a programming agent meant writing a good prompt, providing context, waiting for output, and then repeating the process manually—agents were tools you held at every step.
Today, you build a small system that finds work, assigns tasks, checks results, records progress, and decides the next step, allowing the system to interact with the agent without manual prompting.
Osmani breaks this system into five building blocks plus a memory layer, all of which are supported by both Codex and Claude Code, albeit under different names.
Five Building Blocks
1. Automations – Heartbeat
Automations keep the loop running on a schedule rather than a single execution. In Codex’s Automations tab you select a project, write a prompt, set a frequency, and route results to a triage inbox, with automatic archiving of empty results. OpenAI uses this internally for issue classification, CI failure aggregation, commit summaries, and bug capture.
Claude Code achieves the same via /loop, cron jobs, hooks, and GitHub Actions. It also offers /goal, which runs until a specified condition (e.g., all tests under test/auth pass and lint is clean) is met, with a separate lightweight model judging each iteration.
2. Worktrees – Parallel without Conflict
Running multiple agents concurrently can cause file conflicts, analogous to two engineers editing the same line. Git worktrees give each agent its own independent working directory while sharing repository history, preventing interference. Codex has built‑in support; Claude Code uses the --worktree flag or isolation: worktree configuration.
Osmani warns that tools solve mechanical conflicts, but the real bottleneck is the reviewer’s bandwidth, which determines how many agents can run in parallel.
3. Skills – Avoid Re‑explaining the Project
Each new session traditionally requires re‑explaining project context. A Skill is a folder containing a SKILL.md file that describes commands, metadata, optional scripts, reference files, and resources. Codex invokes it with $ or /skills; Claude Code uses the same mechanism.
Osmani previously coined "intent debt"—agents start cold, and any gaps in intent lead to confident guesses. Skills externalize intent, allowing it to be written once and read on every run, preventing the loop from re‑deriving the entire project each time.
4. Plugins & Connectors – Access Real Tools
Loops that only see the file system are limited. MCP‑based connectors let agents read issue trackers, query databases, call staging APIs, or post to Slack. Both Codex and Claude Code support MCP, so a connector written for one often works for the other. Plugins package connectors and skills for easy sharing.
This capability determines whether an agent merely suggests a fix or actually opens a PR, links a Linear ticket, and notifies a channel when CI passes.
5. Sub‑agents – Separate Writing and Checking
The most valuable loop design separates code generation from verification. The first model writes code; a second, possibly different model, evaluates it. Codex defines sub‑agents via TOML files under .codex/agents/; Claude Code uses .claude/agents/ and agent teams.
Typical division: one agent explores, another implements, a third validates against specifications. In an unattended loop, a trusted validator is the only reason you can safely step away.
+1. State – The Sixth Piece
State is any Markdown file or Linear board that persists beyond a single conversation, recording what has been done and what to do next. Because models forget between runs, state must live on disk, not just in context; the repository remembers, the model does not.
What a Loop Looks Like
Osmani shares a common pattern: each morning an automation runs, invoking a triage skill that reads yesterday’s CI failures, open issues, and recent commits, then writes findings to Markdown or Linear. For each actionable finding, a new worktree is created, a sub‑agent drafts a fix, and another sub‑agent validates it against project skills and tests. Connectors let the loop open PRs and update tickets automatically; anything the loop cannot handle lands in the triage inbox for manual review.
You design the loop once; you never manually prompt any step thereafter.
Things Loops Can’t Solve
Osmani lists three sharpening problems:
Verification remains your responsibility. An unattended loop can still make mistakes; separating the writer from the validator only makes the claim of "done" slightly stronger, but it is still a claim, not proof.
Your understanding will degrade. Faster loop‑generated code widens the gap between what you understand and the codebase, creating "comprehension debt" unless you actively read the loop’s output.
Cognitive surrender. When the loop runs autonomously, it’s easy to stop exercising judgment. Designing the loop with active judgment makes it a remedy; using it to avoid thinking makes it a catalyst for loss of insight.
Key Insight
The most valuable part of the article is not the five building blocks—those are documented elsewhere—but the shift in leverage point. As Bcherny says, "I no longer prompt Claude; I write loops." The work isn’t lighter; the point of effort has moved. Two people can build identical loops that produce opposite outcomes: one uses the loop to deepen understanding, another to evade it. The loop itself is indifferent; the engineer’s intent decides the result.
Thus, loop design is harder than prompt engineering, not easier.
Build the loop. But build it like someone who intends to stay the engineer, not just the person who presses go.
Note on token cost: the loop’s reach is limited by your token budget, a constraint tied to model providers’ pricing models.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
AI Engineering
Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
