Loop Engineering: Designing Self‑Running Agent Loops

Addy Osmani introduces Loop Engineering, a shift from writing prompts for coding agents to building autonomous loops composed of automations, worktrees, skills, plugins, sub‑agents, and state, while highlighting benefits, trade‑offs, and the new leverage point for engineers.

AI Engineering
AI Engineering
AI Engineering
Loop Engineering: Designing Self‑Running Agent Loops

Addy Osmani published a detailed article introducing the term "Loop Engineering," describing a trend from manually writing prompts for programming agents to designing autonomous loop systems that drive agents themselves.

"You should no longer hand‑craft prompts for programming agents. Instead, design a loop that prompts the agent for you."

Previously, Boris, author of Claude Code, expressed a similar view: "I no longer write prompts; I write loops."

Before and Now

Two years ago, using a programming agent meant writing a good prompt, providing context, waiting for output, and then repeating the process manually—agents were tools you held at every step.

Today, you build a small system that finds work, assigns tasks, checks results, records progress, and decides the next step, allowing the system to interact with the agent without manual prompting.

Osmani breaks this system into five building blocks plus a memory layer, all of which are supported by both Codex and Claude Code, albeit under different names.

Five Building Blocks

1. Automations – Heartbeat

Automations keep the loop running on a schedule rather than a single execution. In Codex’s Automations tab you select a project, write a prompt, set a frequency, and route results to a triage inbox, with automatic archiving of empty results. OpenAI uses this internally for issue classification, CI failure aggregation, commit summaries, and bug capture.

Claude Code achieves the same via /loop, cron jobs, hooks, and GitHub Actions. It also offers /goal, which runs until a specified condition (e.g., all tests under test/auth pass and lint is clean) is met, with a separate lightweight model judging each iteration.

2. Worktrees – Parallel without Conflict

Running multiple agents concurrently can cause file conflicts, analogous to two engineers editing the same line. Git worktrees give each agent its own independent working directory while sharing repository history, preventing interference. Codex has built‑in support; Claude Code uses the --worktree flag or isolation: worktree configuration.

Osmani warns that tools solve mechanical conflicts, but the real bottleneck is the reviewer’s bandwidth, which determines how many agents can run in parallel.

3. Skills – Avoid Re‑explaining the Project

Each new session traditionally requires re‑explaining project context. A Skill is a folder containing a SKILL.md file that describes commands, metadata, optional scripts, reference files, and resources. Codex invokes it with $ or /skills; Claude Code uses the same mechanism.

Osmani previously coined "intent debt"—agents start cold, and any gaps in intent lead to confident guesses. Skills externalize intent, allowing it to be written once and read on every run, preventing the loop from re‑deriving the entire project each time.

4. Plugins & Connectors – Access Real Tools

Loops that only see the file system are limited. MCP‑based connectors let agents read issue trackers, query databases, call staging APIs, or post to Slack. Both Codex and Claude Code support MCP, so a connector written for one often works for the other. Plugins package connectors and skills for easy sharing.

This capability determines whether an agent merely suggests a fix or actually opens a PR, links a Linear ticket, and notifies a channel when CI passes.

5. Sub‑agents – Separate Writing and Checking

The most valuable loop design separates code generation from verification. The first model writes code; a second, possibly different model, evaluates it. Codex defines sub‑agents via TOML files under .codex/agents/; Claude Code uses .claude/agents/ and agent teams.

Typical division: one agent explores, another implements, a third validates against specifications. In an unattended loop, a trusted validator is the only reason you can safely step away.

+1. State – The Sixth Piece

State is any Markdown file or Linear board that persists beyond a single conversation, recording what has been done and what to do next. Because models forget between runs, state must live on disk, not just in context; the repository remembers, the model does not.

What a Loop Looks Like

Osmani shares a common pattern: each morning an automation runs, invoking a triage skill that reads yesterday’s CI failures, open issues, and recent commits, then writes findings to Markdown or Linear. For each actionable finding, a new worktree is created, a sub‑agent drafts a fix, and another sub‑agent validates it against project skills and tests. Connectors let the loop open PRs and update tickets automatically; anything the loop cannot handle lands in the triage inbox for manual review.

You design the loop once; you never manually prompt any step thereafter.

Things Loops Can’t Solve

Osmani lists three sharpening problems:

Verification remains your responsibility. An unattended loop can still make mistakes; separating the writer from the validator only makes the claim of "done" slightly stronger, but it is still a claim, not proof.

Your understanding will degrade. Faster loop‑generated code widens the gap between what you understand and the codebase, creating "comprehension debt" unless you actively read the loop’s output.

Cognitive surrender. When the loop runs autonomously, it’s easy to stop exercising judgment. Designing the loop with active judgment makes it a remedy; using it to avoid thinking makes it a catalyst for loss of insight.

Key Insight

The most valuable part of the article is not the five building blocks—those are documented elsewhere—but the shift in leverage point. As Bcherny says, "I no longer prompt Claude; I write loops." The work isn’t lighter; the point of effort has moved. Two people can build identical loops that produce opposite outcomes: one uses the loop to deepen understanding, another to evade it. The loop itself is indifferent; the engineer’s intent decides the result.

Thus, loop design is harder than prompt engineering, not easier.

Build the loop. But build it like someone who intends to stay the engineer, not just the person who presses go.

Note on token cost: the loop’s reach is limited by your token budget, a constraint tied to model providers’ pricing models.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

automationAI agentspluginsStateSkillsSub‑agentsWorktreesloop engineering
AI Engineering
Written by

AI Engineering

Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.