Why You Should Stop Hand‑Writing Prompts: Loop Engineering Lets AI Run Itself
The article explains Loop Engineering—a three‑layered approach that moves AI from manual prompt writing to autonomous loops, detailing its core components, practical implementations in Codex and Claude Code, and the trade‑offs such as token cost, comprehension debt, and design complexity.
From Hand‑Writing Prompts to Autonomous Loops
AI agents are becoming increasingly capable, planning and completing tasks without constant human input. Google engineer Addy Osmani posted that we should stop hand‑crafting prompts and instead design a looping system that lets the AI run itself.
Loop Engineering is described as a shift from manually feeding an AI to letting the AI find work, check results, and advance autonomously. Claude Code’s founder echoed this, saying he no longer writes prompts for Claude but runs loops that handle prompting and decision‑making.
Three Hierarchical Layers
The author outlines three layers: Prompt Engineering (optimising a single prompt’s wording), Context Engineering (optimising the documents, history, and tool definitions the model sees), and Loop Engineering (optimising an automated system that decides when to prompt, what to prompt, and whether the result is acceptable). Each layer wraps the one below it, extending the lever arm outward.
Agent Harness Engineering
Osmani also references an earlier concept, Agent Harness Engineering , which designs the runtime environment for a single agent. Harnesses run on timers, spawning small helpers that feed themselves work.
Five Essential Building Blocks + a Notebook
A functional loop requires five components plus a persistent notebook:
Automations – the heartbeat that repeatedly triggers the loop (e.g., a scheduled task in Codex or Claude Code).
Skills – reusable, formatted intent files (SKILL.md) that describe commands, metadata, and optional scripts.
Connectors – MCP‑based adapters that let agents read issue trackers, query databases, call staging APIs, or post to Slack.
Sub‑agents – separate agents for distinct roles such as exploration, implementation, and verification.
State file – a Markdown file or Linear board that records completed and pending items, providing memory across loop iterations.
Plugins are the distribution mechanism for Skills, allowing reuse across repositories.
Automation Details
In Codex, an Automation is created on the Automations tab, where you select a project, prompt, frequency, and runtime environment. Results are sent to a Triage inbox and auto‑archived if no issues are found. Claude Code offers similar functionality with /goal, /loop, and pause/resume controls, and each round ends with a lightweight model that judges completion.
Worktrees and Parallelism
Git worktrees provide isolated working directories for parallel agents, preventing file‑write conflicts that would occur if two agents edited the same file simultaneously.
Sub‑agents Structure
Codex defines sub‑agents in TOML files, assigning high‑capacity models to reviewers and fast, read‑only models to explorers. Claude Code uses a .claude/agents/ directory. Each sub‑agent incurs its own model and tool‑call overhead, so they are employed only when the added insight justifies the token cost.
State Persistence
The state file is the backbone of a long‑running loop. Because models forget between runs, persistent storage (a Markdown file or Linear panel) is required to remember what has been tried, what succeeded, and what remains open.
Limitations and Risks
Three major concerns arise as loops grow stronger:
Comprehension debt – rapid, autonomous code generation can widen the gap between the repository’s state and the engineer’s mental model.
Cognitive surrender – relying entirely on the loop can erode personal judgment, leading to blind acceptance of outputs.
Token cost – each sub‑agent and each /goal evaluation consumes tokens; a daily loop with several sub‑agents can use 5–10× the tokens of manual prompting, and runaway loops can consume unbounded tokens.
Reddit commenters liken loops to “cron jobs with a hat,” emphasizing that the technology itself is not mysterious; the challenge lies in preventing failures and balancing cost versus benefit.
Conclusion
Osmani’s takeaway is not to abandon prompting entirely but to find a balance: design loops that automate repetitive work while retaining human oversight, avoid unnecessary token burn, and mitigate comprehension debt.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
SuanNi
A community for AI developers that aggregates large-model development services, models, and compute power.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
