From Task Cycles to a Maintainable, Observable, Replayable Agent Loop

The article explains how Loop Engineering turns multi‑round Agent execution into a maintainable, observable, and replayable closed‑loop by defining six core components, reusing traditional development patterns, presenting a CI‑failure triage demo, and highlighting architectural and practical pitfalls.

Architect
Architect
Architect
From Task Cycles to a Maintainable, Observable, Replayable Agent Loop

TL;DR

Loop behaves like a task runtime rather than a longer prompt.

A usable Loop must expose State, Intent, Action, Verify, Commit, and Trace.

State machines, job runners, CI pipelines, and front‑end state flows clarify Loop design.

The first version works well as a small CI‑failure triage loop: read more, write less, keep evidence explicit.

Six components of a Loop

State : stores the current facts of the task. Common mistake : relying only on chat‑context memory.

Intent : decides the next step. Common mistake : letting the model change while thinking.

Action : accesses external systems. Common mistake : granting overly broad tool permissions or missing a whitelist.

Verify : checks the credibility of a result. Common mistake : the executor self‑approves.

Commit : writes the final result to real systems. Common mistake : mixing candidate and final results.

Trace : records what happened each round. Common mistake : keeping only the final summary.

Figure 1: Loop’s six components

Loop's six components
Loop's six components

Applying traditional development experience

Job runner – explicit state

Async jobs usually track more than running and done. A minimal state machine can be:

pending -> running -> retrying -> succeeded
running -> failed -> retrying
running -> blocked -> needs_human
retrying -> failed_permanently

Agent Loops need a similar state machine; otherwise the system only knows that the Agent is still running, not what it has accomplished.

CI pipeline – artifact per step

Each CI step leaves an artifact (commit, job, log, report). A Loop must record evidence for every round: which files were read, which commands ran, what errors occurred, and why a decision was made.

Front‑end state flow – separate candidate actions from commits

In complex UI, the view is derived from state, then actions are batched and submitted. The same pattern applies: let the model generate candidate actions (plan, diff, comment draft) and let a controlled executor perform side‑effects such as writing files or opening PRs.

Minimal demo: CI failure triage Loop

The demo reads a failing CI job, its logs, the related PR and recent commit, classifies the failure, and produces an evidence‑backed suggestion. If evidence is insufficient or the failure is permission‑related, it hands off to a human.

{
  "runId": "ci-triage-20260626-001",
  "goal": "triage failing CI jobs",
  "phase": "collecting",
  "attempt": 0,
  "maxAttempts": 2,
  "evidence": [],
  "classification": null,
  "proposal": null,
  "handoffReason": null
}

Phase type definition (escaped generics):

type Phase =
  | "collecting"
  | "classifying"
  | "drafting"
  | "verifying"
  | "ready_to_commit"
  | "done"
  | "needs_human";

Reducer drives state transitions based on events:

function reduce(state: LoopState, event: Event): LoopState {
  switch (event.type) {
    case "EVIDENCE_COLLECTED":
      return { ...state, phase: "classifying", evidence: [...state.evidence, ...event.evidence] };
    case "CLASSIFIED":
      if (event.classification === "permission_failure") {
        return { ...state, phase: "needs_human", classification: event.classification, handoffReason: "Permission failure requires human review" };
      }
      return { ...state, phase: "drafting", classification: event.classification };
    case "PROPOSAL_DRAFTED":
      return { ...state, phase: "verifying", proposal: event.proposal };
    case "VERIFIED":
      return { ...state, phase: "ready_to_commit" };
    case "VERIFICATION_FAILED":
      if (state.attempt + 1 >= state.maxAttempts) {
        return { ...state, phase: "needs_human", handoffReason: event.reason };
      }
      return { ...state, phase: "collecting", attempt: state.attempt + 1 };
    case "COMMITTED":
      return { ...state, phase: "done" };
    case "HANDOFF":
      return { ...state, phase: "needs_human", handoffReason: event.reason };
    default:
      return state;
  }
}

Intent selection maps the current phase to the next intent:

function selectIntent(state: LoopState): Intent {
  switch (state.phase) {
    case "collecting": return { type: "COLLECT_EVIDENCE" };
    case "classifying": return { type: "CLASSIFY" };
    case "drafting": return { type: "DRAFT_PROPOSAL" };
    case "verifying": return { type: "VERIFY" };
    case "ready_to_commit": return { type: "COMMIT" };
    case "done":
    case "needs_human":
      return { type: "STOP" };
  }
}

Main loop runs a bounded number of steps, stores each state, selects an intent, performs the effect, and reduces the resulting event back into state:

async function runLoop(env: Env, initialState: LoopState): Promise<LoopState> {
  let state = initialState;
  for (let step = 0; step < 12; step++) {
    await env.store.append(state);
    const intent = selectIntent(state);
    if (intent.type === "STOP") return state;
    const event = await env.effects.perform(intent, state);
    state = reduce(state, event);
  }
  return reduce(state, { type: "HANDOFF", reason: "Loop exceeded step limit" });
}

The effects.perform function is the only place that accesses external tools; it can read CI logs, classify failures, generate suggestions, and write comments, but it never bypasses the state machine to modify final results directly.

Side‑effect management (front‑end view)

Read files, logs, issues : allowed by default, source must be recorded.

Generate plan/diff/comment draft : candidate results only, no direct commit.

Write docs, open candidate PR : low‑risk writes, require reviewable diff.

Change permissions, deploy, delete data : human confirmation required .

Retry external API : rate‑limit, back‑off, and max‑attempt limits.

Architecture: separate control plane and execution plane

The control plane should be stable, predictable, and auditable; the execution plane can be flexible and plug in different tools per task. This mirrors platform designs where a scheduler does not perform business logic directly.

Control plane and execution plane
Control plane and execution plane

Choosing the first Loop scenario

CI failure triage : clear input, log references, easy human review.

Document command validation : readable files, runnable commands, clear failure evidence.

PR risk pre‑check : explicit diff, can output a candidate risk list.

Dependency upgrade impact scan : limited to directories/packages, suitable for report generation.

Changelog candidate generation : read‑heavy, write‑light, result editable by humans.

All share the principle: read more, write less, keep evidence clear, and hand off on failure.

Common pitfalls

Using chat history as the sole state store

Chat context compresses over long tasks, losing details. Store structured state in markdown, issues, databases, or event logs to enable replay and reconciliation.

Letting the Agent decide all permissions

Model can suggest actions, but permission boundaries must be encoded in the system (read‑only vs write‑allowed paths, API‑only calls, actions requiring human approval).

Lack of independent verification

Code tasks have tests and builds; document tasks need link checks; operational tasks need source citations and approval flows. Without external verification the Loop may treat a claim of completion as actual completion.

Ignoring the process trace

Trace data reveals mis‑classifications, flaky tools, and overly permissive rules, guiding future improvements to skills, memory, tools, and prompts.

Relation to previous articles

Earlier pieces introduced Harness (the shell that runs the Agent) and Environment (the world the Agent sees). Loop Engineering sits inside this environment as the task runtime that structures state, actions, verification, and commits.

Conclusion

Loop Engineering adds an Agent to familiar loop concepts. By first providing an explicit skeleton—state, intent, action, verification, commit, and trace—a Loop becomes maintainable, observable, and replayable, solving many real‑world problems even before any advanced AI tricks are added.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

software architectureObservabilityState MachineCI Pipelineagent loop
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.