Artificial Intelligence 27 min read

Inside Claude Code’s Query Loop: From a Simple While Loop to an Industrial‑Grade Agent Engine

This article dissects Claude Code’s 1729‑line queryLoop, explaining its four‑layer call chain (ask → QueryEngine → query → queryLoop), the async‑generator core that streams model output, how tool calls are handled in parallel, the explicit state object, and the many error‑recovery paths that make the loop production‑ready.

IT Services Circle

Jun 12, 2026

Inside Claude Code’s Query Loop: From a Simple While Loop to an Industrial‑Grade Agent Engine

When you type a command into Claude Code and hit Enter, a sophisticated agent engine runs for about 30 seconds before returning a solution. The article walks through every step of that engine, showing that the "while (true) { call model; maybe run tool; }" skeleton hides a massive amount of production‑grade logic.

1. What actually happens after you press Enter?

The model often replies with a request to read a file or run a test before it can answer. Claude Code follows that request, reads the file, runs the test, and only after several rounds does the model finally say it has fixed the bug. This repeated "model → tool → model" cycle is called the Query Loop , the heart of the agent.

2. From ask to queryLoop

The call chain is split into four layers: ask – SDK entry point (e.g., await ask({prompt})) QueryEngine.submitMessage – manages session state query – an async generator that streams results queryLoop – the core while(true) loop

Each layer yields to the next with yield*, allowing events produced deep inside queryLoop to bubble up to the outermost caller.

3. The five‑step skeleton of queryLoop

async function* queryLoop(params) {
  let state = {messages: [...], turnCount: 1, ...}
  while (true) {
    // 1️⃣ Prepare messages (compress if needed)
    const messagesForQuery = maybeCompact(state.messages)
    // 2️⃣ Stream model output, collect tool_use blocks
    let toolUseBlocks = []
    let needsFollowUp = false
    for await (const chunk of callModel(messagesForQuery)) {
      yield chunk               // stream text to user
      if (chunk is tool_use) { // collect tool request
        toolUseBlocks.push(chunk)
        needsFollowUp = true
      }
    }
    // 3️⃣ Decide whether to continue
    if (!needsFollowUp) return {reason: 'completed'}
    // 4️⃣ Run all tools (parallel for read‑only, serial for mutating)
    const toolResults = await runTools(toolUseBlocks)
    // 5️⃣ Append results and start next round
    state = {...state, messages: [...messagesForQuery, ...assistantMessages, ...toolResults], turnCount: state.turnCount + 1}
  }
}

The loop handles message preparation, streaming, tool collection, decision making, tool execution, and state update.

4. Decision logic – when does the loop stop?

The only condition is whether the model emitted any tool_use blocks. If none appear, the loop returns with reason: 'completed'. If a tool_use is present, needsFollowUp becomes true, the tools run, and the loop continues.

5. Parallel tool execution

Claude Code launches each tool as soon as its tool_use block is seen, using a StreamingToolExecutor. Read‑only tools (e.g., Read, Grep) run in parallel, while mutating tools (e.g., Edit, Write) are forced to run serially. If a tool’s metadata omits the read‑only flag, the framework defaults to fail‑closed (treat as mutating) to avoid unsafe concurrency.

6. Explicit state object

All cross‑turn information lives in a State object rather than hidden closure variables. Example fields include:

type State = {
  messages: Message[];          // accumulated conversation
  turnCount: number;            // current iteration
  maxOutputTokensRecoveryCount: number; // how many times output truncation was recovered
  hasAttemptedReactiveCompact: boolean; // whether this round already tried compression
}

This makes debugging deterministic and prevents infinite loops caused by hidden flags.

7. Robust error‑recovery paths

Claude Code defines more than a dozen exit reason codes, such as: completed – normal finish max_turns – user‑specified turn limit aborted_streaming – user pressed Ctrl+C while model was streaming aborted_tools – user cancelled a running tool prompt_too_long – message history exceeds context window max_output_tokens_recovery – output was truncated and could not be recovered stop_hook_prevented – custom stop‑hook blocked continuation

Two particularly clever mechanisms are highlighted:

Missing tool results: If a tool_use block never gets a result (network loss, user interrupt, model downgrade), the engine synthesizes a fake tool_result with is_error: true and the original tool_use_id. This satisfies the Anthropic API’s requirement that every tool call be paired with a result, allowing the conversation to continue.

Output truncation: When the model hits the default 8 k token limit, the loop silently raises the limit to 64 k and retries. If truncation persists, the engine injects a nudge message in the next turn (e.g., "Output token limit hit. Resume directly — no apology, no recap…") and retries up to three times before finally exiting with max_output_tokens_recovery.

8. Design philosophy

The article concludes with four takeaways that make Claude Code’s loop production‑ready:

Stream‑while‑working: Async generators expose intermediate events instantly, giving users a responsive feel.

Explicit state management: All counters and flags live in a visible State object.

Engine‑level isolation: The core loop knows nothing about the semantics of individual tools, enabling zero‑intrusion extensions.

Failure‑first recovery: Instead of throwing errors, the engine fabricates safe fallbacks (fake tool results, silent token‑limit upgrades) so the user never sees a crash.

Armed with this knowledge, a candidate can answer interview questions about Claude Code’s query process by describing the four‑layer call chain, the async‑generator loop, parallel tool execution, state handling, and the extensive error‑recovery strategies that keep the agent robust in production.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

state-management Agent Architecture Claude Code tool execution Async Generator Error Recovery Query Loop

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.