Inside Claude Code’s Query Loop: From a Simple While Loop to an Industrial‑Grade Agent Engine
This article dissects Claude Code’s 1729‑line queryLoop, explaining its four‑layer call chain (ask → QueryEngine → query → queryLoop), the async‑generator core that streams model output, how tool calls are handled in parallel, the explicit state object, and the many error‑recovery paths that make the loop production‑ready.
When you type a command into Claude Code and hit Enter, a sophisticated agent engine runs for about 30 seconds before returning a solution. The article walks through every step of that engine, showing that the "while (true) { call model; maybe run tool; }" skeleton hides a massive amount of production‑grade logic.
1. What actually happens after you press Enter?
The model often replies with a request to read a file or run a test before it can answer. Claude Code follows that request, reads the file, runs the test, and only after several rounds does the model finally say it has fixed the bug. This repeated "model → tool → model" cycle is called the Query Loop , the heart of the agent.
2. From ask to queryLoop
The call chain is split into four layers: ask – SDK entry point (e.g., await ask({prompt})) QueryEngine.submitMessage – manages session state query – an async generator that streams results queryLoop – the core while(true) loop
Each layer yields to the next with yield*, allowing events produced deep inside queryLoop to bubble up to the outermost caller.
3. The five‑step skeleton of queryLoop
async function* queryLoop(params) {
let state = {messages: [...], turnCount: 1, ...}
while (true) {
// 1️⃣ Prepare messages (compress if needed)
const messagesForQuery = maybeCompact(state.messages)
// 2️⃣ Stream model output, collect tool_use blocks
let toolUseBlocks = []
let needsFollowUp = false
for await (const chunk of callModel(messagesForQuery)) {
yield chunk // stream text to user
if (chunk is tool_use) { // collect tool request
toolUseBlocks.push(chunk)
needsFollowUp = true
}
}
// 3️⃣ Decide whether to continue
if (!needsFollowUp) return {reason: 'completed'}
// 4️⃣ Run all tools (parallel for read‑only, serial for mutating)
const toolResults = await runTools(toolUseBlocks)
// 5️⃣ Append results and start next round
state = {...state, messages: [...messagesForQuery, ...assistantMessages, ...toolResults], turnCount: state.turnCount + 1}
}
}The loop handles message preparation, streaming, tool collection, decision making, tool execution, and state update.
4. Decision logic – when does the loop stop?
The only condition is whether the model emitted any tool_use blocks. If none appear, the loop returns with reason: 'completed'. If a tool_use is present, needsFollowUp becomes true, the tools run, and the loop continues.
5. Parallel tool execution
Claude Code launches each tool as soon as its tool_use block is seen, using a StreamingToolExecutor. Read‑only tools (e.g., Read, Grep) run in parallel, while mutating tools (e.g., Edit, Write) are forced to run serially. If a tool’s metadata omits the read‑only flag, the framework defaults to fail‑closed (treat as mutating) to avoid unsafe concurrency.
6. Explicit state object
All cross‑turn information lives in a State object rather than hidden closure variables. Example fields include:
type State = {
messages: Message[]; // accumulated conversation
turnCount: number; // current iteration
maxOutputTokensRecoveryCount: number; // how many times output truncation was recovered
hasAttemptedReactiveCompact: boolean; // whether this round already tried compression
}This makes debugging deterministic and prevents infinite loops caused by hidden flags.
7. Robust error‑recovery paths
Claude Code defines more than a dozen exit reason codes, such as: completed – normal finish max_turns – user‑specified turn limit aborted_streaming – user pressed Ctrl+C while model was streaming aborted_tools – user cancelled a running tool prompt_too_long – message history exceeds context window max_output_tokens_recovery – output was truncated and could not be recovered stop_hook_prevented – custom stop‑hook blocked continuation
Two particularly clever mechanisms are highlighted:
Missing tool results: If a tool_use block never gets a result (network loss, user interrupt, model downgrade), the engine synthesizes a fake tool_result with is_error: true and the original tool_use_id. This satisfies the Anthropic API’s requirement that every tool call be paired with a result, allowing the conversation to continue.
Output truncation: When the model hits the default 8 k token limit, the loop silently raises the limit to 64 k and retries. If truncation persists, the engine injects a nudge message in the next turn (e.g., "Output token limit hit. Resume directly — no apology, no recap…") and retries up to three times before finally exiting with max_output_tokens_recovery.
8. Design philosophy
The article concludes with four takeaways that make Claude Code’s loop production‑ready:
Stream‑while‑working: Async generators expose intermediate events instantly, giving users a responsive feel.
Explicit state management: All counters and flags live in a visible State object.
Engine‑level isolation: The core loop knows nothing about the semantics of individual tools, enabling zero‑intrusion extensions.
Failure‑first recovery: Instead of throwing errors, the engine fabricates safe fallbacks (fake tool results, silent token‑limit upgrades) so the user never sees a crash.
Armed with this knowledge, a candidate can answer interview questions about Claude Code’s query process by describing the four‑layer call chain, the async‑generator loop, parallel tool execution, state handling, and the extensive error‑recovery strategies that keep the agent robust in production.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Services Circle
Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
