Artificial Intelligence 14 min read

QueryEngine: One Instance Equals One Session – Full Breakdown of Claude Code’s Session Lifecycle

The article dissects Claude Code’s QueryEngine class, explaining how each QueryEngine instance represents a single conversation thread, detailing its configuration, state management across turns, the submitMessage workflow, SDK vs REPL modes, persistence mechanisms, and the four key engineering decisions and technical debts.

James' Growth Diary

May 1, 2026

QueryEngine: One Instance Equals One Session – Full Breakdown of Claude Code’s Session Lifecycle

01 Architecture Positioning: One Instance = One Session

Each QueryEngine instance represents a complete dialogue thread, providing an explicit architectural boundary that isolates state between REPL, sub‑agents, and MCP sessions.

export class QueryEngine {
  // Session state (persisted across turns)
  private config: QueryEngineConfig;
  private mutableMessages: Message[]; // accumulated history
  private abortController: AbortController;
  private permissionDenials: SDKPermissionDenial[];
  private totalUsage: NonNullableUsage; // token consumption
  private readFileState: FileStateCache;
  private discoveredSkillNames = new Set<string>();

  // Main entry
  async *submitMessage(prompt: string | ContentBlockParam[], options?: SubmitMessageOptions): AsyncGenerator<QueryEvent> {}
}

02 QueryEngineConfig: 34 Injected Fields

The engine receives its environment via dependency injection, avoiding internal state for application‑level data.

export type QueryEngineConfig = {
  cwd: string;                     // working directory
  tools: Tools;                    // available tools
  commands: Command[];             // slash commands
  mcpClients: MCPServerConnection[]; // MCP connections
  canUseTool: CanUseToolFn;        // permission check (injected)
  getAppState: () => AppState;     // read external state
  setAppState: (f: StateUpdater) => void; // write external state
  initialMessages?: Message[];    // restore history on --continue
  readFileCache: FileStateCache;
  customSystemPrompt?: string;
  maxTurns?: number;               // sub‑agent limit
  maxBudgetUsd?: number;           // budget in USD
};

Note: getAppState and setAppState are external injections, not owned by the engine.

03 submitMessage: A Turn’s Complete Journey

The method performs budget checking, processes user input, rebuilds the system prompt each turn, loads memory files, appends to the mutable message list, runs the core query() AsyncGenerator loop, yields events to the REPL, updates usage, and finally persists the transcript.

async *submitMessage(prompt, options): AsyncGenerator<QueryEvent> {
  // 1. Budget guard
  if (this.config.maxTurns && this.getTurnCount() >= this.config.maxTurns) {
    yield { type: 'budget_exceeded', reason: 'max_turns' };
    return;
  }

  // 2. Process user input
  const { messages: userMessages } = await processUserInput(prompt, ...);

  // 3. Rebuild system prompt (no cache)
  const systemPromptParts = await fetchSystemPromptParts({
    cwd: getCwd(),
    tools: this.config.tools,
    commands: this.config.commands,
    mcpClients: this.config.mcpClients,
  });

  // 4. Load memory files
  const memoryPrompt = await loadMemoryPrompt();

  // 5. Append to history
  this.mutableMessages.push(...userMessages);

  // 6. Core query loop
  for await (const event of query({
    messages: this.mutableMessages,
    system: [...systemPromptParts, memoryPrompt],
    tools: this.config.tools,
    abortSignal: this.abortController.signal,
  })) {
    yield event; // streamed to REPL
    if (event.type === 'assistant') {
      this.mutableMessages.push(event.message);
      this.totalUsage = accumulateUsage(this.totalUsage, event.usage);
    }
  }

  // 7. Persist transcript
  recordTranscript(this.mutableMessages);
  flushSessionStorage();
}

Key details:

System prompts are rebuilt every turn, ensuring immediate effect of changes to CLAUDE.md or tool definitions, at the cost of I/O. mutableMessages accumulates across turns, directly causing token growth in long sessions.

Events are yielded before being added to history, enabling streaming output.

04 Session State Comparison: Cross‑Turn vs Non‑Persistent

Fields that survive across turns: mutableMessages – full conversation record (persisted) totalUsage – cumulative token usage for budgeting (persisted) readFileState – file‑read cache to avoid repeated reads (persisted) permissionDenials – SDK‑rejected operations (persisted) discoveredSkillNames – reset each submitMessage (single‑turn) abortController – recreated after a user abort (single‑turn) systemPromptParts – rebuilt every turn (not persisted) prompt – the user input for a single call (not persisted)

The principle is to persist internally generated state (history, usage, cache) while leaving externally supplied data (system prompt, tool definitions) uncached.

05 SDK Mode vs REPL Mode: Two Faces of the Same Engine

Both interactive REPL and programmatic SDK usage share the same QueryEngine codebase. In SDK mode the engine returns a structured result after the query loop finishes; in REPL mode it streams events.

// SDK mode
if (options.sdkMode) {
  return {
    messages: this.mutableMessages,
    usage: this.totalUsage,
    cost: this.getTotalCost(),
  };
}

// Fast mode example (lightweight model override)
const fastModeState = getFastModeState();
if (fastModeState.enabled) {
  options.model = fastModeState.fastModel; // affects only this call
}

AgentTool (sub‑agent) invokes the engine with sdkMode: true, receives the structured messages, and continues its own workflow.

06 Session Persistence: What recordTranscript Does

After each submitMessage, recordTranscript serialises mutableMessages to a JSONL file under ~/.claude/projects/<project_hash>/. The --continue and --resume flags read this file back into initialMessages, allowing a new engine instance to pick up where the previous session left off.

// Persist after each turn
recordTranscript(this.mutableMessages);
flushSessionStorage();

// Restore on construction
constructor(config: QueryEngineConfig) {
  this.mutableMessages = config.initialMessages ?? [];
}

The /clear command calls resetMessages(), clearing mutableMessages and writing a fresh empty transcript.

07 Reusing the Pattern in Your Project

A minimal implementation mirrors the same "one instance = one session" contract, with explicit turn counting, budget checks at entry, per‑turn system‑prompt reconstruction, and a reset() method that clears state without destroying the instance.

class QueryEngine {
  private messages: Message[] = [];
  private totalTokens: TokenUsage = new TokenUsage();

  async *submitMessage(prompt: string): AsyncGenerator<Event> {
    // 1. Budget guard
    if (this.config.maxTurns && this.getTurnCount() >= this.config.maxTurns) {
      yield { type: 'budget_exceeded' };
      return;
    }
    // 2. Rebuild system prompt each turn
    const system = await this.buildSystemPrompt();
    // 3. Append user message
    this.messages.push({ role: 'user', content: prompt });
    // 4. LLM loop
    for await (const event of this.callLLM(system, this.messages)) {
      yield event;
      if (event.type === 'assistant') {
        this.messages.push(event.message);
        this.totalTokens.add(event.usage);
      }
    }
    // 5. Persist
    await this.persistTranscript();
  }

  reset(): void {
    this.messages = [];
    this.totalTokens = new TokenUsage();
  }

  getTurnCount(): number {
    return this.messages.filter(m => m.role === 'user').length;
  }
}

08 Design Insights: Four Engineering Decisions

Dependency injection, not centralized state: The engine only holds session‑level state; application‑level state is accessed via injected getAppState/setAppState, allowing safe reuse by sub‑agents.

System prompts are never cached; message history always is: External files (e.g., CLAUDE.md) are rebuilt each turn, while internally generated data is persisted.

Single codebase serves both SDK and REPL modes: No duplicated implementations; AgentTool benefits from this shared engine.

Budget check as an entry guard: Per‑turn validation avoids background monitoring or timers.

09 Critical View: Three Technical Debts

Unbounded growth of mutableMessages : Long sessions can cause memory pressure; auto‑compaction is planned for a future article.

Per‑turn system‑prompt reconstruction incurs I/O: Large projects may suffer noticeable latency; incremental updates or file watching could mitigate this.

Abort controller recreation can race: If a previous turn’s async work hasn’t fully cleaned up, a new controller may coexist, leading to race conditions under rapid user aborts.

10 Summary

The core contract is that a QueryEngine instance equals a conversation thread. State that the engine itself creates (history, token usage, file cache) persists across turns, while externally supplied state (system prompts, tool definitions) is rebuilt each turn. The same engine powers both interactive REPL and programmatic SDK usage, enabling lightweight sub‑agent sandboxes. Budget enforcement is performed as an entry guard, keeping overhead minimal. The most pressing technical debt is the unlimited growth of mutableMessages, which will be addressed in the next installment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

TypeScript AI Claude QueryEngine SDK mode session lifecycle

Written by

James' Growth Diary

I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.