The 12 ms Secret: How Claude Code Optimizes CLI Startup Performance
This article dissects Claude Code’s 12 ms --version response and its 1.1 MB lazy‑loaded footprint, explaining a four‑layer startup chain that uses fast‑path handling, parallel prefetch, selective lazy imports, memoized global initialization, and compile‑time dead‑code elimination to achieve near‑instant CLI launch.
CLI startup latency threshold
When a user presses Enter, any pause longer than 200 ms feels like a hang, causing anxiety and abandonment. An experiment on GitHub showed claude --version returning in 12 ms , while node --version took 50 ms . This demonstrates that sub‑200 ms response is a make‑or‑break requirement for developer‑focused CLIs.
Typical CLI startup approaches
Node.js + Commander.js with all modules imported at the top – simple but costly for large codebases.
Electron with lazy loading – still too heavy for CLI latency.
Compiled Go/Rust binaries – near‑zero start‑up cost but slower iteration.
Bun with aggressive optimizations – balances development speed and binary‑like launch times.
Claude Code adopts the fourth approach and treats startup speed as non‑negotiable.
Four‑layer startup chain
cli.tsx → Entry dispatch layer
main.tsx → CLI initialization layer
init.ts → Global initialization layer
setup.ts → Session setup layerThe guiding principle is “load only what you need”.
Layer 1 – Fast Path (12 ms)
High‑frequency commands such as --version and --dump-system-prompt are handled before any heavy imports, exiting immediately after printing the result. No React, Commander.js, or MCP modules are loaded.
// Fast path handling before any module load
if (args[0] === '--version' || args[0] === '-v') {
console.log(`${MACRO.VERSION} (Claude Code)`);
return; // zero extra module load
}
if (args[0] === '--dump-system-prompt') {
const { getSystemPrompt } = await import('./systemPrompt.js');
console.log(getSystemPrompt());
return;
}Layer 2 – Parallel prefetch (Speculative Execution)
Immediately after entering main.tsx, three I/O‑bound operations are launched in parallel: MDM config read, Keychain credential read, and an additional task. These run while the import‑resolution phase (~135 ms) is occurring, hiding the ~65 ms extra latency inside the import time.
// Parallel prefetch at the top of main.tsx
profileCheckpoint('main_tsx_entry');
startMdmRawRead(); // launch MDM config read (subprocess)
startKeychainPrefetch(); // launch Keychain credential read (dual‑path)Layer 3 – Lazy loading (≈1.1 MB on‑demand)
Modules that are rarely used are never imported until needed. Examples:
OpenTelemetry (~400 KB) – loaded only during telemetry initialization.
gRPC (grpc‑js) (~700 KB) – loaded only when the OTel exporter is required.
Insights command (113 KB) – loaded only when the user runs /insights.
Sentry – feature‑gated and stripped from the build.
// Lazy loading of the insights command
const usageReport = {
type: 'prompt',
name: 'insights',
async getPromptForCommand(args, context) {
const real = (await import('./commands/insights.js')).default;
return real.getPromptForCommand(args, context);
},
};By avoiding the import of these modules at startup, roughly 1.1 MB of JavaScript code is not parsed, equivalent to skipping over 200+ function definitions.
Layer 4 – Global initialization with memoization
The init.ts layer runs only once, wrapped with a memoize helper. It validates configs, injects environment variables, sets up TLS certificates, registers graceful shutdown, and pre‑connects to the Anthropic API, completing the TCP handshake ahead of the first request and shaving ~50 ms from the initial API call.
export const init = memoize(async () => {
enableConfigs();
applySafeConfigEnvironmentVariables();
applyExtraCACertsFromConfig();
setupGracefulShutdown();
// ... other init steps
preconnectAnthropicApi(); // TCP handshake only
});Compile‑time feature gating with Bun DCE
Bun’s bundler supports dead‑code elimination (DCE). Optional features are wrapped in a feature() call; when the flag is false, the bundler strips the entire block, removing both runtime overhead and the code from the final artifact.
// Compile‑time DCE example
const SleepTool = feature('PROACTIVE')
? require('./tools/SleepTool.js').SleepTool
: null;Transferable principles derived from the four‑layer chain
Fast Path – handle high‑frequency commands ( --version, --help) at the very top of the entry file with zero dependencies.
Parallel I/O – launch independent I/O operations (e.g., Keychain read, MDM config read) during the import phase to hide latency.
Lazy loading beyond the web – defer heavy modules until they are actually invoked, using dynamic import().
Memoization of heavyweight initialization – wrap global init functions with a memoization helper to guarantee single execution.
Compile‑time gating – combine a feature‑flag function with Bun’s DCE to eliminate dead code and achieve both performance and security isolation.
Key quantitative outcomes
Fast Path handling of --version achieves a 12 ms response.
Parallel prefetch hides ~ 65 ms of I/O inside the ~ 135 ms import period.
Lazy loading avoids parsing ~ 1.1 MB of code.
Pre‑connecting to the Anthropic API saves ~ 50 ms on the first API call.
Compile‑time feature gating removes unused modules entirely from the bundle.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
James' Growth Diary
I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
