65 min read

Inside Claude Code: Deep Dive into Prompt, Context & Harness Engineering for AI Coding Agents

This article analyzes Claude Code, an AI coding agent, exploring its sophisticated Prompt Engineering, dynamic Context assembly, multi‑layered compression strategies, and Harness Engineering mechanisms, while comparing its design to OpenClaw and highlighting unique features such as memory systems, sandbox isolation, and playful Easter eggs.

Alibaba Cloud Developer

Apr 20, 2026

Inside Claude Code: Deep Dive into Prompt, Context & Harness Engineering for AI Coding Agents

Background

The author revisits the earlier deep‑analysis of OpenClaw and uses it as a reference point to examine Claude Code , an AI coding agent built on Anthropic's Claude Opus 4.6 model. The goal is to understand how Claude Code designs its Prompt Engineering , Context Engineering , and Harness Engineering to achieve reliable, long‑running tasks.

Prompt Engineering – Static & Dynamic Prompt Assembly

Claude Code treats the system prompt as a modular, multi‑stage construction. Static sections (identity, system rules, task guidelines, safety, tool usage, tone, output efficiency) are defined in constants/prompts.ts and assembled into an array of strings. Dynamic sections (session guidance, memory, environment info, language, output style, MCP instructions, scratchpad, function‑result clearing, token budget) are injected at runtime.

QueryEngine.ask()
  → fetchSystemPromptParts()
  → buildEffectiveSystemPrompt()
  → query()

The fetchSystemPromptParts() function gathers three components:

defaultSystemPrompt – from constants/prompts.ts#getSystemPrompt() systemContext – from context.ts#getSystemContext() (git status, environment)

userContext – from context.ts#getUserContext() (CLAUDE.md content, current date)

Priority‑based selection (override, coordinator, agent, custom, default) is performed in utils/systemPrompt.ts#buildEffectiveSystemPrompt():

1. overrideSystemPrompt  // forced replacement
2. Coordinator prompt   // mode‑specific
3. Agent prompt        // user‑defined
4. customSystemPrompt  // --system-prompt flag
5. defaultSystemPrompt // fallback

After the prompt array is built, static parts are cached (KV cache) while dynamic parts are appended per session.

Context Engineering – System & User Context Injection

Two injection points enrich the prompt:

appendSystemContext – adds git status, branch, model version, etc., to the end of the system prompt.

prependUserContext – inserts a <system‑reminder> block before the first user message, containing the CLAUDE.md project description, current date, and a disclaimer about relevance.

These blocks are wrapped with a custom tag to keep the model aware of what is system‑generated versus user‑generated.

Harness Engineering – Controlling the Agent

Harness Engineering provides the outer execution environment: interfaces, hooks, and guardrails that constrain the model’s actions. Claude Code defines explicit guardrails (e.g., disallowing destructive file operations) and uses hooks to inject additional logic before/after tool calls.

Three‑Layer Context Compression

To keep the token window within limits, Claude Code employs a progressive compression pipeline:

MicroCompact – rule‑driven truncation of tool outputs (Bash, Read, Grep, etc.) without invoking an LLM.

Session Memory Compact (SM Compact) – replaces old messages with pre‑computed session memory summaries when token count ≥ 10 000 and message count ≥ 5.

Full LLM Compact – calls the LLM to produce a structured 9‑section summary (primary request, key concepts, files, errors, problem solving, user messages, pending tasks, current work, next step) when earlier layers are insufficient.

Compression is triggered automatically when the remaining buffer falls below AUTOCOMPACT_BUFFER_TOKENS = 13 000. The system first attempts SM Compact; if that fails, it falls back to Full LLM Compact.

Memdir Structured Memory System

Claude Code stores four categories of memory:

User – personal preferences and custom instructions.

Feedback – correction logs and anti‑pattern records.

Project – architecture decisions, tool configs, and constraints.

Reference – reusable code snippets and documentation.

Memory files are loaded via memdir/memdir.ts#loadMemoryPrompt(), filtered by type, and trimmed according to a token budget. For large memory banks, Claude Code uses an LLM‑in‑the‑loop retriever ( memdir/findRelevantMemories.ts) that returns at most five semantically relevant entries.

Built‑in Agent Suite

Claude Code ships with eight agents, each with a dedicated system prompt:

General‑Purpose Agent – full tool access, minimal prompt, “complete the task fully”.

Explore Agent – read‑only, fast Haiku model, parallel tool calls, minimum three queries.

Plan Agent – architecture design, outputs a list of critical files.

Verification Agent – attempts to break the generated code, enforces strict safety, no file writes.

Claude Code Guide Agent – answers usage questions, runs on Haiku, never asks for permission.

Statusline Setup Agent – configures terminal status bars using only Read and Edit.

Fork Sub Agent – clones the current session with shared prompt cache, limited output, prevents recursive forking.

Buddy System (Pet) – optional fun feature that spawns a deterministic ASCII pet based on user ID.

Each agent’s prompt is stored in constants/prompts.ts and follows the same static/dynamic assembly pipeline.

Security & Sandbox Isolation

The permission engine ( permissions.ts, ~61 KB) classifies actions as Allow , Deny , or Ask . Critical operations (file deletion, git force‑push) are denied unless explicitly permitted.

When a tool is allowed, Claude Code may still run it inside a lightweight bwrap sandbox ( sandbox‑adapter.ts) that isolates the file system, network, and process namespaces, and drops privileges to a non‑root user.

Async Generator Main Loop

The core execution loop is an async function* generator ( queryLoop) that yields at every meaningful step, enabling real‑time streaming of status, tool execution, and partial results. The loop consists of six stages:

Message preprocessing (system‑reminder injection).

LLM API call.

Response parsing (final answer vs. tool call).

Tool execution with security checks.

Result yielding back to the caller.

Termination condition check (max rounds, success, unrecoverable error).

Cancellation is handled via return(), and output truncation triggers up to three automatic continue retries.

Programmable Hook System

Claude Code defines over 20 hook events (e.g., PreToolUse, PostToolUse, ToolError, SessionStart, UserPromptSubmit, PreFileEdit, etc.). Hooks receive a JSON payload, can modify inputs/outputs, block execution ( {"blocked":true,"reason":"..."}), or inject messages. Execution timeouts are enforced by TOOL_HOOK_EXECUTION_TIMEOUT_MS = 600000 (10 minutes).

Easter Eggs & Fun Features

Claude Code includes several whimsical yet functional additions:

Caffeinate – on macOS it runs caffeinate -u -t 300 to prevent idle sleep.

Anti‑Distillation – injects fake tool definitions when anti_distillation:['fake_tools'] is set, poisoning potential model‑stealing datasets.

Undercover Mode – for internal Anthropic users, blocks commit messages containing “Claude Code” or model identifiers.

Dogfooding Flag – process.env.USER_TYPE === 'ant' enables internal‑only features.

Profanity Detection – regexes catch negative keywords and trigger a feedback survey.

Loading Verbs – a rotating list of 100+ whimsical verbs (e.g., “Boondoggling…”, “Lollygagging…”) keeps the CLI lively.

Buddy System – deterministic ASCII pets (cats, ducks, “chonk”, etc.) with rarity tiers, hats, attributes, and optional “shiny” variants. The pet is selected via a Mulberry32 PRNG seeded by the user ID, ensuring the same pet per user.

Example pet image:

Conclusion

The article demonstrates that Claude Code embodies a mature, production‑grade approach to building AI agents: modular prompt construction, rigorous context management, layered compression, fine‑grained permission and sandboxing, an extensible hook framework, and even delightful user‑facing details. These practices serve as a valuable reference for anyone designing reliable, long‑running AI‑driven workflows.

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.