Artificial Intelligence 29 min read

How Claude Code Implements Skills: Architecture, Loading, and Execution

This article dissects Claude Code's skill system, tracing its evolution from early prompt engineering to the modern skill framework, detailing the loading pipeline, SKILL.md structure, lazy compilation, command routing, and the system's strengths and limitations.

Goodme Frontend Team

Apr 8, 2026

How Claude Code Implements Skills: Architecture, Loading, and Execution

The piece begins with a historical overview of large‑language‑model (LLM) development, highlighting three phases: the explosive popularity of ChatGPT in late 2022, the rise of prompt‑engineering repositories in early 2023, and the emergence of standardized tool‑calling interfaces such as OpenAI Function Calling and Anthropic's Model Context Protocol (MCP) in mid‑2023.

What Are Skills?

In October 2025 Anthropic released Claude Skills, which package reusable, documented capability units. A skill bundles best‑practice instructions (e.g., how to generate a DOCX or read a PDF) so the model can reference them instead of relying on ad‑hoc prompts. The main benefits are:

Maintainable knowledge : best practices live in SKILL.md and related files, making updates straightforward.

On‑demand loading : the model only reads a skill when needed, keeping the context clean.

Human‑AI collaboration : humans maintain the skill documentation, the model executes it.

Reusability : any user can invoke a skill and obtain consistent results.

A skill can be thought of as a combination of company policies and a toolbox. Each skill resides in its own directory containing three elements:

SKILL.md : a natural‑language instruction file describing the skill’s purpose, usage conditions, and any caveats.

Script : code written in Python, JavaScript, or another language that performs the actual work when the model calls the skill.

Resource files : auxiliary documents, templates, or configuration files referenced by the script.

Thus, skills are essentially high‑level prompts combined with tool‑calling capabilities, published via platforms like clawhub for versioning and discovery.

How Skills Are Implemented in Claude Code

Claude Code loads skills in two stages: a fast loading phase that parses only the front‑matter of each SKILL.md, and a lazy injection phase that compiles the full markdown into prompt blocks at call time.

1. Startup Entry Point

Running the claude command triggers the entry point in src/main.tsx (lines 1918‑1932). The code registers built‑in plugins and bundled skills, then creates a setupPromise and a commandsPromise (if the worktree is enabled). The design ensures that initBundledSkills() populates an in‑memory array ( bundledSkills.push(skill)) in under 1 ms, before getCommands() runs, otherwise bundled skills would be missing.

2. Skill Loading Pipeline

When a .claude/skills/ directory (or a project‑level .claude/skills/) is detected, the system reads skills in parallel using Promise.all. The merge order (high to low priority) is:

[
  ...bundledSkills,          // embedded skills
  ...builtinPluginSkills,    // built‑in plugin skills
  ...skillDirCommands,       // user / project / managed skills
  ...workflowCommands,
  ...pluginCommands,
  ...pluginSkills,
  ...COMMANDS()              // non‑skill commands
]

The core loader loadSkillsFromSkillsDir performs:

Read directory entries with fs.readdir.

Skip non‑directories; expect skill-name/SKILL.md structure.

Read SKILL.md content.

Parse YAML front‑matter into a structured object.

Extract fields (description, allowedTools, model, hooks, etc.).

Create a Command object that captures the markdown via a closure for lazy compilation.

Duplicate paths are de‑duplicated using realpath, and the priority is determined by the merge order (managed > user > project > additional > legacy).

3. Conditional Skills

Skills with a paths field are stored in a conditionalSkills map and only become active when a file operation matches the glob pattern. Activation uses a gitignore‑style matcher.

4. SKILL.md Front‑Matter

Key front‑matter fields include: name,

description

model

(e.g., claude-sonnet-4-6) effort (low | medium | high or numeric) context ( fork for isolated subprocess, inline otherwise) allowed-tools (list of tool names) paths (conditional activation) hooks (pre/post tool hooks) shell (execution environment) version These fields answer the question “What runtime guarantees does this skill need?”

5. Lazy Compilation via getPromptForCommand

The Command object stores only the front‑matter initially. When a user invokes /skill-name, the closure runs getPromptForCommand, which:

Injects the base directory path if needed.

Substitutes argument placeholders ( $ARGUMENTS, ${CLAUDE_SKILL_DIR}, ${CLAUDE_SESSION_ID}).

Executes any embedded shell commands (prefixed with !) to fetch live data.

Returns a ContentBlockParam[] containing the final text block.

This design provides a compilation pipeline that progressively reduces the LLM’s decision space, turning high‑entropy intent into deterministic instructions.

6. Command Routing

Claude Code defines nine entry points for invoking skills, but the article focuses on the first: user slash commands ( /skill-name).

The flow is:

REPL.onSubmit detects a leading / and extracts the command name.

handlePromptSubmit decides whether to queue the input (if the model is busy) or execute immediately.

processUserInput routes slash commands to processSlashCommand.

processSlashCommand parses the command, looks up the registration, and dispatches based on command type ( prompt, local, local‑jsx).

For prompt skills, getMessagesForPromptSlashCommand loads the skill content via command.getPromptForCommand, builds a message list (including metadata, skill content, and permission attachments), and marks shouldQuery as true.

onQuery appends these messages to the conversation history and calls the Claude API, sending the compiled skill as a user message.

7. Design Trade‑offs

Front‑matter is parsed at startup for fast skill discovery; full markdown compilation is deferred until invocation, keeping startup latency low.

Shell commands run on each call, guaranteeing fresh data but adding a one‑time latency per invocation.

All skill content resides in a single in‑memory copy; each call creates a new compiled prompt, which is memory‑efficient but can produce different outputs if underlying shell commands change.

Critical Reflections

What Problems Do Skills Solve?

LLMs suffer from output inconsistency, structural drift, and hallucination. Skills address these by:

Encapsulating best‑practice prompts to fix output behavior.

Using front‑matter constraints and hooks to enforce structural guards.

Conditionally activating based on file paths to reduce hallucination domains.

Three Orthogonal Sub‑systems

Declaration layer : SKILL.md as a domain‑specific language (DSL) describing name, description, when‑to‑use, paths, allowed tools, model, effort, context, hooks, shell, version.

Compilation layer : getPromptForCommand performs lazy compilation, argument substitution, environment injection, and shell execution.

Runtime layer : The resulting Command object is a closure that holds the markdown and metadata, executing only when invoked.

Limitations

Composition explosion : Skills can call other skills via SkillTool, but there is no declarative composition field (e.g., depends_on), no token‑budget accounting for composed skills, and no DAG scheduling, leading to potential conflicts and uncontrolled token usage.

Lack of feedback loop : Execution is one‑way ( SKILL.md → compile → LLM → output). Hooks can validate tool calls but cannot verify pure LLM text, so deviations from expected behavior go unchecked.

Versioning gaps : Although front‑matter includes a version, the loader never uses it for compatibility checks, so upgrading a skill can silently break dependent skills.

Potential Enhancements

Introduce a declarative composes field to build a prompt‑computation graph, enabling token budgeting, conflict detection, and DAG execution.

Add output_schema or validation metadata so the system can automatically verify LLM responses against expected structures.

Leverage the version field to enforce compatibility constraints or automatic migration scripts.

Skill System from a Product‑R&D Perspective

Currently, skills automate atomic actions such as /commit, /review, or /test. Mapping the full product development lifecycle reveals many stages where skills could add value—requirements analysis, architectural suggestions, automated coding, test generation, impact analysis, and rollout strategies—provided the necessary infrastructure (structured input, dependency graphs, CI/CD hooks) exists.

The core promise of skills is not raw automation but the standardization of AI‑human interaction: by codifying expert knowledge into reusable, versioned units, teams can reduce randomness, propagate best practices, and achieve auditability. However, this assumes the underlying methodology encoded in a skill is sound; otherwise, flawed practices will be industrialized at scale.

Conclusion

Claude Code’s skill system offers a sophisticated prompt‑compilation framework that balances fast discovery with lazy, context‑aware execution. Its design addresses key LLM shortcomings but leaves open challenges around composability, validation, and version management. From a product‑R&D standpoint, extending skills beyond atomic tasks toward higher‑level workflow orchestration could dramatically improve development quality, provided the system evolves to include explicit dependency declarations, output contracts, and robust versioning.

LLM prompt engineering Tool Calling AI workflow Lazy Compilation Claude Code skill system

Written by

Goodme Frontend Team

Regularly sharing the team's insights and expertise in the frontend field

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.