Artificial Intelligence 32 min read

Why Anthropic Skips Function Calling: Inside the 5 Skill Execution Modes

This article dissects Anthropic's Skill framework, revealing how it drives AI agents through five distinct execution modes—pure prompt injection, script execution, library calls, progressive document loading, and workflow orchestration—while avoiding function‑calling registration and optimizing token usage.

Tencent Cloud Developer

Mar 17, 2026

Why Anthropic Skips Function Calling: Inside the 5 Skill Execution Modes

Background

Anthropic’s Skill system lets AI agents perform tasks such as PDF handling, PPT generation, and data extraction without registering individual function calls. A SKILL.md file provides natural‑language documentation that the model reads and follows.

Core Architecture

The framework is built around nine core Python modules:

_types.py

_repository.py

_tools.py

_toolset.py

_dynamic_toolset.py

_run_tool.py

_skill_processor.py

Key runtime components are:

SkillToolSet – registers built‑in tools (e.g., skill_load, skill_list, skill_run, skill_select_*).

DynamicSkillToolSet – lazily loads skill‑specific tools based on a temporary state_delta.

FsSkillRepository – scans the skills/ directory, parses SKILL.md, and extracts tool declarations.

WorkspaceRuntime – provides two isolated execution back‑ends (local sandbox or Docker container) with read‑only protection for skill files.

Execution Flow

1. LLM calls skill_list() → receives skill names. 
2. LLM calls skill_load("pdf") → loads SKILL.md into the system prompt. 
3. LLM calls skill_select_docs(docs=["forms.md"]) → loads additional documentation. 
4. LLM calls skill_run(command="python3 scripts/xxx.py") → runs the command in an isolated workspace. 
5. The runtime injects environment variables ($WORK_DIR, $OUTPUT_DIR, …). 
6. skill_run stages the skill directory, computes a hash for incremental caching, and executes the command. 
7. stdout, stderr, exit code, and any output files are returned to the LLM.

Key Design Elements

CQRS pattern : write operations (tool registration) and read operations (skill processing) are decoupled via a temporary state_delta.

Incremental hashing ( compute_dir_digest()) avoids re‑copying unchanged skill directories, saving I/O.

Read‑only protection : skill files are read‑only; only $OUTPUT_DIR and $WORK_DIR are writable.

Environment injection : variables such as $WORKSPACE_DIR, $SKILLS_DIR, and $OUTPUT_DIR let scripts run unchanged locally or inside containers.

Token‑efficient tool schema : a single generic tool skill_run (~20 tokens) replaces dozens of function‑calling schemas (1 600–4 000 tokens per round).

Five Execution Modes

Mode 1 – Pure Prompt Injection

Skills that consist only of a SKILL.md file (e.g., frontend‑design) are injected as a system prompt. The LLM generates the final output directly using built‑in tools such as write_to_file. No external scripts are executed.

Mode 2 – Script Execution

Skills like pdf, pptx, and xlsx contain a scripts/ directory and detailed documentation. After the LLM reads SKILL.md, it calls skill_run("python3 scripts/…") to execute the script inside an isolated workspace.

Mode 3 – Library Call

Skills such as slack‑gif‑creator ship a Python package under core/. SKILL.md serves as API documentation; the LLM writes a short script that import s the library (e.g., GIFBuilder) and calls its functions. Execution still goes through skill_run.

Mode 4 – Progressive Document Loading

Skills like pptx provide a lightweight routing table in SKILL.md and separate detailed docs (e.g., editing.md, pptxgenjs.md). The LLM first loads the overview, then lazily loads the required document via skill_select_docs, keeping token usage low.

Mode 5 – Orchestration (Workflow)

The skill‑creator skill defines a multi‑stage pipeline (interview → write SKILL.md → spawn sub‑agents → run evaluation scripts → iterate). It combines prompt injection, document loading, and skill_run to orchestrate complex, multi‑step workflows.

Token Savings

Using a single generic skill_run reduces per‑round tool schema from ~1 600–4 000 tokens to ~20 tokens. For a skill with eight scripts, the savings over ten dialogue rounds are 16 000–40 000 tokens. Progressive loading further cuts token consumption (e.g., loading only editing.md adds ~1 700 tokens instead of the full 28 KB payload).

Why Anthropic Avoids Function Calling

LLM as code writer : the model can generate arbitrary code from natural‑language examples, making a fixed function schema unnecessary.

Universal sandbox : skill_run can execute any command, providing a catch‑all capability.

Narrow use‑case for function calls : only operations that cannot be expressed as code (e.g., internal API calls) would benefit from function calling.

Practical Recommendations for Skill Authors

Start with a pure prompt‑injection skill (only SKILL.md) to obtain a working prototype.

If file manipulation is required, add a scripts/ directory and document the usage in SKILL.md.

For reusable libraries, place code under core/ and treat SKILL.md as API documentation.

Use a routing table and progressive loading for large skill sets to keep token budgets low.

For complex multi‑step processes, define the workflow in SKILL.md and orchestrate sub‑agents via skill_select_docs and skill_run.

Conclusion

The execution power of a Skill equals the quality of its SKILL.md body multiplied by the generic sandbox tool skill_run. By treating the markdown as both documentation and prompt, Anthropic achieves flexible, token‑efficient AI agents without the overhead of per‑function registration.

AI LLM Agent Function Calling Skill Token Optimization

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Background

Core Architecture

Execution Flow

Key Design Elements

Five Execution Modes

Mode 1 – Pure Prompt Injection

Mode 2 – Script Execution

Mode 3 – Library Call

Mode 4 – Progressive Document Loading

Mode 5 – Orchestration (Workflow)

Token Savings

Why Anthropic Avoids Function Calling

Practical Recommendations for Skill Authors

Conclusion

Tencent Cloud Developer

How this landed with the community

Was this worth your time?

0 Comments

Mode 1 – Pure Prompt Injection

Mode 2 – Script Execution

Mode 3 – Library Call

Mode 4 – Progressive Document Loading

Mode 5 – Orchestration (Workflow)