Why Your Claude Code Skills Aren’t Working: It’s More Than Just a Markdown File
The article explains that Claude Code skills are full‑folder toolkits, not simple markdown steps, and shows how description length, progressive disclosure, classification, memory, hooks, and usage metrics determine whether a skill is triggered and useful.
Many users install a handful of Claude Code skills and wonder why the assistant never calls them. The problem isn’t the skill mechanism itself; Anthropic’s internal blog reveals that a skill is a folder containing a mandatory SKILL.md and optional references/, scripts/, and assets/. The folder can also hold data files, configuration, and temporary hooks.
Q1 – A skill is not just a markdown file
A skill’s folder structure looks like this:
deploy-service/
├── SKILL.md # name, description, step‑by‑step guide, gotchas
├── references/ # extra docs, API specs, troubleshooting
├── scripts/ # executable helpers (e.g., <em>smoke_test.sh</em>)
└── assets/ # output templates (e.g., <em>release_note.md</em>)Only SKILL.md is required; the other directories are optional and can be added as needed. The assistant does not load all files at once – it first reads the description, then lazily loads the full markdown and any referenced files when the skill is actually invoked.
Q2 – Nine skill categories
Anthropic grouped hundreds of internal skills into nine functional categories, such as “Library and API reference”, “Product verification”, “Data query & analysis”, “Business process automation”, “Code scaffolding”, “Code quality & review”, “CI/CD & deployment”, “Runbook troubleshooting”, and “Infrastructure operations”. The most effective skills stay within a single category; mixing categories confuses the model.
Q3 – Why a skill may never fire
When a conversation starts, Claude builds a skill list that includes only each skill’s name and description. This is called Progressive Disclosure . The model decides whether to call a skill solely from the description. If the description is vague or exceeds the MAX_LISTING_DESC_CHARS = 250 limit, the excess is truncated with an ellipsis, making the trigger condition invisible.
The list is limited to SKILL_BUDGET_CONTEXT_PERCENT = 0.01 (1 % of the context window). Adding too many skills forces the system to compress descriptions or fall back to showing only names, causing “silent” skills.
Q4 – The most valuable part of a skill: the Gotchas list
Instead of restating obvious steps, the skill should contain concrete “gotchas” that only a human who has hit the edge case would know. Examples include precise database column mappings or environment‑specific quirks. These high‑signal items let Claude avoid pitfalls that it cannot infer from code alone.
Q5 – Advanced skill features
Memory : a skill can write its own logs or JSON files (or even a SQLite DB) in its folder, enabling incremental reporting (e.g., daily stand‑up notes). The persistent directory is exposed via the CLAUDE_PLUGIN_DATA environment variable.
Scripts : bundling reusable functions lets Claude focus on orchestration rather than reinventing boilerplate code.
Temporary hooks : a skill can register a hook that lives only for the duration of the skill’s execution, providing safety nets (e.g., blocking dangerous commands) without permanently altering the global environment.
Q6 – Distributing skills within a team
Two paths are recommended: (1) commit the skill into a shared code repository under .claude/skills, or (2) package it as a plugin and host it in an internal marketplace. The marketplace approach avoids the 1 % context budget issue because only the installed skills consume budget.
Skill approval is left to natural evolution: developers share a skill in a sandbox folder, gather usage, then promote it via a pull request to the official marketplace.
Q7 – Measuring skill usage
Anthropic instruments each skill with a PreToolUse hook that logs “who used which skill and when”. This data reveals (a) popular skills that deserve extra polishing (especially verification skills) and (b) “under‑triggered” skills whose descriptions need improvement.
Finally, the article condenses the seven questions into three takeaways: treat a skill as a full folder‑based system, craft a concise description under 250 characters that serves as a model‑oriented trigger, and prioritize verification‑type skills for the biggest impact.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
