Artificial Intelligence 29 min read

How Agent Skills Solve LLM Development Pain Points and Gain Standard Status

The article analyses the emergence of Agent Skills as an open LLM standard, explains the technical shortcomings of current prompt‑centric workflows, describes the three‑layer skill architecture and its benefits for reuse, versioning and organization‑wide deployment, and discusses current limitations and future evolution paths.

Huawei Cloud Developer Alliance

Mar 23, 2026

How Agent Skills Solve LLM Development Pain Points and Gain Standard Status

Background

In December 2025 Anthropic announced Agent Skills as an open standard for large‑language‑model (LLM) systems. Microsoft quickly integrated the standard into VS Code and GitHub, and other leading development tools such as OpenCode, Cursor and Letta followed suit. Agent Skills is the second major LLM system standard after the Model Context Protocol (MCP).

Current LLM Development Pain Points

Prompt length explosion : Real‑world projects continuously extend prompts with workflows, good/bad cases and detailed instructions, often exceeding the model’s token limit and causing attention dilution.

Ability reuse and version management : Abstracting a capability into a public API forces all consumers to upgrade together, leading to heavy testing burdens and instability when a single API version changes.

Fragmented workflow logic : Complex tasks require a mix of deterministic program calls and probabilistic LLM reasoning. The logic ends up split between code and prompts, making hand‑over and review difficult.

Agent Skills Overview

A Skill is essentially a folder containing a SKILL.md file. The file holds two essential parts:

Metadata (name and description) that tells the LLM system when and how to invoke the skill.

Concrete instructions that define the step‑by‑step workflow, required parameters and output format.

Example metadata for a "Customer Suggestion Extractor" skill:

name: "Customer Suggestion Extractor"
description: "Extract structured product improvement suggestions from user comments or descriptions. Use when a user submits a review, test report, experience share or forum post."

The skill’s instruction section then specifies the task, output format (JSON) and detailed rules.

Three‑Layer Architecture

Agent Skills adopts a progressive‑disclosure design with three layers:

Metadata layer : Loaded at startup, contains only the names and descriptions of all available skills (under 100 tokens). It is used for tool selection.

Skill body layer : Loaded on demand when a skill is relevant. It includes the core instructions, workflow and any small auxiliary files (typically under 5 K tokens).

Additional files layer : Arbitrary files such as code scripts, data schemas or reference documents are referenced but not loaded into the prompt, removing token limits for large assets.

This design mirrors a “table of contents → chapter → appendix” reading pattern, allowing the LLM to load only what is needed.

Advantages Over Traditional Approaches

Reduced prompt length through on‑demand loading.

Fine‑grained reuse: skills can be stacked like LEGO blocks to build complex pipelines.

Organization‑level deployment: a single skill folder can be shared across projects, with independent version control.

Private customization: users can fork a public skill and modify only the parts that need adaptation.

Limitations and Open Issues

Despite its strengths, Agent Skills has several drawbacks:

No built‑in asynchronous state tracking, making it unsuitable for high‑concurrency or multi‑subtask scenarios.

Highly branching logic can lead to excessively long skill files, re‑introducing token‑limit problems.

Version stability and permission control are weak; changes to a skill are often made directly in the file without a clear audit trail.

Embedding executable scripts raises code‑injection and security concerns.

Current implementations lack robust integration with version‑control systems, making dependency tracking and change history difficult.

Future Directions

Proposed evolutions include:

Adopting Git‑style governance for skill publishing, review and permission management.

Enhancing stability by separating development and usage states, prioritising environment issues before modifying skill code.

Extending the architecture to support asynchronous task IDs and status callbacks.

Combining Skills with other paradigms such as Deep Research or agent‑based planning to handle more complex, non‑SOP tasks.

Typical Application Scenarios

Agent Skills shines in situations that require standardized, repeatable procedures:

SOP‑type tasks : Any workflow that can be expressed as a fixed sequence of steps (e.g., compliance checks, data extraction).

Highly reusable capabilities : Functions like OCR, safety detection or domain‑specific data synthesis that benefit from organization‑wide sharing.

Multi‑person collaboration : Teams can enforce a common skill set for code style checks, report generation or security auditing, ensuring consistency across contributors.

Practical Example

A concrete skill for generating synthetic tax‑scenario data demonstrates the end‑to‑end workflow:

# Customer Suggestion Extractor
## Task
Extract structured suggestions from product comments.
## Output Format
JSON object
### Rules
- If a product is mentioned, target that product; otherwise use the page context.
- Output as {"product_name": "...", "suggestion": "..."}
- Limit suggestion length to 1000 characters.
- Use a list structure for multiple products.
## Notes
- Avoid vague or malicious suggestions.

The development steps include:

Install a Claude‑compatible LLM (e.g., doubao‑seed‑code‑preview‑251028).

Iteratively generate and refine the prompt, then store it as prompt.txt.

Write a small Python script to read a list of Huawei subsidiaries from an Excel file.

Implement the data‑generation API call (e.g., using doubao‑seed‑1‑8‑251228).

Use the /skill‑creator command to assemble the skill folder, linking the prompt, script and data files.

Run automated tests, fix bugs, and finally invoke the skill via a natural‑language command such as “Generate 5 tax‑scenario records with the tax‑scenedata‑generator skill”.

The skill folder is placed under .claude/, recognized by the Claude runtime, and consumes only 71 tokens for its description, demonstrating the compactness of the metadata layer.

LLM AI Standards Workflow Management Agent Skills

Written by

Huawei Cloud Developer Alliance

The Huawei Cloud Developer Alliance creates a tech sharing platform for developers and partners, gathering Huawei Cloud product knowledge, event updates, expert talks, and more. Together we continuously innovate to build the cloud foundation of an intelligent world.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.