Perplexity’s Skill Design Secrets: Why Writing Skills Differs from Coding
The article dissects Perplexity’s internal best‑practice guide for building Agent Skills, showing how Skill design flips conventional coding wisdom, introduces a three‑tier context‑cost model, and provides a step‑by‑step workflow, maintenance tips, and real‑world examples.
Zen of Python vs. Zen of Skills
Perplexity jokes that the "Zen of Python" (PEP 20) is inverted for Skill creation. Roughly half of the Python aphorisms become opposite principles when writing a Skill.
Simple is better than complex → A Skill is a folder, not a single file; complexity itself is a feature .
Explicit is better than implicit → Activation relies on implicit pattern matching and progressive disclosure.
Sparse is better than dense → Context is expensive; each token must carry maximal signal.
Special cases aren't special enough → Gotchas are the highest‑value content .
Implementation is the best idea → If a good explanation exists, the model already knows it; the Skill entry can be removed.
Bottom line: Writing a Skill is not writing software; it is constructing context for the model, with completely different constraints.
What Is a Skill?
Perplexity defines a Skill as a directory with a specific structure: SKILL.md: frontmatter + main command. scripts/: code the agent runs directly; avoid letting the model write it. references/: heavy documentation, loaded on demand. assets/: templates, schemas, data. config.json: initial user configuration.
This hub‑and‑spoke layout lets a Skill stay compact while holding complex content.
Real‑World Example
When building a tax‑law Skill for the "Computer" agent, Perplexity initially tried to load 1,945 tax‑code entries in a single folder, which performed worse than not loading the Skill at all. By reorganising into three‑level topic nesting (≈300 topics → 20 areas → ~15 internal topics) and adding a custom search tool with progressive disclosure, the tax‑related capability became reliable.
“Each additional layer requires manual information‑architecture work, but once refined, the model’s lookup accuracy improves exponentially.”
Three‑Tier Context Cost
The core concept is a three‑tier cost model for loading context:
Index : Lists all visible Skills as name: description. Roughly 100 tokens per Skill; paid per session per user (global tax).
Load : Loads the full SKILL.md content (~5,000 tokens). Paid for the duration of the load (task tax); costs multiply if multiple Skills load simultaneously.
Runtime : Executes scripts/, references/, assets/, or sub‑Skills. No token limit; paid only when the agent actually reads the content.
Why the tiers matter:
Index tokens are a global tax; descriptions must be ultra‑concise.
Load tokens are a task tax; every loaded Skill adds cost, so each sentence must be useful.
Runtime tokens are unrestricted but only charged on actual use.
When Do You Really Need a Skill?
“If a hero query runs successfully without a Skill, you don’t need one.”
Scenarios that warrant a Skill:
The agent makes mistakes without special context.
Cross‑run consistency is critical.
Knowledge is stable but absent from the model’s training data (e.g., post‑cutoff or private processes).
Fine‑grained judgments that the model cannot learn from data, such as font‑selection preferences.
Scenarios where a Skill is unnecessary:
Simple git command sequences the model already knows.
Redundant content already present in the system prompt.
Rapidly changing endpoints that outpace maintenance speed.
Five‑Step Skill Development Process
Step 0 – Write evals : Gather real user queries, known failure cases, and near‑misses.
Step 1 – Write description : Start with “Load when…”, keep under 50 words, describe user intent (preferably a real query), and avoid describing workflow.
Step 2 – Write body : Treat the Skill as a conversation with the LLM, not a human. Provide clear, flexible instructions; avoid overly deterministic command scripts.
Step 3 – Organise directory : Use scripts/ for deterministic logic, references/ for heavy documentation, assets/ for templates/schemas, and config.json for initial config.
Step 4 – Iterate : Run evaluations on a branch, merge changes with the full changeset and evaluation suite.
Maintenance: Gotcha Flywheel
After release, maintenance becomes the main effort:
If a task fails → add a Gotcha.
If an unwanted Skill loads → tighten description and add negative samples.
If a needed Skill fails to load → add keywords and positive samples.
If the system prompt changes → check for conflicts or duplication.
Skills are mostly append‑only; most updates are new Gotchas rather than description rewrites, because changing a description can ripple across all other Skills.
Multi‑Model Evaluation
Perplexity’s Computer supports at least three model families (GPT, Claude Opus, Claude Sonnet). Since Sonnet and GPT exhibit notable behavioral differences on Skills, each Skill must be evaluated across models to avoid coupling to a single backend.
“Very few domestic providers do this.”
Key Takeaways
Skills are not just documentation; they are context bundles.
The description line is the hardest but most critical—it drives routing.
Gotchas are priceless; each failure should be captured as a Gotcha.
Every Skill incurs a token tax; ask whether the agent would fail without it before adding.
Always test across multiple models to ensure robustness.
Adding a Skill can unintentionally degrade unrelated Skills—watch for “action at a distance.”
For teams building Claude Skills or agents on Computer/Codex, Perplexity’s design guide is a valuable reference.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Old Zhang's AI Learning
AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
