Artificial Intelligence 22 min read

How to Write Workflow Skills: Patterns and Best Practices from 7 Top Projects

This article analyzes seven production‑grade workflow Skills from OpenAI, Google Labs, and others, extracting five reusable design patterns, essential front‑matter fields, and practical writing techniques to help you craft effective Skills that run reliably in LLM agents.

ITPUB

Jul 5, 2026

How to Write Workflow Skills: Patterns and Best Practices from 7 Top Projects

Skill definition

Skill is a folder whose core file is SKILL.md. The file uses YAML front‑matter + Markdown body. When an LLM decides a Skill is needed, the skill tool loads the folder and injects the entire content of SKILL.md into the conversation context.

my-skill/
├── SKILL.md          # required
├── scripts/          # optional executables
├── references/       # optional docs (on‑demand)
├── resources/        # optional templates, checklists
└── examples/         # optional examples

Key mechanism: knowledge injection – the Skill does not create new tools but provides instruction text that the LLM executes with existing tools such as bash, read, edit.

Front‑matter

Required fields

name

: unique identifier, lower‑case hyphenated (e.g. test-driven-development) description: most critical; LLM scans this to decide whether to load the Skill. Good descriptions list trigger phrases, define timing, and include product keywords.

# Good description – includes trigger phrases and keywords
description: Deploy applications and websites to Vercel. Use when the user requests deployment actions like "deploy my app", "push this live", or "create a preview deployment".

# Good description – defines timing
description: Use when implementing any feature or bugfix, before writing implementation code.

# Bad description – too vague
description: Helps with deployment stuff

Optional extension fields observed

references

: declares the most important reference documents allowed-tools: declares required tool permissions type: workflow or component best_for: list of ideal scenarios scenarios: concrete trigger‑scenario examples estimated_time: estimated execution time

Five core design patterns

Pattern 1 – Linear Flow

When to use: deployment, installation, migration – any process with a clear step‑by‑step sequence.

Representative Skill: openai/skills – vercel-deploy (77 lines).

# Title
## Prerequisites
## Quick Start (Step 1 → 2 → 3)
## Fallback
## Troubleshooting

Key tricks:

Safe defaults (e.g. "Always deploy as preview, not production") to prevent dangerous actions.

Provide concrete bash commands for each step so the LLM does not have to guess.

Explicit timeout (e.g. "Use a 10‑minute (600000 ms) timeout").

Fallback scripts for CLI failures.

Negative directives (e.g. "Do not curl the deployed URL to verify").

Pattern 2 – Decision Tree + On‑Demand Loading

When to use: large platforms, product navigation, diagnostic flows.

Representative Skill: openai/skills – cloudflare-deploy (224 lines).

# Title
## Authentication
## Quick Decision Trees
### "I need to run code"
### "I need to store data"
### "I need AI/ML"
## Product Index

Key tricks:

User‑intent classification using natural‑language phrases.

Tree navigation, e.g. ├─ edge‑less‑function → workers/, for quick LLM location.

Progressive disclosure: main file ~7 KB, large references/ loaded only when needed.

Product index table for fast lookup.

Pattern 3 – Iterative Loop (TDD‑style)

When to use: test‑driven development, code review, design review – any workflow that repeats "do → verify → improve".

Representative Skill: obra/superpowers – test-driven-development (371 lines).

# Title
## The Iron Law (non‑negotiable core principle)
## Red‑Green‑Refactor (loop body)
### RED – write a failing test
### Verify RED – confirm it fails
### GREEN – write minimal code
### Verify GREEN – confirm it passes
### REFACTOR – clean up
## Common Rationalizations (excuse rebuttal table)
## Verification Checklist (8‑item exit condition)

Key tricks:

Strong imperative tone (e.g. "Delete it. Start over.") to increase compliance.

Good/Bad comparison tags ( <Good>, <Bad>) for teaching effect.

Excuse rebuttal table (12 common LLM excuses + counter‑arguments).

Checklist to ensure quality before exiting the loop.

Human fallback ("ask your human partner").

Pattern 4 – Baton (Cross‑Session Persistence)

When to use: long‑term projects that span multiple LLM sessions.

Representative Skill: google-labs-code/stitch-skills – stitch-loop (203 lines).

# Title
## Overview (baton mode overview)
## The Baton System (file‑protocol)
## Execution Protocol (6‑step protocol)
### Step 1: Read the Baton
### Step 2: Consult Context Files
### Step 3: Generate
### Step 4: Integrate
### Step 5: Update Documentation
### Step 6: Prepare the Next Baton (critical!)
## File Structure Reference
## Orchestration Options

Key tricks:

File as state: next-prompt.md acts as the baton, so the LLM never has to remember the previous step.

Critical‑step marker (e.g. "Step 6: Critical + MUST") to avoid dead‑ends.

Clear file protocol: each file has a single responsibility.

Orchestration‑agnostic: works with CI/CD, human‑in‑the‑loop, or chained agents.

Pattern 5 – Multi‑Stage + Checkpoints + Skill Orchestration

When to use: multi‑week processes that need Go/No‑Go decisions at key milestones.

Representative Skill: deanpeters/Product-Manager-Skills – discovery-process (502 lines).

# Title
## Key Concepts (core ideas + anti‑patterns)
## Phase 1: Frame the Problem
### Activities (which sub‑Skills to call)
### Outputs (phase deliverables)
### Decision Point 1 (YES/NO + time impact)
## Phase 2‑6… (repeat structure)
## Complete Workflow (end‑to‑end timeline)
## Common Pitfalls
## References (list of sub‑Skills)

Key tricks:

Unified phase template (Activities → Outputs → Decision Point) for quick LLM comprehension.

Decision checkpoints (e.g. "Is the problem saturated? YES → next phase, NO → +1 week").

Skill orchestration: schedule 10+ sub‑Skills per phase.

Time impact annotations on NO paths (e.g. "+2‑3 days", "+1 week").

Separate interaction protocol (reference workshop-facilitation for UI handling).

Special Pattern – Thinking Framework

When to use: security audits, code reviews, architecture analysis – scenarios that require deep reasoning rather than quick execution.

Representative Skill: trailofbits/skills – audit-context-building (302 lines).

# Title
## Purpose (control thinking, not actions)
## When to Use / When NOT to Use
## Rationalizations (excuse rebuttal table)
## Phase 1: Initial Orientation
## Phase 2: Ultra‑Granular Function Analysis
### Per‑Function Checklist
### Cross‑Function Flow Analysis
### Output Requirements (format + quantitative thresholds)
### Completeness Checklist
## Phase 3: Global System Understanding
## Stability Rules (anti‑hallucination rules)
## Non‑Goals (explicit prohibitions)

Key tricks:

Thinking tools: first‑principles, 5 Why, 5 How to give the LLM a reasoning scaffold.

Quantitative thresholds (e.g. "at least 3 invariants, 5 assumptions per function").

Non‑goal constraints (e.g. "do not identify vulnerabilities, do not propose fixes").

Anti‑hallucination rule: "Never reshape evidence to fit earlier assumptions".

Sub‑Agent guidance for when and how to invoke a function‑analyzer agent.

General writing techniques

Four weapons to stop LLM laziness

Strong imperative tone – LLM obeys commands more reliably.

Excuse rebuttal table – anticipate and block LLM self‑justifications.

Quantitative thresholds – set hard minimum standards.

Negative directives – explicitly forbid undesired actions.

Three effective teaching methods

Good/Bad comparison – contrast correct vs incorrect code.

Concrete commands – LLM excels at executing explicit shell snippets.

Full examples – show the exact expected output format.

Security and boundary principles

Safe defaults – choose the safest option by default (e.g., preview deployments).

Least‑privilege – only elevate permissions when absolutely necessary.

Human fallback – hand over to a human when the LLM is uncertain.

Three‑layer knowledge organization

Layer 1: Front‑matter (~100 tokens) – LLM scans description to decide loading.

Layer 2: SKILL.md body (2 K‑5 K tokens) – core instructions, decision trees, steps.

Layer 3: references/ & resources/ (on‑demand) – detailed docs, examples, checklists read via the read tool.

Token budget guidelines

Front‑matter: ~100 tokens (name + description).

Main file: 2 K‑5 K tokens.

Single reference doc: 1 K‑3 K tokens.

Total context: <10 K tokens (main file + 1‑2 references).

Decision‑tree for selecting a pattern

What does your Skill need to do?
│
├─ Execute a clearly‑step‑by‑step operation → Pattern 1: Linear Flow
│
├─ Choose the right direction among many options → Pattern 2: Decision Tree + On‑Demand Loading
│
├─ Repeatedly do → verify → improve in a single session → Pattern 3: Iterative Loop
│
├─ Persist progress across multiple sessions → Pattern 4: Baton Loop
│
├─ Span multiple days/weeks with phases and Go/No‑Go decisions → Pattern 5: Multi‑Stage + Checkpoints
│
└─ Need deep analysis rather than quick execution → Special Pattern: Thinking Framework

Minimal linear‑pattern Skill template

---
name: my-skill
description: [One‑sentence purpose + when to trigger]
---

# Skill name

[One‑sentence core principle + safe default]

## Prerequisites
- [Precondition 1]
- [Precondition 2]

## Steps
### Step 1: [Action]
```bash
[Concrete command]
```
### Step 2: [Action]
[Concrete command]
### Step 3: [Action]
[Concrete command]

Reference URLs

Agent Skills open standard – https://agentskills.io/

Anthropic Skills template – https://github.com/anthropics/skills/tree/main/template

Anthropic Skills specification – https://github.com/anthropics/skills/tree/main/spec

OpenAI Skills repository – https://github.com/openai/skills

Obra Superpowers – https://github.com/obra/superpowers

Google Labs Stitch‑Skills – https://github.com/google-labs-code/stitch-skills

Dean Peters Product‑Manager‑Skills – https://github.com/deanpeters/Product-Manager-Skills

Trail of Bits Skills – https://github.com/trailofbits/skills

OpenClaw ClawHub – https://github.com/openclaw/clawhub

VoltAgent Awesome‑Agent‑Skills – https://github.com/VoltAgent/awesome-agent-skills

Travisvn Awesome‑Claude‑Skills – https://github.com/travisvn/awesome-claude-skills

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM Prompt engineering workflow AI automation skill design

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Skill definition

Front‑matter

Required fields

Optional extension fields observed

Five core design patterns

Pattern 1 – Linear Flow

Pattern 2 – Decision Tree + On‑Demand Loading

Pattern 3 – Iterative Loop (TDD‑style)

Pattern 4 – Baton (Cross‑Session Persistence)

Pattern 5 – Multi‑Stage + Checkpoints + Skill Orchestration

Special Pattern – Thinking Framework

General writing techniques

Four weapons to stop LLM laziness

Three effective teaching methods

Security and boundary principles

Three‑layer knowledge organization

Token budget guidelines

Decision‑tree for selecting a pattern

Minimal linear‑pattern Skill template

Reference URLs

ITPUB

How this landed with the community

Was this worth your time?

0 Comments

Pattern 1 – Linear Flow

Pattern 2 – Decision Tree + On‑Demand Loading

Pattern 3 – Iterative Loop (TDD‑style)

Pattern 4 – Baton (Cross‑Session Persistence)

Pattern 5 – Multi‑Stage + Checkpoints + Skill Orchestration