R&D Management 27 min read

OpenSpec Best Practices: Three Labs Validate Five Quality Upgrades with Clear Results

The article walks through three hands‑on labs—bare‑run, adding Rules + Explore + Validate, and customizing the schema with a Review artifact—to experimentally verify five quality‑upgrade directions for OpenSpec, comparing outputs, task granularity, rollback plans, testing coverage, and offering practical recommendations.

Shuge Unlimited

May 10, 2026

OpenSpec Best Practices: Three Labs Validate Five Quality Upgrades with Clear Results

What You’ll See

✅ Bare‑run propose → apply → archive – real output

✅ Add Rules + Validate + Explore – low‑cost improvement

✅ Custom Schema + Review artifact – medium‑cost solution

✅ Five‑direction upgrade comparison table

Lab 1: Bare‑run propose → apply → archive (baseline)

This round runs the workflow without any quality measures, simply executing /opsx:propose → /opsx:apply → /opsx:archive to see what the AI produces by default.

Propose: one‑click artifact generation

/opsx:propose

The command creates a todo‑priority change directory and generates four artifacts:

**Change created: `todo‑priority`**
Location: `openspec/changes/todo-priority/`
Artifacts created:
- **proposal.md** — defines the problem and motivation
- **design.md** — technical decisions, default priority, enum validation, sort order, timestamp format
- **specs/todo-priority/spec.md** — detailed requirements with normal, null, empty, invalid scenarios
- **tasks.md** — 6 implementation task groups with paired test tasks
All artifacts complete! Ready for implementation.

⚠️ Note: propose may skip a demand clarification step if the change name is clear; you can force clarification with /opsx:explore before proposing.

Apply: execute tasks

/opsx:apply

AI reads the artifacts and runs each task in order, marking them as completed. The first test run fails ("should reject undefined priority type"), the AI auto‑fixes the code, and the second run passes:

# First run – 1 failure
FAIL src/store.test.ts > priority validation > should reject undefined priority type
Test Files 1 failed (1)
Tests 1 failed | 17 passed (18)

# After fix – all pass
Test Files 1 passed (1)
Tests 18 passed (18)

Lab 1 Diagnosis

Spec coverage: ✅ (null, empty, invalid values covered)
Task granularity: ⚠️ (6 groups, 15 tasks; AI added an ISO‑8601 breaking change)
Test tasks: ✅ (each group has a paired test task)
Rollback plan: ⚠️ (design mentions migration but no concrete SQL)

Conclusion: the four‑step method guarantees process completeness but does not ensure that the AI‑inferred requirements match the user’s intent.

Lab 2: Add Rules + Validate + Explore (low‑cost)

Lab 1 exposed issues that can be mitigated without changing OpenSpec’s schema. The workflow adds an Explore step to clarify requirements, writes a config.yaml with rules, and validates the output.

Step 1 — Explore to clarify requirements

/opsx:explore

Question asked to the AI: 1. Should the enum be extensible? 2. How to handle null values? 3. Sorting weight? 4. Migration strategy?

After answering, the AI proposes a concrete solution:

## Recommended solution
- Type: Enum (Low / Medium / High)
- DB: NOT NULL, default 'medium'
- Sort order: HIGH → MEDIUM → LOW
- Migration: one‑time UPDATE priority='medium'

Step 2 — Create config.yaml with Rules

rules:
  specs:
    - 每个数据字段的变更，必须覆盖 null、空值、越界三种异常场景
    - Scenario 必须使用 #### 四级标题，否则会被静默忽略
  design:
    - 涉及数据库 migration 的设计，必须包含回滚方案
  tasks:
    - 所有实现任务必须配对对应的测试任务，测试任务写在实现任务正下方
    - 单个 task 不超过 30 分钟工作量

Key points: rules are independent constraint fields, keyed by artifact ID, and placed in the project root openspec/config.yaml.

Step 3 — Propose with rules applied

/opsx:propose

The output is similar to Lab 1 but now respects the rules (e.g., test tasks are paired, task count is 19).

Lab 2 Outcome

Exception coverage: ✅ (more complete, null=400 added)
Rollback plan: ✅ (full plan present)
Test tasks: ✅ (paired)
Task granularity: 6 groups, 19 tasks (finer)
Demand source: ✅ (user‑confirmed via explore)

Key insight: the quality boost mainly comes from the multi‑round requirement clarification in the Explore stage, not from the rules themselves.

Lab 3: Custom Schema + Review artifact (medium‑cost)

Lab 2 patches the baseline without altering OpenSpec’s structure. Lab 3 forks the spec‑driven schema and inserts a Review artifact between design and tasks, creating a structural quality gate.

Step 1 — List available schemas

openspec schemas
Available schemas:
  spec-driven

Step 2 — Fork schema

openspec schema fork spec-driven with-review
✔ Forked 'spec-driven' to 'with-review'

Step 3 — Edit schema to insert Review

artifacts:
- id: proposal
  generates: proposal.md
  template: proposal.md
- id: specs
  generates: specs/**/*.md
  template: spec.md
- id: design
  generates: design.md
  template: design.md
- id: review
  generates: review.md
  template: review.md
  description: 五维审查
  instruction: |
    从五个维度审查所有工件的完整性：
    1. 边界条件
    2. 回滚方案
    3. 测试覆盖
    4. 向后兼容
    5. 任务粒度
  requires: [proposal, specs, design]
- id: tasks
  generates: tasks.md
  template: tasks.md
  requires: [review]

The new review artifact must have a review.md template in openspec/schemas/with-review/templates/.

Step 4 — Create change with new schema

openspec new change todo-priority-v3 --schema with-review

Step 5 — Propose and observe five artifacts

**Change created: `todo-priority-v3`**
Artifacts created:
- proposal.md …
- design.md …
- specs/todo-priority/spec.md …
- review.md … (five‑dimensional quality review)
- tasks.md … (22 tasks, includes review recommendations)
All artifacts complete! Ready for implementation.

Review.md Sample

# Quality Review
## 1. Boundary Conditions
**Status**: ✅ Pass
**Findings**: Specs cover null, empty, invalid, 404, etc.

## 2. Rollback Plan
**Status**: ✅ Pass
**Findings**: Design states no rollback needed; migration is idempotent.

## 3. Test Coverage
**Status**: ⚠️ Warning
**Findings**: Test tasks are paired but need explicit verification.

## 4. Backward Compatibility
**Status**: ✅ Pass
**Findings**: Priority is additive; existing API unchanged.

## 5. Task Granularity
**Status**: ✅ Pass
**Findings**: 22 tasks, respects 30‑minute rule.

## Overall Assessment
All five dimensions pass. The change is well‑scoped with clear requirements and migration plan.

Lab 3 Outcome

Exception coverage: ✅ (review confirms)
Rollback plan: ✅ (review confirms)
Test coverage: ✅ (review suggestions applied)
Task granularity: 22 tasks (more detailed)
Review checkpoints: ✅ (design→review→tasks enforced)
Demand source: ✅ (user‑confirmed)

Core advantage: the Review artifact creates a mandatory quality gate; tasks will not be generated until the review passes.

Five Upgrade Directions – Verdict

#1 Spec Review – ✅ effective (validate + review)
#2 Atomic tasks + checkpoints – ✅ effective (requires chain)
#3 Runtime verification – ⚠️ partially effective (needs external CI)
#4 TDD mindset – ✅ effective (rules enforce test pairing)
#5 Quality gate before archive – ✅ effective (review as gate)

Practical Recommendations

Start with Lab 1 to become familiar with the OpenSpec workflow.

For personal or fast‑iteration projects, adopt Lab 2 (Rules + Validate) – simple setup, ~5 minutes.

For team projects requiring stronger quality control, use Lab 3 (custom schema with Review) – structural gate ensures alignment.

For production environments, combine Lab 3 with an external CI pipeline to achieve runtime verification.

Honest Assessment of OpenSpec Limits

OpenSpec can enforce documentation alignment, structural validation, and schema‑driven quality gates, covering four of the five upgrade goals. It cannot execute runtime tests; that requires separate test frameworks and CI pipelines.

The built‑in Review artifact is generated by the same AI, so its assessment is lenient and should be treated as a preliminary check rather than a final audit.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Code Generation validation AI programming OpenSpec quality workflow schema customization

Written by

Shuge Unlimited

Formerly "Ops with Skill", now officially upgraded. Fully dedicated to AI, we share both the why (fundamental insights) and the how (practical implementation). From technical operations to breakthrough thinking, we help you understand AI's transformation and master the core abilities needed to shape the future. ShugeX: boundless exploration, skillful execution.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

What You’ll See

Lab 1: Bare‑run propose → apply → archive (baseline)

Propose: one‑click artifact generation

Apply: execute tasks

Archive

Lab 1 Diagnosis

Lab 2: Add Rules + Validate + Explore (low‑cost)

Step 1 — Explore to clarify requirements

Step 2 — Create config.yaml with Rules

Step 3 — Propose with rules applied

Lab 2 Outcome

Lab 3: Custom Schema + Review artifact (medium‑cost)

Step 1 — List available schemas

Step 2 — Fork schema

Step 3 — Edit schema to insert Review

Step 4 — Create change with new schema

Step 5 — Propose and observe five artifacts

Review.md Sample

Lab 3 Outcome

Five Upgrade Directions – Verdict

Practical Recommendations

Honest Assessment of OpenSpec Limits

Shuge Unlimited

How this landed with the community

Was this worth your time?

0 Comments

Lab 2: Add Rules + Validate + Explore (low‑cost)

Step 1 — Explore to clarify requirements

Step 2 — Create config.yaml with Rules

Step 3 — Propose with rules applied

Lab 3: Custom Schema + Review artifact (medium‑cost)

Step 1 — List available schemas

Step 2 — Fork schema

Step 3 — Edit schema to insert Review

Step 4 — Create change with new schema

Step 5 — Propose and observe five artifacts