Artificial Intelligence 18 min read

Boost OpenSpec Code Quality to 80% with a Single Config Change

The article analyzes three rounds of lab experiments that reveal task‑granularity as the key lever for AI‑generated code quality, introduces the 2/8 rule, and details a three‑step configuration (config.yaml, schema fork with upgraded instruction, and review artifact) that raises quality scores to about 80% without modifying OpenSpec source code.

Shuge Unlimited

May 11, 2026

Boost OpenSpec Code Quality to 80% with a Single Config Change

1. Why Task Granularity Matters

Three lab rounds repeatedly showed that when tasks are defined coarsely, the AI adds unintended changes during the apply phase. For example, a change named todo‑priority caused the AI to introduce an ISO‑8601 timestamp format change that was never requested.

Coarse tasks look like:

- [ ] 1.1 Implement user registration API
- [ ] 1.2 Add input validation
- [ ] 1.3 Handle exception cases

The AI then guesses details (email format, validation rules, etc.) and may produce incorrect implementations.

Fine‑grained tasks look like:

### Task 1: Email format validation
- [ ] Step 1: Write failing test
<code>test('invalid email returns 400', () => {
  const result = register({ email: 'abc' });
  expect(result.status).toBe(400);
});</code>
- [ ] Step 2: Run test – expect FAIL
- [ ] Step 3: Write minimal implementation
- [ ] Step 4: Run test – expect PASS
- [ ] Step 5: Commit (git command)

With such detailed steps the AI has no room to improvise; it simply follows the instructions.

2. The 2/8 Rule – 20% of Changes Deliver 80% of Quality

Based on the three labs, the five improvement directions are ranked by cost‑effectiveness. The table below shows the ranking:

| Improvement Direction | What to Do | Effect | Cost‑Benefit |
|----------------------|------------|--------|--------------|
| **Upgrade tasks instruction** | Edit a single config line | AI generates fine‑grained tasks, eliminating most quality issues | **Very high** |
| Add code review | Separate sub‑agent or manual review | Detects generated problems | Medium |
| Add pre‑archive validation | Enable expanded workflow | Catch missed items | Medium |
| Write finer rules | Tune <code>config.yaml</code> rules | AI compliance varies | Low |
| Demand clarification | Multi‑round dialogue (Explore) | Good effect but time‑consuming | Medium‑high |

Only the first direction – changing the tasks instruction – requires editing one field, does not touch OpenSpec source, and yields an 80% quality lift.

3. Three‑Step Configuration

Step 1 – Create config.yaml

The file supplies global context and per‑artifact rules. Example excerpt:

schema: with-review

context: |
  Tech stack: TypeScript, Express, Vitest
  Test command: npx vitest run
  All new features follow TDD – write failing test first

rules:
  specs:
    - Every data‑field change must cover null, empty, and out‑of‑range cases
    - Scenarios must use #### level‑4 headings
  design:
    - Database migrations must include rollback plans
  tasks:
    - Each task must contain full test code and implementation code
    - First step: write failing test; last step: verify pass
  review:
    - Check task granularity (2‑5 min per step)
    - Flag placeholders (TBD, TODO, implement later)

Step 2 – Fork Schema and Upgrade Tasks Instruction

Run the experimental command: openspec schema fork spec-driven with-review In openspec/schemas/with-review/schema.yaml replace the instruction of the tasks artifact with:

instruction: |
  Create fine‑grained implementation plans. Each task should take 2‑5 minutes.

  Every task must follow this format:
  - File path (exact)
  - Step 1: Write failing test (full code)
  - Step 2: Run test – expect failure (command + output)
  - Step 3: Write minimal implementation (full code)
  - Step 4: Run test – expect pass (command + output)
  - Step 5: Commit (git command)

  Prohibited items (plan invalid):
  - TBD, TODO, implement later
  - "Add appropriate error handling" (must provide concrete code)
  - "Write tests for the above code" (must provide concrete test code)
  - Describe what to do without showing how

This gives the AI a strict template and a list of forbidden shortcuts, forcing it to produce concrete, test‑driven steps.

Step 3 – Insert a Review Artifact

Add a new review artifact between design and tasks:

- id: review
  generates: review.md
  template: review.md
  description: Five‑dimensional review
  instruction: |
    Review all artifacts for completeness and quality.
    Review dimensions:
    1. Boundary conditions  2. Rollback plan  3. Test coverage
    4. Backward compatibility  5. Task granularity (most important)
  requires: [proposal, specs, design]

Change the requires field of tasks from [specs, design] to [review], making the dependency chain:

proposal → specs → design → review → tasks → APPLY

The review step acts as a gate; tasks are generated only after a successful review.

4. Full Daily Workflow (6 Phases)

Phase 1 – Demand Clarification ( /opsx:explore) – optional but recommended to resolve ambiguous requirements before proposing.

Phase 2 – Generate Artifacts ( /opsx:propose) – with with-review schema this creates proposal.md, specs/, design.md, review.md, tasks.md. The upgraded instruction ensures each task is 2‑5 minutes and includes full code.

Phase 3 – Manual Check – quickly scan review.md for task‑granularity status and overall suggestions.

Phase 4 – Execute Tasks ( /opsx:apply) – AI follows the step‑by‑step tasks; no creative space remains.

Phase 5 – Consistency Verification ( /opsx:verify) – text‑level check that implementation matches spec intent (does not run tests).

Phase 6 – Archive ( /opsx:archive) – bundle the change directory after confirming all tasks are completed.

5. Three Layers of Defense

Layer 1 – Source Control (tasks instruction upgrade) contributes ~80% of quality improvement by eliminating problems at generation time.

Layer 2 – Process Checks (review + verify) act as a safety net for missed boundary conditions, rollback plans, or spec‑code mismatches.

Layer 3 – Final Confirmation (pre‑archive manual review) is a quick human check for any remaining issues.

The philosophy is “heavy on the first layer, light on the second, fast on the third.”

6. Practical Tips & Pitfalls

Artifact IDs in rules must exactly match the schema IDs; otherwise rules are silent. openspec schema fork is experimental – future versions may change the command syntax. /opsx:verify only checks textual consistency; actual runtime validation relies on your test framework.

Self‑review tends to be lenient; for strict quality use an external reviewer.

Large changes (>5 files) should be split into multiple independent changes to keep context size manageable.

Additional tricks: edit tasks.md directly if the generated granularity is insufficient, add more prohibited items in the instruction for higher compliance, and clear the AI context ( /clear) before applying very large changes.

7. Next Issue Preview

The next article will apply the methodology to build a real project – the shuge AI Toolbox – covering demand clarification, artifact generation, and verification of task granularity.

Comparison of coarse vs fine task granularity

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI code generation Configuration Software quality workflow automation OpenSpec task granularity

Written by

Shuge Unlimited

Formerly "Ops with Skill", now officially upgraded. Fully dedicated to AI, we share both the why (fundamental insights) and the how (practical implementation). From technical operations to breakthrough thinking, we help you understand AI's transformation and master the core abilities needed to shape the future. ShugeX: boundless exploration, skillful execution.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

1. Why Task Granularity Matters

2. The 2/8 Rule – 20% of Changes Deliver 80% of Quality

3. Three‑Step Configuration

Step 1 – Create config.yaml

Step 2 – Fork Schema and Upgrade Tasks Instruction

Step 3 – Insert a Review Artifact

4. Full Daily Workflow (6 Phases)

5. Three Layers of Defense

6. Practical Tips & Pitfalls

7. Next Issue Preview

Shuge Unlimited

How this landed with the community

Was this worth your time?

0 Comments

Step 1 – Create config.yaml

Step 2 – Fork Schema and Upgrade Tasks Instruction

Step 3 – Insert a Review Artifact