Why Changing the Instruction Field Fails: Root Cause Uncovered in OpenSpec Source

An analysis of five rounds of OpenSpec experiments shows that content rules in the instruction field work, but structural rules do not, and source‑code inspection reveals that the template field is the authoritative structure, explaining the failure and guiding a corrective redesign.

Shuge Unlimited
Shuge Unlimited
Shuge Unlimited
Why Changing the Instruction Field Fails: Root Cause Uncovered in OpenSpec Source

Experimental Findings

Five iterations were run on the shuge AI Toolbox project. Content‑related rules in the instruction field succeeded: code completeness rose from 0 % to ~90 %, TBD/TODO placeholders disappeared after the second round, the “Involved Files” field appeared consistently, and the quality score climbed from 28 to 55.

Structural rules almost never took effect: the required ### 任务 N heading was never output, the TDD five‑step loop was never executed perfectly, task granularity often exceeded ten minutes, and only from the fifth round did a partial ### 任务 1 heading appear.

Content rule outcomes:

Code completeness: 0 % → ~90 % (almost no code blocks in round 1, basic code blocks from round 3).

TBD/TODO placeholders: eliminated from round 2 onward.

“Involved Files” field: stable from round 2.

Overall quality score: 28 → 55.

Structural rule outcomes:

Rounds 1‑4 used the default ## 1. + - [ ] 1.1 format; no ### 任务 N headings.

TDD five‑step loop: 0 tasks executed perfectly.

Task granularity: generally >10 minutes, with multiple features packed into a single task.

From round 5 a partial ### 任务 1: heading appeared, but the TDD steps were incomplete.

Source‑Code Investigation

In instruction-loader.ts the ArtifactInstructions interface defines four independent fields: instruction: string | undefined – optional guidance text. template: string – mandatory structural skeleton. context: string | undefined – project context. rules: string[] | undefined – artifact‑specific rules.

The type signature shows template is required (no undefined), while instruction is optional.

In propose.ts the workflow passes the instruction to the LLM with the comment:

Use `template` as the structure for your output file - fill in its sections

This makes template the “structural authority” and instruction merely supplemental guidance. When the two conflict, the LLM follows the template.

Conflict Example

If the instruction asks for the ### 任务 N heading but the default template provides ## 1. plus a checklist, the LLM must choose. The source comment already answers: it obeys the template and ignores the conflicting instruction.

Fix: Move Structural Rules into the Template

The solution is to place all structural constraints inside the template and keep only content constraints in instruction. The original nine‑line template:

## 1. <!-- Task Group Name -->
- [ ] 1.1 <!-- Task description -->
- [ ] 1.2 <!-- Task description -->
## 2. <!-- Task Group Name -->
- [ ] 1.1 <!-- Task description -->
- [ ] 1.2 <!-- Task description -->

was replaced with a 37‑line template that uses ### 任务 N:[名称] headings, adds an “Involved Files” section, and embeds the full TDD five‑step loop for each sub‑task while preserving the checklist syntax required by the apply stage.

### 任务 1:<!-- Task Group Name -->

**涉及文件:**
- 新建/修改:exact/path/to/file
- 测试:tests/exact/path/to/test

- [ ] 1.1 **写失败测试**
```typescript
// test code here
```
- [ ] 1.2 **运行测试 - 确认失败**
命令:`npx vitest run tests/path/test.ts`
预期:FAIL — <!-- expected error -->
- [ ] 1.3 **写最小实现**
```typescript
// implementation code here
```
- [ ] 1.4 **运行测试 - 确认通过**
命令:`npx vitest run tests/path/test.ts`
预期:PASS
- [ ] 1.5 **提交**
```bash
git add path/to/files
git commit -m "feat: description"
```

### 任务 2:<!-- Task Group Name -->

**涉及文件:**
- 新建/修改:exact/path/to/file
- 测试:tests/exact/path/to/test

- [ ] 2.1 <!-- Continue with the same pattern -->

Format changed from ## 1. to ### 任务 N:[名称].

Each task now lists involved files.

Each sub‑task includes the TDD steps (write failing test, confirm failure, write minimal implementation, confirm pass, commit).

The - [ ] checkbox format is retained for progress tracking.

Because the template is the structure the LLM follows, this change dramatically improves format compliance.

Revised 2/8 Rule 2.0

template : controls structural rules (format, grouping, section content). Effort 20 %, format score improves from 0 % to 80 %.

instruction : controls content rules (code completeness, no placeholders, TDD rhythm). Effort 20 %, content score improves from 0 % to 80 %.

config.yaml rules : project constraints (language, tech stack). Provides additional polish.

Combined, template and instruction each contribute 20 % effort for a total potential quality score of 95; the remaining 5 points rely on verification processes.

Takeaway

Separate “content” and “structure”: write content rules in instruction and structural rules in template. The template is the definitive format reference for the LLM, enabling up to 95 points of quality when both layers are applied.

template and instruction relationship diagram
template and instruction relationship diagram
Revised 2/8 rule 2.0 matrix
Revised 2/8 rule 2.0 matrix

OpenSpec repository: https://github.com/Fission-AI/OpenSpec

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

LLMtemplateTDDAI toolingOpenSpecinstruction
Shuge Unlimited
Written by

Shuge Unlimited

Formerly "Ops with Skill", now officially upgraded. Fully dedicated to AI, we share both the why (fundamental insights) and the how (practical implementation). From technical operations to breakthrough thinking, we help you understand AI's transformation and master the core abilities needed to shape the future. ShugeX: boundless exploration, skillful execution.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.