When 30 Rules Aren’t Enough: Why CLAUDE.md Ignores Overwritten Rules

The article explains why stuffing CLAUDE.md with many rules makes them ineffective, detailing its always‑resident loading, token cost, rule dilution, proper layering, import mechanisms, and verification techniques to keep essential guidelines enforced in LLM‑driven workflows.

Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
When 30 Rules Aren’t Enough: Why CLAUDE.md Ignores Overwritten Rules

1. Persistence and token cost

CLAUDE.md is injected at session start and remains in the context for every turn, unlike skill which is lazily loaded. Each line is re‑read each turn, so file size directly adds token cost. In the AlgoMooc project the file grew to over 600 lines, consuming roughly 20 KB of tokens per turn while only a few dozen lines were used daily.

The four layers are concatenated, not overridden: ~/.claude/CLAUDE.md – user‑wide ./CLAUDE.md – project‑level, version‑controlled ./CLAUDE.local.md – local, not version‑controlled

sub‑directory CLAUDE.md – lazy‑loaded only when Claude reads that directory

Only sub‑directory files are lazy; the others are always resident.

CLAUDE.md is ‘persistent’, skill is ‘on‑demand’: cost per turn
CLAUDE.md is ‘persistent’, skill is ‘on‑demand’: cost per turn

2. Why rules are ignored

CLAUDE.md is treated as a normal user message and competes for attention with the current query, code snippets, and tool results. When short and specific, the model usually follows the rules; when the file reaches a few hundred lines, rules dilute each other and later or middle rules receive far less attention. Official guidance recommends keeping CLAUDE.md under 200 lines.

Example: a rule “HTML for solution animation must be locally rendered and verified before submission” placed around line 80 was ignored twice, leading to blank animations. Moving it to its own line, adding an imperative phrase, and placing it near the top raised compliance from ~30 % to ~90 %.

Placement matters: rules at the beginning or end get higher attention than those buried in the middle.

File too long – dilution

Embedding rules inside long paragraphs instead of separate lines or lists

Writing rules as soft descriptions rather than imperative commands (e.g., “we tend to run tests” vs “must run npm test ”)

Combining multiple requirements in one sentence – only the first is likely to be obeyed

If a rule still fails, prepend IMPORTANT or YOU MUST, or convert the rule into a PreToolUse Hook, which is enforced mechanically.

Long files dilute key rules
Long files dilute key rules

3. What belongs in CLAUDE.md

Include only items that satisfy three conditions simultaneously: high frequency, stable, and not inferable from code across sessions.

High frequency : information used in almost every task (e.g., package manager, core commands, strict code‑style conventions).

Stable : long‑term agreements that rarely change.

Cross‑session & not inferable : context the model cannot deduce from the code alone, such as historical decisions or rationale.

Exclude:

One‑off temporary instructions (write directly in the conversation)

Details already expressed in the code (e.g., naming conventions visible in source)

Low‑frequency processes better expressed as skill with lazy loading

Large background knowledge blocks that belong in @import references or a references file

Applying these filters to the AlgoMooc CLAUDE.md reduced it from 600+ lines to ~130, dramatically improving compliance for the remaining core rules.

What belongs in CLAUDE.md: high‑frequency + stable + cross‑session
What belongs in CLAUDE.md: high‑frequency + stable + cross‑session

4. Layering and @import

Complex projects should split CLAUDE.md using the four layers and @import statements.

Global preferences → ~/.claude/CLAUDE.md Project‑wide rules → ./CLAUDE.md Local, non‑versioned items → ./CLAUDE.local.md When a section becomes large (e.g., a detailed Git workflow), place it in a separate file and import it with @import @docs/git-workflow.md. Imported files are still fully resident; they do not reduce token cost.

Import depth is limited to four levels; exceeding that stops loading.

Sub‑directory CLAUDE.md files are lazy‑loaded and can hold module‑specific conventions, keeping the main file clean while still providing context when that module is accessed.

Finer control is possible with .claude/rules/ files that include a paths field, making the rule active only when Claude reads files matching the specified path pattern.

A utility command /memory lists the currently loaded CLAUDE.md, CLAUDE.local.md, and rule files, helping verify what is in context.

Layering and @import: keep files tidy
Layering and @import: keep files tidy

5. Verifying rule compliance

Because CLAUDE.md is advisory, compliance must be probed. Create test scenarios that trigger each critical rule and observe whether Claude obeys them.

Example: a rule “must update the homepage index after adding a new animation” was ignored when placed deep in the file. After moving it near the top and adding “must”, Claude started updating the index.

Dilution : the rule is fine but hidden among many lines – fix by repositioning, using imperative language, or adding emphasis.

Inherent conflict : the rule contradicts Claude’s default behavior – such rules cannot be forced and should be removed or replaced with a Hook.

In a systematic probe of ~20 rules, only 7–8 remained reliably obeyed after trimming and refactoring.

Verification loop: probe tests + periodic pruning
Verification loop: probe tests + periodic pruning

6. Maintenance checklist

State the persistent cost: each line is re‑read every turn; official guidance caps size around 200 lines.

Explain why rules are ignored: they compete for attention; long files dilute rules, so move core rules forward, make them imperative, and optionally prepend IMPORTANT.

Describe division of labor: keep only high‑frequency, stable, cross‑session items in CLAUDE.md; low‑frequency flows become skill; hard constraints become PreToolUse Hook.

Emphasize verification: run probe tests that deliberately trigger each rule and adjust placement or wording until compliance is high.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AILLMPrompt EngineeringClaudeRule Management
Wu Shixiong's Large Model Academy
Written by

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.