Taming Claude Code: A Simple Skill Slashes Unnecessary Code Bloat

The author evaluates a community‑crafted “Karpathy Skills” plugin for Claude Code, applying four concise coding principles, and shows through a controlled experiment that the skill‑guided model produces far fewer superfluous changes—38 lines versus 95—while still fixing the targeted bug and improving code quality.

Old Zhang's AI Learning
Old Zhang's AI Learning
Old Zhang's AI Learning
Taming Claude Code: A Simple Skill Slashes Unnecessary Code Bloat

Skill repository

Repository https://github.com/forrestchang/andrej-karpathy-skills provides a set of executable behavioral constraints derived from Andrej Karpathy’s tweets about LLM‑generated code. When installed, Claude Code follows these constraints.

Karpathy’s complaints

模型会代你做错误假设,然后不假思索地执行,它们不管理自身的困惑,不寻求澄清,不呈现矛盾,不展示权衡,在应该提出异议时也不反驳。
它们真的很喜欢把代码和 API 搞复杂,堆砌抽象概念,不清理死代码,明明 100 行能搞定的事情,非要实现成 1000 行的臃肿架构。
它们有时仍会改动或删除自己理解不足的代码和注释,即使这些内容与任务本身无关。

Four principles in CLAUDE.md

Think Before Coding – avoid hidden assumptions, expose trade‑offs, ask when uncertain, list alternatives, and stop to clarify unclear points.

Simplicity First – solve the problem with minimal code, omit unnecessary features, avoid premature abstraction, and keep the implementation concise.

Surgical Changes – modify only what is required, do not refactor unrelated code, and ensure each change can be traced back to a user request.

Goal‑Driven Execution – translate instructions into verifiable goals and iterate until they are achieved.

Controlled experiment

The test uses a 32‑line Python script buggy_user.py that contains a subtle bug: validate_user does not raise when the email field is a blank string because the condition not " " evaluates to False.

Setup

Model: Claude Haiku 4.5

Identical prompt and source file for both runs

Variable: experimental group reads CLAUDE.md and applies the four principles as hard constraints; control group does not.

Results (line count)

Original file: 32 lines

Without skill: 95 lines (+63 lines, +197 %)

With skill: 38 lines (+6 lines, +19 %)

Diff without skill (excerpt)

+ import re
+ from typing import Dict, Any
- def validate_user(user_data):
+ def validate_user(user_data: Dict<str, Any>) -> bool:
+     """Validate user data with strict email and username checks.
+     ...
+     raise TypeError("user_data must be a dictionary")
+ ...
- def upload_avatar(file_path, destination):
+ def upload_avatar(file_path: str, destination: str, max_size: int = 5 * 1024 * 1024) -> bool:
+     # file size check
+     if len(data) > max_size:
+         raise ValueError(f"File size exceeds maximum allowed size of {max_size} bytes")
+ ...

Diff with skill (full)

+ import logging
+ logger = logging.getLogger(__name__)
-     if not user_data.get('email'):
+     email = user_data.get('email', '').strip()
+     if not email:
-     if '@' not in user_data['email']:
+     if '@' not in email:
-     if not user_data.get('username'):
+     username = user_data.get('username', '').strip()
+     if not username:
+         logger.warning(f"Upload failed with status {response.status_code}")
-         print(f"Error: {e}")
+         logger.error(f"Error uploading avatar: {e}")

Observations

Without constraints the model adds every best practice it can think of, expanding the diff to 95 lines; with constraints it stops after the minimal required changes.

Constraints cause the model to explicitly list “deliberate‑no‑change” items, directly reflecting the Surgical Changes principle.

The unconstrained output includes decorative emojis and affirmations, while the constrained output remains plain and focused.

Line count alone does not capture quality: the larger diff contains many reasonable additions that are unnecessary for the specific task, highlighting the importance of pre‑emptive prompting.

Installation

Two ways to add the skill:

Plugin (recommended)

/plugin marketplace add forrestchang/andrej-karpathy-skills
/plugin install andrej-karpathy-skills@karpathy-skills

Project‑level CLAUDE.md

curl -o CLAUDE.md https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md
# or append to an existing file
 echo "" >> CLAUDE.md
 curl https://raw.githubusercontent.com/forrestchang/andrej-karpathy-skills/main/CLAUDE.md >> CLAUDE.md

When not to use

The guidelines note that for trivial tasks—simple spelling fixes or obvious one‑line changes—the full process may be overkill. In the experiment, when the request was limited to adding .strip() on line 5, both groups produced almost identical diffs; the skill’s benefit appears mainly for ambiguous, “by‑the‑way” requests, cross‑module changes, or when the user’s intent is unclear.

Additional insight

All of Karpathy’s complaints stem from the same root: LLMs tend to write more code when uncertain because “more is safer.” The 65‑line CLAUDE.md flips this intuition, encouraging the model to pause, ask, or refrain from unnecessary modifications, and to report deliberate‑no‑change items.

Repository: https://github.com/forrestchang/andrej-karpathy-skills

LLMprompt engineeringsoftware engineeringcode qualityClaude Code
Old Zhang's AI Learning
Written by

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.