Artificial Intelligence 11 min read

Thin Harness, Fat Skills: The Essence of AI Agent Architecture

The article explains how Garry Tan's three AI‑agent engineering principles—Thin Harness, Fat Skills, and Fat Code—replace raw model size arguments with a disciplined architecture that yields 10‑100× productivity gains, illustrated through concrete skill files, case studies, and community insights.

AI Engineering

Apr 15, 2026

Thin Harness, Fat Skills: The Essence of AI Agent Architecture

The Essence of Architecture Layers

Garry Tan’s three AI‑agent engineering principles, promoted inside Y Combinator, reinterpret the Unix philosophy of "do one thing well" for modern agents. While the industry debates model parameter counts, the real 100× engineers apply a layered architecture that delivers massive productivity gains.

Steve Yegge notes that engineers using AI‑coding agents are 10‑100× more productive than those using traditional editors, and about 1,000× more productive than Google engineers in 2005. The author emphasizes that the difference is not model quality but architectural design.

Layered Architecture

Fat Skills layer : Skill files written in Markdown act as a programming language, encoding fuzzy judgment and domain knowledge. Each skill file is a method call that accepts parameters and produces different capabilities. For example, an /investigate skill with steps (define dataset, build timeline, journalize each document, synthesize, argue, cite) takes three parameters (TARGET, QUESTION, DATASET) and can serve as a medical‑research analyst or a forensic investigator depending on the inputs.

Fat Code layer : Deterministic logic such as SQL queries, compiled code, and arithmetic operations remain in this layer to guarantee reliable execution.

Thin Harness layer : A lightweight framework of about 200 lines of code that handles model loops, context management, tool invocation, and safety. It provides the basic connection between the model and external tools.

The anti‑pattern is a "thick constraint with thin skills"—e.g., defining 40+ tools that consume half the context window, or using all‑purpose tools with 2‑5 s round‑trip latency. The preferred approach is narrow, fast tools such as a Playwright CLI that performs a browser action in 100 ms, 75× faster than a 15‑second Chrome MCP sequence.

Evolution of Skill Files

A case study from YC’s founder activity feedback system shows the impact of skill‑file automation:

Initial version received a 12% "okay" rating.

After enabling automatic analysis, the rating dropped to 4%.

Key improvements were solidified into the Markdown skill files.

This design makes each skill file permanently upgradable; it continues to run at 3 am and automatically benefits from new model releases.

In a 2026 Chase Center event with 6,000 founders, traditional manual review of applications and spreadsheets failed at scale, whereas an AI‑agent pipeline using enriched skill files could process all profiles nightly, extracting gaps between founders' statements and actual builds.

The enrichment skill ( /enrich-founde) pulls data from GitHub, URLs, social signals, and runs deterministic steps (SQL lookups, browser tests) to produce structured outputs such as:

FOUNDER: Maria Santos
COMPANY: Contrail (contrail.dev)
SAYS: "AI agent's Datadog"
ACTUALLY BUILDING: 80% of commits in billing module

This gap analysis requires reading full profiles, Git history, and advisor notes—something keyword search or similarity search cannot achieve.

Community Discussion

Developer ByteCrafter observed that complex frameworks often clash with agents; simplifying to a context loader dramatically improves efficiency.

Sam Ward’s team built a system where all "intelligence" resides in Markdown loaded at startup, while the framework only connects models and tools. Updating the Markdown updates the agent without touching code.

Commenter forgedynamicsai summarized the insight: thin LangGraph constraints, thick scripts, and a clear split between latent (potential) and deterministic spaces unlocks continuous improvement.

Boundary Migration Phenomenon

Claudia noted that when deterministic code needs contextual judgment, it naturally migrates to the skill layer. Accepting this migration stabilizes the architecture.

Garry Tan’s open‑source project gstack demonstrates a 200‑line CLI framework supporting 23 professional roles, illustrating the principle of "doing the right thing at the right layer".

The two categories of steps are:

Latent space : where the model reads, interprets, decides, and synthesizes.

Deterministic space : where trust resides—identical input yields identical output (SQL, compilation, arithmetic).

For example, an LLM can seat 8 people considering personalities, but when asked to seat 800 it hallucinates a plausible yet wrong arrangement—an illustration of forcing a combinatorial optimization problem into the latent space.

Engineering Discipline

Garry’s iron rules, shown below, enforce disciplined development:

如果同一件事问智能体两次就算失败
需要重复的工作必须先手动跑3-10个样本
批准后固化为技能文件或cron任务

This discipline yields compounding returns that outweigh chasing larger models. As ChaiBytesAI observed, the journaling step creates a self‑reinforcing loop: each run’s output becomes context for the next, allowing the agent to improve its judgments over time.

Garry Tan concludes that each skill file is a permanent upgrade that never degrades, runs overnight, and instantly benefits from the next model release, keeping deterministic steps reliable while latent steps improve.

architecture AI agents fat code markdown skills thin harness

Written by

AI Engineering

Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.