Why Large Language Models Are Plateauing: Insights from Veteran Architect Steve Yegge
Steve Yegge argues that the exponential growth of large language models is flattening due to dual physical and safety limits, prompting a shift toward AI literacy, token hygiene, and a resurgence of SaaS as developers adapt to the new "Flat Curve" era.
In the past two years developers have lived with "technology stack anxiety", fearing that each new Claude or GPT release could render their work obsolete. Steve Yegge, a veteran programmer from Amazon, Google, and other tech giants, published a long‑form essay titled "The Flat Curve Society" asserting that large‑model growth is rapidly slowing as it hits both physical and safety walls, ushering in a "Flat Curve Era".
Why the curve is flattening: the Double Horizons model
Demand Horizon : For about 90% of everyday tasks, mid‑size models like Claude Sonnet already reach a performance ceiling. Users cannot distinguish model quality because their queries are not challenging enough to stretch the model’s demand horizon.
When truly hard engineering problems—such as Steve’s own React client refactor—are used as benchmarks, even top‑tier models still make frequent mistakes.
Discernment Horizon (the ultimate physical barrier) : The limit is not set by the hardest question a user asks, but by the "hardest answer verifiable by humans". Once a model surpasses human verification ability, it becomes "Superhuman" and effectively "Unverifiable". For example, a model could generate a tens‑of‑thousands‑line, highly obscure chip‑scheduling algorithm that no human on Earth could validate, making deployment unsafe. Such un‑supervisable super‑models are likened to "nuclear weapons" and will be tightly sealed by labs and governments, fixing public model capabilities at the current plateau.
Industry reshuffle: SaaS returns, Vibe‑Coding collapses
The weekend‑one‑click AI rewrite era ends : Without cross‑generation breakthroughs, using AI agents to rewrite complex monolithic code becomes prohibitively risky and costly.
SaaS makes a strong comeback : Companies now prefer mature SaaS products with predictable token costs over building custom AI tools that incur endless token and maintenance expenses.
Netflix case study: building AI literacy tiers
Netflix measured employee token consumption and usage patterns, defining three AI‑literacy tiers:
Tier 1 – Beginners/Users : Recently escaped "AI illiteracy"; use single‑point prompts; still need constant human supervision.
Tier 2 – Baseline AI Literacy : Can orchestrate multiple agents asynchronously; consume 12–15 million tokens daily; trust 2–4 agents to work independently while auditing outcomes.
Tier 3 – Power Users/Advanced : Integrate AI into complex system‑level development, automated bug hunting, and CI/CD pipelines; consume over 50 million tokens daily.
Netflix demonstrated that moving a completely AI‑naïve engineer to Tier 2 requires only a five‑hour intensive training, and another five hours to reach Tier 3, with 96 % retaining high AI‑collaboration habits after six weeks.
New game in the second half: from "Token burning" to "Token hygiene"
In the flat‑curve era, indiscriminate token consumption is unsustainable. Steve introduces the concept of "Token Hygiene"—the practice of minimizing unnecessary context overhead.
Example of wasteful automation : Prompting an agent to run git status forces the entire directory tree to be uploaded as context, wasting roughly 100 000 tokens. The recommendation is to execute trivial commands manually when they take a second, saving a few cents on API bills.
Smart routing : High‑level AI systems should route 90 % of simple, cheap queries to inexpensive or free models, escalating only complex reasoning tasks to costly top‑tier models. The ultimate goal is to turn AI literacy into a discipline of achieving maximal business outcomes with minimal token expenditure.
Conclusion: the flat curve is a gift to pragmatic builders
Steve’s illustration of a "Campground Craft" metaphor emphasizes that the flattening of model capabilities stabilizes the playing field. Models like Claude Sonnet and Opus will dominate mainstream use for years, allowing engineers to focus on solid system design, multi‑agent routing, database optimization, and long‑lasting, craft‑oriented software.
The era of speculative, token‑wasting hype is ending; the golden age belongs to developers who master AI literacy and token hygiene.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
TonyBai
Tony Bai's tech world (tonybai.com). Not satisfied with just "knowing how", we strive for mastery. Focused on Go language internals, high-quality engineering practices, and cloud‑native architecture, exploring cutting‑edge intersections of Go and AI. Gophers who pursue technology are welcome—follow me and evolve with Go.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
