From Vibe Coding to Agentic Engineering: Why Karpathy Says He’s Falling Behind
In a December 2025 interview, Andrej Karpathy explains how Vibe Coding lowered the software‑creation barrier, why Agentic Engineering shifts responsibility from models to humans, and what engineers must master to manage AI agents safely and effectively.
01 – A Real Turning Point
Andrej Karpathy, OpenAI co‑founder and former Tesla Autopilot lead, coined the term “Vibe Coding” and recently admitted he feels “more behind than ever” as a programmer. The interview that follows expands on this feeling.
02 – Vibe Coding: Redefining the Program
Vibe Coding describes a workflow where a developer sees a result, states an intent, lets an LLM modify the code, and copies‑pastes the outcome. It lowers the entry barrier so anyone can build tools, while experienced programmers can prototype faster. Karpathy splits software history into three layers:
Software 1.0 : humans write code, machines execute.
Software 2.0 (2017): humans design datasets and loss functions; models learn rules.
Software 3.0 : no code or weight training; prompts become the interface, and the model acts as a natural‑language interpreter.
He illustrates the shift with two examples:
Anthropic’s Claude installation: instead of a growing shell script, the user copies a textual description to an agent, which reads the environment, runs steps, and debugs itself.
MenuGen app: originally a pipeline of OCR → image generation → layout, later replaced by a single prompt to Gemini that returns a fully rendered menu, showing that the intermediate engineering layer becomes unnecessary.
Vibe Coding solves the “can it run?” problem, raising the floor for software creation.
03 – Agentic Engineering: Engineers as Managers
Agentic Engineering tackles the next layer: when agents can write code, run workflows, and deliver results, who guarantees quality? The focus moves from execution to questions of error detection, verification, rollback, and accountability. Karpathy proposes treating agents like interns—providing clear tasks, documentation, boundaries, checklists, tests, and reviews—while humans remain responsible for specifications and judgments.
He critiques traditional hiring (algorithmic coding tests) and suggests future assessments that require candidates to build a large‑scale, secure agent‑driven system and defend it against multiple attacking coding agents.
04 – Understanding the “Ghost” LLM
Karpathy warns against anthropomorphizing LLMs: they lack bodies, survival pressure, and continuous memory. Their intelligence is uneven—excellent at code generation or vulnerability finding, but clueless about simple commonsense queries. Verification is the true bottleneck; tasks with clear test signals amplify AI’s usefulness, while unverifiable tasks become error‑prone.
He notes that performance jumps (e.g., GPT‑4’s chess skill) often stem from data selection rather than emergent intelligence, emphasizing that model strengths depend on both verifiability and research focus.
05 – Spec Responsibility and Real‑World Pitfalls
In the MenuGen case, the agent mistakenly merged Google‑login emails with Stripe‑payment emails, assuming they refer to the same user. The correct design binds all financial flows to an internal unique user ID, illustrating that agents can follow superficial patterns but lack deep understanding of domain constraints. Humans must define the spec and enforce such invariants.
Karpathy stresses that engineers must know the underlying concepts (e.g., tensor storage, API differences) that agents cannot reliably infer.
06 – Knowledge Management with LLMs
Karpathy uses a personal wiki: each article he reads is added to the wiki, then an LLM is prompted to ask questions and reorganize the information, generating new insights. The goal is not to let the LLM replace understanding, but to force the human to view the material from many angles, strengthening comprehension.
This contrasts with many “AI second brain” products that merely store summaries, risking superficial knowledge.
Conclusion
Karpathy’s sense of falling behind stems not from an inability to code—AI already handles that—but from the challenge of keeping pace with rapidly evolving tools, mapping model capabilities, building verifiable workflows, and preserving human understanding. The evolution from Vibe Coding to Agentic Engineering shifts engineers from code authors to system owners who manage cognition, specifications, and accountability.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
