Notes — From Vibe Coding to Agentic Engineering
Source: raw/llm-vibe-coding-script.md | Author: Andrej Karpathy | 2025
Four questions [Adler frame]
Q1 — What is it about as a whole? A conversation-format talk tracking the shift from experimental AI-assisted coding to Software 3.0: a new programming paradigm where natural language is the language and the LLM is the interpreter. The central claim: this shift is not incremental acceleration but a categorical change.
Q2 — How is it argued? First-person accounts (Karpathy’s December experience), concrete product examples (MenuGen, Claude Code installer), and a framework (Software 1.0/2.0/3.0) that makes the claim falsifiable. The verifiability analysis explains the jaggedness of AI capability.
Q3 — Is it true, in whole or part? The Software 1.0/2.0/3.0 framework is a useful organising lens, not a strict empirical claim. The verifiability explanation for capability clustering is consistent with the published RL literature. The “neural computer” endpoint is speculative; Karpathy acknowledges this. The vibe coding / agentic engineering distinction is practically useful for setting expectations.
Q4 — What of it? Three actionable implications: (1) invest in agentic tooling as a core engineering skill; (2) in finding where to build, look for verifiable domains; (3) update hiring practices to test agentic engineering capability.
Glossary
- Software 1.0 — explicit code; deterministic; engineer-authored.
- Software 2.0 — neural network weights; program is in the training data distribution.
- Software 3.0 — prompt as program; LLM as interpreter; natural language is the programming language.
- Vibe coding — narrating intent to an agent and deferring code-writing. Raises the floor.
- Agentic engineering — coordinating agents to maintain professional quality. Raises the ceiling.
- Verifiability — the property that makes a domain tractable for RL; enables automatic reward signals.
- Jaggedness — uneven capability profile; gaps reflect RL investment decisions.
- Neural computer — speculative endpoint: device that processes raw A/V through a neural network; no traditional OS.
Key claims by section
The December shift [§ Feeling Behind as a Coder]
- Shift happened December 2024. Karpathy had used agentic tools for ~1 year. [§ Feeling Behind as a Coder]
- Previous experience: model produced chunks, he corrected them. December: no corrections needed. [§ Feeling Behind as a Coder]
- Key: “a different relationship with the machine.” Not just faster — fundamentally different. [§ Feeling Behind as a Coder]
Software 3.0 [§ Software 3.0 Explained]
- LLM = programmable computer of a new kind; multitasks because trained on the internet. [§ Software 3.0 Explained]
- The way you program it: natural language. [§ Software 3.0 Explained]
- What lives in the context window is your lever over the LLM. [§ Software 3.0 Explained]
Agent as installer [§ Agents as the Installer]
- Claude Code installation: a block of text you paste to the agent. [§ Agents as the Installer]
- Agent reads environment, executes steps, debugs in a loop. No explicit conditionals needed. [§ Agents as the Installer]
- The programming artefact is no longer a script — it’s a prompt. [§ Agents as the Installer]
MenuGen [§ Menu Gen vs Raw Prompts]
- MenuGen: OCR pipeline + image generation + Vercel app. Software 1.0. [§ Menu Gen vs Raw Prompts]
- Software 3.0 version: photograph menu, give to multimodal model with one prompt → annotated image. No pipeline. [§ Menu Gen vs Raw Prompts]
- “All of my MenuGen is spurious. The app shouldn’t exist.” [§ Menu Gen vs Raw Prompts]
- New things now possible that couldn’t exist before: not faster but categorically new. [§ Menu Gen vs Raw Prompts]
Neural computer [§ What’s Obvious by 2026]
- Endpoint: device takes raw A/V → neural network → diffusion renders UI. No OS. CPUs as co-processors. [§ What’s Obvious by 2026]
- “Extremely foreign” — Karpathy’s own descriptor. [§ What’s Obvious by 2026]
- Intelligence compute already the dominant FLOP share. [§ What’s Obvious by 2026]
- 1950s analogy: calculator vs neural-network paths were live options; calculator won. Now diagram may flip. [§ What’s Obvious by 2026]
Verifiability [§ Verifiability and Jagged Skills]
- RL requires reward signal, which requires verifier. Labs build RL environments where verifiers are cheap + valuable. [§ Verifiability and Jagged Skills]
- Maths and code: verifiable → fast improvement. [§ Verifiability and Jagged Skills]
- Car-wash example: “I want to go to a car wash 50 metres away. Should I drive or walk?” → models say walk. Wrong: you need a clean car. [§ Verifiability and Jagged Skills]
- Chess spike GPT-3.5 → GPT-4: someone at OpenAI added a large chess corpus. Capability follows data decisions. [§ Verifiability and Jagged Skills]
- “You are somewhat at the mercy of what the labs put in the mix.” [§ Verifiability and Jagged Skills]
Vibe coding vs agentic engineering [§ From Vibe Coding to Agent Engineering]
- Vibe coding: raises the floor. Anyone can build. [§ From Vibe Coding to Agent Engineering]
- Agentic engineering: preserves quality bar. Vulnerabilities from careless agent use are yours. [§ From Vibe Coding to Agent Engineering]
- 10× engineer framing obsolete. Fully AI-native people operate at “multiples that dwarf 10×.” [§ From Vibe Coding to Agent Engineering]
- Hiring test: large project + deploy + attack with agents. Can it hold? [§ From Vibe Coding to Agent Engineering]