Nathan Lambert and Sebastian Raschka on State of AI in 2026
Lex Fridman Podcast #490. Two ML researchers — Nathan Lambert (post-training lead, Allen Institute for AI) and Sebastian Raschka (ML educator, author) — review the state of the art in AI as of 2026: scaling axes, the open-weight explosion, training pipeline anatomy, and where the field is headed.
Source: Lex Fridman Podcast #490
Speakers: Nathan Lambert, Sebastian Raschka
Date: 2026
Key ideas
- Three scaling axes, all still active. Scaling Laws have not plateaued — they have multiplied. Pre-training (model/data size), RL with verifiable rewards, and inference-time compute are three independent axes. Frontier labs now balance ROI across all three based on deployment economics, not pursue a single lever.
- Data quality, not quantity, drove OLMo 3. AI2’s model outperformed competitors using less total data. The differentiator: careful data curation — blending GitHub, arXiv, Stack Exchange, and Reddit at empirically optimised ratios, driven by downstream eval performance rather than raw token count.
- Open-weight explosion is structurally durable. Chinese labs (DeepSeek, Qwen, MiniMax, Z.ai) use open-weight releases as a market-access strategy in regions where US companies face payment or security barriers. Government incentives sustain this for years. No single winner — leapfrogging continues.
- Tool use built in changes hallucination economics. gpt-oss, described as the first open-weight model trained with tool use as a first-class objective, illustrates Nathan and Sebastian’s shared view: a model that can search or calculate doesn’t need to memorise — reducing hallucinations structurally rather than through training alone.
- AGI is a completion-rate question. Sebastian’s operational reframe: today’s models complete ~30–40% of complex multi-step tasks reliably. At 90–95%, the distinction between “AGI” and “very powerful AI” becomes practically meaningless. Nathan adds: systems dramatically more capable than today will exist within years regardless of definitional debates.
Speakers
| Name | Role |
|---|---|
| Nathan Lambert | Post-training lead, Allen Institute for AI; author of definitive RLHF book |
| Sebastian Raschka | ML educator; author of Build a Large Language Model from Scratch and Build a Reasoning Model from Scratch |
Topics covered
- China vs US AI race — no winner, leapfrogging pattern
- Open-weight landscape: Chinese (large MoE) vs Western (smaller, transparent)
- Transformer architecture evolution from GPT-2 to MoE
- Three scaling axes: pre-training, RL, inference-time compute
- Pre-training vs mid-training vs post-training pipeline anatomy
- Data quality as OLMo 3’s competitive advantage
- Synthetic data and OCR-extracted PDFs as high-signal sources
- Tool use as a structural hallucination remedy
- AI coding tools: Codeium, Claude Code, Cursor
- AGI framing via task completion rates
- Autonomous agents as 2026’s key development
Cross-references
Concepts: Scaling Laws · Reinforcement Learning from Human Feedback · Tool Use · Pretraining · Large Language Models · Transformers · Hallucination
Speakers: Nathan Lambert · Sebastian Raschka
Notes: Nathan Lambert and Sebastian Raschka on State of AI in 2026