Nathan Lambert and Sebastian Raschka on State of AI in 2026

Lex Fridman Podcast #490. Two ML researchers — Nathan Lambert (post-training lead, Allen Institute for AI) and Sebastian Raschka (ML educator, author) — review the state of the art in AI as of 2026: scaling axes, the open-weight explosion, training pipeline anatomy, and where the field is headed.

Source: Lex Fridman Podcast #490
Speakers: Nathan Lambert, Sebastian Raschka
Date: 2026

Key ideas

Three scaling axes, all still active. Scaling Laws have not plateaued — they have multiplied. Pre-training (model/data size), RL with verifiable rewards, and inference-time compute are three independent axes. Frontier labs now balance ROI across all three based on deployment economics, not pursue a single lever.
Data quality, not quantity, drove OLMo 3. AI2’s model outperformed competitors using less total data. The differentiator: careful data curation — blending GitHub, arXiv, Stack Exchange, and Reddit at empirically optimised ratios, driven by downstream eval performance rather than raw token count.
Open-weight explosion is structurally durable. Chinese labs (DeepSeek, Qwen, MiniMax, Z.ai) use open-weight releases as a market-access strategy in regions where US companies face payment or security barriers. Government incentives sustain this for years. No single winner — leapfrogging continues.
Tool use built in changes hallucination economics. gpt-oss, described as the first open-weight model trained with tool use as a first-class objective, illustrates Nathan and Sebastian’s shared view: a model that can search or calculate doesn’t need to memorise — reducing hallucinations structurally rather than through training alone.
AGI is a completion-rate question. Sebastian’s operational reframe: today’s models complete ~30–40% of complex multi-step tasks reliably. At 90–95%, the distinction between “AGI” and “very powerful AI” becomes practically meaningless. Nathan adds: systems dramatically more capable than today will exist within years regardless of definitional debates.

Speakers

Name	Role
Nathan Lambert	Post-training lead, Allen Institute for AI; author of definitive RLHF book
Sebastian Raschka	ML educator; author of Build a Large Language Model from Scratch and Build a Reasoning Model from Scratch

Topics covered

China vs US AI race — no winner, leapfrogging pattern
Open-weight landscape: Chinese (large MoE) vs Western (smaller, transparent)
Transformer architecture evolution from GPT-2 to MoE
Three scaling axes: pre-training, RL, inference-time compute
Pre-training vs mid-training vs post-training pipeline anatomy
Data quality as OLMo 3’s competitive advantage
Synthetic data and OCR-extracted PDFs as high-signal sources
Tool use as a structural hallucination remedy
AI coding tools: Codeium, Claude Code, Cursor
AGI framing via task completion rates
Autonomous agents as 2026’s key development

Cross-references

Concepts: Scaling Laws · Reinforcement Learning from Human Feedback · Tool Use · Pretraining · Large Language Models · Transformers · Hallucination
Speakers: Nathan Lambert · Sebastian Raschka
Notes: Nathan Lambert and Sebastian Raschka on State of AI in 2026