Nathan Lambert and Sebastian Raschka on State of AI in 2026

Nathan Lambert and Sebastian Raschka on State of AI in 2026

transcriptlex-fridmanscalingtrainingopen-sourceai-race

Nathan Lambert and Sebastian Raschka on State of AI in 2026

Lex Fridman Podcast #490. Two ML researchers — Nathan Lambert (post-training lead, Allen Institute for AI) and Sebastian Raschka (ML educator, author) — review the state of the art in AI as of 2026: scaling axes, the open-weight explosion, training pipeline anatomy, and where the field is headed.

Source: Lex Fridman Podcast #490
Speakers: Nathan Lambert, Sebastian Raschka
Date: 2026


Key ideas

  • Three scaling axes, all still active. Scaling Laws have not plateaued — they have multiplied. Pre-training (model/data size), RL with verifiable rewards, and inference-time compute are three independent axes. Frontier labs now balance ROI across all three based on deployment economics, not pursue a single lever.
  • Data quality, not quantity, drove OLMo 3. AI2’s model outperformed competitors using less total data. The differentiator: careful data curation — blending GitHub, arXiv, Stack Exchange, and Reddit at empirically optimised ratios, driven by downstream eval performance rather than raw token count.
  • Open-weight explosion is structurally durable. Chinese labs (DeepSeek, Qwen, MiniMax, Z.ai) use open-weight releases as a market-access strategy in regions where US companies face payment or security barriers. Government incentives sustain this for years. No single winner — leapfrogging continues.
  • Tool use built in changes hallucination economics. gpt-oss, described as the first open-weight model trained with tool use as a first-class objective, illustrates Nathan and Sebastian’s shared view: a model that can search or calculate doesn’t need to memorise — reducing hallucinations structurally rather than through training alone.
  • AGI is a completion-rate question. Sebastian’s operational reframe: today’s models complete ~30–40% of complex multi-step tasks reliably. At 90–95%, the distinction between “AGI” and “very powerful AI” becomes practically meaningless. Nathan adds: systems dramatically more capable than today will exist within years regardless of definitional debates.

Speakers

NameRole
Nathan LambertPost-training lead, Allen Institute for AI; author of definitive RLHF book
Sebastian RaschkaML educator; author of Build a Large Language Model from Scratch and Build a Reasoning Model from Scratch

Topics covered

  • China vs US AI race — no winner, leapfrogging pattern
  • Open-weight landscape: Chinese (large MoE) vs Western (smaller, transparent)
  • Transformer architecture evolution from GPT-2 to MoE
  • Three scaling axes: pre-training, RL, inference-time compute
  • Pre-training vs mid-training vs post-training pipeline anatomy
  • Data quality as OLMo 3’s competitive advantage
  • Synthetic data and OCR-extracted PDFs as high-signal sources
  • Tool use as a structural hallucination remedy
  • AI coding tools: Codeium, Claude Code, Cursor
  • AGI framing via task completion rates
  • Autonomous agents as 2026’s key development

Cross-references

Concepts: Scaling Laws · Reinforcement Learning from Human Feedback · Tool Use · Pretraining · Large Language Models · Transformers · Hallucination
Speakers: Nathan Lambert · Sebastian Raschka
Notes: Nathan Lambert and Sebastian Raschka on State of AI in 2026