Speaker

Nathan Lambert

Nathan Lambert

Post-training lead at the Allen Institute for AI (AI2). Author of the definitive book on Reinforcement Learning from Human Feedback. Works on OLMo — AI2’s fully open-weight language model series. One of the most technically credible public voices on post-training methodology, RL scaling, and the open-weight ecosystem.


Background

Research scientist focused on post-training: supervised fine-tuning, RLHF, RLVR, and the emerging RL environments frontier. Led the post-training work on OLMo 3. Writes publicly about AI training methodology and the state of open-weight models. His RLHF book is the most comprehensive technical account of post-training available outside closed lab research.

Known for: rigorous practitioner perspective on scaling economics; balanced takes on the China/US AI race; critical view of AGI definitions as operationally unhelpful.


Appearances in this wiki

EpisodeSourceDate
Dylan Patel and Nathan Lambert on DeepSeek and China AILex Fridman Podcast2025
Nathan Lambert and Sebastian Raschka on State of AI in 2026Lex Fridman Podcast2026

Key positions

  • Three independent scaling axes all remain active: pre-training, RL with verifiable rewards, inference-time compute
  • Data quality drove OLMo 3’s competitiveness despite using fewer tokens than some rivals
  • Chinese open-weight releases are a durable market-access strategy, not a temporary anomaly
  • AGI framing is semantically awkward; track task completion rates instead
  • Autonomous agents operating with minimal oversight are 2026’s most consequential development