Sebastian Raschka
ML educator and researcher. Author of Build a Large Language Model from Scratch and Build a Reasoning Model from Scratch — the most widely read practitioner books on constructing LLMs from first principles. Maintains mlxtend, a long-running open-source Python library for machine learning algorithms. Previously research scientist at Apple.
Background
Known for translating frontier ML research into accessible practitioner knowledge. His LLM book demonstrates building modern architectures incrementally from GPT-2 — illustrating that the core transformer architecture has remained largely stable since 2019, with modifications (MoE, RMSNorm, GQA) rather than fundamental reinventions.
Approaches AI tools empirically: uses Codeium + VS Code for coding assistance, preferring control over full agentic autonomy. Maintains one of the oldest actively-developed Python ML libraries (mlxtend), giving him direct experience with the LLM-generated pull request flood facing open-source maintainers.
Appearances in this wiki
| Episode | Source | Date |
|---|---|---|
| Nathan Lambert and Sebastian Raschka on State of AI in 2026 | Lex Fridman Podcast | 2026 |
Key positions
- Transformer architecture from GPT-2 to today: modifications not fundamental reinventions (MoE, RMSNorm, GQA)
- Pre-training scaling is not dead, merely less immediately attractive — inference scaling is currently winning ROI
- Tool use built into model training addresses hallucinations structurally
- AGI is best measured as task completion rate; at 90–95% reliable completion, philosophical definitions become irrelevant
- Human verification, however minimal, distinguishes verified LLM-generated data from raw AI output