Brendan Foody on Evals, the Expert Labour Market, and Mercor’s Rise

Source: Lenny’s Podcast Speaker: Brendan Foody Date: ~2025 Link: https://www.lennysnewsletter.com/p/experts-writing-ai-evals-brendan-foody

Key ideas

Era of evals as AI’s core bottleneck. “If the model is the product, the eval is the PRD.” Evals simultaneously function as product requirements, verifier rewards in reinforcement learning pipelines, and sales collateral. The bottleneck to model improvement is not compute or pre-training data but the ability to define success across every domain humans want models to master. Academic benchmarks (GPQA, Olympiad Maths) are saturating; the frontier has moved to practical domain evals — how do we measure a model redlining a contract the way a lawyer would? Models are only as good as their evals.
Expert labour transition. AI training shifted from crowdsourcing low-skill annotators (early LLMs needed grammatically correct sentences) to sourcing high-calibre domain experts — Goldman analysts, McKinsey consultants, FAANG engineers, lawyers, doctors, Emmy-winning screenwriters. The top 10% of experts drive the majority of model improvement (power law applies). Mercor sits at the intersection of labour marketplace and data company: labs need the people, not just the data. Median pay $95/hr; up to $500/hr for deep domain expertise.
Elastic demand as job-survival heuristic. Jobs where AI-driven productivity increase expands demand 10x+ are resilient; jobs with inelastic demand may shrink. Software engineering, product management, and consulting have near-unlimited demand: making engineers 10x more productive means 10x more software gets built. The skill to cultivate is leveraging AI within whatever domain you already operate in — don’t fight it, use whatever tools are available. Assess talent by what they can build in an hour with AI, not without.
PMF signal: surprisingly easy to sell. Mercor found product-market fit by looking for the customer who was “surprisingly easy to sell into” rather than forcing it. The pivot: bootstrapped to $1M ARR doing general tech hiring, then spotted that AI labs needed expert professionals and incumbents were asleep. No sales/marketing for the first 18 months — customer obsession only; NRR >1,600%. Leading indicators in fast-moving markets matter more than traditional “why now” framing.
Culture tenets: can-do, high standards, intensity. Set radically ambitious revenue goals before the trajectory exists (called $50M ARR at $1.5M; hit it on schedule). First-10 hire patience: initial talent density shapes the entire org permanently; speed matters more once product-market fit is confirmed. Intensity as a byproduct of ownership, not mandated hours. Focus on amplifying strengths rather than patching weaknesses — Brendan’s own dyslexia as a frame for differentiated thinking.

Overview

Brendan Foody — CEO and co-founder of Mercor, youngest unicorn founder at the time of recording — describes how Mercor grew from $1 to $400M ARR in 16 months (fastest in history) by building the expert labour marketplace that AI labs use for post-training data and evals. The episode covers: why evals are the central bottleneck and opportunity in AI; the shift from crowdsourced annotation to expert sourcing; elastic vs inelastic demand as a frame for future-proofing careers; the PMF discovery story (xAI meeting, incumbent crowdsourcing player failure); Mercor’s three culture tenets; and Brendan’s belief that AGI/superintelligence timelines are longer than public forecasts suggest.

Asha Sharma on Product as Organism, Post-Training, and the Agentic Society — post-training data and the agentic era from an Anthropic perspective
Archie Abrams on Shopify's Growth Model, Long-Term Holdouts, and the 100-Year Vision — hyper-growth company building with long-horizon thinking
Scaling Laws — Brendan’s contrarian view: scaling laws are plateauing; post-training data and better evals will drive the next frontier, not more pre-training compute

Brendan Foody on Evals, the Expert Labour Market, and Mercor’s Rise

Key ideas

Overview

Related