Brendan Foody on Evals, the Expert Labour Market, and Mercor's Rise

Brendan Foody on Evals, the Expert Labour Market, and Mercor's Rise

transcriptlennys-podcastaievalslabour-marketsmercorfuture-of-workstartups

Brendan Foody on Evals, the Expert Labour Market, and Mercor’s Rise

Source: Lenny’s Podcast Speaker: Brendan Foody Date: ~2025 Link: https://www.lennysnewsletter.com/p/experts-writing-ai-evals-brendan-foody

Key ideas

  • Era of evals as AI’s core bottleneck. “If the model is the product, the eval is the PRD.” Evals simultaneously function as product requirements, verifier rewards in reinforcement learning pipelines, and sales collateral. The bottleneck to model improvement is not compute or pre-training data but the ability to define success across every domain humans want models to master. Academic benchmarks (GPQA, Olympiad Maths) are saturating; the frontier has moved to practical domain evals — how do we measure a model redlining a contract the way a lawyer would? Models are only as good as their evals.
  • Expert labour transition. AI training shifted from crowdsourcing low-skill annotators (early LLMs needed grammatically correct sentences) to sourcing high-calibre domain experts — Goldman analysts, McKinsey consultants, FAANG engineers, lawyers, doctors, Emmy-winning screenwriters. The top 10% of experts drive the majority of model improvement (power law applies). Mercor sits at the intersection of labour marketplace and data company: labs need the people, not just the data. Median pay $95/hr; up to $500/hr for deep domain expertise.
  • Elastic demand as job-survival heuristic. Jobs where AI-driven productivity increase expands demand 10x+ are resilient; jobs with inelastic demand may shrink. Software engineering, product management, and consulting have near-unlimited demand: making engineers 10x more productive means 10x more software gets built. The skill to cultivate is leveraging AI within whatever domain you already operate in — don’t fight it, use whatever tools are available. Assess talent by what they can build in an hour with AI, not without.
  • PMF signal: surprisingly easy to sell. Mercor found product-market fit by looking for the customer who was “surprisingly easy to sell into” rather than forcing it. The pivot: bootstrapped to $1M ARR doing general tech hiring, then spotted that AI labs needed expert professionals and incumbents were asleep. No sales/marketing for the first 18 months — customer obsession only; NRR >1,600%. Leading indicators in fast-moving markets matter more than traditional “why now” framing.
  • Culture tenets: can-do, high standards, intensity. Set radically ambitious revenue goals before the trajectory exists (called $50M ARR at $1.5M; hit it on schedule). First-10 hire patience: initial talent density shapes the entire org permanently; speed matters more once product-market fit is confirmed. Intensity as a byproduct of ownership, not mandated hours. Focus on amplifying strengths rather than patching weaknesses — Brendan’s own dyslexia as a frame for differentiated thinking.

Overview

Brendan Foody — CEO and co-founder of Mercor, youngest unicorn founder at the time of recording — describes how Mercor grew from $1 to $400M ARR in 16 months (fastest in history) by building the expert labour marketplace that AI labs use for post-training data and evals. The episode covers: why evals are the central bottleneck and opportunity in AI; the shift from crowdsourced annotation to expert sourcing; elastic vs inelastic demand as a frame for future-proofing careers; the PMF discovery story (xAI meeting, incumbent crowdsourcing player failure); Mercor’s three culture tenets; and Brendan’s belief that AGI/superintelligence timelines are longer than public forecasts suggest.