Reading Notes

Asha Sharma on Product as Organism, Post-Training, and the Agentic Society

Source: Asha Sharma on Product as Organism, Post-Training, and the Agentic Society

Notes — Asha Sharma on Product as Organism, Post-Training, and the Agentic Society

Four questions [Adler frame]

Q1 — What is it about? A platform executive’s thesis on how AI changes what a product is, how it is built, and how organisations structure around it. Central claims: (1) products are becoming living organisms (not artefacts) whose competitive advantage is a data-metabolism loop; (2) post-training economics will surpass pre-training; (3) successful builders need to own the full loop rather than a function; (4) GUI-centric product thinking is being superseded by composability and code-native interfaces; (5) traditional roadmapping fails in AI — replace with seasons, loose OKRs, and explicit slope investment.

Q2 — How is it argued? Platform-level empiricism: Sharma oversees the AI workloads of 80,000+ companies and can observe patterns across that population. Arguments are inductive from direct observation (e.g. Microsoft Dragon physician tool: 30–60% character acceptance → 83% after expert annotation of 600K interactions). Historical analogy used for the GUI → code-native thesis: databases → SQL, cloud consoles → Terraform. Nathan Lambert’s leaderboard study cited for the 30B-parameter post-training threshold. References throughout to Microsoft internal products and customers (GitHub Copilot, Spark prototyping, Dragon, Azure agents).

Q3 — Is it true? The post-training thesis is well-grounded — the ~30B parameter threshold is empirically supported, fine-tuning adoption is observable at scale, and the Dragon example gives strong specific evidence. The product-as-organism framing is persuasive as a conceptual shift even if contested in its universality (not all software benefits equally from continuous model optimisation). The GUI → code-native prediction is historically grounded but the timeline is speculative. The “loop not lane” builder thesis matches what is observable in AI-native startups but may overstate how quickly enterprise organisations restructure. The seasons planning model is clearly practical but untested as a durable management framework.

Q4 — What of it? For product builders: reframe the product’s KPI from “what did we ship?” to “what is the metabolism of the loop?” For platform/infra investors: post-training infrastructure is the growth vector. For PMs and PMs managing PMs: the polymath builder is the archetype to hire and develop toward. For planners: abandon six-month roadmaps in favour of season-grounded OKRs. For leaders: optimism is an active skill that can be cultivated and deployed, not just a personality trait.


Glossary

Product as organism: Sharma’s framing for the shift from software as a static shipped artefact to software as a continuously learning system. The product’s IP is its fine-tuning loop (proprietary data + reward design + observability + A/B testing), not its feature set. Metabolic rate of the product team = speed of ingestion, digestion, and output improvement.

Post-training: The phase of model development after pre-training (the initial large-scale training on vast token corpora). Includes supervised fine-tuning (SFT), reinforcement learning from human feedback (RLHF), reinforcement learning with AI feedback (RLAIF), and other reward-based optimisation techniques. Cheaper per compute unit than pre-training; increasingly the primary locus of product differentiation.

Pre-training: Initial foundation model training on massive token datasets at enormous CapEx. Produces general-purpose capabilities. Sharma’s argument: once a model exceeds ~30B parameters, additional pre-training runs face diminishing economic returns relative to post-training investment on domain-specific outcomes.

Reward design: The process of specifying what outcomes the fine-tuning loop should optimise for (e.g. price, performance, quality, character acceptance rate). Requires domain expertise and product judgment — analogous to selecting KPIs, but at the model level.

Seasons (planning framework): Sharma’s planning unit above the quarterly OKR. A season is defined by a secular market shift rather than a calendar period. Duration is variable (months). Examples: Season 1 — early GPT prototyping; Season 2 — model and reasoning model explosion; Season 3 — advent of agents. All planning is grounded by the current season’s thesis.

Loop, not the lane: Sharma’s description of the new builder orientation — fluency across the entire product-model loop (data, model behaviour, UX, observability, re-training) rather than depth in a single function. Analogous to Agentic Engineering‘s full-stack builder archetype. See also Anuj Rathi's full-stack PM concept.

Model system (ensemble): Sharma’s preferred architecture — multiple specialised, fine-tuned models orchestrated together rather than a single general model. Different models optimised for latency, cost, reasoning depth, domain expertise. Contrasts with “one model to rule them all” approach.

Slope (vs. snapshot): Sharma’s planning distinction — building for where the technology is headed (slope) vs. optimising for current capabilities (snapshot). Explicit Slack in planning reserved for investments that disrupt the current platform’s own thinking.


Product as organism

Products are shifting from artifacts (static, shipped, feature-driven) to organisms (living, learning, data-metabolising). The implications:

Old paradigm (artifact)New paradigm (organism)
Build → ship → monitor dashboardIngest data → digest reward signal → output improved model → repeat
KPI: features shipped, conversion rateKPI: loop metabolism rate, acceptance rate, LTV improvement
IP = feature set, proprietary codeIP = fine-tuning loop, reward design, proprietary training data
Product team as executorProduct team as loop operator
Value frozen at launchValue compounds over time

The threshold for this shift is the moment a product team can close a feedback loop: deploy → observe → label/annotate → fine-tune → re-deploy. The Dragon example: after annotating 600K physician interactions by experts, character acceptance rate moved from 30–60% to 83%.


Post-training economics

Nathan Lambert’s study (referenced): Once a model crosses ~30B parameters, pre-training economics (CapEx per incremental capability) deteriorate relative to post-training (fine-tuning + reward optimisation on existing model). ~50% of developers already fine-tuning.

Post-training stack:

  1. Reward design — specify what the model should optimise for
  2. Data sourcing — proprietary, purchased, or synthetic
  3. Fine-tuning — SFT, RLHF, RLAIF, or other
  4. Evals — measure against target outcomes
  5. A/B testing — validate impact in production
  6. Observability — continuous monitoring for model drift or degradation
  7. Iteration — repeat

Key implications:

  • Platform infrastructure for post-training (not just inference) is the growth vector
  • Domain-specific models (fine-tuned on vertical data) can outperform general foundation models at fraction of cost
  • Proprietary interaction data becomes a compounding asset, not a transient signal

Loop, not the lane

AI compresses a typical product launch from ~500 touch-points (spec, security review, UXR, front-end, back-end, QA, localisation, legal, etc.) spanning 5–7 functions and 6–7 layers to something manageable by a small full-stack team. Sharma’s pattern from AI-native startups:

  • Full-stack builders operate across: cost/efficiency of model → reward/system design → UI/UX for agents or humans → iteration loop
  • Functions blur: PM, engineer, designer, data scientist roles converge around loop ownership
  • Observable in enterprises too: Microsoft Dragon iteration was done by a “small group of individuals, not a large organization”

Historical parallel: each computing paradigm shift (mainframe → PC, server → cloud/mobile) invented new specialist roles. This one is inverting that trend — creating generalists, not specialists.


GUI → code-native interfaces

Historical pattern:

  • Databases: desktop GUIs → SQL (structured text commands)
  • Cloud infrastructure: management consoles → Terraform (declarative code)
  • AI products: (prediction) consumer GUIs → composable text-stream interfaces

Why text/code is better for AI products:

  • LLMs interact natively with text streams — lower latency, better parsing
  • Composability: agents need to read, route, and chain artefacts — text and code compose better than pixels
  • Scale: infinite scale via composable API primitives, not hand-crafted UI components

Caveat: Sharma explicitly hedges — CLI/terminal and GUI can coexist; agents and humans may use different interfaces; chat is powerful and will persist. The shift is from GUI-as-default to composability-as-default in design thinking.


Seasons planning framework

Structure:

  1. Season (variable duration; weeks to months): Defined by secular market shift (e.g. “the advent of agents”). Everyone aligned on: what secular changes are happening? What customer problems must we solve? What does winning look like?
  2. Quarterly OKRs (loose): What do we need to do next quarter to put ourselves on a path to winning this season?
  3. Squad goals (4–6 weeks): Problem area targets that ladder up to the quarterly OKRs. Teams operate in squads against specific problem areas.
  4. Explicit Slack: Reserved for the slope — investment in disrupting the platform’s own current thinking, not just executing the current plan.

Why traditional six-month planning fails in AI:

  • Technology and competitive landscape shifts faster than the planning cycle
  • 70,000+ enterprise AI tools launched in one year alone
  • Platform bets (model, tool, framework) made at six months may be wrong at three months

Enterprise AI adoption pattern

Sharma’s three-stage model observed across 80,000+ companies:

Stage 1 — AI fluency: Everyone using AI in daily workflows. No fear; awareness of ceiling-raising and floor-lowering effects across all skill levels.

Stage 2 — Process AI: Map an existing process (e.g. customer support, fraud remediation), apply AI, measure P&L impact, feel the intrinsic benefit. Build the muscle for the full loop.

Stage 3 — Growth inflection: Use AI to inflect growth — improve LTV/retention, co-create new categories, deploy embodied agents for exponential task throughput.

Where companies fail: AI for AI’s sake; too many simultaneous projects without a blueprint; no measurement/observability/evals; beholden to a single technology or tool rather than a swappable platform layer. Build for the slope, not the snapshot.