Notes — Stewart Butterfield on Product Craft, Utility Curves, and Why We Don’t Sell Saddles

Four questions [Adler frame]

Q1 — What is it about? A product masterclass from the co-founder of Flickr and Slack. Three large topics: (1) a framework for thinking about product value as a function of user segment and investment (utility curves); (2) the distinction between comprehension and friction as the actual challenge in product design; (3) the organisational diagnosis of fake work (HRWLA) and its cause in Parkinson’s Law. Woven through: the emotional difficulty of pivoting, the generosity principle at Slack, and the ‘We Don’t Sell Saddles Here’ memo as a statement of market-creation obligation.

Q2 — How is it argued? Anecdote-driven. Each principle is grounded in a specific Slack decision: the DND rollout, the @channel redesign debate, the magic link authentication invention. Butterfield moves from case to principle rather than principle to case. The comprehension/friction distinction is built through analogy — Taylor Swift tickets (high intent, specific knowledge) vs. discovering a new product (low intent, no context). HRWLA is diagnosed by applying Parkinson’s Law at organisational scale.

Q3 — Is it true? The comprehension/friction distinction is a genuine and useful reframe. Design literature has long known that cognitive load is a cost — Don’t Make Me Think — but Butterfield’s framing clarifies why reducing clicks often fails: it addresses the wrong constraint. Utility curves are descriptively accurate as an S-curve model; the practical challenge is measuring where on the curve your users actually sit. HRWLA is a plausible but unfalsifiable diagnosis: it depends on distinguishing real from fake work in a way that is hard from outside the organisation.

Q4 — What of it? Comprehension/friction changes how you design onboarding — the question shifts from “how do we reduce steps?” to “how do we build the right understanding?” Utility curves change product investment logic — not all improvements are worth equal investment; where you are on the curve determines what returns are available. HRWLA explains why headcount scaling so often disappoints without changing output.

Glossary

Utility curve. S-shaped function mapping effort or investment in a product to value extracted from it. Features a long flat entry region (most users extract little), a steep middle section (a core segment gets enormous value), and a plateau. Understanding which segment sits on the steep part determines investment priorities.

Comprehension. The challenge of enabling users to understand what a product does and what to do next. Distinct from friction — friction matters when intent is high and users know exactly what they want; comprehension matters when they arrive with low intent and no existing model of the product.

Friction. Obstacles between a user and their goal. Removing friction is correct when intent is high and specific (Taylor Swift concert tickets). It is the wrong problem when users do not yet understand what they are trying to do.

HRWLA (Hyper-Realistic Work-Like Activities). Work-shaped activities — presentations, alignment meetings, pre-briefings, deck reviews — that consume time and appear identical to real work but produce nothing. Arise when the supply of known-valuable work falls below the headcount available to do it.

Parkinson’s Law. Work expands to fill the time available. Organisational corollary: the number of administrators grows independently of the work they administer. As organisations scale, the ratio of known-valuable work to total headcount falls, producing HRWLA.

Owner’s Delusion. The inability to see one’s product from the customer’s perspective because familiarity has made invisible the context a new user lacks. Example: a restaurant owner who builds a website with no address because they already know where the restaurant is.

We Don’t Sell Saddles Here. Pre-launch Slack memo (eight people, circa 2013) arguing that the company is responsible not just for building the product but for creating the market and the aspiration. Selling a saddle is not selling horseback riding — you must create the desire for the outcome, not just the object.

Magic link authentication. Email-based login in which clicking a link in an email authenticates the user without a password. Slack implemented this for mobile at a time when it required custom engineering; it became a product differentiator before becoming a standard.

Utility curves

The S-curve maps investment (effort, cost, feature depth) on the x-axis against utility extracted by users on the y-axis. Most products have a structure: junk → junk → good → great → plateau.

The key insight is that this curve is different for different user segments. A segment that uses Slack for complex asynchronous work across time zones sits on the steep part of the curve; adding more features yields large value. A segment using Slack as a glorified IRC sits on the flat part; improvements yield little. These two segments are in the same product at the same time.

Strategic implication: the question to ask before any product investment is “which users does this serve and where do they sit on the utility curve?” If you are serving the segment already on the steep part, modest investment produces large returns. If you are serving the flat-part segment, you may need to accept that no amount of investment will significantly increase their value extraction — or you need to move them further along the curve, which is a different kind of work.

Butterfield uses this to explain why not all product improvements are worth equal effort, and why a company should sometimes explicitly not invest in serving certain users rather than spreading investment to cover everyone.

Comprehension vs. friction

Friction reduction works when: the user knows exactly what they want, has high intent, and is blocked by a step. Taylor Swift tickets on Ticketmaster: the user is certain, motivated, and any extra step hurts. Shave every step.

Comprehension matters when: the user has low intent, does not know what the product does, cannot identify what to do next, and will not persist through confusion. Most users arriving at a new product for the first time are in this state. For them, reducing the step count from four to two does nothing if the remaining two steps are incomprehensible.

The design question for most products is therefore not “how do we reduce clicks?” but “can someone look at this screen, understand what the product does, and understand what to do next?” These are different questions that produce different solutions.

The emotional dimension is critical. When a user cannot understand a step, the cognitive cost of a decision they cannot parse falls on them as a real burden. Butterfield frames this in biological terms: unnecessary thinking is an unnecessary expenditure of metabolic resources. Beyond the metabolic cost, confusion produces an emotional response — users feel stupid, and they associate that feeling with the product. This association persists and is hard to reclaim.

Implication for the @channel redesign: the product team proposed restoring a pre-populated @mention in thread reply boxes, citing a statistically marginal increase in thread length (2.17 vs. 2.14 messages). Butterfield rejected this as guaranteed negative expected value regardless of the statistical result: the analysis required feature flags, A/B infrastructure, dashboards, and multiple rescheduled meetings — all to optimise a number with no strategic significance.

The DND rollout: layered defaults

Do Not Disturb required a system of defaults because most users never voluntarily configure notification settings. Without a default, the feature sits unused; with a wrong default, it breaks organisational expectations.

Atlassian’s approach: four layers. Organisation-level default (set by the company). Admin override (can change the default for their workspace). User override (can change their own setting). Admin-can-re-override (admin can lock the setting if the organisation requires availability windows, e.g. hospitals).

The logic: defaults determine behaviour at scale because most users accept them. A feature whose value depends on mass adoption — like DND, whose value increases when most of your colleagues also have it on — requires a default designed for the average user, not the power user who would configure it manually anyway.

HRWLA and Parkinson’s Law

Butterfield’s 27-product-manager example: at scale, each PM wanted to hire a junior. The impulse is rational from each PM’s perspective — headcount signals seniority, authority, and pay. But the aggregate result is headcount growth that outpaces the supply of known-valuable work.

When headcount exceeds the supply of real work, HRWLA fills the gap. Presentations replace decisions. Alignment meetings replace aligned action. Pre-briefing calls replace the briefed call. These activities are performed in good faith by intelligent people who believe they are working. They produce no output.

The solution Butterfield proposes is not a cultural fix (telling people to work differently) but a supply-side fix: explicitly define what constitutes known-valuable work and decline to fund activities that fall outside it. This requires willingness to be specific about what is and is not valuable — a willingness most large organisations resist because specificity produces conflict.

The saddles memo

Written when Slack had eight people, the memo’s argument: you are not just responsible for building the product but for building the market. A saddle manufacturer is not a horse-riding company; it does not create the desire to ride, only the object used in riding. Slack had to create the aspiration for organised, searchable, asynchronous communication — not just build the app.

This is a market-creation obligation. It means investing in articulating the problem the product solves for users who have not yet articulated that problem themselves. It means showing people what they are missing before showing them the solution. The memo predates contemporary thinking about category design but anticipates it: define the category before defining the product.

Pivoting: cold rationality

Glitch (the massively multiplayer online game that became Slack) was shut down only after Butterfield had exhausted every non-ludicrous long-shot idea for making it commercially viable. He distinguishes this from a quick pivot: “It’s not something I take lightly. I think it’s very different to be like, ‘There’s three of us and we started making this app and then we pivoted to a different app.’ That doesn’t even really count.”

The emotional obstacle to rational pivoting: the humiliation of admitting failure to investors, employees, and early users causes founders to extend losing bets far beyond what the expected-value calculation warrants. The attachment to sunk cost, to the team, to the identity of the company — all of these distort the calculation.

Distance from the emotional investment is required. The rational question is: given everything we now know, what is the expected value of continuing vs. stopping? Most founders ask a different question: what would it mean about me if I stop?

Generosity as strategy

Three Slack examples illustrate a consistent principle: in the long run, the measure of success is value created for customers.

Employee cash machine: $500 on the spot for a front-line employee who needed it, no questions.
IPO lockup: no lockup on IPO for employees — standard lockups can cost millions if the stock moves.
Downtime credits: 100× credit for downtime, even when this cost ~$8M in a single outage.

The logic: generosity is not charity but strategy. Each of these actions created disproportionate goodwill relative to cost. The $8M downtime credit bought more customer trust than any equivalent marketing spend could. The non-standard action (100×, not industry-standard 1×) made the statement; standard practice would have been invisible.

Stewart Butterfield on Product Craft, Utility Curves, and Why We Don't Sell Saddles