GIST Framework
GIST (Goals, Ideas, Steps, Tasks) is Itamar Gilad’s meta-framework for evidence-guided product development. Introduced in Evidence-Guided: Creating High-Impact Products in the Face of Uncertainty (2023). It integrates lean startup, design thinking, product discovery, agile delivery, and OKRs into a single coherent model. Each layer addresses a characteristic failure mode in conventional plan-and-execute product organisations.
The core problem GIST addresses
Opinion-based development: teams decide what to build based primarily on conviction, strategic narrative, or seniority — without empirical evidence of user behaviour or business impact. Itamar contrasts two Google products:
- Google+ (~1,000 people, 2011–2019): plan-and-execute mode; hypothesis untested at scale until years of investment were sunk. Shut down 2019 with no measurable impact on competitors.
- Gmail tabbed inbox (2012–present): zero-code validation first (Wizard of Oz test); discovered that ~85–88% of passive inbox users loved it — a finding internal “power user” bias would have suppressed. Now used by 1.8B users.
The difference was operating mode, not talent. Both projects involved the same organisation.
The four layers
G — Goals
What goes wrong: goals as planning (“what do we build by when?”); siloed functional goals that pull teams in different vectors.
What GIST proposes:
- North Star metric — measures value delivered to users (WhatsApp: messages sent; Airbnb: nights booked; Amplitude: weekly active learning users).
- Top business KPI — measures value captured (revenue, profit, market share).
Both together define the value exchange loop: the organisation creates value for users and captures value back; growing both in balance drives compounding growth.
Build a metrics tree from each:
- Decompose each top metric into sub-drivers hierarchically.
- The two trees overlap in the middle — these are the highest-leverage sub-metrics (moving them shifts both trees).
- Teams own sub-metrics as their area of responsibility.
- Trees expose the quantitative impact of sub-metric changes on top-level metrics.
- Team topology can be rationalised around tree structure rather than functional hierarchy.
OKR integration: metrics trees + mission populate OKRs. Max four key results per team.
I — Ideas
What goes wrong: ideas evaluated by conviction, strategic theme (“it’s about AI”), or HiPPO (highest-paid person’s opinion). Confidence is self-assigned and almost always inflated.
What GIST proposes: ICE scoring + confidence meter.
ICE (Impact, Confidence, Ease; created by Sean Ellis):
- Impact: effect on stated goals — hard to estimate; best-case backed by test data; at minimum, structured estimation.
- Confidence: how sure should we be about impact and ease? The hardest and most often inflated dimension.
- Ease: inverse of effort — also a guesstimate, but asking the question improves discussion.
Confidence meter — a tiered calibration model (0–10):
| Score | Evidence class | Examples |
|---|---|---|
| 0–1 | Opinion | Self-conviction, pitch decks, strategic themes |
| 1–2 | Social | Stakeholder review, colleague feedback |
| 2–3 | Estimates | Back-of-envelope modelling, business case |
| 3–4 | Anecdotal data | A few interviews, competitor has the feature |
| 4–5 | Market data | Surveys, competitive analysis, field research |
| 5–7 | Low-fidelity tests | Fake door, Wizard of Oz, usability study |
| 6–8 | Rough builds | Early adopter programme, fish food |
| 8–10 | Experiments | AB test, multivariate, staged rollout |
Key principle: investment in an idea should scale with confidence level. Start cheap; earn the right to invest more.
S — Steps
What goes wrong: teams equate “build an MVP” with “run an experiment.” The MVP is usually a near-complete beta, expensive and too late to course-correct cheaply.
What GIST proposes: a full spectrum of validation methods, each a learning milestone:
| Level | Methods | Cost |
|---|---|---|
| Assessment | ICE, assumption mapping, stakeholder 1:1s, business modelling | Negligible |
| Data | User interviews, surveys, competitive analysis, observation | Low–medium |
| Fake/low-fidelity | Fake door, smoke test, Wizard of Oz, concierge | Low |
| Rough build | Fish food, early adopter programme, longitudinal study | Medium |
| Near-complete build | Dog fooding, preview, beta, labs | Medium–high |
| Experiments | AB test, multivariate, hold-backs, staged rollout | High |
Each step generates evidence. After each: continue, pivot the idea, or kill it and promote the next ICE idea. “You don’t have to start at the right-hand side, which is expensive.”
T — Tasks
What goes wrong: two disconnected worlds. Managers in roadmap/strategy mode. Developers in Jira/story-points mode. PMs exhausted as translators. Developers disengaged from users and outcomes.
What GIST proposes: the GIST board.
GIST board (per team):
- Goals: max 4 key results for the quarter
- Ideas: active ideas with ICE scores
- Steps: next learning milestones for each idea
Reviewed every two weeks. Discussion: Are we on the right ideas? How are we tracking against goals? What’s blocking the most important steps?
“This middle layer discussion — what are we actually trying to achieve and how well are we doing — is the one that doesn’t happen. Most discussion is at the roadmap level or the task level.”
Outcome vs. release roadmaps
| Release roadmap | Outcome roadmap | |
|---|---|---|
| Commits to | Features + dates | Goals + dates |
| Solution | Pre-determined (low confidence) | Left open until confidence is gained |
| Effect on discovery | Suppresses it (race to ship) | Enables it |
| When features appear | Upfront | After validation (promoted to dated milestone) |
The shift to outcome roadmaps is culturally disruptive — it requires executive alignment because it removes the certainty that feature roadmaps provide to stakeholders.
Stage calibration
| Stage | Recommended approach |
|---|---|
| Pre-PMF startup | Focus on finding PMF. Metrics trees and heavy OKRs are overkill. Iterate fast; goals = find PMF. |
| Series A–B | Start building North Star metric and business KPI. Lightweight GIST board and ICE useful. |
| Scale-up | Full GIST warranted. Cost of opinion-based development is highest here. |
Relationship to other frameworks
- JAM Model (Gibson Biddle): JAM force-ranks Growth/Engagement/Monetisation. GIST’s North Star metric + business KPI is roughly equivalent; GIST adds the full tree decomposition and the decision framework.
- DHM Model (Gibson Biddle): DHM evaluates product strategy quality; GIST evaluates the process for generating and validating that strategy.
- OKRs: GIST is OKR-compatible. The key results in OKRs map directly to GIST goals; the metrics tree populates them.