Notes — Zevi Arnovitz on Vibe Coding as a PM, Multi-Model Review, and Building with Claude Code

Four questions [Adler frame]

Q1. What is it about? A non-technical PM’s systematic workflow for building real products using Cursor and Claude Code, with no coding background. The episode is both a how-to (the /commands system) and a statement about what vibe coding now makes possible for anyone.

Q2. How is it argued? Through a live screen-share demo: Zevi builds a feature (fill-in-the-blank questions) for his paid side project StudyMate end-to-end during the episode. The argument is empirical — here is the workflow in action, here is the output, here is what the code review found. No theory, all practice.

Q3. Is it true? Credible. The underlying claim — that a non-technical person can build and ship a paying product using agentic coding tools — is well-corroborated by 2025–26 evidence. The specific workflow innovations (multi-model peer review, iterative /command refinement) are practical contributions rather than bold claims. The caveat: Zevi notes this works well for contained UI projects and single-developer codebases; he does not recommend PMs shipping database migrations or large projects at multi-team companies.

Q4. What of it? The /commands workflow is portable: anyone can copy it. The multi-model peer review technique is the most novel contribution — having models review each other’s code via a structured prompt that frames the reviews as ‘other dev leads’ while asserting the primary agent’s greater context. The meta-lesson: iterating on your prompt system (postmortems on AI failures, asking the AI what in its system prompt caused the error) compounds faster than iterating on the code.

Glossary

/commands — Reusable prompts stored as markdown files in the codebase, invoked by typing /command-name in Claude Code or Cursor. Each encodes a phase of the development workflow.

Exploration phase — A /command that pulls a Linear ticket and instructs Claude to read the codebase, understand the problem, and ask clarifying questions before any code is written. Claude’s job in this phase: understand, not execute.

Create plan — A /command that produces a markdown file with a TLDR, critical decisions, and status-tracked task list. The plan can be handed to any model (Cursor Composer, Gemini) for execution.

Peer review — A /command that frames Claude as ‘dev lead.’ Reviews from other models (Codex, Gemini via Cursor Composer) are pasted in and attributed to ‘other team leads.’ Claude must either explain why their findings are wrong or fix them — not blindly accept.

Learning opportunity — A /command that instructs Claude to explain the current technical context using the 80/20 rule, calibrated to a ‘technical PM in the making with mid-level engineering knowledge.’

CLAUDE.md — The system prompt file loaded into Claude’s context for every session. Encodes team norms, workflow rules, and accumulated lessons from past failures.

Notes

The gradual on-ramp

Zevi’s path: GPT project (chatbot with custom persona) → Bolt/Lovable (vibe coding with guardrails) → Cursor with light mode → dark mode + Claude Code. The progression corresponds to exposure therapy for code aversion. The key insight: you do not need to understand the code; you need to understand the concepts well enough to review AI’s decisions.

The /commands system

Seven phases in Zevi’s workflow:

/create-issue — capture quickly, push to Linear via MCP
/exploration-phase — fetch Linear ticket, read codebase, ask clarifying questions
/create-plan — produce markdown plan with status trackers
/execute-plan — execute using Cursor Composer (fast) or Claude Code (thorough)
/review — Claude reviews its own output for bugs (critical, high, medium tiers)
/peer-review — frame other models’ reviews as team lead feedback; primary agent defends or fixes
/update-docs — update documentation so future agents have better context

The system is self-improving: when a mistake recurs, Zevi asks Claude what in its system prompt caused it and updates the prompt. ‘Going back to your prompts, understanding what was not good enough, iterating on them and then seeing how AI’s responses get better — I think that’s one of the most important things.‘

Multi-model peer review

Zevi runs three models simultaneously: Claude Code as primary dev lead, Codex (OpenAI) as specialist reviewer, Cursor Composer (also fast) as a third. He characterises them as people:

Claude: communicative, opinionated, collaborative — ‘dream CTO’
Codex: dark-room expert, not communicative, solves hard bugs silently
Gemini: artsy, strong at UI, terrifying to watch work (deletes things, brings them back), produces beautiful results

The peer review /command instructs the primary agent not to defer to reviewers: ‘You have more context than them. Explain why their issues are not real, or fix them.’ Claude sometimes pushes back after the third identical issue: ‘For the third time, this is by design.‘

Using AI for interview prep

Zevi built a custom Claude project as an interview coach for his Meta PM process, fed it top frameworks and question banks, used it for mock interviews with harsh feedback requested explicitly. His conclusion: AI mocks are valuable for early reps and feedback; cold outreach to humans is irreplaceable for final preparation.