Spec-Driven Development — 4-Phase Workflow for Claude Code

Overview

The Four Phases

Every non-trivial feature follows the same path. Each phase produces a document, gets approved through a dashboard, and builds on the previous phase. Claude never improvises architecture — it works from specs.

1. Requirements

WHAT are we building? User stories, acceptance criteria, non-functional requirements.

2. Design

HOW will it work? Architecture, components, data models, error handling.

3. Tasks

What are the atomic steps? Each task has a _Prompt, _Leverage, and file touch map.

4. Implementation

Build it. Subagent dispatch, code review, implementation logging, then ship.

Phase 1: Requirements ── APPROVAL ──▶ Phase 2: Design │ │ codebase-search APPROVAL Impact Report │ ▼ Phase 3: Tasks ── APPROVAL ──▶ Phase 4: Implementation │ For each task: ├── Implement ├── Review (spec + quality) ├── log-implementation └── Mark [x]

Phase 1

Requirements

Before writing requirements, Claude must run a codebase search to understand what already exists. This produces an Impact Report showing files likely to be touched, existing patterns to reuse, and potential ripple effects.

Codebase search is a hard gate

No requirements get written without the Impact Report. This prevents Claude from designing features that duplicate existing code or conflict with established patterns. The search uses structural (graph DB) and semantic (vector DB) tools to find related code.

Requirements document structure

.spec-workflow/specs/STAK-498-price-scraper-fix/requirements.md

## References
- Issue: [[STAK-498]]
- Spec path: .spec-workflow/specs/STAK-498-price-scraper-fix/

## Requirements

### Requirement 1: eCheck Price Accuracy
**User Story:** As a user viewing JM Bullion prices, I want to see
the eCheck/Wire price, so that I can compare actual purchase costs.

**Acceptance Criteria:**
1. WHEN displaying JM Bullion prices THEN the eCheck/Wire column
   SHALL be read, not Card/PayPal
2. WHEN other vendors are displayed THEN their prices SHALL be
   unaffected by this change

## Non-Functional Requirements
- Performance: Price extraction under 200ms per vendor
- Reliability: Graceful fallback if column structure changes

Acceptance criteria must be testable

Write criteria as "WHEN X THEN Y SHALL happen" — behavioral statements that can be verified in tests. Avoid vague criteria like "prices should be correct."

Phase 2

Design

The design phase proposes 2-3 implementation approaches, references patterns found in the codebase search, and gets user feedback section by section. The output is a design document covering architecture, components, data models, and error handling.

Key sections: architecture overview, component interfaces (props/state/events for frontend, endpoints/schemas for backend), code reuse analysis (what existing utilities to leverage), and a testing strategy (framework, test directory, what to test).

Phase 3

Tasks

Tasks break the design into atomic, implementable units. Each task touches 1-3 files and includes structured guidance that Claude uses during implementation.

The _Prompt pattern

Every task includes a _Prompt field with four components: Role, Task, Restrictions, and Success criteria. This is the instruction block that a subagent receives when implementing the task.

tasks.md (example task)

## Task 1.1: Fix Price Column Selection
- [ ] Update column-aware parser to check eCheck first
- [ ] Add regression test for JM Bullion parsing

_Prompt: Role: Backend Developer specializing in web scraping |
  Task: Fix price-extract.js to read eCheck/Wire column before
  Card/PayPal column in JM Bullion parser |
  Restrictions: Don't change other vendor parsers. Don't modify
  the DOM query selectors — only change the column selection
  order. |
  Success: eCheck prices within $5 of website values for all
  JM Bullion products.

_Leverage: devops/pollers/shared/price-extract.js (existing parser)
_Leverage: devops/pollers/shared/vendor-config.js (column mappings)

_Requirements: Requirement #1, Acceptance Criteria #1, #2

File touch map

Each task lists which files it creates, modifies, or tests. The orchestrator uses this to determine which tasks can run in parallel (no overlapping files) and which must run sequentially.

# Tasks with no file overlap → dispatch in parallel
Task 1: MODIFY price-extract.js, TEST price-extract.test.js
Task 2: MODIFY retail-display.js, TEST retail-display.test.js
         ↑ No overlap → parallel safe

# Tasks sharing files → run sequentially
Task 3: MODIFY price-extract.js ← conflicts with Task 1
         ↑ Overlap → must wait for Task 1

Phase 4

Implementation

Implementation dispatches subagents with the task's _Prompt guidance. Each task goes through implementation, two rounds of code review, and mandatory logging before being marked complete.

The implementation loop

For each task:

1. Mark in-progress    → [ ] becomes [-] in tasks.md
2. Check existing logs → avoid duplicating prior work
3. Dispatch subagent   → full _Prompt + _Leverage context
4. Spec compliance review → does code match the task spec?
   └─ FAIL? → fix and re-review
5. Code quality review → architecture, error handling, tests
   └─ FAIL? → fix critical/important issues
6. Log implementation  → MANDATORY before marking done
7. Mark complete       → [-] becomes [x] in tasks.md

Implementation logging

The log-implementation tool creates a searchable record of what was built: files modified, API endpoints created, components added, test results. Future specs check these logs to avoid creating duplicate endpoints or reimplementing existing components.

log-implementation
  specName: "STAK-498-price-scraper-fix"
  taskId: "1.1"
  summary: "Fixed column selection order in JM Bullion parser"
  filesModified: ["devops/pollers/shared/price-extract.js"]
  artifacts:
    functions:
      - name: jmPriceFromProseTable
        location: "price-extract.js:142"
    tests:
      - name: "JM Bullion eCheck prices"
        status: "passed"
        count: "4 passed, 0 failed"

This is a hard gate

Never mark a task [x] without calling log-implementation first. The logs are what make the system self-aware — without them, the next spec might create duplicate API endpoints or reimplement a utility that already exists.

When to Skip

Not Everything Needs a Spec

Situation	Approach	What to use
Simple bug fix	Bug fast path — debug, issue, fix, PR	`/systematic-debugging` → `/issue` → fix
Trivial chore	No issue, no spec, just a `chore:` PR	`/gsd` (Get Stuff Done)
CSS tweak, typo fix	Direct fix in a worktree	`/gsd`
New feature	Full 4-phase spec	`/spec ISSUE-ID`
Complex refactor	Full spec — need design review	`/spec ISSUE-ID`
Multi-component feature	Full spec with parallel task dispatch	`/spec` + `/dispatching-parallel-agents`

Prompt for Claude: Build a spec workflow

I want to implement spec-driven development for my projects. Help me set up:

1. A spec directory structure at .spec-workflow/specs/ with templates for requirements, design, and tasks
2. A requirements template with user stories, acceptance criteria (WHEN/THEN/SHALL format), and non-functional requirements
3. A tasks template with _Prompt (Role/Task/Restrictions/Success), _Leverage, _Requirements, and file touch maps
4. A /spec skill that orchestrates the 4-phase workflow with approval checkpoints at each phase
5. An implementation logging system that records what each task built (files, endpoints, components) so future specs can check for duplicates

Start simple — I can add the dashboard approval system later. For now, just ask me "approve?" before proceeding to the next phase.