AI agents lose context between sessions. They forget decisions, repeat mistakes, and drift from reality. SpecFlow gives them a structured lifecycle, persistent memory, and semantic code intelligence — self-hosted core with optional cloud integrations.
Every session starts from zero. Three failure modes compound over time.
The agent has no memory of yesterday. It doesn't know what it built, what broke, or what conventions you agreed on. You re-explain everything, every time.
You stuff everything into CLAUDE.md — architecture, conventions, decisions, infrastructure. The file balloons. Token waste goes up. Attention quality goes down.
Agents jump straight to code. No requirements phase. No design review. No approval gates. The result: rework, scope creep, and features that don't match intent.
SpecFlow structures how work gets done. DocVault structures what the agent knows. Code Context gives it semantic understanding of your code. Together, they replace the bloated CLAUDE.md + ad-hoc prompting pattern.
Requirements → Design → Tasks → Implementation with dashboard approvals at every gate. Works with Claude Code, Gemini CLI, and Codex CLI — any MCP-compatible agent.
One Obsidian vault serves 8+ repos. Architecture, infrastructure, decisions, issues — all in structured markdown with graph visualization and wikilinks.
Self-hosted Milvus vector database indexes every codebase. Agents search code by meaning, not just keywords. Forked from Zilliz, hardened with timeouts and stability fixes.
CLAUDE.md stays tiny — a routing table to skills. Each skill encodes a full workflow: debugging, deployment, PR resolution, infrastructure management.
SpecFlow speaks MCP — the open protocol for agent-tool communication. Any agent that supports MCP gets the full spec lifecycle, shared knowledge base, and code intelligence.
One-click install via the Claude Code marketplace. Full skill system with 60+ workflows, session lifecycle, and parallel subagent dispatch.
Add SpecFlow as an MCP server in Gemini’s settings. Full access to spec tools, DocVault, and the approval dashboard. Uses GEMINI.md for agent-specific behavior.
Configure in Codex’s TOML config. Same MCP tools, same workflow, same dashboard. Uses CODEX.md for agent-specific behavior.
All three agents share the same DocVault knowledge base, spec workflow state, and dashboard — work started in one agent can be continued in another.
Not everything belongs in one file. Each tier has a purpose, a source of truth ranking, and a different retrieval cost.
Human-curated Obsidian vault. Architecture, infrastructure topology, data models, deployment procedures. Version-controlled. Wins all tie-breaking conflicts. Read via Grep, Glob, or semantic search.
Project-scoped markdown files at ~/.claude/projects/*/memory/. Agent-curated, human-reviewable. Loaded at session start. Stores user preferences, feedback, project state, and external references.
Cloud-based semantic memory. Automatically populated from session digests. Probabilistic retrieval via embedding similarity. Never authoritative — DocVault always wins. Best for past decisions, operational context, and gap-filling.
Agents shouldn't grep blindly through your codebase. Code Context gives them semantic understanding — search by meaning, not string matching. Backed by a self-hosted Milvus vector database. Embedding generation uses a cloud API or local model via Ollama.
Ask "find the authentication middleware" and get results even if the code never uses the word "auth." Vector embeddings capture semantic relationships across your entire codebase.
Cloud vector databases cap you at 4 collections on free tiers. Self-hosted Milvus on your own infrastructure gives unlimited project indices with full data sovereignty.
Forked from Zilliz's MCP server with critical fixes: 30s fetch timeouts, gRPC connection guards, and pinned npm versions. No more silent upstream breakage pulling broken builds into your sessions.
Code indices persist across sessions in Milvus. No re-indexing on every cold start. Incremental updates catch only what changed since the last index run.
Every non-trivial feature follows the same path. Approvals required at each gate. No phase-skipping.
Most AI tools treat each session as isolated. SpecFlow creates a continuous learning loop: sessions end with knowledge extraction, and new sessions start pre-loaded with everything that matters.
Boots in ~15 seconds: indexes code, reads digests, pulls mem0 memories, checks issues and git status. Optional --deep mode for thorough codebase analysis. The agent starts every session knowing what happened yesterday.
One command to close a session cleanly: cleanup stale branches, sync documentation via /vault-update, extract prescriptive lessons via /retro, then process session logs via /digest-session. Replaces the deprecated /goodnight and /digest-session skills.
Scans code quality, security posture, documentation drift, and issue staleness. Produces an actionable remediation report. Run anytime — before a release, after a sprint, or when things feel off.
Not "what did we do" but "what should we do differently." Extracts actionable rules from the conversation and saves them as structured memories. Future sessions apply these lessons automatically.
DocVault is a single Obsidian vault that serves every project. No per-repo scaffolding. No duplicate context files. Infrastructure docs sit alongside application architecture.
Different approaches to the same problem: giving AI agents persistent project context.
| Dimension | SpecKit | BMAD | GSD | Taskmaster | mex | Pimzino | SpecFlow |
|---|---|---|---|---|---|---|---|
| Approval gates | None | Advisory | UAT phase | None | None | Dashboard (blocking) | Dashboard + skill enforcement |
| Memory | constitution.md | Git docs | STATE.md | tasks.json | Scaffold files | Steering docs | 3-tier: DocVault + file + mem0 |
| Session learning | None | None | None | None | GROW loop | None | /prime → /audit → /wrap |
| Code search | None | None | None | None | None | None | Semantic + structural |
| Multi-project | Per-repo | Per-repo | Per-repo | Per-repo | Per-repo | Per-repo | One vault, all repos |
| Infrastructure | Code only | Code only | Code only | Code only | Code only | Code only | Docker, DNS, VMs, proxies |
| Drift detection | None | None | None | None | 8 checkers | None | /vault-update gate |
| Multi-tool | Any AI tool | Any AI tool | Claude focused | Any AI tool | 4 tools | Claude + MCP | Claude + MCP ecosystem |
| Self-hosted | Files only | Files only | Files only | Files only | Files only | Node.js | Milvus, Neo4j, Ollama |
| Best for | Quick adoption | Enterprise teams | Solo context eng. | PRD pipelines | Per-repo memory | Structured workflow | Multi-project, high-governance |
SpecFlow is an MCP server that works with any MCP-compatible agent — Claude Code, Gemini CLI, and Codex CLI verified. DocVault is an Obsidian vault. Code Context is a semantic search engine. Together, they replace the bloated instruction file + ad-hoc prompting pattern.
The biggest architectural difference from other spec workflows: tasks don't execute sequentially in one session. Each task runs through an isolated three-stage pipeline, and tasks with zero file overlap execute concurrently.
23 tasks, 30 parallel subagent dispatches, 8 dashboard approval gates, zero file conflicts. What would take a full day sequentially completed in one afternoon.
Each component is independently useful today. The roadmap is consolidation: one install, one Docker container, zero cloud dependencies.
The end state: a single installable system that gives any AI agent persistent memory, semantic code search, structural code intelligence, and spec-driven workflow — self-hosted core, cloud optional.
SpecFlow is open source. The core workflow runs entirely on your machine. Cloud integrations (mem0, embeddings, session digests) are optional and configurable — use local models or cloud APIs based on your preference and hardware.