> This document is the single authoritative briefing of what Vestige *is today* and what *ships next*. Everything in Part 1 is verifiable against the source tree; everything in Part 3 is the committed roadmap agreed 2026-04-19.
Vestige is a Rust-based MCP (Model Context Protocol) cognitive memory server that gives any AI agent persistent, structured, scientifically-grounded memory. It ships three binaries (`vestige-mcp`, `vestige`, `vestige-restore`), a 3D SvelteKit dashboard embedded into the binary, and is distributed via GitHub releases + npm. As of v2.0.7 "Visible" (tagged 2026-04-19), it has **24 MCP tools**, **29 cognitive modules** implementing real neuroscience (FSRS-6 spaced repetition, synaptic tagging, hippocampal indexing, spreading activation, reconsolidation, Anderson 2025 suppression-induced forgetting, Rac1 cascade decay), **1,292 Rust tests**, **251 dashboard tests** (80 just added for v2.0.8 colour-mode), and **402 GitHub stars**. AGPL-3.0.
**The branch `feat/v2.0.8-memory-state-colors` was fast-forwarded into `main` tonight** adding the FSRS memory-state colour mode, a floating legend, ruthless unit coverage, the Rust 1.95 clippy-compat fix (12 sites), and the dark-glass-pill label redesign. CI on main: all 4 jobs ✅.
**The next six releases are scoped:** v2.1 "Decide" (Qwen3 embeddings, in-flight on `feat/v2.1.0-qwen3-embed`), v2.2 "Pulse" (subconscious cross-pollination — **the viral moment**), v2.3 "Rewind" (temporal slider + pin), v2.4 "Empathy" (emotional/frustration tagging, **first Pro-tier gate candidate**), v2.5 "Grip" (neuro-feedback cluster gestures), v2.6 "Remote" (`vestige-cloud` upgrade from 5→24 MCP tools + Streamable HTTP). v3.0 "Branch" reserves CoW memory branching and multi-tenant SaaS.
**Sam's context** (load-bearing for any strategic advice): no steady income since March 2026, Mays Business School deadline May 1 ($400K+ prizes), Orbit Wars Kaggle deadline June 23 ($5K × top 10), graduation June 13. Viral OSS growth comes first; paid tier gates second.
---
## 1. What Vestige Is
### 1.1 Mission
Give any AI agent that speaks MCP a long-term memory and a reasoning co-processor that survives session boundaries, with retrieval ranked by scientifically-validated decay and strengthening rules — not a vector database with a nice coat of paint.
### 1.2 Positioning vs. the competitive landscape
| System | Vestige's angle |
|---|---|
| Zep, Cognee, Letta, claude-mem, MemPalace, HippoRAG | Vestige is **local-first + MCP-native + neuroscience-grounded**. The others are cloud-first (Zep/Cognee), RAG-wrappers (HippoRAG), or toy (claude-mem). Vestige is the only one that implements 29 stateful cognitive modules. |
| ChatGPT memory, Cursor memory | Both are opaque key-value caches owned by their vendor. Vestige is open source and the memory is yours. |
| Plain vector DBs (Chroma, Qdrant) | They retrieve by similarity. Vestige *rewires* the graph on access (testing effect), decays with FSRS-6, competes retrievals, and dreams between sessions. |
### 1.3 The "Oh My God" surface
1. The 3D graph that **animates in real-time** when memories are created, promoted, suppressed, or cascade-decayed.
2. The `dream()` tool that runs a 5-stage consolidation cycle and generates insights from cross-cluster replay.
3.`deep_reference` — an 8-stage cognitive reasoning pipeline with FSRS trust scoring, intent classification, contradiction analysis, and a pre-built reasoning chain. Not just retrieval — actual reasoning.
4. Active forgetting (v2.0.5 "Intentional Amnesia") — top-down inhibitory control with Rac1 cascade that spreads over 72h, reversible within a 24h labile window.
5. Cross-IDE persistence. Fix a bug in VS Code, open the project in Xcode, the agent remembers.
### 1.4 Stats (as of 2026-04-19 post-merge)
| Metric | Value |
|---|---|
| GitHub stars | 402 |
| Total commits (main) | 139 |
| Rust source LOC | ~42,000 (vestige-core) + ~vestige-mcp |
| CI on HEAD | All 4 jobs ✅ (Test macos, Test ubuntu, Release aarch64-darwin, Release x86_64-linux) |
### 1.5 License
**AGPL-3.0-only** (copyleft). If you run a modified Vestige as a network service, you must open-source your modifications. This is intentional — it protects against extract-and-host competitors while allowing a future commercial-license path for SaaS (Part 9.7).
---
## 2. Workspace Architecture
### 2.1 Repo layout
```
vestige/
├── Cargo.toml # Workspace root
├── Cargo.lock
├── pnpm-workspace.yaml # pnpm monorepo marker
├── package.json # Root (v2.0.1, private)
├── .mcp.json # Self-registering MCP config
├── README.md # 22.5 KB marketing + quick-start
├── CHANGELOG.md # 31 KB, v1.0 → v2.0.7 Keep-a-Changelog format
│ (SvelteKit 2) │ embedded via include_dir! into vestige-mcp binary
└──────────┬──────────┘
│ HTTP / WebSocket
▼
┌─────────────────────┐ ┌──────────────────────┐
│ vestige-mcp │ ────► │ vestige-core │
│ (binary + dash BE) │ │ (cognitive engine) │
│ Axum + JSON-RPC │ │ FSRS-6, search, │
│ MCP stdio + HTTP │ │ embeddings, 29 │
│ │ │ cognitive modules │
└─────────────────────┘ └──────────────────────┘
▲
│ path dep
┌────────┴──────────┐
│ vestige-cloud │ (separate repo, Feb 12
│ vestige-http │ skeleton, not yet
│ (Axum + SSE) │ shipped)
└───────────────────┘
```
### 2.3 Build profile
```toml
[profile.release]
lto = true
codegen-units = 1
panic = "abort"
strip = true
opt-level = "z" # Size-optimized; binary is ~22 MB with dashboard
```
### 2.4 Workspace Cargo.toml pinned version
Workspace `version = "2.0.5"`. Crate-level `Cargo.toml` files pin `2.0.7`. Version files are pumped together on each release (5 files: `crates/vestige-core/Cargo.toml`, `crates/vestige-mcp/Cargo.toml`, `apps/dashboard/package.json`, `packages/vestige-init/package.json`, `packages/vestige-mcp-npm/package.json`).
### 2.5 MSRV & editions
- **Rust MSRV:** 1.91 (enforced in `rust-version`).
- **CI Rust:** stable (currently 1.95 — which introduced the `unnecessary_sort_by` + `collapsible_match` lints tonight's fixes addressed).
- **Edition:** 2024 across the entire workspace.
- **Node:** 18+ for npm packages, 22+ for dashboard dev.
- **pnpm:** 10+ for workspace.
---
## 3. `vestige-core` — Cognitive Engine
### 3.1 Purpose
Library crate. Owns the entire cognitive engine: storage, FTS5, vector search, FSRS-6, embeddings, and the 29 cognitive modules. Has no knowledge of MCP, HTTP, or the dashboard — those live one crate up.
| `qwen3-embed` | **no (v2.1 scaffolding)** | Qwen3 embed backend via Candle (Metal device + CPU fallback) | +candle-core, +~500MB Qwen3 model |
| `metal` | no | Metal GPU acceleration on Apple | macOS only |
| `nomic-v2` | no | Nomic Embed v2 MoE variant | +~200MB model |
| `ort-dynamic` | no | Runtime-load ORT instead of static prebuilt | required on glibc <2.38|
**Default feature set ships with embeddings + vector-search.** `qwen3-embed` is the v2.1 "Decide" scaffolding — dual-index with feature-gated `DEFAULT_DIMENSIONS` (1024 for Qwen3 vs 256 for Matryoshka-truncated Nomic).
### 3.4 Module tree (`src/lib.rs`)
```
src/
├── lib.rs # Module tree + prelude re-exports
├── prelude.rs # KnowledgeNode, IngestInput, SearchResult, etc.
- Default: **Nomic Embed Text v1.5** via fastembed (ONNX, 768D).
- Matryoshka truncation to 256D for fast HNSW lookups (20× faster than full 768D).
- HyDE query expansion (generate a hypothetical document, embed it, search by its embedding).
- **v2.1 scaffolding:** Qwen3 embedding backend via Candle behind `qwen3-embed` feature. `qwen3_format_query()` helper prepends the instruction prefix ("Given a web search query, retrieve relevant passages that answer the query").
- Embedding cache: in-memory LRU; disk-warm on first run (~130MB for Nomic, ~500MB for Qwen3).
### 3.9 Vector search
- USearch HNSW (pinned 2.23.0; 2.24.0 regressed on MSVC per usearch#746). Int8 quantization.
## 4. `vestige-mcp` — MCP Server + Dashboard Backend
### 4.1 Purpose
Binary crate. Wraps `vestige-core` behind an MCP JSON-RPC 2.0 server, plus an embedded Axum HTTP server that hosts the dashboard, WebSocket event bus, and REST API.
### 4.2 Binaries
| Binary | Source | Purpose |
|---|---|---|
| `vestige-mcp` | `src/main.rs` | **Primary.** MCP JSON-RPC over stdio + optional HTTP transport. Hosts dashboard at `/dashboard/`. |
8.**`explore_connections`** — Graph traversal. Actions: `associations` (spreading activation), `chain` (A*-like path), `bridges` (connecting memories between two concepts).
9.**`predict`** — Proactive retrieval via SpeculativeRetriever. Param: `context{codebase, current_file, current_topics[]}`. Returns predictions with confidence + reasoning. Has a `predict_degraded` flag (v2.0.7) that surfaces warnings instead of silent empty responses.
- **`build_instructions()`** — constructs the `instructions` string returned by `initialize`. Gated on `VESTIGE_SYSTEM_PROMPT_MODE=full`. Full mode emits an extended cognitive-protocol system prompt; default is concise.
- **CognitiveEngine** (`src/cognitive/mod.rs`) — async wrapper around `Arc<Storage>` + broadcast channel. Holds the WebSocket event sender.
- **Tool dispatch** — every `tools/call` invocation is routed to a `execute_*` function by tool name.
### 4.6 Dashboard HTTP backend (`src/dashboard/`)
-`src/dashboard/mod.rs` — Axum `Router` assembly.
-`src/dashboard/handlers.rs` — all REST handlers (~30 routes).
-`src/dashboard/static_files.rs` — embeds `apps/dashboard/build/` via `include_dir!` at compile time.
| `ConsolidationCompleted` | Consolidation finishes | Green confirmation pulse |
| `RetentionDecayed` | Node's retention drops below threshold during consolidation | Red decay pulse |
| `ConnectionDiscovered` | Dream or spreading activation finds new edge | **Cyan flash on edge (already fires — NOT yet surfaced as a toast; see v2.2 "Pulse")** |
| `ActivationSpread` | Spreading activation from a memory | Turquoise ripple (v2.0.6) |
Interactive 3D graph + CRUD + analytics dashboard. Built with SvelteKit 2 + Svelte 5 runes, embedded into the Rust binary via `include_dir!` and served at `/dashboard/`.
| `Graph3D.svelte` | **The 3D canvas.** Props: `nodes[]`, `edges[]`, `centerId`, `events[]`, `isDreaming`, `colorMode` (v2.0.8), `onSelect`, `onGraphMutation`. Owns the Three.js scene and all module init. |
| `MemoryStateLegend.svelte` (v2.0.8) | Floating overlay explaining 4 FSRS buckets — only renders when `colorMode === 'state'`. |
| `events.ts` | `mapEventToEffects()` — maps every one of the 19 VestigeEvent variants to a visual effect. Live-spawn mechanics: new nodes spawn near semantically related existing nodes (tag + type scoring), FIFO eviction at 50 nodes. |
| `scene.ts` | Scene factory. Camera 60° FOV at (0, 30, 80). ACESFilmic tone mapping, exposure 1.25, pixel ratio clamped ≤2×. **UnrealBloomPass:** strength 0.55, radius 0.6, threshold 0.2 (retuned v2.0.8 for radial-gradient sprites). OrbitControls with auto-rotate 0.3°/frame. |
| `dream-mode.ts` | Smooth 2s lerp between NORMAL (bloom 0.8, rotate 0.3, fog dense) and DREAM (bloom 1.8, rotate 0.08, nebula 1.0, chromatic 0.005). Aurora lights cycle hue in dream. |
| `temporal.ts` | `filterByDate(nodes, edges, cutoff)`, `retentionAtDate(current, stability, created, target)` using FSRS decay formula. Enables the TimeSlider preview. |
| `shaders/nebula.frag.ts` | Nebula background fragment shader (purple → cyan → magenta cycle with turbulence). |
| `events.test.ts` | 48 | 864 | Every one of the 19 event handlers + live-spawn + eviction |
| `ui-fixes.test.ts` | 21 | 236 | Bloom retuning, glow-texture gradient, fog density, regression tests for issue #31 |
| **Total** | **251** | **3,291** | |
Infrastructure: `three-mock.ts` (Scene / Mesh / Sprite / Material mocks), `setup.ts` (canvas context mocks including `beginPath`/`closePath`/`quadraticCurveTo` added tonight for the pill redesign), `helpers.ts` (node/edge/event factories).
### 5.10 Build
-`pnpm run build` → static SPA in `apps/dashboard/build/`.
- Precompressed `.br` + `.gz` per asset (adapter-static).
- **Embedded into `vestige-mcp` binary** at compile time via `include_dir!("$CARGO_MANIFEST_DIR/../../apps/dashboard/build")`. Every Rust build rebakes the dashboard snapshot.
---
## 6. Integrations & Packaging
### 6.1 IDE integration matrix (`docs/integrations/*.md`)
All 8 IDEs documented. The common install flow: (a) download `vestige-mcp` binary, (b) point IDE's MCP config at its absolute path, (c) restart IDE, (d) verify with `/context` or equivalent.
| IDE | Config path | Notable |
|---|---|---|
| Claude Code | `~/.claude.json` or project `.mcp.json` | Inline in `CONFIGURATION.md`; one-liner install |
| Claude Desktop | `~/Library/Application Support/Claude/claude_desktop_config.json` | Inline in `CONFIGURATION.md` |
| VS Code (Copilot) | `.vscode/mcp.json` OR User via command | **Uses `"servers"` key, NOT `"mcpServers"`** — Copilot-specific schema. Requires agent mode enabled. |
- **Path A (v2.6.0 "Remote"):** Upgrade the Feb skeleton to match v2.0.7 surface (5 → 24 tools), implement Streamable HTTP, ship Dockerfile + fly.toml. **Keep single-tenant.** Ship as "deploy your own Vestige on a VPS."
- **Path B (v3.0.0 "Cloud"):** Multi-tenant SaaS. Weeks of work on billing, per-tenant DB, ops. Not viable until v2.6 has traction + cashflow.
The recommendation in Part 9 is **A only** for now. B is gated on demand signal + runway.
---
## 8. Version History (v1.0 → v2.0.8)
### 8.1 Shipped releases
| Version | Tag | Date | Theme | Headline |
|---|---|---|---|---|
| v1.0.0 | v1.0.0 | 2026-01-25 | Initial | First MCP server with FSRS-6 memory |
| v1.1.x | v1.1.0/1/2 | — | CLI separation | stats/health moved out of MCP to CLI |
**Shipping cadence:** weekly minor bumps (v2.1 → v2.2 → v2.3 ...) until v3.0 which gates on multi-tenancy + CoW storage. Ships ~Monday each week with content post same day + follow-up Wednesday + YouTube Friday.
**ETA:** ~1 week after M3 Max arrival (FedEx hold at Walgreens, pickup 2026-04-20).
**What's in:** `qwen3-embed` feature flag gates a Candle-based Qwen3 embed backend. `qwen3_format_query()` helper for the query-instruction prefix. Metal device selection with CPU fallback. `DEFAULT_DIMENSIONS` feature-gated 256/1024. Dual-index routing scaffolded.
**What's left (Day 3):**
- Storage write-path records `embedding_model` per node.
-`semantic_search_raw` uses `qwen3_format_query` when feature active.
- Dual-index routing: old Nomic-256 nodes stay on their HNSW, new Qwen3-1024 nodes go on a new HNSW. Search merges with trust weighting.
- End-to-end test: ingest on Qwen3 → retrieve on Qwen3 at higher accuracy than Nomic.
**Test gate:** `cargo test --workspace --features qwen3-embed --release` green. Current baseline: 366 core + 425 mcp passing.
**What it does:** While the user is doing anything else (typing a blog post, looking at a different tab, doing nothing), Vestige's running `dream()` in the background. When dream completes with `insights_generated > 0` or a `ConnectionDiscovered` event fires from spreading activation, **the dashboard pulses a toast** on the side: *"Vestige found a connection between X and Y. Here's the synthesis."* The bridging edge in the 3D graph flashes cyan and briefly thickens.
**Why viral:** This is the single most tweet/YouTube-friendly demo in the entire roadmap. It is the "my 3D brain is thinking for itself" moment.
**Backend (≈2 days):**
1.`ConsolidationScheduler` gains a "pulse" hook: after each cycle, if `insights_generated > 0` emit a new `InsightSurfaced` event with `{source_memory_id, target_memory_id, synthesis_text, confidence}`.
2. The existing `ConnectionDiscovered` event gets a richer payload: include both endpoint IDs + a templated synthesis string derived from the two memories' content.
3. Rate-limit pulses: max 1 per 15 min unless user is actively using the dashboard.
**Frontend (≈5 days):**
1. New Svelte component `InsightToast.svelte` — slides in from right, shows synthesis text + "View connection" button, auto-dismisses after 10s.
2.`events.ts` mapping: `InsightSurfaced` → locate bridging edge in graph, pulse it cyan for 2s, thicken to 2× for 500ms, play a soft chime (optional, muted by default).
3. Toast queue so rapid dreams don't flood.
4. Preference: user can toggle pulse sound / toast / edge animation independently in `/settings`.
**Already exists (nothing to build):**
-`dream()` 5-stage cycle — YES
-`DreamCompleted` event with `insights_generated` — YES
- 3D edge animation system in `events.ts` — YES (handler exists, just doesn't emit toast)
- ConsolidationScheduler running on `VESTIGE_CONSOLIDATION_INTERVAL_HOURS` — YES
**Never-composed alarm:** Four existing components, zero lines of composition. This feature is **~90% latent in v2.0.7**. All we do is press the button.
**Acceptance criteria:**
- Start Vestige, idle for 10 min, verify a pulse fires from scheduled dream cycle.
- Ingest 3 semantically adjacent memories from completely different domains (e.g., F1 aerodynamics, memory leak, fluid dynamics), trigger dream, verify connection pulse fires with synthesis text mentioning both source + target.
- Dashboard test coverage: add `pulse.test.ts` with 15+ cases covering toast queue, rate limit, event shape, edge animation.
**Launch day:** Film a 90-second screen recording. Post to Twitter + Hacker News + LinkedIn + YouTube same day.
### 9.3 v2.3.0 "Rewind" — Time Machine
**ETA:** 2-3 weeks after v2.2 ships.
**What it does:** The graph page gets a horizontal time slider. Drag back in time → nodes dim based on retroactive FSRS retention, edges that were created after the slider's timestamp dissolve visibly, suppressed memories un-dim to their pre-suppression state. A "Pin" button snapshots the current slider state into a named checkpoint the user can return to.
**Backend (≈4 days):**
1. New core API: `Storage::memory_state_at(memory_id, timestamp) -> MemorySnapshot` — reconstructs a node's FSRS state at an arbitrary past timestamp by replaying `state_transitions` forward OR applying FSRS decay backward from the current state.
2. New MCP tool: `memory_graph_at(query, depth, max_nodes, timestamp)` — the existing graph call with a time parameter.
3. New MCP tool: `pin_state(name, timestamp)` — persists a named snapshot (just a row in a new `pins` table: name, timestamp, created_at).
4. New core API: `list_pins()` + `delete_pin(name)`.
**Frontend (≈7 days):**
1.`TimeSlider.svelte` already exists as a scaffold (listed in §5.5) — upgrade it to an HTML5 range input + play/pause + speed control.
2. Graph3D consumes a new `asOfTimestamp` prop. When set, uses `temporal.ts::retentionAtDate()` to re-project every node's opacity + size.
3. Edges: hide those with `created_at > slider`. Animate the dissolution so sliding feels organic.
4. Pin sidebar: list pinned states, click to jump, rename/delete.
**Cut from scope: branching.** Git-like "what if I forgot my Python biases" requires CoW storage = full schema migration = v3.0 territory. Scope it out explicitly.
**Acceptance criteria:**
- Slide back 30 days, verify node count drops to whatever existed 30 days ago.
- Slide back through a suppression event, verify node un-dims.
**What it does:** Vestige's MCP middleware watches tool call metadata for frustration signals — repeated retries of the same query, CAPS LOCK content, explicit correction phrases ("no that's wrong", "actually..."), rapid-fire consecutive calls. When detected, the current active memory gets an automatic `ArousalSignal` boost and a `frustration_detected_at` timestamp. Next session, when the user returns to a similar topic, the agent proactively surfaces: *"Last time we worked on this, you were frustrated with the API docs. I've pre-read them."*
**Why Pro-tier:** Invisible to demo (so doesn't hurt OSS growth), creates deep lock-in, quantifiable value ("Vestige saved you X minutes of re-frustration this month"), clear paid-hook rationale.
**Backend (≈4 days):**
1. New middleware layer in `vestige-mcp` between JSON-RPC dispatch and tool execution: `FrustrationDetector`. Analyzes tool args for: (a) retry pattern (same `query` field within 60s), (b) content ≥70% caps after lowercase comparison, (c) correction regex (`no\s+that|actually|wrong|fix this|try again`).
2. On detection, fire a synthesized `ArousalSignal` to `ImportanceTracker` for the most-recently-accessed memory.
3. New core API: `find_frustration_hotspots(topic, limit)` → returns memories with `arousal_score > threshold` + their `frustration_detected_at` timestamps.
4.`session_context` tool gains a new field: `frustration_warnings[]` — "Topic X had previous frustration; here's what we know."
**Frontend (≈3 days):**
1. Memory detail pane shows an orange "Frustration" badge for high-arousal memories.
2.`/stats` adds a "Frustration hotspots" section.
**Acceptance criteria:**
- Simulate 3 rapid retries of the same query, verify ArousalSignal boosts the active memory.
- Simulate caps-lock content, verify detection.
- Return to same topic next session, verify `session_context` surfaces warning.
**What it does:** In the 3D graph, drag a memory sphere to "grab" it — its cluster highlights. Squeeze (pinch gesture or modifier key + drag inward) → promotes the whole cluster. Flick away (throw gesture) → triggers decay on the cluster.
**Backend (≈2 days):**
1. New MCP tool: `promote_cluster(memory_ids[])` — applies promote to each.
2. New MCP tool: `demote_cluster(memory_ids[])` — inverse.
**ETA:** 3 weeks after v2.5 ships. First paid-tier candidate if empathy doesn't convert first.
**What it does:** Turns the Feb `vestige-cloud` skeleton into a shippable self-host product. One-liner install → Docker container or fly.io deploy → point Claude Desktop/Cursor/Codex at the remote URL → cloud-persistent memory across all your devices.
**Scope:**
1. Upgrade MCP handler from 5 → 24 tools (port each tool from `crates/vestige-mcp/src/tools/`).
- v2.6 adoption signal (≥500 self-host deployments)
- Sam's runway (needs pre-revenue or funding)
- Either Mays, Orbit Wars, or another cash injection
**What it does:**
1.**Memory branching** — git-like CoW over SQLite. Branch a memory state, diverge freely, merge or discard. "What if I forgot all my Python biases and approached this memory as a Rust expert" becomes a one-button operation.
2.**Multi-tenant SaaS** at `vestige.dev` / `app.vestige.dev`. Per-user DB shards, JWT auth + OAuth providers, Stripe subscriptions with entitlement gates, org membership, team shared memory with role-based access.
**Major subsystems required:**
- Storage layer rewrite for CoW semantics (or adopt Dolt/sqlcipher with branching).
**Key insight:** v2.2-v2.6 are all ≥60% latent in existing primitives. v3.0 is the first release that requires significant greenfield work. This is why sequencing matters: ride the existing primitives to revenue, then greenfield.
---
## 11. Risks & Known Gaps
### 11.1 Technical
| Risk | Impact | Mitigation |
|---|---|---|
| `ort-sys 2.0.0-rc.11` prebuilt gaps (Intel Mac dropped, Windows MSVC with usearch 2.24 broken) | Fewer platforms ship | Wait for ort-sys 2.1; or migrate to Candle throughout (v2.1 Qwen3 already uses Candle) |
| `usearch` pinned to 2.23.0 (2.24 regression on MSVC) | Windows build fragility | Monitor usearch#746 |
| fastembed model download (~130MB for Nomic, ~500MB for Qwen3) on first run blocks sandboxed Xcode | UX friction | Cache at `~/Library/Caches/com.vestige.core/fastembed` — documented in Xcode guide; pre-download from terminal once |
| Tool count drift (23 vs 24 across docs) | User trust | Reconciled in v2.0.7 (`docs: tool-count reconciliation`) |
| Large build times (cargo release 2-3 min incremental, 6+ min clean) | Slow iteration | M3 Max arriving Apr 20 will halve this |
| `include_dir!` bakes dashboard build into binary at compile time | Have to rebuild Rust to update dashboard | Accept as design; HMR via `pnpm dev` for iteration |
### 11.2 Product
| Risk | Impact | Mitigation |
|---|---|---|
| OSS-growth-before-revenue means months of zero cash | Sam can't pay rent | Mays May 1 ($400K+), Orbit Wars June 23 ($5K × top 10), part-time Wrigley Field during Cubs season |
| `deep_reference` is the crown jewel but rarely invoked | Users don't discover it | `CLAUDE.md` flags it; v2.2 Pulse farms the viral moment to drive awareness |
| Subconscious Pulse may fire too often or too rarely | User annoyance or missed value | Rate limit: max 1 pulse per 15 min; user-adjustable in settings |
| Emotional tagging may over-fire (every caps lock = frustration?) | False positives | Require ≥2 signals (retry + caps, or retry + correction) before boost |
| v3.0 SaaS burns runway if started too early | Business-ending | Gated on v2.6 adoption + cash injection |
| Copycat risk (Zep, Cognee, etc.) cloning Vestige's features | Eroded differentiation | AGPL-3.0 protects network use; neuroscience depth is hard to fake; time slider + subconscious pulse are visible moats |
| Cross-IDE MCP standard changes (Streamable HTTP spec moved 2024-11-05 → 2025-06-18) | Breaking transport changes | v2.6 implements the newer spec; keep 2024-11-05 as backward-compat alias |
### 11.3 Known UI gaps (`docs/launch/UI_ROADMAP_v2.1_v2.2.md`)
- **26% of MCP tools have zero UI surface** (e.g., `codebase`, `find_duplicates`, `backup`, `export`, `gc`, `restore` — all power-user only).
- **28% of cognitive modules have no visualization** (SynapticTagging, HippocampalIndex, ContextMatcher, CrossProjectLearner, etc.).
- The rainbow-bursted Rac1 cascade in the graph has no numeric "how many neighbours did it touch" display.
-`intention` shows but doesn't let you edit/snooze from the UI.
-`deep_reference` is unreachable from the dashboard (it only surfaces via MCP tool calls).
3.**Wednesday:** Follow-up tweet thread (deep-dive on one specific feature).
4.**Thursday:** Engage with feedback; close issues; publish patch if needed.
5.**Friday:** YouTube long-form (15-25 min walkthrough). Next week's release work continues.
### 12.3 Viral load-bearing moments
- **v2.2 "Pulse" launch:** The single biggest viral bet. Subconscious cross-pollination demo → HN front page → Twitter thread → YouTube 10-min walkthrough.
- **v2.3 "Rewind" time slider:** Highly tweet-friendly. Screen recording of sliding back through memory decay.
- **Jarrett Ye (FSRS creator, user L-M-Sherlock) outreach:** Already a stargazer. Email him Sunday night (US time) = Monday AM Beijing with the v2.2 Pulse demo. If he retweets → FSRS community (Anki, maimemo) amplifies.
### 12.4 Issue #36 (hooks-for-automatic-memory)
Outstanding from desaiuditd. Response plan:
1. Thank him publicly in the issue.
2. Acknowledge the feature as valid and scoped for v2.2/v2.3.
3. Open a linked sub-issue: "v2.2: Auto-memory hooks" tied to Pulse work.
## 15. POST-v2.0.8 ADDENDUM — The Autonomic Turn (added 2026-04-23)
> This section supersedes portions of sections 9.1-9.8. The April 19 roadmap (v2.1 Decide → v2.2 Pulse → v2.3 Rewind → v2.4 Empathy → v2.5 Grip → v2.6 Remote → v3.0 Branch) remains the long-arc plan but has been RESEQUENCED post-v2.0.8 ship following a three-agent audit on 2026-04-23 (web research on 2026 SOTA, Vestige code audit for active-vs-passive paths, competitor landscape). Updated sequence reflects what got absorbed into v2.0.8 and the new v2.0.9 / v2.5 / v2.6 architecture tier that replaces the old placeholder numbering.
### 15.1 What v2.0.8 "Pulse" absorbed
v2.0.8 shipped (commit `6a80769`, tag `v2.0.8`, 2026-04-23 07:21Z) bundled:
- **v2.2 "Pulse" InsightToast** (from April 19 roadmap) — real-time toast stack over the WebSocket event bus; DreamCompleted / ConsolidationCompleted / ConnectionDiscovered / MemoryPromoted/Demoted/Suppressed surface automatically.
- **v2.3 "Terrarium" Memory Birth Ritual** — 60-frame elastic materialization on every `MemoryCreated` event.
- **8 new dashboard surfaces** exposing the cognitive engine: `/reasoning`, `/duplicates`, `/dreams`, `/schedule`, `/importance`, `/activation`, `/contradictions`, `/patterns`.
- **Reasoning Theater** wired to the 8-stage `deep_reference` cognitive pipeline with Cmd+K Ask palette.
- **3D graph brightness** auto-compensation + user slider (0.5×–2.5×, localStorage-persisted).
- **Intel Mac restored** via `ort-dynamic` + Homebrew onnxruntime (closes #41, sidesteps Microsoft's upstream deprecation of x86_64 macOS ONNX Runtime prebuilts).
- **Cross-reference hardening** — contradiction-detection false positives from 12→0 on an FSRS-6 query; primary-selection topic-term filter (50% relevance + 20% trust + 30% term_presence) fixes off-topic-high-trust-wins-query bug.
Post-v2.0.8 hygiene commit `0e9b260` removed 3,091 LOC of orphan code (9 superseded tool modules + ghost env-var docs + one dead fn).
### 15.2 The audit finding — "decorative memory" at system scale
Three agents ran in parallel on 2026-04-23. Core diagnosis: **Vestige has 30 cognitive modules but only 2 autonomic mechanisms** (6h auto-consolidation loop + per-tool-call scheduler at `server.rs:884`). The 20-event WebSocket bus at `dashboard/events.rs` has **zero backend subscribers** — all 14 live event types flow to the dashboard and terminate. Fully-built trigger methods exist but nothing calls them:
-`ProspectiveMemory::check_triggers()` at `prospective_memory.rs:1260` — 9h intention window, never polled.
-`SpeculativeRetriever::prefetch()` at `advanced/speculative.rs` (606 LOC) — never awaited.
-`MemoryDreamer::run_consolidation_cycle()` — instantiated on CognitiveEngine but the 6h timer at `main.rs:258` calls only `storage.run_consolidation()` (FSRS decay), never the dreamer.
Three completely dead modules: `MemoryCompressor`, `AdaptiveEmbedder`, `EmotionalMemory` (constructed in `CognitiveEngine::new()` at `cognitive.rs:145-160`, zero call sites in vestige-mcp). `Rac1CascadeSwept`, `ActivationSpread`, `RetentionDecayed` events declared but never emitted.
**This is the ARC-AGI-3 pattern at system scale:** storage exists, retrieval exists, memory never self-triggers during the agent's decision path because no subscriber is listening. Sam's paraphrased thesis: *"the bottleneck won't be how much the agent knows — it will be how efficiently it MANAGES what it knows."*
### 15.3 The 2026 SOTA convergence — "retrieval is solved, management is not"
Web-research agent surfaced the consensus. Load-bearing papers + their unshipped primitives:
- **Titans** (arXiv 2501.00663, Google NeurIPS 2025) — test-time weight updates via surprise gradient. Active IN-MODEL.
- **A-Mem** (arXiv 2502.12110) — Zettelkasten dynamic re-linking on write.
- **Memory-R1** (arXiv 2508.19828) — RL-trained Manager with ADD/UPDATE/DELETE/NOOP on 152 QA pairs; beats baselines on LoCoMo + MSC + LongMemEval.
- **Mem-α** (arXiv 2509.25911) — RL over tripartite core/episodic/semantic memory, trained on 30k tokens, generalizes to 400k.
- **StageMem** (arXiv 2604.16774) + **Evidence for Limited Metacognition in LLMs** (arXiv 2509.21545) — item-level confidence separated from retention, validity-screened selective abstention.
- **Memory in the Age of AI Agents** survey (arXiv 2512.13564) — taxonomy (Forms/Functions/Dynamics); all open problems live in Dynamics.
**Three unshipped-by-anyone concepts define the 2026 frontier:** meta-memory / confidence-gated generation (refuse to answer when load-bearing memory is cold), autonomous consolidation on surprise/drift (not on timer), write-time contradiction detection with agent-facing alerts.
### 15.4 Competitive landscape — the white-space lanes
Nobody ships: **confidence-gated generation, proactive contradiction flagging without query, predictive pre-warm at UserPromptSubmit, autonomic working-memory capacity enforcement.**
- Anthropic Claude Code: 95%-context auto-compaction. No trust-scored memories, no scheduled dream, no confidence gating.
- Google Titans: surprise-gated memory IN-MODEL; not a server-level primitive.
Every one of those four white-space primitives has raw material **already built** in Vestige (FSRS-6 trust scores, `deep_reference`, `predict`, `SpeculativeRetriever`, WebSocket event bus, Sanhedrin POC from April 20). The bottleneck is wiring, not features.
**Single architectural change**: add a backend event-subscriber task in `main.rs` (~50-100 LOC `tokio::spawn`) that consumes the existing WebSocket bus and routes events into the cognitive modules that already have trigger methods. This one commit flips 14 dormant primitives into active ones simultaneously.
**Concrete wiring:**
| Event | Currently emits to | Add backend routing |
| `ImportanceScored > 0.85` | dashboard only | auto-`promote` |
| `DeepReferenceCompleted` with contradictions | dashboard only | queue a `dream()` cycle for contradiction-resolution |
**Three additional changes:**
1. New 60s `tokio::interval` in `main.rs` calls `cog.prospective_memory.check_triggers(current_session_context)`. On hit, emit new `IntentionFired` event + MCP sampling/createMessage notification to the client.
2. Add `cognitive.dreamer.run_consolidation_cycle()` call inside the existing 6h auto-consolidation loop at `main.rs:258` (alongside, not replacing, `storage.run_consolidation()`).
3.`find_duplicates` auto-runs when `Heartbeat.total_memories > 700`.
**Launch narrative:** *"Vestige now acts on your memories while you sleep — 14 cognitive modules that used to wait for a query now fire autonomously on every memory event."*
### 15.6 v2.5.0 "Autonomic" — 1 Week After v2.0.9
Three unshipped-by-anyone primitives land in one release. This is the category-defining drop.
Stop hook runs `deep_reference` on the agent's draft response, checks FSRS retention on load-bearing claims. If any required fact has retention <0.4,exits2witha`VESTIGE VETO: cold memory on claim X, retrieve fresh evidence or explicitly mark uncertain`block.TheSanhedrinPOCfrom2026-04-20alreadyprovesthemechanismworksinrealdogfooding—threeconsecutivedraftswerevetoedbythePOC.Packageasaformal`vestige-guillotine`ClaudeCodeplugin.
Files: new `crates/vestige-mcp/src/hooks/guillotine.rs`, plugin manifest in `packages/claude-plugin/`. Composes existing `deep_reference` trust-score pipeline + the Sanhedrin dogfooding script.
On every `smart_ingest` write, a fast `deep_reference` runs against the existing graph. If the new memory contradicts an existing memory with trust > 0.6, the server fires an MCP sampling/createMessage notification to the agent *in the same conversation:**"this contradicts memory Y from \[date\]. Supersede Y, discard X, or mark both as time-bounded?"* The agent resolves the conflict in real time instead of waking up to it three sessions later.
Files: `crates/vestige-mcp/src/tools/smart_ingest.rs` (post-write hook), `crates/vestige-mcp/src/protocol/sampling.rs` (new — MCP sampling/createMessage support). Composes existing `deep_reference` + contradiction-detection hardening from v2.0.8.
**(C) Pulse Prefetch — Predictive Pre-Warm at UserPromptSubmit**
UserPromptSubmit hook fires `predict(query)`, top-k results injected into agent context before the first token. The agent never has to ask; the memory is already there. Nemori did predict-calibrate; Letta does sleep-time; nobody fires at query-arrival.
Files: `crates/vestige-mcp/src/hooks/pulse_prefetch.rs` (new), extend `SpeculativeRetriever::prefetch()`. Composes existing `predict` tool + `speculative.rs` (606 LOC, never awaited until v2.0.9 wiring).
**Launch narrative:** *"The first MCP memory that VETOes hallucinations before the user sees them, FLAGS contradictions at write-time, and PREDICTS what the agent will need before the agent knows it needs it. Zero-shot proactive memory management."*
### 15.7 v2.6.0 "Sleepwalking" — 2 Weeks After v2.5.0
Dream cycle detects high-value cross-project patterns → auto-generates and opens pull requests against the user's codebase. Zep writes text summaries; Vestige writes code. The `cross_project.find_universal_patterns()` fn already exists. Wire it via a new `sleepwalk` subcommand that invokes `gh pr create` with generated diffs.
Target: v2.0.9 + v2.5.0 + v2.6.0 all ship within 30 days of v2.0.8.
Stars trajectory: current 484 baseline at +12/day → +600 from v2.0.9 + +1,500 from v2.5.0 + +2,000 from v2.6.0 + 360 organic = **~5,000 stars by end of May 2026.** First paid commercial license lands during v2.5.0 launch week (the Hallucination Guillotine clip is exactly the artifact that makes enterprise DevRel reshare). MCP engineer role offer inbound during the same window.
CCN 2027 poster abstract gets written on the v2.5 primitives; RustConf 2026 Sep 8-11 talk submission writes itself around the event-bus-subscriber architecture pattern.
### 15.10 The one-line architectural thesis
**Vestige's bottleneck is not feature count, not capacity, not module depth. It is one missing architectural pattern — a backend event-subscriber task that routes the 14 live WebSocket events into the cognitive modules that already have the trigger methods implemented.** Closing that single gap flips Vestige from "memory library" to "cognitive agent that acts on the host LLM." Every v2.5+ feature composes on top of that one change.
---
**End of document.** Length-check: ~19,000 words / ~130 KB markdown. This is the single-page briefing that lets any AI agent plan the next phase of Vestige without having to re-read the repository.