mirror of https://github.com/samvallad33/vestige.git synced 2026-07-02 22:01:01 +02:00

Sam Valladares b91a7e0bb7 docs: rewrite README in a human first-person voice + lead with the pitch

Opens with the pitch's pattern-interrupt first sentence ('Your bug was born
days before it crashed'), then tells the story in Sam's own voice — why he built
it, the soccer/causal-gap framing, the DeepMind theorem + CauseBench receipts
(0% vs 60%, kept honestly separate as theorem-vs-measurement). Links the
60-second spoken pitch (demo/PITCH-v2-causebench.md). Real citations only
(arXiv:2508.21038, Nature DOI 10.1038/s41586-024-08168-4).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-29 15:29:43 -05:00

16 KiB

Raw Blame History

Vestige

Your bug was born days before it crashed — you just can't remember where.

Vestige is a local-first memory for AI agents that reaches backward through time to find the quiet change that caused today's failure — the cause that looks nothing like the bug. One 23MB Rust binary. No cloud. Your data never leaves your machine.

⚡ Quick Start · 🧠 The Idea · 🔬 The Science · 🛠 13 Tools · 📊 Dashboard

👋 Why I built this

Hi — I'm Sam. I built Vestige from a tiny apartment in Chicago because I kept losing days to the same thing, and I bet you have too.

Production breaks. You start hunting. And the cause is almost never near the error — it's some quiet change you made days ago that looks nothing like the crash it eventually caused. A flipped env var. A swapped service. A config tweak you'd already forgotten.

Here's the part that took me a while to see: every AI memory tool is built on vector search, and vector search hunts for what looks like your problem. But a root cause never looks like the bug it creates. So they all search the goal line — while the real failure was a quiet midfield turnover fifteen minutes earlier.

I wanted a memory that traces the match backward.

So that's what Vestige is. Everyone else built a memory that remembers. I tried to build the first one that realizes — it gates what's worth keeping, lets the noise fade like your own memory does, and when a failure hits, it reaches back through time to the change that actually caused it.

It's one Rust binary. It runs entirely on your machine. It never phones home. And there's a 60-second start right below.

🎙️ The 60-second version of this whole story — the one I give in person — lives in demo/PITCH-v2-causebench.md. If you've got a minute, read that first. It's the clearest way to get why this matters.

⚡ Get it running in 60 seconds

npm install -g vestige-mcp-server@latest      # one binary — no Docker, no API key, no signup
claude mcp add vestige vestige-mcp -s user    # connect it to Claude Code

That's the whole install. Now talk to your agent like it has a memory — because now it does:

You:  "Remember: we always disable SimSIMD on release builds, it breaks old x86 CPUs."
        ...days later, fresh session, zero context...
You:  "Should I enable SimSIMD for the release?"
AI:   ⚠️ Hold on — this contradicts a decision you stored: you chose to DISABLE it
        because it breaks old x86 CPUs.

That last line isn't me being cute — it's a real status the engine returns, called claim_contradicts_memory. Most memory tools would have happily handed you the wrong answer. Vestige tells you when you're about to walk back into a mistake you already learned from.

(Works with Codex, Cursor, VS Code, Claude Desktop, Windsurf, JetBrains, Zed — anything that speaks MCP. Full setup is here ↓.)

🧠 It's not RAG with a nicer haircut

RAG is a bucket: throw everything in, hope nearest-neighbor finds it later. Vestige behaves more like an actual memory — it decides what's worth keeping, forgets what isn't, and reasons across what's left.

	🪣 RAG / Vector Store	🧠 Vestige
What it stores	Everything you hand it	Only what's surprising or new — the rest gets merged or skipped
What it forgets	Nothing — it just bloats	Unused memories fade on a real forgetting curve, so your context stays lean
Finding a root cause	Can't — the cause isn't similar to the bug	Reaches backward in time to the change that caused it (the whole point ↓)
Catching contradictions	Silent — serves the stale answer with a straight face	Tells you: "this contradicts what you decided"
Duplicates	You clean them up by hand	Self-heals — "likes dark mode" + "prefers dark themes" quietly become one
Forgetting on demand	DELETE and it's gone	`suppress` — gently inhibits a memory (and its neighbors), reversible for 24h
Where it lives	Usually someone else's cloud	Your machine. One binary. No telemetry.

🔥 The thing nothing else does: memory with hindsight

This is the part I'm proudest of, and it's worth one honest paragraph.

A bug shows up today. The cause was a quiet decision from three weeks ago — a changed env var, a swapped service. That cause shares no words with the error it created. A vector search will never connect them, because it only knows how to find things that look alike — and this is a case where the cause and the symptom look nothing alike. This isn't a tuning problem; in 2026 Google DeepMind published a proof (arXiv:2508.21038, ICLR 2026) that single-vector retrieval is mathematically incapable of bridging gaps like this.

So Vestige doesn't do it with similarity. Its Retroactive Salience Backfill — ported from Zaki/Cai et al., 2024, Nature 637:145–155 (DOI), on how the brain links a shock to the quiet memory that caused it — reaches backward through time and promotes the dormant memory that's causally upstream: it shares an entity (the same file, env var, or service), not the same words.

I also built a benchmark to keep myself honest about it. Every pure vector retriever scored 0% recall@1 on the causal-gap task; Vestige scored 60%. (To be precise: the impossibility is DeepMind's theorem; the 0%-vs-60% is my measurement — two different claims, and I keep them separate.)

vestige backfill --contrast      # show the root cause a vector search would have missed

The nice part: it compounds. Every failure your agent records makes the next session diagnose faster — run two is smarter than run one — and it happens automatically during consolidation, so you don't have to babysit it.

All of this shipped in v2.2.0, along with a 34→13 tool consolidation and a rebuilt retrieval engine. Full release notes →

🔬 This is real neuroscience, not a metaphor

I get skeptical when projects wave the word "neuroscience" around, so here's my receipt: every mechanism below is a real, cited paper, implemented in Rust, running locally on your machine. None of it phones a model in the cloud to sound smart.

Mechanism	What it does for you	Grounded in
Prediction-Error Gating	Redundant info gets merged, contradictory gets superseded, only the novel gets stored	The hippocampal novelty signal
FSRS-6 Spaced Repetition	21 parameters of the mathematics of forgetting — used memories stay, unused fade	Modern spaced-repetition research
Retroactive Salience Backfill	Backward causal reach to the root cause of a failure	Zaki/Cai et al. 2024, Nature 637:145–155
Synaptic Tagging	A memory that looked trivial this morning can be tagged critical tonight	Frey & Morris 1997
Spreading Activation	Search "auth bug," surface last week's JWT update — memory is a graph, not a list	Collins & Loftus 1975
Dual-Strength Model	Storage strength vs. retrieval strength — deeply stored ≠ instantly recalled, just like you	Bjork & Bjork 1992
Memory Dreaming	Sleep-like consolidation: replays, connects, synthesizes insights to a graph	Active-dreaming consolidation
Active Forgetting (`suppress`)	Top-down inhibition that compounds and cascades to neighbors — reversible for 24h	Anderson 2025 · Davis 2020

Read the full science doc → — every feature, every paper.

🛠 13 tools, one brain

v2.2.0 consolidated a sprawling 34-tool surface into 13 sharp ones your agent actually reaches for. Old names still work as hidden aliases — nothing breaks.

Tool	What it does
🔍 `recall`	The retrieval engine — folds search + deep reasoning + contradiction detection into one call. F32 embeddings, Reciprocal Rank Fusion, claim-vs-memory checks.
🧠 `backfill`	Memory with hindsight — backward causal reach to a failure's root cause (Cai 2024).
💾 `smart_ingest`	Stores with CREATE / UPDATE / SUPERSEDE via Prediction-Error Gating. Batch session-end saves.
🗂 `memory`	Get, edit, promote 👍, demote 👎, check state, purge content + embeddings.
🧩 `graph`	Reasoning chains, associations, bridges, predictions, force-directed export.
🌙 `maintain`	Consolidate, dream, GC, importance-score, backup, export, restore — one maintenance verb.
🧹 `dedup`	Self-healing duplicate detection + merge (8 old tools → 1).
🚫 `suppress`	Top-down active forgetting — compounds, cascades, reversible 24h. The memory is inhibited, not erased.
📟 `memory_status`	Health + stats + trends + recommendations in one packet.
🧬 `codebase` · `intention` · `source_sync` · `session_start`	Per-project code memory · "remind me when X" · external-source connectors · one-call session init.

📊 Watch your AI think in 3D

vestige dashboard      # → http://localhost:3927/dashboard

Every memory is a glowing node in a real-time, force-directed 3D graph. Connections form as you work. Nodes pulse when accessed, burst on creation, fade on decay. Kick off a consolidation and the whole graph slides into purple dream mode, replaying memories that light up in sequence.

Built with SvelteKit 2 · Svelte 5 · Three.js · WebGL bloom · live WebSocket events. 1000+ nodes at 60fps. Installable as a PWA.

🧩 Works in every editor you use

Vestige speaks MCP, so any client that can register a stdio MCP server can use it.

Editor	One-liner
Claude Code	`claude mcp add vestige vestige-mcp -s user`
Codex	`codex mcp add vestige -- vestige-mcp`
Cursor / VS Code / Windsurf / JetBrains / Xcode / OpenCode	Integration guides →
Claude Desktop	2-minute setup →

Other install methods (Intel Mac, Windows, build-from-source)

Update an existing install:

vestige update                          # binaries only
vestige update --sandwich-companion     # also refresh optional Claude Code companion files

macOS (Intel): Microsoft is dropping x86_64 macOS ONNX Runtime prebuilts after v1.23.0, so the Intel Mac build links dynamically against a Homebrew ONNX Runtime:

brew install onnxruntime
npm install -g vestige-mcp-server@latest
echo 'export ORT_DYLIB_PATH="'"$(brew --prefix onnxruntime)"'/lib/libonnxruntime.dylib"' >> ~/.zshrc && source ~/.zshrc
claude mcp add vestige vestige-mcp -s user

Full guide: docs/INSTALL-INTEL-MAC.md.

Windows + Claude Desktop: quit Claude Desktop from the tray, then in PowerShell:

npm install -g vestige-mcp-server@latest
vestige-mcp --version

Point %APPDATA%\Claude\claude_desktop_config.json at it:

{ "mcpServers": { "vestige": { "command": "vestige-mcp" } } }

If it can't find the command, run where vestige-mcp and use the exact .cmd path.

Build from source (Rust 1.91+):

git clone https://github.com/samvallad33/vestige && cd vestige
cargo build --release -p vestige-mcp
# Apple Silicon GPU: --features metal   ·   NVIDIA: --features qwen3-embeddings,cuda

🚀 Make your AI use memory automatically

Registering the server exposes the tools; a short instruction tells the agent when to call them. Drop in the protocol and your agent saves and recalls on its own:

You say	Vestige does
"Remember this"	Saves immediately
"I always..." / "I prefer..."	Saves as a durable preference
"Remind me when..."	Creates a future trigger (`intention`)
"This is important"	Saves and promotes it

Agent memory protocol → · Claude Code template →

🏗 Under the hood

┌──────────────────────────────────────────────────────────┐
│  SvelteKit Dashboard — Three.js 3D graph · WebGL bloom    │
├──────────────────────────────────────────────────────────┤
│  Axum HTTP + WebSocket (:3927) — REST + live event stream │
├──────────────────────────────────────────────────────────┤
│  MCP Server (stdio JSON-RPC) — 13 tools · 30 modules      │
├──────────────────────────────────────────────────────────┤
│  Cognitive Engine                                          │
│   FSRS-6 · Spreading Activation · Prediction-Error Gating │
│   Retroactive Salience Backfill · Synaptic Tagging        │
│   Memory Dreamer · Hippocampal Index · Active Forgetting  │
├──────────────────────────────────────────────────────────┤
│  Storage — SQLite + FTS5 · USearch HNSW · Nomic Embed v1.5│
│   Optional: Qwen3 reranker · SQLCipher · Metal/CUDA       │
└──────────────────────────────────────────────────────────┘


Language	Rust 2024 (MSRV 1.91) — 86,000+ lines
Binary	~23MB, single file
Embeddings	Nomic Embed Text v1.5 (768d→256d Matryoshka, 8192 ctx); Qwen3 optional
Vector search	USearch HNSW (≈20× faster than FAISS)
Storage	SQLite + FTS5, optional SQLCipher encryption
Tests	1,550 passing · clippy `-D warnings` clean
First run	Downloads ~130MB embedding model once, then fully offline forever
Platforms	macOS (ARM + Intel) · Linux x86_64 · Windows x86_64 — all prebuilt

📚 Go deeper


FAQ	30+ real questions answered
The Science	Every feature, every paper
Storage Modes	Global · per-project · multi-instance
Configuration	CLI, env vars, every knob
Changelog	The full story, version by version

If your agent should remember what you taught it yesterday — star it. ⭐

_{86,000+ lines of Rust · 13 tools · 30 cognitive modules · 130 years of memory research · one 23MB binary that never phones home.}

_{Built by @samvallad33 · AGPL-3.0 · 100% local, 100% yours}

16 KiB Raw Blame History Unescape Escape