Vestige
### Your bug was born days before it crashed โ you just can't remember where.
Vestige is a local-first memory for AI agents that reaches backward through time to find the quiet change that caused today's failure โ the cause that looks nothing like the bug. One 23MB Rust binary. No cloud. Your data never leaves your machine.
[](https://github.com/samvallad33/vestige/stargazers)
[](https://github.com/samvallad33/vestige/releases/latest)
[](https://github.com/samvallad33/vestige/actions)
[](LICENSE)
[**โก Quick Start**](#-get-it-running-in-60-seconds) ยท [**๐ง The Idea**](#-why-i-built-this) ยท [**๐ฌ The Science**](#-this-is-real-neuroscience-not-a-metaphor) ยท [**๐ 13 Tools**](#-13-tools-one-brain) ยท [**๐ Dashboard**](#-watch-your-ai-think-in-3d)
---
## ๐ Why I built this
Hi โ I'm [Sam](https://github.com/samvallad33). I built Vestige from a tiny apartment in Chicago because I kept losing days to the same thing, and I bet you have too.
Production breaks. You start hunting. And the cause is almost never *near* the error โ it's some quiet change you made days ago that looks **nothing** like the crash it eventually caused. A flipped env var. A swapped service. A config tweak you'd already forgotten.
Here's the part that took me a while to see: **every AI memory tool is built on vector search, and vector search hunts for what *looks like* your problem.** But a root cause never looks like the bug it creates. So they all search the goal line โ while the real failure was a quiet midfield turnover fifteen minutes earlier.
I wanted a memory that traces the match *backward.*
So that's what Vestige is. Everyone else built a memory that **remembers**. I tried to build the first one that **realizes** โ it gates what's worth keeping, lets the noise fade like your own memory does, and when a failure hits, it reaches back through time to the change that actually caused it.
It's one Rust binary. It runs entirely on your machine. It never phones home. And there's a 60-second start right below.
> ๐๏ธ **The 60-second version** of this whole story โ the one I give in person โ lives in [`demo/PITCH-v2-causebench.md`](demo/PITCH-v2-causebench.md). If you've got a minute, read that first. It's the clearest way to *get* why this matters.
---
## โก Get it running in 60 seconds
```bash
npm install -g vestige-mcp-server@latest # one binary โ no Docker, no API key, no signup
claude mcp add vestige vestige-mcp -s user # connect it to Claude Code
```
That's the whole install. Now talk to your agent like it has a memory โ because now it does:
```
You: "Remember: we always disable SimSIMD on release builds, it breaks old x86 CPUs."
...days later, fresh session, zero context...
You: "Should I enable SimSIMD for the release?"
AI: โ ๏ธ Hold on โ this contradicts a decision you stored: you chose to DISABLE it
because it breaks old x86 CPUs.
```
That last line isn't me being cute โ it's a real status the engine returns, called `claim_contradicts_memory`. Most memory tools would have happily handed you the wrong answer. Vestige tells you when you're about to walk back into a mistake you already learned from.
*(Works with Codex, Cursor, VS Code, Claude Desktop, Windsurf, JetBrains, Zed โ anything that speaks MCP. [Full setup is here โ](#-works-in-every-editor-you-use).)*
---
## ๐ง It's not RAG with a nicer haircut
RAG is a bucket: throw everything in, hope nearest-neighbor finds it later. Vestige behaves more like an actual memory โ it decides what's worth keeping, forgets what isn't, and reasons across what's left.
| | ๐ชฃ RAG / Vector Store | ๐ง Vestige |
|---|---|---|
| **What it stores** | Everything you hand it | Only what's **surprising or new** โ the rest gets merged or skipped |
| **What it forgets** | Nothing โ it just bloats | Unused memories **fade** on a real forgetting curve, so your context stays lean |
| **Finding a root cause** | Can't โ the cause isn't *similar* to the bug | **Reaches backward in time** to the change that caused it (the whole point โ) |
| **Catching contradictions** | Silent โ serves the stale answer with a straight face | Tells you: *"this contradicts what you decided"* |
| **Duplicates** | You clean them up by hand | Self-heals โ *"likes dark mode"* + *"prefers dark themes"* quietly become one |
| **Forgetting on demand** | DELETE and it's gone | **`suppress`** โ gently inhibits a memory (and its neighbors), reversible for 24h |
| **Where it lives** | Usually someone else's cloud | **Your machine. One binary. No telemetry.** |
---
## ๐ฅ The thing nothing else does: memory with hindsight
This is the part I'm proudest of, and it's worth one honest paragraph.
A bug shows up today. The cause was a quiet decision from three weeks ago โ a changed env var, a swapped service. That cause **shares no words with the error it created.** A vector search will never connect them, because it only knows how to find things that *look alike* โ and this is a case where the cause and the symptom look nothing alike. This isn't a tuning problem; in 2026 Google DeepMind published a proof ([arXiv:2508.21038](https://arxiv.org/abs/2508.21038), ICLR 2026) that single-vector retrieval is *mathematically* incapable of bridging gaps like this.
So Vestige doesn't do it with similarity. Its **Retroactive Salience Backfill** โ ported from **Zaki/Cai et al., 2024, *Nature* 637:145โ155** ([DOI](https://doi.org/10.1038/s41586-024-08168-4)), on how the brain links a shock to the quiet memory that caused it โ reaches *backward through time* and promotes the dormant memory that's **causally upstream**: it shares an *entity* (the same file, env var, or service), not the same words.
I also built a benchmark to keep myself honest about it. Every pure vector retriever scored **0% recall@1** on the causal-gap task; Vestige scored **60%**. (To be precise: the impossibility is DeepMind's *theorem*; the 0%-vs-60% is *my measurement* โ two different claims, and I keep them separate.)
```bash
vestige backfill --contrast # show the root cause a vector search would have missed
```
The nice part: it compounds. Every failure your agent records makes the *next* session diagnose faster โ run two is smarter than run one โ and it happens automatically during consolidation, so you don't have to babysit it.
All of this shipped in **v2.2.0**, along with a 34โ13 tool consolidation and a rebuilt retrieval engine. [Full release notes โ](https://github.com/samvallad33/vestige/releases/tag/v2.2.0)
---
## ๐ฌ This is real neuroscience, not a metaphor
I get skeptical when projects wave the word "neuroscience" around, so here's my receipt: every mechanism below is a real, cited paper, implemented in Rust, running locally on your machine. None of it phones a model in the cloud to sound smart.
| Mechanism | What it does for you | Grounded in |
|---|---|---|
| **Prediction-Error Gating** | Redundant info gets merged, contradictory gets superseded, only the novel gets stored | The hippocampal novelty signal |
| **FSRS-6 Spaced Repetition** | 21 parameters of the mathematics of forgetting โ used memories stay, unused fade | Modern spaced-repetition research |
| **Retroactive Salience Backfill** | Backward causal reach to the root cause of a failure | Zaki/Cai et al. 2024, *Nature* 637:145โ155 |
| **Synaptic Tagging** | A memory that looked trivial this morning can be tagged critical tonight | [Frey & Morris 1997](https://doi.org/10.1038/385533a0) |
| **Spreading Activation** | Search "auth bug," surface last week's JWT update โ memory is a graph, not a list | [Collins & Loftus 1975](https://doi.org/10.1037/0033-295X.82.6.407) |
| **Dual-Strength Model** | Storage strength vs. retrieval strength โ deeply stored โ instantly recalled, just like you | [Bjork & Bjork 1992](https://doi.org/10.1016/S0079-7421(08)60016-9) |
| **Memory Dreaming** | Sleep-like consolidation: replays, connects, synthesizes insights to a graph | Active-dreaming consolidation |
| **Active Forgetting (`suppress`)** | Top-down inhibition that *compounds* and cascades to neighbors โ reversible for 24h | [Anderson 2025](https://www.nature.com/articles/s41583-025-00929-y) ยท [Davis 2020](https://pmc.ncbi.nlm.nih.gov/articles/PMC7477079/) |
[**Read the full science doc โ**](docs/SCIENCE.md) โ every feature, every paper.
---
## ๐ 13 tools, one brain
v2.2.0 consolidated a sprawling 34-tool surface into **13 sharp ones** your agent actually reaches for. Old names still work as hidden aliases โ nothing breaks.
| Tool | What it does |
|---|---|
| ๐ `recall` | The retrieval engine โ folds search + deep reasoning + contradiction detection into one call. F32 embeddings, Reciprocal Rank Fusion, claim-vs-memory checks. |
| ๐ง `backfill` | **Memory with hindsight** โ backward causal reach to a failure's root cause (Cai 2024). |
| ๐พ `smart_ingest` | Stores with CREATE / UPDATE / SUPERSEDE via Prediction-Error Gating. Batch session-end saves. |
| ๐ `memory` | Get, edit, promote ๐, demote ๐, check state, purge content + embeddings. |
| ๐งฉ `graph` | Reasoning chains, associations, bridges, predictions, force-directed export. |
| ๐ `maintain` | Consolidate, dream, GC, importance-score, backup, export, restore โ one maintenance verb. |
| ๐งน `dedup` | Self-healing duplicate detection + merge (8 old tools โ 1). |
| ๐ซ `suppress` | Top-down active forgetting โ compounds, cascades, reversible 24h. The memory is *inhibited, not erased.* |
| ๐ `memory_status` | Health + stats + trends + recommendations in one packet. |
| ๐งฌ `codebase` ยท `intention` ยท `source_sync` ยท `session_start` | Per-project code memory ยท "remind me when X" ยท external-source connectors ยท one-call session init. |
---
## ๐ Watch your AI think in 3D
```bash
vestige dashboard # โ http://localhost:3927/dashboard
```
Every memory is a glowing node in a real-time, force-directed 3D graph. Connections form as you work. Nodes **pulse** when accessed, **burst** on creation, **fade** on decay. Kick off a consolidation and the whole graph slides into **purple dream mode**, replaying memories that light up in sequence.
Built with SvelteKit 2 ยท Svelte 5 ยท Three.js ยท WebGL bloom ยท live WebSocket events. 1000+ nodes at 60fps. Installable as a PWA.
---
## ๐งฉ Works in every editor you use
Vestige speaks MCP, so any client that can register a stdio MCP server can use it.
| Editor | One-liner |
|---|---|
| **Claude Code** | `claude mcp add vestige vestige-mcp -s user` |
| **Codex** | `codex mcp add vestige -- vestige-mcp` |
| **Cursor / VS Code / Windsurf / JetBrains / Xcode / OpenCode** | [Integration guides โ](docs/integrations/) |
| **Claude Desktop** | [2-minute setup โ](docs/CONFIGURATION.md#claude-desktop-macos) |
Other install methods (Intel Mac, Windows, build-from-source)
**Update an existing install:**
```bash
vestige update # binaries only
vestige update --sandwich-companion # also refresh optional Claude Code companion files
```
**macOS (Intel):** Microsoft is dropping x86_64 macOS ONNX Runtime prebuilts after v1.23.0, so the Intel Mac build links dynamically against a Homebrew ONNX Runtime:
```bash
brew install onnxruntime
npm install -g vestige-mcp-server@latest
echo 'export ORT_DYLIB_PATH="'"$(brew --prefix onnxruntime)"'/lib/libonnxruntime.dylib"' >> ~/.zshrc && source ~/.zshrc
claude mcp add vestige vestige-mcp -s user
```
Full guide: [`docs/INSTALL-INTEL-MAC.md`](docs/INSTALL-INTEL-MAC.md).
**Windows + Claude Desktop:** quit Claude Desktop from the tray, then in PowerShell:
```powershell
npm install -g vestige-mcp-server@latest
vestige-mcp --version
```
Point `%APPDATA%\Claude\claude_desktop_config.json` at it:
```json
{ "mcpServers": { "vestige": { "command": "vestige-mcp" } } }
```
If it can't find the command, run `where vestige-mcp` and use the exact `.cmd` path.
**Build from source (Rust 1.91+):**
```bash
git clone https://github.com/samvallad33/vestige && cd vestige
cargo build --release -p vestige-mcp
# Apple Silicon GPU: --features metal ยท NVIDIA: --features qwen3-embeddings,cuda
```
---
## ๐ Make your AI use memory automatically
Registering the server exposes the tools; a short instruction tells the agent *when* to call them. Drop in the protocol and your agent saves and recalls on its own:
| You say | Vestige does |
|---|---|
| *"Remember this"* | Saves immediately |
| *"I always..."* / *"I prefer..."* | Saves as a durable preference |
| *"Remind me when..."* | Creates a future trigger (`intention`) |
| *"This is important"* | Saves **and** promotes it |
[Agent memory protocol โ](docs/AGENT-MEMORY-PROTOCOL.md) ยท [Claude Code template โ](docs/CLAUDE-SETUP.md)
---
## ๐ Under the hood
```
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ SvelteKit Dashboard โ Three.js 3D graph ยท WebGL bloom โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Axum HTTP + WebSocket (:3927) โ REST + live event stream โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ MCP Server (stdio JSON-RPC) โ 13 tools ยท 30 modules โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Cognitive Engine โ
โ FSRS-6 ยท Spreading Activation ยท Prediction-Error Gating โ
โ Retroactive Salience Backfill ยท Synaptic Tagging โ
โ Memory Dreamer ยท Hippocampal Index ยท Active Forgetting โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Storage โ SQLite + FTS5 ยท USearch HNSW ยท Nomic Embed v1.5โ
โ Optional: Qwen3 reranker ยท SQLCipher ยท Metal/CUDA โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
| | |
|---|---|
| **Language** | Rust 2024 (MSRV 1.91) โ **86,000+ lines** |
| **Binary** | ~23MB, single file |
| **Embeddings** | Nomic Embed Text v1.5 (768dโ256d Matryoshka, 8192 ctx); Qwen3 optional |
| **Vector search** | USearch HNSW (โ20ร faster than FAISS) |
| **Storage** | SQLite + FTS5, optional SQLCipher encryption |
| **Tests** | **1,550 passing** ยท clippy `-D warnings` clean |
| **First run** | Downloads ~130MB embedding model once, then **fully offline forever** |
| **Platforms** | macOS (ARM + Intel) ยท Linux x86_64 ยท Windows x86_64 โ all prebuilt |
---
## ๐ Go deeper
| | |
|---|---|
| [**FAQ**](docs/FAQ.md) | 30+ real questions answered |
| [**The Science**](docs/SCIENCE.md) | Every feature, every paper |
| [**Storage Modes**](docs/STORAGE.md) | Global ยท per-project ยท multi-instance |
| [**Configuration**](docs/CONFIGURATION.md) | CLI, env vars, every knob |
| [**Changelog**](CHANGELOG.md) | The full story, version by version |
---
### If your agent should remember what you taught it yesterday โ star it. โญ
86,000+ lines of Rust ยท 13 tools ยท 30 cognitive modules ยท 130 years of memory research ยท one 23MB binary that never phones home.
Built by @samvallad33 ยท AGPL-3.0 ยท 100% local, 100% yours