Prepare agent-neutral hardening release

2026-06-24 21:38:07 +02:00 · 2026-05-24 16:09:44 -05:00 · 2026-05-24 16:09:44 -05:00 · 7eba0b1e97
commit 7eba0b1e97
parent 9936928be9
117 changed files with 3679 additions and 513 deletions
--- a/docs/AGENT-MEMORY-PROTOCOL.md
+++ b/docs/AGENT-MEMORY-PROTOCOL.md
@ -0,0 +1,81 @@
+# Agent Memory Protocol
+
+> Minimal instructions for any MCP-compatible agent using Vestige.
+
+Vestige is an MCP server, not a Claude-specific workflow. Register `vestige-mcp`
+with your client, then give the agent a short instruction that makes memory part
+of its normal reasoning loop.
+
+## Register Vestige
+
+Use your client's MCP server configuration format. The command is the same:
+
+```json
+{
+  "mcpServers": {
+    "vestige": {
+      "command": "vestige-mcp"
+    }
+  }
+}
+```
+
+Examples:
+
+```bash
+claude mcp add vestige vestige-mcp -s user
+codex mcp add vestige -- vestige-mcp
+```
+
+## Agent Instruction
+
+Add this to the agent's global or project instruction file:
+
+```text
+Use Vestige as durable local memory.
+
+At the start of a new session, call `session_context` with the current user,
+project, and task context. If `session_context` is unavailable or too broad, call
+`search` with a concrete query matching the current task.
+
+When accuracy or prior decisions matter, call `deep_reference`. When memories may
+conflict, call `contradictions` before answering. Compose retrieved evidence into
+the answer; do not merely paste memory summaries.
+
+Save durable preferences, project decisions, recurring corrections, stable facts,
+and reusable code patterns with `smart_ingest`. Do not store secrets, credentials,
+one-off logs, speculation, or transient command output.
+
+When the user says a memory was useful, call `memory` with `action="promote"`.
+When the user says a memory was wrong or unhelpful, call `memory` with
+`action="demote"`. When the user explicitly asks to erase a memory permanently,
+call `memory` with `action="purge"` and `confirm=true`.
+```
+
+## Practical Tool Choices
+
+| Situation | Tool |
+|-----------|------|
+| Start of session | `session_context` |
+| Find exact identifiers, paths, env vars, names | `search` |
+| Answer from prior decisions or evolving facts | `deep_reference` |
+| Inspect disagreements before answering | `contradictions` |
+| Save a preference, decision, correction, or code pattern | `smart_ingest` |
+| Retrieve, promote, demote, edit, or purge one memory | `memory` |
+| Create a future reminder | `intention` |
+| Check health or maintenance state | `system_status` |
+
+## What Not To Store
+
+- API keys, tokens, passwords, private keys, or session cookies.
+- Raw logs or command output unless the durable lesson is extracted first.
+- Guesswork the agent has not verified.
+- Temporary plans that will be obsolete after the current session.
+- User data the user asked not to retain.
+
+## Portability Notes
+
+The same protocol applies to Claude Code, Codex, Cursor, VS Code, Xcode,
+JetBrains, Windsurf, and any other client that can run a stdio MCP server. Claude
+Code's Cognitive Sandwich hooks are optional companion files; they are not
+required for normal Vestige memory.
--- a/docs/COGNITIVE_SANDWICH.md
+++ b/docs/COGNITIVE_SANDWICH.md
@ -37,7 +37,7 @@ Sanhedrin, preflight, and all Vestige Claude Code hooks are optional. The defaul
 3. **Claude reads the assembled context and generates a draft.**
 4. **By default, no Vestige Stop hooks are installed.** If explicitly enabled, Stop hooks fire serially (any can VETO with `exit 2`, forcing a rewrite):
   - `veto-detector.sh` — fast regex against `veto`-tagged Vestige memories (~50ms)
-   - `sanhedrin.sh` → `sanhedrin-local.py` — optional single-shot semantic verdict
+   - `sanhedrin.sh` → `sanhedrin-local.py` — optional Sanhedrin verifier
   - `synthesis-stop-validator.sh` — regex against forbidden patterns (hedging, summary-instead-of-composition)
 5. **If all enabled Stop hooks return `exit 0`, the response is delivered.**

@ -45,19 +45,44 @@ Sanhedrin, preflight, and all Vestige Claude Code hooks are optional. The defaul

 ## The Sanhedrin Executioner protocol

-The Executioner extracts atomic claims from Claude's draft across 10 classes:
+Sanhedrin has two execution modes:
+
+- **Legacy mode** (`VESTIGE_SANHEDRIN_CLAIM_MODE=0`) keeps the original broad draft-level semantic check for technical-looking responses.
+- **Claim mode** (`VESTIGE_SANHEDRIN_CLAIM_MODE=1`) extracts check-worthy claims, retrieves Vestige evidence per claim, and aggregates structured verdicts before the Stop hook allows delivery.
+
+The claim-mode Executioner extracts atomic claims from Claude's draft across these classes:

 `TECHNICAL` · `BIOGRAPHICAL` · `FINANCIAL` · `ACHIEVEMENT` · `TIMELINE` · `QUANTITATIVE` · `ATTRIBUTION` · `CAUSAL` · `COMPARATIVE` · `EXISTENTIAL` · plus v2.1.0 additions: `VAGUE-QUANTIFIER` · `UNVERIFIED-POSITIVE`

-For each claim, it checks Vestige's `deep_reference` for high-trust contradicting memories. Decision rules:
+For each check-worthy claim, claim mode calls Vestige's `/api/deep_reference` and judges the claim against high-trust durable evidence plus any optional staged evidence overlay. Decision rules:

 | Class | Rule |
 |---|---|
-| TECHNICAL / EXISTENTIAL / TIMELINE | VETO if memory trust > 0.55 directly contradicts |
-| BIOGRAPHICAL / FINANCIAL / ACHIEVEMENT / ATTRIBUTION | VETO if contradicted OR if factual-shaped with zero supporting evidence (fail-closed) |
-| **VAGUE-QUANTIFIER** | VETO on vague achievement or financial claims without enumeration |
+| TECHNICAL / EXISTENTIAL / CAUSAL / COMPARATIVE | VETO only on same-subject durable contradiction; missing memory is `NEI` |
+| BIOGRAPHICAL / FINANCIAL / ACHIEVEMENT / TIMELINE / QUANTITATIVE / ATTRIBUTION / VAGUE-QUANTIFIER about the user | zero high-trust durable evidence is `REFUTED_BY_ABSENCE` and blocks |
+| **VAGUE-QUANTIFIER** | VETO on vague achievement or financial claims without durable enumeration |
 | **UNVERIFIED-POSITIVE** | VETO on specific named institutions/dates/employers not in evidence |

+Structured verdicts:
+
+| Verdict | Meaning |
+|---|---|
+| `SUPPORTED` | High-trust evidence supports or does not contradict the claim |
+| `REFUTED` | High-trust durable evidence directly contradicts the same-subject claim |
+| `REFUTED_BY_ABSENCE` | User-critical claim has no high-trust durable Vestige evidence |
+| `NEI` | Not enough information; allow unless another claim blocks |
+
+The bridge still prints legacy one-line `yes` / `no - ...` by default for Stop-hook compatibility. With `VESTIGE_SANHEDRIN_OUTPUT=json`, it emits structured JSON containing `decision`, `reason`, and per-claim verdicts. `sanhedrin.sh` can parse either format.
+
+### Staged evidence overlay
+
+`VESTIGE_SANHEDRIN_STAGE_FILE` may point to a JSON array of current-turn evidence candidates. Sanhedrin can read this staged evidence as context, but staged evidence is deliberately non-durable:
+
+- it never calls `smart_ingest`
+- it cannot promote, demote, merge, suppress, or supersede durable memories
+- it does not satisfy the durable-evidence requirement for `REFUTED_BY_ABSENCE`
+- durable memory writes remain a separate commit-after-pass step
+
 False-positive guards (added v2.1.0 after dogfood):
 - Subject-equality gate (memory about Vestige codebase ≠ contradiction with external tools)
 - Version-discriminator rule (M3 Max ≠ M5 Max; Qwen3.5 ≠ Qwen3.6)
@ -75,10 +100,12 @@ False-positive guards (added v2.1.0 after dogfood):
 vestige sandwich install
 ```

-`vestige update` also refreshes these companion files by default after it updates
-the binaries. The default command does not activate any Claude Code hook. It
-removes old v2.1.0 Vestige hook wiring from `~/.claude/settings.json` while
-preserving unrelated user hooks.
+`vestige update` updates binaries only by default. To refresh these optional
+Claude Code companion files during an update, run
+`vestige update --sandwich-companion`. The companion installer does not activate
+any Claude Code hook unless you pass an explicit opt-in flag. It removes old
+v2.1.0 Vestige hook wiring from `~/.claude/settings.json` while preserving
+unrelated user hooks.

 ### From a checkout

@ -122,7 +149,7 @@ vestige sandwich install \
 |---|---|
 | Python 3.10+ | typically preinstalled |
 | `jq` | `brew install jq` |
-| `vestige-mcp` | `cargo install vestige-mcp` |
+| `vestige-mcp` | `npm install -g vestige-mcp-server` |
 | Claude Code | https://claude.ai/code |

 Optional Apple Silicon local Sanhedrin backend:
@ -158,7 +185,8 @@ cp ~/.claude/settings.json.bak.pre-sandwich ~/.claude/settings.json
 ## Performance notes

 Optional local MLX backend on M3 Max 16-core (400 GB/s memory bandwidth):
- Sanhedrin verdict: 5–15 seconds end-to-end (single deep_reference + single Qwen call)
+- Legacy Sanhedrin verdict: 5–15 seconds end-to-end (single deep_reference + single Qwen call)
+- Claim mode: one `/api/deep_reference` call per extracted check-worthy claim, capped by `VESTIGE_SANHEDRIN_MAX_CLAIMS`
 - mlx_lm.server token generation: ~82 tok/s
 - mlx_lm.server peak resident memory: ~19.7 GB
 - Cold model load: ~5 seconds
@ -176,6 +204,11 @@ On M3 Max 14-core or M2/M1 Max: closer to 3–7s prompt processing, ~50–60 tok
 | `VESTIGE_DASHBOARD_PORT` | `3927` | Vestige MCP HTTP API port used by hooks |
 | `VESTIGE_SANHEDRIN_ENDPOINT` | `http://127.0.0.1:8080/v1/chat/completions` | OpenAI-compatible chat completions endpoint for Sanhedrin |
 | `VESTIGE_SANHEDRIN_MODEL` | `mlx-community/Qwen3.6-35B-A3B-4bit` | Model name sent to the Sanhedrin endpoint |
+| `VESTIGE_SANHEDRIN_CLAIM_MODE` | `1` when installed with `--enable-sanhedrin` | Enables per-claim retrieval and fail-closed user-critical lanes |
+| `VESTIGE_SANHEDRIN_OUTPUT` | `json` when installed with `--enable-sanhedrin` | Emits structured JSON from the bridge; shell hook also accepts legacy text |
+| `VESTIGE_SANHEDRIN_STAGE_FILE` | unset | Optional JSON-array staged evidence overlay, read-only and non-durable |
+| `VESTIGE_SANHEDRIN_MAX_CLAIMS` | `8` | Max check-worthy claims adjudicated per draft |
+| `VESTIGE_SANHEDRIN_PYTHON` | `python3` from `PATH` | Optional Python interpreter override for the Stop hook bridge |
 | `MLX_ENDPOINT` / `VESTIGE_SANDWICH_MODEL` | legacy aliases | Backward-compatible names still read by the bridge |
 | `VESTIGE_MEMORY_DIR` | (auto) | Override per-user Claude memory dir |

--- a/docs/CONFIGURATION.md
+++ b/docs/CONFIGURATION.md
@ -16,7 +16,7 @@ The embedding model is cached in platform-specific directories:

 | Platform | Cache Location |
 |----------|----------------|
-| macOS | `~/Library/Caches/com.vestige.core/fastembed` |
+| macOS | `~/Library/Caches/vestige/fastembed` |
 | Linux | `~/.cache/vestige/fastembed` |
 | Windows | `%LOCALAPPDATA%\vestige\cache\fastembed` |

@ -36,10 +36,12 @@ Qwen3 currently uses Hugging Face Hub's Candle loader directly, so use the stand
 | `VESTIGE_DATA_DIR` | OS per-user data directory | Storage directory fallback; overridden by `--data-dir`; database lives at `<dir>/vestige.db` |
 | `VESTIGE_EMBEDDING_MODEL` | `nomic-v1.5` | Embedding backend selector. Use `qwen3-0.6b` with a build that enables `qwen3-embeddings` |
 | `RUST_LOG` | `info` (via tracing-subscriber) | Log verbosity + per-module filtering |
-| `FASTEMBED_CACHE_PATH` | `./.fastembed_cache` | Embedding model cache location |
+| `FASTEMBED_CACHE_PATH` | Platform cache directory; `./.fastembed_cache` fallback | Embedding model cache location |
 | `VESTIGE_DASHBOARD_PORT` | `3927` | Dashboard HTTP + WebSocket port |
-| `VESTIGE_HTTP_PORT` | `3928` | Optional MCP-over-HTTP port |
+| `VESTIGE_HTTP_ENABLED` | `false` | Set `true` or `1` to enable optional MCP-over-HTTP |
+| `VESTIGE_HTTP_PORT` | `3928` | Optional MCP-over-HTTP port; `--http-port` also enables HTTP |
 | `VESTIGE_HTTP_BIND` | `127.0.0.1` | HTTP bind address |
+| `VESTIGE_HTTP_ALLOWED_ORIGINS` | localhost origins for the HTTP port | Comma-separated browser origins allowed to call MCP-over-HTTP |
 | `VESTIGE_AUTH_TOKEN` | auto-generated | Dashboard + MCP HTTP bearer auth |
 | `VESTIGE_DASHBOARD_ENABLED` | `false` | Set `true` or `1` to enable the web dashboard |
 | `VESTIGE_CONSOLIDATION_INTERVAL_HOURS` | `6` | FSRS-6 decay cycle cadence |
@ -175,18 +177,17 @@ See [Storage Modes](STORAGE.md) for more options.
 vestige update
 ```

-This updates `vestige`, `vestige-mcp`, `vestige-restore`, and the Cognitive
-Sandwich companion files. The companion refresh keeps hooks disabled by default
-and cleans up old mandatory v2.1.0 hook wiring.
+This updates `vestige`, `vestige-mcp`, and `vestige-restore`. It does not mutate
+Claude Code Cognitive Sandwich companion files unless you explicitly request it.

-**Binaries only:**
+**Also refresh optional Claude Code companion files:**
 ```bash
-vestige update --no-sandwich
+vestige update --sandwich-companion
 ```

 **Pin to specific version:**
 ```bash
-vestige update --version v2.1.1
+vestige update --version v2.1.21
 ```

 **Manage the optional Cognitive Sandwich layer without updating binaries:**
--- a/docs/FAQ.md
+++ b/docs/FAQ.md
@ -22,13 +22,13 @@
 ## Getting Started

 <details>
-<summary><b>"Can Vestige support a two-Claude household?"</b></summary>
+<summary><b>"Can Vestige support multiple agents or MCP clients?"</b></summary>

-**Yes!** See [Storage Modes](STORAGE.md#option-3-multi-claude-household). You can either:
- **Share memories**: Both Claudes point to the same `--data-dir`
- **Separate identities**: Each Claude gets its own data directory
+**Yes.** See [Storage Modes](STORAGE.md#option-3-multi-agent-household). You can either:
+- **Share memories**: Multiple agents point to the same `--data-dir`
+- **Separate identities**: Each agent gets its own data directory

-For two Claudes with distinct personas (e.g., "Domovoi" and "Storm") sharing the same human, use separate directories but consider a shared "household" memory for common knowledge.
+For two agents with distinct roles sharing the same human, use separate directories but consider a shared "household" memory for common knowledge.
 </details>

 <details>
@ -38,28 +38,28 @@ For two Claudes with distinct personas (e.g., "Domovoi" and "Storm") sharing the

 **For non-technical users:**
 1. Have a technical friend do the 5-minute install
-2. Add the CLAUDE.md instructions
-3. Just talk to Claude normally—it handles the memory calls
+2. Add the [agent memory protocol](AGENT-MEMORY-PROTOCOL.md) to your MCP client's instruction file
+3. Just talk normally; the agent handles the memory calls

-**The magic**: Once set up, you never think about it. Claude just... remembers.
+**The magic**: Once set up, you never think about it. Your agent just remembers.
 </details>

 <details>
 <summary><b>"What input do you feed it? How does it create memories?"</b></summary>

-Claude creates memories via MCP tool calls. Three ways:
+Your agent creates memories via MCP tool calls. Three ways:

-1. **Explicit**: You say "Remember that I prefer dark mode" → Claude calls `smart_ingest`
-2. **Automatic**: Claude notices something important → calls `smart_ingest` proactively
-3. **Codebase**: Claude detects patterns/decisions → calls `remember_pattern` or `remember_decision`
+1. **Explicit**: You say "Remember that I prefer dark mode" -> the agent calls `smart_ingest`
+2. **Automatic**: The agent notices something important -> calls `smart_ingest` proactively
+3. **Codebase**: The agent detects patterns/decisions -> calls `codebase(action="remember_pattern")` or `codebase(action="remember_decision")`

-The CLAUDE.md instructions tell Claude when to create memories proactively.
+The agent memory protocol tells the client when to create memories proactively.
 </details>

 <details>
 <summary><b>"Can it be filled with a conversation stream in realtime?"</b></summary>

-Not currently. Vestige is **tool-based**, not stream-based. Claude decides what's worth remembering, not everything gets saved.
+Not currently. Vestige is **tool-based**, not stream-based. The agent decides what's worth remembering, not everything gets saved.

 This is intentional—saving everything would:
 - Bloat the knowledge base
@ -211,11 +211,9 @@ In Vestige's current implementation:

 In Vestige's implementation:
 ```
-importance(
-  memory_id="the-important-one",
-  event_type="user_flag",  # or "emotional", "novelty", "repeated_access", "cross_reference"
-  hours_back=9,   # Look back 9 hours (configurable)
-  hours_forward=2  # Capture next 2 hours too
+importance_score(
+  content="the-important content",
+  context_topics=["release", "memory"]
 )
 ```

@ -330,9 +328,9 @@ The unified `search` always uses hybrid, which gives you the best of both worlds

 Three approaches:

-1. **Mark as important**: `importance(memory_id="xxx", event_type="user_flag")`
+1. **Mark as important**: `importance_score(content="...", event_type="user_flag")`
 2. **Access regularly**: The Testing Effect strengthens memories each time you retrieve them
-3. **Promote explicitly**: `promote_memory(id="xxx")` after it proves valuable
+3. **Promote explicitly**: `memory(action="promote", id="xxx")` after it proves valuable

 For truly critical information, consider also:
 - Using specific tags like `["critical", "never-forget"]`
@ -549,13 +547,13 @@ Common issues:

 | Feature | Notes App | Vestige |
 |---------|-----------|---------|
-| Retrieval | You search manually | Claude searches contextually |
+| Retrieval | You search manually | The agent searches contextually |
 | Decay | Everything stays forever | Unused knowledge fades naturally |
 | Duplicates | You manage manually | Prediction Error Gating auto-merges |
 | Context | Static text | Active part of AI reasoning |
 | Strengthening | Manual review | Automatic via Testing Effect |

-The key difference: **Vestige is part of Claude's cognitive loop.** Notes are external reference—Vestige is internal memory.
+The key difference: **Vestige is part of the agent's cognitive loop.** Notes are external reference; Vestige is active working memory.
 </details>

 <details>
@ -619,7 +617,7 @@ Why Nomic:
 - No API costs or rate limits
 - Fast enough for real-time search

-The model is cached at `~/.cache/huggingface/` after first run.
+The model is cached in the platform user cache directory first, with `./.fastembed_cache` as a fallback. Set `FASTEMBED_CACHE_PATH` to choose a specific cache path.
 </details>

 <details>
@ -817,11 +815,11 @@ See [CLAUDE-SETUP.md](CLAUDE-SETUP.md) for the full template. The key elements:
 **During Work**:
 - Notice a pattern? `codebase(action="remember_pattern")`
 - Made a decision? `codebase(action="remember_decision")` with rationale
- Something important? `importance()` to strengthen recent memories
+- Something important? `importance_score(content="...")` to score it before saving or promoting

 **Memory Hygiene**:
- When a memory helps: `promote_memory`
- When a memory misleads: `demote_memory`
+- When a memory helps: `memory(action="promote", id="...")`
+- When a memory misleads: `memory(action="demote", id="...")`
 </details>

 ---
--- a/docs/SCIENCE.md
+++ b/docs/SCIENCE.md
@ -126,11 +126,9 @@ In Vestige's implementation:

 In Vestige:
 ```
-importance(
-  memory_id="the-important-one",
-  event_type="user_flag",
-  hours_back=9,
-  hours_forward=2
+importance_score(
+  content="the-important content",
+  context_topics=["release", "memory"]
 )
 ```

@ -183,7 +181,7 @@ This gives you exact keyword matching AND semantic understanding in one search.
 - Runs 100% local (after first download)
 - Competitive with OpenAI's ada-002

-The model is cached at `~/.cache/huggingface/` after first run.
+The model is cached in the platform user cache directory after first run, with `./.fastembed_cache` as a fallback. Set `FASTEMBED_CACHE_PATH` to choose a specific cache path.

 ---

--- a/docs/STORAGE.md
+++ b/docs/STORAGE.md
@ -1,6 +1,6 @@
 # Storage Configuration

-> Global, per-project, and multi-Claude setups
+> Global, per-project, and multi-agent setups

 ---

@ -89,9 +89,9 @@ Separate memory per codebase. Good for:
 - Different coding styles per project
 - Team environments

-**Claude Code Setup:**
+**MCP Client Setup:**

-Add to your project's `.claude/settings.local.json`:
+Add an MCP server entry to your client or project config:
 ```json
 {
  "mcpServers": {
@ -131,11 +131,11 @@ For power users who want both global AND project memory:
 }
 ```

-### Option 3: Multi-Claude Household
+### Option 3: Multi-Agent Household

-For setups with multiple Claude instances (e.g., Claude Desktop + Claude Code, or two personas):
+For setups with multiple MCP clients or agent personas:

-**Shared Memory (Both Claudes share memories):**
+**Shared Memory (all clients share memories):**
 ```json
 {
  "mcpServers": {
@ -147,27 +147,27 @@ For setups with multiple Claude instances (e.g., Claude Desktop + Claude Code, o
 }
 ```

-**Separate Identities (Each Claude has own memory):**
+**Separate Identities (each agent has its own memory):**

-Claude Desktop config - for "Domovoi":
+Client config for "Research":
 ```json
 {
  "mcpServers": {
    "vestige": {
      "command": "vestige-mcp",
-      "args": ["--data-dir", "~/vestige-domovoi"]
+      "args": ["--data-dir", "~/vestige-research"]
    }
  }
 }
 ```

-Claude Code config - for "Storm":
+Client config for "Builder":
 ```json
 {
  "mcpServers": {
    "vestige": {
      "command": "vestige-mcp",
-      "args": ["--data-dir", "~/vestige-storm"]
+      "args": ["--data-dir", "~/vestige-builder"]
    }
  }
 }
@ -263,9 +263,9 @@ Internally the `Storage` type holds **separate reader and writer connections**,

 | Pattern | Status | Notes |
 |---------|--------|-------|
-| One `vestige-mcp` + one Claude client | **Supported** | The default case. Zero contention. |
-| Multiple Claude clients, separate `--data-dir` | **Supported** | Each process owns its own DB file. No shared state. |
-| Multiple Claude clients, **shared** `--data-dir`, **one** `vestige-mcp` | **Supported** | Clients talk to a single MCP process that owns the DB. Recommended for multi-agent setups. |
+| One `vestige-mcp` + one MCP client | **Supported** | The default case. Zero contention. |
+| Multiple MCP clients, separate `--data-dir` | **Supported** | Each process owns its own DB file. No shared state. |
+| Multiple MCP clients, **shared** `--data-dir`, **one** `vestige-mcp` | **Supported** | Clients talk to a single MCP process that owns the DB. Recommended for multi-agent setups. |
 | CLI (`vestige` binary) reading while `vestige-mcp` runs | **Supported** | WAL makes this safe — queries see a consistent snapshot. |
 | Time Machine / `rsync` backup during writes | **Supported** | WAL journal gets copied with the main file; recovery handles it. |

--- a/docs/VESTIGE_STATE_AND_PLAN.md
+++ b/docs/VESTIGE_STATE_AND_PLAN.md
@ -10,20 +10,20 @@ For current user-facing release information, use:
 - `CHANGELOG.md`
 - `docs/STORAGE.md`
 - `docs/COGNITIVE_SANDWICH.md`
+- `docs/AGENT-MEMORY-PROTOCOL.md`
 - `docs/CLAUDE-SETUP.md`

 ## Current Release Shape

-Vestige v2.1.2 is the "Honest Memory" release. Its public scope is:
+Vestige v2.1.21 is the "Agent-Neutral Hardening" release. Its public scope is:

- concrete literal search for quoted strings, env vars, UUIDs, paths, and code
-  identifiers
- irreversible purge semantics with content-free deletion tombstones
- first-class contradiction inspection through the MCP `contradictions` tool
- the `vestige update` CLI flow for binary and Cognitive Sandwich updates
- dense dream connection persistence fixes
- embedding-model upgrade repair during consolidation
- an opt-in `/dashboard/waitlist` preview for Vestige Pro early access
+- stdio MCP as the default agent transport, with HTTP MCP opt-in only
+- binary-only `vestige update` by default
+- delete and purge confirmation parity for destructive memory removal
+- portable sync fixes for purge tombstones, UPSERT merge, and vector index
+  reloads
+- safer release packaging with dashboard freshness checks and checksums
+- agent-neutral memory instructions for any MCP-compatible client

 The release keeps the local-first baseline intact. Heavy model hooks, local
 verifier models, and preflight automation remain optional.
@ -69,23 +69,25 @@ Vestige is organized as:
 - `packages/vestige-init`: installer helper
 - `docs`: user and integration documentation

-## v2.1.2 Implementation Notes
+## v2.1.21 Implementation Notes

-Concrete search is implemented in the MCP `search` tool and core SQLite
-storage. Literal-looking queries use a keyword path instead of HyDE expansion,
-semantic fusion, FSRS reweighting, retrieval competition, and spreading
-activation.
+HTTP MCP is disabled unless the user passes `--http`, passes `--http-port`, or
+sets `VESTIGE_HTTP_ENABLED=1`. The stdio MCP server remains the portable default
+for Claude Code, Codex, Cursor, VS Code, Xcode, JetBrains, Windsurf, and other
+clients.

 Purge is implemented transactionally in storage and surfaced through the MCP
 `memory` tool. `memory(action="purge", confirm=true)` is the explicit hard
-delete path. `delete` remains a backwards-compatible alias.
+delete path. `delete` remains a backwards-compatible alias but also requires
+`confirm=true`.

-Contradictions are exposed as a first-class MCP tool and reuse the same trust
-and topic-overlap logic used by the deeper reference pipeline.
+Portable merge imports preserve both sync tombstones and non-content deletion
+tombstones. Keyed table writes use UPSERT rather than `INSERT OR REPLACE` so
+related rows are not accidentally cascaded away.

-The waitlist preview is a dashboard route. Its capture and support endpoints
-are controlled by opt-in public dashboard environment variables. If unset, the
-page does not silently capture private signup data.
+Claude Code Cognitive Sandwich files are optional companion files, not the
+default Vestige setup path. Use `vestige update --sandwich-companion` or
+`vestige sandwich install` only when that hook layer is wanted.

 ## 15. Autopilot Rationale

--- a/docs/integrations/codex-intelligent-memory.md
+++ b/docs/integrations/codex-intelligent-memory.md
@ -0,0 +1,72 @@
+# Codex Intelligent Memory Protocol
+
+Codex can connect to Vestige through MCP, but MCP registration alone only makes
+the tools available. It does not make Codex automatically reason with memory.
+
+Use this protocol when configuring a Codex workspace that should behave like it
+has long-term cognitive memory.
+
+## 1. Register Vestige MCP
+
+```toml
+[mcp_servers.vestige]
+command = "/absolute/path/to/vestige-mcp"
+```
+
+Restart Codex after changing MCP configuration.
+
+## 2. Add An `AGENTS.md` Trigger
+
+Codex reads `AGENTS.md` files as workspace instructions. Put a file at the repo
+root, or a higher workspace root, with a rule like:
+
+```markdown
+Before answering substantive prompts, consult Vestige using the current prompt
+plus project and user context. Use `session_context` for broad context, `search`
+for quick memory checks, and `deep_reference` for decisions, contradictions, or
+accuracy-sensitive questions. Compose memories into actions; do not summarize
+retrievals.
+```
+
+This is the Codex equivalent of the lightweight top-bread memory trigger.
+
+## 3. Use A Query Router
+
+Use the smallest call that can change the answer:
+
+- `session_context`: start of a topic or project switch.
+- `search`: identity, preference, exact memory, or quick project context.
+- `deep_reference` / `cross_reference`: decision history, contradictions,
+  timelines, or root-cause analysis.
+- `memory(get_batch)`: expand specific load-bearing memories.
+- `smart_ingest`: save durable corrections, decisions, or new preferences.
+
+## 4. Compose, Do Not Summarize
+
+Retrieved memory is evidence, not the final answer.
+
+Use this mental transform:
+
+```text
+memory fact -> implication -> action
+```
+
+If memory does not change the action, do not mention it. If it does, make the
+changed recommendation clear.
+
+## 5. Know The Limit
+
+Claude Code's Cognitive Sandwich can use `UserPromptSubmit` and `Stop` hooks to
+wrap every response. Codex may expose different hook events depending on version.
+Do not assume Claude's hook chain is active in Codex just because Vestige MCP is
+registered.
+
+For Codex, the reliable portable layer is:
+
+1. MCP server configured.
+2. `AGENTS.md` instruction trigger.
+3. Local Codex rule docs.
+4. Explicit agent discipline: call Vestige before substantive answers.
+
+If a future Codex version supports a stable pre-prompt hook, wire that hook to
+inject a short Vestige reminder or context packet before the model answers.
--- a/docs/integrations/codex.md
+++ b/docs/integrations/codex.md
@ -89,6 +89,27 @@ args = ["--data-dir", "/Users/you/projects/my-app/.vestige"]

 ---

+## Intelligent Memory Protocol
+
+MCP registration makes Vestige tools available to Codex. It does not, by itself,
+force Codex to call those tools before answering.
+
+For workspaces where Codex should behave like it has persistent cognitive
+memory, add an `AGENTS.md` file at the workspace or repo root:
+
+```markdown
+Before answering substantive prompts, consult Vestige using the current prompt
+plus project and user context. Use `session_context` for broad context, `search`
+for quick memory checks, and `deep_reference` for decisions, contradictions, or
+accuracy-sensitive questions. Compose memories into actions; do not summarize
+retrievals.
+```
+
+Then use the full protocol in
+[`codex-intelligent-memory.md`](./codex-intelligent-memory.md).
+
+---
+
 ## Troubleshooting

 <details>
--- a/docs/integrations/xcode.md
+++ b/docs/integrations/xcode.md
@ -165,7 +165,7 @@ See [CLAUDE.md templates](../CLAUDE-SETUP.md) for a full setup.
 The first time Vestige runs, it downloads the embedding model (~130MB). In Xcode's sandboxed environment, the cache location is:

 ```
-~/Library/Caches/com.vestige.core/fastembed
+~/Library/Caches/vestige/fastembed
 ```

 If the download fails behind a corporate proxy, pre-download by running `vestige-mcp` once from your terminal.
@ -230,7 +230,7 @@ Xcode 26.3 has a feature gate (`claudeai-mcp`) that may block custom MCP servers
 The first run downloads ~130MB. If Xcode's sandbox blocks the download:

 1. Run `vestige-mcp` once from your terminal to cache the model
-2. The cache at `~/Library/Caches/com.vestige.core/fastembed` will be available to the sandboxed instance
+2. The cache at `~/Library/Caches/vestige/fastembed` will be available to the sandboxed instance

 Behind a proxy:
 ```bash
--- a/docs/launch/blog-post.md
+++ b/docs/launch/blog-post.md
@ -318,7 +318,7 @@ SQLite is the most deployed database in the world for a reason. WAL mode gives u

 ### fastembed (Nomic Embed v1.5)

-All embeddings run locally. The Nomic Embed v1.5 model produces 768-dimensional vectors, runs via ONNX Runtime, and is competitive with OpenAI's ada-002. The model is cached at `~/.cache/huggingface/` after first download (~130MB). No API keys. No network calls during operation. Your memories never leave your machine.
+All embeddings run locally. The Nomic Embed v1.5 model produces 768-dimensional vectors, runs via ONNX Runtime, and is competitive with OpenAI's ada-002. The model is cached in the platform user cache directory after first download (~130MB), with `./.fastembed_cache` as a fallback. No API keys. No network calls during operation. Your memories never leave your machine.

 ### Performance