vestige/CLAUDE.md
Sam Valladares 04781a95e2 feat: v2.0.4 "Deep Reference" — cognitive reasoning engine + 10 bug fixes
New features:
- deep_reference tool (#22): 8-stage cognitive reasoning pipeline with FSRS-6
  trust scoring, intent classification (FactCheck/Timeline/RootCause/Comparison/
  Synthesis), spreading activation expansion, temporal supersession, trust-weighted
  contradiction analysis, relation assessment, dream insight integration, and
  algorithmic reasoning chain generation — all without calling an LLM
- cross_reference (#23): backward-compatible alias for deep_reference
- retrieval_mode parameter on search (precise/balanced/exhaustive)
- get_batch action on memory tool (up to 20 IDs per call)
- Token budget raised from 10K to 100K on search + session_context
- Dates (createdAt/updatedAt) on all search results and session_context lines

Bug fixes (GitHub Issue #25 — all 10 resolved):
- state_transitions empty: wired record_memory_access into strengthen_batch
- chain/bridges no storage fallback: added with edge deduplication
- knowledge_edges dead schema: documented as deprecated
- insights not persisted from dream: wired save_insight after generation
- find_duplicates threshold dropped: serde alias fix
- search min_retention ignored: serde aliases for snake_case params
- intention time triggers null: removed dead trigger_at embedding
- changelog missing dreams: added get_dream_history + event integration
- phantom Related IDs: clarified message text
- fsrs_cards empty: documented as harmless dead schema

Security hardening:
- HTTP transport CORS: permissive() → localhost-only
- Auth token panic guard: &token[..8] → safe min(8) slice
- UTF-8 boundary fix: floor_char_boundary on content truncation
- All unwrap() removed from HTTP transport (unwrap_or_else fallback)
- Dream memory_count capped at 500 (prevents O(N²) hang)
- Dormant state threshold aligned (0.3 → 0.4)

Stats: 23 tools, 758 tests, 0 failures, 0 warnings, 0 unwraps in production

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:15:26 -05:00

348 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Vestige v2.0.4 — Cognitive Memory & Reasoning System
Vestige is your long-term memory AND reasoning engine. 29 stateful cognitive modules implement real neuroscience: FSRS-6 spaced repetition, synaptic tagging, prediction error gating, hippocampal indexing, spreading activation, reconsolidation, and dual-strength memory theory. **Use it automatically. Use it aggressively.**
**NEW: `deep_reference` — call this for ALL factual questions.** It doesn't just retrieve — it REASONS across memories with FSRS-6 trust scoring, intent classification, contradiction analysis, and generates a pre-built reasoning chain. Read the `reasoning` field FIRST.
---
## Session Start Protocol
Every conversation, before responding to the user:
```
session_context({
queries: ["user preferences", "[current project] context"],
context: { codebase: "[project]", topics: ["[current topics]"] },
token_budget: 2000
})
```
Then check `automationTriggers` from response:
- `needsDream` → call `dream` (consolidates memories, discovers hidden connections)
- `needsBackup` → call `backup`
- `needsGc` → call `gc(dry_run: true)` then review
- totalMemories > 700 → call `find_duplicates`
Say "Remembering..." then retrieve context before answering.
> **Fallback:** If `session_context` unavailable: `search` × 2 → `intention` check → `system_status` → `predict`.
---
## Complete Tool Reference (23 Tools)
### session_context — One-Call Initialization
```
session_context({
queries: ["user preferences", "project context"], // search queries
context: { codebase: "project-name", topics: ["svelte", "rust"], file: "src/main.rs" },
token_budget: 2000, // 100-100000, controls response size
include_status: true, // system health
include_intentions: true, // triggered reminders
include_predictions: true // proactive memory predictions
})
```
Returns: markdown context + `automationTriggers` + `expandable` IDs for on-demand retrieval.
### smart_ingest — Save Anything
**Single mode** — auto-decides CREATE/UPDATE/SUPERSEDE via Prediction Error Gating:
```
smart_ingest({
content: "What to remember",
tags: ["tag1", "tag2"],
node_type: "fact", // fact|concept|event|person|place|note|pattern|decision
source: "optional reference",
forceCreate: false // bypass dedup when needed
})
```
**Batch mode** — save up to 20 items in one call (session end, pre-compaction):
```
smart_ingest({
items: [
{ content: "Item 1", tags: ["session-end"], node_type: "fact" },
{ content: "Item 2", tags: ["bug-fix"], node_type: "fact" }
]
})
```
Each item runs the full cognitive pipeline: importance scoring → intent detection → synaptic tagging → hippocampal indexing → PE gating → cross-project recording.
### search — 7-Stage Cognitive Search
```
search({
query: "search query",
limit: 10, // 1-100
min_retention: 0.0, // filter by retention strength
min_similarity: 0.5, // minimum cosine similarity
detail_level: "summary", // brief|summary|full
context_topics: ["rust", "debugging"], // boost topic-matching memories
token_budget: 3000, // 100-100000, truncate to fit
retrieval_mode: "balanced" // precise|balanced|exhaustive (v2.1)
})
```
Retrieval modes: `precise` (fast, no activation/competition), `balanced` (default 7-stage pipeline), `exhaustive` (5x overfetch, deep graph traversal, no competition suppression).
Pipeline: Overfetch → Rerank (cross-encoder) → Temporal boost → Accessibility filter (FSRS-6) → Context match (Tulving 1973) → Competition (Anderson 1994) → Spreading activation. **Every search strengthens the memories it finds (Testing Effect).**
### memory — Read, Edit, Delete, Promote, Demote
```
memory({ action: "get", id: "uuid" }) // full node with all FSRS state
memory({ action: "edit", id: "uuid", content: "updated text" }) // preserves FSRS state, regenerates embedding
memory({ action: "delete", id: "uuid" })
memory({ action: "promote", id: "uuid", reason: "was helpful" }) // +0.20 retrieval, +0.10 retention, 1.5x stability
memory({ action: "demote", id: "uuid", reason: "was wrong" }) // -0.30 retrieval, -0.15 retention, 0.5x stability
memory({ action: "state", id: "uuid" }) // Active/Dormant/Silent/Unavailable + accessibility score
memory({ action: "get_batch", ids: ["uuid1", "uuid2", "uuid3"] }) // retrieve up to 20 full memories at once (v2.1)
```
Promote/demote does NOT delete — it adjusts ranking. Demoted memories rank lower; alternatives surface instead.
`get_batch` is designed for batch retrieval of expandable overflow IDs from search/session_context.
### codebase — Code Patterns & Architectural Decisions
```
codebase({ action: "remember_pattern", name: "Pattern Name",
description: "How it works and when to use it",
files: ["src/file.rs"], codebase: "project-name" })
codebase({ action: "remember_decision", decision: "What was decided",
rationale: "Why", alternatives: ["Option A", "Option B"],
files: ["src/file.rs"], codebase: "project-name" })
codebase({ action: "get_context", codebase: "project-name", limit: 10 })
// Returns: patterns, decisions, cross-project insights
```
### intention — Prospective Memory (Reminders)
```
intention({ action: "set", description: "What to do",
trigger: { type: "context", topic: "authentication" }, // fires when discussing auth
priority: "high" })
intention({ action: "set", description: "Deploy by Friday",
trigger: { type: "time", at: "2026-03-07T17:00:00Z" },
deadline: "2026-03-07T17:00:00Z" })
intention({ action: "set", description: "Check test coverage",
trigger: { type: "context", codebase: "vestige", file_pattern: "*.test.*" } })
intention({ action: "check", context: { codebase: "vestige", topics: ["testing"] } })
intention({ action: "update", id: "uuid", status: "complete" })
intention({ action: "list", filter_status: "active" })
```
### dream — Memory Consolidation
```
dream({ memory_count: 50 })
```
5-stage cycle: Replay → Cross-reference → Strengthen → Prune → Transfer. Uses Waking SWR tagging (70% tagged + 30% random for diversity). Discovers hidden connections, generates insights, persists new edges to the activation network.
### explore_connections — Graph Traversal
```
explore_connections({ action: "associations", from: "uuid", limit: 10 })
// Spreading activation from a memory — find related memories via graph traversal
explore_connections({ action: "chain", from: "uuid-A", to: "uuid-B" })
// Build reasoning path between two memories (A*-like pathfinding)
explore_connections({ action: "bridges", from: "uuid-A", to: "uuid-B" })
// Find connecting memories that bridge two concepts
```
### predict — Proactive Retrieval
```
predict({ context: { codebase: "vestige", current_file: "src/main.rs",
current_topics: ["error handling", "rust"] } })
```
Returns: predictions with confidence, suggestions, speculative retrievals, top interests. Uses SpeculativeRetriever's learned patterns from access history.
### importance_score — Should I Save This?
```
importance_score({ content: "Content to evaluate",
context_topics: ["debugging"], project: "vestige" })
```
4-channel model: novelty (0.25), arousal (0.30), reward (0.25), attention (0.20). Composite > 0.6 = save it.
### find_duplicates — Dedup Memory
```
find_duplicates({ similarity_threshold: 0.80, limit: 20, tags: ["bug-fix"] })
```
Cosine similarity clustering. Returns merge/review suggestions.
### memory_timeline — Chronological Browse
```
memory_timeline({ start: "2026-02-01", end: "2026-03-01",
node_type: "decision", tags: ["vestige"], limit: 50, detail_level: "summary" })
```
### memory_changelog — Audit Trail
```
memory_changelog({ memory_id: "uuid", limit: 20 }) // per-memory history
memory_changelog({ start: "2026-03-01", limit: 20 }) // system-wide
```
### memory_health — Retention Dashboard
```
memory_health()
```
Returns: avg retention, distribution buckets (0-20%, 20-40%, etc.), trend (improving/declining/stable), recommendation.
### memory_graph — Visualization Export
```
memory_graph({ query: "search term", depth: 2, max_nodes: 50 })
memory_graph({ center_id: "uuid", depth: 3, max_nodes: 100 })
```
Returns nodes with force-directed positions + edges with weights.
### deep_reference — Cognitive Reasoning Engine (v2.0.4) ★ USE THIS FOR ALL FACTUAL QUESTIONS
```
deep_reference({ query: "What port does the dev server use?" })
deep_reference({ query: "Should I use prefix caching with vLLM?", depth: 30 })
```
**THE killer tool.** 8-stage cognitive reasoning pipeline:
1. Broad retrieval + cross-encoder reranking
2. Spreading activation expansion (finds connected memories search misses)
3. FSRS-6 trust scoring (retention × stability × reps ÷ lapses)
4. Intent classification (FactCheck / Timeline / RootCause / Comparison / Synthesis)
5. Temporal supersession (newer high-trust replaces older)
6. Trust-weighted contradiction analysis (only flags conflicts between strong memories)
7. Relation assessment (Supports / Contradicts / Supersedes / Irrelevant per pair)
8. **Template reasoning chain** — pre-built natural language reasoning the AI validates
Parameters: `query` (required), `depth` (5-50, default 20).
Returns: `intent`, `reasoning` (THE KEY FIELD — read this first), `recommended` (highest-trust answer), `evidence` (trust-sorted), `contradictions`, `superseded`, `evolution`, `related_insights`, `confidence`.
`cross_reference` is a backward-compatible alias that calls `deep_reference`.
### Maintenance Tools
```
system_status() // health + stats + warnings + recommendations
consolidate() // FSRS-6 decay cycle + embedding generation
backup() // SQLite backup → ~/.vestige/backups/
export({ format: "json", tags: ["bug-fix"], since: "2026-01-01" })
gc({ min_retention: 0.1, dry_run: true }) // garbage collect (dry_run first!)
restore({ path: "/path/to/backup.json" })
```
---
## Mandatory Save Gates
**You MUST NOT proceed past a save gate without executing the save.**
| Gate | Trigger | Action |
|------|---------|--------|
| **BUG_FIX** | After any error is resolved | `smart_ingest({ content: "BUG FIX: [error]\nRoot cause: [why]\nSolution: [fix]\nFiles: [paths]", tags: ["bug-fix", "project"], node_type: "fact" })` |
| **DECISION** | After any architectural/design choice | `codebase({ action: "remember_decision", decision, rationale, alternatives, files, codebase })` |
| **CODE_CHANGE** | After >20 lines or new pattern | `codebase({ action: "remember_pattern", name, description, files, codebase })` |
| **SESSION_END** | Before stopping or compaction | `smart_ingest({ items: [{ content: "SESSION: [summary]", tags: ["session-end"] }] })` |
---
## Trigger Words — Auto-Save
| User Says | Action |
|-----------|--------|
| "Remember this" / "Don't forget" | `smart_ingest` immediately |
| "I always..." / "I never..." / "I prefer..." | Save as preference |
| "This is important" | `smart_ingest` + `memory(action="promote")` |
| "Remind me..." / "Next time..." | `intention({ action: "set" })` |
---
## Cognitive Architecture
### Search Pipeline (7 stages)
1. **Overfetch** — 3x results from hybrid search (0.3 BM25 + 0.7 semantic, nomic-embed-text-v1.5 768D)
2. **Rerank** — Cross-encoder rescoring (Jina Reranker v1 Turbo, 38M params)
3. **Temporal** — Recency + validity window boosting (85% relevance + 15% temporal)
4. **Accessibility** — FSRS-6 retention filter (Active ≥0.7, Dormant ≥0.4, Silent ≥0.1)
5. **Context** — Tulving 1973 encoding specificity (topic overlap → +30% boost)
6. **Competition** — Anderson 1994 retrieval-induced forgetting (winners strengthen, competitors weaken)
7. **Activation** — Spreading activation side effects + predictive model + reconsolidation marking
### Ingest Pipeline
**Pre:** 4-channel importance scoring (novelty/arousal/reward/attention) + intent detection → auto-tag
**Store:** Prediction Error Gating: similarity >0.92 → UPDATE, 0.75-0.92 → UPDATE/SUPERSEDE, <0.75 CREATE
**Post:** Synaptic tagging (Frey & Morris 1997, 9h backward + 2h forward) + hippocampal indexing + cross-project recording
### FSRS-6 (State-of-the-Art Spaced Repetition)
- Retrievability: `R = (1 + factor × t / S)^(-w20)` 21 trained parameters
- Dual-strength model (Bjork & Bjork 1992): storage strength (grows) + retrieval strength (decays)
- Accessibility = retention×0.5 + retrieval×0.3 + storage×0.2
- 20-30% more efficient than SM-2 (Anki)
### 29 Cognitive Modules (stateful, persist across calls)
**Neuroscience (16):**
ActivationNetwork (Collins & Loftus 1975), SynapticTaggingSystem (Frey & Morris 1997), HippocampalIndex (Teyler & Rudy 2007), ContextMatcher (Tulving 1973), AccessibilityCalculator, CompetitionManager (Anderson 1994), StateUpdateService, ImportanceSignals, NoveltySignal, ArousalSignal, RewardSignal, AttentionSignal, EmotionalMemory (Brown & Kulik 1977), PredictiveMemory, ProspectiveMemory, IntentionParser
**Advanced (11):**
ImportanceTracker, ReconsolidationManager (Nader 5min labile window), IntentDetector (9 intent types), ActivityTracker, MemoryDreamer (5-stage consolidation), MemoryChainBuilder (A*-like), MemoryCompressor (30-day min age), CrossProjectLearner (6 pattern types), AdaptiveEmbedder, SpeculativeRetriever (6 trigger types), ConsolidationScheduler
**Search (2):** Reranker, TemporalSearcher
### Memory States
- **Active** (retention 0.7) easily retrievable
- **Dormant** (≥ 0.4) retrievable with effort
- **Silent** (≥ 0.1) difficult, needs cues
- **Unavailable** (< 0.1) needs reinforcement
### Connection Types
semantic, temporal, causal, spatial, part_of, user_defined each with strength (0-1), activation_count, timestamps
---
## Advanced Techniques
### Cross-Project Intelligence
The CrossProjectLearner tracks patterns across ALL projects (ErrorHandling, AsyncConcurrency, Testing, Architecture, Performance, Security). When you learn a pattern in one project that works, it becomes available in all projects. Use `codebase({ action: "get_context" })` without a codebase filter to get universal patterns.
### Reconsolidation Window
After any memory is accessed (via search, get, or promote), it enters a 5-minute "labile" state where modifications are enhanced. This is the optimal time to edit memories with new context. The system handles this automatically.
### Synaptic Tagging (Retroactive Importance)
Memories encoded in the last 9 hours can be retroactively promoted when something important happens. If you fix a critical bug, not only does the fix get saved related memories from the past 9 hours also get importance boosts. The SynapticTaggingSystem handles this automatically.
### Dream Insights
Dreams don't just consolidate they generate new insights by cross-referencing recent memories with older knowledge. The insights can reveal: contradictions between memories, previously unseen patterns, connections across different projects. Always check dream results for `insights_generated`.
### Token Budget Strategy
Use `token_budget` on search and session_context to control response size. For quick lookups: 500. For deep context: 3000-5000. Results that don't fit go to `expandable` retrieve them with `memory({ action: "get", id: "..." })`.
### Detail Levels
- `brief` id/type/tags/score only (1-2 tokens per result, good for scanning)
- `summary` 8 fields including content preview (default, balanced)
- `full` all FSRS state, timestamps, embedding info (for debugging/analysis)
---
## Memory Hygiene
**Promote** when user confirms helpful, solution worked, info was accurate.
**Demote** when user corrects mistake, info was wrong, led to bad outcome.
**Never save:** secrets, API keys, passwords, temporary debugging state, trivial info.
---
## The One Rule
**When in doubt, save. The cost of a duplicate is near zero (Prediction Error Gating handles dedup). The cost of lost knowledge is permanent.**
Memory is retrieval. Searching strengthens memory. Search liberally, save aggressively.
---
## Development
- **Crate:** `vestige-mcp` v2.0.4, Rust 2024 edition, MSRV 1.91
- **Tests:** 758 (406 mcp + 352 core), zero warnings
- **Build:** `cargo build --release -p vestige-mcp` (features: `embeddings` + `vector-search`)
- **Build (no embeddings):** `cargo build --release -p vestige-mcp --no-default-features`
- **Bench:** `cargo bench -p vestige-core`
- **Architecture:** `McpServer` `Arc<Storage>` + `Arc<Mutex<CognitiveEngine>>`
- **Storage:** SQLite WAL mode, `Mutex<Connection>` reader/writer split, FTS5 full-text search
- **Embeddings:** nomic-embed-text-v1.5 (768D, 8K context) via fastembed (local ONNX, no API)
- **Vector index:** USearch HNSW (20x faster than FAISS)
- **Binaries:** `vestige-mcp` (MCP server), `vestige` (CLI), `vestige-restore`
- **Dashboard:** SvelteKit 2 + Svelte 5 + Three.js + Tailwind 4, embedded at `/dashboard`
- **Env vars:** `VESTIGE_DASHBOARD_PORT` (default 3927), `VESTIGE_HTTP_PORT` (default 3928), `VESTIGE_HTTP_BIND` (default 127.0.0.1), `VESTIGE_AUTH_TOKEN` (auto-generated), `VESTIGE_CONSOLIDATION_INTERVAL_HOURS` (default 6), `RUST_LOG`