mirror of
https://github.com/samvallad33/vestige.git
synced 2026-04-25 00:36:22 +02:00
New features: - deep_reference tool (#22): 8-stage cognitive reasoning pipeline with FSRS-6 trust scoring, intent classification (FactCheck/Timeline/RootCause/Comparison/ Synthesis), spreading activation expansion, temporal supersession, trust-weighted contradiction analysis, relation assessment, dream insight integration, and algorithmic reasoning chain generation — all without calling an LLM - cross_reference (#23): backward-compatible alias for deep_reference - retrieval_mode parameter on search (precise/balanced/exhaustive) - get_batch action on memory tool (up to 20 IDs per call) - Token budget raised from 10K to 100K on search + session_context - Dates (createdAt/updatedAt) on all search results and session_context lines Bug fixes (GitHub Issue #25 — all 10 resolved): - state_transitions empty: wired record_memory_access into strengthen_batch - chain/bridges no storage fallback: added with edge deduplication - knowledge_edges dead schema: documented as deprecated - insights not persisted from dream: wired save_insight after generation - find_duplicates threshold dropped: serde alias fix - search min_retention ignored: serde aliases for snake_case params - intention time triggers null: removed dead trigger_at embedding - changelog missing dreams: added get_dream_history + event integration - phantom Related IDs: clarified message text - fsrs_cards empty: documented as harmless dead schema Security hardening: - HTTP transport CORS: permissive() → localhost-only - Auth token panic guard: &token[..8] → safe min(8) slice - UTF-8 boundary fix: floor_char_boundary on content truncation - All unwrap() removed from HTTP transport (unwrap_or_else fallback) - Dream memory_count capped at 500 (prevents O(N²) hang) - Dormant state threshold aligned (0.3 → 0.4) Stats: 23 tools, 758 tests, 0 failures, 0 warnings, 0 unwraps in production Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
348 lines
17 KiB
Markdown
348 lines
17 KiB
Markdown
# Vestige v2.0.4 — Cognitive Memory & Reasoning System
|
||
|
||
Vestige is your long-term memory AND reasoning engine. 29 stateful cognitive modules implement real neuroscience: FSRS-6 spaced repetition, synaptic tagging, prediction error gating, hippocampal indexing, spreading activation, reconsolidation, and dual-strength memory theory. **Use it automatically. Use it aggressively.**
|
||
|
||
**NEW: `deep_reference` — call this for ALL factual questions.** It doesn't just retrieve — it REASONS across memories with FSRS-6 trust scoring, intent classification, contradiction analysis, and generates a pre-built reasoning chain. Read the `reasoning` field FIRST.
|
||
|
||
---
|
||
|
||
## Session Start Protocol
|
||
|
||
Every conversation, before responding to the user:
|
||
|
||
```
|
||
session_context({
|
||
queries: ["user preferences", "[current project] context"],
|
||
context: { codebase: "[project]", topics: ["[current topics]"] },
|
||
token_budget: 2000
|
||
})
|
||
```
|
||
|
||
Then check `automationTriggers` from response:
|
||
- `needsDream` → call `dream` (consolidates memories, discovers hidden connections)
|
||
- `needsBackup` → call `backup`
|
||
- `needsGc` → call `gc(dry_run: true)` then review
|
||
- totalMemories > 700 → call `find_duplicates`
|
||
|
||
Say "Remembering..." then retrieve context before answering.
|
||
|
||
> **Fallback:** If `session_context` unavailable: `search` × 2 → `intention` check → `system_status` → `predict`.
|
||
|
||
---
|
||
|
||
## Complete Tool Reference (23 Tools)
|
||
|
||
### session_context — One-Call Initialization
|
||
```
|
||
session_context({
|
||
queries: ["user preferences", "project context"], // search queries
|
||
context: { codebase: "project-name", topics: ["svelte", "rust"], file: "src/main.rs" },
|
||
token_budget: 2000, // 100-100000, controls response size
|
||
include_status: true, // system health
|
||
include_intentions: true, // triggered reminders
|
||
include_predictions: true // proactive memory predictions
|
||
})
|
||
```
|
||
Returns: markdown context + `automationTriggers` + `expandable` IDs for on-demand retrieval.
|
||
|
||
### smart_ingest — Save Anything
|
||
**Single mode** — auto-decides CREATE/UPDATE/SUPERSEDE via Prediction Error Gating:
|
||
```
|
||
smart_ingest({
|
||
content: "What to remember",
|
||
tags: ["tag1", "tag2"],
|
||
node_type: "fact", // fact|concept|event|person|place|note|pattern|decision
|
||
source: "optional reference",
|
||
forceCreate: false // bypass dedup when needed
|
||
})
|
||
```
|
||
**Batch mode** — save up to 20 items in one call (session end, pre-compaction):
|
||
```
|
||
smart_ingest({
|
||
items: [
|
||
{ content: "Item 1", tags: ["session-end"], node_type: "fact" },
|
||
{ content: "Item 2", tags: ["bug-fix"], node_type: "fact" }
|
||
]
|
||
})
|
||
```
|
||
Each item runs the full cognitive pipeline: importance scoring → intent detection → synaptic tagging → hippocampal indexing → PE gating → cross-project recording.
|
||
|
||
### search — 7-Stage Cognitive Search
|
||
```
|
||
search({
|
||
query: "search query",
|
||
limit: 10, // 1-100
|
||
min_retention: 0.0, // filter by retention strength
|
||
min_similarity: 0.5, // minimum cosine similarity
|
||
detail_level: "summary", // brief|summary|full
|
||
context_topics: ["rust", "debugging"], // boost topic-matching memories
|
||
token_budget: 3000, // 100-100000, truncate to fit
|
||
retrieval_mode: "balanced" // precise|balanced|exhaustive (v2.1)
|
||
})
|
||
```
|
||
Retrieval modes: `precise` (fast, no activation/competition), `balanced` (default 7-stage pipeline), `exhaustive` (5x overfetch, deep graph traversal, no competition suppression).
|
||
|
||
Pipeline: Overfetch → Rerank (cross-encoder) → Temporal boost → Accessibility filter (FSRS-6) → Context match (Tulving 1973) → Competition (Anderson 1994) → Spreading activation. **Every search strengthens the memories it finds (Testing Effect).**
|
||
|
||
### memory — Read, Edit, Delete, Promote, Demote
|
||
```
|
||
memory({ action: "get", id: "uuid" }) // full node with all FSRS state
|
||
memory({ action: "edit", id: "uuid", content: "updated text" }) // preserves FSRS state, regenerates embedding
|
||
memory({ action: "delete", id: "uuid" })
|
||
memory({ action: "promote", id: "uuid", reason: "was helpful" }) // +0.20 retrieval, +0.10 retention, 1.5x stability
|
||
memory({ action: "demote", id: "uuid", reason: "was wrong" }) // -0.30 retrieval, -0.15 retention, 0.5x stability
|
||
memory({ action: "state", id: "uuid" }) // Active/Dormant/Silent/Unavailable + accessibility score
|
||
memory({ action: "get_batch", ids: ["uuid1", "uuid2", "uuid3"] }) // retrieve up to 20 full memories at once (v2.1)
|
||
```
|
||
Promote/demote does NOT delete — it adjusts ranking. Demoted memories rank lower; alternatives surface instead.
|
||
`get_batch` is designed for batch retrieval of expandable overflow IDs from search/session_context.
|
||
|
||
### codebase — Code Patterns & Architectural Decisions
|
||
```
|
||
codebase({ action: "remember_pattern", name: "Pattern Name",
|
||
description: "How it works and when to use it",
|
||
files: ["src/file.rs"], codebase: "project-name" })
|
||
|
||
codebase({ action: "remember_decision", decision: "What was decided",
|
||
rationale: "Why", alternatives: ["Option A", "Option B"],
|
||
files: ["src/file.rs"], codebase: "project-name" })
|
||
|
||
codebase({ action: "get_context", codebase: "project-name", limit: 10 })
|
||
// Returns: patterns, decisions, cross-project insights
|
||
```
|
||
|
||
### intention — Prospective Memory (Reminders)
|
||
```
|
||
intention({ action: "set", description: "What to do",
|
||
trigger: { type: "context", topic: "authentication" }, // fires when discussing auth
|
||
priority: "high" })
|
||
|
||
intention({ action: "set", description: "Deploy by Friday",
|
||
trigger: { type: "time", at: "2026-03-07T17:00:00Z" },
|
||
deadline: "2026-03-07T17:00:00Z" })
|
||
|
||
intention({ action: "set", description: "Check test coverage",
|
||
trigger: { type: "context", codebase: "vestige", file_pattern: "*.test.*" } })
|
||
|
||
intention({ action: "check", context: { codebase: "vestige", topics: ["testing"] } })
|
||
intention({ action: "update", id: "uuid", status: "complete" })
|
||
intention({ action: "list", filter_status: "active" })
|
||
```
|
||
|
||
### dream — Memory Consolidation
|
||
```
|
||
dream({ memory_count: 50 })
|
||
```
|
||
5-stage cycle: Replay → Cross-reference → Strengthen → Prune → Transfer. Uses Waking SWR tagging (70% tagged + 30% random for diversity). Discovers hidden connections, generates insights, persists new edges to the activation network.
|
||
|
||
### explore_connections — Graph Traversal
|
||
```
|
||
explore_connections({ action: "associations", from: "uuid", limit: 10 })
|
||
// Spreading activation from a memory — find related memories via graph traversal
|
||
|
||
explore_connections({ action: "chain", from: "uuid-A", to: "uuid-B" })
|
||
// Build reasoning path between two memories (A*-like pathfinding)
|
||
|
||
explore_connections({ action: "bridges", from: "uuid-A", to: "uuid-B" })
|
||
// Find connecting memories that bridge two concepts
|
||
```
|
||
|
||
### predict — Proactive Retrieval
|
||
```
|
||
predict({ context: { codebase: "vestige", current_file: "src/main.rs",
|
||
current_topics: ["error handling", "rust"] } })
|
||
```
|
||
Returns: predictions with confidence, suggestions, speculative retrievals, top interests. Uses SpeculativeRetriever's learned patterns from access history.
|
||
|
||
### importance_score — Should I Save This?
|
||
```
|
||
importance_score({ content: "Content to evaluate",
|
||
context_topics: ["debugging"], project: "vestige" })
|
||
```
|
||
4-channel model: novelty (0.25), arousal (0.30), reward (0.25), attention (0.20). Composite > 0.6 = save it.
|
||
|
||
### find_duplicates — Dedup Memory
|
||
```
|
||
find_duplicates({ similarity_threshold: 0.80, limit: 20, tags: ["bug-fix"] })
|
||
```
|
||
Cosine similarity clustering. Returns merge/review suggestions.
|
||
|
||
### memory_timeline — Chronological Browse
|
||
```
|
||
memory_timeline({ start: "2026-02-01", end: "2026-03-01",
|
||
node_type: "decision", tags: ["vestige"], limit: 50, detail_level: "summary" })
|
||
```
|
||
|
||
### memory_changelog — Audit Trail
|
||
```
|
||
memory_changelog({ memory_id: "uuid", limit: 20 }) // per-memory history
|
||
memory_changelog({ start: "2026-03-01", limit: 20 }) // system-wide
|
||
```
|
||
|
||
### memory_health — Retention Dashboard
|
||
```
|
||
memory_health()
|
||
```
|
||
Returns: avg retention, distribution buckets (0-20%, 20-40%, etc.), trend (improving/declining/stable), recommendation.
|
||
|
||
### memory_graph — Visualization Export
|
||
```
|
||
memory_graph({ query: "search term", depth: 2, max_nodes: 50 })
|
||
memory_graph({ center_id: "uuid", depth: 3, max_nodes: 100 })
|
||
```
|
||
Returns nodes with force-directed positions + edges with weights.
|
||
|
||
### deep_reference — Cognitive Reasoning Engine (v2.0.4) ★ USE THIS FOR ALL FACTUAL QUESTIONS
|
||
```
|
||
deep_reference({ query: "What port does the dev server use?" })
|
||
deep_reference({ query: "Should I use prefix caching with vLLM?", depth: 30 })
|
||
```
|
||
**THE killer tool.** 8-stage cognitive reasoning pipeline:
|
||
1. Broad retrieval + cross-encoder reranking
|
||
2. Spreading activation expansion (finds connected memories search misses)
|
||
3. FSRS-6 trust scoring (retention × stability × reps ÷ lapses)
|
||
4. Intent classification (FactCheck / Timeline / RootCause / Comparison / Synthesis)
|
||
5. Temporal supersession (newer high-trust replaces older)
|
||
6. Trust-weighted contradiction analysis (only flags conflicts between strong memories)
|
||
7. Relation assessment (Supports / Contradicts / Supersedes / Irrelevant per pair)
|
||
8. **Template reasoning chain** — pre-built natural language reasoning the AI validates
|
||
|
||
Parameters: `query` (required), `depth` (5-50, default 20).
|
||
|
||
Returns: `intent`, `reasoning` (THE KEY FIELD — read this first), `recommended` (highest-trust answer), `evidence` (trust-sorted), `contradictions`, `superseded`, `evolution`, `related_insights`, `confidence`.
|
||
|
||
`cross_reference` is a backward-compatible alias that calls `deep_reference`.
|
||
|
||
### Maintenance Tools
|
||
```
|
||
system_status() // health + stats + warnings + recommendations
|
||
consolidate() // FSRS-6 decay cycle + embedding generation
|
||
backup() // SQLite backup → ~/.vestige/backups/
|
||
export({ format: "json", tags: ["bug-fix"], since: "2026-01-01" })
|
||
gc({ min_retention: 0.1, dry_run: true }) // garbage collect (dry_run first!)
|
||
restore({ path: "/path/to/backup.json" })
|
||
```
|
||
|
||
---
|
||
|
||
## Mandatory Save Gates
|
||
|
||
**You MUST NOT proceed past a save gate without executing the save.**
|
||
|
||
| Gate | Trigger | Action |
|
||
|------|---------|--------|
|
||
| **BUG_FIX** | After any error is resolved | `smart_ingest({ content: "BUG FIX: [error]\nRoot cause: [why]\nSolution: [fix]\nFiles: [paths]", tags: ["bug-fix", "project"], node_type: "fact" })` |
|
||
| **DECISION** | After any architectural/design choice | `codebase({ action: "remember_decision", decision, rationale, alternatives, files, codebase })` |
|
||
| **CODE_CHANGE** | After >20 lines or new pattern | `codebase({ action: "remember_pattern", name, description, files, codebase })` |
|
||
| **SESSION_END** | Before stopping or compaction | `smart_ingest({ items: [{ content: "SESSION: [summary]", tags: ["session-end"] }] })` |
|
||
|
||
---
|
||
|
||
## Trigger Words — Auto-Save
|
||
|
||
| User Says | Action |
|
||
|-----------|--------|
|
||
| "Remember this" / "Don't forget" | `smart_ingest` immediately |
|
||
| "I always..." / "I never..." / "I prefer..." | Save as preference |
|
||
| "This is important" | `smart_ingest` + `memory(action="promote")` |
|
||
| "Remind me..." / "Next time..." | `intention({ action: "set" })` |
|
||
|
||
---
|
||
|
||
## Cognitive Architecture
|
||
|
||
### Search Pipeline (7 stages)
|
||
1. **Overfetch** — 3x results from hybrid search (0.3 BM25 + 0.7 semantic, nomic-embed-text-v1.5 768D)
|
||
2. **Rerank** — Cross-encoder rescoring (Jina Reranker v1 Turbo, 38M params)
|
||
3. **Temporal** — Recency + validity window boosting (85% relevance + 15% temporal)
|
||
4. **Accessibility** — FSRS-6 retention filter (Active ≥0.7, Dormant ≥0.4, Silent ≥0.1)
|
||
5. **Context** — Tulving 1973 encoding specificity (topic overlap → +30% boost)
|
||
6. **Competition** — Anderson 1994 retrieval-induced forgetting (winners strengthen, competitors weaken)
|
||
7. **Activation** — Spreading activation side effects + predictive model + reconsolidation marking
|
||
|
||
### Ingest Pipeline
|
||
**Pre:** 4-channel importance scoring (novelty/arousal/reward/attention) + intent detection → auto-tag
|
||
**Store:** Prediction Error Gating: similarity >0.92 → UPDATE, 0.75-0.92 → UPDATE/SUPERSEDE, <0.75 → CREATE
|
||
**Post:** Synaptic tagging (Frey & Morris 1997, 9h backward + 2h forward) + hippocampal indexing + cross-project recording
|
||
|
||
### FSRS-6 (State-of-the-Art Spaced Repetition)
|
||
- Retrievability: `R = (1 + factor × t / S)^(-w20)` — 21 trained parameters
|
||
- Dual-strength model (Bjork & Bjork 1992): storage strength (grows) + retrieval strength (decays)
|
||
- Accessibility = retention×0.5 + retrieval×0.3 + storage×0.2
|
||
- 20-30% more efficient than SM-2 (Anki)
|
||
|
||
### 29 Cognitive Modules (stateful, persist across calls)
|
||
|
||
**Neuroscience (16):**
|
||
ActivationNetwork (Collins & Loftus 1975), SynapticTaggingSystem (Frey & Morris 1997), HippocampalIndex (Teyler & Rudy 2007), ContextMatcher (Tulving 1973), AccessibilityCalculator, CompetitionManager (Anderson 1994), StateUpdateService, ImportanceSignals, NoveltySignal, ArousalSignal, RewardSignal, AttentionSignal, EmotionalMemory (Brown & Kulik 1977), PredictiveMemory, ProspectiveMemory, IntentionParser
|
||
|
||
**Advanced (11):**
|
||
ImportanceTracker, ReconsolidationManager (Nader — 5min labile window), IntentDetector (9 intent types), ActivityTracker, MemoryDreamer (5-stage consolidation), MemoryChainBuilder (A*-like), MemoryCompressor (30-day min age), CrossProjectLearner (6 pattern types), AdaptiveEmbedder, SpeculativeRetriever (6 trigger types), ConsolidationScheduler
|
||
|
||
**Search (2):** Reranker, TemporalSearcher
|
||
|
||
### Memory States
|
||
- **Active** (retention ≥ 0.7) — easily retrievable
|
||
- **Dormant** (≥ 0.4) — retrievable with effort
|
||
- **Silent** (≥ 0.1) — difficult, needs cues
|
||
- **Unavailable** (< 0.1) — needs reinforcement
|
||
|
||
### Connection Types
|
||
semantic, temporal, causal, spatial, part_of, user_defined — each with strength (0-1), activation_count, timestamps
|
||
|
||
---
|
||
|
||
## Advanced Techniques
|
||
|
||
### Cross-Project Intelligence
|
||
The CrossProjectLearner tracks patterns across ALL projects (ErrorHandling, AsyncConcurrency, Testing, Architecture, Performance, Security). When you learn a pattern in one project that works, it becomes available in all projects. Use `codebase({ action: "get_context" })` without a codebase filter to get universal patterns.
|
||
|
||
### Reconsolidation Window
|
||
After any memory is accessed (via search, get, or promote), it enters a 5-minute "labile" state where modifications are enhanced. This is the optimal time to edit memories with new context. The system handles this automatically.
|
||
|
||
### Synaptic Tagging (Retroactive Importance)
|
||
Memories encoded in the last 9 hours can be retroactively promoted when something important happens. If you fix a critical bug, not only does the fix get saved — related memories from the past 9 hours also get importance boosts. The SynapticTaggingSystem handles this automatically.
|
||
|
||
### Dream Insights
|
||
Dreams don't just consolidate — they generate new insights by cross-referencing recent memories with older knowledge. The insights can reveal: contradictions between memories, previously unseen patterns, connections across different projects. Always check dream results for `insights_generated`.
|
||
|
||
### Token Budget Strategy
|
||
Use `token_budget` on search and session_context to control response size. For quick lookups: 500. For deep context: 3000-5000. Results that don't fit go to `expandable` — retrieve them with `memory({ action: "get", id: "..." })`.
|
||
|
||
### Detail Levels
|
||
- `brief` — id/type/tags/score only (1-2 tokens per result, good for scanning)
|
||
- `summary` — 8 fields including content preview (default, balanced)
|
||
- `full` — all FSRS state, timestamps, embedding info (for debugging/analysis)
|
||
|
||
---
|
||
|
||
## Memory Hygiene
|
||
|
||
**Promote** when user confirms helpful, solution worked, info was accurate.
|
||
**Demote** when user corrects mistake, info was wrong, led to bad outcome.
|
||
**Never save:** secrets, API keys, passwords, temporary debugging state, trivial info.
|
||
|
||
---
|
||
|
||
## The One Rule
|
||
|
||
**When in doubt, save. The cost of a duplicate is near zero (Prediction Error Gating handles dedup). The cost of lost knowledge is permanent.**
|
||
|
||
Memory is retrieval. Searching strengthens memory. Search liberally, save aggressively.
|
||
|
||
---
|
||
|
||
## Development
|
||
|
||
- **Crate:** `vestige-mcp` v2.0.4, Rust 2024 edition, MSRV 1.91
|
||
- **Tests:** 758 (406 mcp + 352 core), zero warnings
|
||
- **Build:** `cargo build --release -p vestige-mcp` (features: `embeddings` + `vector-search`)
|
||
- **Build (no embeddings):** `cargo build --release -p vestige-mcp --no-default-features`
|
||
- **Bench:** `cargo bench -p vestige-core`
|
||
- **Architecture:** `McpServer` → `Arc<Storage>` + `Arc<Mutex<CognitiveEngine>>`
|
||
- **Storage:** SQLite WAL mode, `Mutex<Connection>` reader/writer split, FTS5 full-text search
|
||
- **Embeddings:** nomic-embed-text-v1.5 (768D, 8K context) via fastembed (local ONNX, no API)
|
||
- **Vector index:** USearch HNSW (20x faster than FAISS)
|
||
- **Binaries:** `vestige-mcp` (MCP server), `vestige` (CLI), `vestige-restore`
|
||
- **Dashboard:** SvelteKit 2 + Svelte 5 + Three.js + Tailwind 4, embedded at `/dashboard`
|
||
- **Env vars:** `VESTIGE_DASHBOARD_PORT` (default 3927), `VESTIGE_HTTP_PORT` (default 3928), `VESTIGE_HTTP_BIND` (default 127.0.0.1), `VESTIGE_AUTH_TOKEN` (auto-generated), `VESTIGE_CONSOLIDATION_INTERVAL_HOURS` (default 6), `RUST_LOG`
|