diff --git a/CLAUDE.md b/CLAUDE.md index a6acaf8..f7d4901 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -1,6 +1,6 @@ -# Vestige v2.0.0 — Cognitive Memory System +# Vestige v2.0.1 — Cognitive Memory System -Vestige is your long-term memory. It implements real neuroscience: FSRS-6 spaced repetition, synaptic tagging, prediction error gating, hippocampal indexing, spreading activation, and 29 stateful cognitive modules. **Use it automatically.** +Vestige is your long-term memory. 29 stateful cognitive modules implement real neuroscience: FSRS-6 spaced repetition, synaptic tagging, prediction error gating, hippocampal indexing, spreading activation, reconsolidation, and dual-strength memory theory. **Use it automatically. Use it aggressively.** --- @@ -9,135 +9,204 @@ Vestige is your long-term memory. It implements real neuroscience: FSRS-6 spaced Every conversation, before responding to the user: ``` -1. session_context({ → ONE CALL replaces steps 1-5 - queries: ["user preferences", "[project] context"], - context: { codebase: "[project]", topics: ["[current topics]"] }, - token_budget: 1000 - }) -2. Check automationTriggers from response: - - needsDream == true → call dream - - needsBackup == true → call backup - - needsGc == true → call gc(dry_run: true) - - totalMemories > 700 → call find_duplicates +session_context({ + queries: ["user preferences", "[current project] context"], + context: { codebase: "[project]", topics: ["[current topics]"] }, + token_budget: 2000 +}) ``` +Then check `automationTriggers` from response: +- `needsDream` → call `dream` (consolidates memories, discovers hidden connections) +- `needsBackup` → call `backup` +- `needsGc` → call `gc(dry_run: true)` then review +- totalMemories > 700 → call `find_duplicates` + Say "Remembering..." then retrieve context before answering. -> **Fallback:** If `session_context` is unavailable, use the 5-call sequence: `search` × 2 → `intention` check → `system_status` → `predict`. +> **Fallback:** If `session_context` unavailable: `search` × 2 → `intention` check → `system_status` → `predict`. --- -## The 21 Tools +## Complete Tool Reference (21 Tools) -### Context Packets (1 tool) -| Tool | When to Use | -|------|-------------| -| `session_context` | **One-call session initialization.** Replaces 5 separate calls (search × 2, intention check, system_status, predict) with a single token-budgeted response. Returns markdown context + `automationTriggers` (needsDream/needsBackup/needsGc) + `expandable` IDs for on-demand full retrieval. Params: `queries` (string[]), `token_budget` (100-10000, default 1000), `context` ({codebase, topics, file}), `include_status/include_intentions/include_predictions` (bool). | +### session_context — One-Call Initialization +``` +session_context({ + queries: ["user preferences", "project context"], // search queries + context: { codebase: "project-name", topics: ["svelte", "rust"], file: "src/main.rs" }, + token_budget: 2000, // 100-10000, controls response size + include_status: true, // system health + include_intentions: true, // triggered reminders + include_predictions: true // proactive memory predictions +}) +``` +Returns: markdown context + `automationTriggers` + `expandable` IDs for on-demand retrieval. -### Core Memory (1 tool) -| Tool | When to Use | -|------|-------------| -| `smart_ingest` | **Default for all saves.** Single mode: provide `content` for auto-decide CREATE/UPDATE/SUPERSEDE via Prediction Error Gating. Batch mode: provide `items` array (max 20) for session-end saves — each item runs full cognitive pipeline (importance scoring, intent detection, synaptic tagging, hippocampal indexing). | +### smart_ingest — Save Anything +**Single mode** — auto-decides CREATE/UPDATE/SUPERSEDE via Prediction Error Gating: +``` +smart_ingest({ + content: "What to remember", + tags: ["tag1", "tag2"], + node_type: "fact", // fact|concept|event|person|place|note|pattern|decision + source: "optional reference", + forceCreate: false // bypass dedup when needed +}) +``` +**Batch mode** — save up to 20 items in one call (session end, pre-compaction): +``` +smart_ingest({ + items: [ + { content: "Item 1", tags: ["session-end"], node_type: "fact" }, + { content: "Item 2", tags: ["bug-fix"], node_type: "fact" } + ] +}) +``` +Each item runs the full cognitive pipeline: importance scoring → intent detection → synaptic tagging → hippocampal indexing → PE gating → cross-project recording. -### Unified Tools (4 tools) -| Tool | Actions | When to Use | -|------|---------|-------------| -| `search` | query + filters | **Every time you need to recall anything.** Hybrid search (BM25 + semantic + convex combination fusion). 7-stage pipeline: overfetch → rerank → temporal boost → accessibility filter → context match → competition → spreading activation. Searching strengthens memory (Testing Effect). **v1.8.0:** optional `token_budget` param (100-10000) limits response size; results exceeding budget moved to `expandable` array. | -| `memory` | get, delete, state, promote, demote | Retrieve a full memory by ID, delete a memory, check its cognitive state (Active/Dormant/Silent/Unavailable), promote (thumbs up — increases retrieval strength), or demote (thumbs down — decreases retrieval strength, does NOT delete). | -| `codebase` | remember_pattern, remember_decision, get_context | Store and recall code patterns, architectural decisions, and project context. The killer differentiator. | -| `intention` | set, check, update, list | Prospective memory — "remember to do X when Y happens". Supports time, context, and event triggers. | +### search — 7-Stage Cognitive Search +``` +search({ + query: "search query", + limit: 10, // 1-100 + min_retention: 0.0, // filter by retention strength + min_similarity: 0.5, // minimum cosine similarity + detail_level: "summary", // brief|summary|full + context_topics: ["rust", "debugging"], // boost topic-matching memories + token_budget: 3000 // 100-10000, truncate to fit +}) +``` +Pipeline: Overfetch (3x, BM25+semantic) → Rerank (cross-encoder) → Temporal boost → Accessibility filter (FSRS-6) → Context match (Tulving 1973) → Competition (Anderson 1994) → Spreading activation. **Every search strengthens the memories it finds (Testing Effect).** -### Temporal (2 tools) -| Tool | When to Use | -|------|-------------| -| `memory_timeline` | Browse memories chronologically. Grouped by day. Filter by type, tags, date range. When user references a time period ("last week", "yesterday"). | -| `memory_changelog` | Audit trail. Per-memory: state transitions. System-wide: consolidations + recent changes. When debugging memory issues. | +### memory — Read, Edit, Delete, Promote, Demote +``` +memory({ action: "get", id: "uuid" }) // full node with all FSRS state +memory({ action: "edit", id: "uuid", content: "updated text" }) // preserves FSRS state, regenerates embedding +memory({ action: "delete", id: "uuid" }) +memory({ action: "promote", id: "uuid", reason: "was helpful" }) // +0.20 retrieval, +0.10 retention, 1.5x stability +memory({ action: "demote", id: "uuid", reason: "was wrong" }) // -0.30 retrieval, -0.15 retention, 0.5x stability +memory({ action: "state", id: "uuid" }) // Active/Dormant/Silent/Unavailable + accessibility score +``` +Promote/demote does NOT delete — it adjusts ranking. Demoted memories rank lower; alternatives surface instead. -### Cognitive (3 tools) -| Tool | When to Use | -|------|-------------| -| `dream` | Trigger memory consolidation — replays recent memories to discover hidden connections and synthesize insights. At session start if >24h since last dream, after every 50 saves. | -| `explore_connections` | Graph exploration. Actions: `chain` (reasoning path A→B), `associations` (spreading activation from a node), `bridges` (connecting memories between two nodes). When search returns 3+ related results. | -| `predict` | Proactive retrieval — predicts what memories you'll need next based on context, activity patterns, and learned behavior. At session start, when switching projects. | +### codebase — Code Patterns & Architectural Decisions +``` +codebase({ action: "remember_pattern", name: "Pattern Name", + description: "How it works and when to use it", + files: ["src/file.rs"], codebase: "project-name" }) -### Auto-Save & Dedup (2 tools) -| Tool | When to Use | -|------|-------------| -| `importance_score` | Score content importance before deciding whether to save. 4-channel model: novelty, arousal, reward, attention. Composite > 0.6 = worth saving. | -| `find_duplicates` | Find near-duplicate memory clusters via cosine similarity. Returns merge/review suggestions. Run when memory count > 700 or on user request. | +codebase({ action: "remember_decision", decision: "What was decided", + rationale: "Why", alternatives: ["Option A", "Option B"], + files: ["src/file.rs"], codebase: "project-name" }) -### Autonomic (2 tools) -| Tool | When to Use | -|------|-------------| -| `memory_health` | Retention dashboard — avg retention, distribution buckets, trend (improving/declining/stable), recommendation. Lightweight alternative to system_status focused on memory quality. | -| `memory_graph` | Subgraph export for visualization. Input: center_id or query, depth (1-3), max_nodes. Returns nodes with force-directed layout positions and edges with weights. | +codebase({ action: "get_context", codebase: "project-name", limit: 10 }) +// Returns: patterns, decisions, cross-project insights +``` -### Maintenance (5 tools) -| Tool | When to Use | -|------|-------------| -| `system_status` | **Combined health + stats.** Returns status (healthy/degraded/critical/empty), full statistics, FSRS preview, cognitive module health, state distribution, warnings, and recommendations. At session start (or use `session_context` which includes this). | -| `consolidate` | Run FSRS-6 consolidation cycle. Applies decay, generates embeddings, maintenance. At session end, when retention drops. | -| `backup` | Create SQLite database backup. Before major upgrades, weekly. | -| `export` | Export memories as JSON/JSONL with tag and date filters. | -| `gc` | Garbage collect low-retention memories. When system_status shows degraded + high count. Defaults to dry_run=true. | +### intention — Prospective Memory (Reminders) +``` +intention({ action: "set", description: "What to do", + trigger: { type: "context", topic: "authentication" }, // fires when discussing auth + priority: "high" }) -### Restore (1 tool) -| Tool | When to Use | -|------|-------------| -| `restore` | Restore memories from a JSON backup file. Supports MCP wrapper, RecallResult, and direct array formats. | +intention({ action: "set", description: "Deploy by Friday", + trigger: { type: "time", at: "2026-03-07T17:00:00Z" }, + deadline: "2026-03-07T17:00:00Z" }) -### Deprecated (still work via redirects) -| Old Tool | Redirects To | -|----------|-------------| -| `ingest` | `smart_ingest` | -| `session_checkpoint` | `smart_ingest` (batch mode) | -| `promote_memory` | `memory(action="promote")` | -| `demote_memory` | `memory(action="demote")` | -| `health_check` | `system_status` | -| `stats` | `system_status` | +intention({ action: "set", description: "Check test coverage", + trigger: { type: "context", codebase: "vestige", file_pattern: "*.test.*" } }) + +intention({ action: "check", context: { codebase: "vestige", topics: ["testing"] } }) +intention({ action: "update", id: "uuid", status: "complete" }) +intention({ action: "list", filter_status: "active" }) +``` + +### dream — Memory Consolidation +``` +dream({ memory_count: 50 }) +``` +5-stage cycle: Replay → Cross-reference → Strengthen → Prune → Transfer. Uses Waking SWR tagging (70% tagged + 30% random for diversity). Discovers hidden connections, generates insights, persists new edges to the activation network. + +### explore_connections — Graph Traversal +``` +explore_connections({ action: "associations", from: "uuid", limit: 10 }) +// Spreading activation from a memory — find related memories via graph traversal + +explore_connections({ action: "chain", from: "uuid-A", to: "uuid-B" }) +// Build reasoning path between two memories (A*-like pathfinding) + +explore_connections({ action: "bridges", from: "uuid-A", to: "uuid-B" }) +// Find connecting memories that bridge two concepts +``` + +### predict — Proactive Retrieval +``` +predict({ context: { codebase: "vestige", current_file: "src/main.rs", + current_topics: ["error handling", "rust"] } }) +``` +Returns: predictions with confidence, suggestions, speculative retrievals, top interests. Uses SpeculativeRetriever's learned patterns from access history. + +### importance_score — Should I Save This? +``` +importance_score({ content: "Content to evaluate", + context_topics: ["debugging"], project: "vestige" }) +``` +4-channel model: novelty (0.25), arousal (0.30), reward (0.25), attention (0.20). Composite > 0.6 = save it. + +### find_duplicates — Dedup Memory +``` +find_duplicates({ similarity_threshold: 0.80, limit: 20, tags: ["bug-fix"] }) +``` +Cosine similarity clustering. Returns merge/review suggestions. + +### memory_timeline — Chronological Browse +``` +memory_timeline({ start: "2026-02-01", end: "2026-03-01", + node_type: "decision", tags: ["vestige"], limit: 50, detail_level: "summary" }) +``` + +### memory_changelog — Audit Trail +``` +memory_changelog({ memory_id: "uuid", limit: 20 }) // per-memory history +memory_changelog({ start: "2026-03-01", limit: 20 }) // system-wide +``` + +### memory_health — Retention Dashboard +``` +memory_health() +``` +Returns: avg retention, distribution buckets (0-20%, 20-40%, etc.), trend (improving/declining/stable), recommendation. + +### memory_graph — Visualization Export +``` +memory_graph({ query: "search term", depth: 2, max_nodes: 50 }) +memory_graph({ center_id: "uuid", depth: 3, max_nodes: 100 }) +``` +Returns nodes with force-directed positions + edges with weights. + +### Maintenance Tools +``` +system_status() // health + stats + warnings + recommendations +consolidate() // FSRS-6 decay cycle + embedding generation +backup() // SQLite backup → ~/.vestige/backups/ +export({ format: "json", tags: ["bug-fix"], since: "2026-01-01" }) +gc({ min_retention: 0.1, dry_run: true }) // garbage collect (dry_run first!) +restore({ path: "/path/to/backup.json" }) +``` --- ## Mandatory Save Gates -**RULE: You MUST NOT proceed past a save gate without executing the save.** +**You MUST NOT proceed past a save gate without executing the save.** -### BUG_FIX — After any error is resolved -Your next tool call after confirming a fix MUST be `smart_ingest`: -``` -smart_ingest({ - content: "BUG FIX: [exact error]\nRoot cause: [why]\nSolution: [what fixed it]\nFiles: [paths]", - tags: ["bug-fix", "[project]"], node_type: "fact" -}) -``` - -### DECISION — After any architectural or design choice -``` -codebase({ - action: "remember_decision", - decision: "[what]", rationale: "[why]", - alternatives: ["[A]", "[B]"], files: ["[affected]"], codebase: "[project]" -}) -``` - -### CODE_CHANGE — After writing significant code (>20 lines or new pattern) -``` -codebase({ - action: "remember_pattern", - name: "[pattern]", description: "[how/when to use]", - files: ["[files]"], codebase: "[project]" -}) -``` - -### SESSION_END — Before stopping or compaction -``` -smart_ingest({ - items: [ - { content: "SESSION: [work done]\nFixes: [list]\nDecisions: [list]", tags: ["session-end", "[project]"] }, - // ... any unsaved fixes, decisions, patterns - ] -}) -``` +| Gate | Trigger | Action | +|------|---------|--------| +| **BUG_FIX** | After any error is resolved | `smart_ingest({ content: "BUG FIX: [error]\nRoot cause: [why]\nSolution: [fix]\nFiles: [paths]", tags: ["bug-fix", "project"], node_type: "fact" })` | +| **DECISION** | After any architectural/design choice | `codebase({ action: "remember_decision", decision, rationale, alternatives, files, codebase })` | +| **CODE_CHANGE** | After >20 lines or new pattern | `codebase({ action: "remember_pattern", name, description, files, codebase })` | +| **SESSION_END** | Before stopping or compaction | `smart_ingest({ items: [{ content: "SESSION: [summary]", tags: ["session-end"] }] })` | --- @@ -148,60 +217,82 @@ smart_ingest({ | "Remember this" / "Don't forget" | `smart_ingest` immediately | | "I always..." / "I never..." / "I prefer..." | Save as preference | | "This is important" | `smart_ingest` + `memory(action="promote")` | -| "Remind me..." / "Next time..." | `intention` → set | +| "Remind me..." / "Next time..." | `intention({ action: "set" })` | --- -## Under the Hood — Cognitive Pipelines +## Cognitive Architecture ### Search Pipeline (7 stages) -1. **Overfetch** — Pull 3x results from hybrid search (BM25 + semantic) -2. **Reranker** — Re-score by relevance quality (cross-encoder) -3. **Temporal boost** — Recent memories get recency bonus -4. **Accessibility filter** — FSRS-6 retention threshold (Ebbinghaus curve) -5. **Context match** — Tulving 1973 encoding specificity (match current context to encoding context) +1. **Overfetch** — 3x results from hybrid search (0.3 BM25 + 0.7 semantic, nomic-embed-text-v1.5 768D) +2. **Rerank** — Cross-encoder rescoring (Jina Reranker v1 Turbo, 38M params) +3. **Temporal** — Recency + validity window boosting (85% relevance + 15% temporal) +4. **Accessibility** — FSRS-6 retention filter (Active ≥0.7, Dormant ≥0.4, Silent ≥0.1) +5. **Context** — Tulving 1973 encoding specificity (topic overlap → +30% boost) 6. **Competition** — Anderson 1994 retrieval-induced forgetting (winners strengthen, competitors weaken) -7. **Spreading activation** — Side effects: activate related memories, update predictive model, record reconsolidation opportunity +7. **Activation** — Spreading activation side effects + predictive model + reconsolidation marking -### Ingest Pipeline (cognitive pre/post) -**Pre-ingest:** 4-channel importance scoring (novelty/arousal/reward/attention) + intent detection → auto-tag -**Storage:** Prediction Error Gating decides create/update/reinforce/supersede -**Post-ingest:** Synaptic tagging (Frey & Morris 1997) + novelty model update + hippocampal indexing + cross-project recording +### Ingest Pipeline +**Pre:** 4-channel importance scoring (novelty/arousal/reward/attention) + intent detection → auto-tag +**Store:** Prediction Error Gating: similarity >0.92 → UPDATE, 0.75-0.92 → UPDATE/SUPERSEDE, <0.75 → CREATE +**Post:** Synaptic tagging (Frey & Morris 1997, 9h backward + 2h forward) + hippocampal indexing + cross-project recording -### Feedback Pipeline (via memory promote/demote) -**Promote:** Reward signal + importance boost + reconsolidation (memory becomes modifiable for 24-48h) + activation spread -**Demote:** Competition suppression + retrieval strength decrease (does NOT delete — alternatives surface instead) +### FSRS-6 (State-of-the-Art Spaced Repetition) +- Retrievability: `R = (1 + factor × t / S)^(-w20)` — 21 trained parameters +- Dual-strength model (Bjork & Bjork 1992): storage strength (grows) + retrieval strength (decays) +- Accessibility = retention×0.5 + retrieval×0.3 + storage×0.2 +- 20-30% more efficient than SM-2 (Anki) + +### 29 Cognitive Modules (stateful, persist across calls) + +**Neuroscience (16):** +ActivationNetwork (Collins & Loftus 1975), SynapticTaggingSystem (Frey & Morris 1997), HippocampalIndex (Teyler & Rudy 2007), ContextMatcher (Tulving 1973), AccessibilityCalculator, CompetitionManager (Anderson 1994), StateUpdateService, ImportanceSignals, NoveltySignal, ArousalSignal, RewardSignal, AttentionSignal, EmotionalMemory (Brown & Kulik 1977), PredictiveMemory, ProspectiveMemory, IntentionParser + +**Advanced (11):** +ImportanceTracker, ReconsolidationManager (Nader — 5min labile window), IntentDetector (9 intent types), ActivityTracker, MemoryDreamer (5-stage consolidation), MemoryChainBuilder (A*-like), MemoryCompressor (30-day min age), CrossProjectLearner (6 pattern types), AdaptiveEmbedder, SpeculativeRetriever (6 trigger types), ConsolidationScheduler + +**Search (2):** Reranker, TemporalSearcher + +### Memory States +- **Active** (retention ≥ 0.7) — easily retrievable +- **Dormant** (≥ 0.4) — retrievable with effort +- **Silent** (≥ 0.1) — difficult, needs cues +- **Unavailable** (< 0.1) — needs reinforcement + +### Connection Types +semantic, temporal, causal, spatial, part_of, user_defined — each with strength (0-1), activation_count, timestamps --- -## CognitiveEngine — 29 Modules +## Advanced Techniques -All modules persist across tool calls as stateful instances: +### Cross-Project Intelligence +The CrossProjectLearner tracks patterns across ALL projects (ErrorHandling, AsyncConcurrency, Testing, Architecture, Performance, Security). When you learn a pattern in one project that works, it becomes available in all projects. Use `codebase({ action: "get_context" })` without a codebase filter to get universal patterns. -**Neuroscience (16):** ActivationNetwork, SynapticTaggingSystem, HippocampalIndex, ContextMatcher, AccessibilityCalculator, CompetitionManager, StateUpdateService, ImportanceSignals, NoveltySignal, ArousalSignal, RewardSignal, AttentionSignal, EmotionalMemory, PredictiveMemory, ProspectiveMemory, IntentionParser +### Reconsolidation Window +After any memory is accessed (via search, get, or promote), it enters a 5-minute "labile" state where modifications are enhanced. This is the optimal time to edit memories with new context. The system handles this automatically. -**Advanced (11):** ImportanceTracker, ReconsolidationManager, IntentDetector, ActivityTracker, MemoryDreamer, MemoryChainBuilder, MemoryCompressor, CrossProjectLearner, AdaptiveEmbedder, SpeculativeRetriever, ConsolidationScheduler +### Synaptic Tagging (Retroactive Importance) +Memories encoded in the last 9 hours can be retroactively promoted when something important happens. If you fix a critical bug, not only does the fix get saved — related memories from the past 9 hours also get importance boosts. The SynapticTaggingSystem handles this automatically. -**Search (2):** Reranker, TemporalSearcher +### Dream Insights +Dreams don't just consolidate — they generate new insights by cross-referencing recent memories with older knowledge. The insights can reveal: contradictions between memories, previously unseen patterns, connections across different projects. Always check dream results for `insights_generated`. + +### Token Budget Strategy +Use `token_budget` on search and session_context to control response size. For quick lookups: 500. For deep context: 3000-5000. Results that don't fit go to `expandable` — retrieve them with `memory({ action: "get", id: "..." })`. + +### Detail Levels +- `brief` — id/type/tags/score only (1-2 tokens per result, good for scanning) +- `summary` — 8 fields including content preview (default, balanced) +- `full` — all FSRS state, timestamps, embedding info (for debugging/analysis) --- ## Memory Hygiene -### Promote when: -- User confirms memory was helpful → `memory(action="promote")` -- Solution worked correctly -- Information was accurate - -### Demote when: -- User corrects a mistake → `memory(action="demote")` -- Information was wrong -- Memory led to bad outcome - -### Never save: -- Secrets, API keys, passwords -- Temporary debugging state -- Obvious/trivial information +**Promote** when user confirms helpful, solution worked, info was accurate. +**Demote** when user corrects mistake, info was wrong, led to bad outcome. +**Never save:** secrets, API keys, passwords, temporary debugging state, trivial info. --- @@ -216,11 +307,14 @@ Memory is retrieval. Searching strengthens memory. Search liberally, save aggres ## Development - **Crate:** `vestige-mcp` v2.0.1, Rust 2024 edition, MSRV 1.91 -- **Tests:** 1,238 tests, zero warnings -- **Build:** `cargo build --release -p vestige-mcp` -- **Features:** `embeddings` + `vector-search` (default on) -- **Architecture:** `McpServer` holds `Arc` + `Arc>` -- **Storage:** Interior mutability — `Storage` uses `Mutex` for reader/writer split, all methods take `&self`. WAL mode for concurrent reads + writes. -- **Entry:** `src/main.rs` → stdio JSON-RPC server -- **Tools:** `src/tools/` — one file per tool, each exports `schema()` + `execute()` -- **Cognitive:** `src/cognitive.rs` — 29-field struct, initialized once at startup +- **Tests:** 1,238 (352 unit + 192 E2E + cognitive + journey + extreme), zero warnings +- **Build:** `cargo build --release -p vestige-mcp` (features: `embeddings` + `vector-search`) +- **Build (no embeddings):** `cargo build --release -p vestige-mcp --no-default-features` +- **Bench:** `cargo bench -p vestige-core` +- **Architecture:** `McpServer` → `Arc` + `Arc>` +- **Storage:** SQLite WAL mode, `Mutex` reader/writer split, FTS5 full-text search +- **Embeddings:** nomic-embed-text-v1.5 (768D, 8K context) via fastembed (local ONNX, no API) +- **Vector index:** USearch HNSW (20x faster than FAISS) +- **Binaries:** `vestige-mcp` (MCP server), `vestige` (CLI), `vestige-restore` +- **Dashboard:** SvelteKit 2 + Svelte 5 + Three.js + Tailwind 4, embedded at `/dashboard` +- **Env vars:** `VESTIGE_DASHBOARD_PORT` (default 3927), `VESTIGE_CONSOLIDATION_INTERVAL_HOURS` (default 6), `RUST_LOG` diff --git a/Cargo.lock b/Cargo.lock index dc7f5eb..1879e26 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -4494,7 +4494,7 @@ checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a" [[package]] name = "vestige-core" -version = "2.0.1" +version = "2.0.2" dependencies = [ "chrono", "criterion", @@ -4529,7 +4529,7 @@ dependencies = [ [[package]] name = "vestige-mcp" -version = "2.0.1" +version = "2.0.2" dependencies = [ "anyhow", "axum", @@ -4544,6 +4544,7 @@ dependencies = [ "rusqlite", "serde", "serde_json", + "subtle", "tempfile", "thiserror 2.0.18", "tokio", diff --git a/crates/vestige-core/Cargo.toml b/crates/vestige-core/Cargo.toml index b8aeab6..9cfafc1 100644 --- a/crates/vestige-core/Cargo.toml +++ b/crates/vestige-core/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "vestige-core" -version = "2.0.1" +version = "2.0.2" edition = "2024" rust-version = "1.91" authors = ["Vestige Team"] diff --git a/crates/vestige-mcp/Cargo.toml b/crates/vestige-mcp/Cargo.toml index 86d8d74..5536e0e 100644 --- a/crates/vestige-mcp/Cargo.toml +++ b/crates/vestige-mcp/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "vestige-mcp" -version = "2.0.1" +version = "2.0.2" edition = "2024" description = "Cognitive memory MCP server for Claude - FSRS-6, spreading activation, synaptic tagging, 3D dashboard, and 130 years of memory research" authors = ["samvallad33"] @@ -32,7 +32,7 @@ path = "src/bin/cli.rs" # ============================================================================ # Includes: FSRS-6, spreading activation, synaptic tagging, hippocampal indexing, # memory states, context memory, importance signals, dreams, and more -vestige-core = { version = "2.0.1", path = "../vestige-core", default-features = false, features = ["bundled-sqlite"] } +vestige-core = { version = "2.0.2", path = "../vestige-core", default-features = false, features = ["bundled-sqlite"] } # ============================================================================ # MCP Server Dependencies @@ -50,6 +50,9 @@ chrono = { version = "0.4", features = ["serde"] } # UUID uuid = { version = "1", features = ["v4", "serde"] } +# Constant-time comparison for auth tokens (prevents timing side-channels) +subtle = "2" + # Error handling thiserror = "2" anyhow = "1" diff --git a/crates/vestige-mcp/src/bin/cli.rs b/crates/vestige-mcp/src/bin/cli.rs index cebacf3..ecb278e 100644 --- a/crates/vestige-mcp/src/bin/cli.rs +++ b/crates/vestige-mcp/src/bin/cli.rs @@ -4,6 +4,7 @@ use std::io::{BufWriter, Write}; use std::path::PathBuf; +use std::sync::Arc; use chrono::{NaiveDate, Utc}; use clap::{Parser, Subcommand}; @@ -109,6 +110,19 @@ enum Commands { #[arg(long)] source: Option, }, + + /// Start standalone HTTP MCP server (no stdio, for remote access) + Serve { + /// HTTP transport port + #[arg(long, default_value = "3928")] + port: u16, + /// Also start the dashboard + #[arg(long)] + dashboard: bool, + /// Dashboard port + #[arg(long, default_value = "3927")] + dashboard_port: u16, + }, } fn main() -> anyhow::Result<()> { @@ -139,6 +153,11 @@ fn main() -> anyhow::Result<()> { node_type, source, } => run_ingest(content, tags, node_type, source), + Commands::Serve { + port, + dashboard, + dashboard_port, + } => run_serve(port, dashboard, dashboard_port), } } @@ -967,6 +986,82 @@ fn run_dashboard(port: u16, open_browser: bool) -> anyhow::Result<()> { }) } +/// Start standalone HTTP MCP server (no stdio transport) +fn run_serve(port: u16, with_dashboard: bool, dashboard_port: u16) -> anyhow::Result<()> { + use vestige_mcp::cognitive::CognitiveEngine; + + println!("{}", "=== Vestige HTTP Server ===".cyan().bold()); + println!(); + + let storage = Storage::new(None)?; + + #[cfg(feature = "embeddings")] + { + if let Err(e) = storage.init_embeddings() { + println!( + " {} Embeddings unavailable: {} (search will use keyword-only)", + "!".yellow(), + e + ); + } + } + + let storage = Arc::new(storage); + + let rt = tokio::runtime::Runtime::new()?; + rt.block_on(async move { + let cognitive = Arc::new(tokio::sync::Mutex::new(CognitiveEngine::new())); + { + let mut cog = cognitive.lock().await; + cog.hydrate(&storage); + } + + let (event_tx, _) = + tokio::sync::broadcast::channel::(1024); + + // Optionally start dashboard + if with_dashboard { + let ds = Arc::clone(&storage); + let dc = Arc::clone(&cognitive); + let dtx = event_tx.clone(); + tokio::spawn(async move { + match vestige_mcp::dashboard::start_background_with_event_tx(ds, Some(dc), dtx, dashboard_port).await { + Ok(_) => println!(" {} Dashboard: http://127.0.0.1:{}", ">".cyan(), dashboard_port), + Err(e) => eprintln!(" {} Dashboard failed: {}", "!".yellow(), e), + } + }); + } + + // Get auth token + let token = vestige_mcp::protocol::auth::get_or_create_auth_token() + .map_err(|e| anyhow::anyhow!("Failed to create auth token: {}", e))?; + + let bind = std::env::var("VESTIGE_HTTP_BIND").unwrap_or_else(|_| "127.0.0.1".to_string()); + println!(" {} HTTP transport: http://{}:{}/mcp", ">".cyan(), bind, port); + println!(" {} Auth token: {}...", ">".cyan(), &token[..8]); + println!(); + println!("{}", "Press Ctrl+C to stop.".dimmed()); + + // Start HTTP transport (blocks on the server, no stdio) + vestige_mcp::protocol::http::start_http_transport( + Arc::clone(&storage), + Arc::clone(&cognitive), + event_tx, + token, + port, + ) + .await + .map_err(|e| anyhow::anyhow!("HTTP transport failed: {}", e))?; + + // Keep the process alive (the HTTP server runs in a spawned task) + tokio::signal::ctrl_c().await.ok(); + println!(); + println!("{}", "Shutting down...".dimmed()); + + Ok(()) + }) +} + /// Truncate a string for display (UTF-8 safe) fn truncate(s: &str, max_chars: usize) -> String { let s = s.replace('\n', " "); diff --git a/crates/vestige-mcp/src/lib.rs b/crates/vestige-mcp/src/lib.rs index 994caff..b5a2c3e 100644 --- a/crates/vestige-mcp/src/lib.rs +++ b/crates/vestige-mcp/src/lib.rs @@ -4,3 +4,7 @@ pub mod cognitive; pub mod dashboard; +pub mod protocol; +pub mod resources; +pub mod server; +pub mod tools; diff --git a/crates/vestige-mcp/src/main.rs b/crates/vestige-mcp/src/main.rs index c3391f5..b594b0e 100644 --- a/crates/vestige-mcp/src/main.rs +++ b/crates/vestige-mcp/src/main.rs @@ -27,12 +27,9 @@ //! - Reconsolidation (memories editable on retrieval) //! - Memory Chains (reasoning paths) -// cognitive is exported from lib.rs for dashboard access use vestige_mcp::cognitive; -mod protocol; -mod resources; -mod server; -mod tools; +use vestige_mcp::protocol; +use vestige_mcp::server; use std::io; use std::path::PathBuf; @@ -44,15 +41,24 @@ use tracing_subscriber::EnvFilter; // Use vestige-core for the cognitive science engine use vestige_core::Storage; -use crate::protocol::stdio::StdioTransport; -use crate::server::McpServer; +use protocol::stdio::StdioTransport; +use server::McpServer; -/// Parse command-line arguments and return the optional data directory path. -/// Returns `None` for the path if no `--data-dir` was specified. +/// Parsed CLI configuration. +struct Config { + data_dir: Option, + http_port: u16, +} + +/// Parse command-line arguments into a `Config`. /// Exits the process if `--help` or `--version` is requested. -fn parse_args() -> Option { +fn parse_args() -> Config { let args: Vec = std::env::args().collect(); let mut data_dir: Option = None; + let mut http_port: u16 = std::env::var("VESTIGE_HTTP_PORT") + .ok() + .and_then(|s| s.parse().ok()) + .unwrap_or(3928); let mut i = 1; while i < args.len() { @@ -69,13 +75,18 @@ fn parse_args() -> Option { println!(" -h, --help Print help information"); println!(" -V, --version Print version information"); println!(" --data-dir Custom data directory"); + println!(" --http-port HTTP transport port (default: 3928)"); println!(); println!("ENVIRONMENT:"); - println!(" RUST_LOG Log level filter (e.g., debug, info, warn, error)"); + println!(" RUST_LOG Log level filter (e.g., debug, info, warn, error)"); + println!(" VESTIGE_AUTH_TOKEN Override the bearer token for HTTP transport"); + println!(" VESTIGE_HTTP_PORT HTTP transport port (default: 3928)"); + println!(" VESTIGE_DASHBOARD_PORT Dashboard port (default: 3927)"); println!(); println!("EXAMPLES:"); println!(" vestige-mcp"); println!(" vestige-mcp --data-dir /custom/path"); + println!(" vestige-mcp --http-port 8080"); println!(" RUST_LOG=debug vestige-mcp"); std::process::exit(0); } @@ -102,6 +113,31 @@ fn parse_args() -> Option { } data_dir = Some(PathBuf::from(path)); } + "--http-port" => { + i += 1; + if i >= args.len() { + eprintln!("error: --http-port requires a port number"); + eprintln!("Usage: vestige-mcp --http-port "); + std::process::exit(1); + } + http_port = match args[i].parse() { + Ok(p) => p, + Err(_) => { + eprintln!("error: invalid port number '{}'", args[i]); + std::process::exit(1); + } + }; + } + arg if arg.starts_with("--http-port=") => { + let val = arg.strip_prefix("--http-port=").unwrap_or(""); + http_port = match val.parse() { + Ok(p) => p, + Err(_) => { + eprintln!("error: invalid port number '{}'", val); + std::process::exit(1); + } + }; + } arg => { eprintln!("error: unknown argument '{}'", arg); eprintln!("Usage: vestige-mcp [OPTIONS]"); @@ -112,13 +148,13 @@ fn parse_args() -> Option { i += 1; } - data_dir + Config { data_dir, http_port } } #[tokio::main] async fn main() { // Parse CLI arguments first (before logging init, so --help/--version work cleanly) - let data_dir = parse_args(); + let config = parse_args(); // Initialize logging to stderr (stdout is for JSON-RPC) tracing_subscriber::fmt() @@ -134,7 +170,7 @@ async fn main() { info!("Vestige MCP Server v{} starting...", env!("CARGO_PKG_VERSION")); // Initialize storage with optional custom data directory - let storage = match Storage::new(data_dir) { + let storage = match Storage::new(config.data_dir) { Ok(s) => { info!("Storage initialized successfully"); @@ -260,6 +296,38 @@ async fn main() { }); } + // Start HTTP MCP transport (Streamable HTTP for Claude.ai / remote clients) + { + let http_storage = Arc::clone(&storage); + let http_cognitive = Arc::clone(&cognitive); + let http_event_tx = event_tx.clone(); + let http_port = config.http_port; + + match protocol::auth::get_or_create_auth_token() { + Ok(token) => { + let bind = std::env::var("VESTIGE_HTTP_BIND").unwrap_or_else(|_| "127.0.0.1".to_string()); + eprintln!("Vestige HTTP transport: http://{}:{}/mcp", bind, http_port); + eprintln!("Auth token: {}...", &token[..8]); + tokio::spawn(async move { + if let Err(e) = protocol::http::start_http_transport( + http_storage, + http_cognitive, + http_event_tx, + token, + http_port, + ) + .await + { + warn!("HTTP transport failed to start: {}", e); + } + }); + } + Err(e) => { + warn!("Could not create auth token, HTTP transport disabled: {}", e); + } + } + } + // Load cross-encoder reranker in the background (downloads ~150MB on first run) #[cfg(feature = "embeddings")] { diff --git a/crates/vestige-mcp/src/protocol/auth.rs b/crates/vestige-mcp/src/protocol/auth.rs new file mode 100644 index 0000000..dce821b --- /dev/null +++ b/crates/vestige-mcp/src/protocol/auth.rs @@ -0,0 +1,104 @@ +//! Bearer token authentication for the HTTP transport. +//! +//! Token priority: +//! 1. `VESTIGE_AUTH_TOKEN` env var (override) +//! 2. Read from `/auth_token` file +//! 3. Generate `uuid::Uuid::new_v4()`, write to file with 0o600 permissions +//! +//! Security: The token file is created with restricted permissions from the +//! start (via OpenOptionsExt on Unix) to prevent a TOCTOU race where another +//! process could read the token before permissions are set. + +use std::fs; +use std::path::PathBuf; + +use directories::ProjectDirs; +use tracing::{info, warn}; + +/// Minimum recommended token length when provided via env var. +const MIN_TOKEN_LENGTH: usize = 32; + +/// Return the auth token file path inside the Vestige data directory. +fn token_path() -> Result> { + let dirs = ProjectDirs::from("com", "vestige", "core") + .ok_or("could not determine project directories")?; + Ok(dirs.data_dir().join("auth_token")) +} + +/// Get (or create) the bearer token used for HTTP transport authentication. +/// +/// Priority: +/// 1. `VESTIGE_AUTH_TOKEN` environment variable +/// 2. Existing `auth_token` file in the data directory +/// 3. Newly generated UUID v4, persisted to file +pub fn get_or_create_auth_token() -> Result> { + // 1. Env var override + if let Ok(token) = std::env::var("VESTIGE_AUTH_TOKEN") { + let token = token.trim().to_string(); + if !token.is_empty() { + if token.len() < MIN_TOKEN_LENGTH { + warn!( + "VESTIGE_AUTH_TOKEN is only {} chars (recommended >= {}). \ + Short tokens are vulnerable to brute-force attacks.", + token.len(), + MIN_TOKEN_LENGTH + ); + } + info!("Using auth token from VESTIGE_AUTH_TOKEN env var"); + return Ok(token); + } + } + + let path = token_path()?; + + // 2. Read existing file + if path.exists() { + let token = fs::read_to_string(&path)?.trim().to_string(); + if !token.is_empty() { + info!("Using auth token from {}", path.display()); + return Ok(token); + } + } + + // 3. Generate new token and persist + let token = uuid::Uuid::new_v4().to_string(); + + // Ensure parent directory exists with restricted permissions + if let Some(parent) = path.parent() { + fs::create_dir_all(parent)?; + + // Restrict parent directory permissions on Unix (owner only) + #[cfg(unix)] + { + use std::os::unix::fs::PermissionsExt; + let _ = fs::set_permissions(parent, fs::Permissions::from_mode(0o700)); + } + } + + // Write token file with restricted permissions from the start. + // On Unix, we use OpenOptionsExt to set mode 0o600 at creation time, + // avoiding the TOCTOU race of write-then-chmod. + #[cfg(unix)] + { + use std::io::Write; + use std::os::unix::fs::OpenOptionsExt; + + let mut file = fs::OpenOptions::new() + .write(true) + .create(true) + .truncate(true) + .mode(0o600) // Owner read/write only — set at creation, no race window + .open(&path)?; + file.write_all(token.as_bytes())?; + file.sync_all()?; + } + + // On non-Unix (Windows), fall back to regular write (Windows ACLs are different) + #[cfg(not(unix))] + { + fs::write(&path, &token)?; + } + + info!("Generated new auth token at {}", path.display()); + Ok(token) +} diff --git a/crates/vestige-mcp/src/protocol/http.rs b/crates/vestige-mcp/src/protocol/http.rs new file mode 100644 index 0000000..9e8987a --- /dev/null +++ b/crates/vestige-mcp/src/protocol/http.rs @@ -0,0 +1,308 @@ +//! Streamable HTTP transport for MCP. +//! +//! Implements the MCP Streamable HTTP transport specification: +//! - `POST /mcp` — JSON-RPC endpoint (initialize, tools/call, etc.) +//! - `DELETE /mcp` — session cleanup +//! +//! Each client gets a per-session `McpServer` instance (owns `initialized` state). +//! Shared state (Storage, CognitiveEngine, event bus) is shared across sessions. + +use std::collections::HashMap; +use std::sync::Arc; +use std::time::{Duration, Instant}; + +use axum::extract::{DefaultBodyLimit, State}; +use axum::http::{HeaderMap, StatusCode}; +use axum::response::IntoResponse; +use axum::routing::{delete, post}; +use axum::{Json, Router}; +use subtle::ConstantTimeEq; +use tokio::sync::{broadcast, Mutex, RwLock}; +use tower::ServiceBuilder; +use tower::limit::ConcurrencyLimitLayer; +use tower_http::cors::CorsLayer; +use tracing::{info, warn}; + +use crate::cognitive::CognitiveEngine; +use crate::protocol::types::JsonRpcRequest; +use crate::server::McpServer; +use vestige_core::Storage; +use crate::dashboard::events::VestigeEvent; + +/// Maximum concurrent sessions. +const MAX_SESSIONS: usize = 100; + +/// Sessions idle longer than this are reaped. +const SESSION_TIMEOUT: Duration = Duration::from_secs(30 * 60); + +/// How often the reaper task runs. +const REAPER_INTERVAL: Duration = Duration::from_secs(5 * 60); + +/// Concurrency limit for the tower middleware. +const CONCURRENCY_LIMIT: usize = 50; + +/// Maximum request body size (256 KB — JSON-RPC requests should be small). +const MAX_BODY_SIZE: usize = 256 * 1024; + +/// A per-client session holding its own McpServer instance. +struct Session { + server: McpServer, + last_active: Instant, +} + +/// Shared state cloned into every axum handler. +#[derive(Clone)] +pub struct HttpTransportState { + sessions: Arc>>>>, + storage: Arc, + cognitive: Arc>, + event_tx: broadcast::Sender, + auth_token: String, +} + +/// Start the HTTP MCP transport on `127.0.0.1:`. +/// +/// This function spawns a background tokio task and returns immediately. +pub async fn start_http_transport( + storage: Arc, + cognitive: Arc>, + event_tx: broadcast::Sender, + auth_token: String, + port: u16, +) -> Result<(), Box> { + let state = HttpTransportState { + sessions: Arc::new(RwLock::new(HashMap::new())), + storage, + cognitive, + event_tx, + auth_token, + }; + + // Spawn session reaper + { + let sessions = Arc::clone(&state.sessions); + tokio::spawn(async move { + loop { + tokio::time::sleep(REAPER_INTERVAL).await; + let mut map = sessions.write().await; + let before = map.len(); + map.retain(|_id, session| { + // Try to check last_active without blocking; skip if locked + match session.try_lock() { + Ok(s) => s.last_active.elapsed() < SESSION_TIMEOUT, + Err(_) => true, // in-use, keep + } + }); + let removed = before - map.len(); + if removed > 0 { + info!("Session reaper: removed {} idle sessions ({} active)", removed, map.len()); + } + } + }); + } + + let app = Router::new() + .route("/mcp", post(post_mcp)) + .route("/mcp", delete(delete_mcp)) + .layer( + ServiceBuilder::new() + .layer(DefaultBodyLimit::max(MAX_BODY_SIZE)) + .layer(ConcurrencyLimitLayer::new(CONCURRENCY_LIMIT)) + .layer(CorsLayer::permissive()), + ) + .with_state(state); + + // Bind to localhost only — use VESTIGE_HTTP_BIND=0.0.0.0 for remote access + let bind_addr: std::net::IpAddr = std::env::var("VESTIGE_HTTP_BIND") + .ok() + .and_then(|s| s.parse().ok()) + .unwrap_or_else(|| std::net::IpAddr::V4(std::net::Ipv4Addr::LOCALHOST)); + + let addr = std::net::SocketAddr::from((bind_addr, port)); + let listener = tokio::net::TcpListener::bind(addr).await?; + info!("HTTP MCP transport listening on http://{}/mcp", addr); + + tokio::spawn(async move { + if let Err(e) = axum::serve(listener, app).await { + warn!("HTTP transport error: {}", e); + } + }); + + Ok(()) +} + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- + +/// Validate the `Authorization: Bearer ` header using constant-time +/// comparison to prevent timing side-channel attacks. +fn validate_auth(headers: &HeaderMap, expected: &str) -> Result<(), (StatusCode, &'static str)> { + let header = headers + .get("authorization") + .and_then(|v| v.to_str().ok()) + .ok_or((StatusCode::UNAUTHORIZED, "Missing Authorization header"))?; + + let token = header + .strip_prefix("Bearer ") + .ok_or((StatusCode::UNAUTHORIZED, "Invalid Authorization scheme (expected Bearer)"))?; + + // Constant-time comparison: prevents timing side-channel attacks. + // We first check lengths match (length itself is not secret since UUIDs + // have a fixed public format), then compare bytes in constant time. + let token_bytes = token.as_bytes(); + let expected_bytes = expected.as_bytes(); + + if token_bytes.len() != expected_bytes.len() + || token_bytes.ct_eq(expected_bytes).unwrap_u8() != 1 + { + return Err((StatusCode::FORBIDDEN, "Invalid auth token")); + } + + Ok(()) +} + +/// Extract and validate the `Mcp-Session-Id` header value. +/// +/// Only accepts valid UUID v4 format (8-4-4-4-12 hex) to prevent header +/// injection and ensure session IDs match server-generated format. +fn session_id_from_headers(headers: &HeaderMap) -> Option { + headers + .get("mcp-session-id") + .and_then(|v| v.to_str().ok()) + .filter(|s| uuid::Uuid::parse_str(s).is_ok()) + .map(|s| s.to_string()) +} + +// --------------------------------------------------------------------------- +// Handlers +// --------------------------------------------------------------------------- + +/// `POST /mcp` — main JSON-RPC handler. +async fn post_mcp( + State(state): State, + headers: HeaderMap, + Json(request): Json, +) -> impl IntoResponse { + // Auth check + if let Err((status, msg)) = validate_auth(&headers, &state.auth_token) { + return (status, HeaderMap::new(), msg.to_string()).into_response(); + } + + let is_initialize = request.method == "initialize"; + + if is_initialize { + // ── New session ── + // Take write lock immediately to avoid TOCTOU race on MAX_SESSIONS check. + let mut sessions = state.sessions.write().await; + if sessions.len() >= MAX_SESSIONS { + return ( + StatusCode::SERVICE_UNAVAILABLE, + "Too many active sessions", + ) + .into_response(); + } + + let server = McpServer::new_with_events( + Arc::clone(&state.storage), + Arc::clone(&state.cognitive), + state.event_tx.clone(), + ); + + let session_id = uuid::Uuid::new_v4().to_string(); + + let session = Arc::new(Mutex::new(Session { + server, + last_active: Instant::now(), + })); + + // Handle the initialize request + let response = { + let mut sess = session.lock().await; + sess.server.handle_request(request).await + }; + + // Insert session while still holding write lock — atomic check-and-insert + sessions.insert(session_id.clone(), session); + drop(sessions); + + match response { + Some(resp) => { + let mut resp_headers = HeaderMap::new(); + resp_headers.insert("mcp-session-id", session_id.parse().unwrap()); + (StatusCode::OK, resp_headers, Json(resp)).into_response() + } + None => { + // Notifications return 202 + let mut resp_headers = HeaderMap::new(); + resp_headers.insert("mcp-session-id", session_id.parse().unwrap()); + (StatusCode::ACCEPTED, resp_headers).into_response() + } + } + } else { + // ── Existing session ── + let session_id = match session_id_from_headers(&headers) { + Some(id) => id, + None => { + return ( + StatusCode::BAD_REQUEST, + "Missing or invalid Mcp-Session-Id header", + ) + .into_response(); + } + }; + + let session = { + let sessions = state.sessions.read().await; + sessions.get(&session_id).cloned() + }; + + let session = match session { + Some(s) => s, + None => { + return ( + StatusCode::NOT_FOUND, + "Session not found or expired", + ) + .into_response(); + } + }; + + let response = { + let mut sess = session.lock().await; + sess.last_active = Instant::now(); + sess.server.handle_request(request).await + }; + + let mut resp_headers = HeaderMap::new(); + resp_headers.insert("mcp-session-id", session_id.parse().unwrap()); + + match response { + Some(resp) => (StatusCode::OK, resp_headers, Json(resp)).into_response(), + None => (StatusCode::ACCEPTED, resp_headers).into_response(), + } + } +} + +/// `DELETE /mcp` — explicit session cleanup. +async fn delete_mcp( + State(state): State, + headers: HeaderMap, +) -> impl IntoResponse { + if let Err((status, msg)) = validate_auth(&headers, &state.auth_token) { + return (status, msg).into_response(); + } + + let session_id = match session_id_from_headers(&headers) { + Some(id) => id, + None => return (StatusCode::BAD_REQUEST, "Missing or invalid Mcp-Session-Id header").into_response(), + }; + + let mut sessions = state.sessions.write().await; + if sessions.remove(&session_id).is_some() { + info!("Session {} deleted via DELETE /mcp", &session_id[..8]); + (StatusCode::OK, "Session deleted").into_response() + } else { + (StatusCode::NOT_FOUND, "Session not found").into_response() + } +} diff --git a/crates/vestige-mcp/src/protocol/mod.rs b/crates/vestige-mcp/src/protocol/mod.rs index 9e0793c..9e3bc0d 100644 --- a/crates/vestige-mcp/src/protocol/mod.rs +++ b/crates/vestige-mcp/src/protocol/mod.rs @@ -2,6 +2,8 @@ //! //! JSON-RPC 2.0 over stdio for the Model Context Protocol. +pub mod auth; +pub mod http; pub mod messages; pub mod stdio; pub mod types; diff --git a/crates/vestige-mcp/src/server.rs b/crates/vestige-mcp/src/server.rs index 95a0b88..10a233a 100644 --- a/crates/vestige-mcp/src/server.rs +++ b/crates/vestige-mcp/src/server.rs @@ -11,7 +11,7 @@ use tokio::sync::{broadcast, Mutex}; use tracing::{debug, info, warn}; use crate::cognitive::CognitiveEngine; -use vestige_mcp::dashboard::events::VestigeEvent; +use crate::dashboard::events::VestigeEvent; use crate::protocol::messages::{ CallToolRequest, CallToolResult, InitializeRequest, InitializeResult, ListResourcesResult, ListToolsResult, ReadResourceRequest, ReadResourceResult,