mirror of
https://github.com/samvallad33/vestige.git
synced 2026-04-25 00:36:22 +02:00
feat: add MCP Streamable HTTP transport with Bearer auth
Adds a second transport layer alongside stdio — Streamable HTTP on port 3928. Enables Claude.ai, remote clients, and web integrations to connect to Vestige over HTTP with per-session McpServer instances. - POST /mcp (JSON-RPC) + DELETE /mcp (session cleanup) - Bearer token auth with constant-time comparison (subtle crate) - Auto-generated UUID v4 token persisted with 0o600 permissions - Per-session McpServer instances with 30-min idle reaper - 100 max sessions, 50 concurrency limit, 256KB body limit - --http-port flag + VESTIGE_HTTP_PORT / VESTIGE_HTTP_BIND env vars - Module exports moved from binary to lib.rs for reusability - vestige CLI gains `serve` subcommand via shared lib Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
070889ef26
commit
816b577f69
11 changed files with 849 additions and 170 deletions
394
CLAUDE.md
394
CLAUDE.md
|
|
@ -1,6 +1,6 @@
|
|||
# Vestige v2.0.0 — Cognitive Memory System
|
||||
# Vestige v2.0.1 — Cognitive Memory System
|
||||
|
||||
Vestige is your long-term memory. It implements real neuroscience: FSRS-6 spaced repetition, synaptic tagging, prediction error gating, hippocampal indexing, spreading activation, and 29 stateful cognitive modules. **Use it automatically.**
|
||||
Vestige is your long-term memory. 29 stateful cognitive modules implement real neuroscience: FSRS-6 spaced repetition, synaptic tagging, prediction error gating, hippocampal indexing, spreading activation, reconsolidation, and dual-strength memory theory. **Use it automatically. Use it aggressively.**
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -9,135 +9,204 @@ Vestige is your long-term memory. It implements real neuroscience: FSRS-6 spaced
|
|||
Every conversation, before responding to the user:
|
||||
|
||||
```
|
||||
1. session_context({ → ONE CALL replaces steps 1-5
|
||||
queries: ["user preferences", "[project] context"],
|
||||
context: { codebase: "[project]", topics: ["[current topics]"] },
|
||||
token_budget: 1000
|
||||
})
|
||||
2. Check automationTriggers from response:
|
||||
- needsDream == true → call dream
|
||||
- needsBackup == true → call backup
|
||||
- needsGc == true → call gc(dry_run: true)
|
||||
- totalMemories > 700 → call find_duplicates
|
||||
session_context({
|
||||
queries: ["user preferences", "[current project] context"],
|
||||
context: { codebase: "[project]", topics: ["[current topics]"] },
|
||||
token_budget: 2000
|
||||
})
|
||||
```
|
||||
|
||||
Then check `automationTriggers` from response:
|
||||
- `needsDream` → call `dream` (consolidates memories, discovers hidden connections)
|
||||
- `needsBackup` → call `backup`
|
||||
- `needsGc` → call `gc(dry_run: true)` then review
|
||||
- totalMemories > 700 → call `find_duplicates`
|
||||
|
||||
Say "Remembering..." then retrieve context before answering.
|
||||
|
||||
> **Fallback:** If `session_context` is unavailable, use the 5-call sequence: `search` × 2 → `intention` check → `system_status` → `predict`.
|
||||
> **Fallback:** If `session_context` unavailable: `search` × 2 → `intention` check → `system_status` → `predict`.
|
||||
|
||||
---
|
||||
|
||||
## The 21 Tools
|
||||
## Complete Tool Reference (21 Tools)
|
||||
|
||||
### Context Packets (1 tool)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `session_context` | **One-call session initialization.** Replaces 5 separate calls (search × 2, intention check, system_status, predict) with a single token-budgeted response. Returns markdown context + `automationTriggers` (needsDream/needsBackup/needsGc) + `expandable` IDs for on-demand full retrieval. Params: `queries` (string[]), `token_budget` (100-10000, default 1000), `context` ({codebase, topics, file}), `include_status/include_intentions/include_predictions` (bool). |
|
||||
### session_context — One-Call Initialization
|
||||
```
|
||||
session_context({
|
||||
queries: ["user preferences", "project context"], // search queries
|
||||
context: { codebase: "project-name", topics: ["svelte", "rust"], file: "src/main.rs" },
|
||||
token_budget: 2000, // 100-10000, controls response size
|
||||
include_status: true, // system health
|
||||
include_intentions: true, // triggered reminders
|
||||
include_predictions: true // proactive memory predictions
|
||||
})
|
||||
```
|
||||
Returns: markdown context + `automationTriggers` + `expandable` IDs for on-demand retrieval.
|
||||
|
||||
### Core Memory (1 tool)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `smart_ingest` | **Default for all saves.** Single mode: provide `content` for auto-decide CREATE/UPDATE/SUPERSEDE via Prediction Error Gating. Batch mode: provide `items` array (max 20) for session-end saves — each item runs full cognitive pipeline (importance scoring, intent detection, synaptic tagging, hippocampal indexing). |
|
||||
### smart_ingest — Save Anything
|
||||
**Single mode** — auto-decides CREATE/UPDATE/SUPERSEDE via Prediction Error Gating:
|
||||
```
|
||||
smart_ingest({
|
||||
content: "What to remember",
|
||||
tags: ["tag1", "tag2"],
|
||||
node_type: "fact", // fact|concept|event|person|place|note|pattern|decision
|
||||
source: "optional reference",
|
||||
forceCreate: false // bypass dedup when needed
|
||||
})
|
||||
```
|
||||
**Batch mode** — save up to 20 items in one call (session end, pre-compaction):
|
||||
```
|
||||
smart_ingest({
|
||||
items: [
|
||||
{ content: "Item 1", tags: ["session-end"], node_type: "fact" },
|
||||
{ content: "Item 2", tags: ["bug-fix"], node_type: "fact" }
|
||||
]
|
||||
})
|
||||
```
|
||||
Each item runs the full cognitive pipeline: importance scoring → intent detection → synaptic tagging → hippocampal indexing → PE gating → cross-project recording.
|
||||
|
||||
### Unified Tools (4 tools)
|
||||
| Tool | Actions | When to Use |
|
||||
|------|---------|-------------|
|
||||
| `search` | query + filters | **Every time you need to recall anything.** Hybrid search (BM25 + semantic + convex combination fusion). 7-stage pipeline: overfetch → rerank → temporal boost → accessibility filter → context match → competition → spreading activation. Searching strengthens memory (Testing Effect). **v1.8.0:** optional `token_budget` param (100-10000) limits response size; results exceeding budget moved to `expandable` array. |
|
||||
| `memory` | get, delete, state, promote, demote | Retrieve a full memory by ID, delete a memory, check its cognitive state (Active/Dormant/Silent/Unavailable), promote (thumbs up — increases retrieval strength), or demote (thumbs down — decreases retrieval strength, does NOT delete). |
|
||||
| `codebase` | remember_pattern, remember_decision, get_context | Store and recall code patterns, architectural decisions, and project context. The killer differentiator. |
|
||||
| `intention` | set, check, update, list | Prospective memory — "remember to do X when Y happens". Supports time, context, and event triggers. |
|
||||
### search — 7-Stage Cognitive Search
|
||||
```
|
||||
search({
|
||||
query: "search query",
|
||||
limit: 10, // 1-100
|
||||
min_retention: 0.0, // filter by retention strength
|
||||
min_similarity: 0.5, // minimum cosine similarity
|
||||
detail_level: "summary", // brief|summary|full
|
||||
context_topics: ["rust", "debugging"], // boost topic-matching memories
|
||||
token_budget: 3000 // 100-10000, truncate to fit
|
||||
})
|
||||
```
|
||||
Pipeline: Overfetch (3x, BM25+semantic) → Rerank (cross-encoder) → Temporal boost → Accessibility filter (FSRS-6) → Context match (Tulving 1973) → Competition (Anderson 1994) → Spreading activation. **Every search strengthens the memories it finds (Testing Effect).**
|
||||
|
||||
### Temporal (2 tools)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `memory_timeline` | Browse memories chronologically. Grouped by day. Filter by type, tags, date range. When user references a time period ("last week", "yesterday"). |
|
||||
| `memory_changelog` | Audit trail. Per-memory: state transitions. System-wide: consolidations + recent changes. When debugging memory issues. |
|
||||
### memory — Read, Edit, Delete, Promote, Demote
|
||||
```
|
||||
memory({ action: "get", id: "uuid" }) // full node with all FSRS state
|
||||
memory({ action: "edit", id: "uuid", content: "updated text" }) // preserves FSRS state, regenerates embedding
|
||||
memory({ action: "delete", id: "uuid" })
|
||||
memory({ action: "promote", id: "uuid", reason: "was helpful" }) // +0.20 retrieval, +0.10 retention, 1.5x stability
|
||||
memory({ action: "demote", id: "uuid", reason: "was wrong" }) // -0.30 retrieval, -0.15 retention, 0.5x stability
|
||||
memory({ action: "state", id: "uuid" }) // Active/Dormant/Silent/Unavailable + accessibility score
|
||||
```
|
||||
Promote/demote does NOT delete — it adjusts ranking. Demoted memories rank lower; alternatives surface instead.
|
||||
|
||||
### Cognitive (3 tools)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `dream` | Trigger memory consolidation — replays recent memories to discover hidden connections and synthesize insights. At session start if >24h since last dream, after every 50 saves. |
|
||||
| `explore_connections` | Graph exploration. Actions: `chain` (reasoning path A→B), `associations` (spreading activation from a node), `bridges` (connecting memories between two nodes). When search returns 3+ related results. |
|
||||
| `predict` | Proactive retrieval — predicts what memories you'll need next based on context, activity patterns, and learned behavior. At session start, when switching projects. |
|
||||
### codebase — Code Patterns & Architectural Decisions
|
||||
```
|
||||
codebase({ action: "remember_pattern", name: "Pattern Name",
|
||||
description: "How it works and when to use it",
|
||||
files: ["src/file.rs"], codebase: "project-name" })
|
||||
|
||||
### Auto-Save & Dedup (2 tools)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `importance_score` | Score content importance before deciding whether to save. 4-channel model: novelty, arousal, reward, attention. Composite > 0.6 = worth saving. |
|
||||
| `find_duplicates` | Find near-duplicate memory clusters via cosine similarity. Returns merge/review suggestions. Run when memory count > 700 or on user request. |
|
||||
codebase({ action: "remember_decision", decision: "What was decided",
|
||||
rationale: "Why", alternatives: ["Option A", "Option B"],
|
||||
files: ["src/file.rs"], codebase: "project-name" })
|
||||
|
||||
### Autonomic (2 tools)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `memory_health` | Retention dashboard — avg retention, distribution buckets, trend (improving/declining/stable), recommendation. Lightweight alternative to system_status focused on memory quality. |
|
||||
| `memory_graph` | Subgraph export for visualization. Input: center_id or query, depth (1-3), max_nodes. Returns nodes with force-directed layout positions and edges with weights. |
|
||||
codebase({ action: "get_context", codebase: "project-name", limit: 10 })
|
||||
// Returns: patterns, decisions, cross-project insights
|
||||
```
|
||||
|
||||
### Maintenance (5 tools)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `system_status` | **Combined health + stats.** Returns status (healthy/degraded/critical/empty), full statistics, FSRS preview, cognitive module health, state distribution, warnings, and recommendations. At session start (or use `session_context` which includes this). |
|
||||
| `consolidate` | Run FSRS-6 consolidation cycle. Applies decay, generates embeddings, maintenance. At session end, when retention drops. |
|
||||
| `backup` | Create SQLite database backup. Before major upgrades, weekly. |
|
||||
| `export` | Export memories as JSON/JSONL with tag and date filters. |
|
||||
| `gc` | Garbage collect low-retention memories. When system_status shows degraded + high count. Defaults to dry_run=true. |
|
||||
### intention — Prospective Memory (Reminders)
|
||||
```
|
||||
intention({ action: "set", description: "What to do",
|
||||
trigger: { type: "context", topic: "authentication" }, // fires when discussing auth
|
||||
priority: "high" })
|
||||
|
||||
### Restore (1 tool)
|
||||
| Tool | When to Use |
|
||||
|------|-------------|
|
||||
| `restore` | Restore memories from a JSON backup file. Supports MCP wrapper, RecallResult, and direct array formats. |
|
||||
intention({ action: "set", description: "Deploy by Friday",
|
||||
trigger: { type: "time", at: "2026-03-07T17:00:00Z" },
|
||||
deadline: "2026-03-07T17:00:00Z" })
|
||||
|
||||
### Deprecated (still work via redirects)
|
||||
| Old Tool | Redirects To |
|
||||
|----------|-------------|
|
||||
| `ingest` | `smart_ingest` |
|
||||
| `session_checkpoint` | `smart_ingest` (batch mode) |
|
||||
| `promote_memory` | `memory(action="promote")` |
|
||||
| `demote_memory` | `memory(action="demote")` |
|
||||
| `health_check` | `system_status` |
|
||||
| `stats` | `system_status` |
|
||||
intention({ action: "set", description: "Check test coverage",
|
||||
trigger: { type: "context", codebase: "vestige", file_pattern: "*.test.*" } })
|
||||
|
||||
intention({ action: "check", context: { codebase: "vestige", topics: ["testing"] } })
|
||||
intention({ action: "update", id: "uuid", status: "complete" })
|
||||
intention({ action: "list", filter_status: "active" })
|
||||
```
|
||||
|
||||
### dream — Memory Consolidation
|
||||
```
|
||||
dream({ memory_count: 50 })
|
||||
```
|
||||
5-stage cycle: Replay → Cross-reference → Strengthen → Prune → Transfer. Uses Waking SWR tagging (70% tagged + 30% random for diversity). Discovers hidden connections, generates insights, persists new edges to the activation network.
|
||||
|
||||
### explore_connections — Graph Traversal
|
||||
```
|
||||
explore_connections({ action: "associations", from: "uuid", limit: 10 })
|
||||
// Spreading activation from a memory — find related memories via graph traversal
|
||||
|
||||
explore_connections({ action: "chain", from: "uuid-A", to: "uuid-B" })
|
||||
// Build reasoning path between two memories (A*-like pathfinding)
|
||||
|
||||
explore_connections({ action: "bridges", from: "uuid-A", to: "uuid-B" })
|
||||
// Find connecting memories that bridge two concepts
|
||||
```
|
||||
|
||||
### predict — Proactive Retrieval
|
||||
```
|
||||
predict({ context: { codebase: "vestige", current_file: "src/main.rs",
|
||||
current_topics: ["error handling", "rust"] } })
|
||||
```
|
||||
Returns: predictions with confidence, suggestions, speculative retrievals, top interests. Uses SpeculativeRetriever's learned patterns from access history.
|
||||
|
||||
### importance_score — Should I Save This?
|
||||
```
|
||||
importance_score({ content: "Content to evaluate",
|
||||
context_topics: ["debugging"], project: "vestige" })
|
||||
```
|
||||
4-channel model: novelty (0.25), arousal (0.30), reward (0.25), attention (0.20). Composite > 0.6 = save it.
|
||||
|
||||
### find_duplicates — Dedup Memory
|
||||
```
|
||||
find_duplicates({ similarity_threshold: 0.80, limit: 20, tags: ["bug-fix"] })
|
||||
```
|
||||
Cosine similarity clustering. Returns merge/review suggestions.
|
||||
|
||||
### memory_timeline — Chronological Browse
|
||||
```
|
||||
memory_timeline({ start: "2026-02-01", end: "2026-03-01",
|
||||
node_type: "decision", tags: ["vestige"], limit: 50, detail_level: "summary" })
|
||||
```
|
||||
|
||||
### memory_changelog — Audit Trail
|
||||
```
|
||||
memory_changelog({ memory_id: "uuid", limit: 20 }) // per-memory history
|
||||
memory_changelog({ start: "2026-03-01", limit: 20 }) // system-wide
|
||||
```
|
||||
|
||||
### memory_health — Retention Dashboard
|
||||
```
|
||||
memory_health()
|
||||
```
|
||||
Returns: avg retention, distribution buckets (0-20%, 20-40%, etc.), trend (improving/declining/stable), recommendation.
|
||||
|
||||
### memory_graph — Visualization Export
|
||||
```
|
||||
memory_graph({ query: "search term", depth: 2, max_nodes: 50 })
|
||||
memory_graph({ center_id: "uuid", depth: 3, max_nodes: 100 })
|
||||
```
|
||||
Returns nodes with force-directed positions + edges with weights.
|
||||
|
||||
### Maintenance Tools
|
||||
```
|
||||
system_status() // health + stats + warnings + recommendations
|
||||
consolidate() // FSRS-6 decay cycle + embedding generation
|
||||
backup() // SQLite backup → ~/.vestige/backups/
|
||||
export({ format: "json", tags: ["bug-fix"], since: "2026-01-01" })
|
||||
gc({ min_retention: 0.1, dry_run: true }) // garbage collect (dry_run first!)
|
||||
restore({ path: "/path/to/backup.json" })
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Mandatory Save Gates
|
||||
|
||||
**RULE: You MUST NOT proceed past a save gate without executing the save.**
|
||||
**You MUST NOT proceed past a save gate without executing the save.**
|
||||
|
||||
### BUG_FIX — After any error is resolved
|
||||
Your next tool call after confirming a fix MUST be `smart_ingest`:
|
||||
```
|
||||
smart_ingest({
|
||||
content: "BUG FIX: [exact error]\nRoot cause: [why]\nSolution: [what fixed it]\nFiles: [paths]",
|
||||
tags: ["bug-fix", "[project]"], node_type: "fact"
|
||||
})
|
||||
```
|
||||
|
||||
### DECISION — After any architectural or design choice
|
||||
```
|
||||
codebase({
|
||||
action: "remember_decision",
|
||||
decision: "[what]", rationale: "[why]",
|
||||
alternatives: ["[A]", "[B]"], files: ["[affected]"], codebase: "[project]"
|
||||
})
|
||||
```
|
||||
|
||||
### CODE_CHANGE — After writing significant code (>20 lines or new pattern)
|
||||
```
|
||||
codebase({
|
||||
action: "remember_pattern",
|
||||
name: "[pattern]", description: "[how/when to use]",
|
||||
files: ["[files]"], codebase: "[project]"
|
||||
})
|
||||
```
|
||||
|
||||
### SESSION_END — Before stopping or compaction
|
||||
```
|
||||
smart_ingest({
|
||||
items: [
|
||||
{ content: "SESSION: [work done]\nFixes: [list]\nDecisions: [list]", tags: ["session-end", "[project]"] },
|
||||
// ... any unsaved fixes, decisions, patterns
|
||||
]
|
||||
})
|
||||
```
|
||||
| Gate | Trigger | Action |
|
||||
|------|---------|--------|
|
||||
| **BUG_FIX** | After any error is resolved | `smart_ingest({ content: "BUG FIX: [error]\nRoot cause: [why]\nSolution: [fix]\nFiles: [paths]", tags: ["bug-fix", "project"], node_type: "fact" })` |
|
||||
| **DECISION** | After any architectural/design choice | `codebase({ action: "remember_decision", decision, rationale, alternatives, files, codebase })` |
|
||||
| **CODE_CHANGE** | After >20 lines or new pattern | `codebase({ action: "remember_pattern", name, description, files, codebase })` |
|
||||
| **SESSION_END** | Before stopping or compaction | `smart_ingest({ items: [{ content: "SESSION: [summary]", tags: ["session-end"] }] })` |
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -148,60 +217,82 @@ smart_ingest({
|
|||
| "Remember this" / "Don't forget" | `smart_ingest` immediately |
|
||||
| "I always..." / "I never..." / "I prefer..." | Save as preference |
|
||||
| "This is important" | `smart_ingest` + `memory(action="promote")` |
|
||||
| "Remind me..." / "Next time..." | `intention` → set |
|
||||
| "Remind me..." / "Next time..." | `intention({ action: "set" })` |
|
||||
|
||||
---
|
||||
|
||||
## Under the Hood — Cognitive Pipelines
|
||||
## Cognitive Architecture
|
||||
|
||||
### Search Pipeline (7 stages)
|
||||
1. **Overfetch** — Pull 3x results from hybrid search (BM25 + semantic)
|
||||
2. **Reranker** — Re-score by relevance quality (cross-encoder)
|
||||
3. **Temporal boost** — Recent memories get recency bonus
|
||||
4. **Accessibility filter** — FSRS-6 retention threshold (Ebbinghaus curve)
|
||||
5. **Context match** — Tulving 1973 encoding specificity (match current context to encoding context)
|
||||
1. **Overfetch** — 3x results from hybrid search (0.3 BM25 + 0.7 semantic, nomic-embed-text-v1.5 768D)
|
||||
2. **Rerank** — Cross-encoder rescoring (Jina Reranker v1 Turbo, 38M params)
|
||||
3. **Temporal** — Recency + validity window boosting (85% relevance + 15% temporal)
|
||||
4. **Accessibility** — FSRS-6 retention filter (Active ≥0.7, Dormant ≥0.4, Silent ≥0.1)
|
||||
5. **Context** — Tulving 1973 encoding specificity (topic overlap → +30% boost)
|
||||
6. **Competition** — Anderson 1994 retrieval-induced forgetting (winners strengthen, competitors weaken)
|
||||
7. **Spreading activation** — Side effects: activate related memories, update predictive model, record reconsolidation opportunity
|
||||
7. **Activation** — Spreading activation side effects + predictive model + reconsolidation marking
|
||||
|
||||
### Ingest Pipeline (cognitive pre/post)
|
||||
**Pre-ingest:** 4-channel importance scoring (novelty/arousal/reward/attention) + intent detection → auto-tag
|
||||
**Storage:** Prediction Error Gating decides create/update/reinforce/supersede
|
||||
**Post-ingest:** Synaptic tagging (Frey & Morris 1997) + novelty model update + hippocampal indexing + cross-project recording
|
||||
### Ingest Pipeline
|
||||
**Pre:** 4-channel importance scoring (novelty/arousal/reward/attention) + intent detection → auto-tag
|
||||
**Store:** Prediction Error Gating: similarity >0.92 → UPDATE, 0.75-0.92 → UPDATE/SUPERSEDE, <0.75 → CREATE
|
||||
**Post:** Synaptic tagging (Frey & Morris 1997, 9h backward + 2h forward) + hippocampal indexing + cross-project recording
|
||||
|
||||
### Feedback Pipeline (via memory promote/demote)
|
||||
**Promote:** Reward signal + importance boost + reconsolidation (memory becomes modifiable for 24-48h) + activation spread
|
||||
**Demote:** Competition suppression + retrieval strength decrease (does NOT delete — alternatives surface instead)
|
||||
### FSRS-6 (State-of-the-Art Spaced Repetition)
|
||||
- Retrievability: `R = (1 + factor × t / S)^(-w20)` — 21 trained parameters
|
||||
- Dual-strength model (Bjork & Bjork 1992): storage strength (grows) + retrieval strength (decays)
|
||||
- Accessibility = retention×0.5 + retrieval×0.3 + storage×0.2
|
||||
- 20-30% more efficient than SM-2 (Anki)
|
||||
|
||||
### 29 Cognitive Modules (stateful, persist across calls)
|
||||
|
||||
**Neuroscience (16):**
|
||||
ActivationNetwork (Collins & Loftus 1975), SynapticTaggingSystem (Frey & Morris 1997), HippocampalIndex (Teyler & Rudy 2007), ContextMatcher (Tulving 1973), AccessibilityCalculator, CompetitionManager (Anderson 1994), StateUpdateService, ImportanceSignals, NoveltySignal, ArousalSignal, RewardSignal, AttentionSignal, EmotionalMemory (Brown & Kulik 1977), PredictiveMemory, ProspectiveMemory, IntentionParser
|
||||
|
||||
**Advanced (11):**
|
||||
ImportanceTracker, ReconsolidationManager (Nader — 5min labile window), IntentDetector (9 intent types), ActivityTracker, MemoryDreamer (5-stage consolidation), MemoryChainBuilder (A*-like), MemoryCompressor (30-day min age), CrossProjectLearner (6 pattern types), AdaptiveEmbedder, SpeculativeRetriever (6 trigger types), ConsolidationScheduler
|
||||
|
||||
**Search (2):** Reranker, TemporalSearcher
|
||||
|
||||
### Memory States
|
||||
- **Active** (retention ≥ 0.7) — easily retrievable
|
||||
- **Dormant** (≥ 0.4) — retrievable with effort
|
||||
- **Silent** (≥ 0.1) — difficult, needs cues
|
||||
- **Unavailable** (< 0.1) — needs reinforcement
|
||||
|
||||
### Connection Types
|
||||
semantic, temporal, causal, spatial, part_of, user_defined — each with strength (0-1), activation_count, timestamps
|
||||
|
||||
---
|
||||
|
||||
## CognitiveEngine — 29 Modules
|
||||
## Advanced Techniques
|
||||
|
||||
All modules persist across tool calls as stateful instances:
|
||||
### Cross-Project Intelligence
|
||||
The CrossProjectLearner tracks patterns across ALL projects (ErrorHandling, AsyncConcurrency, Testing, Architecture, Performance, Security). When you learn a pattern in one project that works, it becomes available in all projects. Use `codebase({ action: "get_context" })` without a codebase filter to get universal patterns.
|
||||
|
||||
**Neuroscience (16):** ActivationNetwork, SynapticTaggingSystem, HippocampalIndex, ContextMatcher, AccessibilityCalculator, CompetitionManager, StateUpdateService, ImportanceSignals, NoveltySignal, ArousalSignal, RewardSignal, AttentionSignal, EmotionalMemory, PredictiveMemory, ProspectiveMemory, IntentionParser
|
||||
### Reconsolidation Window
|
||||
After any memory is accessed (via search, get, or promote), it enters a 5-minute "labile" state where modifications are enhanced. This is the optimal time to edit memories with new context. The system handles this automatically.
|
||||
|
||||
**Advanced (11):** ImportanceTracker, ReconsolidationManager, IntentDetector, ActivityTracker, MemoryDreamer, MemoryChainBuilder, MemoryCompressor, CrossProjectLearner, AdaptiveEmbedder, SpeculativeRetriever, ConsolidationScheduler
|
||||
### Synaptic Tagging (Retroactive Importance)
|
||||
Memories encoded in the last 9 hours can be retroactively promoted when something important happens. If you fix a critical bug, not only does the fix get saved — related memories from the past 9 hours also get importance boosts. The SynapticTaggingSystem handles this automatically.
|
||||
|
||||
**Search (2):** Reranker, TemporalSearcher
|
||||
### Dream Insights
|
||||
Dreams don't just consolidate — they generate new insights by cross-referencing recent memories with older knowledge. The insights can reveal: contradictions between memories, previously unseen patterns, connections across different projects. Always check dream results for `insights_generated`.
|
||||
|
||||
### Token Budget Strategy
|
||||
Use `token_budget` on search and session_context to control response size. For quick lookups: 500. For deep context: 3000-5000. Results that don't fit go to `expandable` — retrieve them with `memory({ action: "get", id: "..." })`.
|
||||
|
||||
### Detail Levels
|
||||
- `brief` — id/type/tags/score only (1-2 tokens per result, good for scanning)
|
||||
- `summary` — 8 fields including content preview (default, balanced)
|
||||
- `full` — all FSRS state, timestamps, embedding info (for debugging/analysis)
|
||||
|
||||
---
|
||||
|
||||
## Memory Hygiene
|
||||
|
||||
### Promote when:
|
||||
- User confirms memory was helpful → `memory(action="promote")`
|
||||
- Solution worked correctly
|
||||
- Information was accurate
|
||||
|
||||
### Demote when:
|
||||
- User corrects a mistake → `memory(action="demote")`
|
||||
- Information was wrong
|
||||
- Memory led to bad outcome
|
||||
|
||||
### Never save:
|
||||
- Secrets, API keys, passwords
|
||||
- Temporary debugging state
|
||||
- Obvious/trivial information
|
||||
**Promote** when user confirms helpful, solution worked, info was accurate.
|
||||
**Demote** when user corrects mistake, info was wrong, led to bad outcome.
|
||||
**Never save:** secrets, API keys, passwords, temporary debugging state, trivial info.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -216,11 +307,14 @@ Memory is retrieval. Searching strengthens memory. Search liberally, save aggres
|
|||
## Development
|
||||
|
||||
- **Crate:** `vestige-mcp` v2.0.1, Rust 2024 edition, MSRV 1.91
|
||||
- **Tests:** 1,238 tests, zero warnings
|
||||
- **Build:** `cargo build --release -p vestige-mcp`
|
||||
- **Features:** `embeddings` + `vector-search` (default on)
|
||||
- **Architecture:** `McpServer` holds `Arc<Storage>` + `Arc<Mutex<CognitiveEngine>>`
|
||||
- **Storage:** Interior mutability — `Storage` uses `Mutex<Connection>` for reader/writer split, all methods take `&self`. WAL mode for concurrent reads + writes.
|
||||
- **Entry:** `src/main.rs` → stdio JSON-RPC server
|
||||
- **Tools:** `src/tools/` — one file per tool, each exports `schema()` + `execute()`
|
||||
- **Cognitive:** `src/cognitive.rs` — 29-field struct, initialized once at startup
|
||||
- **Tests:** 1,238 (352 unit + 192 E2E + cognitive + journey + extreme), zero warnings
|
||||
- **Build:** `cargo build --release -p vestige-mcp` (features: `embeddings` + `vector-search`)
|
||||
- **Build (no embeddings):** `cargo build --release -p vestige-mcp --no-default-features`
|
||||
- **Bench:** `cargo bench -p vestige-core`
|
||||
- **Architecture:** `McpServer` → `Arc<Storage>` + `Arc<Mutex<CognitiveEngine>>`
|
||||
- **Storage:** SQLite WAL mode, `Mutex<Connection>` reader/writer split, FTS5 full-text search
|
||||
- **Embeddings:** nomic-embed-text-v1.5 (768D, 8K context) via fastembed (local ONNX, no API)
|
||||
- **Vector index:** USearch HNSW (20x faster than FAISS)
|
||||
- **Binaries:** `vestige-mcp` (MCP server), `vestige` (CLI), `vestige-restore`
|
||||
- **Dashboard:** SvelteKit 2 + Svelte 5 + Three.js + Tailwind 4, embedded at `/dashboard`
|
||||
- **Env vars:** `VESTIGE_DASHBOARD_PORT` (default 3927), `VESTIGE_CONSOLIDATION_INTERVAL_HOURS` (default 6), `RUST_LOG`
|
||||
|
|
|
|||
5
Cargo.lock
generated
5
Cargo.lock
generated
|
|
@ -4494,7 +4494,7 @@ checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a"
|
|||
|
||||
[[package]]
|
||||
name = "vestige-core"
|
||||
version = "2.0.1"
|
||||
version = "2.0.2"
|
||||
dependencies = [
|
||||
"chrono",
|
||||
"criterion",
|
||||
|
|
@ -4529,7 +4529,7 @@ dependencies = [
|
|||
|
||||
[[package]]
|
||||
name = "vestige-mcp"
|
||||
version = "2.0.1"
|
||||
version = "2.0.2"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"axum",
|
||||
|
|
@ -4544,6 +4544,7 @@ dependencies = [
|
|||
"rusqlite",
|
||||
"serde",
|
||||
"serde_json",
|
||||
"subtle",
|
||||
"tempfile",
|
||||
"thiserror 2.0.18",
|
||||
"tokio",
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
[package]
|
||||
name = "vestige-core"
|
||||
version = "2.0.1"
|
||||
version = "2.0.2"
|
||||
edition = "2024"
|
||||
rust-version = "1.91"
|
||||
authors = ["Vestige Team"]
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
[package]
|
||||
name = "vestige-mcp"
|
||||
version = "2.0.1"
|
||||
version = "2.0.2"
|
||||
edition = "2024"
|
||||
description = "Cognitive memory MCP server for Claude - FSRS-6, spreading activation, synaptic tagging, 3D dashboard, and 130 years of memory research"
|
||||
authors = ["samvallad33"]
|
||||
|
|
@ -32,7 +32,7 @@ path = "src/bin/cli.rs"
|
|||
# ============================================================================
|
||||
# Includes: FSRS-6, spreading activation, synaptic tagging, hippocampal indexing,
|
||||
# memory states, context memory, importance signals, dreams, and more
|
||||
vestige-core = { version = "2.0.1", path = "../vestige-core", default-features = false, features = ["bundled-sqlite"] }
|
||||
vestige-core = { version = "2.0.2", path = "../vestige-core", default-features = false, features = ["bundled-sqlite"] }
|
||||
|
||||
# ============================================================================
|
||||
# MCP Server Dependencies
|
||||
|
|
@ -50,6 +50,9 @@ chrono = { version = "0.4", features = ["serde"] }
|
|||
# UUID
|
||||
uuid = { version = "1", features = ["v4", "serde"] }
|
||||
|
||||
# Constant-time comparison for auth tokens (prevents timing side-channels)
|
||||
subtle = "2"
|
||||
|
||||
# Error handling
|
||||
thiserror = "2"
|
||||
anyhow = "1"
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@
|
|||
|
||||
use std::io::{BufWriter, Write};
|
||||
use std::path::PathBuf;
|
||||
use std::sync::Arc;
|
||||
|
||||
use chrono::{NaiveDate, Utc};
|
||||
use clap::{Parser, Subcommand};
|
||||
|
|
@ -109,6 +110,19 @@ enum Commands {
|
|||
#[arg(long)]
|
||||
source: Option<String>,
|
||||
},
|
||||
|
||||
/// Start standalone HTTP MCP server (no stdio, for remote access)
|
||||
Serve {
|
||||
/// HTTP transport port
|
||||
#[arg(long, default_value = "3928")]
|
||||
port: u16,
|
||||
/// Also start the dashboard
|
||||
#[arg(long)]
|
||||
dashboard: bool,
|
||||
/// Dashboard port
|
||||
#[arg(long, default_value = "3927")]
|
||||
dashboard_port: u16,
|
||||
},
|
||||
}
|
||||
|
||||
fn main() -> anyhow::Result<()> {
|
||||
|
|
@ -139,6 +153,11 @@ fn main() -> anyhow::Result<()> {
|
|||
node_type,
|
||||
source,
|
||||
} => run_ingest(content, tags, node_type, source),
|
||||
Commands::Serve {
|
||||
port,
|
||||
dashboard,
|
||||
dashboard_port,
|
||||
} => run_serve(port, dashboard, dashboard_port),
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -967,6 +986,82 @@ fn run_dashboard(port: u16, open_browser: bool) -> anyhow::Result<()> {
|
|||
})
|
||||
}
|
||||
|
||||
/// Start standalone HTTP MCP server (no stdio transport)
|
||||
fn run_serve(port: u16, with_dashboard: bool, dashboard_port: u16) -> anyhow::Result<()> {
|
||||
use vestige_mcp::cognitive::CognitiveEngine;
|
||||
|
||||
println!("{}", "=== Vestige HTTP Server ===".cyan().bold());
|
||||
println!();
|
||||
|
||||
let storage = Storage::new(None)?;
|
||||
|
||||
#[cfg(feature = "embeddings")]
|
||||
{
|
||||
if let Err(e) = storage.init_embeddings() {
|
||||
println!(
|
||||
" {} Embeddings unavailable: {} (search will use keyword-only)",
|
||||
"!".yellow(),
|
||||
e
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
let storage = Arc::new(storage);
|
||||
|
||||
let rt = tokio::runtime::Runtime::new()?;
|
||||
rt.block_on(async move {
|
||||
let cognitive = Arc::new(tokio::sync::Mutex::new(CognitiveEngine::new()));
|
||||
{
|
||||
let mut cog = cognitive.lock().await;
|
||||
cog.hydrate(&storage);
|
||||
}
|
||||
|
||||
let (event_tx, _) =
|
||||
tokio::sync::broadcast::channel::<vestige_mcp::dashboard::events::VestigeEvent>(1024);
|
||||
|
||||
// Optionally start dashboard
|
||||
if with_dashboard {
|
||||
let ds = Arc::clone(&storage);
|
||||
let dc = Arc::clone(&cognitive);
|
||||
let dtx = event_tx.clone();
|
||||
tokio::spawn(async move {
|
||||
match vestige_mcp::dashboard::start_background_with_event_tx(ds, Some(dc), dtx, dashboard_port).await {
|
||||
Ok(_) => println!(" {} Dashboard: http://127.0.0.1:{}", ">".cyan(), dashboard_port),
|
||||
Err(e) => eprintln!(" {} Dashboard failed: {}", "!".yellow(), e),
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
// Get auth token
|
||||
let token = vestige_mcp::protocol::auth::get_or_create_auth_token()
|
||||
.map_err(|e| anyhow::anyhow!("Failed to create auth token: {}", e))?;
|
||||
|
||||
let bind = std::env::var("VESTIGE_HTTP_BIND").unwrap_or_else(|_| "127.0.0.1".to_string());
|
||||
println!(" {} HTTP transport: http://{}:{}/mcp", ">".cyan(), bind, port);
|
||||
println!(" {} Auth token: {}...", ">".cyan(), &token[..8]);
|
||||
println!();
|
||||
println!("{}", "Press Ctrl+C to stop.".dimmed());
|
||||
|
||||
// Start HTTP transport (blocks on the server, no stdio)
|
||||
vestige_mcp::protocol::http::start_http_transport(
|
||||
Arc::clone(&storage),
|
||||
Arc::clone(&cognitive),
|
||||
event_tx,
|
||||
token,
|
||||
port,
|
||||
)
|
||||
.await
|
||||
.map_err(|e| anyhow::anyhow!("HTTP transport failed: {}", e))?;
|
||||
|
||||
// Keep the process alive (the HTTP server runs in a spawned task)
|
||||
tokio::signal::ctrl_c().await.ok();
|
||||
println!();
|
||||
println!("{}", "Shutting down...".dimmed());
|
||||
|
||||
Ok(())
|
||||
})
|
||||
}
|
||||
|
||||
/// Truncate a string for display (UTF-8 safe)
|
||||
fn truncate(s: &str, max_chars: usize) -> String {
|
||||
let s = s.replace('\n', " ");
|
||||
|
|
|
|||
|
|
@ -4,3 +4,7 @@
|
|||
|
||||
pub mod cognitive;
|
||||
pub mod dashboard;
|
||||
pub mod protocol;
|
||||
pub mod resources;
|
||||
pub mod server;
|
||||
pub mod tools;
|
||||
|
|
|
|||
|
|
@ -27,12 +27,9 @@
|
|||
//! - Reconsolidation (memories editable on retrieval)
|
||||
//! - Memory Chains (reasoning paths)
|
||||
|
||||
// cognitive is exported from lib.rs for dashboard access
|
||||
use vestige_mcp::cognitive;
|
||||
mod protocol;
|
||||
mod resources;
|
||||
mod server;
|
||||
mod tools;
|
||||
use vestige_mcp::protocol;
|
||||
use vestige_mcp::server;
|
||||
|
||||
use std::io;
|
||||
use std::path::PathBuf;
|
||||
|
|
@ -44,15 +41,24 @@ use tracing_subscriber::EnvFilter;
|
|||
// Use vestige-core for the cognitive science engine
|
||||
use vestige_core::Storage;
|
||||
|
||||
use crate::protocol::stdio::StdioTransport;
|
||||
use crate::server::McpServer;
|
||||
use protocol::stdio::StdioTransport;
|
||||
use server::McpServer;
|
||||
|
||||
/// Parse command-line arguments and return the optional data directory path.
|
||||
/// Returns `None` for the path if no `--data-dir` was specified.
|
||||
/// Parsed CLI configuration.
|
||||
struct Config {
|
||||
data_dir: Option<PathBuf>,
|
||||
http_port: u16,
|
||||
}
|
||||
|
||||
/// Parse command-line arguments into a `Config`.
|
||||
/// Exits the process if `--help` or `--version` is requested.
|
||||
fn parse_args() -> Option<PathBuf> {
|
||||
fn parse_args() -> Config {
|
||||
let args: Vec<String> = std::env::args().collect();
|
||||
let mut data_dir: Option<PathBuf> = None;
|
||||
let mut http_port: u16 = std::env::var("VESTIGE_HTTP_PORT")
|
||||
.ok()
|
||||
.and_then(|s| s.parse().ok())
|
||||
.unwrap_or(3928);
|
||||
let mut i = 1;
|
||||
|
||||
while i < args.len() {
|
||||
|
|
@ -69,13 +75,18 @@ fn parse_args() -> Option<PathBuf> {
|
|||
println!(" -h, --help Print help information");
|
||||
println!(" -V, --version Print version information");
|
||||
println!(" --data-dir <PATH> Custom data directory");
|
||||
println!(" --http-port <PORT> HTTP transport port (default: 3928)");
|
||||
println!();
|
||||
println!("ENVIRONMENT:");
|
||||
println!(" RUST_LOG Log level filter (e.g., debug, info, warn, error)");
|
||||
println!(" RUST_LOG Log level filter (e.g., debug, info, warn, error)");
|
||||
println!(" VESTIGE_AUTH_TOKEN Override the bearer token for HTTP transport");
|
||||
println!(" VESTIGE_HTTP_PORT HTTP transport port (default: 3928)");
|
||||
println!(" VESTIGE_DASHBOARD_PORT Dashboard port (default: 3927)");
|
||||
println!();
|
||||
println!("EXAMPLES:");
|
||||
println!(" vestige-mcp");
|
||||
println!(" vestige-mcp --data-dir /custom/path");
|
||||
println!(" vestige-mcp --http-port 8080");
|
||||
println!(" RUST_LOG=debug vestige-mcp");
|
||||
std::process::exit(0);
|
||||
}
|
||||
|
|
@ -102,6 +113,31 @@ fn parse_args() -> Option<PathBuf> {
|
|||
}
|
||||
data_dir = Some(PathBuf::from(path));
|
||||
}
|
||||
"--http-port" => {
|
||||
i += 1;
|
||||
if i >= args.len() {
|
||||
eprintln!("error: --http-port requires a port number");
|
||||
eprintln!("Usage: vestige-mcp --http-port <PORT>");
|
||||
std::process::exit(1);
|
||||
}
|
||||
http_port = match args[i].parse() {
|
||||
Ok(p) => p,
|
||||
Err(_) => {
|
||||
eprintln!("error: invalid port number '{}'", args[i]);
|
||||
std::process::exit(1);
|
||||
}
|
||||
};
|
||||
}
|
||||
arg if arg.starts_with("--http-port=") => {
|
||||
let val = arg.strip_prefix("--http-port=").unwrap_or("");
|
||||
http_port = match val.parse() {
|
||||
Ok(p) => p,
|
||||
Err(_) => {
|
||||
eprintln!("error: invalid port number '{}'", val);
|
||||
std::process::exit(1);
|
||||
}
|
||||
};
|
||||
}
|
||||
arg => {
|
||||
eprintln!("error: unknown argument '{}'", arg);
|
||||
eprintln!("Usage: vestige-mcp [OPTIONS]");
|
||||
|
|
@ -112,13 +148,13 @@ fn parse_args() -> Option<PathBuf> {
|
|||
i += 1;
|
||||
}
|
||||
|
||||
data_dir
|
||||
Config { data_dir, http_port }
|
||||
}
|
||||
|
||||
#[tokio::main]
|
||||
async fn main() {
|
||||
// Parse CLI arguments first (before logging init, so --help/--version work cleanly)
|
||||
let data_dir = parse_args();
|
||||
let config = parse_args();
|
||||
|
||||
// Initialize logging to stderr (stdout is for JSON-RPC)
|
||||
tracing_subscriber::fmt()
|
||||
|
|
@ -134,7 +170,7 @@ async fn main() {
|
|||
info!("Vestige MCP Server v{} starting...", env!("CARGO_PKG_VERSION"));
|
||||
|
||||
// Initialize storage with optional custom data directory
|
||||
let storage = match Storage::new(data_dir) {
|
||||
let storage = match Storage::new(config.data_dir) {
|
||||
Ok(s) => {
|
||||
info!("Storage initialized successfully");
|
||||
|
||||
|
|
@ -260,6 +296,38 @@ async fn main() {
|
|||
});
|
||||
}
|
||||
|
||||
// Start HTTP MCP transport (Streamable HTTP for Claude.ai / remote clients)
|
||||
{
|
||||
let http_storage = Arc::clone(&storage);
|
||||
let http_cognitive = Arc::clone(&cognitive);
|
||||
let http_event_tx = event_tx.clone();
|
||||
let http_port = config.http_port;
|
||||
|
||||
match protocol::auth::get_or_create_auth_token() {
|
||||
Ok(token) => {
|
||||
let bind = std::env::var("VESTIGE_HTTP_BIND").unwrap_or_else(|_| "127.0.0.1".to_string());
|
||||
eprintln!("Vestige HTTP transport: http://{}:{}/mcp", bind, http_port);
|
||||
eprintln!("Auth token: {}...", &token[..8]);
|
||||
tokio::spawn(async move {
|
||||
if let Err(e) = protocol::http::start_http_transport(
|
||||
http_storage,
|
||||
http_cognitive,
|
||||
http_event_tx,
|
||||
token,
|
||||
http_port,
|
||||
)
|
||||
.await
|
||||
{
|
||||
warn!("HTTP transport failed to start: {}", e);
|
||||
}
|
||||
});
|
||||
}
|
||||
Err(e) => {
|
||||
warn!("Could not create auth token, HTTP transport disabled: {}", e);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Load cross-encoder reranker in the background (downloads ~150MB on first run)
|
||||
#[cfg(feature = "embeddings")]
|
||||
{
|
||||
|
|
|
|||
104
crates/vestige-mcp/src/protocol/auth.rs
Normal file
104
crates/vestige-mcp/src/protocol/auth.rs
Normal file
|
|
@ -0,0 +1,104 @@
|
|||
//! Bearer token authentication for the HTTP transport.
|
||||
//!
|
||||
//! Token priority:
|
||||
//! 1. `VESTIGE_AUTH_TOKEN` env var (override)
|
||||
//! 2. Read from `<data_dir>/auth_token` file
|
||||
//! 3. Generate `uuid::Uuid::new_v4()`, write to file with 0o600 permissions
|
||||
//!
|
||||
//! Security: The token file is created with restricted permissions from the
|
||||
//! start (via OpenOptionsExt on Unix) to prevent a TOCTOU race where another
|
||||
//! process could read the token before permissions are set.
|
||||
|
||||
use std::fs;
|
||||
use std::path::PathBuf;
|
||||
|
||||
use directories::ProjectDirs;
|
||||
use tracing::{info, warn};
|
||||
|
||||
/// Minimum recommended token length when provided via env var.
|
||||
const MIN_TOKEN_LENGTH: usize = 32;
|
||||
|
||||
/// Return the auth token file path inside the Vestige data directory.
|
||||
fn token_path() -> Result<PathBuf, Box<dyn std::error::Error>> {
|
||||
let dirs = ProjectDirs::from("com", "vestige", "core")
|
||||
.ok_or("could not determine project directories")?;
|
||||
Ok(dirs.data_dir().join("auth_token"))
|
||||
}
|
||||
|
||||
/// Get (or create) the bearer token used for HTTP transport authentication.
|
||||
///
|
||||
/// Priority:
|
||||
/// 1. `VESTIGE_AUTH_TOKEN` environment variable
|
||||
/// 2. Existing `auth_token` file in the data directory
|
||||
/// 3. Newly generated UUID v4, persisted to file
|
||||
pub fn get_or_create_auth_token() -> Result<String, Box<dyn std::error::Error>> {
|
||||
// 1. Env var override
|
||||
if let Ok(token) = std::env::var("VESTIGE_AUTH_TOKEN") {
|
||||
let token = token.trim().to_string();
|
||||
if !token.is_empty() {
|
||||
if token.len() < MIN_TOKEN_LENGTH {
|
||||
warn!(
|
||||
"VESTIGE_AUTH_TOKEN is only {} chars (recommended >= {}). \
|
||||
Short tokens are vulnerable to brute-force attacks.",
|
||||
token.len(),
|
||||
MIN_TOKEN_LENGTH
|
||||
);
|
||||
}
|
||||
info!("Using auth token from VESTIGE_AUTH_TOKEN env var");
|
||||
return Ok(token);
|
||||
}
|
||||
}
|
||||
|
||||
let path = token_path()?;
|
||||
|
||||
// 2. Read existing file
|
||||
if path.exists() {
|
||||
let token = fs::read_to_string(&path)?.trim().to_string();
|
||||
if !token.is_empty() {
|
||||
info!("Using auth token from {}", path.display());
|
||||
return Ok(token);
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Generate new token and persist
|
||||
let token = uuid::Uuid::new_v4().to_string();
|
||||
|
||||
// Ensure parent directory exists with restricted permissions
|
||||
if let Some(parent) = path.parent() {
|
||||
fs::create_dir_all(parent)?;
|
||||
|
||||
// Restrict parent directory permissions on Unix (owner only)
|
||||
#[cfg(unix)]
|
||||
{
|
||||
use std::os::unix::fs::PermissionsExt;
|
||||
let _ = fs::set_permissions(parent, fs::Permissions::from_mode(0o700));
|
||||
}
|
||||
}
|
||||
|
||||
// Write token file with restricted permissions from the start.
|
||||
// On Unix, we use OpenOptionsExt to set mode 0o600 at creation time,
|
||||
// avoiding the TOCTOU race of write-then-chmod.
|
||||
#[cfg(unix)]
|
||||
{
|
||||
use std::io::Write;
|
||||
use std::os::unix::fs::OpenOptionsExt;
|
||||
|
||||
let mut file = fs::OpenOptions::new()
|
||||
.write(true)
|
||||
.create(true)
|
||||
.truncate(true)
|
||||
.mode(0o600) // Owner read/write only — set at creation, no race window
|
||||
.open(&path)?;
|
||||
file.write_all(token.as_bytes())?;
|
||||
file.sync_all()?;
|
||||
}
|
||||
|
||||
// On non-Unix (Windows), fall back to regular write (Windows ACLs are different)
|
||||
#[cfg(not(unix))]
|
||||
{
|
||||
fs::write(&path, &token)?;
|
||||
}
|
||||
|
||||
info!("Generated new auth token at {}", path.display());
|
||||
Ok(token)
|
||||
}
|
||||
308
crates/vestige-mcp/src/protocol/http.rs
Normal file
308
crates/vestige-mcp/src/protocol/http.rs
Normal file
|
|
@ -0,0 +1,308 @@
|
|||
//! Streamable HTTP transport for MCP.
|
||||
//!
|
||||
//! Implements the MCP Streamable HTTP transport specification:
|
||||
//! - `POST /mcp` — JSON-RPC endpoint (initialize, tools/call, etc.)
|
||||
//! - `DELETE /mcp` — session cleanup
|
||||
//!
|
||||
//! Each client gets a per-session `McpServer` instance (owns `initialized` state).
|
||||
//! Shared state (Storage, CognitiveEngine, event bus) is shared across sessions.
|
||||
|
||||
use std::collections::HashMap;
|
||||
use std::sync::Arc;
|
||||
use std::time::{Duration, Instant};
|
||||
|
||||
use axum::extract::{DefaultBodyLimit, State};
|
||||
use axum::http::{HeaderMap, StatusCode};
|
||||
use axum::response::IntoResponse;
|
||||
use axum::routing::{delete, post};
|
||||
use axum::{Json, Router};
|
||||
use subtle::ConstantTimeEq;
|
||||
use tokio::sync::{broadcast, Mutex, RwLock};
|
||||
use tower::ServiceBuilder;
|
||||
use tower::limit::ConcurrencyLimitLayer;
|
||||
use tower_http::cors::CorsLayer;
|
||||
use tracing::{info, warn};
|
||||
|
||||
use crate::cognitive::CognitiveEngine;
|
||||
use crate::protocol::types::JsonRpcRequest;
|
||||
use crate::server::McpServer;
|
||||
use vestige_core::Storage;
|
||||
use crate::dashboard::events::VestigeEvent;
|
||||
|
||||
/// Maximum concurrent sessions.
|
||||
const MAX_SESSIONS: usize = 100;
|
||||
|
||||
/// Sessions idle longer than this are reaped.
|
||||
const SESSION_TIMEOUT: Duration = Duration::from_secs(30 * 60);
|
||||
|
||||
/// How often the reaper task runs.
|
||||
const REAPER_INTERVAL: Duration = Duration::from_secs(5 * 60);
|
||||
|
||||
/// Concurrency limit for the tower middleware.
|
||||
const CONCURRENCY_LIMIT: usize = 50;
|
||||
|
||||
/// Maximum request body size (256 KB — JSON-RPC requests should be small).
|
||||
const MAX_BODY_SIZE: usize = 256 * 1024;
|
||||
|
||||
/// A per-client session holding its own McpServer instance.
|
||||
struct Session {
|
||||
server: McpServer,
|
||||
last_active: Instant,
|
||||
}
|
||||
|
||||
/// Shared state cloned into every axum handler.
|
||||
#[derive(Clone)]
|
||||
pub struct HttpTransportState {
|
||||
sessions: Arc<RwLock<HashMap<String, Arc<Mutex<Session>>>>>,
|
||||
storage: Arc<Storage>,
|
||||
cognitive: Arc<Mutex<CognitiveEngine>>,
|
||||
event_tx: broadcast::Sender<VestigeEvent>,
|
||||
auth_token: String,
|
||||
}
|
||||
|
||||
/// Start the HTTP MCP transport on `127.0.0.1:<port>`.
|
||||
///
|
||||
/// This function spawns a background tokio task and returns immediately.
|
||||
pub async fn start_http_transport(
|
||||
storage: Arc<Storage>,
|
||||
cognitive: Arc<Mutex<CognitiveEngine>>,
|
||||
event_tx: broadcast::Sender<VestigeEvent>,
|
||||
auth_token: String,
|
||||
port: u16,
|
||||
) -> Result<(), Box<dyn std::error::Error>> {
|
||||
let state = HttpTransportState {
|
||||
sessions: Arc::new(RwLock::new(HashMap::new())),
|
||||
storage,
|
||||
cognitive,
|
||||
event_tx,
|
||||
auth_token,
|
||||
};
|
||||
|
||||
// Spawn session reaper
|
||||
{
|
||||
let sessions = Arc::clone(&state.sessions);
|
||||
tokio::spawn(async move {
|
||||
loop {
|
||||
tokio::time::sleep(REAPER_INTERVAL).await;
|
||||
let mut map = sessions.write().await;
|
||||
let before = map.len();
|
||||
map.retain(|_id, session| {
|
||||
// Try to check last_active without blocking; skip if locked
|
||||
match session.try_lock() {
|
||||
Ok(s) => s.last_active.elapsed() < SESSION_TIMEOUT,
|
||||
Err(_) => true, // in-use, keep
|
||||
}
|
||||
});
|
||||
let removed = before - map.len();
|
||||
if removed > 0 {
|
||||
info!("Session reaper: removed {} idle sessions ({} active)", removed, map.len());
|
||||
}
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
let app = Router::new()
|
||||
.route("/mcp", post(post_mcp))
|
||||
.route("/mcp", delete(delete_mcp))
|
||||
.layer(
|
||||
ServiceBuilder::new()
|
||||
.layer(DefaultBodyLimit::max(MAX_BODY_SIZE))
|
||||
.layer(ConcurrencyLimitLayer::new(CONCURRENCY_LIMIT))
|
||||
.layer(CorsLayer::permissive()),
|
||||
)
|
||||
.with_state(state);
|
||||
|
||||
// Bind to localhost only — use VESTIGE_HTTP_BIND=0.0.0.0 for remote access
|
||||
let bind_addr: std::net::IpAddr = std::env::var("VESTIGE_HTTP_BIND")
|
||||
.ok()
|
||||
.and_then(|s| s.parse().ok())
|
||||
.unwrap_or_else(|| std::net::IpAddr::V4(std::net::Ipv4Addr::LOCALHOST));
|
||||
|
||||
let addr = std::net::SocketAddr::from((bind_addr, port));
|
||||
let listener = tokio::net::TcpListener::bind(addr).await?;
|
||||
info!("HTTP MCP transport listening on http://{}/mcp", addr);
|
||||
|
||||
tokio::spawn(async move {
|
||||
if let Err(e) = axum::serve(listener, app).await {
|
||||
warn!("HTTP transport error: {}", e);
|
||||
}
|
||||
});
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Helpers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Validate the `Authorization: Bearer <token>` header using constant-time
|
||||
/// comparison to prevent timing side-channel attacks.
|
||||
fn validate_auth(headers: &HeaderMap, expected: &str) -> Result<(), (StatusCode, &'static str)> {
|
||||
let header = headers
|
||||
.get("authorization")
|
||||
.and_then(|v| v.to_str().ok())
|
||||
.ok_or((StatusCode::UNAUTHORIZED, "Missing Authorization header"))?;
|
||||
|
||||
let token = header
|
||||
.strip_prefix("Bearer ")
|
||||
.ok_or((StatusCode::UNAUTHORIZED, "Invalid Authorization scheme (expected Bearer)"))?;
|
||||
|
||||
// Constant-time comparison: prevents timing side-channel attacks.
|
||||
// We first check lengths match (length itself is not secret since UUIDs
|
||||
// have a fixed public format), then compare bytes in constant time.
|
||||
let token_bytes = token.as_bytes();
|
||||
let expected_bytes = expected.as_bytes();
|
||||
|
||||
if token_bytes.len() != expected_bytes.len()
|
||||
|| token_bytes.ct_eq(expected_bytes).unwrap_u8() != 1
|
||||
{
|
||||
return Err((StatusCode::FORBIDDEN, "Invalid auth token"));
|
||||
}
|
||||
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Extract and validate the `Mcp-Session-Id` header value.
|
||||
///
|
||||
/// Only accepts valid UUID v4 format (8-4-4-4-12 hex) to prevent header
|
||||
/// injection and ensure session IDs match server-generated format.
|
||||
fn session_id_from_headers(headers: &HeaderMap) -> Option<String> {
|
||||
headers
|
||||
.get("mcp-session-id")
|
||||
.and_then(|v| v.to_str().ok())
|
||||
.filter(|s| uuid::Uuid::parse_str(s).is_ok())
|
||||
.map(|s| s.to_string())
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Handlers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// `POST /mcp` — main JSON-RPC handler.
|
||||
async fn post_mcp(
|
||||
State(state): State<HttpTransportState>,
|
||||
headers: HeaderMap,
|
||||
Json(request): Json<JsonRpcRequest>,
|
||||
) -> impl IntoResponse {
|
||||
// Auth check
|
||||
if let Err((status, msg)) = validate_auth(&headers, &state.auth_token) {
|
||||
return (status, HeaderMap::new(), msg.to_string()).into_response();
|
||||
}
|
||||
|
||||
let is_initialize = request.method == "initialize";
|
||||
|
||||
if is_initialize {
|
||||
// ── New session ──
|
||||
// Take write lock immediately to avoid TOCTOU race on MAX_SESSIONS check.
|
||||
let mut sessions = state.sessions.write().await;
|
||||
if sessions.len() >= MAX_SESSIONS {
|
||||
return (
|
||||
StatusCode::SERVICE_UNAVAILABLE,
|
||||
"Too many active sessions",
|
||||
)
|
||||
.into_response();
|
||||
}
|
||||
|
||||
let server = McpServer::new_with_events(
|
||||
Arc::clone(&state.storage),
|
||||
Arc::clone(&state.cognitive),
|
||||
state.event_tx.clone(),
|
||||
);
|
||||
|
||||
let session_id = uuid::Uuid::new_v4().to_string();
|
||||
|
||||
let session = Arc::new(Mutex::new(Session {
|
||||
server,
|
||||
last_active: Instant::now(),
|
||||
}));
|
||||
|
||||
// Handle the initialize request
|
||||
let response = {
|
||||
let mut sess = session.lock().await;
|
||||
sess.server.handle_request(request).await
|
||||
};
|
||||
|
||||
// Insert session while still holding write lock — atomic check-and-insert
|
||||
sessions.insert(session_id.clone(), session);
|
||||
drop(sessions);
|
||||
|
||||
match response {
|
||||
Some(resp) => {
|
||||
let mut resp_headers = HeaderMap::new();
|
||||
resp_headers.insert("mcp-session-id", session_id.parse().unwrap());
|
||||
(StatusCode::OK, resp_headers, Json(resp)).into_response()
|
||||
}
|
||||
None => {
|
||||
// Notifications return 202
|
||||
let mut resp_headers = HeaderMap::new();
|
||||
resp_headers.insert("mcp-session-id", session_id.parse().unwrap());
|
||||
(StatusCode::ACCEPTED, resp_headers).into_response()
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// ── Existing session ──
|
||||
let session_id = match session_id_from_headers(&headers) {
|
||||
Some(id) => id,
|
||||
None => {
|
||||
return (
|
||||
StatusCode::BAD_REQUEST,
|
||||
"Missing or invalid Mcp-Session-Id header",
|
||||
)
|
||||
.into_response();
|
||||
}
|
||||
};
|
||||
|
||||
let session = {
|
||||
let sessions = state.sessions.read().await;
|
||||
sessions.get(&session_id).cloned()
|
||||
};
|
||||
|
||||
let session = match session {
|
||||
Some(s) => s,
|
||||
None => {
|
||||
return (
|
||||
StatusCode::NOT_FOUND,
|
||||
"Session not found or expired",
|
||||
)
|
||||
.into_response();
|
||||
}
|
||||
};
|
||||
|
||||
let response = {
|
||||
let mut sess = session.lock().await;
|
||||
sess.last_active = Instant::now();
|
||||
sess.server.handle_request(request).await
|
||||
};
|
||||
|
||||
let mut resp_headers = HeaderMap::new();
|
||||
resp_headers.insert("mcp-session-id", session_id.parse().unwrap());
|
||||
|
||||
match response {
|
||||
Some(resp) => (StatusCode::OK, resp_headers, Json(resp)).into_response(),
|
||||
None => (StatusCode::ACCEPTED, resp_headers).into_response(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// `DELETE /mcp` — explicit session cleanup.
|
||||
async fn delete_mcp(
|
||||
State(state): State<HttpTransportState>,
|
||||
headers: HeaderMap,
|
||||
) -> impl IntoResponse {
|
||||
if let Err((status, msg)) = validate_auth(&headers, &state.auth_token) {
|
||||
return (status, msg).into_response();
|
||||
}
|
||||
|
||||
let session_id = match session_id_from_headers(&headers) {
|
||||
Some(id) => id,
|
||||
None => return (StatusCode::BAD_REQUEST, "Missing or invalid Mcp-Session-Id header").into_response(),
|
||||
};
|
||||
|
||||
let mut sessions = state.sessions.write().await;
|
||||
if sessions.remove(&session_id).is_some() {
|
||||
info!("Session {} deleted via DELETE /mcp", &session_id[..8]);
|
||||
(StatusCode::OK, "Session deleted").into_response()
|
||||
} else {
|
||||
(StatusCode::NOT_FOUND, "Session not found").into_response()
|
||||
}
|
||||
}
|
||||
|
|
@ -2,6 +2,8 @@
|
|||
//!
|
||||
//! JSON-RPC 2.0 over stdio for the Model Context Protocol.
|
||||
|
||||
pub mod auth;
|
||||
pub mod http;
|
||||
pub mod messages;
|
||||
pub mod stdio;
|
||||
pub mod types;
|
||||
|
|
|
|||
|
|
@ -11,7 +11,7 @@ use tokio::sync::{broadcast, Mutex};
|
|||
use tracing::{debug, info, warn};
|
||||
|
||||
use crate::cognitive::CognitiveEngine;
|
||||
use vestige_mcp::dashboard::events::VestigeEvent;
|
||||
use crate::dashboard::events::VestigeEvent;
|
||||
use crate::protocol::messages::{
|
||||
CallToolRequest, CallToolResult, InitializeRequest, InitializeResult,
|
||||
ListResourcesResult, ListToolsResult, ReadResourceRequest, ReadResourceResult,
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue