The Stage 8 `recommended` selector and the evidence sort both rank by
FSRS-6 trust only, discarding the `combined_score` signal that the
upstream hybrid_search + cross-encoder reranker just computed. Confidence
is then derived from `recommended.trust + evidence_count`, neither of
which moves with the query — so any query against the same corpus
returns the same primary memory and the same confidence score.
Empirical reproduction (15 deep_reference probes against an 11-memory
corpus, 9 with a unique correct answer + 6 with no relevant memories):
- Distinct primary memories returned : 1 / 15
- Confidence values returned : 1 distinct (0.82 for all)
- Ground-truth accuracy on specific queries : 1 / 9 (11.1%)
The single hit is coincidental: the always-returned memory happened to
be the correct answer for one query. Random guessing across the 11-memory
corpus would be ~9% baseline, so the tool is performing at random.
Fix
---
Replace trust-only ranking at three sites with a 50/50 composite of
combined_score (query relevance) and FSRS-6 trust:
let composite = |s: &ScoredMemory| s.combined_score as f64 * 0.5 + s.trust * 0.5;
Used in:
- cross_reference.rs:573 — `recommended` max_by
- cross_reference.rs:589 — `non_superseded` evidence sort_by
- cross_reference.rs:622 — `base_confidence` formula
The 50/50 weighting is a design choice — see PR body for the knob to
tweak if a different blend is preferred. The pre-existing updated_at
tiebreaker is preserved.
Tests
-----
Two regression tests, both verified to FAIL on `main` and PASS with the
fix via negative control (temporarily set the composite weights to
1.0 trust + 0.0 relevance and confirmed both tests fail again):
- test_recommended_uses_query_relevance_not_just_trust
Two-memory corpus, ingested in order so the off-topic memory wins
the trust tiebreaker. Query targets the on-topic memory. The fix
ensures `recommended` is the on-topic one.
- test_confidence_varies_with_query_relevance
Single-memory corpus. Identical execute() calls with a relevant
query and an irrelevant query. The fix ensures the relevant
query produces higher confidence.
Full crate suite: 410 / 410 passing (was 408 + 2 new).
Out of scope
------------
While running the live MCP probes I observed two further inconsistencies
in `cross_reference.rs` that I cannot reproduce in cargo test (the
synthetic test environment with mock embeddings does not trigger the
required combined_score > 0.2 floor condition):
- The `effective_sim` floor at line 551 fabricates contradictions
between memories with no real topical overlap when one contains a
CORRECTION_SIGNALS keyword.
- The Stage 5 `contradictions` field (strict) and the Stage 7
`pair_relations` feeding the reasoning text (loose, post-floor)
disagree, producing responses where `reasoning` claims N
contradictions while `contradictions` is empty and `status` is
"resolved".
I have empirical data for both from live MCP usage but no reproducible
cargo test, so they are intentionally not addressed in this PR. Happy to
file them as a separate issue with the raw probe data if useful.
Collapsed nested if statements into single conditions using
let-chains (if a && let Ok(b) = ...). Fixes CI clippy failures
on both macOS and Ubuntu.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The public JSON schema in schema() declares `in_minutes` and `file_pattern`
in snake_case, but TriggerSpec uses `#[serde(rename_all = "camelCase")]`
which makes serde expect `inMinutes` / `filePattern`. Snake_case inputs are
silently dropped to None, so time-based intentions with `in_minutes` never
fire (triggerAt becomes null) and file_pattern-only context intentions
never match.
Added `#[serde(alias = ...)]` so both naming conventions deserialize
correctly — purely additive, existing camelCase callers unaffected.
Two regression tests added, verified to FAIL without the aliases
(negative control confirmed the snake_case duration test sees
`triggerAt: null` and the file_pattern test sees an empty `triggered`
array). Both pass with the fix. Full crate suite: 408/408 passing.
Related to #25 (Bug #8 was half-fixed — check-side re-derivation works,
but the set-side was still dropping the value before it could be persisted).
- Removed vestige-agent and vestige-agent-py from workspace members
(ARC-AGI-3 code, not part of Vestige release — caused CI failure)
- Improved deep_reference reasoning chain: fuller output with arrows on
supersession reasoning, longer primary finding preview, fallback message
when no relations found, boosted relation detection for search results
with high combined_score
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Handle 'vestige/memory://' and 'vestige/codebase://' URIs by stripping the
provider prefix before scheme matching. This fixes compatibility with
MCP clients like OpenCode that prepend the provider name to resource URIs.
Fixes#19
When memories are created, promoted, deleted, or dreamed via MCP tools,
the 3D graph now shows spectacular live animations:
- Rainbow particle burst + elastic scale-up on MemoryCreated
- Ripple wave cascading to nearby nodes
- Green pulse + node growth on MemoryPromoted
- Implosion + dissolution on MemoryDeleted
- Edge growth animation on ConnectionDiscovered
- Purple cascade on DreamStarted/DreamProgress/DreamCompleted
- FIFO eviction at 50 live nodes to guard performance
Also: graph center defaults to most-connected node, legacy HTML
redirects to SvelteKit dashboard, CSS height chain fix in layout.
Testing: 150 unit tests (vitest), 11 e2e tests (Playwright with
MCP Streamable HTTP client), 22 proof screenshots.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a second transport layer alongside stdio — Streamable HTTP on port
3928. Enables Claude.ai, remote clients, and web integrations to connect
to Vestige over HTTP with per-session McpServer instances.
- POST /mcp (JSON-RPC) + DELETE /mcp (session cleanup)
- Bearer token auth with constant-time comparison (subtle crate)
- Auto-generated UUID v4 token persisted with 0o600 permissions
- Per-session McpServer instances with 30-min idle reaper
- 100 max sessions, 50 concurrency limit, 256KB body limit
- --http-port flag + VESTIGE_HTTP_PORT / VESTIGE_HTTP_BIND env vars
- Module exports moved from binary to lib.rs for reusability
- vestige CLI gains `serve` subcommand via shared lib
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ort-sys v2.0.0-rc.11 has no prebuilt ONNX Runtime binaries for
x86_64-apple-darwin, and vestige-mcp requires embeddings to compile.
- Remove x86_64-apple-darwin from CI release matrix (discontinued 2020)
- Fix vestige-mcp Cargo.toml: add default-features=false to vestige-core dep
- Extract sanitize_fts5_query to always-available fts.rs module
- Gate embeddings-only imports in storage/sqlite.rs behind #[cfg]
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Dashboard v2.1 "Nuclear" upgrade:
- Dark glassmorphism UI system (4-tier glass utilities, ambient orbs, nav glow)
- Graph3D decomposed from 806-line monolith into 10 focused modules
- Custom GLSL shaders (nebula FBM background, chromatic aberration, film grain, vignette)
- Enhanced dream mode with smooth 2s lerped transitions and aurora cycling
- Cognitive pipeline visualizer (7-stage search cascade animation)
- Temporal playback slider (scrub through memory evolution over time)
- Bioluminescent color palette for node types and events
Fix flaky CI test on macOS:
- vector::tests::test_add_and_search used near-identical test vectors (additive phase shift)
- Changed to multiplicative frequency so each seed produces a distinct vector
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
explore_connections and memory_graph returned empty results because
in-memory cognitive modules were never loaded from the database.
Connections were persisting to SQLite correctly (795 in production)
but the query path only checked empty ActivationNetwork.
- Add CognitiveEngine::hydrate() to load connections at startup
- Add storage fallback in explore_connections associations
- Hydrate live engine after dream persists new connections
- Add error logging for save_connection failures
- Add 7 integration tests for the full round-trip
Closes#14
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Triggers 2 (force after 6h stale) and 3 (mini-consolidation after 2h)
fired immediately on fresh schedulers even when user was active, because
they didn't check activity state. Added MIN_BRIEF_IDLE_MINS (5 min)
guard so both triggers require a brief idle period before firing.
Fixes test_consolidation_idle_trigger in CI.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3 new MCP tools (16 → 19 total):
- importance_score: 4-channel neuroscience importance scoring (novelty/arousal/reward/attention)
- session_checkpoint: batch smart_ingest up to 20 items with PE Gating
- find_duplicates: cosine similarity clustering with union-find for dedup
CLI: vestige ingest command for memory ingestion via command line
Core: made get_node_embedding public, added get_all_embeddings for dedup scanning
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add web dashboard (axum) on port 3927 with memory browser, search, and
system stats. New MCP tools: memory_timeline, memory_changelog,
health_check, consolidate, stats, backup, export, gc. Search now supports
detail_level (brief/summary/full) to control token usage. Add backup_to()
and get_recent_state_transitions() to storage layer. Bump to v1.2.0.
P0 fixes:
- Add `vestige backup <path>` — full DB copy with WAL checkpoint flush
- Add `vestige export --format json|jsonl [--tags] [--since] <path>` —
paginated memory export with tag/date filtering
- Add `vestige gc --min-retention 0.1 [--max-age-days] [--dry-run] [--yes]`
— bulk cleanup of stale memories with safety prompts
- Fix apply_decay() scaling: batched pagination (500 rows/batch) with
explicit transactions instead of loading all nodes into memory
- Fix hidden MCP resources: memory://insights and memory://consolidation-log
now listed in resources/list (were implemented but undiscoverable)
P1 fixes:
- Add auto-consolidation on server startup: FSRS-6 decay runs in background
after 2s delay, only if last consolidation was >6 hours ago
- Add encryption at rest via SQLCipher feature flag: use --features encryption
with VESTIGE_ENCRYPTION_KEY env var (bundled-sqlite and encryption are
mutually exclusive)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Route ingest tool through smart_ingest (Prediction Error Gating) to
prevent duplicate memories when content is similar to existing entries
- Fix Intel Mac release build: use macos-13 runner for x86_64-apple-darwin
(macos-latest is now ARM64, causing silent cross-compile failures)
- Sync npm package version to 1.1.2 (was 1.0.0 in package.json, 1.1.0
in postinstall.js BINARY_VERSION)
- Add vestige-restore to npm makeExecutable list
- Remove abandoned packages/core/ TypeScript package (pre-Rust implementation
referencing FSRS-5, chromadb, ollama — 32K lines of dead code)
- Sync workspace Cargo.toml version to 1.1.2
Closes#5
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously, fastembed created .fastembed_cache in the current working
directory, polluting project folders with symlinks.
Now uses platform-appropriate cache directories:
- macOS: ~/Library/Caches/com.vestige.core/fastembed
- Linux: ~/.cache/vestige/fastembed
- Windows: %LOCALAPPDATA%\vestige\cache\fastembed
Can still be overridden with FASTEMBED_CACHE_PATH env var.
Fixes user feedback about .fastembed_cache appearing in random folders.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add #![allow(dead_code)] to deprecated tool modules (kept for
backwards compatibility but not exposed in MCP tool list)
- Mark unused functions with #[allow(dead_code)] annotations
- Fix unused variable warnings (prefix with _)
- Apply clippy auto-fixes for redundant closures and derives
- Fix test to account for protocol version negotiation
- Reorganize tools/mod.rs to clarify active vs deprecated tools
Security review: LOW RISK - no critical vulnerabilities found
Dead code review: deprecated tools properly annotated
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix silent errors in stdio.rs: clients now receive fallback error
responses instead of hanging when JSON serialization fails
- Fix UTF-8 panics in keyword.rs: use char-aware slicing instead of
byte offsets for query sanitization and term highlighting
- Fix UTF-8 panics in prospective_memory.rs: replace hard-coded byte
offsets with char-aware slicing for natural language parsing
- Fix UTF-8 panics in git.rs: convert byte positions to char positions
before slicing commit messages
- Fix feature flag bug in vestige-mcp: add proper [features] section
to forward embeddings and vector-search features from vestige-core,
enabling the #[cfg(feature = "embeddings")] initialization code
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When embeddings fail to initialize in the MCP server context (e.g., due
to working directory issues), the error was silently swallowed and
smart_ingest would fall back to regular ingest without explanation.
Changes:
- Add init_embeddings() method to Storage for explicit initialization
- Initialize embeddings at MCP server startup with error logging
- Add check_ready() method to EmbeddingService for error access
- Log warning when is_ready() returns false
Now users will see clear error messages like:
"Failed to initialize embedding service: ..."
"Hint: Check FASTEMBED_CACHE_PATH or ensure ~/.fastembed_cache exists"
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Claude Desktop requests protocol version 2025-06-18 but Vestige was
responding with 2025-11-25, causing Claude Desktop to disconnect.
Now the server negotiates: if the client requests an older version,
use it. This maintains backward compatibility with older clients.
Fixes: Claude Desktop "Server transport closed unexpectedly" error
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Users were seeing 30 tools in /mcp which was overwhelming. Now only 8
essential tools are listed:
1. search - unified hybrid search
2. memory - get/delete/state operations
3. codebase - patterns and decisions
4. intention - prospective memory
5. ingest - add memories
6. smart_ingest - intelligent ingestion
7. promote_memory - thumbs up
8. demote_memory - thumbs down
Deprecated tools (recall, semantic_search, etc.) still work internally
for backward compatibility but are no longer listed.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Two-pronged fix for cross-compilation:
1. git2 with vendored-openssl feature - compiles OpenSSL from source,
eliminating system dependency issues across all platforms
2. houseabsolute/actions-rust-cross@v1 - dedicated GitHub Action that
properly handles cross-compilation with Docker containers
Sources:
- https://github.com/rust-lang/git2-rs
- https://github.com/houseabsolute/actions-rust-cross
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Replace BGE-base-en-v1.5 with nomic-embed-text-v1.5
- 8192 token context window (vs 512 for BGE)
- Matryoshka representation learning support
- Fully open source with training data released
- Same 768 dimensions, no schema changes required
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
FSRS-6 spaced repetition, spreading activation, synaptic tagging,
hippocampal indexing, and 130 years of memory research.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>