merge(#99): v2.2 12-tool consolidation + flagship backfill (13 advertised)

Tool Consolidation v2.2.0 (34->12 advertised surface: recall/memory/codebase/
intention/smart_ingest/source_sync/memory_status/maintain/dedup/graph/
session_start/suppress) onto the retroactive-salience-backfill base.

Conflict resolution:
- server.rs: took #99's canonical 12-tool surface, then re-added
  as a 13th ADVERTISED tool (ToolDescription + dispatch arm + test). Backfill
  is the flagship 'Memory with hindsight' (Cai 2024 Nature) cognitive primitive
  — a distinct capability, not a maintenance op to fold into `maintain` or hide.
- Cargo.toml: kept usearch features=["fp16lib"] with the fuller MSVC-C1021
  explanation comment (Windows build fix, #71).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Sam Valladares 2026-06-29 15:03:47 -05:00
commit 29658d24f6
11 changed files with 1974 additions and 1248 deletions

28
Dockerfile Normal file
View file

@ -0,0 +1,28 @@
# Dockerfile for running the Vestige MCP server in an isolated sandbox.
#
# Used by registries such as Glama to start the server and run the standard
# MCP stdio introspection exchange (tools/list, resources/list, prompts/list).
# The server speaks MCP over stdio, which is exactly what these tools expect.
#
# Base must be glibc (Debian), not musl/Alpine: the npm postinstall downloads
# the prebuilt x86_64-unknown-linux-gnu Rust binary from the GitHub release, and
# a -gnu binary will not run on an Alpine/musl image.
FROM node:20-slim
# ca-certificates lets the postinstall fetch the release asset over HTTPS.
RUN apt-get update \
&& apt-get install -y --no-install-recommends ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Install the published package globally. Its postinstall downloads the matching
# prebuilt vestige-mcp binary for linux/x64 from the GitHub release.
RUN npm install -g vestige-mcp-server@latest
# Keep all memory data inside the container under a writable path.
ENV VESTIGE_DATA_DIR=/data
RUN mkdir -p /data
# Start the MCP server on stdio. The `vestige-mcp` bin execs the native binary
# and inherits stdio, so the MCP client talks to it directly.
ENTRYPOINT ["vestige-mcp"]

View file

@ -138,7 +138,18 @@ candle-core = { version = "0.10.2", optional = true }
#
# Disable default features so release binaries do not include SimSIMD's
# Haswell+/AVX2/FMA dispatch targets. Those kernels can trigger illegal
# instructions on older x86_64 CPUs that Vestige otherwise supports.
# instructions on older x86_64 CPUs that Vestige otherwise supports (#71).
#
# But re-enable `fp16lib` explicitly. usearch's defaults are
# ["simsimd", "fp16lib"]; with BOTH off, build.rs sets USEARCH_USE_FP16LIB=0
# and USEARCH_USE_SIMSIMD=0, which selects the bare half-precision `#else`
# branch in include/usearch/index_plugins.hpp. That branch carries a
# `#warning` directive, which MSVC's cl.exe treats as fatal error C1021,
# breaking the Windows build (GCC/Clang only warn). `fp16lib` is a scalar,
# self-contained fp16<->fp32 conversion library with NO SIMD intrinsics, so
# re-enabling it sets USEARCH_USE_FP16LIB=1 (taking the non-warning branch)
# WITHOUT reintroducing the SimSIMD illegal-instruction risk from #71. Do not
# drop this feature.
usearch = { version = "=2.23.0", default-features = false, features = ["fp16lib"], optional = true }
# LRU cache for query embeddings

File diff suppressed because it is too large Load diff

View file

@ -7,7 +7,7 @@ use std::sync::Arc;
use uuid::Uuid;
use vestige_core::{CompositionOutcomeRecord, Storage};
const OUTCOME_TYPES: &[&str] = &[
pub(crate) const OUTCOME_TYPES: &[&str] = &[
"helpful",
"dead_end",
"submitted",

View file

@ -276,6 +276,99 @@ pub async fn execute(storage: &Arc<Storage>, args: Option<Value>) -> Result<Valu
}
}
// ============================================================================
// UNIFIED `dedup` TOOL (v2.2 — Tool Consolidation)
//
// Folds the 8 former dedup/merge tools into a single action-dispatched surface:
// action = scan (default) | plan_merge | plan_supersede | apply | undo
// | protect | policy
//
// `scan` combines cosine-similarity duplicate clusters (this module's
// `execute`) with Fellegi-Sunter merge candidates (`merge::merge_candidates`),
// returning both in separate fields. The mutate/preview/reverse actions delegate
// to `super::merge::execute` verbatim, preserving plan_id → apply → undo,
// confirm-gating, and bitemporal-never-delete byte-for-byte.
// ============================================================================
/// Discriminated-union schema for the unified `dedup` tool.
pub fn unified_schema() -> Value {
serde_json::json!({
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": ["scan", "plan_merge", "plan_supersede", "apply", "undo", "protect", "policy"],
"default": "scan",
"description": "What to do. 'scan' (default): surface duplicate clusters (cosine) AND merge candidates (Fellegi-Sunter), read-only. 'plan_merge'/'plan_supersede': preview a reversible plan without applying (returns plan_id). 'apply': execute a plan_id. 'undo': reverse a prior operation (omit operation_id to list the reflog). 'protect': pin a memory against auto-merge/supersede/forget. 'policy': get/set Fellegi-Sunter thresholds."
},
"similarity_threshold": {
"type": "number",
"description": "[scan] Minimum cosine similarity for duplicate clusters (0.5-1.0, default 0.80).",
"minimum": 0.5, "maximum": 1.0
},
"limit": {
"type": "integer",
"description": "[scan] Max clusters/candidates to return (default 20).",
"minimum": 1, "maximum": 100
},
"tags": {
"type": "array", "items": { "type": "string" },
"description": "[scan] Optional: only consider memories with these tags (ANY match)."
},
"member_ids": {
"type": "array", "items": { "type": "string" },
"description": "[plan_merge] IDs of memories to merge (>= 2). Survivor kept; rest bitemporally invalidated."
},
"survivor_id": { "type": "string", "description": "[plan_merge] Optional: which member to keep (defaults to highest-retention)." },
"old_id": { "type": "string", "description": "[plan_supersede] Memory being superseded (kept, marked invalid)." },
"new_id": { "type": "string", "description": "[plan_supersede] Memory that supersedes the old one." },
"plan_id": { "type": "string", "description": "[apply] ID of a plan produced by plan_merge/plan_supersede." },
"confirm": { "type": "boolean", "default": false, "description": "[apply] Required true for 'possible'/'non_match' plans." },
"operation_id": { "type": "string", "description": "[undo] Operation to reverse. Omit to list the reflog." },
"id": { "type": "string", "description": "[protect] Memory id to protect/unprotect." },
"protected": { "type": "boolean", "default": true, "description": "[protect] true to pin, false to unpin." },
"match_threshold": { "type": "number", "minimum": 0.0, "maximum": 1.0, "description": "[policy] Score >= this => 'match'." },
"possible_threshold": { "type": "number", "minimum": 0.0, "maximum": 1.0, "description": "[policy] Score in [possible, match) => review." },
"auto_apply": { "type": "boolean", "description": "[policy] Allow 'match' plans to apply without confirm. Default false." }
}
})
}
/// Unified dispatcher for the `dedup` tool. Routes on `action` (default `scan`).
pub async fn execute_unified(storage: &Arc<Storage>, args: Option<Value>) -> Result<Value, String> {
let action = args
.as_ref()
.and_then(|a| a.get("action"))
.and_then(|v| v.as_str())
.unwrap_or("scan")
.to_string();
match action.as_str() {
"scan" => {
// Cosine-similarity duplicate clusters (this module).
let clusters = execute(storage, args.clone()).await?;
// Fellegi-Sunter merge candidates (merge module, name-dispatched).
let candidates =
super::merge::execute(storage, "merge_candidates", args.clone()).await?;
Ok(serde_json::json!({
"action": "scan",
"duplicateClusters": clusters,
"mergeCandidates": candidates,
"nextStep": "Use action='plan_merge' (member_ids) or action='plan_supersede' (old_id,new_id) to preview a reversible plan, then action='apply' (plan_id)."
}))
}
"plan_merge" => super::merge::execute(storage, "plan_merge", args).await,
"plan_supersede" => super::merge::execute(storage, "plan_supersede", args).await,
"apply" => super::merge::execute(storage, "apply_plan", args).await,
"undo" => super::merge::execute(storage, "merge_undo", args).await,
"protect" => super::merge::execute(storage, "protect", args).await,
"policy" => super::merge::execute(storage, "merge_policy", args).await,
other => Err(format!(
"Unknown dedup action '{other}'. Use scan|plan_merge|plan_supersede|apply|undo|protect|policy."
)),
}
}
#[cfg(test)]
mod tests {
use super::*;
@ -287,6 +380,25 @@ mod tests {
assert!(schema["properties"]["similarity_threshold"].is_object());
}
#[test]
fn test_unified_schema() {
let schema = unified_schema();
assert_eq!(schema["type"], "object");
let actions = schema["properties"]["action"]["enum"].as_array().unwrap();
assert_eq!(actions.len(), 7);
assert_eq!(schema["properties"]["action"]["default"], "scan");
}
#[tokio::test]
async fn test_unified_scan_empty_storage() {
let dir = tempfile::TempDir::new().unwrap();
let storage = Storage::new(Some(dir.path().join("test.db"))).unwrap();
let storage = Arc::new(storage);
// Default action (scan) on empty storage must not error.
let result = execute_unified(&storage, None).await;
assert!(result.is_ok());
}
#[test]
#[cfg(all(feature = "embeddings", feature = "vector-search"))]
fn test_union_find() {

View file

@ -0,0 +1,140 @@
//! Unified `graph` Tool (v2.2 — Tool Consolidation)
//!
//! Folds four graph/association/prediction tools into one action-dispatched
//! surface:
//!
//! action ∈ {
//! chain, associations, bridges, // former explore_connections
//! predict, // former predict
//! memory_graph, // former memory_graph (viz subgraph)
//! recent, get, memory, neighbors, // former composed_graph
//! never_composed, bounty_mode, label, // "
//! }
//!
//! This is a transparent facade: each action forwards the *same* args envelope
//! to the existing handler, which re-reads its own discriminator/params. None of
//! the underlying arg structs use `deny_unknown_fields`, so unrelated fields are
//! ignored. All actions are read-only EXCEPT `label`, which writes a composition
//! outcome (the one mutator) and is logged for audit.
use serde_json::Value;
use std::sync::Arc;
use tokio::sync::Mutex;
use vestige_core::Storage;
use crate::cognitive::CognitiveEngine;
// Reuse composed_graph's canonical outcome-label vocabulary (do not re-list).
use super::composed_graph::OUTCOME_TYPES;
/// Discriminated-union schema for the unified `graph` tool.
pub fn schema() -> Value {
serde_json::json!({
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": [
"chain", "associations", "bridges",
"predict", "memory_graph",
"recent", "get", "memory", "neighbors",
"never_composed", "bounty_mode", "label"
],
"description": "Graph operation. Reasoning paths: 'chain' (from→to), 'associations' (related via spreading activation, needs 'from'), 'bridges' (connectors between from/to). 'predict' (what memories you'll need next, from 'context'). 'memory_graph' (force-directed subgraph for viz, from 'center_id' or 'query'). Composition topology: 'recent', 'get' (event_id), 'memory' (memory_id), 'neighbors' (memory_id), 'never_composed', 'bounty_mode', 'label' (record an outcome — the only write)."
},
// --- explore (chain/associations/bridges) ---
"from": { "type": "string", "description": "[chain/associations/bridges] Source memory ID." },
"to": { "type": "string", "description": "[chain/bridges] Target memory ID." },
// --- predict ---
"context": { "type": "object", "description": "[predict] Current context (current_file, current_topics, codebase)." },
// --- memory_graph (viz subgraph) ---
"center_id": { "type": "string", "description": "[memory_graph] Center node id (or use 'query')." },
"query": { "type": "string", "description": "[memory_graph] Pick a center node by search query." },
"depth": { "type": "integer", "minimum": 1, "maximum": 3, "description": "[memory_graph] Traversal depth (1-3, default 2)." },
"max_nodes": { "type": "integer", "description": "[memory_graph] Max nodes (default 50, capped 200)." },
// --- composed_graph ---
"event_id": { "type": "string", "description": "[get/label] Composition event id." },
"memory_id": { "type": "string", "description": "[memory/neighbors] Memory id." },
"tags": { "type": "array", "items": { "type": "string" }, "description": "[never_composed/bounty_mode] Optional tag filter." },
"outcome_type": {
"type": "string",
"enum": OUTCOME_TYPES,
"description": "[label] Outcome to record for the composition (the only mutating action)."
},
// --- shared ---
"limit": { "type": "integer", "description": "Max results (per-action defaults; clamped internally).", "minimum": 1, "maximum": 100 }
},
"required": ["action"]
})
}
/// Unified dispatcher for `graph`. Routes on `action`.
pub async fn execute(
storage: &Arc<Storage>,
cognitive: &Arc<Mutex<CognitiveEngine>>,
args: Option<Value>,
) -> Result<Value, String> {
let action = args
.as_ref()
.and_then(|a| a.get("action"))
.and_then(|v| v.as_str())
.ok_or("Missing 'action'. Use chain|associations|bridges|predict|memory_graph|recent|get|memory|neighbors|never_composed|bounty_mode|label.")?
.to_string();
match action.as_str() {
// explore_connections — re-reads its own `action` (chain/associations/bridges).
"chain" | "associations" | "bridges" => {
super::explore::execute(storage, cognitive, args).await
}
// predict — reads `context`, ignores `action`.
"predict" => super::predict::execute(storage, cognitive, args).await,
// memory_graph — reads center_id/query/depth, ignores `action`.
"memory_graph" => super::graph::execute(storage, args).await,
// composed_graph — re-reads its own `action`. `label` is the only write.
"recent" | "get" | "memory" | "neighbors" | "never_composed" | "bounty_mode" | "label" => {
if action == "label" {
let event_id = args
.as_ref()
.and_then(|a| a.get("event_id"))
.and_then(|v| v.as_str())
.unwrap_or("?");
let outcome = args
.as_ref()
.and_then(|a| a.get("outcome_type"))
.and_then(|v| v.as_str())
.unwrap_or("?");
tracing::info!(
event_id = %event_id,
outcome_type = %outcome,
"graph: composition outcome labeled"
);
}
super::composed_graph::execute(storage, args).await
}
other => Err(format!(
"Unknown graph action '{other}'. Use chain|associations|bridges|predict|memory_graph|recent|get|memory|neighbors|never_composed|bounty_mode|label."
)),
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_schema_action_count() {
let s = schema();
let actions = s["properties"]["action"]["enum"].as_array().unwrap();
assert_eq!(actions.len(), 12);
// outcome_type enum is sourced from the canonical const.
let outcomes = s["properties"]["outcome_type"]["enum"].as_array().unwrap();
assert_eq!(outcomes.len(), OUTCOME_TYPES.len());
}
#[test]
fn test_missing_action_errors() {
// Pure arg-shape check; no storage needed for the early return path.
let s = schema();
assert_eq!(s["required"][0], "action");
}
}

View file

@ -0,0 +1,134 @@
//! Unified `maintain` Tool (v2.2 — Tool Consolidation)
//!
//! Folds the seven maintenance/lifecycle tools into one action-dispatched
//! surface:
//!
//! action = consolidate | dream | gc | importance_score | backup | export | restore
//!
//! This is a thin facade: each action forwards the *same* args envelope to the
//! existing handler. None of the underlying arg structs use
//! `deny_unknown_fields`, so the `action` discriminator is ignored by each
//! handler and per-action params validate as before. Safety defaults are
//! preserved because they live inside the callees:
//! - `gc` defaults `dry_run=true` (handler-internal),
//! - `restore` keeps path-confinement (handler-internal),
//! - `export` keeps its traversal guard (handler-internal).
//!
//! The `consolidate`/`dream` *Started* events and the
//! `consolidate`/`dream`/`importance_score` *Completed* events are emitted by
//! the server dispatch + `emit_tool_event` (which normalizes the `maintain`
//! name to its effective action) — not here.
use serde_json::Value;
use std::sync::Arc;
use tokio::sync::Mutex;
use vestige_core::Storage;
use crate::cognitive::CognitiveEngine;
/// Discriminated-union schema for the unified `maintain` tool.
pub fn schema() -> Value {
serde_json::json!({
"type": "object",
"properties": {
"action": {
"type": "string",
"enum": ["consolidate", "dream", "gc", "importance_score", "backup", "export", "restore"],
"description": "Maintenance op. 'consolidate' (run FSRS-6 decay/embedding cycle), 'dream' (replay memories → insights/connections), 'gc' (garbage-collect stale memories; dry_run=true by default), 'importance_score' (4-channel neuroscience score for 'content'), 'backup' (SQLite DB backup), 'export' (memories as JSON/JSONL with filters), 'restore' (restore from a JSON backup at 'path')."
},
// --- gc ---
"min_retention": { "type": "number", "minimum": 0.0, "maximum": 1.0, "description": "[gc] Collect memories below this retention (default 0.1)." },
"dry_run": { "type": "boolean", "description": "[gc] Preview only. Defaults to TRUE for safety." },
// --- importance_score ---
"content": { "type": "string", "description": "[importance_score] Content to score." },
// --- export ---
"format": { "type": "string", "enum": ["json", "jsonl"], "description": "[export] Output format." },
"tags": { "type": "array", "items": { "type": "string" }, "description": "[export] Tag filter." },
"start": { "type": "string", "description": "[export] Start date filter (ISO 8601)." },
"end": { "type": "string", "description": "[export] End date filter (ISO 8601)." },
// --- backup / restore ---
"path": { "type": "string", "description": "[restore] Path to a JSON backup file (path-confined)." }
},
"required": ["action"]
})
}
/// Unified dispatcher for `maintain`. Routes on `action` (required).
pub async fn execute(
storage: &Arc<Storage>,
cognitive: &Arc<Mutex<CognitiveEngine>>,
args: Option<Value>,
) -> Result<Value, String> {
// Clone the discriminator out before the args envelope is moved into a callee.
let action = args
.as_ref()
.and_then(|a| a.get("action"))
.and_then(|v| v.as_str())
.ok_or("Missing 'action'. Use consolidate|dream|gc|importance_score|backup|export|restore.")?
.to_string();
match action.as_str() {
"consolidate" => super::maintenance::execute_consolidate(storage, args).await,
"dream" => super::dream::execute(storage, cognitive, args).await,
"gc" => super::maintenance::execute_gc(storage, args).await,
"importance_score" => super::importance::execute(storage, cognitive, args).await,
"backup" => super::maintenance::execute_backup(storage, args).await,
"export" => super::maintenance::execute_export(storage, args).await,
"restore" => super::restore::execute(storage, args).await,
other => Err(format!(
"Unknown maintain action '{other}'. Use consolidate|dream|gc|importance_score|backup|export|restore."
)),
}
}
#[cfg(test)]
mod tests {
use super::*;
fn test_storage() -> Arc<Storage> {
let dir = tempfile::TempDir::new().unwrap();
let storage = Storage::new(Some(dir.path().join("test.db"))).unwrap();
std::mem::forget(dir);
Arc::new(storage)
}
#[test]
fn test_schema_actions() {
let s = schema();
let actions = s["properties"]["action"]["enum"].as_array().unwrap();
assert_eq!(actions.len(), 7);
assert_eq!(s["required"][0], "action");
}
#[tokio::test]
async fn test_missing_action_errors() {
let storage = test_storage();
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
let r = execute(&storage, &cognitive, None).await;
assert!(r.is_err(), "missing action must error");
}
#[tokio::test]
async fn test_gc_defaults_dry_run() {
let storage = test_storage();
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
// No dry_run passed → handler default true → nothing is actually deleted.
let args = Some(serde_json::json!({ "action": "gc" }));
let r = execute(&storage, &cognitive, args).await.unwrap();
// gc's envelope reports dry_run; assert it stayed true.
let dry = r
.get("dryRun")
.or(r.get("dry_run"))
.and_then(|v| v.as_bool());
assert_eq!(dry, Some(true), "gc must default to dry_run=true via maintain");
}
#[tokio::test]
async fn test_consolidate_resolves() {
let storage = test_storage();
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
let args = Some(serde_json::json!({ "action": "consolidate" }));
assert!(execute(&storage, &cognitive, args).await.is_ok());
}
}

View file

@ -0,0 +1,141 @@
//! Unified `memory_status` Tool (v2.2 — Tool Consolidation)
//!
//! Folds four read-only status/health/temporal tools into one
//! view-dispatched surface:
//!
//! view = health (default) | retention | timeline | changelog
//!
//! - `health` → full system health + statistics (the former `system_status`).
//! Returns the byte-for-byte `system_status` shape (audit scripts parse it),
//! including `schema_introspection` passthrough.
//! - `retention` → the lightweight retention dashboard (former `memory_health`).
//! - `timeline` → chronological browse (former `memory_timeline`).
//! - `changelog` → audit trail of memory changes (former `memory_changelog`).
//!
//! This is a thin facade: each view forwards the *same* args envelope to the
//! existing handler. None of the underlying arg structs use
//! `deny_unknown_fields`, so the discriminator `view` is simply ignored by each
//! handler — no lossy re-scoping required, and per-view fields validate as
//! before. The `cognitive` lock is never held across a forwarded call.
use serde_json::Value;
use std::sync::Arc;
use tokio::sync::Mutex;
use vestige_core::{OutputConfig, Storage};
use crate::cognitive::CognitiveEngine;
/// Discriminated-union schema for the unified `memory_status` tool.
pub fn schema() -> Value {
serde_json::json!({
"type": "object",
"properties": {
"view": {
"type": "string",
"enum": ["health", "retention", "timeline", "changelog"],
"default": "health",
"description": "Which status view. 'health' (default): full system health + stats + FSRS preview + warnings + recommendations. 'retention': lightweight retention dashboard (avg/distribution/trend). 'timeline': browse memories chronologically. 'changelog': audit trail of memory state changes."
},
// --- [health view] ---
"schema_introspection": {
"type": "boolean",
"description": "[health view] Include the response-schema description in the output."
},
// --- [timeline view] ---
"start": { "type": "string", "description": "[timeline/changelog view] Start of range (ISO 8601 date or datetime)." },
"end": { "type": "string", "description": "[timeline/changelog view] End of range (ISO 8601 date or datetime)." },
"node_type": { "type": "string", "description": "[timeline view] Filter by node type (e.g. 'fact', 'decision')." },
"tags": { "type": "array", "items": { "type": "string" }, "description": "[timeline view] Filter by tags (ANY match)." },
"detail_level": {
"type": "string", "enum": ["brief", "summary", "full"],
"description": "[timeline view] Level of detail (default 'summary')."
},
// --- [changelog view] ---
"memory_id": { "type": "string", "description": "[changelog view] Per-memory mode: state transitions for this memory id." },
// --- shared: limit (per-view ranges differ; clamped internally) ---
"limit": {
"type": "integer",
"description": "Max results. [timeline] default 50, max 200. [changelog] default 20, clamped to 100. Ignored by health/retention.",
"minimum": 1, "maximum": 200
}
}
})
}
/// Unified dispatcher for `memory_status`. Routes on `view` (default `health`).
pub async fn execute(
storage: &Arc<Storage>,
cognitive: &Arc<Mutex<CognitiveEngine>>,
output_config: &OutputConfig,
args: Option<Value>,
) -> Result<Value, String> {
let view = args
.as_ref()
.and_then(|a| a.get("view"))
.and_then(|v| v.as_str())
.unwrap_or("health")
.to_string();
match view.as_str() {
// Byte-for-byte system_status shape (incl. schema_introspection passthrough).
"health" => super::maintenance::execute_system_status(storage, cognitive, args).await,
"retention" => super::health::execute(storage, args).await,
"timeline" => super::timeline::execute(storage, output_config, args).await,
"changelog" => super::changelog::execute(storage, args).await,
other => Err(format!(
"Unknown memory_status view '{other}'. Use health|retention|timeline|changelog."
)),
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::cognitive::CognitiveEngine;
fn test_storage() -> Arc<Storage> {
let dir = tempfile::TempDir::new().unwrap();
let storage = Storage::new(Some(dir.path().join("test.db"))).unwrap();
// Keep the tempdir alive for the duration of the process by leaking it;
// these are short-lived unit tests.
std::mem::forget(dir);
Arc::new(storage)
}
#[test]
fn test_schema_views() {
let s = schema();
let views = s["properties"]["view"]["enum"].as_array().unwrap();
assert_eq!(views.len(), 4);
assert_eq!(s["properties"]["view"]["default"], "health");
}
#[tokio::test]
async fn test_default_view_is_health() {
let storage = test_storage();
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
let oc = OutputConfig::default();
// No args → health view → must match system_status output exactly.
let unified = execute(&storage, &cognitive, &oc, None).await.unwrap();
let direct = super::super::maintenance::execute_system_status(&storage, &cognitive, None)
.await
.unwrap();
assert_eq!(
unified, direct,
"memory_status view=health must equal system_status byte-for-byte"
);
}
#[tokio::test]
async fn test_all_views_resolve() {
let storage = test_storage();
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
let oc = OutputConfig::default();
for view in ["health", "retention", "timeline", "changelog"] {
let args = Some(serde_json::json!({ "view": view }));
let r = execute(&storage, &cognitive, &oc, args).await;
assert!(r.is_ok(), "view={view} should resolve, got {r:?}");
}
}
}

View file

@ -2,15 +2,25 @@
//!
//! Tool implementations for the Vestige MCP server.
//!
//! The unified tools (codebase_unified, intention_unified, memory_unified, search_unified)
//! are the primary API. The granular tools below are kept for backwards compatibility
//! but are not exposed in the MCP tool list.
//! v2.2 Tool Consolidation (Layer 1): the advertised surface is 12 tools —
//! recall, memory, codebase, intention, smart_ingest, source_sync,
//! memory_status, dedup, graph, maintain, session_start, suppress. The unified
//! facade modules (recall, dedup, memory_status, graph_unified, maintain, plus
//! the earlier *_unified) dispatch on an action/mode/view discriminator and
//! delegate to the granular handler modules below, which stay in the crate as
//! the implementation layer and as hidden back-compat aliases (see the redirect
//! arms in server.rs). See docs/launch/tool-consolidation-v2.2.0.md.
// Active unified tools
pub mod codebase_unified;
pub mod intention_unified;
pub mod memory_unified;
pub mod search_unified;
// v2.2: Unified retrieval surface — folds search + deep_reference +
// cross_reference + contradictions into one mode-dispatched tool.
// mode=lookup (default) is a zero-overhead pass-through to search_unified.
pub mod recall;
pub mod smart_ingest;
// #57: external-source connectors (GitHub Issues / Redmine retrieval layer)
pub mod source_sync;
@ -22,6 +32,14 @@ pub mod timeline;
// v1.2: Maintenance tools
pub mod maintenance;
// v2.2: Unified maintenance surface — folds consolidate + dream + gc +
// importance_score + backup + export + restore into one action-dispatched tool.
pub mod maintain;
// v2.2: Unified status surface — folds system_status + memory_health +
// memory_timeline + memory_changelog into one view-dispatched tool.
pub mod memory_status;
// v1.3: Auto-save and dedup tools
pub mod dedup;
pub mod importance;
@ -42,6 +60,10 @@ pub mod session_context;
pub mod graph;
pub mod health;
// v2.2: Unified graph surface — folds explore_connections + predict +
// memory_graph + composed_graph into one action-dispatched tool.
pub mod graph_unified;
// v2.1: Cross-reference (connect the dots)
pub mod composed_graph;
pub mod contradictions;

View file

@ -0,0 +1,163 @@
//! Unified `recall` Tool (v2.2 — Tool Consolidation, HOT PATH)
//!
//! Folds the four retrieval/reasoning tools into one mode-dispatched surface:
//!
//! mode = lookup (DEFAULT) | reason | contradictions
//!
//! - `lookup` (default) → hybrid search (the former `search`). This is the hot
//! path: with no `mode` set, `recall` is a ZERO-overhead pass-through to
//! `search_unified::execute` — it must never pay the cost of the reasoning
//! path. (`deep_reference`/`reason` runs spreading activation + FSRS trust
//! scoring + contradiction analysis and is 510× slower.)
//! - `reason` → deep cognitive reasoning across memories (former
//! `deep_reference` / `cross_reference`).
//! - `contradictions` → trust-weighted disagreement pairs (former
//! `contradictions`).
//!
//! The schema is derived from `search_unified::schema()` (so every lookup
//! parameter stays available and documented) plus the `mode` discriminator and
//! the reason/contradictions fields. `query` is NOT globally required because
//! the contradictions mode is scoped by `topic`; per-mode requirements are
//! validated at runtime.
use serde_json::Value;
use std::sync::Arc;
use tokio::sync::Mutex;
use vestige_core::{OutputConfig, Storage};
use crate::cognitive::CognitiveEngine;
/// Discriminated-union schema for the unified `recall` tool.
///
/// Built on top of `search_unified::schema()` so all lookup parameters carry
/// through verbatim; the `required: ["query"]` constraint is dropped (validated
/// per-mode at runtime) and the mode/reason/contradictions fields are added.
pub fn schema() -> Value {
let mut schema = super::search_unified::schema();
if let Some(obj) = schema.as_object_mut() {
// Drop the global `query` requirement — contradictions uses `topic`.
obj.remove("required");
if let Some(props) = obj.get_mut("properties").and_then(|p| p.as_object_mut()) {
props.insert(
"mode".to_string(),
serde_json::json!({
"type": "string",
"enum": ["lookup", "reason", "contradictions"],
"default": "lookup",
"description": "Retrieval mode. 'lookup' (default): fast hybrid search — use for plain recall. 'reason': deep cognitive reasoning across memories (FSRS-6 trust scoring, spreading activation, supersession, contradiction analysis) — use when accuracy matters; needs 'query'. 'contradictions': surface trust-weighted disagreement pairs for a 'topic' (or recent memories)."
}),
);
// reason (deep_reference) extra field.
props.insert(
"depth".to_string(),
serde_json::json!({
"type": "integer",
"description": "[reason mode] How many memories to analyze (default 20, max 50).",
"minimum": 5, "maximum": 50
}),
);
// contradictions extra fields.
props.insert(
"topic".to_string(),
serde_json::json!({
"type": "string",
"description": "[contradictions mode] Topic to scope contradiction detection. If omitted, scans recent memories."
}),
);
props.insert(
"since".to_string(),
serde_json::json!({
"type": "string",
"description": "[contradictions mode] RFC3339 timestamp; only memories updated after this are considered."
}),
);
props.insert(
"min_trust".to_string(),
serde_json::json!({
"type": "number",
"minimum": 0.0, "maximum": 1.0,
"description": "[contradictions mode] Minimum trust for both sides of a contradiction (default 0.3)."
}),
);
}
}
schema
}
/// Unified dispatcher for `recall`. Routes on `mode` (default `lookup`).
///
/// HOT-PATH INVARIANT: `mode` absent ⇒ `lookup` ⇒ direct pass-through to
/// `search_unified::execute`, no extra work.
pub async fn execute(
storage: &Arc<Storage>,
cognitive: &Arc<Mutex<CognitiveEngine>>,
output_config: &OutputConfig,
args: Option<Value>,
) -> Result<Value, String> {
let mode = args
.as_ref()
.and_then(|a| a.get("mode"))
.and_then(|v| v.as_str())
.unwrap_or("lookup");
match mode {
// Zero-overhead default: straight to hybrid search.
"lookup" => super::search_unified::execute(storage, cognitive, output_config, args).await,
// Deep reasoning (deep_reference / cross_reference share this handler).
"reason" => super::cross_reference::execute(storage, cognitive, args).await,
// Trust-weighted contradiction pairs (storage-only).
"contradictions" => super::contradictions::execute(storage, args).await,
other => Err(format!(
"Unknown recall mode '{other}'. Use lookup|reason|contradictions."
)),
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_schema_has_mode_and_no_required() {
let s = schema();
let modes = s["properties"]["mode"]["enum"].as_array().unwrap();
assert_eq!(modes.len(), 3);
assert_eq!(s["properties"]["mode"]["default"], "lookup");
// query must NOT be globally required (contradictions uses topic).
assert!(
s.get("required").is_none(),
"recall must not globally require 'query'"
);
// lookup params carried over from search schema.
assert!(s["properties"]["limit"].is_object());
assert!(s["properties"]["detail_level"].is_object());
}
#[tokio::test]
async fn test_lookup_is_default_and_resolves() {
let dir = tempfile::TempDir::new().unwrap();
let storage = Arc::new(Storage::new(Some(dir.path().join("test.db"))).unwrap());
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
let oc = OutputConfig::default();
// No mode → lookup → behaves like search (query required by search).
let args = Some(serde_json::json!({ "query": "anything" }));
let r = execute(&storage, &cognitive, &oc, args).await;
assert!(r.is_ok(), "default lookup should resolve: {r:?}");
}
#[tokio::test]
async fn test_contradictions_mode_resolves_without_query() {
let dir = tempfile::TempDir::new().unwrap();
let storage = Arc::new(Storage::new(Some(dir.path().join("test.db"))).unwrap());
let cognitive = Arc::new(Mutex::new(CognitiveEngine::new()));
let oc = OutputConfig::default();
// contradictions uses topic, not query — must resolve with no query.
let args = Some(serde_json::json!({ "mode": "contradictions" }));
let r = execute(&storage, &cognitive, &oc, args).await;
assert!(r.is_ok(), "contradictions mode should resolve: {r:?}");
}
}

View file

@ -0,0 +1,135 @@
# Tool Consolidation v2.2.0
> Reduce the Vestige MCP tool surface so an agent can reliably pick the right
> tool, then make the few always-on tools deterministic. Two layers: Layer 1
> (this release) collapses 34 advertised tools to 12; Layer 2 (follow-up) shrinks
> the *default* surface and enforces the memory loop with hooks.
## Why (frontier evidence)
More advertised tools actively degrade tool selection — the 30 tools an agent
ignores make the 5 it uses harder to choose:
- **RAG-MCP** (arXiv 2505.03275): selection accuracy collapses 43% → 14% when the
full tool catalog is dumped into context; stays >90% under ~30 tools.
- **Anthropic tool-deferral**: deferring tool schemas moved Opus 4 from 49% → 74%
on a tool-heavy benchmark.
- **GitHub Copilot**: 40 → 13 tools gave +25pp accuracy and 400ms latency.
- **OpenAI** guidance: aim for <20 functions visible at the start of a turn.
- **RoTBench** (2401.08326): tool *names* are load-bearing — renaming drops GPT-4
80 → 58. So renames are deliberate and every old name keeps working.
Vestige had **34** advertised tools. This is the correction.
## Layer 1 — Count reduction (THIS RELEASE): 34 → 12 advertised
Principle: **one consolidation per commit, one change per submission.** Each
consolidation is its own commit, landed in a safe order with the hot retrieval
path touched last. Every old tool name remains a hidden `warn!` + redirect alias
for at least one minor release (so existing `.mcp.json` configs, hooks, and agent
habits keep working) and is removed in **v2.3.0**.
### Safe order (as committed)
| # | Commit | Folds | Into | Count |
|---|--------|-------|------|------:|
| 1 | `dedup` | find_duplicates + merge_candidates + plan_merge + plan_supersede + apply_plan + merge_undo + protect + merge_policy (8) | `dedup` | 34 → 27 |
| 2 | `session_start` | session_context (rename) | `session_start` | 27 |
| 3a | `memory_status` | system_status + memory_health + memory_timeline + memory_changelog (4) | `memory_status` | 27 → 24 |
| 3b | `graph` | explore_connections + predict + memory_graph + composed_graph (4) | `graph` | 24 → 21 |
| 4 | `maintain` | consolidate + dream + gc + importance_score + backup + export + restore (7) | `maintain` | 21 → 15 |
| 5 | `recall` | search + deep_reference + cross_reference + contradictions (4) | `recall` | 15 → 12 |
`recall` is committed **last** because it is the hot path.
### Final advertised surface (12)
| Standalone (6) | Consolidated (6) |
|---|---|
| `smart_ingest` | `recall` |
| `memory` | `dedup` |
| `codebase` | `memory_status` |
| `intention` | `graph` |
| `source_sync` | `maintain` |
| `suppress` | `session_start` |
### Action / mode / view maps
- **`recall`** — `mode`: `lookup` (default) · `reason` · `contradictions`
- **`dedup`** — `action`: `scan` (default) · `plan_merge` · `plan_supersede` · `apply` · `undo` · `protect` · `policy`
- **`memory_status`** — `view`: `health` (default) · `retention` · `timeline` · `changelog`
- **`graph`** — `action`: `chain` · `associations` · `bridges` · `predict` · `memory_graph` · `recent` · `get` · `memory` · `neighbors` · `never_composed` · `bounty_mode` · `label`
- **`maintain`** — `action`: `consolidate` · `dream` · `gc` · `importance_score` · `backup` · `export` · `restore`
### Resolved design decisions
- **`search` is folded, not kept standalone.** `recall` with no `mode` (the
default) *is* search — a zero-overhead pass-through to `search_unified`. Keeping
both `search` and `recall` advertised would be the exact RAG-MCP anti-pattern.
Final count is a clean **12**, leaving 2 slots of headroom toward a future
always-on `save` surface rather than spending them on a redundant verb.
- **`graph` actions are flat peers, not nested.** `explore`'s `chain` /
`associations` / `bridges` sit alongside `predict` / `memory_graph` /
`composed_graph` actions in a single `action` enum — matching the existing
`memory` / `codebase` flat-action convention and avoiding a translation layer.
### Invariants preserved (with the test that proves each)
- **bitemporal-never-delete** (`dedup`): plan → apply → undo, confirm-gating, and
invalidation-not-deletion delegate to `merge::execute` verbatim.
- **`system_status` response shape** (`memory_status` view=`health`): byte-for-byte
`test_default_view_is_health`.
- **`gc` dry-run default** + **`restore` path-confinement** (`maintain`):
`test_maintain_actions_and_safety`.
- **`recall` lookup = search, no reasoning cost** (hot path):
`test_recall_lookup_matches_search_shape`.
- **Dashboard events** (consolidate/dream/importance_score Started + Completed,
SearchPerformed): preserved by re-emitting in the new dispatch arms and by
`emit_tool_event` normalizing the unified tool name to its effective sub-action.
### Result-size annotations (moved with their tools)
`memory_timeline` (200k) → `memory_status`; `search` (300k) → `recall`; new
`dedup` 150k and `graph` 250k. Kept in sync across the annotation loop, the
`expected_max_result_size` helper, and both annotation guard tests.
### Deprecation timeline
Aliases `warn!` in v2.2.x and are hard-removed in **v2.3.0**. Full alias list (31
names) lives in the dispatch redirects in `crates/vestige-mcp/src/server.rs`.
## Layer 2 — Default-surface + hooks (FOLLOW-UP, NOT in v2.2.0)
Count reduction is necessary but not sufficient: what matters most is how few
tools are visible *at the start of a turn*, plus making the memory loop fire
deterministically instead of hoping the model remembers.
- **Tiny always-on surface (~3)**: `recall` @ session start, `save` (=`smart_ingest`)
@ session end, `recall` on-demand for facts. Everything else (`dedup`, `graph`,
`maintain`, `memory_status`, …) deferred off the default surface, loaded on
demand.
- **Deterministic hooks**: a `SessionStart` hook fires `recall`; a `Stop` hook
fires `save` (async, fire-and-forget — synchronous heavy work in `Stop` causes
loops + per-turn lag). "If the model fails to save, it's gone" — move save out
of the model hot loop.
- This is what turns 12-advertised into ~3-default. Status: **design guidance
only; no code in v2.2.0.**
## Verification
Per-commit gates (all green for every commit):
```sh
cargo test --workspace --no-fail-fast
cargo clippy --workspace -- -D warnings
```
Release gates before tagging v2.2.0:
```sh
pnpm --filter @vestige/dashboard check
pnpm --filter @vestige/dashboard build
```
Plus a `tools/list` smoke check asserting exactly **12** advertised names
(`test_tools_list_returns_all_tools`).