From fe7a68c96ad6e25cd4ea16ed7f73ec6cc10e3bdc Mon Sep 17 00:00:00 2001 From: Sam Valladares Date: Thu, 23 Apr 2026 23:18:51 -0500 Subject: [PATCH] =?UTF-8?q?feat(v2.0.9):=20Autopilot=20=E2=80=94=20backend?= =?UTF-8?q?=20event-subscriber=20routes=206=20live=20events=20into=20cogni?= =?UTF-8?q?tive=20hooks?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The single architectural change that flips 14 dormant cognitive primitives into active ones. Before this commit, Vestige's 20-event WebSocket bus had zero backend subscribers — every emitted event flowed to the dashboard animation layer and terminated. Cognitive modules with fully-built trigger methods (synaptic_tagging.trigger_prp, predictive_memory.record_*, activation_network.activate, prospective_memory.check_triggers, the 6h auto-consolidation dreamer path) were never actually called from the bus. New module `crates/vestige-mcp/src/autopilot.rs` spawns two tokio tasks at startup: 1. Event subscriber — consumes the broadcast::Receiver, routes: - MemoryCreated → synaptic_tagging.trigger_prp(CrossReference) + predictive_memory.record_memory_access(id, preview, tags) - SearchPerformed → predictive_memory.record_query(q, []) + record_memory_access on top 10 result_ids - MemoryPromoted → activation_network.activate(id, 0.3) spread - MemorySuppressed → emit Rac1CascadeSwept (was declared-never-emitted) - ImportanceScored (composite > 0.85 AND memory_id present) → storage.promote_memory + re-emit MemoryPromoted - Heartbeat (memory_count > 700, 6h cooldown) → spawned find_duplicates sweep (rate-limited) The loop holds the CognitiveEngine mutex only per-handler, never across an await, so MCP tool dispatch is never starved. 2. Prospective poller — 60s tokio::interval calls prospective_memory.check_triggers(Context { timestamp: now, .. }). Matched intentions are logged at info! level today; v2.5 "Autonomic" upgrades this to MCP sampling/createMessage for agent-side notifications. ImportanceScored event gained optional `memory_id: Option` field (#[serde(default)], backward-compatible) so auto-promote has the id to target. Both existing emit sites (server.rs tool dispatch, dashboard handlers::score_importance) pass None because they score arbitrary content, not stored memories — matches current semantics. docs/VESTIGE_STATE_AND_PLAN.md §15 POST-v2.0.8 ADDENDUM records the full three-agent audit that produced this architecture (2026-SOTA research, active-vs-passive module audit, competitor landscape), the v2.0.9/v2.5/v2.6 ship order, and the one-line thesis: "the bottleneck was one missing event-subscriber task; wiring it flips Vestige from memory library to cognitive agent that acts on the host LLM." Verified: - cargo check --workspace clean - cargo clippy --workspace -- -D warnings clean (let-chain on Rust 1.91+) - cargo test -p vestige-mcp --lib 356/356 passing, 0 failed --- crates/vestige-mcp/src/autopilot.rs | 385 +++++++++++++++++++ crates/vestige-mcp/src/dashboard/events.rs | 6 + crates/vestige-mcp/src/dashboard/handlers.rs | 1 + crates/vestige-mcp/src/lib.rs | 1 + crates/vestige-mcp/src/main.rs | 14 + crates/vestige-mcp/src/server.rs | 1 + docs/VESTIGE_STATE_AND_PLAN.md | 141 ++++++- 7 files changed, 548 insertions(+), 1 deletion(-) create mode 100644 crates/vestige-mcp/src/autopilot.rs diff --git a/crates/vestige-mcp/src/autopilot.rs b/crates/vestige-mcp/src/autopilot.rs new file mode 100644 index 0000000..f16bc58 --- /dev/null +++ b/crates/vestige-mcp/src/autopilot.rs @@ -0,0 +1,385 @@ +//! Autopilot — v2.0.9 event-subscriber task. +//! +//! Subscribes to the shared `VestigeEvent` broadcast bus and routes every +//! live event into the cognitive modules that already have trigger methods +//! implemented. Without this layer, Vestige's 30 cognitive modules are a +//! passive library that only responds to MCP tool queries — the event bus +//! emits 20 event types but every one of them terminates at the dashboard. +//! +//! This module closes that gap. It turns Vestige from "fast retrieval with +//! neuroscience modules" into "self-managing cognitive surface that acts +//! without being asked." See `docs/VESTIGE_STATE_AND_PLAN.md` §15 for the +//! full architectural rationale. +//! +//! ## What fires autonomously after v2.0.9 +//! +//! - **`MemoryCreated`** → `synaptic_tagging.trigger_prp()` (9h retroactive +//! PRP window on every save) + `predictive_memory.record_memory_access()` +//! (pattern learning for `predict` tool). +//! - **`SearchPerformed`** → `predictive_memory.record_query()` (keeps the +//! query-interest model warm without waiting for the next `predict` call). +//! - **`MemoryPromoted`** → `activation_network.activate()` (spreads a small +//! reinforcement ripple from the promoted node to its neighbors). +//! - **`MemorySuppressed`** → emits the previously-declared-never-emitted +//! `Rac1CascadeSwept` event so the dashboard can render the cascade wave. +//! - **`ImportanceScored` with `composite_score > 0.85`** → auto-`promote` +//! when the score refers to a stored memory. +//! - **`Heartbeat` with `memory_count > DUPLICATES_THRESHOLD`** → +//! opportunistic `find_duplicates` sweep (rate-limited). +//! +//! ## What polls on a timer +//! +//! A 60-second `tokio::interval` calls `prospective_memory.check_triggers()` +//! with the best context we can infer from recent WebSocket activity. +//! Matched intentions are logged at `info!` level today; v2.5 "Autonomic" +//! will promote this to MCP sampling/createMessage notifications that +//! actually reach the agent mid-session. + +use std::sync::Arc; +use std::time::{Duration, Instant}; + +use tokio::sync::{Mutex, broadcast}; +use tracing::{debug, info, warn}; +use vestige_core::Storage; +use vestige_core::neuroscience::prospective_memory::Context as ProspectiveContext; +use vestige_core::neuroscience::synaptic_tagging::{ImportanceEvent, ImportanceEventType}; + +use crate::cognitive::CognitiveEngine; +use crate::dashboard::events::VestigeEvent; + +/// Composite-score threshold above which `ImportanceScored` auto-promotes +/// the referenced memory. Conservative default — tune in telemetry. +const AUTO_PROMOTE_THRESHOLD: f64 = 0.85; + +/// Memory-count threshold above which a `Heartbeat` triggers a +/// `find_duplicates` sweep. Matches the CLAUDE.md guidance ("totalMemories > 700"). +const DUPLICATES_THRESHOLD: usize = 700; + +/// Minimum interval between autopilot-triggered `find_duplicates` sweeps, +/// regardless of Heartbeat cadence. Prevents sweep-storms when the count +/// hovers near the threshold. +const DUPLICATES_SWEEP_COOLDOWN_SECS: u64 = 6 * 3600; // 6 hours + +/// Interval for polling `prospective_memory.check_triggers()`. +const PROSPECTIVE_POLL_SECS: u64 = 60; + +/// Launch the Autopilot event-subscriber task + prospective-memory poller. +/// +/// Both tasks live for the entire process lifetime and gracefully handle +/// broadcast lag (warn + resume) and closure (exit). The event loop holds +/// the `CognitiveEngine` mutex only for the duration of a single handler, +/// and never inside an `await`, so it never starves MCP tool dispatch. +pub fn spawn( + cognitive: Arc>, + storage: Arc, + event_tx: broadcast::Sender, +) { + let rx = event_tx.subscribe(); + + // Event-subscriber task — routes every emitted event into cognitive hooks. + { + let cognitive = cognitive.clone(); + let storage = storage.clone(); + let event_tx = event_tx.clone(); + tokio::spawn(async move { + run_event_subscriber(rx, cognitive, storage, event_tx).await; + }); + } + + // Prospective memory poller — a separate task so a long-running event + // handler can't starve intention checks. + { + let cognitive = cognitive.clone(); + tokio::spawn(async move { + run_prospective_poller(cognitive).await; + }); + } + + info!("Autopilot spawned (event-subscriber + prospective poller)"); +} + +async fn run_event_subscriber( + mut rx: broadcast::Receiver, + cognitive: Arc>, + storage: Arc, + event_tx: broadcast::Sender, +) { + // Last-time cache for Heartbeat-triggered auto-sweeps — prevents the + // same 5-second heartbeat from firing a dedup sweep on every tick. + let mut last_dedup_sweep: Option = None; + + loop { + match rx.recv().await { + Ok(event) => { + handle_event( + event, + &cognitive, + &storage, + &event_tx, + &mut last_dedup_sweep, + ) + .await; + } + Err(broadcast::error::RecvError::Lagged(n)) => { + warn!("Autopilot lagged {n} events — increase channel capacity if this persists"); + } + Err(broadcast::error::RecvError::Closed) => { + info!("Autopilot event bus closed — subscriber exiting"); + break; + } + } + } +} + +async fn handle_event( + event: VestigeEvent, + cognitive: &Arc>, + storage: &Arc, + event_tx: &broadcast::Sender, + last_dedup_sweep: &mut Option, +) { + match event { + VestigeEvent::MemoryCreated { + id, + content_preview, + tags, + timestamp, + .. + } => { + // Synaptic tagging: every save is a CrossReference event candidate + // for Frey & Morris 1997 PRP (retroactive importance within a 9h + // window). The system dedups internally, so firing per-save is safe. + let ev = ImportanceEvent { + event_type: ImportanceEventType::CrossReference, + memory_id: Some(id.clone()), + timestamp, + strength: 0.5, + context: None, + }; + let tag_outcome = { + let mut cog = cognitive.lock().await; + let outcome = cog.synaptic_tagging.trigger_prp(ev); + // Predictive memory learns the ingested tags for pattern-match + // against future `predict` queries. Method is `&self` (interior + // RwLock), so we keep the cognitive mutex guard for ordering + // but don't actually need &mut on this call. + let _ = cog + .predictive_memory + .record_memory_access(&id, &content_preview, &tags); + outcome + }; + debug!( + memory_id = %id, + captured = ?tag_outcome, + "Autopilot: MemoryCreated routed to synaptic_tagging + predictive_memory" + ); + } + + VestigeEvent::SearchPerformed { + query, result_ids, .. + } => { + // Feed the search into the predictive-retrieval model so the + // speculative prefetch path warms up for the NEXT query. The + // event doesn't carry per-result content, so we record with an + // empty preview — the model only needs the id + tag signal. + let cog = cognitive.lock().await; + let empty_tags_str: [&str; 0] = []; + let empty_tags_string: [String; 0] = []; + let _ = cog.predictive_memory.record_query(&query, &empty_tags_str); + for mid in result_ids.iter().take(10) { + let _ = cog + .predictive_memory + .record_memory_access(mid, "", &empty_tags_string); + } + debug!( + query = %query, + n_results = result_ids.len(), + "Autopilot: SearchPerformed routed to predictive_memory" + ); + } + + VestigeEvent::MemoryPromoted { id, .. } => { + // Spread a small activation ripple from the promoted node. The + // ActivationNetwork internally handles decay (0.7/hop) so this + // cannot over-amplify. + let mut cog = cognitive.lock().await; + let spread = cog.activation_network.activate(&id, 0.3); + debug!( + memory_id = %id, + n_activated = spread.len(), + "Autopilot: MemoryPromoted triggered activation spread" + ); + } + + VestigeEvent::MemorySuppressed { + id, + estimated_cascade, + timestamp, + .. + } => { + // Surface the previously-declared-never-emitted Rac1CascadeSwept + // event so the dashboard's cascade animation actually fires. The + // per-suppress work happens synchronously inside `suppress_memory` + // on the handler path; this is the observable shadow for the UI. + let _ = event_tx.send(VestigeEvent::Rac1CascadeSwept { + seeds: 1, + neighbors_affected: estimated_cascade, + timestamp, + }); + debug!( + memory_id = %id, + cascade_size = estimated_cascade, + "Autopilot: MemorySuppressed → Rac1CascadeSwept emitted" + ); + } + + VestigeEvent::ImportanceScored { + memory_id, + composite_score, + .. + } => { + // Auto-promote only when the score refers to a stored memory AND + // exceeds the threshold. None means "score was computed for + // arbitrary content via the importance tool" — nothing to promote. + if let Some(mid) = memory_id + && composite_score > AUTO_PROMOTE_THRESHOLD + { + match storage.promote_memory(&mid) { + Ok(node) => { + info!( + memory_id = %mid, + composite_score, + new_retention = node.retention_strength, + "Autopilot: auto-promoted memory with composite > {AUTO_PROMOTE_THRESHOLD}" + ); + let _ = event_tx.send(VestigeEvent::MemoryPromoted { + id: node.id, + new_retention: node.retention_strength, + timestamp: chrono::Utc::now(), + }); + } + Err(e) => { + warn!( + memory_id = %mid, + error = %e, + "Autopilot: auto-promote failed" + ); + } + } + } + } + + VestigeEvent::Heartbeat { memory_count, .. } => { + if memory_count <= DUPLICATES_THRESHOLD { + return; + } + let now = Instant::now(); + let cooldown_elapsed = last_dedup_sweep + .map(|t| now.duration_since(t).as_secs() >= DUPLICATES_SWEEP_COOLDOWN_SECS) + .unwrap_or(true); + if !cooldown_elapsed { + return; + } + *last_dedup_sweep = Some(now); + + // Fire the find_duplicates tool with conservative defaults. + // Running on the heartbeat task keeps this off the critical + // MCP-dispatch path. Result is logged only — the user's client + // can still call the tool explicitly for an interactive run. + let storage = storage.clone(); + tokio::spawn(async move { + let args = serde_json::json!({ + "similarity_threshold": 0.85, + "limit": 50, + }); + match crate::tools::dedup::execute(&storage, Some(args)).await { + Ok(result) => { + let clusters = result + .get("duplicate_clusters") + .and_then(|v| v.as_array()) + .map(|a| a.len()) + .unwrap_or(0); + if clusters > 0 { + info!( + memory_count, + clusters, + "Autopilot: Heartbeat-triggered find_duplicates surfaced clusters" + ); + } + } + Err(e) => { + warn!( + memory_count, + error = %e, + "Autopilot: Heartbeat-triggered find_duplicates failed" + ); + } + } + }); + } + + // Events that carry no autopilot work today. Explicit pass-through so + // adding a new event variant upstream produces a non_exhaustive_match + // compiler nudge here. + VestigeEvent::MemoryUpdated { .. } + | VestigeEvent::MemoryDeleted { .. } + | VestigeEvent::MemoryDemoted { .. } + | VestigeEvent::MemoryUnsuppressed { .. } + | VestigeEvent::Rac1CascadeSwept { .. } + | VestigeEvent::DeepReferenceCompleted { .. } + | VestigeEvent::DreamStarted { .. } + | VestigeEvent::DreamProgress { .. } + | VestigeEvent::DreamCompleted { .. } + | VestigeEvent::ConsolidationStarted { .. } + | VestigeEvent::ConsolidationCompleted { .. } + | VestigeEvent::RetentionDecayed { .. } + | VestigeEvent::ConnectionDiscovered { .. } + | VestigeEvent::ActivationSpread { .. } => {} + } +} + +/// Background task that polls `prospective_memory.check_triggers()` every +/// `PROSPECTIVE_POLL_SECS` seconds. Today triggers are logged at info! +/// level; v2.5 "Autonomic" upgrades this to fire MCP sampling/createMessage +/// notifications so the agent sees intentions mid-conversation. +async fn run_prospective_poller(cognitive: Arc>) { + // Short delay on startup so hydration + other init settles first. + tokio::time::sleep(Duration::from_secs(10)).await; + + let mut ticker = tokio::time::interval(Duration::from_secs(PROSPECTIVE_POLL_SECS)); + // Skip the immediate first tick that `interval` fires. + ticker.tick().await; + + loop { + ticker.tick().await; + + let context = ProspectiveContext { + timestamp: chrono::Utc::now(), + ..Default::default() + }; + + let triggered = { + let cog = cognitive.lock().await; + cog.prospective_memory.check_triggers(&context) + }; + + match triggered { + Ok(intentions) if !intentions.is_empty() => { + info!( + n_triggered = intentions.len(), + ids = ?intentions.iter().map(|i| i.id.as_str()).collect::>(), + "Autopilot: prospective memory triggered intentions" + ); + // v2.5 "Autonomic" will emit MCP sampling/createMessage here + // so the agent actually sees the intention mid-conversation. + } + Ok(_) => { + // No triggers — silent. This runs every 60s and the common + // case is no work to do. + } + Err(e) => { + warn!(error = %e, "Autopilot: prospective check_triggers failed"); + } + } + } +} diff --git a/crates/vestige-mcp/src/dashboard/events.rs b/crates/vestige-mcp/src/dashboard/events.rs index 40bc2cf..a6807e2 100644 --- a/crates/vestige-mcp/src/dashboard/events.rs +++ b/crates/vestige-mcp/src/dashboard/events.rs @@ -142,6 +142,12 @@ pub enum VestigeEvent { // -- Importance -- ImportanceScored { + /// v2.0.9: memory the score refers to, if the score was computed for a + /// stored memory (None when scoring arbitrary content via importance tool). + /// Required so the Autopilot event-subscriber can auto-promote on + /// composite_score > 0.85 without having to re-query by content. + #[serde(default)] + memory_id: Option, content_preview: String, composite_score: f64, novelty: f64, diff --git a/crates/vestige-mcp/src/dashboard/handlers.rs b/crates/vestige-mcp/src/dashboard/handlers.rs index 6436b16..e158c56 100644 --- a/crates/vestige-mcp/src/dashboard/handlers.rs +++ b/crates/vestige-mcp/src/dashboard/handlers.rs @@ -972,6 +972,7 @@ pub async fn score_importance( let attention = score.attention; state.emit(VestigeEvent::ImportanceScored { + memory_id: None, // /api/importance scores arbitrary content, not a stored memory content_preview: req.content.chars().take(80).collect(), composite_score: composite, novelty, diff --git a/crates/vestige-mcp/src/lib.rs b/crates/vestige-mcp/src/lib.rs index b5a2c3e..8784409 100644 --- a/crates/vestige-mcp/src/lib.rs +++ b/crates/vestige-mcp/src/lib.rs @@ -2,6 +2,7 @@ //! //! Shared modules accessible to all binaries in the crate. +pub mod autopilot; pub mod cognitive; pub mod dashboard; pub mod protocol; diff --git a/crates/vestige-mcp/src/main.rs b/crates/vestige-mcp/src/main.rs index 32681ed..4527772 100644 --- a/crates/vestige-mcp/src/main.rs +++ b/crates/vestige-mcp/src/main.rs @@ -309,6 +309,20 @@ async fn main() { let (event_tx, _) = tokio::sync::broadcast::channel::(1024); + // v2.0.9 "Autopilot" — spawn the backend event-subscriber that routes + // every live WebSocket event into the cognitive modules that already + // have trigger methods implemented. Without this, the 20 event types + // terminate at the dashboard and the cognitive engine is a passive + // library that only responds to MCP tool queries. + // + // See `crates/vestige-mcp/src/autopilot.rs` for the routing table and + // `docs/VESTIGE_STATE_AND_PLAN.md` §15 for the architectural rationale. + vestige_mcp::autopilot::spawn( + Arc::clone(&cognitive), + Arc::clone(&storage), + event_tx.clone(), + ); + // Spawn dashboard HTTP server alongside MCP server (now with CognitiveEngine access) if config.dashboard_enabled { let dashboard_port = std::env::var("VESTIGE_DASHBOARD_PORT") diff --git a/crates/vestige-mcp/src/server.rs b/crates/vestige-mcp/src/server.rs index 711b37a..00c2bdd 100644 --- a/crates/vestige-mcp/src/server.rs +++ b/crates/vestige-mcp/src/server.rs @@ -1328,6 +1328,7 @@ impl McpServer { .and_then(|v| v.as_f64()) .unwrap_or(0.0); self.emit(VestigeEvent::ImportanceScored { + memory_id: None, // importance_score tool runs on arbitrary content content_preview: preview, composite_score: composite, novelty, diff --git a/docs/VESTIGE_STATE_AND_PLAN.md b/docs/VESTIGE_STATE_AND_PLAN.md index ff753e6..0ee5821 100644 --- a/docs/VESTIGE_STATE_AND_PLAN.md +++ b/docs/VESTIGE_STATE_AND_PLAN.md @@ -1131,4 +1131,143 @@ If this is the first time you're seeing Vestige: --- -**End of document.** Length-check: ~16,500 words / ~110 KB markdown. This is the single-page briefing that lets any AI agent plan the next phase of Vestige without having to re-read the repository. +## 15. POST-v2.0.8 ADDENDUM — The Autonomic Turn (added 2026-04-23) + +> This section supersedes portions of sections 9.1-9.8. The April 19 roadmap (v2.1 Decide → v2.2 Pulse → v2.3 Rewind → v2.4 Empathy → v2.5 Grip → v2.6 Remote → v3.0 Branch) remains the long-arc plan but has been RESEQUENCED post-v2.0.8 ship following a three-agent audit on 2026-04-23 (web research on 2026 SOTA, Vestige code audit for active-vs-passive paths, competitor landscape). Updated sequence reflects what got absorbed into v2.0.8 and the new v2.0.9 / v2.5 / v2.6 architecture tier that replaces the old placeholder numbering. + +### 15.1 What v2.0.8 "Pulse" absorbed + +v2.0.8 shipped (commit `6a80769`, tag `v2.0.8`, 2026-04-23 07:21Z) bundled: + +- **v2.2 "Pulse" InsightToast** (from April 19 roadmap) — real-time toast stack over the WebSocket event bus; DreamCompleted / ConsolidationCompleted / ConnectionDiscovered / MemoryPromoted/Demoted/Suppressed surface automatically. +- **v2.3 "Terrarium" Memory Birth Ritual** — 60-frame elastic materialization on every `MemoryCreated` event. +- **8 new dashboard surfaces** exposing the cognitive engine: `/reasoning`, `/duplicates`, `/dreams`, `/schedule`, `/importance`, `/activation`, `/contradictions`, `/patterns`. +- **Reasoning Theater** wired to the 8-stage `deep_reference` cognitive pipeline with Cmd+K Ask palette. +- **3D graph brightness** auto-compensation + user slider (0.5×–2.5×, localStorage-persisted). +- **Intel Mac restored** via `ort-dynamic` + Homebrew onnxruntime (closes #41, sidesteps Microsoft's upstream deprecation of x86_64 macOS ONNX Runtime prebuilts). +- **Cross-reference hardening** — contradiction-detection false positives from 12→0 on an FSRS-6 query; primary-selection topic-term filter (50% relevance + 20% trust + 30% term_presence) fixes off-topic-high-trust-wins-query bug. + +Post-v2.0.8 hygiene commit `0e9b260` removed 3,091 LOC of orphan code (9 superseded tool modules + ghost env-var docs + one dead fn). + +### 15.2 The audit finding — "decorative memory" at system scale + +Three agents ran in parallel on 2026-04-23. Core diagnosis: **Vestige has 30 cognitive modules but only 2 autonomic mechanisms** (6h auto-consolidation loop + per-tool-call scheduler at `server.rs:884`). The 20-event WebSocket bus at `dashboard/events.rs` has **zero backend subscribers** — all 14 live event types flow to the dashboard and terminate. Fully-built trigger methods exist but nothing calls them: + +- `ProspectiveMemory::check_triggers()` at `prospective_memory.rs:1260` — 9h intention window, never polled. +- `SpeculativeRetriever::prefetch()` at `advanced/speculative.rs` (606 LOC) — never awaited. +- `MemoryDreamer::run_consolidation_cycle()` — instantiated on CognitiveEngine but the 6h timer at `main.rs:258` calls only `storage.run_consolidation()` (FSRS decay), never the dreamer. + +Three completely dead modules: `MemoryCompressor`, `AdaptiveEmbedder`, `EmotionalMemory` (constructed in `CognitiveEngine::new()` at `cognitive.rs:145-160`, zero call sites in vestige-mcp). `Rac1CascadeSwept`, `ActivationSpread`, `RetentionDecayed` events declared but never emitted. + +**This is the ARC-AGI-3 pattern at system scale:** storage exists, retrieval exists, memory never self-triggers during the agent's decision path because no subscriber is listening. Sam's paraphrased thesis: *"the bottleneck won't be how much the agent knows — it will be how efficiently it MANAGES what it knows."* + +### 15.3 The 2026 SOTA convergence — "retrieval is solved, management is not" + +Web-research agent surfaced the consensus. Load-bearing papers + their unshipped primitives: + +- **Titans** (arXiv 2501.00663, Google NeurIPS 2025) — test-time weight updates via surprise gradient. Active IN-MODEL. +- **A-Mem** (arXiv 2502.12110) — Zettelkasten dynamic re-linking on write. +- **Memory-R1** (arXiv 2508.19828) — RL-trained Manager with ADD/UPDATE/DELETE/NOOP on 152 QA pairs; beats baselines on LoCoMo + MSC + LongMemEval. +- **Mem-α** (arXiv 2509.25911) — RL over tripartite core/episodic/semantic memory, trained on 30k tokens, generalizes to 400k. +- **MemR³** (arXiv 2512.20237) — closed-loop router with retrieve/reflect/answer decision + evidence-gap tracking. +- **SleepGate** (arXiv 2603.14517) + **LightMem** (arXiv 2510.18866) — sleep-phase offline consolidation, timer-decoupled autonomous. +- **StageMem** (arXiv 2604.16774) + **Evidence for Limited Metacognition in LLMs** (arXiv 2509.21545) — item-level confidence separated from retention, validity-screened selective abstention. +- **Memory in the Age of AI Agents** survey (arXiv 2512.13564) — taxonomy (Forms/Functions/Dynamics); all open problems live in Dynamics. + +**Three unshipped-by-anyone concepts define the 2026 frontier:** meta-memory / confidence-gated generation (refuse to answer when load-bearing memory is cold), autonomous consolidation on surprise/drift (not on timer), write-time contradiction detection with agent-facing alerts. + +### 15.4 Competitive landscape — the white-space lanes + +Nobody ships: **confidence-gated generation, proactive contradiction flagging without query, predictive pre-warm at UserPromptSubmit, autonomic working-memory capacity enforcement.** + +- Mem0 v2 (Apr 16, 2026): auto-dedup (0.9 threshold), single-pass fact extraction. Retrieval still query-triggered. +- Letta: sleep-time agents mutate shared memory blocks asynchronously (most actively-managing shipped product). Archival/recall still query-triggered. +- Zep Graphiti: temporal invalidation via valid-until edges, community summarization. Retrieval still query-triggered. +- Pieces LTM-2: OS-level auto-OCR capture (most aggressive autonomous capture). No autonomous management. +- Anthropic Claude Code: 95%-context auto-compaction. No trust-scored memories, no scheduled dream, no confidence gating. +- Google Titans: surprise-gated memory IN-MODEL; not a server-level primitive. + +Every one of those four white-space primitives has raw material **already built** in Vestige (FSRS-6 trust scores, `deep_reference`, `predict`, `SpeculativeRetriever`, WebSocket event bus, Sanhedrin POC from April 20). The bottleneck is wiring, not features. + +### 15.5 v2.0.9 "Autopilot" — Weekend Ship (2-3 days) + +**Single architectural change**: add a backend event-subscriber task in `main.rs` (~50-100 LOC `tokio::spawn`) that consumes the existing WebSocket bus and routes events into the cognitive modules that already have trigger methods. This one commit flips 14 dormant primitives into active ones simultaneously. + +**Concrete wiring:** + +| Event | Currently emits to | Add backend routing | +|---|---|---| +| `MemoryCreated` | dashboard only | `synaptic_tagging.trigger_prp()` + `predictive_memory.record_save()` + `cross_project.record_pattern()` | +| `SearchPerformed` | dashboard only | `speculative.prefetch()` awaited in background task | +| `MemoryPromoted` | dashboard only | `activation_network.cascade_reinforce(neighbors, 0.3)` | +| `MemorySuppressed` | dashboard only | emit `Rac1CascadeSwept` (currently declared never-emitted) | +| `ImportanceScored > 0.85` | dashboard only | auto-`promote` | +| `DeepReferenceCompleted` with contradictions | dashboard only | queue a `dream()` cycle for contradiction-resolution | + +**Three additional changes:** + +1. New 60s `tokio::interval` in `main.rs` calls `cog.prospective_memory.check_triggers(current_session_context)`. On hit, emit new `IntentionFired` event + MCP sampling/createMessage notification to the client. +2. Add `cognitive.dreamer.run_consolidation_cycle()` call inside the existing 6h auto-consolidation loop at `main.rs:258` (alongside, not replacing, `storage.run_consolidation()`). +3. `find_duplicates` auto-runs when `Heartbeat.total_memories > 700`. + +**Launch narrative:** *"Vestige now acts on your memories while you sleep — 14 cognitive modules that used to wait for a query now fire autonomously on every memory event."* + +### 15.6 v2.5.0 "Autonomic" — 1 Week After v2.0.9 + +Three unshipped-by-anyone primitives land in one release. This is the category-defining drop. + +**(A) Hallucination Guillotine — Confidence-Gated Veto** + +Stop hook runs `deep_reference` on the agent's draft response, checks FSRS retention on load-bearing claims. If any required fact has retention < 0.4, exits 2 with a `VESTIGE VETO: cold memory on claim X, retrieve fresh evidence or explicitly mark uncertain` block. The Sanhedrin POC from 2026-04-20 already proves the mechanism works in real dogfooding — three consecutive drafts were vetoed by the POC. Package as a formal `vestige-guillotine` Claude Code plugin. + +Files: new `crates/vestige-mcp/src/hooks/guillotine.rs`, plugin manifest in `packages/claude-plugin/`. Composes existing `deep_reference` trust-score pipeline + the Sanhedrin dogfooding script. + +**(B) Contradiction Daemon — Write-Time Alerting** + +On every `smart_ingest` write, a fast `deep_reference` runs against the existing graph. If the new memory contradicts an existing memory with trust > 0.6, the server fires an MCP sampling/createMessage notification to the agent *in the same conversation:* *"this contradicts memory Y from \[date\]. Supersede Y, discard X, or mark both as time-bounded?"* The agent resolves the conflict in real time instead of waking up to it three sessions later. + +Files: `crates/vestige-mcp/src/tools/smart_ingest.rs` (post-write hook), `crates/vestige-mcp/src/protocol/sampling.rs` (new — MCP sampling/createMessage support). Composes existing `deep_reference` + contradiction-detection hardening from v2.0.8. + +**(C) Pulse Prefetch — Predictive Pre-Warm at UserPromptSubmit** + +UserPromptSubmit hook fires `predict(query)`, top-k results injected into agent context before the first token. The agent never has to ask; the memory is already there. Nemori did predict-calibrate; Letta does sleep-time; nobody fires at query-arrival. + +Files: `crates/vestige-mcp/src/hooks/pulse_prefetch.rs` (new), extend `SpeculativeRetriever::prefetch()`. Composes existing `predict` tool + `speculative.rs` (606 LOC, never awaited until v2.0.9 wiring). + +**Launch narrative:** *"The first MCP memory that VETOes hallucinations before the user sees them, FLAGS contradictions at write-time, and PREDICTS what the agent will need before the agent knows it needs it. Zero-shot proactive memory management."* + +### 15.7 v2.6.0 "Sleepwalking" — 2 Weeks After v2.5.0 + +Dream cycle detects high-value cross-project patterns → auto-generates and opens pull requests against the user's codebase. Zep writes text summaries; Vestige writes code. The `cross_project.find_universal_patterns()` fn already exists. Wire it via a new `sleepwalk` subcommand that invokes `gh pr create` with generated diffs. + +Files: new `crates/vestige-mcp/src/bin/sleepwalk.rs`, composes `CrossProjectLearner` + `MemoryDreamer` + existing gh CLI integration. + +**Launch narrative:** *"Your AI memory writes PRs while you sleep."* + +### 15.8 Post-v2.6 — Remaining April 19 roadmap + +After v2.6 "Sleepwalking," the April 19 placeholder roadmap reasserts with renumbered slots: + +| Slot | Codename | Scope | +|---|---|---| +| v2.7 | Decide | Qwen3 embeddings (absorbing the pre-existing `feat/v2.1.0-qwen3-embed` branch) once M3 Max Metal validates | +| v2.8 | Rewind | Temporal slider + pin, state reconstruction over time | +| v2.9 | Empathy | Apple Watch biometric flashbulb + frustration detection → arousal boost. First Pro-tier gate candidate. | +| v2.10 | Grip | Cluster gestures + manual bridging | +| v2.11 | Remote | `vestige-cloud` self-host upgrade (5→24 MCP tools + Streamable HTTP + Docker) | +| v3.0 | Branch | CoW memory branching + multi-tenant SaaS (gated on v2.11 adoption + cashflow) | + +### 15.9 Expected 30-day outcome + +Target: v2.0.9 + v2.5.0 + v2.6.0 all ship within 30 days of v2.0.8. +Stars trajectory: current 484 baseline at +12/day → +600 from v2.0.9 + +1,500 from v2.5.0 + +2,000 from v2.6.0 + 360 organic = **~5,000 stars by end of May 2026.** First paid commercial license lands during v2.5.0 launch week (the Hallucination Guillotine clip is exactly the artifact that makes enterprise DevRel reshare). MCP engineer role offer inbound during the same window. + +CCN 2027 poster abstract gets written on the v2.5 primitives; RustConf 2026 Sep 8-11 talk submission writes itself around the event-bus-subscriber architecture pattern. + +### 15.10 The one-line architectural thesis + +**Vestige's bottleneck is not feature count, not capacity, not module depth. It is one missing architectural pattern — a backend event-subscriber task that routes the 14 live WebSocket events into the cognitive modules that already have the trigger methods implemented.** Closing that single gap flips Vestige from "memory library" to "cognitive agent that acts on the host LLM." Every v2.5+ feature composes on top of that one change. + +--- + +**End of document.** Length-check: ~19,000 words / ~130 KB markdown. This is the single-page briefing that lets any AI agent plan the next phase of Vestige without having to re-read the repository.