diff --git a/docs/adr/0001-pluggable-storage-and-network-access.md b/docs/adr/0001-pluggable-storage-and-network-access.md new file mode 100644 index 0000000..c150c70 --- /dev/null +++ b/docs/adr/0001-pluggable-storage-and-network-access.md @@ -0,0 +1,303 @@ +# ADR 0001: Pluggable Storage Backend, Network Access, and Emergent Domains + +**Status**: Accepted +**Date**: 2026-04-21 +**Related**: [docs/prd/001-getting-centralized-vestige.md](../prd/001-getting-centralized-vestige.md) + +--- + +## Context + +Vestige v2.x runs as a per-machine local process: stdio MCP transport, SQLite + +FTS5 + USearch HNSW in `~/.vestige/`, fastembed locally for embeddings. This is +ideal for single-machine single-agent use but blocks three real needs: + +- **Multi-machine access** -- same memory brain from laptop, desktop, server +- **Multi-agent access** -- multiple AI clients against one store concurrently +- **Future federation** -- syncing memory between decentralized nodes (MOS / + Threefold grid) + +SQLite's single-writer model and lack of a native network protocol make it +unsuitable as a centralized server. PostgreSQL + pgvector collapses our three +storage layers (SQLite, FTS5, USearch) into one engine with MVCC concurrency, +auth, and replication. + +Separately, Vestige today has no notion of domain or project scope -- all memories +share one namespace. For a multi-machine brain, users want soft topical +boundaries ("dev", "infra", "home") without manual tenanting. HDBSCAN clustering +on embeddings produces these boundaries from the data itself. + +The PRD at `docs/prd/001-getting-centralized-vestige.md` sketches the full design. +This ADR records the architectural decisions and resolves the open questions from +that document. + +--- + +## Decision + +Introduce two new trait boundaries, a network transport layer, and a domain +classification module. All four changes ship in parallel phases. + +**Trait boundaries:** + +1. `MemoryStore` -- single trait covering CRUD, hybrid search, FSRS scheduling, + graph edges, and domains. One big trait, not four. +2. `Embedder` -- separate trait for text-to-vector encoding. Storage never calls + fastembed directly. Callers (cognitive engine locally, HTTP server remotely) + compute embeddings and pass them into the store. + +**Backends:** + +- `SqliteMemoryStore` -- existing code refactored behind the trait, no behavior + change. +- `PgMemoryStore` -- new, using sqlx + pgvector + tsvector. Selectable at runtime + via `vestige.toml`. + +**Network:** + +- MCP over Streamable HTTP on the existing Axum server. +- API key auth middleware (blake3-hashed, stored in `api_keys` table). +- Dashboard uses the same API keys for login, then signed session cookies for + subsequent requests. + +**Domain classification:** + +- HDBSCAN clustering over embeddings to discover domains automatically. +- Soft multi-domain assignment -- raw similarity scores stored per memory, every + domain above a threshold is assigned. +- Conservative drift handling -- propose splits/merges, never auto-apply. + +--- + +## Architecture Overview + +### Component Breakdown + +1. **`Embedder` trait** (new module `crates/vestige-core/src/embedder/`) + - `async fn embed(&self, text: &str) -> Result>` + - `fn model_name(&self) -> &str` + - `fn dimension(&self) -> usize` + - Impls: `FastembedEmbedder` (local ONNX, today), future `JinaEmbedder`, + `OpenAiEmbedder`, etc. + - Stays pluggable forever -- no lock-in to fastembed or to nomic-embed-text. + +2. **`MemoryStore` trait** (new module `crates/vestige-core/src/storage/trait.rs`) + - One trait, ~25 methods across CRUD, search, FSRS, graph, domain sections. + - Uses `trait_variant::make` to generate a `Send`-bound variant for + `Arc` in Axum/tokio contexts. + - The 29 cognitive modules operate exclusively through this trait. No direct + SQLite or Postgres access from the modules. + +3. **`SqliteMemoryStore`** (refactor of existing `crates/vestige-core/src/storage/sqlite.rs`) + - Existing rusqlite + FTS5 + USearch code, wrapped behind the trait. + - Add `domains TEXT[]` equivalent (JSON-encoded array column in SQLite). + - Add `domain_scores` JSON column. + - No behavioral change for current users. + +4. **`PgMemoryStore`** (new `crates/vestige-core/src/storage/postgres.rs`) + - `sqlx::PgPool` with compile-time checked queries. + - pgvector HNSW index for vector search, tsvector + GIN for FTS. + - Native array columns for `domains`, JSONB for `domain_scores` and `metadata`. + - Hybrid search via RRF (Reciprocal Rank Fusion) in a single SQL query. + +5. **Model registry** + - Per-database table `embedding_model` with `(name, dimension, hash, created_at)`. + - Both backends refuse writes from an embedder whose signature doesn't match + the registered row. + - Model swap = `vestige migrate --reembed --model=`, O(n) cost, explicit. + +6. **`DomainClassifier` cognitive module** (new `crates/vestige-core/src/neuroscience/domain_classifier.rs`) + - Owns the HDBSCAN discovery pass (using the `hdbscan` crate). + - Computes soft-assignment scores for every memory against every centroid. + - Stores raw `domain_scores: HashMap` per memory; thresholds into + the `domains` array using `assign_threshold` (default 0.65). + - Runs discovery on demand (`vestige domains discover`) or during dream + consolidation passes. + +7. **HTTP MCP transport** (extension of existing Axum server in `crates/vestige-mcp/src/`) + - New route `POST /mcp` for Streamable HTTP JSON-RPC. + - New route `GET /mcp` for SSE (for long-running operations). + - REST API under `/api/v1/` for direct HTTP clients (non-MCP integrations). + - Auth middleware validates `Authorization: Bearer ...` or `X-API-Key`, plus + signed session cookies for dashboard. + +8. **Key management** (new `crates/vestige-mcp/src/auth/`) + - `api_keys` table -- blake3-hashed keys, scopes, optional domain filter, + last-used timestamp. + - CLI: `vestige keys create|list|revoke`. + +9. **FSRS review event log** (future-proofing for federation) + - New table `review_events` -- append-only `(memory_id, timestamp, rating, + prior_state, new_state)`. + - Current `scheduling` table becomes a materialized view over the event log + (reconstructible from events). + - Phase 5 federation merges event logs, not derived state. Zero cost today, + avoids lock-in tomorrow. + +### Data Flow + +**Local mode (stdio MCP, unchanged UX):** +``` +stdio client -> McpServer -> CognitiveEngine -> FastembedEmbedder -> MemoryStore (SQLite) +``` + +**Server mode (HTTP MCP, new):** +``` +Remote client -> Axum HTTP -> auth middleware -> CognitiveEngine + -> FastembedEmbedder (server-side) -> MemoryStore (Postgres) +``` + +The cognitive engine is backend-agnostic. The embedder and the store are both +swappable. The 7-stage search pipeline (overfetch -> cross-encoder rerank -> +temporal -> accessibility -> context match -> competition -> spreading activation) +sits *above* the `MemoryStore` trait and works identically against either backend. + +### Orthogonality of HDBSCAN and Reranking + +HDBSCAN and the cross-encoder reranker solve different problems and both stay: + +- **HDBSCAN** discovers domains by clustering embeddings. Runs once per discovery + pass. Produces centroids. Used to *filter* search candidates, not to rank them. +- **Cross-encoder reranker** (Jina Reranker v1 Turbo) scores query-document pairs + at search time. Runs on every search. Produces ranked results. + +Domain membership is a filter applied before or during overfetch; reranking runs +on whatever candidate set survives the filter. + +--- + +## Alternatives Considered + +| Alternative | Pros | Cons | Why Not | +|-------------|------|------|---------| +| Split into 4 traits (`MemoryStore + SchedulingStore + GraphStore + DomainStore`) | Cleaner interface segregation | Every module holds 4 trait objects, coordinates transactions across them | One trait is fine in Rust; extract sub-traits later if a genuine need appears | +| Embedding computed inside the backend | Simpler call sites for callers | Backend becomes aware of embedding models; can't support remote clients without local fastembed | Keep storage pure; separate `Embedder` trait handles pluggability | +| Unconstrained pgvector `vector` (no dimension) | Flexible for model swaps | HNSW still needs fixed dims at index creation; hides a meaningful migration as "silent" | Fixed dimension per install, explicit `--reembed` migration | +| Dashboard separate auth (cookies only, no keys) | Simpler dashboard UX | Two auth systems to maintain | Shared API keys with session cookie layer on top | +| Auto-tuned `assign_threshold` targeting an unclassified ratio | Adapts to corpus | Hard to debug ("why did this memory change domain?"); magical | Static 0.65 default, config-tunable, dashboard shows `domain_scores` for manual retuning | +| Aggressive drift (auto-reassign memories whose scores drifted) | Always up-to-date domains | Breaks user muscle memory; silent reshuffling | Conservative: always propose, user accepts | +| CRDTs for federation state | Mathematically clean merges | Massive complexity, performance cost, overkill | Defer; design FSRS as event log now so any future sync model works | + +--- + +## Consequences + +### Positive + +- Single memory brain accessible from every machine. +- Multi-agent concurrent access via Postgres MVCC. +- Natural topical scoping emerges from data, not manual tenants. +- Future embedding model swaps are a config + migration, not a rewrite. +- Federation has a clean on-ramp (event log merge) without committing now. +- The `Embedder` / `MemoryStore` split unlocks other storage backends later + (Redis, Qdrant, Iroh-backed blob store, etc.) with minimal work. + +### Negative + +- Operating a Postgres instance is more work than managing a SQLite file. +- Users who stay on SQLite gain nothing from this ADR (but lose nothing either). +- Migration (`vestige migrate --from sqlite --to postgres`) is a sensitive + operation for users with months of memories -- needs strong testing. +- HDBSCAN + re-soft-assignment runs in O(n) over all embeddings. At 100k+ + memories this starts to matter; manageable but not free. + +### Risks + +- **Trait abstraction leaks**: a cognitive module might need backend-specific + behavior (e.g., Postgres triggers for tsvector). Mitigation: keep such logic + inside the backend impl; the trait stays pure. + Escalation: if a module genuinely cannot express what it needs through the + trait, the trait grows, not the module bypasses. +- **Embedding model drift**: users on older fastembed versions silently + producing slightly different vectors after a fastembed upgrade. Mitigation: + model hash in the registry, refuse mismatched writes, surface a clear error. +- **Auth misconfiguration**: a user binds to `0.0.0.0` without setting + `auth.enabled = true`. Mitigation: refuse to start with non-localhost bind + and auth disabled. Hard error, not a warning. +- **Re-clustering feedback loop**: dream consolidation proposes re-clusters, + which the user accepts, which changes classifications, which affects future + retrievals, which affect future dreams. Mitigation: cap re-cluster frequency + (every 5th dream by default), require explicit user acceptance of proposals. +- **Cross-domain spreading activation weight (0.5 default)**: arbitrary choice; + could be too aggressive or too lax. Mitigation: config-tunable; instrument + retrieval quality metrics in the dashboard so the user sees impact. + +--- + +## Resolved Decisions (from Q&A) + +| # | Question | Resolution | +|---|----------|------------| +| 1 | Trait granularity | Single `MemoryStore` trait | +| 2 | Embedding on insert | Caller provides; separate `Embedder` trait for pluggability | +| 3 | pgvector dimension | Fixed per install, derived from `Embedder::dimension()` at schema init | +| 4 | Federation sync | Defer algorithm; store FSRS reviews as append-only event log now | +| 5 | Dashboard auth | Shared API keys + signed session cookie | +| 6 | HDBSCAN `min_cluster_size` | Default 10; user reruns with `--min-cluster-size N`; no auto-sweep | +| 7 | Domain drift | Conservative -- always propose splits/merges, never auto-apply | +| 8 | Cross-domain spreading activation | Follow with decay factor 0.5 (tunable) | +| 9 | Assignment threshold | Static 0.65 default, config-tunable, raw `domain_scores` stored for introspection | + +--- + +## Implementation Plan + +Five phases, each independently shippable. + +### Phase 1: Storage trait extraction +- Define `MemoryStore` and `Embedder` traits in `vestige-core`. +- Refactor `SqliteMemoryStore` to implement `MemoryStore`; no behavior change. +- Refactor `FastembedEmbedder` to implement `Embedder`. +- Add `embedding_model` registry table; enforce consistency on write. +- Add `domains TEXT[]`-equivalent and `domain_scores` JSON columns to SQLite + (empty for all existing rows). +- Convert all 29 cognitive modules to operate via the traits. +- **Acceptance**: existing test suite passes unchanged. Zero warnings. + +### Phase 2: PostgreSQL backend +- `PgMemoryStore` with sqlx, pgvector, tsvector. +- sqlx migrations (`crates/vestige-core/migrations/postgres/`). +- Backend selection via `vestige.toml` `[storage]` section. +- `vestige migrate --from sqlite --to postgres` command. +- `vestige migrate --reembed` command for model swaps. +- **Acceptance**: full test suite runs green against Postgres with a testcontainer. + +### Phase 3: Network access +- Streamable HTTP MCP route on Axum (`POST /mcp`, `GET /mcp` for SSE). +- REST API under `/api/v1/`. +- API key table + blake3 hashing + `vestige keys create|list|revoke`. +- Auth middleware (Bearer, X-API-Key, session cookie). +- Refuse non-localhost bind without auth enabled. +- **Acceptance**: MCP client over HTTP works from a second machine; dashboard + login flow works; unauth requests return 401. + +### Phase 4: Emergent domain classification +- `DomainClassifier` module using the `hdbscan` crate. +- `vestige domains discover|list|rename|merge` CLI. +- Automatic soft-assignment pipeline (compute `domain_scores` on ingest, threshold + into `domains`). +- Re-cluster every Nth dream consolidation (default 5); surface proposals in the + dashboard. +- Context signals (git repo, IDE) as soft priors on classification. +- Cross-domain spreading activation with 0.5 decay. +- **Acceptance**: on a corpus of 500+ mixed memories, discover produces sensible + clusters; search scoped to a domain returns tightly relevant results. + +### Phase 5: Federation (future, explicitly out of scope for this ADR's +acceptance) +- Node discovery (Mycelium / mDNS). +- Memory sync protocol over append-only review events and LWW-per-UUID for + memory records. +- Explicit follow-up ADR before any code. + +--- + +## Open Questions + +None at ADR acceptance time. Follow-up items that are *implementation choices*, +not architectural: + +- Precise cross-domain decay weight (start at 0.5, instrument, tune) +- Dashboard histogram of `domain_scores` (UX design detail) +- Whether to gate Postgres behind a Cargo feature flag (`postgres-backend`) or + always compile it in (lean toward feature flag to keep SQLite-only builds small) diff --git a/docs/plans/0001-phase-1-storage-trait-extraction.md b/docs/plans/0001-phase-1-storage-trait-extraction.md new file mode 100644 index 0000000..9960462 --- /dev/null +++ b/docs/plans/0001-phase-1-storage-trait-extraction.md @@ -0,0 +1,1026 @@ +# Phase 1 Plan: Storage Trait Extraction + +**Status**: Draft +**Depends on**: none +**Related**: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 1) + +--- + +## Scope + +### In scope + +- Introduce a new module `crates/vestige-core/src/storage/memory_store.rs` defining: + - `LocalMemoryStore` base trait (Sync + 'static) + - `MemoryStore` Send-bound alias generated via `#[trait_variant::make(MemoryStore: Send)]` + - Supporting data types referenced by the trait: `MemoryRecord`, `SchedulingState`, `SearchQuery`, `SearchResult`, `MemoryEdge`, `Domain`, `ClassificationResult`, `StoreStats`, `HealthStatus`, `MemoryStoreError`. +- Introduce a new module `crates/vestige-core/src/embedder/` defining: + - `Embedder` async trait with `embed`, `model_name`, `dimension` plus `model_hash` (for the registry) and optional `embed_batch` with a default implementation. + - Move/adapt the existing `EmbeddingService` impl into a new struct `FastembedEmbedder` that implements `Embedder`. +- Refactor `Storage` (existing `crates/vestige-core/src/storage/sqlite.rs`) into `SqliteMemoryStore`: + - Keep the struct, the `writer`/`reader` `Mutex` pair, the `FSRSScheduler`, and the USearch `VectorIndex`. + - Rename the type alias `Storage` to `SqliteMemoryStore` with a `pub type Storage = SqliteMemoryStore;` alias for backward source compatibility during the transition. (The trait method surface is the new public contract.) + - Implement `LocalMemoryStore` by wrapping existing synchronous `rusqlite` methods inside `async fn` bodies that call a small `spawn_blocking`-or-inline adapter. Bodies MAY block; the `async fn` signature exists because `LocalMemoryStore` is async. +- Add a `schema_version = 12` migration that introduces two schema additions: + 1. `embedding_model` registry table (one-row constraint enforced in code). + 2. Two new TEXT columns on `knowledge_nodes`: `domains TEXT NOT NULL DEFAULT '[]'` and `domain_scores TEXT NOT NULL DEFAULT '{}'` (both JSON-encoded). +- Enforce model registry on every write path: on the first non-empty embedding write the model signature is recorded; subsequent writes whose `Embedder::model_name()` / `dimension()` / `model_hash()` disagree must fail with `MemoryStoreError::ModelMismatch` before touching the DB. +- Audit all 29 cognitive modules under `crates/vestige-core/src/neuroscience/` and `crates/vestige-core/src/advanced/` to confirm they hold no direct `rusqlite::Connection` references, no `Storage` struct field, and no SQL strings. Any that do get refactored to take `&dyn LocalMemoryStore` (local-only modules) or `&Arc` (modules crossing `await` points). +- Add unit tests alongside each new trait method and integration tests in `tests/phase_1/`. + +### Out of scope + +- Implementing `PgMemoryStore` on sqlx + pgvector -- that is Phase 2. +- `vestige migrate --from sqlite --to postgres` and `vestige migrate --reembed` -- Phase 2. +- MCP over Streamable HTTP, API key middleware, `api_keys` table, `vestige keys create|list|revoke` -- Phase 3. +- `DomainClassifier` module, HDBSCAN clustering, `vestige domains discover|list|rename|merge` CLI, incremental soft-assignment, cross-domain spreading activation decay -- Phase 4. +- Federation, mycelium/mDNS node discovery, review event log table -- Phase 5. +- Removing the `pub type Storage = SqliteMemoryStore;` compatibility alias -- that cleanup happens at the end of Phase 4 when no consumers still spell the old name. + +## Prerequisites + +### Current code state + +- Single concrete type `Storage` in `crates/vestige-core/src/storage/sqlite.rs` (4592 lines, 216 public symbols on the impl blocks, approximately 85 public methods) is the only storage surface the crate exposes. +- `EmbeddingService` in `crates/vestige-core/src/embeddings/local.rs` holds the fastembed singleton. No trait exists; callers type-erase via `&EmbeddingService`. +- Migrations live in `crates/vestige-core/src/storage/migrations.rs`; the current head is v11. +- All cognitive modules in `neuroscience/` and `advanced/` are pure (verified by `grep rusqlite|Connection::|execute\(|prepare\(` returning no matches in those trees). They operate on `KnowledgeNode`, `Vec`, `ConnectionRecord`, etc. passed in by the caller. +- `vestige-mcp` consumes `Arc` in `crates/vestige-mcp/src/server.rs` and every tool under `crates/vestige-mcp/src/tools/`. These call sites will type-check unchanged after the alias is introduced because the trait methods preserve the exact signatures of the existing `pub fn` on `Storage`. +- Test count reported in `CLAUDE.md`: 758 tests (406 mcp + 352 core). This is the no-regression target. + +### Required crates (add via `cargo add` under `crates/vestige-core`) + +| Crate | Version | Why | +|-------|---------|-----| +| `trait-variant` | `0.1` | Generates the `Send`-bound `MemoryStore` alias from `LocalMemoryStore` so `Arc` works under tokio/axum without hand-writing two traits. Listed in PRD section "Crate Dependencies (new)" under Phase 1. | +| `blake3` | `1` | `Embedder::model_hash() -> [u8; 32]` uses blake3 to stabilise the "model signature" stored in the `embedding_model` registry. Already slated for Phase 3 auth; pulling it forward costs nothing and avoids a second migration to add a hash column. | +| `async-trait` | `0.1` | Not strictly required with `trait-variant` on MSRV 1.91 (RPITIT is stable), but used for one utility trait (`EmbedderExt`) that carries a default `embed_batch` body. OPTIONAL; see Open Implementation Questions below. | + +No changes to `vestige-mcp/Cargo.toml` are required for Phase 1 -- the new trait lives in `vestige-core` and the mcp crate continues to depend on the `SqliteMemoryStore` concrete type (via the `Storage` alias) until Phase 2 introduces backend selection. + +## Deliverables + +1. `crates/vestige-core/src/storage/memory_store.rs` -- `LocalMemoryStore` + `MemoryStore` traits and supporting types. +2. `crates/vestige-core/src/storage/mod.rs` -- updated exports and module wiring. +3. `crates/vestige-core/src/storage/sqlite.rs` -- `Storage` renamed to `SqliteMemoryStore`, `impl LocalMemoryStore for SqliteMemoryStore` block, enforcement hooks for the model registry, serde of `domains` / `domain_scores` columns. +4. `crates/vestige-core/src/storage/migrations.rs` -- `MIGRATION_V12_UP` adding `embedding_model` table and `domains`, `domain_scores` columns. +5. `crates/vestige-core/src/embedder/mod.rs` -- `Embedder` trait and re-exports. +6. `crates/vestige-core/src/embedder/fastembed.rs` -- `FastembedEmbedder` implementation. +7. `crates/vestige-core/src/embeddings/local.rs` -- retained; `EmbeddingService` kept as the underlying fastembed holder; `FastembedEmbedder` wraps it. +8. `crates/vestige-core/src/lib.rs` -- new `pub mod embedder;` + re-exports for `MemoryStore`, `LocalMemoryStore`, `Embedder`, `FastembedEmbedder`, and the data types. +9. `tests/phase_1/trait_round_trip.rs` -- integration test: round-trip of every trait method through `SqliteMemoryStore`. +10. `tests/phase_1/embedding_model_registry.rs` -- integration test: first-write registers, mismatch refuses, dimension mismatch refuses. +11. `tests/phase_1/domain_column_migration.rs` -- integration test: a v11 DB upgraded to v12 reads `domains=[]` and `domain_scores={}` for all existing rows. +12. `tests/phase_1/cognitive_module_isolation.rs` -- integration test: every cognitive module compiles and executes against an `Arc` without touching `SqliteMemoryStore` concretely. +13. `tests/phase_1/send_bound_variant.rs` -- integration test: an `Arc` can be moved across `tokio::spawn`. +14. Updated `tests/phase_1/mod.rs` (if the dir already uses a module layout) or individual `[[test]]` entries in `tests/e2e/Cargo.toml` as needed -- see "Test Plan" for the exact layout. + +## Detailed Task Breakdown + +### D1. Trait + supporting types (`memory_store.rs`) + +- **File**: `crates/vestige-core/src/storage/memory_store.rs` (new). +- **Depends on**: `trait-variant` crate added under vestige-core, `chrono`, `serde_json`, `uuid`, `thiserror` (all already in Cargo.toml). +- **Signatures**: + +```rust +//! Backend-agnostic memory store trait. +//! +//! This is the single abstraction every cognitive module sits above. It is +//! intentionally flat: one trait, ~25 methods, no sub-traits. + +use std::collections::HashMap; + +use chrono::{DateTime, Utc}; +use serde::{Deserialize, Serialize}; +use uuid::Uuid; + +// ---------------------------------------------------------------------------- +// ERROR +// ---------------------------------------------------------------------------- + +/// Error returned by every `LocalMemoryStore` / `MemoryStore` method. +#[non_exhaustive] +#[derive(Debug, thiserror::Error)] +pub enum MemoryStoreError { + #[error("not found: {0}")] + NotFound(String), + + #[error("backend error: {0}")] + Backend(String), + + #[error( + "embedding model mismatch: store registered {registered_name} (dim {registered_dim}, \ + hash {registered_hash}), embedder is {actual_name} (dim {actual_dim}, hash {actual_hash})" + )] + ModelMismatch { + registered_name: String, + registered_dim: usize, + registered_hash: String, + actual_name: String, + actual_dim: usize, + actual_hash: String, + }, + + #[error("invalid input: {0}")] + InvalidInput(String), + + #[error("initialization error: {0}")] + Init(String), +} + +impl From for MemoryStoreError { + fn from(e: crate::storage::StorageError) -> Self { + use crate::storage::StorageError as S; + match e { + S::NotFound(s) => MemoryStoreError::NotFound(s), + S::Database(e) => MemoryStoreError::Backend(e.to_string()), + S::Io(e) => MemoryStoreError::Backend(e.to_string()), + S::InvalidTimestamp(s) => MemoryStoreError::Backend(format!("invalid timestamp: {s}")), + S::Init(s) => MemoryStoreError::Init(s), + } + } +} + +pub type MemoryStoreResult = std::result::Result; + +// ---------------------------------------------------------------------------- +// DATA TYPES +// ---------------------------------------------------------------------------- + +/// Backend-agnostic memory record. +/// +/// Phase 1 intentionally keeps this type independent of `KnowledgeNode` to +/// avoid dragging 30+ legacy fields through the trait surface. The SQLite +/// backend converts between `MemoryRecord` and `KnowledgeNode` at the +/// boundary. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct MemoryRecord { + pub id: Uuid, + /// Empty = unclassified. Populated in Phase 4. + pub domains: Vec, + /// Raw similarity per domain centroid. Empty until Phase 4 runs clustering. + pub domain_scores: HashMap, + pub content: String, + pub node_type: String, + pub tags: Vec, + pub embedding: Option>, + pub created_at: DateTime, + pub updated_at: DateTime, + pub metadata: serde_json::Value, +} + +/// FSRS-6 scheduling state, one row per memory. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct SchedulingState { + pub memory_id: Uuid, + pub stability: f64, + pub difficulty: f64, + pub retrievability: f64, + pub last_review: Option>, + pub next_review: Option>, + pub reps: u32, + pub lapses: u32, +} + +/// Hybrid search request. +#[derive(Debug, Clone, Default)] +pub struct SearchQuery { + pub domains: Option>, + pub text: Option, + pub embedding: Option>, + pub tags: Option>, + pub node_types: Option>, + pub limit: usize, + pub min_retrievability: Option, +} + +#[derive(Debug, Clone)] +pub struct SearchResult { + pub record: MemoryRecord, + pub score: f64, + pub fts_score: Option, + pub vector_score: Option, +} + +/// Edge in the spreading-activation graph. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct MemoryEdge { + pub source_id: Uuid, + pub target_id: Uuid, + pub edge_type: String, + pub weight: f64, + pub created_at: DateTime, +} + +/// A topical domain (populated in Phase 4). Phase 1 only needs the type to +/// shape the trait surface; discover/classify are Phase 4 work. +#[derive(Debug, Clone, Serialize, Deserialize)] +pub struct Domain { + pub id: String, + pub label: String, + pub centroid: Vec, + pub top_terms: Vec, + pub memory_count: usize, + pub created_at: DateTime, +} + +/// Result of classifying one vector against all known domains. +#[derive(Debug, Clone)] +pub struct ClassificationResult { + pub scores: HashMap, + pub domains: Vec, +} + +#[derive(Debug, Clone, Serialize, Deserialize, Default)] +pub struct StoreStats { + pub total_memories: usize, + pub memories_with_embeddings: usize, + pub total_edges: usize, + pub total_domains: usize, + pub registered_model_name: Option, + pub registered_model_dim: Option, +} + +#[derive(Debug, Clone, Serialize, Deserialize)] +pub enum HealthStatus { + Healthy, + Degraded { reason: String }, + Unavailable { reason: String }, +} + +// ---------------------------------------------------------------------------- +// EMBEDDING MODEL SIGNATURE +// ---------------------------------------------------------------------------- + +/// Snapshot of the embedding model that was used to write vectors into the +/// store. Persisted in the `embedding_model` table; compared on every write +/// before the vector is accepted. +#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)] +pub struct ModelSignature { + pub name: String, + pub dimension: usize, + /// Lowercase hex-encoded blake3 hash, 64 chars. + pub hash: String, +} + +// ---------------------------------------------------------------------------- +// TRAIT +// ---------------------------------------------------------------------------- + +/// The single storage abstraction. `trait_variant::make` auto-generates a +/// `MemoryStore` alias with `Send`-bound return futures so `Arc` +/// works in tokio/axum contexts. +#[trait_variant::make(MemoryStore: Send)] +pub trait LocalMemoryStore: Sync + 'static { + // --- Lifecycle --- + async fn init(&self) -> MemoryStoreResult<()>; + async fn health_check(&self) -> MemoryStoreResult; + + // --- Embedding model registry --- + async fn registered_model(&self) -> MemoryStoreResult>; + async fn register_model(&self, sig: &ModelSignature) -> MemoryStoreResult<()>; + + // --- CRUD --- + async fn insert(&self, record: &MemoryRecord) -> MemoryStoreResult; + async fn get(&self, id: Uuid) -> MemoryStoreResult>; + async fn update(&self, record: &MemoryRecord) -> MemoryStoreResult<()>; + async fn delete(&self, id: Uuid) -> MemoryStoreResult<()>; + + // --- Search --- + async fn search(&self, query: &SearchQuery) -> MemoryStoreResult>; + async fn fts_search(&self, text: &str, limit: usize) -> MemoryStoreResult>; + async fn vector_search( + &self, + embedding: &[f32], + limit: usize, + ) -> MemoryStoreResult>; + + // --- FSRS Scheduling --- + async fn get_scheduling( + &self, + memory_id: Uuid, + ) -> MemoryStoreResult>; + async fn update_scheduling(&self, state: &SchedulingState) -> MemoryStoreResult<()>; + async fn get_due_memories( + &self, + before: DateTime, + limit: usize, + ) -> MemoryStoreResult>; + + // --- Graph (spreading activation) --- + async fn add_edge(&self, edge: &MemoryEdge) -> MemoryStoreResult<()>; + async fn get_edges( + &self, + node_id: Uuid, + edge_type: Option<&str>, + ) -> MemoryStoreResult>; + async fn remove_edge(&self, source: Uuid, target: Uuid) -> MemoryStoreResult<()>; + async fn get_neighbors( + &self, + node_id: Uuid, + depth: usize, + ) -> MemoryStoreResult>; + + // --- Domains (Phase 1: stubs return empty; full impl in Phase 4) --- + async fn list_domains(&self) -> MemoryStoreResult>; + async fn get_domain(&self, id: &str) -> MemoryStoreResult>; + async fn upsert_domain(&self, domain: &Domain) -> MemoryStoreResult<()>; + async fn delete_domain(&self, id: &str) -> MemoryStoreResult<()>; + /// Phase 1: returns `Ok(vec![])` since no centroids exist. Phase 4 wires + /// the full soft-assignment pass. + async fn classify(&self, embedding: &[f32]) -> MemoryStoreResult>; + + // --- Bulk / Maintenance --- + async fn count(&self) -> MemoryStoreResult; + async fn get_stats(&self) -> MemoryStoreResult; + async fn vacuum(&self) -> MemoryStoreResult<()>; +} +``` + +- **Behavior notes**: + - Every method returns `MemoryStoreResult`; the trait never exposes `rusqlite::Error`. + - `LocalMemoryStore` requires `Sync + 'static` so `Arc` is usable. The auto-generated `MemoryStore` alias adds `Send` bounds on the returned `impl Future`. + - `register_model` is idempotent: writing the same signature twice is `Ok(())`. Writing a different signature after one is registered returns `MemoryStoreError::ModelMismatch`. + - `classify` on Phase 1 returns `Ok(vec![])` and MUST NOT error; cognitive modules call it and Phase 4 will flesh it out without changing the signature. + - `upsert_domain` / `delete_domain` / `list_domains` / `get_domain` operate against a `domains` table that is empty until Phase 4 populates it. Phase 1 still exposes the methods so Phase 2 can implement them against Postgres in one shot. + - `get_neighbors(node_id, depth)` with `depth == 0` returns just `(node, 1.0)` if the node exists, otherwise `NotFound`. `depth > 0` performs breadth-first expansion over edges, weight = product of edge weights along the shortest path discovered, capped at `max_neighbors = 256` to prevent runaway expansion. + +--- + +### D2. Storage module wiring (`storage/mod.rs`) + +- **File**: `crates/vestige-core/src/storage/mod.rs`. +- **Depends on**: D1. +- **Signatures / diff**: + +```rust +//! Storage Module +//! +//! Backend-agnostic memory store abstraction plus SQLite reference impl. + +mod memory_store; +mod migrations; +mod sqlite; + +pub use memory_store::{ + ClassificationResult, Domain, HealthStatus, LocalMemoryStore, MemoryEdge, MemoryRecord, + MemoryStore, MemoryStoreError, MemoryStoreResult, ModelSignature, SchedulingState, + SearchQuery, SearchResult, StoreStats, +}; +pub use migrations::MIGRATIONS; +pub use sqlite::{ + ConnectionRecord, ConsolidationHistoryRecord, DreamHistoryRecord, InsightRecord, + IntentionRecord, Result, SmartIngestResult, SqliteMemoryStore, StateTransitionRecord, + StorageError, +}; + +/// Backwards-compatibility alias. Retained until Phase 4 completes so every +/// existing `Arc` call site keeps compiling. Scheduled for removal +/// once no downstream source file references it. +pub type Storage = SqliteMemoryStore; +``` + +- **Behavior notes**: + - The alias MUST be a `pub type` (not a re-export), because several tool files pattern on `vestige_core::Storage` through `use` statements and we want to keep them compiling verbatim. This has zero runtime cost. + - `StorageError` stays exported for the 29 existing inherent-method callers; the trait exposes `MemoryStoreError` and provides `From`. + +--- + +### D3. Rename + trait impl in `sqlite.rs` + +- **File**: `crates/vestige-core/src/storage/sqlite.rs`. +- **Depends on**: D1, D2, D4 (for schema columns), D5/D6 (to have `Embedder` to accept on `insert`). +- **Signatures (key excerpts)**: + +```rust +pub struct SqliteMemoryStore { + writer: Mutex, + reader: Mutex, + scheduler: Mutex, + #[cfg(feature = "embeddings")] + embedding_service: EmbeddingService, + #[cfg(feature = "vector-search")] + vector_index: Mutex, + #[cfg(feature = "embeddings")] + query_cache: Mutex>>, + /// Cached model signature. `None` until the first embedding is written. + registered_model: std::sync::RwLock>, +} + +impl SqliteMemoryStore { + pub fn new(db_path: Option) -> MemoryStoreResult { /* existing body, Result converted */ } + + /// Internal: convert a row into a `MemoryRecord` (new mapping reading + /// `domains` / `domain_scores` JSON columns). + fn row_to_record(row: &rusqlite::Row) -> rusqlite::Result { /* ... */ } + + /// Internal: given a `MemoryRecord` plus an optional embedding, enforce + /// the registered model signature and return a `MemoryStoreError` if + /// the embedder would produce a mismatched vector. + fn enforce_model( + &self, + incoming: Option<&ModelSignature>, + ) -> MemoryStoreResult<()> { /* ... */ } +} + +impl crate::storage::memory_store::LocalMemoryStore for SqliteMemoryStore { + async fn init(&self) -> MemoryStoreResult<()> { /* no-op; migrations run in `new` */ Ok(()) } + + async fn health_check(&self) -> MemoryStoreResult { + // SELECT 1; check vector index loaded; check embedding_model presence. + } + + async fn registered_model(&self) -> MemoryStoreResult> { + let cached = self.registered_model.read().map_err(|_| MemoryStoreError::Init("registered_model rwlock poisoned".into()))?.clone(); + if cached.is_some() { + return Ok(cached); + } + // Fall through to DB read... + } + + async fn register_model(&self, sig: &ModelSignature) -> MemoryStoreResult<()> { + // INSERT OR IGNORE; if a row exists and differs, return ModelMismatch. + } + + async fn insert(&self, record: &MemoryRecord) -> MemoryStoreResult { + if let Some(vec) = &record.embedding { + // Caller is REQUIRED to have called register_model first (or the + // store auto-registers on the first embedded write -- see + // "embedding_model_registry.rs" test). + let derived = ModelSignature { /* from cache or from record.metadata */ }; + self.enforce_model(Some(&derived))?; + if vec.len() != derived.dimension { + return Err(MemoryStoreError::InvalidInput( + format!("embedding length {} != registered dimension {}", vec.len(), derived.dimension), + )); + } + } + // Delegate to a private `insert_record_blocking` helper that is the + // current `ingest`/`update_node_content` body, rewritten to accept a + // `MemoryRecord` and to also write `domains` / `domain_scores` JSON. + } + + // ... remaining ~24 methods follow the same pattern: convert inputs, + // call the existing synchronous body, convert outputs. +} +``` + +- **SQL** (covered in full in D4 below). +- **Behavior notes**: + - The `async fn` bodies are allowed to be synchronous under the hood (rusqlite is blocking). We do NOT wrap in `spawn_blocking` for Phase 1 -- the current `Storage` is already used from synchronous code paths (CLI, MCP stdio handler) and forcing the tokio runtime is a Phase 2 concern when we also add sqlx. The trait simply lifts the synchronous body into an `async fn` so the signatures match the trait. MSRV 1.91 supports async fn in trait via `trait_variant::make`. + - `insert` preserves the current FSRS initialization logic (stability, difficulty, next_review, etc.) -- the new code path converts `MemoryRecord.metadata` back into `IngestInput`-equivalent fields when needed. All existing inherent methods (`ingest`, `smart_ingest`, `mark_reviewed`, ...) remain on `SqliteMemoryStore` untouched; the trait impl calls into them. + - `registered_model` cache is an `RwLock>`. Invalidated on schema reset. Never mutated after first population until an explicit `--reembed` migration (Phase 2) takes the RwLock exclusively and writes a new row. + - `enforce_model` returns `Ok(())` if no model is registered yet AND `incoming.is_none()` (no-embedding write). Returns `Ok(())` if no model is registered and `incoming.is_some()` after calling `register_model`. Returns `Err(ModelMismatch)` if registered and they disagree. + - `domains` / `domain_scores` serialization uses `serde_json::to_string` on write and `serde_json::from_str` on read. Empty vec -> `"[]"`, empty map -> `"{}"`. `NULL` in the DB is treated as the empty value for pre-migration rows. + - Every existing inherent method is kept verbatim. The trait impl dispatches to them. This is the "no behavior change" guarantee. + +--- + +### D4. Schema migration V12 + +- **File**: `crates/vestige-core/src/storage/migrations.rs`. +- **Depends on**: D2. +- **SQL**: + +```sql +-- Migration V12: embedding model registry + per-memory domain columns. + +-- 1. Embedding model registry. Single logical row; the (id = 1) constraint is +-- enforced in code via `register_model` (SQLite CHECK on a single-row +-- table is uglier than a constraint we already enforce in Rust). +CREATE TABLE IF NOT EXISTS embedding_model ( + id INTEGER PRIMARY KEY CHECK (id = 1), + name TEXT NOT NULL, + dimension INTEGER NOT NULL, + hash TEXT NOT NULL, -- lowercase hex blake3 + created_at TEXT NOT NULL +); + +-- 2. Per-memory domain columns (JSON TEXT; SQLite has no native arrays). +ALTER TABLE knowledge_nodes ADD COLUMN domains TEXT NOT NULL DEFAULT '[]'; +ALTER TABLE knowledge_nodes ADD COLUMN domain_scores TEXT NOT NULL DEFAULT '{}'; + +-- 3. Index on the domains JSON column to enable `LIKE '%"dev"%'`-style +-- filter in Phase 4. Kept lightweight here; Postgres will use GIN. +CREATE INDEX IF NOT EXISTS idx_nodes_domains ON knowledge_nodes(domains); +CREATE INDEX IF NOT EXISTS idx_nodes_domain_scores ON knowledge_nodes(domain_scores); + +-- 4. Domains catalogue (empty until Phase 4 populates). +CREATE TABLE IF NOT EXISTS domains ( + id TEXT PRIMARY KEY, + label TEXT NOT NULL, + centroid BLOB, -- f32 vector, raw bytes + top_terms TEXT NOT NULL DEFAULT '[]', + memory_count INTEGER NOT NULL DEFAULT 0, + created_at TEXT NOT NULL +); + +CREATE INDEX IF NOT EXISTS idx_domains_created_at ON domains(created_at); + +UPDATE schema_version SET version = 12, applied_at = datetime('now'); +``` + +- **Rust changes** to `migrations.rs`: + +```rust +pub const MIGRATIONS: &[Migration] = &[ + // ... V1..V11 unchanged ... + Migration { + version: 12, + description: "Phase 1: embedding_model registry, domains/domain_scores columns, domains table", + up: MIGRATION_V12_UP, + }, +]; + +const MIGRATION_V12_UP: &str = r#"...SQL above..."#; +``` + +- **Behavior notes**: + - Idempotent: `ALTER TABLE ... ADD COLUMN` on SQLite is not idempotent by default, but the `apply_migrations` driver only applies migrations whose version > current. A user who has already applied V12 never sees the SQL again. + - The `CHECK (id = 1)` on `embedding_model` is the only one-row guardrail -- all inserts go through `register_model` which uses `INSERT OR IGNORE INTO embedding_model (id, ...) VALUES (1, ...)` followed by a `SELECT` to detect mismatch. + - `centroid BLOB` stores the f32 vector using the same `Embedding::to_bytes()` format used in `node_embeddings`, for consistency. + +--- + +### D5. Embedder trait (`embedder/mod.rs`) + +- **File**: `crates/vestige-core/src/embedder/mod.rs` (new). +- **Depends on**: `blake3` crate added to vestige-core. +- **Signatures**: + +```rust +//! Text-to-vector encoding trait. Pluggable per-install. + +use std::fmt::Debug; + +mod fastembed; + +pub use fastembed::FastembedEmbedder; + +/// Error returned by every `Embedder` method. +#[non_exhaustive] +#[derive(Debug, thiserror::Error)] +pub enum EmbedderError { + #[error("embedder initialization failed: {0}")] + Init(String), + #[error("embedding generation failed: {0}")] + EmbedFailed(String), + #[error("invalid input: {0}")] + InvalidInput(String), +} + +pub type EmbedderResult = std::result::Result; + +/// Pluggable embedder. The storage layer NEVER calls fastembed directly; +/// callers compute vectors via this trait and pass them into `MemoryStore`. +#[trait_variant::make(Embedder: Send)] +pub trait LocalEmbedder: Sync + 'static { + async fn embed(&self, text: &str) -> EmbedderResult>; + + fn model_name(&self) -> &str; + + fn dimension(&self) -> usize; + + /// Stable blake3 hash of (model_name || dimension || optional weights + /// digest if available). Lowercase hex, 64 chars. + /// + /// Used by `MemoryStore::register_model` to detect silent model drift + /// (e.g. a fastembed minor upgrade that changes vector output). + fn model_hash(&self) -> String; + + async fn embed_batch(&self, texts: &[&str]) -> EmbedderResult>> { + // Default: sequential. Backends with native batching override this. + let mut out = Vec::with_capacity(texts.len()); + for t in texts { + out.push(self.embed(t).await?); + } + Ok(out) + } + + /// Returns the `ModelSignature` describing this embedder. Convenience + /// wrapper over the three accessors above. + fn signature(&self) -> crate::storage::ModelSignature { + crate::storage::ModelSignature { + name: self.model_name().to_string(), + dimension: self.dimension(), + hash: self.model_hash(), + } + } +} +``` + +- **Behavior notes**: + - The `embed_batch` default implementation is non-trivial only in that backends with genuine batching override it. The `FastembedEmbedder` overrides to call `EmbeddingService::embed_batch`. + - `model_hash()` is intentionally a function, not a constant, so backends with configurable weights (a future `OnnxEmbedder` that loads an arbitrary file) can hash the file bytes into the signature. + - `Embedder` (the `Send` variant) is what cognitive modules bind against when they hold `Arc`. `LocalEmbedder` is available for single-threaded callers (CLI, tests). + +--- + +### D6. FastembedEmbedder impl (`embedder/fastembed.rs`) + +- **File**: `crates/vestige-core/src/embedder/fastembed.rs` (new). +- **Depends on**: D5, existing `crate::embeddings::local::EmbeddingService`. +- **Signatures**: + +```rust +use super::{EmbedderError, EmbedderResult, LocalEmbedder}; +use crate::embeddings::{EMBEDDING_DIMENSIONS, EmbeddingService, matryoshka_truncate}; + +pub struct FastembedEmbedder { + inner: EmbeddingService, + cached_hash: std::sync::OnceLock, +} + +impl FastembedEmbedder { + pub fn new() -> Self { + Self { + inner: EmbeddingService::new(), + cached_hash: std::sync::OnceLock::new(), + } + } + + fn compute_hash(name: &str, dim: usize) -> String { + let mut hasher = blake3::Hasher::new(); + hasher.update(name.as_bytes()); + hasher.update(&(dim as u64).to_le_bytes()); + // fastembed's ONNX bytes are not directly accessible at runtime; we + // use `(name, dim, static fastembed crate version)` as the + // signature. If fastembed ever changes its output deterministically + // between minor versions, bumping the crate version triggers a + // mismatch -- which is exactly the drift we want to detect. + hasher.update(env!("CARGO_PKG_VERSION").as_bytes()); + hasher.finalize().to_hex().to_string() + } +} + +impl Default for FastembedEmbedder { + fn default() -> Self { Self::new() } +} + +impl LocalEmbedder for FastembedEmbedder { + async fn embed(&self, text: &str) -> EmbedderResult> { + let emb = self + .inner + .embed(text) + .map_err(|e| EmbedderError::EmbedFailed(e.to_string()))?; + Ok(emb.vector) + } + + fn model_name(&self) -> &str { self.inner.model_name() } + + fn dimension(&self) -> usize { EMBEDDING_DIMENSIONS } + + fn model_hash(&self) -> String { + self.cached_hash + .get_or_init(|| Self::compute_hash(self.inner.model_name(), EMBEDDING_DIMENSIONS)) + .clone() + } + + async fn embed_batch(&self, texts: &[&str]) -> EmbedderResult>> { + let embs = self + .inner + .embed_batch(texts) + .map_err(|e| EmbedderError::EmbedFailed(e.to_string()))?; + Ok(embs.into_iter().map(|e| e.vector).collect()) + } +} +``` + +- **Behavior notes**: + - `EmbeddingService` is kept as the fastembed singleton holder; `FastembedEmbedder` is a thin trait adapter. Existing callers of `EmbeddingService` continue to work during the transition. + - `model_hash` is deterministic for a given `(model_name, EMBEDDING_DIMENSIONS, vestige-core version)` triple. This is the drift detector the ADR calls out under "Risks: Embedding model drift". + - `matryoshka_truncate` is already applied inside `EmbeddingService::embed`, so the vectors returned here are the 256-dim Matryoshka-truncated L2-normalized vectors that the rest of the stack expects. + +--- + +### D7. `lib.rs` re-exports + +- **File**: `crates/vestige-core/src/lib.rs`. +- **Depends on**: D1, D2, D5, D6. +- **Diff** (inserted alongside the existing `pub mod storage;` re-exports): + +```rust +pub mod embedder; + +pub use embedder::{Embedder, EmbedderError, EmbedderResult, FastembedEmbedder, LocalEmbedder}; + +pub use storage::{ + ClassificationResult, Domain, HealthStatus, LocalMemoryStore, MemoryEdge, MemoryRecord, + MemoryStore, MemoryStoreError, MemoryStoreResult, ModelSignature, SchedulingState, + SearchQuery, SearchResult, SqliteMemoryStore, Storage, StoreStats, + // Existing re-exports retained: + ConnectionRecord, ConsolidationHistoryRecord, DreamHistoryRecord, InsightRecord, + IntentionRecord, Result, SmartIngestResult, StateTransitionRecord, StorageError, +}; +``` + +- **Behavior notes**: + - `Storage` remains a top-level re-export so `use vestige_core::Storage;` keeps working in `vestige-mcp` without changes. Post-Phase-4 cleanup will grep the downstream crates and replace. + +--- + +### D8. Cognitive module audit + +- **Files**: all under `crates/vestige-core/src/neuroscience/*.rs` and `crates/vestige-core/src/advanced/*.rs` -- 21 source files. +- **Depends on**: D1..D7. +- **Work**: perform the following grep-gate BEFORE and AFTER the refactor: + +``` +Grep pattern: "rusqlite|Connection::|execute\\(|prepare\\(|&Storage|SqliteMemoryStore" +Expected in neuroscience/ and advanced/ BEFORE: only a single comment-only hit in `neuroscience/active_forgetting.rs:54` referencing `Storage::suppress_memory` in a doc comment. +Expected AFTER: zero hits that reference `SqliteMemoryStore` concretely. References through `&dyn LocalMemoryStore` or `&Arc` are acceptable. +``` + +- **Behavior notes**: + - Current state: the 29 cognitive modules are already pure (they take nodes/vectors/connections as arguments, not a `&Storage`). No refactor is required for their bodies. + - The only work is the `consolidation/sleep.rs` and `consolidation/phases.rs` path, which in the current codebase accepts `&Storage`. These get rewritten to accept `&dyn LocalMemoryStore` (callable from sync contexts) or `&Arc` (callable from async contexts). See file inventory below. + - Actual rewrites (expected number): 3-5 functions across `consolidation/sleep.rs` and `consolidation/mod.rs`. All trait-object refactors; no logic changes. + - `cognitive.rs` in `vestige-mcp` uses `storage.get_all_connections()`. Because `SqliteMemoryStore` keeps `get_all_connections` as an inherent method AND implements `MemoryStore::get_edges`, both call styles keep compiling. `cognitive.rs` does not need to change in Phase 1. + +--- + +### D9. Backwards-compatible inherent methods on `SqliteMemoryStore` + +- **File**: `crates/vestige-core/src/storage/sqlite.rs`. +- **Depends on**: D3. +- **Behavior notes**: + - Every one of the 85 existing `pub fn` on `Storage` (e.g. `ingest`, `smart_ingest`, `mark_reviewed`, `hybrid_search_filtered`, `save_intention`, `save_insight`, `save_connection`, `apply_rac1_cascade`, ...) stays as an inherent method on `SqliteMemoryStore`. The Phase 1 refactor ONLY adds the trait impl; it does NOT remove any method, rename any field, or change any SQL. + - Internal writes that previously embedded `INSERT INTO knowledge_nodes (...)` statements gain two more columns (`domains = '[]'`, `domain_scores = '{}'`) in the INSERT list. These are non-optional columns after migration V12, and their DEFAULT is `'[]'`/`'{}'` respectively, so ALTER behaves correctly for pre-existing rows but INSERT statements need to either list the defaults explicitly or rely on the DB default. Plan: explicitly write `'[]'` and `'{}'` in every `INSERT INTO knowledge_nodes` statement to avoid surprises if a future migration drops the DEFAULT. + +--- + +## Test Plan + +### Unit tests (colocated, `#[cfg(test)] mod tests` at end of each source file) + +Every public trait method on `LocalMemoryStore` gets at least one unit test, exercised through the `SqliteMemoryStore` impl. The unit test file is `crates/vestige-core/src/storage/sqlite.rs` (inside the existing `mod tests`). + +- `vestige_core::storage::sqlite::tests::trait_init_is_idempotent` -- calling `LocalMemoryStore::init` twice returns `Ok(())` both times. +- `vestige_core::storage::sqlite::tests::trait_health_check_reports_healthy_on_fresh_db` -- asserts `HealthStatus::Healthy` on a fresh in-memory DB. +- `vestige_core::storage::sqlite::tests::trait_register_model_first_write_succeeds` -- after registering a signature, `registered_model()` returns it. +- `vestige_core::storage::sqlite::tests::trait_register_model_mismatched_write_refused` -- registering a second, different signature returns `MemoryStoreError::ModelMismatch`. +- `vestige_core::storage::sqlite::tests::trait_register_model_same_signature_idempotent` -- registering the same signature twice returns `Ok(())` both times. +- `vestige_core::storage::sqlite::tests::trait_insert_returns_uuid` -- `insert(record)` returns the UUID from the record. +- `vestige_core::storage::sqlite::tests::trait_insert_refuses_dimension_mismatch` -- inserting a record with a 512-dim vector into a store registered for 256 dims returns `MemoryStoreError::InvalidInput`. +- `vestige_core::storage::sqlite::tests::trait_get_missing_returns_none` -- `get(non_existent_uuid)` returns `Ok(None)`. +- `vestige_core::storage::sqlite::tests::trait_get_after_insert_round_trip` -- insert then get returns a record equal (by content/tags/type) to the input; `domains == []`, `domain_scores == {}`. +- `vestige_core::storage::sqlite::tests::trait_update_modifies_content` -- update with new content reflects in subsequent `get`. +- `vestige_core::storage::sqlite::tests::trait_delete_removes_record` -- `delete` then `get` returns `Ok(None)`. +- `vestige_core::storage::sqlite::tests::trait_search_combines_fts_and_vector` -- with one memory whose content matches by FTS and another by vector, `search` returns both, higher score for the exact content match. +- `vestige_core::storage::sqlite::tests::trait_fts_search_returns_tokens_match` -- verifies FTS path. +- `vestige_core::storage::sqlite::tests::trait_vector_search_returns_cosine_order` -- verifies ordering. +- `vestige_core::storage::sqlite::tests::trait_scheduling_round_trip` -- `update_scheduling` then `get_scheduling` returns equivalent state. +- `vestige_core::storage::sqlite::tests::trait_get_scheduling_missing_returns_none`. +- `vestige_core::storage::sqlite::tests::trait_get_due_memories_returns_in_order` -- inserts 3 records with different `next_review`, asserts older-due listed first. +- `vestige_core::storage::sqlite::tests::trait_add_edge_is_idempotent` -- adding the same edge twice does not duplicate. +- `vestige_core::storage::sqlite::tests::trait_get_edges_filters_by_type`. +- `vestige_core::storage::sqlite::tests::trait_remove_edge_deletes_single`. +- `vestige_core::storage::sqlite::tests::trait_get_neighbors_bfs_depth_zero_returns_self_only`. +- `vestige_core::storage::sqlite::tests::trait_get_neighbors_bfs_depth_two_expands` -- build A->B->C, get_neighbors(A, 2) returns {A, B, C}. +- `vestige_core::storage::sqlite::tests::trait_list_domains_empty_in_phase_1` -- Phase 1 has no clustering, so `list_domains()` returns `[]`. +- `vestige_core::storage::sqlite::tests::trait_upsert_then_get_domain_round_trip`. +- `vestige_core::storage::sqlite::tests::trait_delete_domain_idempotent`. +- `vestige_core::storage::sqlite::tests::trait_classify_with_no_domains_returns_empty` -- verifies Phase 1 stub behavior. +- `vestige_core::storage::sqlite::tests::trait_count_matches_insert_count`. +- `vestige_core::storage::sqlite::tests::trait_get_stats_reports_registered_model`. +- `vestige_core::storage::sqlite::tests::trait_vacuum_succeeds` -- runs and asserts no error. + +Every public method on `LocalEmbedder` gets at least one unit test under `crates/vestige-core/src/embedder/fastembed.rs`: + +- `vestige_core::embedder::fastembed::tests::embedder_reports_correct_name` -- `model_name()` contains "nomic". +- `vestige_core::embedder::fastembed::tests::embedder_reports_256_dimension`. +- `vestige_core::embedder::fastembed::tests::embedder_hash_is_stable` -- `model_hash()` called twice returns identical string. +- `vestige_core::embedder::fastembed::tests::embedder_hash_includes_crate_version` -- a synthetic test that asserts the hash contains the blake3 of `(name, 256, VERSION)`. +- `vestige_core::embedder::fastembed::tests::embedder_embed_smoke` -- gated on `#[cfg(feature = "embeddings")]`; asserts output length == 256. +- `vestige_core::embedder::fastembed::tests::embedder_embed_batch_matches_sequential` -- gated; assert batch result equals sequential result. +- `vestige_core::embedder::fastembed::tests::embedder_signature_matches_accessors`. + +Migration V12 unit tests under `crates/vestige-core/src/storage/migrations.rs`: + +- `vestige_core::storage::migrations::tests::v12_adds_embedding_model_table` -- apply V12 then assert `SELECT count(*) FROM sqlite_master WHERE name='embedding_model'` == 1. +- `vestige_core::storage::migrations::tests::v12_adds_domains_columns` -- assert `PRAGMA table_info(knowledge_nodes)` includes `domains` and `domain_scores`. +- `vestige_core::storage::migrations::tests::v12_default_values_empty_json` -- insert a row via raw SQL, read back, assert `domains == '[]'` and `domain_scores == '{}'`. +- `vestige_core::storage::migrations::tests::v12_is_replayable` -- rewind `schema_version` to 11, re-apply migrations, does not error (MUST use `CREATE TABLE IF NOT EXISTS`; `ALTER TABLE ADD COLUMN` will be skipped because the driver only re-runs migrations whose version > current -- already covered by `apply_migrations`). +- `vestige_core::storage::migrations::tests::v12_preserves_existing_rows` -- insert rows under V11 schema, upgrade to V12, assert `domains='[]'` on those rows. + +Supporting-type unit tests under `crates/vestige-core/src/storage/memory_store.rs`: + +- `vestige_core::storage::memory_store::tests::memory_store_error_from_storage_error` -- converts `StorageError::NotFound` to `MemoryStoreError::NotFound`. +- `vestige_core::storage::memory_store::tests::model_signature_serde_round_trip`. +- `vestige_core::storage::memory_store::tests::memory_record_serde_round_trip`. + +### Integration tests (`tests/phase_1/`) + +Each file is a standalone `[[test]]` target. The Cargo layout: + +- `tests/phase_1/Cargo.toml` with: + +```toml +[package] +name = "vestige-phase-1-tests" +version = "0.0.1" +edition = "2024" +publish = false + +[dependencies] +vestige-core = { path = "../../crates/vestige-core" } +tokio = { version = "1", features = ["macros", "rt-multi-thread"] } +tempfile = "3" +uuid = { version = "1", features = ["v4"] } +chrono = "0.4" +serde_json = "1" +rusqlite = { version = "0.38", features = ["bundled"] } +``` + +And added to the workspace `Cargo.toml` members. Each `.rs` file below is a `#[tokio::test]`-using integration test. + +#### `tests/phase_1/trait_round_trip.rs` + +- `round_trip::insert_get_update_delete` -- exercises CRUD via the trait. Inserts a record with `domains=[]`, gets it, asserts equality, updates content, deletes, asserts not found. +- `round_trip::scheduling_upsert_and_due_scan` -- upserts FSRS state for three memories with different `next_review`, calls `get_due_memories(Utc::now(), 10)`, asserts only past-due ones appear. +- `round_trip::edge_crud` -- add edge, list edges, remove edge, assert gone. +- `round_trip::search_hybrid_returns_results` -- insert three memories, embed one by content match only, one by semantic only, one by both, search with both `text` and `embedding`, assert all three appear with `fts_score`/`vector_score` correctly populated. +- `round_trip::count_and_stats_track_inserts` -- after 10 inserts, `count()` == 10 and `get_stats().total_memories` == 10. +- `round_trip::vacuum_after_deletes_reclaims` -- insert 50, delete 40, call `vacuum`, assert disk file size decreased (informational; test is lenient if VACUUM was a no-op). +- `round_trip::list_domains_empty_then_upsert_then_delete` -- Phase 1 has no discovery, but manual upsert/delete must work so Phase 2's Postgres impl can share the test. +- `round_trip::classify_with_no_domains_returns_empty` -- calls `classify(embedding)` on a fresh store, asserts `Vec<(String, f64)>` is empty. + +#### `tests/phase_1/embedding_model_registry.rs` + +- `model_registry::first_embedded_insert_auto_registers` -- fresh store; insert a record with a 256-dim vector using a `FastembedEmbedder`; subsequent `registered_model()` returns a `Some(ModelSignature)` with dim=256. +- `model_registry::second_insert_with_same_signature_succeeds`. +- `model_registry::second_insert_with_different_dimension_refused` -- register a 256-dim signature, try to insert a 512-dim vector, expect `MemoryStoreError::InvalidInput` (because dimension does not match registered). +- `model_registry::second_insert_with_different_model_name_refused` -- register signature A, call `register_model` with signature B (same dim, different name), expect `MemoryStoreError::ModelMismatch`. +- `model_registry::second_insert_with_different_hash_refused` -- register signature A, try to register signature A' with the same name and dim but a different hash, expect `MemoryStoreError::ModelMismatch`. +- `model_registry::no_embedding_insert_allowed_before_registration` -- a plain text memory without an embedding must insert successfully even when `registered_model()` is `None`. +- `model_registry::stats_reports_registered_model_after_first_write`. + +#### `tests/phase_1/domain_column_migration.rs` + +- `domain_columns::fresh_db_has_v12_schema` -- open a fresh store, query `PRAGMA table_info(knowledge_nodes)`, assert `domains` and `domain_scores` columns are present with the correct defaults. +- `domain_columns::v11_db_upgrades_cleanly` -- programmatically create a DB at V11 by running migrations up to V11 only, insert 5 rows, then invoke the V12 migration, assert all 5 rows now report `domains=='[]'` and `domain_scores=='{}'`. +- `domain_columns::empty_domains_serialize_as_brackets` -- insert a `MemoryRecord { domains: vec![], .. }`, then read the underlying SQLite row via a raw query, assert the stored value is `"[]"`, not `NULL`. +- `domain_columns::populated_domains_round_trip` -- insert a record with `domains=["dev","infra"]` and `domain_scores={"dev":0.82,"infra":0.71}`, read back via the trait, assert equality. +- `domain_columns::domains_table_exists` -- `SELECT name FROM sqlite_master WHERE name='domains'` returns one row. + +#### `tests/phase_1/cognitive_module_isolation.rs` + +- `cognitive_isolation::all_modules_compile_against_dyn_store` -- a test function that allocates a `let store: Arc = Arc::new(SqliteMemoryStore::new(...)?);`, then invokes a representative method from every cognitive module passing in records/vectors/edges it reads through the trait. The point is a compile-time gate: if any module still typed against `SqliteMemoryStore`, this would fail to compile. +- `cognitive_isolation::spreading_activation_traverses_via_trait` -- exercise `ActivationNetwork` seeded from `store.get_edges(...)` results. +- `cognitive_isolation::synaptic_tagging_consumes_records_via_trait` -- build `CapturedMemory` from `store.get(uuid)` and let the tagger compute retroactive importance. +- `cognitive_isolation::hippocampal_index_built_from_store` -- load memories via `store.fts_search`, build `HippocampalIndex`, assert queries against the index work. + +#### `tests/phase_1/send_bound_variant.rs` + +- `send_bound::arc_dyn_memory_store_moves_across_tokio_tasks` -- wrap `SqliteMemoryStore` in `Arc`, spawn 16 tokio tasks each inserting 10 memories, join all tasks, assert final `count() == 160`. This verifies the `#[trait_variant::make(MemoryStore: Send)]` emission actually produces a `Send`-bound future. +- `send_bound::concurrent_readers_one_writer` -- 32 concurrent readers calling `search` while one writer loops inserting; asserts no panics, no deadlocks, eventual consistency on `count`. + +#### `tests/phase_1/embedder_trait.rs` + +- `embedder::fastembed_implements_embedder_trait` -- `let e: Box = Box::new(FastembedEmbedder::new());` compiles and `e.dimension()` == 256. +- `embedder::signature_matches_memory_store_registry` -- take the signature from `Embedder::signature()`, register it via `MemoryStore::register_model`, assert `registered_model()` returns the same. + +### Regression verification + +- `cargo build -p vestige-core` -- zero warnings. +- `cargo build -p vestige-mcp` -- zero warnings. +- `cargo clippy --workspace --all-targets -- -D warnings` -- green. +- `cargo test -p vestige-core --lib` -- existing 352 core lib tests remain green. +- `cargo test -p vestige-mcp --lib` -- existing 406 mcp tests remain green. +- `cargo test -p vestige-core --lib storage::migrations::tests` -- explicitly invokes the migration tests added in Phase 1. +- `cargo test -p vestige-core --lib storage::sqlite::tests` -- invokes the trait-method unit tests added in Phase 1. +- `cargo test -p vestige-core --lib embedder::fastembed::tests` -- invokes embedder unit tests. +- `cargo test -p vestige-phase-1-tests --test trait_round_trip` -- Phase 1 integration test file 1. +- `cargo test -p vestige-phase-1-tests --test embedding_model_registry` -- Phase 1 integration test file 2. +- `cargo test -p vestige-phase-1-tests --test domain_column_migration` -- Phase 1 integration test file 3. +- `cargo test -p vestige-phase-1-tests --test cognitive_module_isolation` -- Phase 1 integration test file 4. +- `cargo test -p vestige-phase-1-tests --test send_bound_variant` -- Phase 1 integration test file 5. +- `cargo test -p vestige-phase-1-tests --test embedder_trait` -- Phase 1 integration test file 6. +- `cargo test -p vestige-phase-1-tests` -- convenience: runs all integration test binaries in the Phase 1 crate. +- `cargo test -p vestige-e2e` -- existing e2e harness runs unchanged; no new tests here but existing ones must pass. + +## Acceptance Criteria + +- [ ] `cargo build -p vestige-core` -- zero warnings. +- [ ] `cargo build -p vestige-mcp` -- zero warnings. +- [ ] `cargo build --workspace --all-targets` -- zero warnings. +- [ ] `cargo clippy --workspace --all-targets -- -D warnings` -- exits 0. +- [ ] `cargo test -p vestige-core` -- all 352 existing core tests plus new Phase 1 unit tests pass. +- [ ] `cargo test -p vestige-mcp` -- all 406 existing mcp tests pass, unchanged. +- [ ] `cargo test -p vestige-phase-1-tests` -- all Phase 1 integration tests pass. +- [ ] `cargo test -p vestige-e2e` -- existing e2e journey suite passes unchanged. +- [ ] Cumulative test count >= 758 (the pre-Phase-1 baseline) plus the new unit and integration additions. +- [ ] `git grep -n 'rusqlite::' crates/vestige-core/src/neuroscience/ crates/vestige-core/src/advanced/` -- zero hits (the single pre-existing doc-comment reference in `active_forgetting.rs` is acceptable and does not introduce SQL dependency; code references must be zero). +- [ ] `git grep -n 'SqliteMemoryStore' crates/vestige-core/src/neuroscience/ crates/vestige-core/src/advanced/` -- zero hits. +- [ ] `git grep -n 'fastembed::' crates/vestige-core/src/storage/sqlite.rs` -- zero hits (Storage must never call fastembed directly; embedding goes through the `Embedder` trait held on the caller side). +- [ ] `SqliteMemoryStore::insert` refuses a vector whose dimension disagrees with the registered model (returns `MemoryStoreError::InvalidInput`). +- [ ] `SqliteMemoryStore::register_model` returns `MemoryStoreError::ModelMismatch` when a second, different signature is provided after a first was already registered. +- [ ] After upgrading a V11 database to V12, every pre-existing row has `domains == "[]"` and `domain_scores == "{}"` with no NULLs. +- [ ] `#[trait_variant::make(MemoryStore: Send)]` compiles; `Arc` is movable across `tokio::spawn`. +- [ ] Migration V12 is idempotent on replay: `apply_migrations` rewound to V11, re-applied, succeeds without error. +- [ ] `vestige-core::storage::Storage` continues to resolve (via the `pub type` alias) at every current call site in `vestige-mcp`. +- [ ] The `embedding_model` table can only hold a single row (programmatic invariant -- verified by an integration test that attempts a second `INSERT INTO embedding_model (id = 1, ...)` and observes the CHECK-enforced uniqueness). +- [ ] `registered_model()` is cached on first read; no SELECT is issued against `embedding_model` after the first hit within the same process (verified by wrapping the reader in a counting proxy in a dedicated test). + +## Rollback Notes + +If Phase 1 fails mid-way, rollback granularity is per-deliverable and the DB can be downgraded by SQL. + +- **D1 (`memory_store.rs`)**: revert the new file. The trait has zero non-test consumers in Phase 1, so deletion is safe. +- **D2 (`storage/mod.rs`)**: revert to the prior export list. The only forward-facing identifier is the `pub type Storage = SqliteMemoryStore;` alias, which becomes `pub use sqlite::Storage;` again once `SqliteMemoryStore` is renamed back to `Storage`. +- **D3 (`sqlite.rs` rename + trait impl)**: revert the struct rename (`SqliteMemoryStore` -> `Storage`). The trait impl is a separate `impl` block and can be deleted wholesale. Inherent methods are unchanged and do not need to be touched. Net diff on revert: delete one `impl LocalMemoryStore for ...` block plus the two helper functions (`row_to_record`, `enforce_model`). +- **D4 (Migration V12)**: DOWN migration script: + +```sql +-- Phase 1 rollback: drop Phase 1 schema additions. +-- WARNING: this deletes any `domains` / `domain_scores` values stored under V12. +-- Execute ONLY when downgrading from V12 to V11 on a database where no Phase 4 +-- work has happened yet (otherwise you lose domain classifications). + +DROP TABLE IF EXISTS domains; +DROP INDEX IF EXISTS idx_nodes_domains; +DROP INDEX IF EXISTS idx_nodes_domain_scores; + +-- SQLite does not support DROP COLUMN before 3.35; the project's bundled +-- rusqlite uses 3.45+ (see `bundled-sqlite` feature). So the DROP COLUMN +-- form below is safe on every target platform. +ALTER TABLE knowledge_nodes DROP COLUMN domains; +ALTER TABLE knowledge_nodes DROP COLUMN domain_scores; + +DROP TABLE IF EXISTS embedding_model; + +UPDATE schema_version SET version = 11, applied_at = datetime('now'); +``` + + Operationally: the DOWN script is NOT included in the source migrations list (migrations are forward-only). If a rollback is required, it is applied manually via `sqlite3 vestige.db < rollback_v12.sql`. A backup via `storage.backup_to(...)` MUST be taken before the Phase 1 migration runs in production -- the `Storage::backup_to` method already exists (line 3903) and does not need changes. + +- **D5/D6 (`embedder/`)**: delete the module. `EmbeddingService` is untouched, so callers that still use it continue to work. The new `Embedder` trait has no pre-Phase-2 consumers. +- **D7 (`lib.rs`)**: revert the re-export additions. Zero downstream impact since the new symbols have no pre-Phase-2 consumers. +- **D8 (cognitive module audit)**: audit-only, no code changes. Nothing to roll back unless `consolidation/sleep.rs` was changed; if so, revert. +- **Crate-level considerations**: + - `trait-variant` must remain in `Cargo.toml` until every consumer of the trait alias has been reverted. Safe to leave in `[dependencies]` indefinitely; it has no runtime cost. + - `blake3` was going to be added in Phase 3 anyway; leaving it in on rollback is harmless. + - `rusqlite` version stays pinned; no bump required for Phase 1. + +## Open Implementation Questions + +Implementation-choice-only. Architectural questions are resolved in ADR 0001. + +1. **`MemoryRecord` vs `KnowledgeNode` as the trait currency.** + - Candidate A: `MemoryRecord` (new, lean type matching the PRD) -- chosen. + - Candidate B: use existing `KnowledgeNode` directly. + - **Recommendation: A.** `KnowledgeNode` carries 30+ FSRS / dual-strength / sentiment / temporal fields that bind callers to the SQLite columns. `MemoryRecord` is what `PgMemoryStore` and future backends will want. SQLite impl converts between the two at the boundary, which is a ~40-line `impl From for MemoryRecord` (and back) shim. Pays for itself in Phase 2. + +2. **`async fn` in traits vs `Box` via `async-trait`.** + - Candidate A: use `trait-variant` (RPITIT-based, MSRV 1.75+, our MSRV is 1.91). + - Candidate B: use `async-trait` (allocates one Box per call). + - **Recommendation: A.** `trait-variant` generates both the base `LocalMemoryStore` and the `Send`-bound `MemoryStore` from one definition, matches what the PRD explicitly calls out, and avoids the allocation overhead of boxed futures on every CRUD call. + +3. **Blocking SQLite under async signatures: spawn_blocking vs inline.** + - Candidate A: bodies call the existing sync `self.writer.lock()...` inline inside the `async fn`. + - Candidate B: bodies wrap in `tokio::task::spawn_blocking`. + - **Recommendation: A for Phase 1.** The current call sites are a mix of sync (CLI, bin/restore.rs) and async (MCP handlers). Introducing `spawn_blocking` would force a tokio runtime even for CLI use. Inline blocking under `async fn` is a documented pattern that compiles and works; under Phase 2 the Postgres impl uses `sqlx` which is natively async, and we can revisit Sqlite blocking policy at that point. Phase 1 priority is "no behavior change". + +4. **Where does `register_model` get called from: storage side auto-register, or caller-side explicit?** + - Candidate A: caller explicitly calls `store.register_model(embedder.signature())` once after `MemoryStore::init`. + - Candidate B: first `insert` with a vector auto-registers. + - **Recommendation: B.** The current code path (`Storage::ingest` -> `generate_embedding_for_node` -> INSERT into `node_embeddings`) has no explicit registration step and we want `--no behavior change`. Auto-register on first embedded write preserves the exact current UX. Callers who care (migration tooling, Phase 2 `--reembed`) can still call `register_model` explicitly; it is a no-op when idempotent. + +5. **`model_hash` content: fastembed ONNX bytes vs `(name, dim, crate_version)`.** + - Candidate A: hash the ONNX file bytes on disk (after model download). + - Candidate B: hash `(name, dim, vestige-core CARGO_PKG_VERSION)`. + - **Recommendation: B.** Fastembed caches ONNX files under `FASTEMBED_CACHE_PATH`; reading them from inside `FastembedEmbedder::new()` couples the embedder to fastembed's caching behavior and adds slow startup. Hashing `(name, dim, our crate version)` catches the "silent model drift between vestige versions" case the ADR calls out under Risks. Phase 2 can add a content-hashed `OnnxEmbedder` that loads any file and genuinely hashes it; the trait method signature stays the same. + +6. **`LocalMemoryStore` `Sync + 'static` or just `Sync`.** + - Candidate A: `Sync + 'static`. + - Candidate B: `Sync`. + - **Recommendation: A.** `'static` is required for `Arc` which is the target call pattern (Axum, MCP server, cognitive engine). Every impl we have in mind -- `SqliteMemoryStore`, `PgMemoryStore` -- holds owned state (connection pool, vector index), so `'static` is free. + +7. **Should trait methods appear on the SQLite impl instead of being separate?** + - Candidate A: keep the current ~85 inherent methods on `SqliteMemoryStore` AND add the `impl LocalMemoryStore` block. + - Candidate B: move every inherent method into the trait. + - **Recommendation: A.** Many inherent methods (e.g. `run_rac1_cascade_sweep`, `apply_rac1_cascade`, `save_insight`, `save_connection`, `preview_review`, `get_memory_subgraph`) have SQLite-specific semantics, transactional behavior, and call patterns that do not belong in a backend-agnostic trait. They will stay SQLite-only or be extracted into new traits in a post-Phase-4 cleanup. Phase 1's job is to expose the `~25 methods` contract the ADR specifies, not to retrofit the entire API. + +8. **Where do `Domain` bytes (centroid) live?** + - Candidate A: `BLOB` column on `domains` table. + - Candidate B: JSON-encoded array of f32 in a `TEXT` column. + - **Recommendation: A.** Consistent with how `node_embeddings.embedding` already stores vectors (little-endian f32 bytes via `Embedding::to_bytes`). JSON would triple the storage size and slow deserialization. The `Domain::centroid: Vec` field round-trips through the same codec. + +9. **Migration numbering when Phase 2 also wants to add a migration.** + - Candidate A: Phase 1 takes V12, Phase 2 takes V13. + - Candidate B: Phase 1 takes V12, Phase 2 re-shapes V12 to include its changes. + - **Recommendation: A.** Migrations are forward-only and append-only in this project. Phase 2 adds V13 (for `review_events` append-only table, if that lands in Phase 2 -- otherwise it is Phase 5 work). + +10. **Integration test crate location: sibling to `tests/e2e/` or inside `crates/vestige-core/tests/`.** + - Candidate A: new workspace member at `tests/phase_1/` (sibling to `tests/e2e/`). + - Candidate B: under `crates/vestige-core/tests/` (standard cargo integration-test layout). + - **Recommendation: A.** Matches the existing pattern of `tests/e2e/`, which is already a workspace member with its own `Cargo.toml`. Keeps the Phase 1 test binary outputs in a predictable location (`target/debug/deps/trait_round_trip-*`). Also avoids the build-graph cycle where `crates/vestige-core/tests/` would re-link everything under `vestige-core` each edit. + +### Critical Files for Implementation + +- /home/delandtj/prppl/vestige/crates/vestige-core/src/storage/memory_store.rs (new; contains the `LocalMemoryStore` / `MemoryStore` traits plus `MemoryRecord`, `SchedulingState`, `SearchQuery`, `SearchResult`, `MemoryEdge`, `Domain`, `ClassificationResult`, `StoreStats`, `HealthStatus`, `MemoryStoreError`, `ModelSignature`) +- /home/delandtj/prppl/vestige/crates/vestige-core/src/storage/sqlite.rs (rename `Storage` -> `SqliteMemoryStore`, add the `impl LocalMemoryStore` block and the `enforce_model` / `row_to_record` helpers; ~200 line diff on a 4592-line file) +- /home/delandtj/prppl/vestige/crates/vestige-core/src/storage/migrations.rs (append `Migration { version: 12, ... }` + `MIGRATION_V12_UP` constant; ~80 new lines) +- /home/delandtj/prppl/vestige/crates/vestige-core/src/embedder/mod.rs (new; `Embedder` + `LocalEmbedder` traits, `EmbedderError`, default `embed_batch`) +- /home/delandtj/prppl/vestige/crates/vestige-core/src/embedder/fastembed.rs (new; `FastembedEmbedder` implementation adapting the existing `EmbeddingService`) diff --git a/docs/plans/0002-phase-2-postgres-backend.md b/docs/plans/0002-phase-2-postgres-backend.md new file mode 100644 index 0000000..a372e27 --- /dev/null +++ b/docs/plans/0002-phase-2-postgres-backend.md @@ -0,0 +1,1269 @@ +# Phase 2 Plan: PostgreSQL Backend + +**Status**: Draft +**Depends on**: Phase 1 (MemoryStore + Embedder traits, embedding_model registry, domain columns) +**Related**: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 2), docs/prd/001-getting-centralized-vestige.md + +--- + +## Scope + +### In scope + +- `PgMemoryStore` struct implementing the Phase 1 `MemoryStore` trait against `sqlx::PgPool`, including compile-time checked queries via `sqlx::query!` / `sqlx::query_as!`. +- First-class `pgvector` integration: typed `Vector` columns, HNSW index (`vector_cosine_ops`, `m = 16`, `ef_construction = 64`), and use of the cosine-distance operator `<=>`. +- First-class Postgres FTS: GENERATED `tsvector` column (`search_vec`) with `setweight` (A=content, B=node_type, C=tags), GIN index, and `websearch_to_tsquery` at query time. +- Hybrid search via Reciprocal Rank Fusion (RRF) expressed as a single SQL statement with CTEs for FTS and vector subqueries, with optional domain filter through array overlap (`&&`). +- sqlx migrations directory at `crates/vestige-core/migrations/postgres/`, numbered `{NNNN}_{name}.up.sql` / `{NNNN}_{name}.down.sql`, runnable by `sqlx::migrate!` at startup and by `sqlx-cli`. +- Offline query cache committed under `crates/vestige-core/.sqlx/` so a DATABASE_URL is not required at build time. +- Backend selection via `vestige.toml`: `[storage]` section with `backend = "sqlite" | "postgres"` plus the per-backend subsection (`[storage.sqlite]`, `[storage.postgres]`). Exclusive at compile time via `postgres-backend` feature, exclusive at runtime via the enum. +- CLI: `vestige migrate --from sqlite --to postgres --sqlite-path

--postgres-url ` -- streaming copy with progress output. +- CLI: `vestige migrate --reembed --model=` -- O(n) re-embed under a new `Embedder`, registry update, HNSW rebuild. +- Testcontainer-based integration tests using the `pgvector/pgvector:pg16` image, behind the `postgres-backend` feature so SQLite-only builds remain untouched. +- `PgMemoryStore` parity with `SqliteMemoryStore` across every public `MemoryStore` method defined in Phase 1. + +### Out of scope + +- Phase 3 (network access): HTTP MCP transport, API key auth, `vestige keys` CLI. The `api_keys` DDL is declared by Phase 3; Phase 2 does not create it. +- Phase 4 (emergent domain classification): `DomainClassifier`, HDBSCAN, discover / rename / merge CLI. Phase 2 provisions the `domains` and `domain_scores` columns and the `domains` table structure so Phase 4 slots in without further migration, but does not compute or classify. +- Phase 5 (federation): cross-node sync. The `review_events` table is declared in Phase 1; Phase 2 only references it where FSRS writes happen. +- Changes to the cognitive engine, Phase 1 traits, or the embedding pipeline itself. Phase 2 only adds a backend. +- SQLCipher parity for Postgres. Operator responsibility (TLS to Postgres, pgcrypto, disk-level encryption) is out of scope for this phase. + +--- + +## Prerequisites + +### Expected Phase 1 artifacts (consumed, not produced) + +Phase 2 treats all of the following as fixed interfaces. Each path is the expected Phase 1 location. + +- `crates/vestige-core/src/storage/mod.rs` -- re-exports the trait and the two concrete backends. +- `crates/vestige-core/src/storage/memory_store.rs` -- defines the `MemoryStore` trait (generated by `trait_variant::make` from `LocalMemoryStore`) with the full CRUD, search, FSRS, graph, and domain surface from the PRD. Phase 2 implements every method here. +- `crates/vestige-core/src/storage/types.rs` -- shared value types: `MemoryRecord`, `SchedulingState`, `SearchQuery`, `SearchResult`, `MemoryEdge`, `Domain`, `StoreStats`, `HealthStatus`. +- `crates/vestige-core/src/storage/error.rs` -- `StoreError` enum plus `pub type StoreResult = Result`. Phase 2 extends this with `StoreError::Postgres(sqlx::Error)` and `StoreError::Migrate(sqlx::migrate::MigrateError)` via `From` impls (the variants themselves MUST live behind `#[cfg(feature = "postgres-backend")]`). +- `crates/vestige-core/src/embedder/mod.rs` -- `Embedder` trait with `embed`, `model_name`, `dimension`, `model_hash`. Phase 2 calls `model_name()`, `dimension()`, and `model_hash()` for the registry. +- `crates/vestige-core/src/storage/sqlite.rs` -- `SqliteMemoryStore: MemoryStore`. Phase 2's `migrate --from sqlite --to postgres` uses this as the source. +- `crates/vestige-core/src/storage/registry.rs` -- `EmbeddingModelRegistry` abstraction that both backends implement. Phase 2 supplies a Postgres version writing to `embedding_model`. +- `crates/vestige-core/migrations/sqlite/` -- V12 (Phase 1) adds `domains TEXT` (JSON-encoded array), `domain_scores TEXT` (JSON), `embedding_model(name, dimension, hash, created_at)`, and `review_events(id, memory_id, timestamp, rating, prior_state, new_state)`. Phase 2 mirrors every column and table in Postgres. + +If any of the above is missing when Phase 2 starts, the first action is to surface the gap back to Phase 1 -- do NOT backfill a partial trait in Phase 2. + +### Required crates (declared in Phase 2, not installed by this doc) + +The agent running Phase 2 uses `cargo add` in `crates/vestige-core/` for each dependency below. Exact versions and feature flags: + +- `sqlx@0.8` with features `runtime-tokio`, `tls-rustls`, `postgres`, `uuid`, `chrono`, `json`, `migrate`, `macros`. Optional (gated by `postgres-backend`). +- `pgvector@0.4` with features `sqlx`. Optional (gated by `postgres-backend`). +- `deadpool` is NOT needed; `sqlx::PgPool` is the pool. +- `toml@0.8` (no features) for `vestige.toml` parsing. Moved to non-optional because both backends share the config surface. +- `figment@0.10` with features `toml`, `env` -- optional, only if Phase 1 has not already picked a config loader. If Phase 1 ships a loader, skip `figment` and reuse. +- `dirs@6` -- already a transitive `directories` dependency; reuse existing. +- `tokio-stream@0.1` (no features). Used by migrate commands for streamed iteration. +- `indicatif@0.17` (no features). Progress bars for the migrate CLI. +- `futures@0.3` with features `std`. Consumed by sqlx stream combinators. + +Dev-only (under `[dev-dependencies]` in `crates/vestige-core/Cargo.toml`, gated by `postgres-backend`): + +- `testcontainers@0.22` with features `blocking` off, `async` on (default). +- `testcontainers-modules@0.10` with features `postgres`. +- `tokio@1` features `macros`, `rt-multi-thread` (already present for core tests). +- `criterion@0.5` already present; add a new `[[bench]]` entry. + +Feature additions in `crates/vestige-core/Cargo.toml`: + +``` +[features] +postgres-backend = ["dep:sqlx", "dep:pgvector", "dep:tokio-stream", "dep:futures"] +``` + +`postgres-backend` is OFF by default. `default = ["embeddings", "vector-search", "bundled-sqlite"]` stays unchanged. `vestige-mcp` forwards a new feature `postgres-backend = ["vestige-core/postgres-backend"]`. + +### External tooling + +- PostgreSQL 16 or newer (uses `gen_random_uuid()` from `pgcrypto` bundled via `CREATE EXTENSION pgcrypto` in migration 0001; pgvector HNSW indexes require pgvector 0.5+). +- The `pgvector` extension installed in the target database (our migration issues `CREATE EXTENSION IF NOT EXISTS vector`). +- `sqlx-cli@0.8` installed on the developer machine for `cargo sqlx prepare --workspace` and `cargo sqlx migrate add` (not a build-time requirement once `.sqlx/` is committed). +- Docker or Podman reachable by the test harness for `testcontainers-modules::postgres` to spin up `pgvector/pgvector:pg16`. + +### Assumed Rust toolchain + +- Rust 2024 edition. +- MSRV 1.91 (per `CLAUDE.md`). `sqlx 0.8` is compatible. +- `rustflags` unchanged. No `nightly`-only features. + +--- + +## Deliverables + +1. Feature gate `postgres-backend` in `crates/vestige-core/Cargo.toml` and `crates/vestige-mcp/Cargo.toml` that cleanly disables all Postgres code paths when off. +2. `crates/vestige-core/src/storage/postgres/mod.rs` -- `PgMemoryStore` struct and `MemoryStore` trait impl (public entry point). +3. `crates/vestige-core/src/storage/postgres/pool.rs` -- `PgMemoryStore::connect(config)` and pool configuration. +4. `crates/vestige-core/src/storage/postgres/search.rs` -- RRF hybrid search query builder and row -> `SearchResult` mapping. +5. `crates/vestige-core/src/storage/postgres/migrations.rs` -- wraps `sqlx::migrate!("./migrations/postgres")` and surfaces typed errors. +6. `crates/vestige-core/src/storage/postgres/registry.rs` -- Postgres `EmbeddingModelRegistry` implementation writing `embedding_model`. +7. `crates/vestige-core/migrations/postgres/0001_init.up.sql` + `0001_init.down.sql` -- extensions, `memories`, `scheduling`, `edges`, `domains`, `embedding_model`, `review_events`, all indexes. +8. `crates/vestige-core/migrations/postgres/0002_hnsw.up.sql` + `0002_hnsw.down.sql` -- HNSW index creation separated so it can be `CREATE INDEX CONCURRENTLY` during reembed. +9. `crates/vestige-core/src/config.rs` -- `VestigeConfig`, `StorageConfig`, `SqliteConfig`, `PostgresConfig`, `EmbeddingsConfig`, plus a single `VestigeConfig::load(path: Option<&Path>)` returning `Result`. +10. `crates/vestige-core/src/storage/postgres/migrate_cli.rs` -- streaming SQLite-to-Postgres copy, domain-aware, with `indicatif` progress. +11. `crates/vestige-core/src/storage/postgres/reembed.rs` -- `ReembedPlan` and its driver; re-encodes all memories via a supplied `Embedder`, updates `embedding_model`, rebuilds HNSW. +12. `crates/vestige-mcp/src/bin/cli.rs` -- two new `clap` subcommands `Migrate` (union of `--from/--to` and `--reembed` variants, one subcommand or two, see Open Questions) wired to deliverables 10 and 11. +13. `crates/vestige-core/.sqlx/` -- offline query cache, committed. +14. `tests/phase_2/` -- six integration test files listed in the Test Plan. +15. `crates/vestige-core/benches/pg_hybrid_search.rs` -- Criterion benches for RRF search at 1k and 100k memories, gated by `postgres-backend`. +16. `docs/runbook/postgres.md` -- brief ops note covering extension install, `max_connections`, backup discipline, and rollback caveats. (Short; only required for the "rollback of migrate" deliverable.) + +--- + +## Detailed Task Breakdown + +### D1. `postgres-backend` feature gate + +- **File**: `crates/vestige-core/Cargo.toml`, `crates/vestige-mcp/Cargo.toml` +- **Depends on**: nothing; this is the first change. +- **Rust snippets**: + +```toml +# crates/vestige-core/Cargo.toml +[features] +default = ["embeddings", "vector-search", "bundled-sqlite"] +bundled-sqlite = ["rusqlite/bundled"] +encryption = ["rusqlite/bundled-sqlcipher"] +postgres-backend = [ + "dep:sqlx", + "dep:pgvector", + "dep:tokio-stream", + "dep:futures", +] + +[dependencies] +sqlx = { version = "0.8", default-features = false, features = [ + "runtime-tokio", "tls-rustls", "postgres", "uuid", "chrono", + "json", "migrate", "macros", +], optional = true } +pgvector = { version = "0.4", features = ["sqlx"], optional = true } +tokio-stream = { version = "0.1", optional = true } +futures = { version = "0.3", optional = true } +toml = "0.8" +indicatif = "0.17" +``` + +- **Behavior notes**: keep the two backends mutually compilable per `CLAUDE.md`. Every `use sqlx::...` sits under `#[cfg(feature = "postgres-backend")]`. Every module under `crates/vestige-core/src/storage/postgres/` carries `#![cfg(feature = "postgres-backend")]` as its file-level attribute. + +### D2. `PgMemoryStore` core struct + +- **File**: `crates/vestige-core/src/storage/postgres/mod.rs` +- **Depends on**: D1, Phase 1 `MemoryStore` trait and value types. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use std::sync::Arc; +use std::time::Duration; + +use chrono::{DateTime, Utc}; +use pgvector::Vector; +use sqlx::postgres::{PgConnectOptions, PgPoolOptions}; +use sqlx::PgPool; +use uuid::Uuid; + +use crate::embedder::Embedder; +use crate::storage::error::{StoreError, StoreResult}; +use crate::storage::types::{ + Domain, HealthStatus, MemoryEdge, MemoryRecord, SchedulingState, + SearchQuery, SearchResult, StoreStats, +}; +use crate::storage::memory_store::LocalMemoryStore; + +pub mod migrations; +pub mod pool; +pub mod registry; +pub mod search; +pub mod migrate_cli; +pub mod reembed; + +/// Postgres-backed implementation of `MemoryStore`. +/// +/// Cheaply cloneable. Methods take `&self`; interior state lives inside +/// the `PgPool` (which already provides `Sync` via `Arc` internally). +#[derive(Clone)] +pub struct PgMemoryStore { + pool: PgPool, + embedding_dim: i32, + embedding_model: Arc, +} + +#[derive(Debug, Clone)] +pub struct EmbeddingModelDescriptor { + pub name: String, + pub dimension: i32, + pub hash: String, +} + +impl PgMemoryStore { + /// Construct a new store. Runs migrations, reads the registry, validates + /// that the embedder matches the registered model. + pub async fn connect( + url: &str, + max_connections: u32, + embedder: &dyn Embedder, + ) -> StoreResult; + + /// Low-level constructor for tests: supply an existing pool, skip migrate. + pub async fn from_pool( + pool: PgPool, + embedder: &dyn Embedder, + ) -> StoreResult; + + /// Accessor used by migrate/reembed CLI. + pub fn pool(&self) -> &PgPool { &self.pool } + + pub fn embedding_dim(&self) -> i32 { self.embedding_dim } +} + +#[trait_variant::make(crate::storage::memory_store::MemoryStore: Send)] +impl LocalMemoryStore for PgMemoryStore { + async fn init(&self) -> StoreResult<()>; + async fn health_check(&self) -> StoreResult; + + async fn insert(&self, record: &MemoryRecord) -> StoreResult; + async fn get(&self, id: Uuid) -> StoreResult>; + async fn update(&self, record: &MemoryRecord) -> StoreResult<()>; + async fn delete(&self, id: Uuid) -> StoreResult<()>; + + async fn search(&self, query: &SearchQuery) -> StoreResult>; + async fn fts_search(&self, text: &str, limit: usize) -> StoreResult>; + async fn vector_search(&self, embedding: &[f32], limit: usize) -> StoreResult>; + + async fn get_scheduling(&self, memory_id: Uuid) -> StoreResult>; + async fn update_scheduling(&self, state: &SchedulingState) -> StoreResult<()>; + async fn get_due_memories( + &self, + before: DateTime, + limit: usize, + ) -> StoreResult>; + + async fn add_edge(&self, edge: &MemoryEdge) -> StoreResult<()>; + async fn get_edges(&self, node_id: Uuid, edge_type: Option<&str>) -> StoreResult>; + async fn remove_edge(&self, source: Uuid, target: Uuid, edge_type: &str) -> StoreResult<()>; + async fn get_neighbors(&self, node_id: Uuid, depth: usize) -> StoreResult>; + + async fn list_domains(&self) -> StoreResult>; + async fn get_domain(&self, id: &str) -> StoreResult>; + async fn upsert_domain(&self, domain: &Domain) -> StoreResult<()>; + async fn delete_domain(&self, id: &str) -> StoreResult<()>; + async fn classify(&self, embedding: &[f32]) -> StoreResult>; + + async fn count(&self) -> StoreResult; + async fn get_stats(&self) -> StoreResult; + async fn vacuum(&self) -> StoreResult<()>; +} +``` + +- **SQL (inline within impl methods)**: every call uses `sqlx::query!` or `sqlx::query_as!` for compile-time validation. Examples: + +```rust +// insert +sqlx::query!( + r#" + INSERT INTO memories ( + id, domains, domain_scores, content, node_type, tags, + embedding, metadata, created_at, updated_at + ) VALUES ($1, $2, $3, $4, $5, $6, $7::vector, $8, $9, $10) + "#, + record.id, + &record.domains as &[String], + serde_json::to_value(&record.domain_scores)?, + record.content, + record.node_type, + &record.tags as &[String], + record.embedding.as_ref().map(|v| Vector::from(v.clone())) as Option, + record.metadata, + record.created_at, + record.updated_at, +) +.execute(&self.pool) +.await?; +``` + +- **Behavior notes**: + - `StoreError` gets two new variants behind the feature: + +```rust +#[cfg(feature = "postgres-backend")] +#[error("postgres error: {0}")] +Postgres(#[from] sqlx::Error), + +#[cfg(feature = "postgres-backend")] +#[error("postgres migration error: {0}")] +Migrate(#[from] sqlx::migrate::MigrateError), +``` + + - `classify()` on Postgres implements the PRD's cosine-similarity-to-centroid computation inside SQL using `1 - (centroid <=> $1::vector)` over the `domains` table and returns rows sorted descending. This mirrors the behavior a `DomainClassifier` in Phase 4 uses; Phase 2 ships the backend capability but does not call it. + - Connection pool defaults (see D3): `max_connections = 10`, `acquire_timeout = 30s`, `idle_timeout = 600s`, `test_before_acquire = false` (cheap queries; avoid per-acquire roundtrip). + - All methods are `async fn` and use sqlx's `tokio` runtime feature; no blocking `block_on`. + +### D3. Pool construction and config wiring + +- **File**: `crates/vestige-core/src/storage/postgres/pool.rs` +- **Depends on**: D1, D2, D9. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use sqlx::postgres::{PgConnectOptions, PgPoolOptions}; +use sqlx::{ConnectOptions, PgPool}; +use std::str::FromStr; +use std::time::Duration; + +use crate::config::PostgresConfig; +use crate::storage::error::{StoreError, StoreResult}; + +pub async fn build_pool(cfg: &PostgresConfig) -> StoreResult { + let mut opts = PgConnectOptions::from_str(&cfg.url)?; + opts = opts + .application_name("vestige") + .statement_cache_capacity(256) + .log_statements(tracing::log::LevelFilter::Debug); + + let pool = PgPoolOptions::new() + .max_connections(cfg.max_connections.unwrap_or(10)) + .min_connections(0) + .acquire_timeout(Duration::from_secs(cfg.acquire_timeout_secs.unwrap_or(30))) + .idle_timeout(Some(Duration::from_secs(600))) + .max_lifetime(Some(Duration::from_secs(1800))) + .test_before_acquire(false) + .connect_with(opts) + .await?; + + Ok(pool) +} +``` + +- **Behavior notes**: acquire timeout chosen to exceed the 30-second testcontainer spin-up requirement. `application_name = "vestige"` makes `pg_stat_activity` readable from `psql` during debugging. + +### D4. sqlx migrations directory + +- **File**: `crates/vestige-core/migrations/postgres/0001_init.up.sql`, `0001_init.down.sql`, `0002_hnsw.up.sql`, `0002_hnsw.down.sql`. +- **Depends on**: none (pure SQL). + +`0001_init.up.sql`: + +```sql +-- Extensions +CREATE EXTENSION IF NOT EXISTS pgcrypto; +CREATE EXTENSION IF NOT EXISTS vector; + +-- Embedding model registry +-- Mirrors the SQLite table created in Phase 1. +CREATE TABLE embedding_model ( + id SMALLINT PRIMARY KEY DEFAULT 1 CHECK (id = 1), + name TEXT NOT NULL, + dimension INTEGER NOT NULL CHECK (dimension > 0), + hash TEXT NOT NULL, + created_at TIMESTAMPTZ NOT NULL DEFAULT now() +); + +-- Domains table (populated by Phase 4 DomainClassifier; Phase 2 only creates +-- the empty table so list/get/upsert/delete work against both backends). +CREATE TABLE domains ( + id TEXT PRIMARY KEY, + label TEXT NOT NULL, + centroid vector, + top_terms TEXT[] NOT NULL DEFAULT '{}', + memory_count INTEGER NOT NULL DEFAULT 0, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + metadata JSONB NOT NULL DEFAULT '{}'::jsonb +); + +-- Core memories table +CREATE TABLE memories ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + domains TEXT[] NOT NULL DEFAULT '{}', + domain_scores JSONB NOT NULL DEFAULT '{}'::jsonb, + content TEXT NOT NULL, + node_type TEXT NOT NULL DEFAULT 'general', + tags TEXT[] NOT NULL DEFAULT '{}', + embedding vector, + metadata JSONB NOT NULL DEFAULT '{}'::jsonb, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), + search_vec TSVECTOR GENERATED ALWAYS AS ( + setweight(to_tsvector('english', coalesce(content, '')), 'A') || + setweight(to_tsvector('english', coalesce(node_type, '')), 'B') || + setweight(to_tsvector('english', coalesce(array_to_string(tags, ' '), '')), 'C') + ) STORED +); + +-- FSRS scheduling state (1:1 with memories) +CREATE TABLE scheduling ( + memory_id UUID PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE, + stability DOUBLE PRECISION NOT NULL DEFAULT 0.0, + difficulty DOUBLE PRECISION NOT NULL DEFAULT 0.0, + retrievability DOUBLE PRECISION NOT NULL DEFAULT 1.0, + last_review TIMESTAMPTZ, + next_review TIMESTAMPTZ, + reps INTEGER NOT NULL DEFAULT 0, + lapses INTEGER NOT NULL DEFAULT 0 +); + +-- Graph edges (spreading activation) +CREATE TABLE edges ( + source_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE, + target_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE, + edge_type TEXT NOT NULL DEFAULT 'related', + weight DOUBLE PRECISION NOT NULL DEFAULT 1.0, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + PRIMARY KEY (source_id, target_id, edge_type) +); + +-- FSRS review event log (Phase 1 creates this; Phase 2 mirrors it for Postgres). +-- Append-only. Used for future federation (Phase 5). +CREATE TABLE review_events ( + id BIGSERIAL PRIMARY KEY, + memory_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE, + timestamp TIMESTAMPTZ NOT NULL DEFAULT now(), + rating SMALLINT NOT NULL, + prior_state JSONB NOT NULL, + new_state JSONB NOT NULL +); + +-- Indexes on memories (vector index is declared separately in 0002_hnsw.up.sql) +CREATE INDEX idx_memories_fts ON memories USING GIN (search_vec); +CREATE INDEX idx_memories_domains ON memories USING GIN (domains); +CREATE INDEX idx_memories_tags ON memories USING GIN (tags); +CREATE INDEX idx_memories_node_type ON memories (node_type); +CREATE INDEX idx_memories_created ON memories (created_at); +CREATE INDEX idx_memories_updated ON memories (updated_at); + +-- Indexes on scheduling +CREATE INDEX idx_scheduling_next_review ON scheduling (next_review); +CREATE INDEX idx_scheduling_last_review ON scheduling (last_review); + +-- Indexes on edges +CREATE INDEX idx_edges_target ON edges (target_id); +CREATE INDEX idx_edges_source ON edges (source_id); +CREATE INDEX idx_edges_type ON edges (edge_type); + +-- Indexes on review_events +CREATE INDEX idx_review_events_memory ON review_events (memory_id); +CREATE INDEX idx_review_events_ts ON review_events (timestamp); + +-- Update trigger on memories.updated_at +CREATE OR REPLACE FUNCTION memories_set_updated_at() RETURNS TRIGGER AS $$ +BEGIN + NEW.updated_at := now(); + RETURN NEW; +END; +$$ LANGUAGE plpgsql; + +CREATE TRIGGER trg_memories_updated_at +BEFORE UPDATE ON memories +FOR EACH ROW EXECUTE FUNCTION memories_set_updated_at(); +``` + +`0001_init.down.sql`: + +```sql +DROP TRIGGER IF EXISTS trg_memories_updated_at ON memories; +DROP FUNCTION IF EXISTS memories_set_updated_at(); + +DROP INDEX IF EXISTS idx_review_events_ts; +DROP INDEX IF EXISTS idx_review_events_memory; +DROP INDEX IF EXISTS idx_edges_type; +DROP INDEX IF EXISTS idx_edges_source; +DROP INDEX IF EXISTS idx_edges_target; +DROP INDEX IF EXISTS idx_scheduling_last_review; +DROP INDEX IF EXISTS idx_scheduling_next_review; +DROP INDEX IF EXISTS idx_memories_updated; +DROP INDEX IF EXISTS idx_memories_created; +DROP INDEX IF EXISTS idx_memories_node_type; +DROP INDEX IF EXISTS idx_memories_tags; +DROP INDEX IF EXISTS idx_memories_domains; +DROP INDEX IF EXISTS idx_memories_fts; + +DROP TABLE IF EXISTS review_events; +DROP TABLE IF EXISTS edges; +DROP TABLE IF EXISTS scheduling; +DROP TABLE IF EXISTS memories; +DROP TABLE IF EXISTS domains; +DROP TABLE IF EXISTS embedding_model; +``` + +`0002_hnsw.up.sql` (separated so reembed can drop-and-recreate without touching the rest of the schema): + +```sql +-- HNSW index on memories.embedding. +-- pgvector requires the column to have a typmod (fixed dimension) for HNSW. +-- The dimension is stamped by the application at startup via ALTER TABLE +-- using the embedder's dimension() method (see PgMemoryStore::connect). +-- We express the index with the generic vector_cosine_ops operator class. +CREATE INDEX idx_memories_embedding_hnsw + ON memories USING hnsw (embedding vector_cosine_ops) + WITH (m = 16, ef_construction = 64); +``` + +`0002_hnsw.down.sql`: + +```sql +DROP INDEX IF EXISTS idx_memories_embedding_hnsw; +``` + +- **Behavior notes**: + - pgvector HNSW requires a typmod. `PgMemoryStore::connect` runs `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)` with `$N = embedder.dimension()` exactly once, guarded by a check against `embedding_model` (first startup ever) or validated against it on subsequent starts. If `embedder.dimension()` differs from the stored one and `embedding_model` is non-empty, return `StoreError::EmbeddingDimensionMismatch` -- the user must run `vestige migrate --reembed`. + - `ALTER COLUMN ... TYPE vector($N)` on a populated column fails unless the data fits; that is the desired safety net. + - The `tsvector` GENERATED column uses `array_to_string(tags, ' ')` rather than `array_to_tsvector` from the PRD sketch, because `array_to_tsvector` is not a core function in Postgres 16 and would require an extension. The behavior is equivalent for weight C. + - `gen_random_uuid()` comes from `pgcrypto`. In Postgres 13+ it is also available from core; we keep the extension for older compatibility paths. + - MVCC: all table writes are transactional; no explicit locks. `INSERT ... ON CONFLICT DO UPDATE` is used in `upsert_domain`, `update_scheduling`, and edge idempotency. + +### D5. Hybrid search via RRF + +- **File**: `crates/vestige-core/src/storage/postgres/search.rs` +- **Depends on**: D2, D4. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use pgvector::Vector; +use sqlx::PgPool; +use uuid::Uuid; + +use crate::storage::error::StoreResult; +use crate::storage::types::{SearchQuery, SearchResult}; + +const RRF_K: i32 = 60; // constant from Cormack et al. 2009 +const OVERFETCH_MULT: i64 = 3; // matches Phase 1 SQLite overfetch + +pub(crate) async fn rrf_search( + pool: &PgPool, + query: &SearchQuery, +) -> StoreResult>; +``` + +SQL for the full hybrid RRF query. Placeholders: +- `$1` = text query (string, may be empty) +- `$2` = embedding (vector) +- `$3` = overfetch limit per branch (int) +- `$4` = final limit (int) +- `$5` = domain filter (text[] or NULL) +- `$6` = node_type filter (text[] or NULL) +- `$7` = tag filter (text[] or NULL) + +```sql +WITH params AS ( + SELECT + $1::text AS q_text, + $2::vector AS q_vec, + $3::int AS overfetch, + $4::int AS final_limit, + $5::text[] AS dom_filter, + $6::text[] AS nt_filter, + $7::text[] AS tag_filter +), +fts AS ( + SELECT m.id, + ts_rank_cd(m.search_vec, websearch_to_tsquery('english', p.q_text)) AS score, + ROW_NUMBER() OVER ( + ORDER BY ts_rank_cd(m.search_vec, websearch_to_tsquery('english', p.q_text)) DESC + ) AS rank + FROM memories m, params p + WHERE p.q_text <> '' + AND m.search_vec @@ websearch_to_tsquery('english', p.q_text) + AND (p.dom_filter IS NULL OR m.domains && p.dom_filter) + AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter)) + AND (p.tag_filter IS NULL OR m.tags && p.tag_filter) + LIMIT (SELECT overfetch FROM params) +), +vec AS ( + SELECT m.id, + 1 - (m.embedding <=> p.q_vec) AS score, + ROW_NUMBER() OVER ( + ORDER BY m.embedding <=> p.q_vec + ) AS rank + FROM memories m, params p + WHERE m.embedding IS NOT NULL + AND p.q_vec IS NOT NULL + AND (p.dom_filter IS NULL OR m.domains && p.dom_filter) + AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter)) + AND (p.tag_filter IS NULL OR m.tags && p.tag_filter) + LIMIT (SELECT overfetch FROM params) +), +fused AS ( + SELECT COALESCE(f.id, v.id) AS id, + COALESCE(1.0 / (60 + f.rank), 0.0) + + COALESCE(1.0 / (60 + v.rank), 0.0) AS rrf_score, + f.score AS fts_score, + v.score AS vector_score + FROM fts f FULL OUTER JOIN vec v ON f.id = v.id +) +SELECT m.id AS "id!: Uuid", + m.domains AS "domains!: Vec", + m.domain_scores AS "domain_scores!: serde_json::Value", + m.content AS "content!", + m.node_type AS "node_type!", + m.tags AS "tags!: Vec", + m.embedding AS "embedding?: Vector", + m.metadata AS "metadata!: serde_json::Value", + m.created_at AS "created_at!: chrono::DateTime", + m.updated_at AS "updated_at!: chrono::DateTime", + fused.rrf_score AS "rrf_score!: f64", + fused.fts_score AS "fts_score?: f64", + fused.vector_score AS "vector_score?: f64" +FROM fused +JOIN memories m ON m.id = fused.id +ORDER BY fused.rrf_score DESC +LIMIT (SELECT final_limit FROM params); +``` + +- **Behavior notes**: + - `OVERFETCH_MULT * query.limit` is passed as `$3`. Final `$4` is `query.limit`. + - Empty text query is allowed; the `fts` CTE returns zero rows (`p.q_text <> ''`) and the result degrades to pure vector search, which matches `vector_search` behavior. + - Null embedding is allowed; the `vec` CTE returns zero rows and the result degrades to pure FTS, which matches `fts_search` behavior. + - `fts_search` and `vector_search` are separate public methods on the trait. Each uses a simpler single-CTE query derived from the above by removing the other branch. Implementing them as thin wrappers over `rrf_search` with nullified inputs is acceptable but adds one extra plan per call; the explicit implementations win on latency. + - `min_retrievability` in `SearchQuery` is applied as a final filter by joining on `scheduling` in the outer `SELECT`. Adding that join unconditionally regresses simple searches; add it only when `query.min_retrievability.is_some()`. + +### D6. `embedding_model` registry impl + +- **File**: `crates/vestige-core/src/storage/postgres/registry.rs` +- **Depends on**: D1, D4 (table exists), Phase 1 `EmbeddingModelRegistry` trait. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use sqlx::PgPool; + +use crate::embedder::Embedder; +use crate::storage::error::{StoreError, StoreResult}; + +pub(crate) async fn ensure_registry( + pool: &PgPool, + embedder: &dyn Embedder, +) -> StoreResult<()> { + let row = sqlx::query!( + r#"SELECT name, dimension, hash FROM embedding_model WHERE id = 1"# + ) + .fetch_optional(pool) + .await?; + + match row { + None => { + sqlx::query!( + r#" + INSERT INTO embedding_model (id, name, dimension, hash) + VALUES (1, $1, $2, $3) + "#, + embedder.model_name(), + embedder.dimension() as i32, + embedder.model_hash(), + ) + .execute(pool) + .await?; + + // First-ever run: stamp the vector column typmod. + let ddl = format!( + "ALTER TABLE memories ALTER COLUMN embedding TYPE vector({})", + embedder.dimension() + ); + sqlx::query(&ddl).execute(pool).await?; + Ok(()) + } + Some(r) if r.name == embedder.model_name() + && r.dimension == embedder.dimension() as i32 + && r.hash == embedder.model_hash() => Ok(()), + Some(r) => Err(StoreError::EmbeddingMismatch { + expected: format!("{} ({}d, {})", r.name, r.dimension, r.hash), + got: format!( + "{} ({}d, {})", + embedder.model_name(), + embedder.dimension(), + embedder.model_hash() + ), + }), + } +} + +pub(crate) async fn update_registry( + pool: &PgPool, + embedder: &dyn Embedder, +) -> StoreResult<()> { + // Used only by `vestige migrate --reembed` after a full re-encode. + sqlx::query!( + r#" + UPDATE embedding_model + SET name = $1, dimension = $2, hash = $3, created_at = now() + WHERE id = 1 + "#, + embedder.model_name(), + embedder.dimension() as i32, + embedder.model_hash(), + ) + .execute(pool) + .await?; + Ok(()) +} +``` + +- **Behavior notes**: + - `StoreError::EmbeddingMismatch { expected, got }` already exists in Phase 1; Phase 2 just constructs it. + - The `ALTER TABLE ... TYPE vector(N)` DDL is only issued on first init. On subsequent inits the existing typmod already matches. + - Re-embed flow also uses this module, but the DDL path is different -- see D11. + +### D7. `VestigeConfig`: `vestige.toml` backend selection + +- **File**: `crates/vestige-core/src/config.rs` (Phase 1 may already own this file; Phase 2 extends, not replaces) +- **Depends on**: D1. +- **Signatures**: + +```rust +use std::path::{Path, PathBuf}; + +use serde::Deserialize; + +#[derive(Debug, Clone, Deserialize)] +pub struct VestigeConfig { + #[serde(default)] + pub embeddings: EmbeddingsConfig, + #[serde(default)] + pub storage: StorageConfig, + #[serde(default)] + pub server: ServerConfig, + #[serde(default)] + pub auth: AuthConfig, +} + +#[derive(Debug, Clone, Deserialize)] +pub struct EmbeddingsConfig { + pub provider: String, // "fastembed" + pub model: String, // "BAAI/bge-base-en-v1.5" +} + +#[derive(Debug, Clone, Deserialize)] +#[serde(tag = "backend", rename_all = "lowercase")] +pub enum StorageConfig { + Sqlite(SqliteConfig), + #[cfg(feature = "postgres-backend")] + Postgres(PostgresConfig), +} + +#[derive(Debug, Clone, Deserialize)] +pub struct SqliteConfig { + pub path: PathBuf, +} + +#[cfg(feature = "postgres-backend")] +#[derive(Debug, Clone, Deserialize)] +pub struct PostgresConfig { + pub url: String, + #[serde(default)] + pub max_connections: Option, + #[serde(default)] + pub acquire_timeout_secs: Option, +} + +#[derive(Debug, Clone, Default, Deserialize)] +pub struct ServerConfig { /* Phase 3 fills this in */ } + +#[derive(Debug, Clone, Default, Deserialize)] +pub struct AuthConfig { /* Phase 3 fills this in */ } + +impl VestigeConfig { + pub fn load(path: Option<&Path>) -> Result; + pub fn default_path() -> PathBuf; // ~/.vestige/vestige.toml +} + +#[derive(Debug, thiserror::Error)] +pub enum ConfigError { + #[error("io: {0}")] + Io(#[from] std::io::Error), + #[error("toml: {0}")] + Toml(#[from] toml::de::Error), + #[error("invalid config: {0}")] + Invalid(String), +} +``` + +- **Behavior notes**: + - The serde representation matches the PRD: `[storage]` with `backend = "sqlite"` and a matching `[storage.sqlite]` or `[storage.postgres]` subsection. + - Because `StorageConfig` is `#[serde(tag = "backend")]`, an unknown backend string returns a clear error. + - If `postgres-backend` is compiled off and the user writes `backend = "postgres"`, deserialization returns "unknown variant `postgres`" -- loud failure. Phase 2 wraps this into `ConfigError::Invalid("postgres-backend feature not compiled in")`. + - `env`-override hooks (e.g., `VESTIGE_POSTGRES_URL`) are a Phase 3 concern; not added here. + +### D8. `vestige migrate --from sqlite --to postgres` + +- **File**: `crates/vestige-core/src/storage/postgres/migrate_cli.rs` +- **Depends on**: D2, D6, D7, Phase 1 `SqliteMemoryStore`. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use std::path::Path; +use std::sync::Arc; + +use futures::{StreamExt, TryStreamExt}; +use indicatif::{ProgressBar, ProgressStyle}; +use uuid::Uuid; + +use crate::embedder::Embedder; +use crate::storage::error::{StoreError, StoreResult}; +use crate::storage::postgres::PgMemoryStore; +use crate::storage::sqlite::SqliteMemoryStore; + +#[derive(Debug, Clone)] +pub struct SqliteToPostgresPlan { + pub sqlite_path: std::path::PathBuf, + pub postgres_url: String, + pub max_connections: u32, + pub batch_size: usize, // default 500 +} + +pub struct MigrationReport { + pub memories_copied: u64, + pub scheduling_rows: u64, + pub edges_copied: u64, + pub review_events_copied: u64, + pub domains_copied: u64, + pub errors: Vec<(Uuid, StoreError)>, +} + +pub async fn run_sqlite_to_postgres( + plan: SqliteToPostgresPlan, + embedder: Arc, +) -> StoreResult; +``` + +Algorithm: + +1. Open source `SqliteMemoryStore` in read-only mode (`?mode=ro`). +2. Check source `embedding_model` registry; refuse if it disagrees with the supplied embedder unless the user also passed `--reembed`. +3. Open destination `PgMemoryStore` via `connect` (runs migrations, stamps dim). +4. Stream source rows in batches of `plan.batch_size` via a windowed query ordered by `created_at, id` (stable cursor; survives resume). +5. For each batch: begin a Postgres transaction, `INSERT INTO memories ... ON CONFLICT (id) DO NOTHING` for all rows, `INSERT INTO scheduling` likewise, commit. Copy domain assignments (`domains`, `domain_scores`) verbatim -- they are `[]` and `{}` for pre-Phase-4 SQLite data. +6. After memories finish, stream edges and review_events the same way. +7. Emit progress via `indicatif::ProgressBar` (one bar per table, multi-bar). Each 1000 rows log to tracing at INFO. +8. Return `MigrationReport` for the caller to print. + +- **Behavior notes**: + - Memory-bounded: batch size 500 and sqlx streams mean memory usage stays O(batch * row_size), not O(total_rows). + - Idempotent: re-running replays only the rows not already present; `ON CONFLICT DO NOTHING` means partial runs recover. + - UUID strings from SQLite are parsed via `Uuid::parse_str` -- any mangled ID pushes to `errors` instead of aborting. + - The FTS `search_vec` is regenerated by Postgres via the GENERATED column; no data to copy. + - `review_events` may not exist in Phase 1 SQLite for pre-V12 databases. The migrator detects missing tables via `SELECT name FROM sqlite_master` and skips gracefully. + - A separate `--dry-run` flag prints the counts per table without writing. + +### D9. `vestige migrate --reembed --model=` + +- **File**: `crates/vestige-core/src/storage/postgres/reembed.rs` +- **Depends on**: D2, D6, Phase 1 `Embedder`. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use std::sync::Arc; +use std::time::Instant; + +use futures::TryStreamExt; +use indicatif::{ProgressBar, ProgressStyle}; +use sqlx::PgPool; +use uuid::Uuid; + +use crate::embedder::Embedder; +use crate::storage::error::{StoreError, StoreResult}; +use crate::storage::postgres::PgMemoryStore; + +#[derive(Debug, Clone)] +pub struct ReembedPlan { + pub batch_size: usize, // default 128 (embedder batch) + pub drop_hnsw_first: bool, // default true + pub concurrent_index: bool, // default false; use CREATE INDEX (not CONCURRENTLY) +} + +pub struct ReembedReport { + pub rows_updated: u64, + pub duration_secs: f64, + pub index_rebuild_secs: f64, +} + +pub async fn run_reembed( + store: &PgMemoryStore, + new_embedder: Arc, + plan: ReembedPlan, +) -> StoreResult; +``` + +Algorithm: + +1. Verify `new_embedder.dimension()` != stored dimension OR `new_embedder.model_hash()` != stored hash -- otherwise no-op and return `rows_updated = 0`. +2. `BEGIN; ALTER TABLE memories ALTER COLUMN embedding DROP NOT NULL`; not actually needed (column is already nullable) but shown here for documentation. +3. If `plan.drop_hnsw_first`, execute `DROP INDEX IF EXISTS idx_memories_embedding_hnsw;` so updates are not slowed by index maintenance. This is the recommended path; `REINDEX` is kept in the Open Questions as an alternative. +4. Stream all `id, content` from `memories` ordered by `id`. +5. For each batch of `plan.batch_size`: call `new_embedder.embed_batch(&texts)` (Phase 1 trait exposes batched embedding when available; otherwise loop single `embed`). Then: + +```sql +UPDATE memories +SET embedding = v.embedding::vector +FROM UNNEST($1::uuid[], $2::real[][]) AS v(id, embedding) +WHERE memories.id = v.id; +``` + +6. After all rows updated: run `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($NEW_DIM)` if dimension changed. +7. Rebuild HNSW. If `plan.concurrent_index`, execute `CREATE INDEX CONCURRENTLY idx_memories_embedding_hnsw ...`; else `CREATE INDEX idx_memories_embedding_hnsw ...`. +8. `update_registry` with the new embedder. +9. Return `ReembedReport`. + +- **Behavior notes**: + - Memory-bounded: batch_size * 2 (old + new texts) vectors in RAM at any time. + - The dimension change must happen AFTER all rows are updated (pgvector validates typmod on write when a typmod is present; we relax-then-tighten). + - `CONCURRENTLY` builds do not hold `AccessExclusiveLock`, but fail inside a transaction. That's why the outer driver runs index DDL as an autocommit statement (sqlx `execute` outside a pool transaction). + - For `--dry-run`, emit what *would* happen (row count, estimated embedder calls, estimated time using `rows / 50`-per-second baseline for local fastembed) and exit. + +### D10. CLI wiring in `vestige-mcp` + +- **File**: `crates/vestige-mcp/src/bin/cli.rs` +- **Depends on**: D8, D9, D7. Requires `vestige-mcp` Cargo feature `postgres-backend`. +- **Signatures**: + +```rust +#[derive(Subcommand)] +enum Commands { + // existing variants: Stats, Health, Consolidate, Restore, Backup, + // Export, Gc, Dashboard, Ingest, Serve ... + + /// Migrate between backends or re-embed memories. + #[cfg(feature = "postgres-backend")] + Migrate(MigrateArgs), +} + +#[derive(clap::Args)] +#[cfg(feature = "postgres-backend")] +struct MigrateArgs { + #[command(subcommand)] + action: MigrateAction, +} + +#[derive(Subcommand)] +#[cfg(feature = "postgres-backend")] +enum MigrateAction { + /// Copy all memories from SQLite to Postgres. + #[command(name = "copy")] + Copy { + #[arg(long)] + from: String, // "sqlite" + #[arg(long)] + to: String, // "postgres" + #[arg(long)] + sqlite_path: PathBuf, + #[arg(long)] + postgres_url: String, + #[arg(long, default_value = "500")] + batch_size: usize, + #[arg(long)] + dry_run: bool, + }, + /// Re-embed all memories with a new embedder. + #[command(name = "reembed")] + Reembed { + #[arg(long)] + model: String, + #[arg(long, default_value = "128")] + batch_size: usize, + #[arg(long, default_value_t = true)] + drop_hnsw_first: bool, + #[arg(long)] + concurrent_index: bool, + #[arg(long)] + dry_run: bool, + }, +} +``` + +The user-facing invocation collapses to the exact string requested by the ADR: + +``` +vestige migrate copy --from sqlite --to postgres \ + --sqlite-path ~/.vestige/vestige.db \ + --postgres-url postgresql://localhost/vestige + +vestige migrate reembed --model=BAAI/bge-large-en-v1.5 +``` + +An alternate top-level layout (single `vestige migrate` with flags `--from`, `--to`, `--reembed`) is equivalent; the subcommand split is preferred because the two flag sets are disjoint (see Open Question 1). + +- **Behavior notes**: + - `--from`/`--to` values are validated; the current Phase 2 build accepts only `sqlite` and `postgres`. + - For `reembed`, the `--model` string resolves to an `Embedder` via a factory already provided by Phase 1 (`Embedder::from_name(&str)`); Phase 2 does not invent new embedder constructors. + - Progress output on `stderr`; machine-readable summary on `stdout` as one-line JSON when `--json` is set (skipped for Phase 2 unless trivial). + +### D11. Offline query cache (`.sqlx/`) + +- **File**: `crates/vestige-core/.sqlx/` (committed directory of `query-*.json`) +- **Depends on**: all `sqlx::query!` call sites being final. +- **Procedure**: the developer runs `cargo sqlx prepare --workspace` with a live Postgres having the schema applied. Output goes into `crates/vestige-core/.sqlx/`. This directory is committed. CI enforces freshness by running `cargo sqlx prepare --workspace --check` against the same live Postgres (or failing that, any dev can reproduce by setting `SQLX_OFFLINE=true`). +- **Behavior notes**: `SQLX_OFFLINE=true` in `build.rs` or env is the default on CI and for downstream consumers. The `vestige-core` docs add a one-liner in README for contributors: "if you change any SQL in Phase 2 modules, rerun `cargo sqlx prepare` with a live DB." + +### D12. Testcontainer harness (integration) + +- **File**: `tests/phase_2/common/mod.rs` (the `common` convention used in `tests/phase_2/` crates) +- **Depends on**: D2 through D11. +- **Signatures**: + +```rust +#![cfg(feature = "postgres-backend")] + +use std::sync::Arc; + +use testcontainers_modules::postgres::Postgres; +use testcontainers::{runners::AsyncRunner, ContainerAsync}; + +use vestige_core::embedder::Embedder; +use vestige_core::storage::postgres::PgMemoryStore; + +pub struct PgHarness { + pub container: ContainerAsync, + pub store: PgMemoryStore, +} + +impl PgHarness { + pub async fn start(embedder: Arc) -> anyhow::Result { + let container = Postgres::default() + .with_tag("pg16") + .with_name("pgvector/pgvector") + .start() + .await?; + let port = container.get_host_port_ipv4(5432).await?; + let url = format!( + "postgresql://postgres:postgres@127.0.0.1:{}/postgres", port + ); + let store = PgMemoryStore::connect(&url, 4, embedder.as_ref()).await?; + Ok(Self { container, store }) + } +} +``` + +- **Behavior notes**: + - Image `pgvector/pgvector:pg16` bundles pgvector into the official postgres:16 image. + - Pool size 4 is enough for tests without starving the container's default `max_connections = 100`. + - `ContainerAsync` is held for the whole test scope; drop tears down the container. + - A fake `TestEmbedder` in `common/test_embedder.rs` provides a deterministic hash-based embedding (no ONNX dependency in CI). + +--- + +## Test Plan + +### Unit tests (colocated in `src/`) + +Under `crates/vestige-core/src/storage/postgres/`: + +- `pool.rs` -- one test per `build_pool` branch: defaults, explicit `max_connections`, invalid URL returns `StoreError::Postgres`. +- `registry.rs` -- three tests: first-init writes row and alters typmod, reopen with same embedder returns Ok, reopen with different dimension returns `EmbeddingMismatch`. +- `search.rs` -- query-builder unit tests for parameter packing: empty text, null embedding, all three filters null, all three filters populated. +- `migrate_cli.rs` -- `SqliteToPostgresPlan::default` returns sane defaults; plan validation rejects empty URL. +- `reembed.rs` -- `ReembedPlan::no_change` returns `rows_updated == 0` when embedder matches registry (no network call). +- `config.rs` -- five tests covering: valid postgres config, valid sqlite config, unknown backend string, missing subsection, feature-gated postgres without feature compiled in. + +### Integration tests (in `tests/phase_2/`) + +Each file is a full integration test crate (`[[test]]` in workspace root Cargo). + +**`tests/phase_2/pg_trait_parity.rs`** + +- Declares the same test matrix as Phase 1's SQLite trait tests, parameterized over `impl MemoryStore`. +- Runs every method: `insert`, `get`, `update`, `delete`, `search`, `fts_search`, `vector_search`, `get_scheduling`, `update_scheduling`, `get_due_memories`, `add_edge`, `get_edges`, `remove_edge`, `get_neighbors`, `list_domains`, `get_domain`, `upsert_domain`, `delete_domain`, `classify`, `count`, `get_stats`, `vacuum`, `health_check`. +- Each test is written once as `async fn roundtrip_(store: &dyn MemoryStore)` and invoked from two wrappers, one for SQLite and one for Postgres. +- Acceptance: every method returns equal results (except for `Uuid` ordering in `list_domains` where the test sorts before comparing). + +**`tests/phase_2/pg_hybrid_search_rrf.rs`** + +- Inserts 20 memories with known content ("rust async trait", "postgres hnsw vector", "fastembed onnx model", ...). +- Case 1: pure FTS. `SearchQuery { text: Some("rust trait"), embedding: None, ... }` returns the three Rust-related rows in order; `fts_score` populated, `vector_score` null. +- Case 2: pure vector. `SearchQuery { text: None, embedding: Some(embed("rust trait")), ... }` returns the same three rows via cosine; `vector_score` populated, `fts_score` null. +- Case 3: hybrid. Both set -- top hit has both scores; `rrf_score >= 1/(60+1) + 1/(60+1) = 0.0328`. +- Case 4: domain filter. 10 memories tagged with `domains = ["dev"]`, 10 with `["home"]`. Query with `domains: Some(vec!["dev"])` returns only dev memories. +- Case 5: edge case -- empty FTS query plus an embedding behaves identically to `vector_search`; empty embedding plus FTS query behaves identically to `fts_search`. + +**`tests/phase_2/pg_migration_sqlite_to_postgres.rs`** + +- Populate a fresh SQLite with 10,000 memories (seeded RNG, deterministic content), 4,000 scheduling rows, 2,000 edges. +- Run `run_sqlite_to_postgres` with a test embedder. +- Assert: `count() == 10_000` on destination; spot-check 25 memories byte-for-byte (content, tags, metadata, domains, domain_scores). +- Assert: FSRS fields (`stability`, `difficulty`, `next_review`) preserved per memory. +- Assert: edges preserved by `(source_id, target_id, edge_type)`. +- Assert: re-running the migration is a no-op (`ON CONFLICT DO NOTHING` path); row count unchanged. + +**`tests/phase_2/pg_migration_reembed.rs`** + +- Start with a fresh store using `TestEmbedder768` (768-dim, hash `h1`). Insert 500 memories. +- Swap to `TestEmbedder1024` (1024-dim, hash `h2`). Run `run_reembed(store, Arc::new(TestEmbedder1024), ReembedPlan::default())`. +- Assert: `rows_updated == 500`; `embedding_model` now has `(name=TestEmbedder1024, dimension=1024, hash=h2)`. +- Assert: `SELECT DISTINCT vector_dims(embedding) FROM memories` returns only `1024`. +- Assert: HNSW index exists after reembed (`SELECT indexrelid FROM pg_indexes WHERE indexname = 'idx_memories_embedding_hnsw'`). +- Assert: memory IDs unchanged (compare pre/post id sets). +- Assert: a hybrid search using `TestEmbedder1024` returns results (post-reembed vectors are queryable). + +**`tests/phase_2/pg_config_parsing.rs`** + +- Parse six `vestige.toml` snippets: + - sqlite + fastembed -> `StorageConfig::Sqlite`. + - postgres + fastembed -> `StorageConfig::Postgres` with `max_connections = 10`. + - postgres with custom `max_connections = 25` and `acquire_timeout_secs = 60`. + - unknown backend `"mysql"` -> `ConfigError`. + - missing subsection `[storage.postgres]` while `backend = "postgres"` -> `ConfigError`. + - malformed URL (empty) -> `ConfigError::Invalid`. + +**`tests/phase_2/pg_concurrency.rs`** + +- Spawn 16 tasks, each inserting 100 memories in parallel for 1,600 total. +- Spawn 4 tasks concurrently running `search` queries; none should fail. +- Spawn 2 tasks concurrently running `update_scheduling` on overlapping IDs -- last write wins (MVCC), neither errors. +- Assert: all 1,600 rows present, no deadlocks, every task returns `Ok`. +- Run time < 10 seconds on a cold container. + +### Compile-time query verification + +- CI step: `cargo sqlx prepare --workspace --check` against a CI-provisioned Postgres (GitHub Actions / Forgejo Actions services block). Fails CI if any `query!` macro goes stale. +- Alternative offline run for contributors: `SQLX_OFFLINE=true cargo check -p vestige-core --features postgres-backend`. CI runs both forms to ensure `.sqlx/` is up to date. +- `.sqlx/` is committed to the repo. A `.gitattributes` entry marks it as `linguist-generated=true` so it doesn't inflate language stats. + +### Benchmarks + +Under `crates/vestige-core/benches/pg_hybrid_search.rs` (Criterion), gated by `postgres-backend`. + +- `pg_search_1k` -- populate 1,000 memories once per bench suite, measure `rrf_search` p50/p99 over 500 iterations. Target: p50 < 10ms, p99 < 30ms on a local container. +- `pg_search_100k` -- 100,000 memories. Target: p50 < 50ms, p99 < 150ms. Validates HNSW scaling. +- Testcontainer shared across both benches via `once_cell`. +- Bench entry in `vestige-core/Cargo.toml`: + +``` +[[bench]] +name = "pg_hybrid_search" +harness = false +required-features = ["postgres-backend"] +``` + +--- + +## Acceptance Criteria + +- [ ] `cargo build -p vestige-core --features postgres-backend` -- zero warnings. +- [ ] `cargo build -p vestige-core` (SQLite-only, default features) -- zero warnings; no Postgres symbols referenced. +- [ ] `cargo build -p vestige-mcp --features postgres-backend` -- zero warnings; `vestige` binary exposes the `migrate` subcommand. +- [ ] `cargo clippy --workspace --all-targets --all-features -- -D warnings` -- clean. +- [ ] `cargo sqlx prepare --workspace --check` -- returns success; `.sqlx/` is current. +- [ ] `cargo test -p vestige-core --features postgres-backend --test pg_trait_parity --test pg_hybrid_search_rrf --test pg_migration_sqlite_to_postgres --test pg_migration_reembed --test pg_config_parsing --test pg_concurrency` -- all green. +- [ ] Testcontainer spin-up p50 under 30 seconds on a developer laptop with a warm Docker daemon. +- [ ] `pg_search_100k` Criterion bench reports p50 < 50ms on reference hardware (logged in the ADR comment trail). +- [ ] `vestige migrate copy --from sqlite --to postgres` on a 10,000-memory corpus completes without data loss: row count parity, content byte-parity on a 1 percent sample, FSRS state preserved (stability, difficulty, reps, lapses, next_review), edge count parity. +- [ ] `vestige migrate reembed` with a dimension-changing embedder returns to a fully queryable state: HNSW present, `embedding_model` updated, no stale vectors, memory IDs untouched. +- [ ] Trait parity: every method on `MemoryStore` has at least one passing test against `PgMemoryStore`. +- [ ] Phase 1's existing SQLite suite continues to pass with zero changes required (Phase 2 is additive). +- [ ] The `postgres-backend` feature does not compile in SQLCipher (`encryption`) simultaneously (mutually exclusive at compile time, per project rule). + +--- + +## Rollback Notes + +- Every `*.up.sql` has a matching `*.down.sql` in `crates/vestige-core/migrations/postgres/`. `sqlx migrate revert` walks them in reverse order. Manual operator procedure: `sqlx migrate revert --database-url $URL --source crates/vestige-core/migrations/postgres`. +- `vestige migrate copy` is a one-way operation. The source SQLite DB is read-only during the run and untouched afterward; users retain their original file indefinitely. Recommended discipline: copy the SQLite file aside before starting, retain for 30 days. +- `vestige migrate reembed` is destructive to the `embedding` column. Recommended discipline: take a logical backup (`pg_dump --table=memories --table=embedding_model --table=scheduling`) before a reembed run. The tool prints that recommendation before starting and exits non-zero unless `--yes` is passed or the user is on a TTY that confirms. +- Feature-gate strategy: the default build remains SQLite-only. Downstream users pull `postgres-backend` explicitly: `cargo install --features postgres-backend vestige-mcp`. If the Postgres implementation fails in the field, users fall back to SQLite simply by flipping `vestige.toml`'s `[storage] backend = "sqlite"` and restarting. No data re-migration is needed if they retained their SQLite file. +- The `docs/runbook/postgres.md` deliverable (D16) captures this discipline as a one-page ops note. + +--- + +## Open Implementation Questions + +Each item has a recommendation. Ship that unless a reviewer objects. + +### Q1. CLI shape: subcommand split vs flag union + +- **Options**: (a) `vestige migrate copy --from sqlite --to postgres ...` and `vestige migrate reembed --model=...` (subcommand split); (b) `vestige migrate --from sqlite --to postgres ...` and `vestige migrate --reembed --model=...` under one `clap` command with disjoint flag groups (flag union). +- **RECOMMENDATION**: (a) subcommand split. The flag sets do not overlap and clap expresses the constraint more cleanly. The ADR string `vestige migrate --from sqlite --to postgres` can still be documented as a canonical alias by having `copy` accept it verbatim when `--from` is present. + +### Q2. Feature flag name + +- **Options**: `postgres-backend`, `postgres`, `backend-postgres`, `pg`. +- **RECOMMENDATION**: `postgres-backend`. Matches the ADR text and is explicit in `Cargo.toml` feature listings. + +### Q3. sqlx offline mode strategy + +- **Options**: (a) commit `.sqlx/` so downstream builds never need DATABASE_URL; (b) require `DATABASE_URL` at build time. +- **RECOMMENDATION**: (a). The repo already ships as a library; many downstream users will build from crates.io with no Postgres available. Committing `.sqlx/` costs ~100 kB. + +### Q4. HNSW rebuild strategy during reembed + +- **Options**: (a) `DROP INDEX; CREATE INDEX`; (b) `REINDEX INDEX CONCURRENTLY`; (c) `CREATE INDEX CONCURRENTLY` on a new name then swap. +- **RECOMMENDATION**: (a) by default for speed on empty / near-empty tables; expose `--concurrent-index` for large production corpora where locking the table is unacceptable. `REINDEX CONCURRENTLY` on pgvector HNSW is supported in pgvector 0.6+ but the community still reports edge cases with `maintenance_work_mem` -- skip unless a user explicitly opts in. + +### Q5. Connection pool sizing default + +- **Options**: 4, 10, 20, `cpus() * 2`. +- **RECOMMENDATION**: 10. Matches the PRD example, covers a single-operator load, and does not exhaust the default Postgres `max_connections = 100`. Configurable via `vestige.toml`. + +### Q6. Testcontainer image pinning + +- **Options**: (a) `pgvector/pgvector:pg16`; (b) `pgvector/pgvector:pg16.2-0.7.4` (exact tag); (c) maintain local Dockerfile. +- **RECOMMENDATION**: (b) pin exact. The float tag `pg16` has shipped breaking changes in the past (e.g., pg 16.0 to 16.1 interop). Pin to a specific pgvector minor and Postgres patch. CI bumps the tag via a single-line change. + +### Q7. Empty-text and null-embedding behavior in `search` + +- **Options**: (a) return an error if both are missing; (b) return an empty result; (c) return all memories sorted by `created_at DESC`. +- **RECOMMENDATION**: (a). A `search` call with no query is a bug in the caller; returning empty silently would hide the bug. The existing Phase 1 SQLite behavior (TBD but likely errors) is the tiebreaker. + +### Q8. `classify()` SQL vs Rust + +- **Options**: (a) compute cosine to all centroids in SQL (`SELECT id, 1 - (centroid <=> $1::vector) FROM domains ORDER BY ...`); (b) load centroids, compute in Rust. +- **RECOMMENDATION**: (a). Leverages pgvector's SIMD paths and avoids round-tripping centroid vectors. At Phase 4 scale (tens of centroids) the difference is marginal, but the SQL path is simpler and matches the rest of the backend. + +### Q9. FSRS `review_events` writes: trait method vs implicit on `update_scheduling` + +- **Options**: (a) add an explicit `record_review(memory_id, rating, prior, new)` method to the Phase 1 trait; (b) have `update_scheduling` write the event atomically. +- **RECOMMENDATION**: this is a Phase 1 question, not Phase 2. Phase 2 implements whichever Phase 1 chose. If Phase 1 missed it, Phase 2 raises a blocker rather than deciding alone. + +### Q10. `tsvector` weight for tags -- PRD used `array_to_tsvector`, we used `array_to_string` + +- **Options**: (a) `array_to_tsvector(tags)` (requires the `tsvector_extra` extension or similar); (b) `to_tsvector('english', array_to_string(tags, ' '))` (plain core Postgres). +- **RECOMMENDATION**: (b). Equivalent ranking, zero extra extensions. If a future tag matches a stopword (`"the"`), it gets dropped, but that is correct behavior for ranking. + +### Q11. `PgMemoryStore::connect` runs migrations automatically? + +- **Options**: (a) always run `sqlx::migrate!` on connect; (b) require the user to run `vestige migrate-schema` explicitly before starting the server. +- **RECOMMENDATION**: (a) during Phase 2; revisit in Phase 3 when the server binary exists. Developer ergonomics win now, and the migrations are idempotent. + +### Q12. Offline query cache freshness vs `sqlx-cli` version skew + +- **Options**: (a) pin `sqlx-cli` version in CI `actions/cache` step; (b) let CI install whatever version `sqlx` depends on. +- **RECOMMENDATION**: (a) pin to the same 0.8.x as the crate. `sqlx prepare` output changes between 0.7 and 0.8 and must match the runtime. + +--- + +## Sequencing + +The Phase 2 agent executes deliverables in this order; deliverables not listed can run in any order relative to each other. + +1. D1 (feature gate + Cargo deps) -- unblocks everything. +2. D7 (config) -- required to construct `PgMemoryStore`. +3. D4 (migrations SQL) -- required before any `query!` compiles. +4. D3 (pool) + D6 (registry) -- small, used by D2. +5. D2 (`PgMemoryStore` core + trait impl) -- the bulk of Phase 2. +6. D5 (RRF search) -- after D2; requires the trait to exist. +7. D12 (test harness) + parity and search tests -- validates D2 and D5 in isolation. +8. D8 (sqlite->pg migrate) + its integration test. +9. D9 (reembed) + its integration test. +10. D10 (CLI wiring). +11. D11 (`.sqlx/` offline cache) -- last, after SQL is frozen. +12. D15 (benches) + D16 (runbook) -- after acceptance tests pass. + +Each deliverable PR includes its own tests; the final Phase 2 PR stacks them (or lands as a single branch if the Phase 1 trait is stable enough to avoid rebase churn). + +### Critical Files for Implementation + +- /home/delandtj/prppl/vestige/crates/vestige-core/src/storage/postgres/mod.rs +- /home/delandtj/prppl/vestige/crates/vestige-core/migrations/postgres/0001_init.up.sql +- /home/delandtj/prppl/vestige/crates/vestige-core/src/storage/postgres/search.rs +- /home/delandtj/prppl/vestige/crates/vestige-core/src/storage/postgres/migrate_cli.rs +- /home/delandtj/prppl/vestige/crates/vestige-mcp/src/bin/cli.rs diff --git a/docs/plans/0003-phase-3-network-access.md b/docs/plans/0003-phase-3-network-access.md new file mode 100644 index 0000000..500fd5a --- /dev/null +++ b/docs/plans/0003-phase-3-network-access.md @@ -0,0 +1,1435 @@ +# Phase 3 Plan: Network Access and Authentication + +**Status**: Draft +**Depends on**: Phase 1 (MemoryStore trait), Phase 2 (PgMemoryStore, backend config) +**Related**: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 3) + +--- + +## Scope + +### In scope + +- HTTP MCP Streamable endpoint at `POST /mcp` (JSON-RPC body, keep existing + session semantics) and `GET /mcp` (Server-Sent Events for long-running + operations: dream, consolidate, discover, reassign). +- REST API under `/api/v1/` for direct HTTP clients that do not speak MCP + (memories CRUD, search, consolidate trigger, stats, domains + list/rename/merge/discover). +- `api_keys` table + enforcement (blake3-hashed, scopes `read`/`write`, optional + `domain_filter` TEXT[], `last_used` timestamp, `active` flag, revocation). +- Auth middleware with three resolution paths in priority order: + `Authorization: Bearer ` then `X-API-Key: ` then signed session + cookie. All three resolve to the same `ApiKeyIdentity`. +- Signed session cookie: `vestige_session`, SameSite=Strict, HttpOnly, + Secure-when-TLS, Path=/, Max-Age 8 hours. Signed with HMAC-SHA256 using a + key derived from `VESTIGE_SESSION_SECRET` (env) or generated + persisted to + `/session_secret` on first boot. +- `vestige keys create|list|revoke` CLI subcommand (plus `keys rotate` as a + convenience alias of `revoke` + `create`). +- Startup-time refusal to bind non-loopback with `auth.enabled = false` (hard + error, non-zero exit, stderr message, no fallback). +- Dashboard login flow: `POST /dashboard/login` with `{"api_key":"vst_..."}` + JSON body, `X-API-Key` header, or form body; sets signed cookie; returns 200 + JSON `{"ok":true}` for XHR or 303 to `/` if form. Logout at + `POST /dashboard/logout` clears cookie. +- Per-key `domain_filter` enforced inside the auth layer: if the key has + `domain_filter = ["dev","infra"]`, every handler that searches or lists sees + the filter pre-applied via a request extension. Optional + `X-Vestige-Domain: home` header may narrow further but may never escape the + key's filter. +- `[server]` and `[auth]` sections in `vestige.toml`, plus backward-compatible + env var bridges. +- `VESTIGE_AUTH_TOKEN` continues to work for one minor release as a synthetic + single-key fallback, but logs a deprecation warning. +- Per-request request IDs and structured tracing; `last_used` write-back on + successful auth. + +### Out of scope + +- Phase 4 HDBSCAN domain classifier itself. The REST surface exposes domain + endpoints but they may stub to empty results until Phase 4 lands. +- Real TLS termination. Assumed handled by a reverse proxy (nginx, Caddy, + Mycelium). An optional `tls_cert` / `tls_key` pair is documented but its + implementation may be deferred behind a `tls` Cargo feature. +- OAuth / OIDC / SSO. Future work. +- Rate limiting per key (documented in Open Questions, not implemented here). +- WebAuthn / passkey dashboard login. Future work. +- Fine-grained RBAC beyond `read` / `write` scopes. + +## Prerequisites + +Phase 1 artifacts: + +- `vestige_core::storage::MemoryStore` trait (with `Send` variant via + `trait_variant::make`). +- `Embedder` trait. +- `SqliteMemoryStore` implementing `MemoryStore`. + +Phase 2 artifacts: + +- `PgMemoryStore` implementing `MemoryStore`. +- `crates/vestige-core/migrations/postgres/` sqlx migrations; `api_keys` table + schema present but enforcement path is Phase 3's job. +- Runtime backend selection via `vestige.toml` `[storage]` section returning + an `Arc`. + +Assumed already available in workspace: + +- `axum = 0.8` (currently pinned in `crates/vestige-mcp/Cargo.toml`). +- `tower = 0.5`, `tower-http = 0.6` (`cors`, `set-header` features already on). +- `tokio`, `serde`, `serde_json`, `uuid`, `chrono`, `tracing`, + `tracing-subscriber`, `thiserror`, `anyhow`, `subtle`, `clap`, `directories`. + +New crates required (add via `cargo add -p vestige-mcp`): + +- `blake3 = "1"` -- key hashing. +- `rand = "0.9"` with `std_rng` (for key bytes; prefer `rand::rngs::OsRng`). +- `axum-extra = { version = "0.10", features = ["cookie-signed", "typed-header"] }` + -- `SignedCookieJar`, `Cookie`, `Key`. +- `hmac = "0.12"` + `sha2 = "0.10"` -- HMAC-SHA256 for the session secret + derivation (not required if `axum-extra`'s `SignedCookieJar` is used, but + retained for the pure-token-signing path). RECOMMENDATION: rely solely on + `axum-extra::extract::cookie::{Key, SignedCookieJar}`. +- `tower-http` features bump: add `trace` and `request-id`. +- `async-stream = "0.3"` -- emitting SSE events from async closures. +- `futures-util` already present -- for `Stream` adapters. +- `base64 = "0.22"` -- emitting / parsing the random bytes in the `vst_...` + prefix. Use the `URL_SAFE_NO_PAD` alphabet. +- `zeroize = "1"` (optional, recommended) -- scrub the plaintext key in RAM + after hashing. + +`cargo add` commands (do not execute here, leave to implementation): + + cargo add -p vestige-mcp blake3 rand base64 zeroize async-stream + cargo add -p vestige-mcp axum-extra --features cookie-signed,typed-header + cargo add -p vestige-mcp tower-http --features trace,request-id,cors,set-header + +JSON-RPC library: the project uses a hand-rolled `JsonRpcRequest` / +`JsonRpcResponse` pair in `crates/vestige-mcp/src/protocol/types.rs`. Keep it +in Phase 3 (no jsonrpsee migration). Streamable HTTP remains implemented as +`POST /mcp` + session header + `GET /mcp` SSE. See Open Questions for rationale. + +## Deliverables + +1. `crates/vestige-mcp/src/auth/` module (new). Houses key generation, key + verification, identity resolution, scopes, domain-filter extractor, session + key type, and error types. + +2. `crates/vestige-mcp/src/auth/keys.rs` -- key format, generation, + blake3 hashing, store-facing trait methods for list / create / revoke / + verify. + +3. `crates/vestige-mcp/src/auth/middleware.rs` -- axum `from_fn` middleware + that populates `Extension` on the request, rejects unauthenticated + requests with 401, insufficient scope with 403. + +4. `crates/vestige-mcp/src/auth/session.rs` -- `SignedCookieJar` integration, + `session_key()` loader (env or persisted file), `issue_session()` and + `revoke_session()` helpers. + +5. `crates/vestige-mcp/src/http/` module split out of `protocol/http.rs`: + - `http/mcp.rs` -- MCP JSON-RPC endpoint (adapted from the current + `post_mcp` / `delete_mcp`, with auth middleware now gating). + - `http/mcp_sse.rs` -- SSE handler for `GET /mcp` long-running ops. + - `http/rest.rs` -- `/api/v1/*` handlers. + - `http/mod.rs` -- `build_router()`, `start_server()`, bind-safety check, + layer stack assembly. + +6. `crates/vestige-mcp/src/http/errors.rs` -- uniform `ApiError` enum and + `IntoResponse` implementation. Maps to RFC 7807 problem+json for REST and + plain JSON for `/mcp`. + +7. Dashboard patch: `crates/vestige-mcp/src/dashboard/mod.rs` -- add the auth + middleware to the dashboard router, add `/dashboard/login` + `/dashboard/logout` + endpoints, keep `/api/health` unauthenticated. + +8. `crates/vestige-mcp/src/bin/cli.rs` -- new `Keys` subcommand group (`create`, + `list`, `revoke`, `rotate`). + +9. `crates/vestige-mcp/src/config.rs` (new file) -- typed `ServerConfig`, + `AuthConfig`, `StorageConfig` loader from `vestige.toml`, merging env var + overrides, validating the non-loopback + auth-disabled combination. + +10. SQL migration `crates/vestige-core/migrations/postgres/0300_api_keys_enforcement.sql` + and SQLite equivalent `crates/vestige-core/migrations/sqlite/0300_api_keys.sql`: + - `api_keys` table (if not already created in Phase 2), with `key_hash` + UNIQUE, `label` NOT NULL, `scopes` TEXT[] default `{read,write}`, + `domain_filter` TEXT[] default `{}`, `created_at`, `last_used`, + `active BOOLEAN DEFAULT true`. + - Index on `key_hash` (unique already), and on `active WHERE active`. + +11. `MemoryStore` trait extension (Phase 2 may already cover this; if not, + finalize in Phase 3): `list_api_keys`, `create_api_key`, + `revoke_api_key`, `find_api_key_by_hash`, `touch_api_key_last_used`. + +12. Docs updates: + - `docs/env-vars.md` (new) -- one sheet for all runtime env vars. + - `README.md` server-mode section. + - `docs/adr/0001-*.md` -- mark Phase 3 as Implemented when merged. + +## Detailed Task Breakdown + +### D1. Auth module skeleton + +Files: + +- `crates/vestige-mcp/src/auth/mod.rs` +- `crates/vestige-mcp/src/auth/keys.rs` +- `crates/vestige-mcp/src/auth/session.rs` +- `crates/vestige-mcp/src/auth/middleware.rs` +- `crates/vestige-mcp/src/auth/errors.rs` + +`auth/mod.rs`: + + pub mod errors; + pub mod keys; + pub mod middleware; + pub mod session; + + pub use errors::AuthError; + pub use keys::{ApiKey, ApiKeyPlaintext, ApiKeyRecord, Scope}; + pub use middleware::{Identity, auth_layer}; + pub use session::{SessionConfig, session_key}; + +`auth/errors.rs`: + + use axum::http::StatusCode; + use axum::response::{IntoResponse, Response}; + use serde::Serialize; + use thiserror::Error; + + #[derive(Debug, Error)] + pub enum AuthError { + #[error("missing credentials")] + MissingCredentials, + #[error("invalid credentials")] + InvalidCredentials, + #[error("key revoked")] + Revoked, + #[error("insufficient scope: required {required}")] + InsufficientScope { required: &'static str }, + #[error("domain not permitted for this key: {domain}")] + DomainNotAllowed { domain: String }, + #[error("internal auth error")] + Internal, + } + + #[derive(Serialize)] + struct Problem<'a> { + #[serde(rename = "type")] + kind: &'a str, + title: &'a str, + status: u16, + detail: &'a str, + } + + impl IntoResponse for AuthError { + fn into_response(self) -> Response { + let (status, title) = match self { + AuthError::MissingCredentials => (StatusCode::UNAUTHORIZED, "unauthorized"), + AuthError::InvalidCredentials => (StatusCode::UNAUTHORIZED, "unauthorized"), + AuthError::Revoked => (StatusCode::UNAUTHORIZED, "unauthorized"), + AuthError::InsufficientScope { .. } => (StatusCode::FORBIDDEN, "forbidden"), + AuthError::DomainNotAllowed { .. } => (StatusCode::FORBIDDEN, "forbidden"), + AuthError::Internal => (StatusCode::INTERNAL_SERVER_ERROR, "internal"), + }; + let detail = self.to_string(); + let body = axum::Json(Problem { + kind: "about:blank", + title, + status: status.as_u16(), + detail: &detail, + }); + let mut r = (status, body).into_response(); + r.headers_mut().insert( + axum::http::header::CONTENT_TYPE, + axum::http::HeaderValue::from_static("application/problem+json"), + ); + r + } + } + +### D2. Key format and generation + +File: `crates/vestige-mcp/src/auth/keys.rs` + +- Key on wire: `vst_<22-byte base64url-no-pad>`. 22 bytes = 176 bits entropy. + Encoded length ~30 chars. Full string ~34 chars including the `vst_` prefix. +- Hash stored in DB: `blake3(key_plaintext)` hex lowercase (32 bytes -> 64 + hex chars). +- Hash prefix on list: first 12 hex characters, e.g. `key_hash[..12]` for + human display. + +Signatures: + + use blake3::Hasher; + use rand::rngs::OsRng; + use rand::TryRngCore; + use base64::engine::general_purpose::URL_SAFE_NO_PAD; + use base64::Engine; + use zeroize::Zeroize; + + const KEY_PREFIX: &str = "vst_"; + const KEY_RANDOM_BYTES: usize = 22; + + #[derive(Clone, Debug, PartialEq, Eq)] + pub enum Scope { + Read, + Write, + } + + impl Scope { + pub fn as_str(&self) -> &'static str { + match self { + Scope::Read => "read", + Scope::Write => "write", + } + } + pub fn from_str(s: &str) -> Option { + match s { + "read" => Some(Scope::Read), + "write" => Some(Scope::Write), + _ => None, + } + } + } + + /// The plaintext key. Shown to the user exactly once. + /// Zeroed on drop. + pub struct ApiKeyPlaintext(String); + + impl ApiKeyPlaintext { + pub fn as_str(&self) -> &str { &self.0 } + pub fn into_inner(mut self) -> String { + std::mem::take(&mut self.0) + } + } + + impl Drop for ApiKeyPlaintext { + fn drop(&mut self) { self.0.zeroize(); } + } + + #[derive(Clone, Debug)] + pub struct ApiKeyRecord { + pub id: uuid::Uuid, + pub key_hash: String, // hex-encoded blake3(plaintext) + pub label: String, + pub scopes: Vec, + pub domain_filter: Vec, + pub created_at: chrono::DateTime, + pub last_used: Option>, + pub active: bool, + } + + pub fn generate_key() -> ApiKeyPlaintext { + let mut bytes = [0u8; KEY_RANDOM_BYTES]; + OsRng.try_fill_bytes(&mut bytes).expect("OsRng"); + let encoded = URL_SAFE_NO_PAD.encode(&bytes); + bytes.zeroize(); + ApiKeyPlaintext(format!("{}{}", KEY_PREFIX, encoded)) + } + + pub fn hash_key(plaintext: &str) -> String { + let mut hasher = Hasher::new(); + hasher.update(plaintext.as_bytes()); + hasher.finalize().to_hex().to_string() + } + + pub fn verify_key(plaintext: &str, stored_hash_hex: &str) -> bool { + use subtle::ConstantTimeEq; + let computed = hash_key(plaintext); + computed.as_bytes().ct_eq(stored_hash_hex.as_bytes()).unwrap_u8() == 1 + } + +Helpers on a thin repository trait that both backends implement through +`MemoryStore` (Phase 2 already adds the required columns; Phase 3 wires the +methods): + + #[async_trait::async_trait] + pub trait ApiKeyStore: Send + Sync + 'static { + async fn create_api_key(&self, rec: &ApiKeyRecord) -> anyhow::Result<()>; + async fn find_api_key_by_hash(&self, hash: &str) -> anyhow::Result>; + async fn list_api_keys(&self) -> anyhow::Result>; + async fn revoke_api_key(&self, id: uuid::Uuid) -> anyhow::Result; + async fn touch_api_key_last_used(&self, id: uuid::Uuid) -> anyhow::Result<()>; + } + +(If Phase 2 already bolted these onto `MemoryStore`, `ApiKeyStore` is simply a +re-export of the relevant subset.) + +### D3. Session cookie + +File: `crates/vestige-mcp/src/auth/session.rs` + +- Cookie name: `vestige_session`. +- Cookie attributes: `HttpOnly`, `SameSite=Strict`, `Path=/`, `Max-Age=28800` + (8h), `Secure` when the server is running behind TLS (detected from + `config.server.tls_cert.is_some()` or the `X-Forwarded-Proto` trusted header; + default: set `Secure` whenever `config.server.bind` is non-loopback). +- Payload: serialized `SessionClaims { key_id: Uuid, issued_at: i64, + expires_at: i64 }` encoded as `serde_json` then base64url. The signing is + handled by `axum-extra::extract::cookie::SignedCookieJar` (HMAC via a 64-byte + `Key`). Any tampering or truncation is rejected by the jar automatically. +- Key material: 64 random bytes, stored at `/session_secret` (mode + 0600) or overridden by `VESTIGE_SESSION_SECRET` (base64url-encoded 64 bytes, + reject if shorter). + +Signatures: + + use axum_extra::extract::cookie::{Cookie, Key, SameSite, SignedCookieJar}; + use chrono::{Duration, Utc}; + use serde::{Deserialize, Serialize}; + + const COOKIE_NAME: &str = "vestige_session"; + const DEFAULT_TTL: Duration = Duration::hours(8); + + #[derive(Clone, Serialize, Deserialize)] + pub struct SessionClaims { + pub key_id: uuid::Uuid, + pub iat: i64, + pub exp: i64, + } + + pub fn session_key(data_dir: &std::path::Path) -> anyhow::Result { + // 1) env override + if let Ok(env_val) = std::env::var("VESTIGE_SESSION_SECRET") { + let raw = base64::engine::general_purpose::URL_SAFE_NO_PAD + .decode(env_val.trim())?; + anyhow::ensure!(raw.len() >= 64, "VESTIGE_SESSION_SECRET must be >= 64 bytes"); + return Ok(Key::from(&raw)); + } + // 2) persisted file + let path = data_dir.join("session_secret"); + if path.exists() { + let bytes = std::fs::read(&path)?; + return Ok(Key::from(&bytes)); + } + // 3) generate + use rand::TryRngCore; + let mut bytes = [0u8; 64]; + rand::rngs::OsRng.try_fill_bytes(&mut bytes)?; + #[cfg(unix)] + { + use std::io::Write; + use std::os::unix::fs::OpenOptionsExt; + std::fs::create_dir_all(data_dir).ok(); + let mut f = std::fs::OpenOptions::new() + .create_new(true).write(true).mode(0o600).open(&path)?; + f.write_all(&bytes)?; + f.sync_all()?; + } + #[cfg(not(unix))] + std::fs::write(&path, &bytes)?; + Ok(Key::from(&bytes)) + } + + pub fn issue_session( + jar: SignedCookieJar, + key_id: uuid::Uuid, + secure: bool, + ) -> SignedCookieJar { + let now = Utc::now(); + let claims = SessionClaims { + key_id, + iat: now.timestamp(), + exp: (now + DEFAULT_TTL).timestamp(), + }; + let value = serde_json::to_string(&claims).expect("serialize claims"); + let mut cookie = Cookie::new(COOKIE_NAME, value); + cookie.set_http_only(true); + cookie.set_same_site(SameSite::Strict); + cookie.set_path("/"); + cookie.set_max_age(cookie::time::Duration::seconds(DEFAULT_TTL.num_seconds())); + cookie.set_secure(secure); + jar.add(cookie) + } + + pub fn revoke_session(jar: SignedCookieJar) -> SignedCookieJar { + jar.remove(Cookie::from(COOKIE_NAME)) + } + + pub fn claims_from(jar: &SignedCookieJar) -> Option { + let c = jar.get(COOKIE_NAME)?; + let claims: SessionClaims = serde_json::from_str(c.value()).ok()?; + if claims.exp < Utc::now().timestamp() { return None; } + Some(claims) + } + +### D4. Auth middleware + +File: `crates/vestige-mcp/src/auth/middleware.rs` + +Identity carried through the request: + + #[derive(Clone, Debug)] + pub struct Identity { + pub key_id: uuid::Uuid, + pub label: String, + pub scopes: Vec, + pub domain_filter: Vec, + pub via: AuthVia, + } + + #[derive(Clone, Copy, Debug)] + pub enum AuthVia { + Bearer, + ApiKeyHeader, + SessionCookie, + } + +Middleware (axum 0.8): + + use axum::extract::{Request, State}; + use axum::http::{header, StatusCode}; + use axum::middleware::Next; + use axum::response::{IntoResponse, Response}; + use axum_extra::extract::cookie::SignedCookieJar; + use std::sync::Arc; + + pub async fn auth_layer( + State(state): State>, + jar: SignedCookieJar, + mut request: Request, + next: Next, + ) -> Response { + // Allowlist endpoints that never require auth: + let path = request.uri().path(); + if path == "/api/health" || path == "/api/v1/health" || + path == "/dashboard/login" { + return next.run(request).await; + } + + let via_and_key = extract_credentials(request.headers(), &jar); + let outcome = match via_and_key { + Some((AuthVia::Bearer, key)) | Some((AuthVia::ApiKeyHeader, key)) => { + resolve_by_plaintext(&state, &key).await.map(|id| (id, via_and_key.unwrap().0)) + } + Some((AuthVia::SessionCookie, key_id_str)) => { + let id = uuid::Uuid::parse_str(&key_id_str).map_err(|_| AuthError::InvalidCredentials)?; + resolve_by_key_id(&state, id).await.map(|id| (id, AuthVia::SessionCookie)) + } + None => Err(AuthError::MissingCredentials), + }; + + let identity = match outcome { + Ok((id, via)) => Identity { via, ..id }, + Err(e) => return e.into_response(), + }; + + // touch last_used asynchronously; do not block request path + let st2 = state.clone(); + let kid = identity.key_id; + tokio::spawn(async move { let _ = st2.store.touch_api_key_last_used(kid).await; }); + + request.extensions_mut().insert(identity); + next.run(request).await + } + +Credential extraction (priority: Bearer > X-API-Key > cookie): + + fn extract_credentials( + headers: &axum::http::HeaderMap, + jar: &SignedCookieJar, + ) -> Option<(AuthVia, String)> { + if let Some(v) = headers.get(header::AUTHORIZATION).and_then(|h| h.to_str().ok()) { + if let Some(rest) = v.strip_prefix("Bearer ") { + return Some((AuthVia::Bearer, rest.trim().to_string())); + } + } + if let Some(v) = headers.get("x-api-key").and_then(|h| h.to_str().ok()) { + return Some((AuthVia::ApiKeyHeader, v.trim().to_string())); + } + if let Some(claims) = crate::auth::session::claims_from(jar) { + return Some((AuthVia::SessionCookie, claims.key_id.to_string())); + } + None + } + +Resolution helpers: + + async fn resolve_by_plaintext(st: &AppCtx, key: &str) -> Result { + let hash = crate::auth::keys::hash_key(key); + let rec = st.store.find_api_key_by_hash(&hash).await + .map_err(|_| AuthError::Internal)? + .ok_or(AuthError::InvalidCredentials)?; + if !rec.active { return Err(AuthError::Revoked); } + Ok(Identity { + key_id: rec.id, label: rec.label, scopes: rec.scopes, + domain_filter: rec.domain_filter, via: AuthVia::Bearer, + }) + } + + async fn resolve_by_key_id(st: &AppCtx, id: uuid::Uuid) -> Result { + let rec = st.store.find_api_key_by_id(id).await + .map_err(|_| AuthError::Internal)? + .ok_or(AuthError::InvalidCredentials)?; + if !rec.active { return Err(AuthError::Revoked); } + Ok(Identity { + key_id: rec.id, label: rec.label, scopes: rec.scopes, + domain_filter: rec.domain_filter, via: AuthVia::SessionCookie, + }) + } + +Scope guard extractor (per-handler opt-in): + + pub struct RequireScope; + impl axum::extract::FromRequestParts for RequireScope + where S: Send + Sync, + { + type Rejection = AuthError; + async fn from_request_parts( + parts: &mut axum::http::request::Parts, _state: &S, + ) -> Result { + let id = parts.extensions.get::().ok_or(AuthError::MissingCredentials)?; + let need = if WRITE { Scope::Write } else { Scope::Read }; + if !id.scopes.contains(&need) { + return Err(AuthError::InsufficientScope { + required: if WRITE { "write" } else { "read" }, + }); + } + Ok(RequireScope) + } + } + +Domain scoping: + + /// Returns the effective domain filter for the request: + /// - Intersect the key's domain_filter with any X-Vestige-Domain header. + /// - Empty key filter means "all domains", so the header is authoritative. + /// - A header that names a domain outside the key filter returns + /// `Err(DomainNotAllowed)`. + pub fn effective_domain_filter( + id: &Identity, header: Option<&str>, + ) -> Result>, AuthError> { + let header_dom = header.map(|s| s.trim().to_string()).filter(|s| !s.is_empty()); + match (id.domain_filter.as_slice(), header_dom) { + ([], None) => Ok(None), + ([], Some(h)) => Ok(Some(vec![h])), + (filter, None) => Ok(Some(filter.to_vec())), + (filter, Some(h)) => { + if filter.iter().any(|d| d == &h) { + Ok(Some(vec![h])) + } else { + Err(AuthError::DomainNotAllowed { domain: h }) + } + } + } + } + +### D5. Layer ordering + +Router assembly in `http/mod.rs::build_router`: + + let trace = tower_http::trace::TraceLayer::new_for_http(); + let request_id = tower_http::request_id::SetRequestIdLayer::x_request_id( + tower_http::request_id::MakeRequestUuid); + let propagate_id = tower_http::request_id::PropagateRequestIdLayer::x_request_id(); + + let cors = CorsLayer::new() + .allow_origin(cfg.server.allowed_origins()) + .allow_methods([Method::GET, Method::POST, Method::PUT, Method::DELETE, Method::OPTIONS]) + .allow_headers([header::CONTENT_TYPE, header::AUTHORIZATION, + HeaderName::from_static("x-api-key"), + HeaderName::from_static("x-vestige-domain"), + HeaderName::from_static("mcp-session-id")]) + .allow_credentials(true); + + let app = Router::new() + // Unauth routes first (not subjected to auth_layer by path allowlist) + .route("/api/health", get(health)) + .route("/dashboard/login", post(login)) + .route("/dashboard/logout", post(logout)) + // MCP + REST + dashboard + .route("/mcp", post(http::mcp::post_mcp).get(http::mcp_sse::get_mcp_sse) + .delete(http::mcp::delete_mcp)) + .nest("/api/v1", http::rest::router()) + .merge(dashboard::router()) + // Auth middleware applied via from_fn_with_state (allowlist inside) + .layer(axum::middleware::from_fn_with_state(ctx.clone(), auth_layer)) + // Outermost: tracing, request-id, cors, body limit, concurrency + .layer( + ServiceBuilder::new() + .layer(trace) + .layer(request_id) + .layer(propagate_id) + .layer(cors) + .layer(DefaultBodyLimit::max(MAX_BODY_SIZE)) + .layer(ConcurrencyLimitLayer::new(CONCURRENCY_LIMIT)) + ) + .with_state(ctx); + +Axum applies `layer()` calls outermost-first in the order they are declared. +The result here: request -> trace -> request-id -> CORS -> body-limit -> +concurrency -> auth -> handler. Auth must wrap the handlers but be inside +tracing so its spans can log auth outcomes. + +### D6. MCP endpoints + +File: `crates/vestige-mcp/src/http/mcp.rs` + +`POST /mcp` -- keep the session-based structure already in `protocol/http.rs` +but use the `Identity` injected by the auth layer instead of a shared +`auth_token`: + + pub async fn post_mcp( + State(ctx): State>, + Extension(id): Extension, + headers: HeaderMap, + Json(request): Json, + ) -> Response { ... } + +Auth happens in the layer, so this handler cannot be reached without a valid +`Identity`. Scope check: all MCP writes (tools that mutate) require +`RequireScope`. Use an enum of MCP methods or a method -> required-scope +map. `tools/list`, `resources/list`, `initialize`, `ping` are read-only. +`tools/call` is conservatively classified as write; the per-tool dispatch +inside `McpServer::handle_tools_call` may further reject writes when the tool +name is read-only and the key lacks write. + +`DELETE /mcp` -- unchanged semantics, drops the session. + +`GET /mcp` -- SSE. Implementation in `http/mcp_sse.rs`: + + use axum::response::sse::{Event, KeepAlive, Sse}; + use axum::extract::Query; + use futures_util::stream::Stream; + use async_stream::stream; + use std::time::Duration; + + #[derive(serde::Deserialize)] + pub struct SseParams { + pub op: String, // "dream" | "consolidate" | "discover" | "reassign" + pub session: Option, // optional operation correlation id + } + + pub async fn get_mcp_sse( + State(ctx): State>, + Extension(_id): Extension, + Query(params): Query, + ) -> Result>>, AuthError> { + let op = params.op.clone(); + let ctx2 = ctx.clone(); + let s = stream! { + yield Ok(Event::default().event("start").data(format!("{{\"op\":\"{}\"}}", op))); + match op.as_str() { + "dream" => { + let mut rx = ctx2.cognitive.lock().await.begin_dream_stream().await; + while let Some(ev) = rx.recv().await { + yield Ok(Event::default().event("progress").json_data(ev)?); + } + yield Ok(Event::default().event("done").data("{}")); + } + "consolidate" => { /* same pattern over Storage::run_consolidation_stream */ } + "discover" => { /* Phase 4 */ } + "reassign" => { /* Phase 4 */ } + other => { + yield Ok(Event::default().event("error") + .data(format!("{{\"message\":\"unknown op {}\"}}", other))); + } + } + }; + Ok(Sse::new(s).keep_alive(KeepAlive::new().interval(Duration::from_secs(15)))) + } + +SSE event shape (stable contract, document in `docs/http-api.md`): + + event: start + data: {"op":"dream"} + + event: progress + data: {"stage":"replay","processed":12,"total":50} + + event: progress + data: {"stage":"cross_reference","processed":25,"total":50} + + event: done + data: {"nodes_processed":50,"duration_ms":14320} + +The `keep-alive` hint is 15s to survive most proxy timeouts. + +### D7. REST API + +File: `crates/vestige-mcp/src/http/rest.rs` + +Routes: + + pub fn router() -> Router> { + Router::new() + .route("/health", get(health)) + .route("/memories", post(create_memory).get(list_memories)) + .route("/memories/{id}", get(get_memory).put(update_memory).delete(delete_memory)) + .route("/memories/{id}/promote", post(promote_memory)) + .route("/memories/{id}/demote", post(demote_memory)) + .route("/search", post(search_memories)) + .route("/consolidate", post(trigger_consolidation)) + .route("/stats", get(get_stats)) + .route("/domains", get(list_domains)) + .route("/domains/discover", post(trigger_discovery)) + .route("/domains/{id}", put(rename_domain).delete(delete_domain)) + .route("/domains/{id}/merge", post(merge_domain)) + .route("/keys", post(create_key).get(list_keys)) + .route("/keys/{id}", delete(revoke_key)) + } + +Representative signatures: + + #[derive(serde::Deserialize)] + pub struct CreateMemoryReq { + pub content: String, + pub node_type: Option, + pub tags: Option>, + pub source: Option, + pub metadata: Option, + } + + #[derive(serde::Serialize)] + pub struct MemoryView { /* flat projection of MemoryRecord */ } + + pub async fn create_memory( + State(ctx): State>, + Extension(id): Extension, + _: RequireScope, + Json(req): Json, + ) -> Result<(StatusCode, Json), ApiError> { + let effective = effective_domain_filter(&id, None)?; + let rec = ctx.store.insert_from_rest(req, effective).await?; + Ok((StatusCode::CREATED, Json(MemoryView::from(rec)))) + } + + pub async fn search_memories( + State(ctx): State>, + Extension(id): Extension, + _: RequireScope, + headers: HeaderMap, + Json(req): Json, + ) -> Result, ApiError> { + let dom_header = headers.get("x-vestige-domain").and_then(|h| h.to_str().ok()); + let effective = effective_domain_filter(&id, dom_header)?; + let q = SearchQuery { domains: effective, ..req.into() }; + let res = ctx.store.search(&q).await?; + Ok(Json(SearchResp::from(res))) + } + +`trigger_consolidation` returns 202 Accepted + a JSON body with a `session_id` +the client may pass to `GET /mcp?op=consolidate&session=...` to stream +progress. + +### D8. Error mapping + +File: `crates/vestige-mcp/src/http/errors.rs` + + #[derive(Debug, thiserror::Error)] + pub enum ApiError { + #[error(transparent)] Auth(#[from] AuthError), + #[error("bad request: {0}")] BadRequest(String), + #[error("not found")] NotFound, + #[error("conflict: {0}")] Conflict(String), + #[error(transparent)] Store(#[from] anyhow::Error), + } + + impl IntoResponse for ApiError { + fn into_response(self) -> Response { + match self { + ApiError::Auth(a) => a.into_response(), + ApiError::BadRequest(m) => (StatusCode::BAD_REQUEST, problem(400, "bad_request", &m)).into_response(), + ApiError::NotFound => (StatusCode::NOT_FOUND, problem(404, "not_found", "")).into_response(), + ApiError::Conflict(m) => (StatusCode::CONFLICT, problem(409, "conflict", &m)).into_response(), + ApiError::Store(e) => { + tracing::error!(err = %e, "store error"); + (StatusCode::INTERNAL_SERVER_ERROR, problem(500, "internal", "internal error")).into_response() + } + } + } + } + +All MCP JSON-RPC error mapping is unchanged (done in `McpServer`); only +transport-level errors (401/403) leave that path. + +### D9. Config loader and bind-safety check + +File: `crates/vestige-mcp/src/config.rs` + + #[derive(Debug, Clone, serde::Deserialize)] + pub struct ServerConfig { + #[serde(default = "default_bind")] + pub bind: String, // "127.0.0.1:3928" + #[serde(default = "default_dashboard_port")] + pub dashboard_port: u16, + #[serde(default)] pub tls_cert: Option, + #[serde(default)] pub tls_key: Option, + #[serde(default)] pub allowed_origins: Vec, + } + + #[derive(Debug, Clone, serde::Deserialize)] + pub struct AuthConfig { + #[serde(default = "default_true")] + pub enabled: bool, + #[serde(default)] pub session_secret_file: Option, + } + + impl ServerConfig { + pub fn parsed_bind(&self) -> anyhow::Result { + self.bind.parse().map_err(|e: std::net::AddrParseError| + anyhow::anyhow!("invalid bind {}: {}", self.bind, e)) + } + } + +Bind-safety check (called during `start_server`): + + pub fn enforce_bind_safety(server: &ServerConfig, auth: &AuthConfig) -> anyhow::Result<()> { + let addr = server.parsed_bind()?; + let is_loopback = match addr.ip() { + std::net::IpAddr::V4(v) => v.is_loopback(), + std::net::IpAddr::V6(v) => v.is_loopback(), + }; + if !is_loopback && !auth.enabled { + anyhow::bail!( + "refusing to bind {} with auth disabled; \ + set [auth] enabled = true in vestige.toml or \ + change [server] bind to a loopback address", + addr + ); + } + Ok(()) + } + +`main.rs` and the `serve` CLI both call `enforce_bind_safety` before +`TcpListener::bind`. On failure: `eprintln!` the error, `std::process::exit(2)`. + +Env bridge: + +- `VESTIGE_HTTP_BIND` (existing) -> `server.bind` host part. +- `VESTIGE_HTTP_PORT` (existing) -> `server.bind` port part. +- `VESTIGE_DASHBOARD_PORT` (existing) -> `server.dashboard_port`. +- `VESTIGE_AUTH_TOKEN` (deprecated) -- when set, synthesize a virtual + `ApiKeyRecord` with `id = all-zero UUID`, `scopes = [read, write]`, + `domain_filter = []`, `active = true`, hash stored in memory only. Log a + warning on every startup: `VESTIGE_AUTH_TOKEN is deprecated; use 'vestige + keys create' and set auth.enabled=true instead. Will be removed in v2.2.0.` +- `VESTIGE_SESSION_SECRET` -- see D3. + +### D10. Dashboard login + logout + +File: `crates/vestige-mcp/src/dashboard/handlers.rs` (additions). + + #[derive(serde::Deserialize)] + pub struct LoginBody { + pub api_key: String, + } + + pub async fn login( + State(state): State, + jar: SignedCookieJar, + headers: HeaderMap, + body: Option>, + ) -> Result<(SignedCookieJar, Json), AuthError> { + // Accept key in either JSON body or X-API-Key header + let plaintext = body.map(|b| b.0.api_key) + .or_else(|| headers.get("x-api-key").and_then(|h| h.to_str().ok()).map(String::from)) + .ok_or(AuthError::MissingCredentials)?; + + let hash = crate::auth::keys::hash_key(&plaintext); + let rec = state.store.find_api_key_by_hash(&hash).await + .map_err(|_| AuthError::Internal)? + .ok_or(AuthError::InvalidCredentials)?; + if !rec.active { return Err(AuthError::Revoked); } + + let secure = state.config.server.tls_cert.is_some(); + let jar = crate::auth::session::issue_session(jar, rec.id, secure); + + Ok((jar, Json(serde_json::json!({ + "ok": true, "key_id": rec.id, "label": rec.label, + "scopes": rec.scopes.iter().map(|s| s.as_str()).collect::>(), + "domains": rec.domain_filter, + })))) + } + + pub async fn logout(jar: SignedCookieJar) + -> (SignedCookieJar, Json) + { + (crate::auth::session::revoke_session(jar), + Json(serde_json::json!({"ok": true}))) + } + +Dashboard router integration: login/logout are appended before `auth_layer` +is applied, so they are reachable unauthenticated. The dashboard SPA asset +routes (`/dashboard`, `/dashboard/{*path}`) remain publicly readable so the +login page can load; the `/api/*` dashboard endpoints are gated by +`auth_layer`. (The existing health endpoint keeps its current behaviour.) + +### D11. `vestige keys` CLI + +File: `crates/vestige-mcp/src/bin/cli.rs` additions. + + #[derive(Subcommand)] + enum Commands { + // ... existing + /// Manage API keys + Keys { + #[command(subcommand)] + sub: KeyCmd, + }, + } + + #[derive(Subcommand)] + enum KeyCmd { + /// Create a new API key + Create { + #[arg(long)] label: String, + #[arg(long, value_delimiter = ',', default_values_t = ["read".to_string(), "write".to_string()])] + scopes: Vec, + /// Restrict the key to listed domains (comma-separated). Empty = all domains. + #[arg(long, value_delimiter = ',')] + domains: Vec, + }, + /// List existing keys (never shows plaintext) + List { + /// Include revoked keys in the output + #[arg(long)] all: bool, + }, + /// Revoke a key by id or by hash prefix + Revoke { + /// Id (UUID) or hash prefix (first 12 hex chars) + id_or_prefix: String, + }, + /// Revoke and re-create with the same scopes/label + Rotate { + id_or_prefix: String, + }, + } + +`Create` outputs the plaintext exactly once on stdout (for piping into env +files) and a confirmation on stderr. Use colored output only on stderr to keep +stdout machine-readable. + + fn run_keys_create(...) -> anyhow::Result<()> { + let store = open_store()?; // Arc + let plaintext = crate::auth::keys::generate_key(); + let hash = crate::auth::keys::hash_key(plaintext.as_str()); + let rec = ApiKeyRecord { + id: uuid::Uuid::new_v4(), + key_hash: hash, label, scopes, domain_filter: domains, + created_at: chrono::Utc::now(), + last_used: None, active: true, + }; + block_on(store.create_api_key(&rec))?; + + // stderr: human-readable + eprintln!("{} {}", "Created key:".green().bold(), rec.label); + eprintln!(" id: {}", rec.id); + eprintln!(" scopes: {}", rec.scopes.iter().map(|s| s.as_str()).collect::>().join(",")); + eprintln!(" domains: {}", if rec.domain_filter.is_empty() { "all".to_string() } else { rec.domain_filter.join(",") }); + eprintln!(); + eprintln!("{}", "Store the plaintext key now. It will not be shown again.".yellow()); + // stdout: ONLY the plaintext, for scripting + println!("{}", plaintext.as_str()); + Ok(()) + } + +`List`: + + kid label scopes domains last_used hash + d3a8... macbook read,write all 2026-04-20 11:02 a1b2c3d4e5f6 + ... + +Never print the plaintext. Show only `hash[..12]`. + +### D12. Migrations + +Postgres `0300_api_keys.sql` (idempotent; Phase 2 may have already created the +table, in which case this migration is a no-op `CREATE TABLE IF NOT EXISTS`): + + CREATE TABLE IF NOT EXISTS api_keys ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + key_hash TEXT NOT NULL UNIQUE, + label TEXT NOT NULL, + scopes TEXT[] NOT NULL DEFAULT ARRAY['read','write'], + domain_filter TEXT[] NOT NULL DEFAULT ARRAY[]::TEXT[], + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + last_used TIMESTAMPTZ, + active BOOLEAN NOT NULL DEFAULT true + ); + + CREATE INDEX IF NOT EXISTS idx_api_keys_active + ON api_keys (active) WHERE active; + +SQLite `0300_api_keys.sql`: + + CREATE TABLE IF NOT EXISTS api_keys ( + id TEXT PRIMARY KEY, + key_hash TEXT NOT NULL UNIQUE, + label TEXT NOT NULL, + scopes TEXT NOT NULL DEFAULT 'read,write', -- comma-joined + domain_filter TEXT NOT NULL DEFAULT '', -- comma-joined, '' = all + created_at TEXT NOT NULL DEFAULT (datetime('now')), + last_used TEXT, + active INTEGER NOT NULL DEFAULT 1 + ); + + CREATE INDEX IF NOT EXISTS idx_api_keys_active + ON api_keys (active) WHERE active = 1; + +Both backends' trait impls convert to/from `ApiKeyRecord`. + +### D13. Wiring main.rs and the `serve` CLI path + +`main.rs` refactor: + +1. `Config::load()` reads `vestige.toml` (if present) and overlays env vars. +2. Run `enforce_bind_safety(&cfg.server, &cfg.auth)` before spawning any + listener. On failure, print to stderr and exit 2. +3. Build `AppCtx` with `Arc`, `CognitiveEngine`, + event bus, `session_key`, `config`. +4. `build_router(ctx)` returns a single Axum `Router` that covers MCP, REST, + and dashboard. +5. `axum::serve(listener, app).await`. +6. The stdio MCP transport continues to run in parallel (unchanged) for + desktop / Claude Code single-user scenarios. + +`serve` CLI subcommand: identical flow minus stdio. + +### D14. Docs + +- `docs/env-vars.md` new: table of every supported env var, default, purpose, + deprecation status. +- Section in `README.md`: "Running Vestige as a network server". +- Cheat-sheet section in `CLAUDE.md` for: create a key, start the server, + curl smoke test. + +## Test Plan + +### Unit tests (colocated under `#[cfg(test)]`) + +- `auth/keys.rs`: + - `generate_key_has_prefix_and_length()` -- asserts `vst_` prefix and 34-ish + char total, regex `^vst_[A-Za-z0-9_-]{29}$`. + - `hash_key_blake3_is_stable_and_hex()` -- fixed vector test. + - `verify_key_accepts_same_input()` / `verify_key_rejects_tampered()` / + `verify_key_rejects_length_mismatch()`. + - `keys_are_unique_in_a_loop()` -- 10_000 iterations, no collisions. + - `plaintext_zeroed_on_drop()` -- unsafe peek into the backing buffer + through a wrapper that exposes bytes for the test only. + +- `auth/session.rs`: + - `round_trip_claims_through_signed_jar()`. + - `expired_cookie_is_rejected()` -- mint a cookie with `exp = iat - 60` and + confirm `claims_from` returns `None`. + - `tampered_cookie_is_rejected()` -- flip one byte in the signed segment, + confirm the jar drops it. + - `session_key_env_overrides_file()`. + - `session_key_generated_file_has_mode_0600_on_unix()`. + +- `auth/middleware.rs`: + - `extract_credentials_prefers_bearer_over_api_key_header()`. + - `extract_credentials_falls_back_to_cookie()`. + - `effective_domain_filter_empty_means_all()`. + - `effective_domain_filter_header_narrows_within_key_filter()`. + - `effective_domain_filter_rejects_header_outside_key_filter()`. + - `missing_credentials_returns_401()`. + - `revoked_key_returns_401()`. + - `insufficient_scope_returns_403()`. + +- `config.rs`: + - `parse_vestige_toml_with_server_and_auth_sections()`. + - `env_vars_override_toml_bind()`. + - `enforce_bind_safety_rejects_0_0_0_0_with_auth_disabled()`. + - `enforce_bind_safety_allows_0_0_0_0_with_auth_enabled()`. + - `enforce_bind_safety_allows_loopback_with_auth_disabled()`. + +- `http/errors.rs`: + - `not_found_emits_problem_json_with_correct_content_type()`. + - `bad_request_includes_detail_field()`. + +- `http/mcp.rs`: + - `post_mcp_unauth_returns_401()` (this would normally be caught by the + middleware; kept as a unit test by constructing the Router minus the + middleware to exercise the handler's own error paths). + +### Integration tests (`tests/phase_3/`) + +All tests spin up the full Axum stack in-process on a random port via +`tokio::net::TcpListener::bind("127.0.0.1:0")`, wire a `SqliteMemoryStore` in +a `TempDir`, and issue HTTP calls with `reqwest`. + +Files (each one a standalone binary test file): + +- `phase_3/common/mod.rs` -- shared harness (`spawn_server()`, + `create_test_key()`, `client()`). + +- `phase_3/http_mcp_round_trip.rs` -- boot server, mint a key, send + `initialize` over `POST /mcp` with `Authorization: Bearer vst_...`, follow + with `tools/list`, assert we see the expected tool count (greater than 20). + +- `phase_3/http_sse_stream.rs` -- `POST /api/v1/consolidate` returns 202 + + `session_id`. `GET /mcp?op=consolidate&session=...` streams at least one + `progress` and one `done` event. Use `eventsource-client` dev dep, or parse + the stream manually. + +- `phase_3/rest_api_crud.rs` -- exercises each REST endpoint in turn: + - `POST /api/v1/memories` -> 201 + body. + - `GET /api/v1/memories/{id}` -> 200. + - `PUT /api/v1/memories/{id}` -> 200. + - `POST /api/v1/search` -> 200 with the new memory in results. + - `POST /api/v1/memories/{id}/promote` -> 200. + - `GET /api/v1/stats` -> 200. + - `GET /api/v1/domains` -> 200 (likely empty). + - `DELETE /api/v1/memories/{id}` -> 204. + +- `phase_3/auth_bearer_token.rs`: + - unauth: `GET /api/v1/stats` returns 401 and `Content-Type: + application/problem+json`. + - valid Bearer: same call returns 200. + - revoked key: `POST /api/v1/keys/{id}` DELETE then reuse -> 401. + - tampered Bearer (last char flipped) -> 401. + +- `phase_3/auth_api_key_header.rs`: + - `X-API-Key: vst_...` alone -> 200. + - Both Bearer and X-API-Key with different values -> Bearer wins (asserted + via a key that is read-only in Bearer + full-scope X-API-Key, then + confirming a write 403s). + +- `phase_3/auth_session_cookie.rs`: + - `POST /dashboard/login` with valid key -> 200 + `Set-Cookie: + vestige_session=...; HttpOnly; SameSite=Strict; Path=/`. + - reuse cookie: `GET /api/v1/stats` returns 200. + - tampered cookie (change one char): -> 401. + - `POST /dashboard/logout` -> `Set-Cookie: vestige_session=; Max-Age=0`. + +- `phase_3/auth_domain_filter.rs`: + - Key with `domain_filter = ["dev"]`: + - `POST /api/v1/search` without header -> search is scoped to `["dev"]` + (insert fixtures with two domains, assert only `dev` rows returned). + - `X-Vestige-Domain: dev` -> same. + - `X-Vestige-Domain: home` -> 403 with detail `domain not permitted`. + - Key with empty filter + `X-Vestige-Domain: dev` -> scoped to `["dev"]`. + - Key with empty filter + no header -> no scoping. + +- `phase_3/auth_scope_enforcement.rs`: + - read-only key cannot call `POST /api/v1/memories` -> 403. + - read-only key CAN call `POST /api/v1/search` -> 200. + +- `phase_3/bind_safety_nonlocalhost_without_auth.rs`: + - Spawn `vestige serve --bind 0.0.0.0:0` as a subprocess with `auth.enabled + = false` via a temp `vestige.toml`. + - Assert: non-zero exit, stderr contains `refusing to bind`, no listener + ever opens (confirm by trying to connect to the configured port and + expecting connection refused after a short timeout). + +- `phase_3/cli_keys_create_list_revoke.rs`: + - Spawn the `vestige` CLI binary with `--data-dir `. + - Run `vestige keys create --label test --scopes read,write`; capture + stdout (the plaintext) and stderr (the human summary). Assert `vst_` + prefix in stdout. + - Run `vestige keys list`; assert no plaintext, label `test` present. + - Run `vestige keys revoke `; confirm exit 0. + - Run `vestige keys list`; assert label no longer visible without `--all`. + +- `phase_3/dashboard_login_flow.rs`: + - Full loop: login -> fetch `/dashboard` (gets SPA index, unauthed ok) -> + fetch `/api/memories` (authed via cookie) -> logout -> fetch `/api/memories` + (401). + +- `phase_3/deprecation_auth_token.rs`: + - Start the server with `VESTIGE_AUTH_TOKEN=test12345...` and no created + keys. Send a Bearer request with that token -> 200. Assert stderr log + contains `deprecated`. + +### Smoke test (`tests/phase_3/smoke/`) + +- `remote_mcp_client.sh`: + + #!/usr/bin/env bash + set -euo pipefail + KEY="${VESTIGE_TEST_KEY:?set me}" + HOST="${VESTIGE_HOST:-http://127.0.0.1:3928}" + # Initialize a session + RESP=$(curl -sS -D /tmp/h -H "Authorization: Bearer $KEY" \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc":"2.0","id":1,"method":"initialize", + "params":{"protocolVersion":"2025-11-25", + "clientInfo":{"name":"smoke","version":"0"}, + "capabilities":{}}}' \ + "$HOST/mcp") + SID=$(grep -i 'mcp-session-id:' /tmp/h | awk '{print $2}' | tr -d '\r') + # tools/list + curl -sS -H "Authorization: Bearer $KEY" \ + -H "Mcp-Session-Id: $SID" \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \ + "$HOST/mcp" | jq '.result.tools | length' + echo "smoke ok" + +## Acceptance Criteria + +- [ ] `cargo build -p vestige-mcp` -- zero warnings, all feature combinations + (`--no-default-features`, default, `--features ort-dynamic`). +- [ ] `cargo clippy --workspace --all-targets --all-features -- -D warnings`. +- [ ] `cargo fmt --all --check`. +- [ ] All `tests/phase_3/*.rs` pass, plus phase_1 and phase_2 remain green. +- [ ] Unauth request to `POST /mcp` returns 401 with + `Content-Type: application/problem+json` and a body containing `status`, + `title`, `detail`. +- [ ] Binding `0.0.0.0:` with `[auth].enabled = false` makes the + process exit with code 2 and print `refusing to bind` to stderr. +- [ ] `vestige keys create --label X` prints exactly one line on stdout + matching `^vst_[A-Za-z0-9_-]+$`; `vestige keys list` never prints that + line back. +- [ ] Dashboard login from a browser-like client (tested via the reqwest + `Client::cookie_store(true)` harness) yields a `Set-Cookie` with + `HttpOnly`, `SameSite=Strict`, `Path=/`, and Max-Age present. +- [ ] A second machine can run a curl-based MCP client against the server + (smoke test) and receive successful `tools/list` responses. +- [ ] `VESTIGE_AUTH_TOKEN` still works and emits the deprecation warning. +- [ ] `tests/phase_3/auth_domain_filter.rs` demonstrates that a key scoped to + `dev` cannot read `home`-domain memories via any of the three auth modes + and cannot escape with `X-Vestige-Domain`. + +## Rollback Notes + +- Ship behind an on-by-default Cargo feature `http-server` on + `vestige-mcp`. Disabling it reverts to stdio + existing localhost HTTP + (`protocol/http.rs` in its current form) with zero behaviour change. +- SQL: migration `0300_api_keys.sql` is additive only; rollback is a single + `DROP TABLE api_keys;` in `0300_api_keys.down.sql` for both backends. Keep a + row count safety check in the down migration and log the deletion. +- Session secret file: deleting `/session_secret` invalidates every + outstanding cookie; users simply log in again. Safe to rotate. +- Env var sunset schedule: + - v2.1.x: `VESTIGE_AUTH_TOKEN` emits a warning, still works. + - v2.2.0: `VESTIGE_AUTH_TOKEN` refused with an error pointing at + `vestige keys create`. +- Downgrade procedure: `git revert` the Phase 3 merge, then run the down + migration. No data loss; plaintext keys were only ever in user-side + secret managers. + +## Open Implementation Questions + +1. JSON-RPC library: hand-rolled vs jsonrpsee? + + - Candidate A: keep the hand-rolled types in `protocol/types.rs` plus the + session-aware `post_mcp` handler already in `protocol/http.rs`. + - Candidate B: switch to `jsonrpsee = "0.24"` with the `server` feature + and adapt it to Axum via `jsonrpsee::server::Server`. + + RECOMMENDATION: A. Phase 3 is about auth and transport surfaces, not + library rewrites. The existing types are already correct, tested, and + compatible with Streamable HTTP; the 29 cognitive modules depend on + `McpServer::handle_request`, which does not map 1:1 to jsonrpsee's + `RpcModule` trait. Re-evaluate in a future phase only if we need subscription + notifications beyond SSE. + +2. Streamable HTTP vs plain POST-with-JSON? + + - The MCP spec titled "Streamable HTTP" defines: `POST /mcp` for + request/response, `GET /mcp` for SSE where the client subscribes to + server-initiated messages, and an `Mcp-Session-Id` header for session + correlation. The current implementation already covers POST + session + header + DELETE; Phase 3 adds the GET/SSE half. + + RECOMMENDATION: implement the full Streamable HTTP transport. Long-running + tools (dream, consolidate, discover) benefit from SSE progress events, and + Claude Desktop / Claude Code both speak Streamable HTTP natively. Keeping + POST-only would work for short calls but block the UX we want for + background jobs. + +3. Session cookie crate? + + - Candidate A: `axum-extra::extract::cookie::SignedCookieJar` with a 64-byte + `Key`. + - Candidate B: `tower-sessions = "0.13"` with the `MemoryStore` or + `PostgresStore` session backend. + - Candidate C: stateless JWT via `jsonwebtoken`. + + RECOMMENDATION: A. We do not need server-side session state (the `api_keys` + row is the state; the cookie is merely a signed pointer to it). B adds a + whole storage backend we do not need. C adds signing-algorithm surface area + and revocation becomes awkward ("revoked key" with a long-lived JWT). + `SignedCookieJar` gives us HMAC-signed cookies for free, integrates with + axum extractors, and the payload is tiny. + +4. Key format and length? + + - 22 random bytes base64url-no-pad = 176 bits entropy, encoded ~30 chars, + full key ~34 chars with the `vst_` prefix. Long enough to make + brute-force infeasible, short enough to paste into config files. + - Alternatives: 32 bytes (40 chars, overkill), 16 bytes (128 bits, marginal + for secret material shared over networks). + + RECOMMENDATION: 22 bytes. Prefix `vst_` is already documented in the PRD + and gives grep-ability. + +5. Rate limiting: in scope for Phase 3? + + - Useful: mitigates slow brute force, runaway agents. + - Expensive to design well (per-key, per-IP, per-endpoint). + + RECOMMENDATION: OUT of scope. Track as `docs/adr/0002-rate-limiting.md` + follow-up. Axum + `tower` has `ConcurrencyLimitLayer` (already used); a + follow-up can add `governor` or `tower_governor` behind the auth layer so + identity is available. + +6. CORS policy defaults for dashboard in server mode? + + - Candidate A: allow only origins derived from `server.bind` host + the + dashboard port. + - Candidate B: allow user-listed origins via `server.allowed_origins` + config, with A as fallback. + - Candidate C: open CORS to `*` when TLS is configured. + + RECOMMENDATION: B. Auto-populate `allowed_origins` from the bind address + and dashboard port at start time; if the operator sets the config list, + use that list verbatim. Never `*` (`allow_credentials = true` is + incompatible with `*` anyway). + +7. Dashboard session lifetime? + + - 8 hours for default; configurable via `auth.session_ttl_hours`. + - Rotate on each write? (Rolling sessions.) + + RECOMMENDATION: 8 hours fixed, non-rolling. Revisit if users complain. + +8. Handling `tools/call` scope granularity? + + - Today, `tools/call` is a single MCP method. Read-only tools like + `search`, `deep_reference`, `predict` should be callable with a + read-only key. + + RECOMMENDATION: map tool names to scopes in `McpServer::handle_tools_call`. + Read-only names: `search`, `session_context`, `memory` with action in + `{get, state, get_batch}`, `deep_reference`, `cross_reference`, `predict`, + `explore_connections`, `find_duplicates`, `memory_timeline`, + `memory_changelog`, `memory_health`, `memory_graph`, `importance_score`, + `system_status`. Everything else requires `write`. If a read-only key + calls a write tool, return a JSON-RPC error with code `-32003` + ("server not initialized" is close but wrong; reuse `-32603 internal` with + a descriptive message or add a new `-32004 UnauthorizedTool`). RECOMMEND + adding `-32004`. + +9. How to bridge `MemoryStore` trait with dashboard state (`AppState`)? + + - Today `AppState.storage: Arc` is a concrete type. + - Phase 2 introduces `Arc`. + + RECOMMENDATION: in Phase 3, introduce `AppCtx { store: Arc, + cognitive, config, event_tx }` as the single state type for the unified + router. Keep `AppState` as a thin wrapper (or alias) if the dashboard + handlers need to stay untouched in this phase. Migrate the dashboard + handlers to the trait in a follow-up refactor to contain the blast radius. + +10. Windows support for `session_secret` and `auth_token` file modes? + + - Unix gets `0600` via `OpenOptionsExt`. + - Windows has no direct equivalent; ACLs differ. + + RECOMMENDATION: document the limitation; use default permissions on + Windows. Add a `#[cfg(windows)]` placeholder to set owner-only ACLs via + `windows-acl` in a follow-up, not Phase 3. + +### Critical Files for Implementation + +- /home/delandtj/prppl/vestige/crates/vestige-mcp/src/protocol/http.rs +- /home/delandtj/prppl/vestige/crates/vestige-mcp/src/dashboard/mod.rs +- /home/delandtj/prppl/vestige/crates/vestige-mcp/src/main.rs +- /home/delandtj/prppl/vestige/crates/vestige-mcp/src/bin/cli.rs +- /home/delandtj/prppl/vestige/crates/vestige-mcp/Cargo.toml diff --git a/docs/plans/0004-phase-4-emergent-domain-classification.md b/docs/plans/0004-phase-4-emergent-domain-classification.md new file mode 100644 index 0000000..d9f2355 --- /dev/null +++ b/docs/plans/0004-phase-4-emergent-domain-classification.md @@ -0,0 +1,883 @@ +# Phase 4 Plan: Emergent Domain Classification + +**Status**: Draft +**Depends on**: Phase 1 (domain columns on memories, `Domain` struct + `DomainStore` methods on `MemoryStore`, `Embedder` trait), Phase 2 (Postgres JSONB + TEXT[] support for domain fields, `embedding_model` registry parity), Phase 3 (Axum HTTP server, REST `/api/v1/` scaffolding, API key auth middleware, signed dashboard session cookies) +**Related**: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 4), docs/prd/001-getting-centralized-vestige.md (Emergent Domain Model) + +--- + +## Scope + +### In scope + +- `DomainClassifier` cognitive module under `crates/vestige-core/src/neuroscience/domain_classifier.rs`, alongside existing neuroscience modules (spreading_activation, synaptic_tagging, ...). +- HDBSCAN discovery pipeline using the `hdbscan` crate (v0.10): load all embeddings, cluster, extract centroids, extract top-terms via TF-IDF over cluster members, persist via the trait's `DomainStore` methods. +- Soft-assignment pipeline: for each memory, compute `cosine_similarity(memory.embedding, domain.centroid)` for every domain, store raw scores in `domain_scores` JSONB, threshold into `domains[]` using `assign_threshold` (default 0.65). +- Automatic classification on ingest: run through `CognitiveEngine` / `smart_ingest` so new memories get classified against existing centroids immediately; skip when `domain_count == 0` (Phase 0 accumulation). +- Re-cluster hook in dream consolidation: every Nth four-phase dream cycle (N=5 default) triggers a discovery pass and generates proposals (split / merge / none). Proposals land in a new `domain_proposals` table, surface in the dashboard, and are never auto-applied (conservative drift, ADR Q7). +- Context signals: `SignalSource` trait with `GitRepoSignal` (detects `.git` in CWD or `metadata.cwd`) and `IdeHintSignal` (reads `metadata.editor` / `metadata.ide`). Each returns a `boost_map` of `domain_id -> additive delta` (typical +0.05). Injected as a `signal_boost: Option>` parameter into `DomainClassifier::classify`. +- Cross-domain spreading activation decay: `ActivationNetwork` traversal multiplies the edge's effective weight by `cross_domain_decay` (default 0.5) when `target.domains` and `source.domains` are disjoint. Strict "no overlap" policy, not graded. +- CLI subcommands (in `crates/vestige-mcp/src/bin/cli.rs`, under a new `Domains` command group): `list`, `discover [--min-cluster-size N] [--force]`, `rename `, `merge [--into ]`. Human-readable tables on stdout; JSON via `--json`. +- Dashboard UI additions (`apps/dashboard/src/routes/(app)/domains/`): list page, per-domain detail (memories, centroid top_terms, score histogram, proposal review controls). +- REST endpoints under `/api/v1/domains` (introduced by Phase 3 skeleton, implemented in Phase 4): list, discover, rename, merge, proposal list / accept / reject. +- Config additions: `[domains]` section in `vestige.toml` covering `assign_threshold`, `recluster_interval`, `min_cluster_size`, `cross_domain_decay`, `discovery_threshold`, `merge_threshold`, `signal_boost` (per-signal toggle). + +### Out of scope + +- Phase 5 federation (explicit separate ADR). Domain centroids are installation-local; no sync. +- Learned re-weighting of domain scores (future, only if retrieval-quality metrics show a need). +- Interactive cluster-membership editing in the UI (drag-and-drop reassign) -- future enhancement. +- Multi-user domain namespaces. One domain set per installation; API keys that carry `domain_filter` just restrict access, they do not create namespaces. +- Auto-sweep of `min_cluster_size` / auto-tuned `assign_threshold` (ADR resolution Q6 + Q9: static defaults, user tunes). +- Graded cross-domain decay (`|A intersect B| / max(|A|,|B|)`) -- strict "no overlap" is the Phase 4 rule. + +--- + +## Prerequisites + +Artifacts that Phases 1-3 are expected to have landed: + +- In `vestige-core`: + - `Embedder` trait (`crates/vestige-core/src/embedder/`). + - `MemoryStore` trait (`crates/vestige-core/src/storage/trait.rs` or similar) including `DomainStore` methods: `list_domains`, `get_domain`, `upsert_domain`, `delete_domain`, `classify(&[f32]) -> Vec<(String, f64)>`, plus a bulk accessor such as `all_embeddings()` (already present in sqlite.rs as `get_all_embeddings`) and a `get_all_memories_with_embeddings()` iterator for discovery. The trait must expose a method to batch-update `(domains, domain_scores)` for a memory id. + - `Domain` struct: `{ id: String, label: String, centroid: Vec, top_terms: Vec, memory_count: usize, created_at: DateTime }`. + - Columns on memories in both SQLite and Postgres: `domains TEXT[]` (or JSON array on SQLite) and `domain_scores JSONB` (or TEXT JSON on SQLite). + - The `domains` table in both backends (see PRD schema sketch). +- In `vestige-mcp`: + - Axum `/api/v1/` router prefix with auth middleware. + - CLI skeleton (`bin/cli.rs`) using `clap`; Phase 4 adds a `Domains` subcommand tree. + - REST handlers file structure ready under `crates/vestige-mcp/src/dashboard/handlers.rs` (legacy) and a dedicated REST handler under `/api/v1/`; Phase 4 adds `domains.rs` handler module. + - SvelteKit dashboard (`apps/dashboard/`) with existing `(app)/memories`, `(app)/timeline`, `(app)/stats`, etc. Phase 4 adds `(app)/domains/`. + +New workspace crate additions required (added manually to `Cargo.toml`, since `cargo add` is not run from the plan): + +- `hdbscan = "0.10"` in `crates/vestige-core/Cargo.toml` (feature-gated behind `domain-classification`). +- Optional: a lightweight stop-word constant inline; no external stop-word crate -- the neuroscience modules already do tokenization on whitespace + length>3 (see `dreams.rs::content_similarity`). Reuse that style; no `ndarray` needed because `hdbscan` v0.10 accepts `&[Vec]` directly (verified from PRD snippet). +- No new deps in `vestige-mcp` for Phase 4 -- CLI reuses `clap` / `colored` / `comfy-table` if already present, otherwise a hand-rolled padded print. We pick hand-rolled to avoid adding a table crate; this matches the existing style of `run_stats` in `cli.rs`. + +Test fixtures: + +- A JSON seed corpus checked into `tests/phase_4/fixtures/seed_500.json` containing >= 500 memories drawn from three plausible clusters. A builder function `tests/phase_4/support/fixtures.rs::build_seed_corpus()` deterministically generates or loads this corpus. Each record has `content`, `tags`, `embedding` (768D bge-base-en-v1.5; use a committed vector or a deterministic mock embedder in tests). For deterministic tests we fake embeddings by hashing content -- acceptable as long as the fake preserves cluster separability (prefix-based: "DEV-...", "INFRA-...", "HOME-..." seeds three Gaussian blobs). +- Reuse `Embedder` mock from Phase 1 tests (`MockEmbedder`) for discovery tests that need real cosine similarity. +- A minimal git-repo fixture created in a tempdir (`tempfile::tempdir` + `std::process::Command::new("git").arg("init")`) for context-signal tests. + +--- + +## Deliverables + +1. `DomainClassifier` cognitive module: struct, defaults, `classify`, `classify_with_boost`, `reassign_all`, `discover`. +2. `domain_terms` helper (TF-IDF over cluster members, returning `top_k` terms). +3. `cli domains discover` subcommand. +4. `cli domains list` / `rename` / `merge` subcommands. +5. Auto-classify hook on ingest (wired into the cognitive engine's ingest pipeline before persistence). +6. Re-cluster hook in dream consolidation (`DreamEngine::run` orchestrator gets an optional `DomainReClusterHook`; triggers every Nth dream). +7. Context signal extractor module (`crates/vestige-core/src/neuroscience/context_signals.rs`) with `SignalSource` trait + `GitRepoSignal` + `IdeHintSignal`. +8. Cross-domain spreading activation decay in `ActivationNetwork::activate` (config-driven). +9. `vestige.toml` `[domains]` section + defaults loader. +10. Dashboard UI: SvelteKit routes `(app)/domains/+page.svelte` (list), `(app)/domains/[id]/+page.svelte` (detail), `(app)/domains/proposals/+page.svelte` (review). +11. REST endpoints under `/api/v1/domains` + `/api/v1/domains/proposals`. +12. `domain_proposals` table + migration + `DomainProposal` trait methods on `MemoryStore`. +13. WebSocket event `VestigeEvent::DomainProposalCreated` so the dashboard gets a live notification after a re-cluster fires. + +--- + +## Detailed Task Breakdown + +### 1. `DomainClassifier` cognitive module + +**File**: `crates/vestige-core/src/neuroscience/domain_classifier.rs` +**Export**: in `crates/vestige-core/src/neuroscience/mod.rs`, add `pub mod domain_classifier;` and re-export `pub use domain_classifier::{DomainClassifier, ClassificationResult, DomainProposal, ProposalKind};` +**Deps**: `hdbscan = "0.10"`, `serde`, `serde_json`, `chrono`, `tracing`, existing `crate::storage::Domain`, `crate::storage::MemoryStore` trait. + +Struct and defaults (match PRD exactly): + +```rust +pub struct DomainClassifier { + pub assign_threshold: f64, // default 0.65 + pub discovery_threshold: usize, // default 150 + pub recluster_interval: usize, // default 5 (every 5th dream) + pub min_cluster_size: usize, // default 10 + pub min_samples: usize, // default 5 (HDBSCAN) + pub cross_domain_decay: f64, // default 0.5 + pub merge_threshold: f64, // default 0.90 (centroid cosine) + pub top_terms_k: usize, // default 10 +} + +impl Default for DomainClassifier { ... } +``` + +Result types: + +```rust +#[derive(Debug, Clone)] +pub struct ClassificationResult { + pub scores: HashMap, // raw per-domain similarities + pub domains: Vec, // above assign_threshold +} + +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum ProposalKind { + Split { parent: String, children: Vec }, + Merge { targets: Vec, suggested_label: String }, + NewCluster { top_terms: Vec }, +} + +#[derive(Debug, Clone)] +pub struct DomainProposal { + pub id: String, // uuid v4 + pub kind: ProposalKind, + pub rationale: String, + pub confidence: f64, + pub created_at: DateTime, + pub status: ProposalStatus, // Pending | Accepted | Rejected +} +``` + +Key methods (all pure where possible; all pub): + +```rust +impl DomainClassifier { + pub fn classify(&self, embedding: &[f32], domains: &[Domain]) -> ClassificationResult; + + pub fn classify_with_boost( + &self, + embedding: &[f32], + domains: &[Domain], + boost: Option<&HashMap>, + ) -> ClassificationResult; + + pub async fn reassign_all( + &self, + store: &dyn MemoryStore, + domains: &[Domain], + ) -> Result; + + pub async fn discover( + &self, + store: &dyn MemoryStore, + ) -> Result, StorageError>; + + pub async fn propose_changes( + &self, + store: &dyn MemoryStore, + existing: &[Domain], + newly_discovered: &[Domain], + ) -> Result, StorageError>; + + pub async fn apply_proposal( + &self, + store: &dyn MemoryStore, + proposal: &DomainProposal, + ) -> Result<(), StorageError>; +} +``` + +Behavior notes: + +- `classify` returns empty `{ scores: {}, domains: [] }` iff `domains.is_empty()` (accumulation phase). This matches the PRD snippet verbatim. +- `classify_with_boost` adds the boost delta to each score AFTER cosine, before thresholding. It clamps to `[0.0, 1.0]`. Boost keys not present in `domains` are ignored. +- `reassign_all` streams memories in batches of 500 (iterator on the store) to keep memory bounded; for each memory issues a single `UPDATE memories SET domains = ?, domain_scores = ? WHERE id = ?` call. Returns count of memories whose `domains` vector actually changed. +- `discover` loads all `(id, embedding)` pairs via an `all_embeddings()` method on the store (exists under `#[cfg(all(feature = "embeddings", feature = "vector-search"))]` in `sqlite.rs::get_all_embeddings`; Phase 1 should promote this onto the trait -- if not yet promoted, add the method). Then: + 1. Build `Vec>` and index -> id map. + 2. `Hdbscan::default_hyper_params(&embeddings).min_cluster_size(self.min_cluster_size).min_samples(self.min_samples).build()` (exact builder depends on hdbscan 0.10 surface; see Open Question). + 3. `let labels = clusterer.cluster()?;` + 4. `let centers = clusterer.calc_centers(Center::Centroid, &labels)?;` + 5. Group indices by label ignoring -1 (noise). For each cluster compute `top_terms` via `compute_top_terms`. + 6. Preserve stable IDs where possible: match each new cluster centroid to the closest existing domain by cosine; if similarity > 0.85, reuse the existing domain id + label. Otherwise generate a fresh id `cluster_{n}` with a label derived from the first 2 terms. + 7. Upsert all resulting `Domain`s via the store. +- `propose_changes` compares old vs new clusters: + - **Split**: an old domain that best-matches two or more new domains each with >= `min_cluster_size` members. Rationale: "domain `dev` is now 2 clusters of >=10 memories: `systems` and `networking`". + - **Merge**: two old domains whose centroids now satisfy `cosine > merge_threshold` get a merge proposal. + - **NewCluster**: a new cluster that doesn't match any old domain above 0.85 similarity. +- `apply_proposal` runs the split or merge against the store (reassign memberships via `reassign_all`), then marks the proposal `Accepted`. It never runs automatically -- only via the CLI or dashboard. + +Helper: + +```rust +fn compute_top_terms(documents: &[&str], k: usize) -> Vec; +``` + +Uses TF-IDF with IDF computed over the entire passed-in corpus (the `documents` slice), tokenization = whitespace split, lowercase, strip non-alphanumeric, drop tokens shorter than 4 chars and a small built-in stop-word list (`the`, `and`, `for`, `that`, `with`, ...). Matches the tokenizer used in `dreams.rs::content_similarity` and `dreams.rs::extract_patterns` so behavior is predictable. + +Cosine similarity helper: + +```rust +fn cosine_similarity(a: &[f32], b: &[f32]) -> f64; +``` + +Keep the existing crate-level `cosine_similarity` if already present (check `embeddings::` or `search::`); otherwise add a private one. Returns 0.0 on dimension mismatch, panics would be a bug. + +### 2. Top-terms computation helper + +**File**: same module, private section. + +- `fn tokenize(text: &str) -> Vec`: lowercase, split on non-alphanumeric, filter len >= 4, drop stop-words. +- `fn tfidf_top_k(docs: &[&str], k: usize) -> Vec`: + 1. `tf[doc_idx][term] = count / total_terms`. + 2. `df[term] = docs containing term`. + 3. `idf[term] = log((N + 1) / (df[term] + 1)) + 1` (smoothed). + 4. For each term, average `tf` across docs in the cluster; multiply by `idf`; sort desc; return top `k`. + +Cluster top-terms are computed over cluster members only, with IDF over the **whole corpus** (all memory contents), not the cluster, so common words get penalized globally. Recompute global IDF once per `discover` call. + +### 3. CLI subcommand: `vestige domains discover` + +**File**: `crates/vestige-mcp/src/bin/cli.rs` + +Add to `enum Commands`: + +```rust +/// Emergent domain management +Domains { + #[command(subcommand)] + action: DomainAction, +}, +``` + +```rust +#[derive(clap::Subcommand)] +enum DomainAction { + /// List all discovered domains + List { + #[arg(long)] json: bool, + }, + /// Run HDBSCAN discovery on all embeddings and propose domains + Discover { + #[arg(long, default_value_t = 10)] min_cluster_size: usize, + /// Skip the proposal flow and write new domains directly (first-time use) + #[arg(long)] force: bool, + #[arg(long)] json: bool, + }, + /// Rename a domain (by id) + Rename { + id: String, + new_label: String, + }, + /// Merge two domains + Merge { + a: String, + b: String, + #[arg(long)] into: Option, // default: `a` + }, +} +``` + +Handler plumbing lives in `run_domains(action)` dispatching to `run_domains_list`, `run_domains_discover`, `run_domains_rename`, `run_domains_merge`. Each opens the default `Storage`, constructs a `DomainClassifier::default()`, and invokes the appropriate method. + +Output format for `list`: + +``` +ID LABEL MEMORIES TOP TERMS +dev Development 87 rust, trait, async, tokio, zinit +infra Infrastructure 47 bgp, sonic, vlan, frr, peering +home Home 31 solar, kwh, battery, pool, esphome +(unclassified) 12 +``` + +Produced via plain `print!` with `%-15s %-18s %-10d %s` style padding. `--json` emits `serde_json::to_string_pretty(&domains)`. + +Output format for `discover` with `--force`: + +``` +HDBSCAN: 500 embeddings, min_cluster_size=10, min_samples=5 +Found 3 clusters (ignoring 14 noise points) + cluster_0 (N=47) top: bgp, sonic, vlan, frr, peering + cluster_1 (N=31) top: solar, kwh, battery, pool, esphome + cluster_2 (N=22) top: rust, trait, async, tokio, zinit + +Writing 3 domains to the store... +Soft-assigning 500 memories against centroids... + multi-domain: 43 + single-domain: 412 + unclassified (below threshold 0.65): 45 +Done in 7.4s. +``` + +Output format for `discover` without `--force` (post-Phase-0): + +``` +HDBSCAN: 623 embeddings, min_cluster_size=10 +Comparing to existing 3 domains... + +Proposals (pending, accept via dashboard or `vestige domains proposals`): + [split] dev -> (systems:34, networking:28) confidence 0.82 + [new] cluster_5 (books, novels, reading) confidence 0.71 + +Run `vestige domains proposals` to review, or open the dashboard. +``` + +### 4. CLI: `list`, `rename`, `merge` + +- `list`: calls `store.list_domains()`, fetches unclassified count via `store.count_memories_without_domains()` (Phase 1 should have provided this; if not, Phase 4 adds it to the trait and both backends). +- `rename`: `store.get_domain(id)` -> mutate `label` -> `store.upsert_domain`. No memory touch. +- `merge`: load both, compute blended centroid (weighted by `memory_count`), merge `top_terms` (union, recompute TF-IDF rank if both sides share the corpus), delete the non-`into` domain, call `reassign_all`. Wrapped in a transaction on Postgres; on SQLite rely on the existing writer-lock pattern. + +### 5. Auto-classify on ingest + +**File**: `crates/vestige-core/src/cognitive.rs` (or equivalent ingest entry in `vestige-mcp/src/tools/smart_ingest.rs`). + +Integration point: just before the record is persisted in the smart-ingest path, after the embedder has produced `embedding` and before `storage.insert(...)`. Trace the current call site -- today `Storage::ingest(IngestInput)` computes embedding inside storage; in Phase 1 the embedder becomes external (ADR decision Q2), so classification can hook right there in the cognitive engine. + +Pseudocode: + +```rust +let embedding = embedder.embed(&input.content).await?; +let domains = store.list_domains().await?; + +let (domains_assigned, domain_scores) = if domains.is_empty() { + (Vec::new(), HashMap::new()) +} else { + let boost = context_signals.gather_boost(&input.metadata, &domains); + let result = classifier.classify_with_boost(&embedding, &domains, boost.as_ref()); + (result.domains, result.scores) +}; + +record.embedding = Some(embedding); +record.domains = domains_assigned; +record.domain_scores = domain_scores; +store.insert(&record).await?; +``` + +Edge cases: + +- Accumulation phase (`domains.is_empty()`): skip classification entirely. Zero overhead. +- Embedding failed / skipped: leave `domains = []`, `domain_scores = {}`. Never fail ingest because of classification. +- Metric: emit `VestigeEvent::MemoryClassified { id, domains, top_score }` on the WebSocket bus so the dashboard sees it live. + +### 6. Re-cluster hook in dream consolidation + +**File**: `crates/vestige-core/src/advanced/dreams.rs` (long file, 1131-line `dream()` entry on the `MemoryDreamer` impl) plus `crates/vestige-core/src/consolidation/phases.rs` (the `DreamEngine::run` orchestrator). + +Design: the `DreamEngine::run(...)` returns `FourPhaseDreamResult`. It does not currently know how many times it has run. Phase 4 introduces a persistent counter on disk (column `dream_cycle_count` on a new singleton `system_state` table, or a simple row in the existing `metadata` / `embedding_model` registry). After the Integration phase finishes, the cognitive engine increments the counter and, if `counter % recluster_interval == 0`, launches discovery asynchronously: + +Extension struct in `phases.rs`: + +```rust +pub struct DreamReClusterHook<'a> { + pub classifier: &'a DomainClassifier, + pub store: &'a dyn MemoryStore, + pub event_tx: Option<&'a tokio::sync::mpsc::UnboundedSender>, +} + +impl<'a> DreamReClusterHook<'a> { + pub async fn tick(&self, cycle_count: usize) -> Result, StorageError> { + if cycle_count == 0 || cycle_count % self.classifier.recluster_interval != 0 { + return Ok(vec![]); + } + let existing = self.store.list_domains().await?; + let rediscovered = self.classifier.discover(self.store).await?; + let proposals = self + .classifier + .propose_changes(self.store, &existing, &rediscovered) + .await?; + for p in &proposals { + self.store.insert_domain_proposal(p).await?; + if let Some(tx) = self.event_tx { + let _ = tx.send(VestigeEvent::DomainProposalCreated { + id: p.id.clone(), + kind: format!("{:?}", p.kind), + confidence: p.confidence, + timestamp: Utc::now(), + }); + } + } + Ok(proposals) + } +} +``` + +Caller wires `tick()` after `DreamEngine::run()` returns, at the ingest/consolidation orchestrator level. The hook never mutates existing domains -- it only writes proposals. The acceptance path is manual (CLI or dashboard). + +Counter storage: add method `store.bump_dream_cycle_count() -> Result` returning the new count. Single-row table: + +```sql +CREATE TABLE IF NOT EXISTS system_state ( + key TEXT PRIMARY KEY, + value TEXT NOT NULL +); +-- seed: ('dream_cycle_count', '0') +``` + +### 7. Context signal extractor + +**File**: `crates/vestige-core/src/neuroscience/context_signals.rs` + +```rust +pub trait SignalSource: Send + Sync { + /// Returns domain_id -> additive boost (positive or negative, typically in [-0.1, +0.1]). + fn boost_map( + &self, + input_metadata: &serde_json::Value, + domains: &[Domain], + ) -> HashMap; + + fn name(&self) -> &'static str; +} + +pub struct GitRepoSignal { + pub boost: f64, // default +0.05 +} + +pub struct IdeHintSignal { + pub boost: f64, +} + +pub struct ContextSignals { + sources: Vec>, +} + +impl ContextSignals { + pub fn gather_boost( + &self, + input_metadata: &serde_json::Value, + domains: &[Domain], + ) -> Option>; +} +``` + +Signal encoding convention (document in the module header): + +- A signal is a **soft prior**. It nudges the post-cosine score by a small additive delta, clamped to `[-0.10, +0.10]` per signal. +- Multiple signals sum, then the final boost per domain is clamped to `[-0.15, +0.15]` so signals cannot by themselves push a memory into or out of a domain; the embedding similarity dominates. +- Signals target domains by heuristic: `GitRepoSignal` boosts any domain whose `top_terms` overlaps `{"rust","async","trait","function","class","def","git","commit","fn","code"}`. `IdeHintSignal` does the same for `{"file","line","editor","vscode","neovim","rust-analyzer","lsp"}`. +- All signal boosts are logged via `tracing::debug!` so users can audit why a memory picked up a domain. + +`GitRepoSignal::boost_map` implementation: + +```rust +fn boost_map(&self, meta: &Value, domains: &[Domain]) -> HashMap { + let is_git = meta.get("cwd") + .and_then(|v| v.as_str()) + .map(|cwd| std::path::Path::new(cwd).join(".git").exists()) + .unwrap_or(false) + || meta.get("git_repo").is_some(); + if !is_git { return HashMap::new(); } + let mut out = HashMap::new(); + for d in domains { + let code_hits = d.top_terms.iter() + .filter(|t| CODE_TERMS.contains(t.as_str())) + .count(); + if code_hits > 0 { out.insert(d.id.clone(), self.boost); } + } + out +} +``` + +Config knob in `[domains.signals]`: `git = true`, `ide = true`, `git_boost = 0.05`, `ide_boost = 0.05`. + +### 8. Cross-domain spreading activation decay + +**File**: `crates/vestige-core/src/neuroscience/spreading_activation.rs` + +Modify `ActivationConfig`: + +```rust +pub struct ActivationConfig { + pub decay_factor: f64, + pub max_hops: u32, + pub min_threshold: f64, + pub allow_cycles: bool, + pub cross_domain_decay: f64, // NEW, default 0.5 +} +``` + +Domain metadata on nodes: the current `ActivationNode` has `id`, `activation`, `last_activated`, `edges: Vec`. Phase 4 adds `pub domains: Vec`. Populated when nodes get added (propagated from the memory's `domains` field). The network is rebuilt on each search from the store; if the in-memory network is persisted (check `ActivationNetwork` lifetime in `CognitiveEngine`), the population happens in the engine at boot and on insert. + +Traversal change, in `ActivationNetwork::activate` loop, replacing the single line `let propagated = current_activation * edge.strength * self.config.decay_factor;`: + +```rust +let cross_penalty = { + let src_doms = self.nodes.get(¤t_id).map(|n| &n.domains); + let tgt_doms = self.nodes.get(&target_id).map(|n| &n.domains); + match (src_doms, tgt_doms) { + (Some(s), Some(t)) if !s.is_empty() && !t.is_empty() => { + let overlap = s.iter().any(|d| t.contains(d)); + if overlap { 1.0 } else { self.config.cross_domain_decay } + } + _ => 1.0, // unclassified on either side: no penalty + } +}; +let propagated = current_activation * edge.strength * self.config.decay_factor * cross_penalty; +``` + +Rationale for "unclassified -> no penalty": unclassified memories are Phase-0 or low-confidence corpus members; penalizing them would block useful cross-pollination during the accumulation ramp. + +API to update a node's domains after reclassification: + +```rust +pub fn set_node_domains(&mut self, id: &str, domains: Vec); +``` + +Called by the reassignment pipeline after `reassign_all`. + +### 9. `vestige.toml` `[domains]` section + +**File**: wherever `vestige.toml` is loaded (search for `[storage]` / `[server]` loaders). Add: + +```toml +[domains] +assign_threshold = 0.65 +discovery_threshold = 150 +recluster_interval = 5 +min_cluster_size = 10 +min_samples = 5 +cross_domain_decay = 0.5 +merge_threshold = 0.90 +top_terms_k = 10 + +[domains.signals] +git = true +ide = true +git_boost = 0.05 +ide_boost = 0.05 +``` + +Rust-side: `DomainsConfig { ... }` struct with `serde(default)` so `vestige.toml` without a `[domains]` section falls back to hard-coded defaults. `DomainClassifier::from_config(cfg: &DomainsConfig) -> Self`. + +### 10. Dashboard UI additions + +**SvelteKit routes** (`apps/dashboard/src/routes/(app)/domains/`): + +- `+page.svelte` (list): fetches `GET /api/v1/domains` and `GET /api/v1/domains/unclassified-count`. Renders a table: `label`, `memories`, `top_terms` chips, `created_at`. Each row links to `/domains/[id]`. A "Discover" button posts `POST /api/v1/domains/discover`. +- `[id]/+page.svelte` (detail): fetches `GET /api/v1/domains/:id`, `GET /api/v1/domains/:id/memories?limit=100`, `GET /api/v1/domains/:id/score-histogram`. Renders: + - Header: label (editable, triggers `PUT /api/v1/domains/:id`), top-terms chips, memory count, created_at. + - Histogram: a vertical bar chart of `domain_scores[:id]` buckets 0-0.1, 0.1-0.2, ..., 0.9-1.0 across all memories. Data source: server precomputes buckets so the client does not need to fetch all scores. + - Memory list: paginated, each row shows the raw score for this domain. +- `proposals/+page.svelte`: fetches `GET /api/v1/domains/proposals?status=pending`. Each pending proposal card shows `kind`, `rationale`, `confidence`, `created_at`, buttons "Accept" (posts `POST /api/v1/domains/proposals/:id/accept`) and "Reject" (`POST .../reject`). Live updates via the existing WebSocket channel (`/ws`) reacting to `DomainProposalCreated` events. + +Styling reuses the existing Tailwind + shadcn-svelte conventions in `apps/dashboard/src/lib/components/`. + +Existing `(app)/stats` and `(app)/feed` pages get a small "Domains" summary panel that links to `/domains`. + +### 11. REST endpoints + +**File**: `crates/vestige-mcp/src/protocol/http.rs` or a new `crates/vestige-mcp/src/api/domains.rs` module, wired into the `/api/v1/` router. + +| Method | Path | Handler | +|--------|------|---------| +| GET | `/api/v1/domains` | `list_domains` -- returns `[Domain...]` + unclassified count | +| POST | `/api/v1/domains/discover` | `trigger_discover` -- body `{ min_cluster_size?: usize, force?: bool }`, returns proposals or applied domains | +| GET | `/api/v1/domains/:id` | `get_domain` | +| PUT | `/api/v1/domains/:id` | `update_domain` -- rename | +| DELETE | `/api/v1/domains/:id` | `delete_domain` -- with `?merge_into=other_id` | +| GET | `/api/v1/domains/:id/memories` | paginated memories in this domain | +| GET | `/api/v1/domains/:id/score-histogram` | precomputed buckets | +| GET | `/api/v1/domains/proposals` | `list_proposals?status=pending` | +| POST | `/api/v1/domains/proposals/:id/accept` | `accept_proposal` | +| POST | `/api/v1/domains/proposals/:id/reject` | `reject_proposal` | + +All handlers go through the Phase 3 auth middleware (Bearer / X-API-Key / session cookie). Responses are JSON; error paths use `StatusCode::*` with a small `{"error": "..."}` body. + +### 12. `domain_proposals` table + trait methods + +Postgres migration (`crates/vestige-core/migrations/postgres/00XX_domain_proposals.sql`): + +```sql +CREATE TABLE domain_proposals ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + kind TEXT NOT NULL, -- 'split' | 'merge' | 'new_cluster' + payload JSONB NOT NULL, -- serialized ProposalKind body + rationale TEXT NOT NULL, + confidence DOUBLE PRECISION NOT NULL, + status TEXT NOT NULL DEFAULT 'pending', -- pending|accepted|rejected + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + resolved_at TIMESTAMPTZ +); +CREATE INDEX idx_domain_proposals_status ON domain_proposals (status, created_at DESC); +``` + +SQLite migration: same table, `UUID` -> `TEXT`, `JSONB` -> `TEXT` with JSON-encoded bodies, `TIMESTAMPTZ` -> `TEXT` ISO-8601. + +`MemoryStore` trait additions: + +```rust +async fn insert_domain_proposal(&self, p: &DomainProposal) -> Result<()>; +async fn list_domain_proposals(&self, status: Option<&str>) -> Result>; +async fn get_domain_proposal(&self, id: &str) -> Result>; +async fn set_proposal_status(&self, id: &str, status: &str) -> Result<()>; +``` + +### 13. WebSocket event for proposals + +**File**: `crates/vestige-mcp/src/dashboard/events.rs` + +Add variant: + +```rust +pub enum VestigeEvent { + // ... existing ... + DomainProposalCreated { + id: String, + kind: String, + confidence: f64, + timestamp: DateTime, + }, + MemoryClassified { + id: String, + domains: Vec, + top_score: f64, + timestamp: DateTime, + }, +} +``` + +The SvelteKit dashboard's WS client reacts to both events: classified events refresh any open domain-detail page; proposal events push a toast and a badge on the navbar. + +--- + +## Test Plan + +Test root: `tests/phase_4/` (a new member of the workspace; mirror the `tests/e2e` layout). + +`tests/phase_4/Cargo.toml`: + +```toml +[package] +name = "vestige-phase4-tests" +version = "0.0.0" +edition = "2024" +publish = false + +[dependencies] +vestige-core = { path = "../../crates/vestige-core", features = ["embeddings", "vector-search", "domain-classification"] } +vestige-mcp = { path = "../../crates/vestige-mcp" } +tokio = { workspace = true } +anyhow = "1" +tempfile = "3" +serde_json = { workspace = true } +uuid = { workspace = true } +``` + +### Unit tests (colocated in `domain_classifier.rs::tests`, `context_signals.rs::tests`, `spreading_activation.rs::tests`) + +Each public function must have at least one test: + +- `classify_empty_domains_returns_empty`: `classify(&[0.0; 768], &[])` returns `ClassificationResult { scores: {}, domains: [] }`. +- `classify_single_domain_scores`: one `Domain` with a known centroid; input embedding equal to centroid; expect score 1.0 and `domains == [id]`. +- `classify_multi_domain_overlap`: two domains A, B; input halfway between centroids; expect both scores >= `assign_threshold`; expect `domains == [A, B]` (order not guaranteed). +- `classify_below_threshold_returns_empty_domains_but_scores_filled`: input orthogonal to all centroids; expect `scores` populated, `domains` empty. +- `classify_with_boost_adds_delta`: same input as above, with `boost = {A: 0.4}`; expect A now above threshold, B unchanged. +- `classify_boost_clamps_to_unit`: `boost = {A: 5.0}`; resulting `scores[A]` must be <= 1.0. +- `tfidf_top_k_returns_distinct_terms`: given three fake docs, `top_k=3` returns three non-duplicate strings, in descending TF-IDF order. +- `tfidf_top_k_drops_stopwords`: `["the and for"]` + real content -> stop-words absent. +- `compute_top_terms_handles_empty_cluster`: returns `vec![]` (no panic). +- `signal_git_present_vs_absent`: `GitRepoSignal` given metadata with `.git` in cwd returns non-empty map; without it returns empty. +- `signal_ide_present_vs_absent`: `IdeHintSignal` ditto for `metadata.editor == "vscode"`. +- `signal_combined_clamped`: two signals both firing each at +0.10 -> combined map values <= +0.15. +- `cross_domain_decay_full_weight_on_overlap`: graph with node A in domain `dev`, node B in domain `dev`, edge A->B strength 1.0; after `activate`, B's activation equals the standard `initial * strength * decay_factor` (no extra penalty). +- `cross_domain_decay_half_weight_no_overlap`: A in `dev`, B in `infra`, same edge -> B's activation is 0.5x that of the overlap case. +- `cross_domain_decay_unclassified_no_penalty`: A classified, B unclassified -> full weight. +- `propose_changes_detects_split`: existing domain `dev`; new discovery returns two clusters whose centroids both sit close to old `dev` centroid, each >= min_cluster_size members -> proposal of kind `Split { parent: "dev", children: [a, b] }`. +- `propose_changes_detects_merge`: two existing domains whose new centroids now have cosine > `merge_threshold` -> proposal of kind `Merge`. +- `propose_changes_detects_new_cluster`: a new cluster with no match >= 0.85 to any existing -> `NewCluster`. +- `apply_proposal_split_updates_memberships`: after accept, memories previously in `dev` get reassigned (some to child a, some to child b) via `reassign_all`. + +### Integration tests (`tests/phase_4/tests/`) + +One file per behavior listed in the Phase 4 acceptance sheet. + +- `discover_seed_corpus.rs` -- loads the 500-memory fixture, runs `classifier.discover(&store).await`, asserts at least 3 clusters, asserts per-cluster intra-similarity mean > 0.6, asserts discovery wall time < 10s in release. Also asserts `top_terms` for each cluster contains at least one expected keyword per cluster (dev: contains any of `rust/trait/async`; infra: `bgp/vlan/network`; home: `solar/battery/pool`). +- `soft_assign_multi_domain.rs` -- inserts a memory "deploy zinit containers over BGP network"; after classify, `domains` contains both `dev` and `infra` (from a known centroid setup). +- `auto_classify_on_ingest.rs` -- with three existing domains, a fresh `smart_ingest` of a dev-ish sentence ends up with `domains == ["dev"]` and non-empty `domain_scores`. +- `reembed_triggers_recluster.rs` -- after `vestige migrate --reembed`, centroids must be recomputed; verify `list_domains()` returns fresh `centroid` values (different from pre-reembed). +- `dream_consolidation_recluster_hook.rs` -- run 5 dream cycles with heavy synthetic memory insertion; after the 5th, assert `list_domain_proposals("pending")` has at least one proposal. +- `proposal_accept_applies_changes.rs` -- accept a split proposal via `apply_proposal`; verify that memories in `dev` are now distributed across the new children and that the old `dev` domain is removed. +- `proposal_reject_leaves_state.rs` -- reject a proposal; verify all domains and memberships unchanged. +- `drift_is_proposal_only.rs` -- over 5 dream cycles with new inserts, never call accept; verify every memory's `domains` field equals its initial post-discovery value. No auto-apply. +- `cross_domain_activation_decay.rs` -- build a `ActivationNetwork` with two memories linked by a strength-1.0 edge, one in `dev`, one in `infra`; activate `dev` memory with 1.0; assert `infra` memory's activation == `0.5 * decay_factor` (0.35 with default decay_factor 0.7). Then set both to `dev` and reassert activation == `0.7`. +- `cli_domains_discover.rs` -- spawn `cargo run -- domains discover --force --json`, parse stdout, assert at least 3 clusters and valid JSON shape. +- `cli_domains_rename_merge.rs` -- happy-path rename then merge, with stdout assertions. +- `context_signal_git_repo.rs` -- ingest the same sentence from inside a tempdir with `.git` vs outside; assert the git-run produces slightly higher `domain_scores` for the code-related domain (diff >= 0.04, matches `git_boost = 0.05`). +- `threshold_tunable.rs` -- same memory, two runs with `assign_threshold = 0.40` vs `0.85`; the low-threshold run assigns more domains than the high-threshold run for the same content. +- `signal_boost_clamped.rs` -- artificially configure `git_boost = 5.0` and assert the resulting per-domain score is still <= 1.0. +- `discover_preserves_stable_ids.rs` -- run discover twice with no new memories; the second run's domain ids match the first's (via centroid-similarity stable-ID matching above 0.85). + +### Dashboard UI tests (`tests/phase_4/ui/`) + +Use curl-driven smoke tests (avoids adding Playwright as a new hard dep; Playwright already exists at `apps/dashboard/playwright.config.ts` and can be extended later). + +- `domains_list_renders.sh` -- `curl -H "X-API-Key: $KEY" http://localhost:3927/api/v1/domains` returns 200 + JSON array with expected keys. +- `domain_detail_histogram.sh` -- `curl .../api/v1/domains/dev/score-histogram` returns 10 buckets. +- `proposal_review_flow.sh` -- create a pending proposal via SQL insert; `curl POST .../api/v1/domains/proposals//accept`; `curl GET .../proposals?status=accepted` shows it. +- `unauth_domain_list_rejected.sh` -- no auth header -> 401. + +### Benchmarks (`tests/phase_4/benches/`) + +Criterion benches: + +- `bench_discover_10k.rs` -- synthetic 10k x 768D embeddings drawn from 5 blobs; assert `discover` wall p95 < 30s on a warm release build. +- `bench_auto_classify_single.rs` -- 20 domains in memory, classify one 768D vector; assert p99 < 5ms. +- `bench_reassign_all.rs` -- 10k memories, 5 domains; assert full `reassign_all` wall time < 90s (100 rows/ms baseline). + +--- + +## Acceptance Criteria + +- [ ] `cargo build -p vestige-core --features domain-classification` zero warnings. +- [ ] `cargo build -p vestige-mcp` zero warnings. +- [ ] `cargo clippy --workspace --all-targets --all-features -- -D warnings` clean. +- [ ] `cargo test -p vestige-phase4-tests` -- all tests in `tests/phase_4/` pass. +- [ ] On a 500+ memory seed corpus covering three natural clusters (dev / infra / home), `vestige domains discover --force` produces sensible top-terms matching the expected keyword sets and labels are stable on a second run. +- [ ] `vestige search` with domain filter `["dev"]` excludes any memory whose `domains` array does not include `dev`. +- [ ] After 5 dream cycles with ongoing inserts, no existing memory's `domains` has silently changed; proposals exist in `domain_proposals` table; accepting a proposal reassigns as described. +- [ ] Cross-domain spreading activation: a query in `dev` that crosses a single edge into an `infra`-only memory still returns the memory but with activation `cross_domain_decay * in-domain_activation`. +- [ ] `vestige domains discover --min-cluster-size 20` produces strictly fewer or equal clusters than the default, and with larger per-cluster membership. +- [ ] Dashboard `/dashboard/domains` route renders all domains within 2 seconds on the seed corpus. +- [ ] Proposal UI flow (open pending, accept, confirmed in store) works end-to-end. +- [ ] Benchmarks meet targets (discover 10k p95 < 30s, auto-classify p99 < 5ms). + +--- + +## Rollback Notes + +- **Feature gate**: add `domain-classification` to `crates/vestige-core/Cargo.toml`'s `[features]`. When disabled, the `DomainClassifier` module is not compiled, the classification call in the ingest path is a no-op (`#[cfg]`-guarded), and cross-domain decay collapses to `1.0`. The CLI `domains` subcommand emits "domain classification is disabled in this build". +- **Revert strategy**: drop the two new tables `domains` (if created in Phase 1 is retained) or `domain_proposals` (Phase 4). A DOWN migration clears `memories.domains` and `memories.domain_scores`. Existing memories simply lose their domain assignments; all search and retrieval paths work unchanged because `domains = []` is the documented "unclassified" state. +- **Idempotency**: rerunning `discover` is always safe. Cluster numeric IDs may differ between runs, but the stable-ID match by centroid similarity preserves user-assigned labels. Do not persist cluster ids in client-side bookmarks; link via the user-assigned label. +- **Data-loss risk**: `apply_proposal` is a destructive operation (it deletes the old parent domain in a split or merges two). The dashboard's accept button double-confirms with a modal that shows the number of affected memories. + +--- + +## Open Implementation Questions + +Each question + candidates + RECOMMENDATION. + +### OQ1. Top-terms extraction: TF-IDF vs BM25 vs frequency? +- TF-IDF with smoothed IDF -- standard, cheap, good-enough. +- BM25 -- better for long-document discrimination, overkill for short memory contents. +- Raw frequency -- noisy; stop-words dominate. +**RECOMMENDATION**: TF-IDF with global IDF over the entire memory corpus (not just cluster members), recomputed once per `discover` call. Same tokenizer as the `dreams.rs::content_similarity` Jaccard for consistency. + +### OQ2. Proposal persistence: DB table vs in-memory with dashboard notification? +- DB table (`domain_proposals`) -- durable, surfaces across restarts, enables audit. +- In-memory only -- simpler, but loses proposals on server restart. +**RECOMMENDATION**: DB table. Proposals are rare (every 5th dream) and valuable user-facing artifacts; durability is mandatory. + +### OQ3. `hdbscan` crate: f32 vs f64 input, exact API surface? +- v0.10 historically takes `&[Vec]`; embeddings are `Vec`. +- Cost of converting f32 -> f64 at discovery time: `10k * 768 = 7.68M` f64 doubles ~ 60MB transient, acceptable. +**RECOMMENDATION**: verify v0.10's type signature at implementation time; if it requires f64, perform the conversion in `discover()` behind a single allocation. Document in module header. If the crate API diverged from the PRD snippet, fall back to the manual builder style (`HdbscanHyperParams::builder().min_cluster_size(n).min_samples(s).build()`). + +### OQ4. Stable domain IDs across discover re-runs? +- Option A: numeric IDs from HDBSCAN labels -- unstable, re-runs shuffle them. +- Option B: hash(top_terms) -- stable if top-terms stable, but top-terms drift. +- Option C (recommended): after computing new centroids, match each to the closest existing domain by centroid cosine; if similarity > 0.85, reuse the existing domain's `id` and `label`. Otherwise mint a fresh `id = "cluster_"`. +**RECOMMENDATION**: Option C. Preserves user-assigned labels across drift. Threshold 0.85 is config-tunable via `stable_id_threshold` if needed later. + +### OQ5. Context signal injection site: ingest handler vs embedder vs classifier? +- Embedder -- would alter embedding; signals are not about embedding quality. +- Ingest handler -- signals known there, but then `DomainClassifier` cannot be tested in isolation. +- Classifier as a `classify_with_boost(boost: Option<&HashMap>)` parameter -- pure, testable, composable. +**RECOMMENDATION**: classifier parameter. The cognitive engine constructs the boost map via `ContextSignals::gather_boost(&metadata, &domains)` and hands it to the classifier. Keeps the classifier stateless w.r.t. signals. + +### OQ6. Re-cluster proposal cadence: event-based (every Nth dream) vs time-based (weekly)? +- ADR resolution Q7: every Nth dream (N=5 default). +- Alternative: once per week regardless of dream cadence. +**RECOMMENDATION**: stick with every Nth dream. Users who dream rarely re-cluster rarely -- that matches the philosophy ("memory work triggers memory bookkeeping"). Note the alternative as future consideration; if users complain about never seeing proposals, add a time-based fallback. + +### OQ7. Minimum corpus size for first discover? +- PRD default: 150. +- Too low -> noisy initial clusters, proposals every dream. +- Too high -> user waits forever for domains to appear. +**RECOMMENDATION**: 150 as the default discovery gate; HDBSCAN's `min_cluster_size=10` will produce 0 clusters for < 100 memories, so the system gracefully produces no domains until the corpus is large enough. Test with `N=80, 150, 500` in `threshold_tunable.rs` to confirm sensible behavior. + +### OQ8. Cross-domain decay: strict no-overlap vs graded? +- Strict: `1.0` if any overlap, `cross_domain_decay` otherwise. +- Graded: `max(cross_domain_decay, |A intersect B| / max(|A|, |B|))`. +**RECOMMENDATION**: strict for Phase 4. Easier to reason about, easier to tune, easier to test. Graded is a marked future enhancement; file an issue if retrieval-quality metrics justify it. + +### OQ9. Classifier invocation from remote HTTP clients? +- In server mode, an agent posts `smart_ingest` -> server embeds -> server classifies. +- All the work stays server-side; MCP clients never do classification. +**RECOMMENDATION**: confirmed server-side-only. Document in the MCP tool schema that `smart_ingest` now returns `domains` and `domain_scores` in its response so clients can display the classification to the user. + +### OQ10. Where to store the dream-cycle counter? +- In-memory on `CognitiveEngine` -- lost on restart, miscounts cadence. +- New `system_state` singleton table. +**RECOMMENDATION**: `system_state` table. Survives restarts. Also useful for future metrics (total memories ever, total dreams ever). + +### OQ11. Scope of `reassign_all` after a proposal accept vs a normal discover? +- On discover --force (first-time), run `reassign_all` against all memories. +- On proposal accept (split / merge), run `reassign_all` only on affected memories (parent's members for split; both parents' members for merge) to avoid touching unrelated records. +**RECOMMENDATION**: scoped reassignment where possible; fall back to full `reassign_all` only on `discover --force` or when the set of domains has fundamentally changed. Reduces write amplification on large corpora. + +### OQ12. Proposal freshness? +- Multiple re-clusters could stack up pending proposals. +**RECOMMENDATION**: before inserting a new proposal, check for existing pending proposals with the same `kind + targets`; if present, bump `created_at` and `confidence` instead of creating a duplicate. Add a `confidence_history` array in the `payload` JSONB for audit. + +--- + +## Implementation Sequencing (suggested order) + +1. Land the `DomainClassifier` struct, `classify` / `classify_with_boost`, unit tests. (Day 1) +2. Add `compute_top_terms` + TF-IDF helper, tests. (Day 1) +3. Wire `discover` end-to-end against SQLite; `discover_seed_corpus` integration test. (Day 2) +4. Add `domain_proposals` table migrations + trait methods; both backends. (Day 2) +5. Implement `propose_changes` + `apply_proposal`; proposal unit tests. (Day 3) +6. Context signals module + tests. (Day 3) +7. Hook classifier into ingest path; `auto_classify_on_ingest` integration test. (Day 4) +8. Cross-domain decay in spreading activation; unit + integration tests. (Day 4) +9. Dream re-cluster hook + `system_state` counter; integration tests for drift-only behavior. (Day 5) +10. CLI subcommands. (Day 6) +11. REST endpoints. (Day 6) +12. SvelteKit dashboard routes + WebSocket event wiring. (Day 7-8) +13. Benchmarks + acceptance sweep on the 500-memory seed. (Day 9) + +--- + +## File Map (everything Phase 4 touches or creates) + +Creates: + +- `crates/vestige-core/src/neuroscience/domain_classifier.rs` +- `crates/vestige-core/src/neuroscience/context_signals.rs` +- `crates/vestige-core/migrations/postgres/00XX_domain_proposals.sql` +- `crates/vestige-core/migrations/sqlite/00XX_domain_proposals.sql` (or inline in `storage/migrations.rs`) +- `crates/vestige-mcp/src/api/domains.rs` (REST handlers) +- `apps/dashboard/src/routes/(app)/domains/+page.svelte` +- `apps/dashboard/src/routes/(app)/domains/[id]/+page.svelte` +- `apps/dashboard/src/routes/(app)/domains/proposals/+page.svelte` +- `apps/dashboard/src/lib/api/domains.ts` +- `tests/phase_4/Cargo.toml` +- `tests/phase_4/tests/*.rs` (per the Integration test list) +- `tests/phase_4/fixtures/seed_500.json` +- `tests/phase_4/support/fixtures.rs` + +Modifies: + +- `crates/vestige-core/Cargo.toml` -- add `hdbscan = "0.10"` under a new `domain-classification` feature. +- `crates/vestige-core/src/neuroscience/mod.rs` -- register new modules, re-exports. +- `crates/vestige-core/src/neuroscience/spreading_activation.rs` -- `cross_domain_decay` field in `ActivationConfig`, `domains` field on `ActivationNode`, decay math in `activate`. +- `crates/vestige-core/src/consolidation/phases.rs` -- `DreamReClusterHook`. +- `crates/vestige-core/src/advanced/dreams.rs` -- accept a hook callback from the orchestrator (if the orchestration is done at this level). +- `crates/vestige-core/src/storage/trait.rs` -- add proposal + system_state methods. +- `crates/vestige-core/src/storage/sqlite.rs` -- implement proposal + system_state methods + `all_embeddings_with_meta` if not already on the trait. +- `crates/vestige-core/src/storage/postgres.rs` (Phase 2) -- same. +- `crates/vestige-core/src/lib.rs` -- re-exports. +- `crates/vestige-core/src/cognitive.rs` (or equivalent ingest orchestrator) -- auto-classify injection. +- `crates/vestige-mcp/src/bin/cli.rs` -- `Domains` subcommand + dispatch. +- `crates/vestige-mcp/src/dashboard/mod.rs` -- wire new REST routes. +- `crates/vestige-mcp/src/dashboard/events.rs` -- new event variants. +- `crates/vestige-mcp/src/dashboard/handlers.rs` -- if legacy dashboard gets a domains panel (optional). +- `vestige.toml` config loader -- `[domains]` section + struct + defaults. +- Root `Cargo.toml` workspace members -- add `tests/phase_4`. + +--- + +## Risks + +- **HDBSCAN determinism**: HDBSCAN is deterministic given input order; sorting embeddings by memory id before feeding the clusterer guarantees reproducibility across runs -- do this in `discover()` and document it. +- **Embedding dimension drift**: Phase 1's `embedding_model` registry blocks writes from mismatched embedders. If `discover()` ever sees two dimensions, it bails with a clear error and points at `vestige migrate --reembed`. +- **Classification latency on ingest**: for users with thousands of domains (unlikely but possible), `classify` is O(n_domains * dim). 20 domains * 768 f32 = 15k flops per classification, trivial. Still, expose a `classify_budget_ms` config knob for paranoia. +- **Re-cluster proposal storms**: if the corpus is borderline-stable, small changes can produce conflicting proposals on consecutive dreams. Mitigation: OQ12 (dedup by target set, bump confidence instead of stacking). +- **Dashboard feature gap**: if the SvelteKit app lands with the domains route but the REST endpoints are not yet deployed, the route 404s. Mitigation: ship the REST endpoints in the same release; a feature flag on the client toggles the nav entry. + +--- + +## Non-Goals Reminder + +- No Phase 5 federation concerns in this plan. +- No cross-installation domain sync. +- No automatic accept of proposals, ever. +- No graded cross-domain decay; strict only. +- No ML-based domain label suggestion (top-terms are enough for v1). +- No editing individual memory memberships from the UI in this phase. diff --git a/docs/prd/001-getting-centralized-vestige.md b/docs/prd/001-getting-centralized-vestige.md new file mode 100644 index 0000000..9d86087 --- /dev/null +++ b/docs/prd/001-getting-centralized-vestige.md @@ -0,0 +1,751 @@ +# RFC: Pluggable Storage Backend + Network Access for Vestige + +**Status**: Draft / Discussion +**Author**: Jan +**Date**: 2026-02-26 +**Vestige version**: v2.x (current main) + +## Summary + +Add a pluggable storage backend trait to Vestige, enabling PostgreSQL (+pgvector) as an alternative to the current SQLite+FTS5+USearch stack. Simultaneously add HTTP MCP transport with API key authentication to enable centralized/remote deployment. + +This keeps the existing local-first SQLite mode fully intact while opening up a server deployment model. + +## Motivation + +Vestige currently runs as a local process per machine (MCP via stdio, SQLite in `~/.vestige/`). This works great for single-machine use but doesn't support: + +- **Multi-machine access**: Same memory brain from laptop, desktop, and server +- **Multi-agent access**: Multiple AI clients hitting one memory store concurrently +- **Future federation**: Syncing memory between decentralized nodes (e.g., MOS/Threefold grid) + +SQLite's single-writer model and lack of native network protocol make it unsuitable as a centralized server. PostgreSQL is a natural fit: built-in concurrency (MVCC), authentication, replication, and with `pgvector` + built-in FTS it collapses three separate storage layers into one. + +## Design + +### Storage Trait + +The core abstraction. All 29 cognitive modules interact with storage exclusively through this trait (or a small family of traits). + +```rust +use std::collections::HashMap; +use uuid::Uuid; + +/// Core memory record, backend-agnostic +#[derive(Debug, Clone)] +pub struct MemoryRecord { + pub id: Uuid, + pub domains: Vec, // [] = unclassified, ["dev"], ["dev", "infra"], etc. + pub domain_scores: HashMap, // raw similarities: {"dev": 0.82, "infra": 0.71} + pub content: String, + pub node_type: String, + pub tags: Vec, + pub embedding: Option>, // dimensionality is runtime config + pub created_at: chrono::DateTime, + pub updated_at: chrono::DateTime, + pub metadata: serde_json::Value, +} + +/// FSRS scheduling state, stored alongside each memory +#[derive(Debug, Clone)] +pub struct SchedulingState { + pub memory_id: Uuid, + pub stability: f64, + pub difficulty: f64, + pub retrievability: f64, + pub last_review: Option>, + pub next_review: Option>, + pub reps: u32, + pub lapses: u32, +} + +/// Hybrid search request +#[derive(Debug, Clone)] +pub struct SearchQuery { + pub domains: Option>, // None = search all domains + pub text: Option, // FTS query + pub embedding: Option>, // vector similarity + pub tags: Option>, // tag filter + pub node_types: Option>, + pub limit: usize, + pub min_retrievability: Option, // filter by FSRS state +} + +#[derive(Debug, Clone)] +pub struct SearchResult { + pub record: MemoryRecord, + pub score: f64, // combined/fused score + pub fts_score: Option, + pub vector_score: Option, +} + +/// Connection/edge between memories (for spreading activation) +#[derive(Debug, Clone)] +pub struct MemoryEdge { + pub source_id: Uuid, + pub target_id: Uuid, + pub edge_type: String, + pub weight: f64, + pub created_at: chrono::DateTime, +} + +/// Main storage trait — one impl per backend +/// trait_variant generates a Send-bound `MemoryStore` alias, +/// enabling Arc without manual boxing. +#[trait_variant::make(MemoryStore: Send)] +pub trait LocalMemoryStore: Sync + 'static { + // --- Lifecycle --- + async fn init(&self) -> Result<()>; + async fn health_check(&self) -> Result; + + // --- CRUD --- + async fn insert(&self, record: &MemoryRecord) -> Result; + async fn get(&self, id: Uuid) -> Result>; + async fn update(&self, record: &MemoryRecord) -> Result<()>; + async fn delete(&self, id: Uuid) -> Result<()>; + + // --- Search --- + async fn search(&self, query: &SearchQuery) -> Result>; + async fn fts_search(&self, text: &str, limit: usize) -> Result>; + async fn vector_search(&self, embedding: &[f32], limit: usize) -> Result>; + + // --- FSRS Scheduling --- + async fn get_scheduling(&self, memory_id: Uuid) -> Result>; + async fn update_scheduling(&self, state: &SchedulingState) -> Result<()>; + async fn get_due_memories(&self, before: chrono::DateTime, limit: usize) -> Result>; + + // --- Graph (spreading activation) --- + async fn add_edge(&self, edge: &MemoryEdge) -> Result<()>; + async fn get_edges(&self, node_id: Uuid, edge_type: Option<&str>) -> Result>; + async fn remove_edge(&self, source: Uuid, target: Uuid) -> Result<()>; + async fn get_neighbors(&self, node_id: Uuid, depth: usize) -> Result>; + + // --- Bulk / Maintenance --- + async fn count(&self) -> Result; + async fn get_stats(&self) -> Result; + async fn vacuum(&self) -> Result<()>; +} +``` + +**Design notes:** + +- `trait_variant::make` generates a `MemoryStore` trait alias with `Send`-bound futures, allowing `Arc` for runtime backend selection. `LocalMemoryStore` is the base (usable in single-threaded contexts), `MemoryStore` is the Send variant for Axum/tokio. +- `embedding: Option>` — dimensions determined at runtime by the configured fastembed model. The backend stores whatever it gets. +- The trait is intentionally flat. The cognitive modules (FSRS-6, spreading activation, synaptic tagging, prediction error gating, etc.) sit *above* this trait and don't need to know about the backend. +- `search()` does hybrid RRF fusion at the backend level — both SQLite and Postgres implementations handle this internally. + +### Backend: SQLite (existing, refactored) + +Wraps the current implementation behind the trait: + +``` +SqliteMemoryStore +├── rusqlite connection pool (r2d2 or deadpool) +├── FTS5 virtual table (keyword search) +├── USearch HNSW index (vector search, behind RwLock) +└── WAL mode + busy timeout for concurrent readers +``` + +No behavioral changes — just the trait boundary. + +### Backend: PostgreSQL (new) + +``` +PgMemoryStore +├── sqlx::PgPool (connection pool, compile-time checked queries) +├── tsvector + GIN index (keyword search) +├── pgvector + HNSW index (vector search) +└── Standard PostgreSQL MVCC concurrency +``` + +**Schema sketch:** + +```sql +CREATE EXTENSION IF NOT EXISTS vector; + +-- Domain registry — populated by clustering, not by user +CREATE TABLE domains ( + id TEXT PRIMARY KEY, -- auto-generated or user-named + label TEXT NOT NULL, -- human label (suggested or user-provided) + centroid vector, -- mean embedding of domain members + top_terms TEXT[] NOT NULL DEFAULT '{}', -- top keywords for display + memory_count INTEGER NOT NULL DEFAULT 0, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + metadata JSONB NOT NULL DEFAULT '{}' +); + +CREATE TABLE memories ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + domains TEXT[] NOT NULL DEFAULT '{}', -- [] = unclassified + domain_scores JSONB NOT NULL DEFAULT '{}', -- {"dev": 0.82, "infra": 0.71} raw similarities + content TEXT NOT NULL, + node_type TEXT NOT NULL DEFAULT 'general', + tags TEXT[] NOT NULL DEFAULT '{}', + embedding vector, -- dimension set at table creation or unconstrained + metadata JSONB NOT NULL DEFAULT '{}', + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), + + -- FTS: auto-maintained tsvector column + search_vec TSVECTOR GENERATED ALWAYS AS ( + setweight(to_tsvector('english', content), 'A') || + setweight(to_tsvector('english', coalesce(node_type, '')), 'B') || + setweight(array_to_tsvector(tags), 'C') + ) STORED +); + +-- FTS index +CREATE INDEX idx_memories_fts ON memories USING GIN (search_vec); + +-- Vector similarity (HNSW) +CREATE INDEX idx_memories_embedding ON memories + USING hnsw (embedding vector_cosine_ops) + WITH (m = 16, ef_construction = 64); + +-- Common filters +CREATE INDEX idx_memories_domains ON memories USING GIN (domains); +CREATE INDEX idx_memories_node_type ON memories (node_type); +CREATE INDEX idx_memories_tags ON memories USING GIN (tags); +CREATE INDEX idx_memories_created ON memories (created_at); + +-- FSRS scheduling state +CREATE TABLE scheduling ( + memory_id UUID PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE, + stability DOUBLE PRECISION NOT NULL DEFAULT 0.0, + difficulty DOUBLE PRECISION NOT NULL DEFAULT 0.0, + retrievability DOUBLE PRECISION NOT NULL DEFAULT 1.0, + last_review TIMESTAMPTZ, + next_review TIMESTAMPTZ, + reps INTEGER NOT NULL DEFAULT 0, + lapses INTEGER NOT NULL DEFAULT 0 +); + +CREATE INDEX idx_scheduling_next ON scheduling (next_review); + +-- Graph edges (spreading activation) +-- Edges can cross domain boundaries — spreading activation respects +-- domain filters when provided, traverses freely when searching all domains. +CREATE TABLE edges ( + source_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE, + target_id UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE, + edge_type TEXT NOT NULL DEFAULT 'related', + weight DOUBLE PRECISION NOT NULL DEFAULT 1.0, + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + PRIMARY KEY (source_id, target_id, edge_type) +); + +CREATE INDEX idx_edges_target ON edges (target_id); + +-- API keys +CREATE TABLE api_keys ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + key_hash TEXT NOT NULL UNIQUE, -- blake3 + label TEXT NOT NULL, + scopes TEXT[] NOT NULL DEFAULT '{read,write}', + domain_filter TEXT[] NOT NULL DEFAULT '{}', -- {} = access all domains + created_at TIMESTAMPTZ NOT NULL DEFAULT now(), + last_used TIMESTAMPTZ, + active BOOLEAN NOT NULL DEFAULT true +); +``` + +**Hybrid search in SQL:** + +```sql +-- RRF (Reciprocal Rank Fusion) combining FTS + vector +-- $1 = query text, $2 = embedding, $3 = limit, $4 = domain filter (NULL for all) +WITH fts AS ( + SELECT id, ts_rank_cd(search_vec, websearch_to_tsquery('english', $1)) AS score, + ROW_NUMBER() OVER (ORDER BY ts_rank_cd(search_vec, websearch_to_tsquery('english', $1)) DESC) AS rank + FROM memories + WHERE search_vec @@ websearch_to_tsquery('english', $1) + AND ($4::text[] IS NULL OR domains && $4) -- array overlap: any match + LIMIT 50 +), +vec AS ( + SELECT id, 1 - (embedding <=> $2::vector) AS score, + ROW_NUMBER() OVER (ORDER BY embedding <=> $2::vector) AS rank + FROM memories + WHERE embedding IS NOT NULL + AND ($4::text[] IS NULL OR domains && $4) + LIMIT 50 +) +SELECT COALESCE(f.id, v.id) AS id, + COALESCE(1.0 / (60 + f.rank), 0) + COALESCE(1.0 / (60 + v.rank), 0) AS rrf_score, + f.score AS fts_score, + v.score AS vector_score +FROM fts f FULL OUTER JOIN vec v ON f.id = v.id +ORDER BY rrf_score DESC +LIMIT $3; +``` + +### Embedding Configuration + +The embedding layer stays external to the storage backend. fastembed runs locally and produces vectors that get passed into `MemoryRecord.embedding`. + +```toml +# vestige.toml +[embeddings] +provider = "fastembed" # only local for now +model = "BAAI/bge-base-en-v1.5" # 768 dimensions +# model = "BAAI/bge-large-en-v1.5" # 1024 dimensions +# model = "BAAI/bge-small-en-v1.5" # 384 dimensions + +[storage] +backend = "postgres" # or "sqlite" + +[storage.sqlite] +path = "~/.vestige/vestige.db" + +[storage.postgres] +url = "postgresql://vestige:secret@localhost:5432/vestige" +max_connections = 10 +``` + +On init, the backend reads the embedding dimension from the first stored vector (or from config) and validates consistency. + +For pgvector: you can either create the column as `vector(768)` (fixed, faster) or unconstrained `vector` (flexible, slightly slower). Recommendation: fixed dimension derived from config, with a migration path if the model changes. + +### Emergent Domain Model + +Instead of user-defined tenants, domains emerge automatically from the data via clustering. The user never has to decide where a memory belongs — the system figures it out. + +#### Pipeline + +``` +Phase 1: Accumulate (cold start, 0 → N memories) +│ All memories stored with domains = [] (unclassified) +│ No classification overhead, just embed and store +│ Threshold N is configurable, default ~150 memories +│ +Phase 2: Discover (triggered once at threshold, or manually) +│ Run HDBSCAN on all embeddings: +│ - min_cluster_size: ~10 +│ - min_samples: ~5 +│ - No eps parameter needed (unlike DBSCAN) +│ - Automatically determines number of clusters +│ - Handles variable-density clusters +│ - Border points between clusters flagged naturally +│ +│ For each cluster, extract: +│ - Centroid (mean embedding) +│ - Top terms (TF-IDF or frequency over cluster members) +│ - Suggested label from top terms +│ +│ Present to user (via dashboard or CLI): +│ "I found 3 natural groupings in your memories: +│ ● cluster_0 (47 memories): BGP, SONiC, VLAN, FRR, peering... +│ ● cluster_1 (31 memories): solar, kWh, battery, pool, ESPHome... +│ ● cluster_2 (22 memories): Rust, trait, async, zinit, tokio..." +│ +│ User can: +│ - Name them: cluster_0 → "infra", cluster_1 → "home", cluster_2 → "dev" +│ - Accept suggested names +│ - Merge clusters +│ - Do nothing (auto-names stick) +│ +Phase 3: Soft-assign all existing memories +│ Now that centroids exist, re-score every memory (including +│ those from discovery) against all centroids. +│ This replaces HDBSCAN's hard labels with continuous scores: +│ +│ For each memory: +│ similarities = [(domain, cosine_sim(embedding, centroid)) for each domain] +│ domains = [id for (id, score) in similarities if score >= threshold] +│ +│ Memories in overlap zones get multiple domains. +│ Memories far from all centroids stay unclassified. +│ +Phase 4: Classify (ongoing, after discovery) +│ New memory ingested: +│ 1. Compute embedding +│ 2. Compute similarity to ALL domain centroids +│ 3. Store raw scores in domain_scores JSONB +│ 4. Threshold into domains[] array +│ 5. Update domain centroids incrementally (running mean) +│ +│ Context signals as soft priors: +│ - Git repo / IDE metadata → boost similarity to code-related domains +│ - No workspace context → slight boost toward non-technical domains +│ - These shift the score, never override the embedding distance +│ +Phase 5: Re-cluster (periodic, during dream consolidation) + Re-run HDBSCAN on all embeddings including new ones + Detect: + - New clusters forming from previously unclassified memories + - Existing clusters splitting (domain grew too broad) + - Clusters merging (domains that were artificially separate) + Propose changes to user: + "Your 'dev' domain may have split into two groups: + - systems (zinit, MOS, containers, VMs) — 34 memories + - networking (BGP, SONiC, VLANs, MLAG) — 28 memories + Split them? [yes / no / later]" + Re-run soft assignment on all memories after structural changes + Centroid vectors are updated regardless +``` + +#### Domain Storage + +```rust +#[derive(Debug, Clone)] +pub struct Domain { + pub id: String, + pub label: String, + pub centroid: Vec, + pub top_terms: Vec, + pub memory_count: usize, + pub created_at: chrono::DateTime, +} +``` + +Added to the `MemoryStore` trait: + +```rust + // --- Domains --- + async fn list_domains(&self) -> Result>; + async fn get_domain(&self, id: &str) -> Result>; + async fn upsert_domain(&self, domain: &Domain) -> Result<()>; + async fn delete_domain(&self, id: &str) -> Result<()>; + async fn classify(&self, embedding: &[f32]) -> Result>; + // Returns [(domain_id, similarity)] sorted by similarity desc. + // Caller decides threshold for assignment. +``` + +#### Classification Module + +A new cognitive module alongside FSRS, spreading activation, etc.: + +```rust +pub struct DomainClassifier { + /// Similarity threshold — domains scoring above this are assigned + pub assign_threshold: f64, // default: 0.65 + /// Minimum memories before running initial discovery + pub discovery_threshold: usize, // default: 150 + /// How often to re-cluster (in dream consolidation passes) + pub recluster_interval: usize, // default: every 5th consolidation + /// HDBSCAN min_cluster_size + pub min_cluster_size: usize, // default: 10 +} + +/// Raw classification result — all scores, before thresholding +#[derive(Debug, Clone)] +pub struct ClassificationResult { + /// Similarity to every known domain centroid + pub scores: HashMap, // {"dev": 0.82, "infra": 0.71, "home": 0.34} + /// Domains above assign_threshold + pub domains: Vec, // ["dev", "infra"] +} + +impl DomainClassifier { + /// Score a memory against all domain centroids. + /// Returns raw scores AND thresholded domain list. + pub fn classify( + &self, + embedding: &[f32], + domains: &[Domain], + ) -> ClassificationResult { + if domains.is_empty() { + return ClassificationResult { + scores: HashMap::new(), + domains: vec![], // still in accumulation phase + }; + } + + let scores: HashMap = domains.iter() + .map(|d| (d.id.clone(), cosine_similarity(embedding, &d.centroid))) + .collect(); + + let assigned: Vec = scores.iter() + .filter(|(_, &s)| s >= self.assign_threshold) + .map(|(id, _)| id.clone()) + .collect(); + + ClassificationResult { scores, domains: assigned } + } + + /// Soft-assign all existing memories after discovery or re-clustering. + /// Returns number of memories whose domains changed. + pub async fn reassign_all( + &self, + store: &dyn MemoryStore, + domains: &[Domain], + ) -> Result { + // Load all memories, re-score, update domains + domain_scores + // Batched to avoid loading everything into memory at once + todo!() + } +} +``` + +**Key distinction from the previous design:** there's no "closest wins" or "margin" logic. Every domain gets a score, and *all* domains above threshold are assigned. A memory about "deploying zinit containers via BGP-routed network" might score 0.78 on "dev" and 0.72 on "infra" — it gets both. A memory about "solar panel output today" scores 0.85 on "home" and 0.31 on everything else — it only gets "home". + +The raw `domain_scores` are always stored, so you (or the dashboard) can see *why* a memory was classified the way it was, and the threshold can be adjusted retroactively without re-computing embeddings. + +#### Search Behavior + +- **Default (no domain filter)**: searches all memories across all domains +- **Domain-scoped**: `domains: Some(vec!["dev"])` — only memories tagged with `dev` +- **Multi-domain**: `domains: Some(vec!["dev", "infra"])` — memories in either +- **MCP clients can set `X-Vestige-Domain` header** for default scoping, but the system works fine without it + +#### HDBSCAN Implementation + +HDBSCAN (Hierarchical DBSCAN) over the embedding vectors. Advantages over plain DBSCAN: + +- **No `eps` parameter** — the hardest thing to tune in DBSCAN. HDBSCAN determines density thresholds from the data hierarchy. +- **Variable-density clusters** — a tight cluster of networking memories and a spread-out cluster of personal memories are both detected correctly. +- **Border points** — memories between clusters are identified as low-confidence members, which aligns perfectly with soft assignment. + +Implementation: the `hdbscan` crate in Rust. Load all embeddings into memory (at 768d × f32 × 10k memories ≈ 30MB — fine), cluster, compute centroids, soft-assign all memories against the centroids. + +```rust +use hdbscan::{Center, Hdbscan}; + +fn discover_domains( + embeddings: &[Vec], + min_cluster_size: usize, +) -> (Vec>, Vec>) { // (cluster → member indices, centroids) + let clusterer = Hdbscan::default(embeddings); + let labels = clusterer.cluster().unwrap(); + let centroids = clusterer.calc_centers(Center::Centroid, &labels).unwrap(); + + // Group indices by label, ignoring noise (-1) + let mut clusters: HashMap> = HashMap::new(); + for (i, &label) in labels.iter().enumerate() { + if label >= 0 { + clusters.entry(label).or_default().push(i); + } + } + (clusters.into_values().collect(), centroids) +} +``` + +After HDBSCAN produces hard clusters, the soft-assignment pass (Phase 3) immediately re-scores all memories — including the ones HDBSCAN assigned — against the computed centroids. So HDBSCAN's hard labels are only used to *define* the centroids. The actual domain assignments always come from the continuous similarity scores. + +This works identically for both SQLite and Postgres backends — clustering runs in Rust application code, results are written back to the storage layer. + +### Network Transport + +#### MCP over Streamable HTTP + +Extend the existing Axum server: + +```rust +// Alongside existing dashboard routes +let app = Router::new() + // Existing dashboard + .route("/api/health", get(health_handler)) + .route("/dashboard/*path", get(dashboard_handler)) + // New: MCP over HTTP + .route("/mcp", post(mcp_handler).get(mcp_sse_handler)) + // New: REST API + // X-Vestige-Domain header optionally scopes to a domain + .route("/api/v1/memories", post(create_memory).get(list_memories)) + .route("/api/v1/memories/:id", get(get_memory).put(update_memory).delete(delete_memory)) + .route("/api/v1/search", post(search_memories)) + .route("/api/v1/consolidate", post(trigger_consolidation)) + .route("/api/v1/stats", get(get_stats)) + .route("/api/v1/domains", get(list_domains)) + .route("/api/v1/domains/discover", post(trigger_discovery)) + .route("/api/v1/domains/:id", put(rename_domain).delete(merge_domain)) + // Auth on everything except health + .layer(middleware::from_fn(api_key_auth)); +``` + +#### Auth Middleware + +```rust +async fn api_key_auth( + State(store): State>, + request: axum::extract::Request, + next: middleware::Next, +) -> Result { + // Skip auth for health endpoint + if request.uri().path() == "/api/health" { + return Ok(next.run(request).await); + } + + let key = request.headers() + .get("Authorization") + .and_then(|v| v.to_str().ok()) + .and_then(|v| v.strip_prefix("Bearer ")) + .or_else(|| request.headers() + .get("X-API-Key") + .and_then(|v| v.to_str().ok())); + + match key { + Some(k) if verify_api_key(store.as_ref(), k).await => { + Ok(next.run(request).await) + } + _ => Err(StatusCode::UNAUTHORIZED), + } +} +``` + +#### Client Configuration + +```json +// Claude Desktop / Claude Code — single key, all domains +{ + "mcpServers": { + "vestige": { + "url": "http://vestige.local:3927/mcp", + "headers": { + "Authorization": "Bearer vst_a1b2c3..." + } + } + } +} +``` + +No domain header needed — searches all domains by default. The MCP tools include an optional `domain` parameter for scoped queries if the LLM or user wants to narrow down. + +Alternatively, scope a connection to a specific domain: + +```json +// Domain-scoped connection (e.g., for a home automation agent) +{ + "mcpServers": { + "vestige-home": { + "url": "http://vestige.local:3927/mcp", + "headers": { + "Authorization": "Bearer vst_e5f6g7...", + "X-Vestige-Domain": "home" + } + } + } +} +``` + +### Server Configuration + +```toml +# vestige.toml — full example for server mode +[server] +bind = "0.0.0.0:3927" # or mycelium IPv6 address +# tls_cert = "/path/to/cert.pem" # optional +# tls_key = "/path/to/key.pem" + +[auth] +enabled = true +# If false, no key required (local-only mode) + +[storage] +backend = "postgres" + +[storage.postgres] +url = "postgresql://vestige:secret@localhost:5432/vestige" +max_connections = 10 + +[embeddings] +provider = "fastembed" +model = "BAAI/bge-base-en-v1.5" +``` + +### CLI Extensions + +```bash +# Domain management (mostly automatic, but user can inspect/rename) +vestige domains list +# → dev Development (auto) memories: 87 top: Rust, trait, async, tokio +# → infra Infrastructure (auto) memories: 47 top: BGP, SONiC, VLAN, FRR +# → home Home (auto) memories: 31 top: solar, kWh, pool, ESPHome +# → (unclassified) memories: 12 + +vestige domains rename cluster_0 infra --label "Infrastructure" +vestige domains merge home personal --into home +vestige domains discover --force # re-run HDBSCAN now + +# Key management +vestige keys create --label "macbook" +# → Created key: vst_a1b2c3d4... (store this, shown once) + +vestige keys create --label "home-assistant" --scopes read --domains home +# → Created key: vst_e5f6g7h8... (read-only, home domain only) + +vestige keys list +# → macbook vst_a1b2... scopes: [read,write] domains: [all] +# → home-assistant vst_e5f6... scopes: [read] domains: [home] + +vestige keys revoke vst_a1b2c3d4... + +# Migration +vestige migrate --from sqlite --to postgres \ + --sqlite-path ~/.vestige/vestige.db \ + --postgres-url postgresql://localhost/vestige +``` + +## Implementation Plan + +### Phase 1: Storage Trait Extraction +- Define the `MemoryStore` trait (including domain methods) +- Refactor current SQLite code to implement it +- Add `domains TEXT[]` column to existing SQLite schema +- Verify all 29 modules work through the trait (no direct SQLite access) +- **No behavioral changes** — all memories start as unclassified + +### Phase 2: PostgreSQL Backend +- Implement `PgMemoryStore` +- Schema migrations (sqlx or refinery) +- `vestige migrate` command for SQLite → Postgres +- Config file support for backend selection + +### Phase 3: Network Access +- MCP Streamable HTTP endpoint on existing Axum server +- API key auth middleware + CLI management +- REST API endpoints +- Feature flags for stdio vs HTTP mode + +### Phase 4: Emergent Domain Classification +- `DomainClassifier` cognitive module +- HDBSCAN clustering via `hdbscan` crate (runs on both backends) +- Soft assignment pass: score all memories against centroids, threshold into domains +- `domain_scores` JSONB stored per memory for transparency / retroactive re-thresholding +- Domain discovery CLI and dashboard UI +- Auto-classification on ingest (once domains exist) +- Re-clustering during dream consolidation passes +- Domain management CLI (rename, merge, inspect) + +### Phase 5: Federation (future) +- Node discovery via Mycelium / mDNS +- Memory sync protocol (UUID-based, last-write-wins) +- Possibly Iroh for content-addressed replication +- FSRS state merge (review history append, not overwrite) + +## Crate Dependencies (new) + +```toml +# Phase 1 — trait abstraction +trait-variant = "0.1" + +# Phase 2 — Postgres +sqlx = { version = "0.8", features = ["runtime-tokio", "postgres", "uuid", "chrono", "json"] } +pgvector = "0.4" # sqlx integration for vector type + +# Phase 3 — Auth +blake3 = "1" # key hashing +rand = "0.8" # key generation + +# Phase 4 — Domain clustering +hdbscan = "0.10" # HDBSCAN — no eps tuning, variable density, built-in centroid calc +``` + +## Open Questions + +1. **Trait granularity**: One big `MemoryStore` trait or split into `MemoryStore + SchedulingStore + GraphStore + DomainStore`? Splitting is cleaner but means more `dyn` parameters threading through handlers. + +2. **Embedding on insert**: Should the storage backend call fastembed, or should the caller always provide the embedding? Current design says caller provides it, keeping the backend pure storage. But this means every client needs fastembed locally even if the DB is remote. For the server model, having the server compute embeddings makes more sense. + +3. **pgvector dimension**: Fixed (e.g., `vector(768)`) or unconstrained (`vector`)? Fixed is faster for HNSW but requires migration if model changes. + +4. **Sync conflict resolution for federation**: LWW per-UUID is simple but lossy. CRDTs would be more correct but massively more complex. For FSRS state specifically, merging review event logs would be ideal. + +5. **Dashboard auth**: The 3D dashboard currently runs unauthenticated on localhost. With remote access, it needs the same auth. Should it use the same API keys or have a separate session/cookie mechanism? + +6. **HDBSCAN `min_cluster_size`**: The main tuning knob. Too small → noisy micro-clusters. Too large → distinct topics get merged. Default of 10 should work for most cases, but may need a manual override or auto-sweep (run with several values, pick the one with best silhouette score). + +7. **Domain drift**: Over time, the character of a domain changes. How aggressively should re-clustering reshape existing domains? Conservative (only propose splits/merges, never auto-apply) vs. aggressive (auto-reassign memories whose scores drifted below threshold)? + +8. **Spreading activation across domains**: When searching within a single domain, should graph edges that cross into other domains be followed? Probably yes for recall quality, but with decaying weight as you cross boundaries. + +9. **Threshold tuning**: The `assign_threshold` (0.65 default) determines how many memories are multi-domain vs single-domain vs unclassified. Too low → everything is multi-domain (useless). Too high → too many unclassified. Could be auto-tuned per dataset by targeting a specific unclassified ratio (e.g., "keep fewer than 10% unclassified").