mirror of https://github.com/samvallad33/vestige.git synced 2026-04-25 00:36:22 +02:00

Jan De Landtsheer 0d273c5641

docs: ADR 0001 + Phase 1-4 implementation plans

Pluggable storage backend, network access, and emergent domain
classification. Introduces MemoryStore + Embedder traits, PgMemoryStore
alongside SqliteMemoryStore, HTTP MCP + API key auth, and HDBSCAN-based
domain clustering. Phase 5 federation deferred to a follow-up ADR.

- docs/adr/0001-pluggable-storage-and-network-access.md -- Accepted
- docs/plans/0001-phase-1-storage-trait-extraction.md
- docs/plans/0002-phase-2-postgres-backend.md
- docs/plans/0003-phase-3-network-access.md
- docs/plans/0004-phase-4-emergent-domain-classification.md
- docs/prd/001-getting-centralized-vestige.md -- source RFC

2026-04-22 12:10:24 +02:00

60 KiB

Raw Blame History

Phase 2 Plan: PostgreSQL Backend

Status: Draft Depends on: Phase 1 (MemoryStore + Embedder traits, embedding_model registry, domain columns) Related: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 2), docs/prd/001-getting-centralized-vestige.md

Scope

In scope

PgMemoryStore struct implementing the Phase 1 MemoryStore trait against sqlx::PgPool, including compile-time checked queries via sqlx::query! / sqlx::query_as!.
First-class pgvector integration: typed Vector columns, HNSW index (vector_cosine_ops, m = 16, ef_construction = 64), and use of the cosine-distance operator <=>.
First-class Postgres FTS: GENERATED tsvector column (search_vec) with setweight (A=content, B=node_type, C=tags), GIN index, and websearch_to_tsquery at query time.
Hybrid search via Reciprocal Rank Fusion (RRF) expressed as a single SQL statement with CTEs for FTS and vector subqueries, with optional domain filter through array overlap (&&).
sqlx migrations directory at crates/vestige-core/migrations/postgres/, numbered {NNNN}_{name}.up.sql / {NNNN}_{name}.down.sql, runnable by sqlx::migrate! at startup and by sqlx-cli.
Offline query cache committed under crates/vestige-core/.sqlx/ so a DATABASE_URL is not required at build time.
Backend selection via vestige.toml: [storage] section with backend = "sqlite" | "postgres" plus the per-backend subsection ([storage.sqlite], [storage.postgres]). Exclusive at compile time via postgres-backend feature, exclusive at runtime via the enum.
CLI: vestige migrate --from sqlite --to postgres --sqlite-path <p> --postgres-url <u> -- streaming copy with progress output.
CLI: vestige migrate --reembed --model=<new> -- O(n) re-embed under a new Embedder, registry update, HNSW rebuild.
Testcontainer-based integration tests using the pgvector/pgvector:pg16 image, behind the postgres-backend feature so SQLite-only builds remain untouched.
PgMemoryStore parity with SqliteMemoryStore across every public MemoryStore method defined in Phase 1.

Out of scope

Phase 3 (network access): HTTP MCP transport, API key auth, vestige keys CLI. The api_keys DDL is declared by Phase 3; Phase 2 does not create it.
Phase 4 (emergent domain classification): DomainClassifier, HDBSCAN, discover / rename / merge CLI. Phase 2 provisions the domains and domain_scores columns and the domains table structure so Phase 4 slots in without further migration, but does not compute or classify.
Phase 5 (federation): cross-node sync. The review_events table is declared in Phase 1; Phase 2 only references it where FSRS writes happen.
Changes to the cognitive engine, Phase 1 traits, or the embedding pipeline itself. Phase 2 only adds a backend.
SQLCipher parity for Postgres. Operator responsibility (TLS to Postgres, pgcrypto, disk-level encryption) is out of scope for this phase.

Prerequisites

Expected Phase 1 artifacts (consumed, not produced)

Phase 2 treats all of the following as fixed interfaces. Each path is the expected Phase 1 location.

crates/vestige-core/src/storage/mod.rs -- re-exports the trait and the two concrete backends.
crates/vestige-core/src/storage/memory_store.rs -- defines the MemoryStore trait (generated by trait_variant::make from LocalMemoryStore) with the full CRUD, search, FSRS, graph, and domain surface from the PRD. Phase 2 implements every method here.
crates/vestige-core/src/storage/types.rs -- shared value types: MemoryRecord, SchedulingState, SearchQuery, SearchResult, MemoryEdge, Domain, StoreStats, HealthStatus.
crates/vestige-core/src/storage/error.rs -- StoreError enum plus pub type StoreResult<T> = Result<T, StoreError>. Phase 2 extends this with StoreError::Postgres(sqlx::Error) and StoreError::Migrate(sqlx::migrate::MigrateError) via From impls (the variants themselves MUST live behind #[cfg(feature = "postgres-backend")]).
crates/vestige-core/src/embedder/mod.rs -- Embedder trait with embed, model_name, dimension, model_hash. Phase 2 calls model_name(), dimension(), and model_hash() for the registry.
crates/vestige-core/src/storage/sqlite.rs -- SqliteMemoryStore: MemoryStore. Phase 2's migrate --from sqlite --to postgres uses this as the source.
crates/vestige-core/src/storage/registry.rs -- EmbeddingModelRegistry abstraction that both backends implement. Phase 2 supplies a Postgres version writing to embedding_model.
crates/vestige-core/migrations/sqlite/ -- V12 (Phase 1) adds domains TEXT (JSON-encoded array), domain_scores TEXT (JSON), embedding_model(name, dimension, hash, created_at), and review_events(id, memory_id, timestamp, rating, prior_state, new_state). Phase 2 mirrors every column and table in Postgres.

If any of the above is missing when Phase 2 starts, the first action is to surface the gap back to Phase 1 -- do NOT backfill a partial trait in Phase 2.

Required crates (declared in Phase 2, not installed by this doc)

The agent running Phase 2 uses cargo add in crates/vestige-core/ for each dependency below. Exact versions and feature flags:

sqlx@0.8 with features runtime-tokio, tls-rustls, postgres, uuid, chrono, json, migrate, macros. Optional (gated by postgres-backend).
pgvector@0.4 with features sqlx. Optional (gated by postgres-backend).
deadpool is NOT needed; sqlx::PgPool is the pool.
toml@0.8 (no features) for vestige.toml parsing. Moved to non-optional because both backends share the config surface.
figment@0.10 with features toml, env -- optional, only if Phase 1 has not already picked a config loader. If Phase 1 ships a loader, skip figment and reuse.
dirs@6 -- already a transitive directories dependency; reuse existing.
tokio-stream@0.1 (no features). Used by migrate commands for streamed iteration.
indicatif@0.17 (no features). Progress bars for the migrate CLI.
futures@0.3 with features std. Consumed by sqlx stream combinators.

Dev-only (under [dev-dependencies] in crates/vestige-core/Cargo.toml, gated by postgres-backend):

testcontainers@0.22 with features blocking off, async on (default).
testcontainers-modules@0.10 with features postgres.
tokio@1 features macros, rt-multi-thread (already present for core tests).
criterion@0.5 already present; add a new [[bench]] entry.

Feature additions in crates/vestige-core/Cargo.toml:

[features]
postgres-backend = ["dep:sqlx", "dep:pgvector", "dep:tokio-stream", "dep:futures"]

postgres-backend is OFF by default. default = ["embeddings", "vector-search", "bundled-sqlite"] stays unchanged. vestige-mcp forwards a new feature postgres-backend = ["vestige-core/postgres-backend"].

External tooling

PostgreSQL 16 or newer (uses gen_random_uuid() from pgcrypto bundled via CREATE EXTENSION pgcrypto in migration 0001; pgvector HNSW indexes require pgvector 0.5+).
The pgvector extension installed in the target database (our migration issues CREATE EXTENSION IF NOT EXISTS vector).
sqlx-cli@0.8 installed on the developer machine for cargo sqlx prepare --workspace and cargo sqlx migrate add (not a build-time requirement once .sqlx/ is committed).
Docker or Podman reachable by the test harness for testcontainers-modules::postgres to spin up pgvector/pgvector:pg16.

Assumed Rust toolchain

Rust 2024 edition.
MSRV 1.91 (per CLAUDE.md). sqlx 0.8 is compatible.
rustflags unchanged. No nightly-only features.

Deliverables

Feature gate postgres-backend in crates/vestige-core/Cargo.toml and crates/vestige-mcp/Cargo.toml that cleanly disables all Postgres code paths when off.
crates/vestige-core/src/storage/postgres/mod.rs -- PgMemoryStore struct and MemoryStore trait impl (public entry point).
crates/vestige-core/src/storage/postgres/pool.rs -- PgMemoryStore::connect(config) and pool configuration.
crates/vestige-core/src/storage/postgres/search.rs -- RRF hybrid search query builder and row -> SearchResult mapping.
crates/vestige-core/src/storage/postgres/migrations.rs -- wraps sqlx::migrate!("./migrations/postgres") and surfaces typed errors.
crates/vestige-core/src/storage/postgres/registry.rs -- Postgres EmbeddingModelRegistry implementation writing embedding_model.
crates/vestige-core/migrations/postgres/0001_init.up.sql + 0001_init.down.sql -- extensions, memories, scheduling, edges, domains, embedding_model, review_events, all indexes.
crates/vestige-core/migrations/postgres/0002_hnsw.up.sql + 0002_hnsw.down.sql -- HNSW index creation separated so it can be CREATE INDEX CONCURRENTLY during reembed.
crates/vestige-core/src/config.rs -- VestigeConfig, StorageConfig, SqliteConfig, PostgresConfig, EmbeddingsConfig, plus a single VestigeConfig::load(path: Option<&Path>) returning Result<Self, ConfigError>.
crates/vestige-core/src/storage/postgres/migrate_cli.rs -- streaming SQLite-to-Postgres copy, domain-aware, with indicatif progress.
crates/vestige-core/src/storage/postgres/reembed.rs -- ReembedPlan and its driver; re-encodes all memories via a supplied Embedder, updates embedding_model, rebuilds HNSW.
crates/vestige-mcp/src/bin/cli.rs -- two new clap subcommands Migrate (union of --from/--to and --reembed variants, one subcommand or two, see Open Questions) wired to deliverables 10 and 11.
crates/vestige-core/.sqlx/ -- offline query cache, committed.
tests/phase_2/ -- six integration test files listed in the Test Plan.
crates/vestige-core/benches/pg_hybrid_search.rs -- Criterion benches for RRF search at 1k and 100k memories, gated by postgres-backend.
docs/runbook/postgres.md -- brief ops note covering extension install, max_connections, backup discipline, and rollback caveats. (Short; only required for the "rollback of migrate" deliverable.)

Detailed Task Breakdown

D1. `postgres-backend` feature gate

File: crates/vestige-core/Cargo.toml, crates/vestige-mcp/Cargo.toml
Depends on: nothing; this is the first change.
Rust snippets:

# crates/vestige-core/Cargo.toml
[features]
default = ["embeddings", "vector-search", "bundled-sqlite"]
bundled-sqlite = ["rusqlite/bundled"]
encryption = ["rusqlite/bundled-sqlcipher"]
postgres-backend = [
    "dep:sqlx",
    "dep:pgvector",
    "dep:tokio-stream",
    "dep:futures",
]

[dependencies]
sqlx = { version = "0.8", default-features = false, features = [
    "runtime-tokio", "tls-rustls", "postgres", "uuid", "chrono",
    "json", "migrate", "macros",
], optional = true }
pgvector = { version = "0.4", features = ["sqlx"], optional = true }
tokio-stream = { version = "0.1", optional = true }
futures = { version = "0.3", optional = true }
toml = "0.8"
indicatif = "0.17"

Behavior notes: keep the two backends mutually compilable per CLAUDE.md. Every use sqlx::... sits under #[cfg(feature = "postgres-backend")]. Every module under crates/vestige-core/src/storage/postgres/ carries #![cfg(feature = "postgres-backend")] as its file-level attribute.

D2. `PgMemoryStore` core struct

File: crates/vestige-core/src/storage/postgres/mod.rs
Depends on: D1, Phase 1 MemoryStore trait and value types.
Signatures:

#![cfg(feature = "postgres-backend")]

use std::sync::Arc;
use std::time::Duration;

use chrono::{DateTime, Utc};
use pgvector::Vector;
use sqlx::postgres::{PgConnectOptions, PgPoolOptions};
use sqlx::PgPool;
use uuid::Uuid;

use crate::embedder::Embedder;
use crate::storage::error::{StoreError, StoreResult};
use crate::storage::types::{
    Domain, HealthStatus, MemoryEdge, MemoryRecord, SchedulingState,
    SearchQuery, SearchResult, StoreStats,
};
use crate::storage::memory_store::LocalMemoryStore;

pub mod migrations;
pub mod pool;
pub mod registry;
pub mod search;
pub mod migrate_cli;
pub mod reembed;

/// Postgres-backed implementation of `MemoryStore`.
///
/// Cheaply cloneable. Methods take `&self`; interior state lives inside
/// the `PgPool` (which already provides `Sync` via `Arc` internally).
#[derive(Clone)]
pub struct PgMemoryStore {
    pool: PgPool,
    embedding_dim: i32,
    embedding_model: Arc<EmbeddingModelDescriptor>,
}

#[derive(Debug, Clone)]
pub struct EmbeddingModelDescriptor {
    pub name: String,
    pub dimension: i32,
    pub hash: String,
}

impl PgMemoryStore {
    /// Construct a new store. Runs migrations, reads the registry, validates
    /// that the embedder matches the registered model.
    pub async fn connect(
        url: &str,
        max_connections: u32,
        embedder: &dyn Embedder,
    ) -> StoreResult<Self>;

    /// Low-level constructor for tests: supply an existing pool, skip migrate.
    pub async fn from_pool(
        pool: PgPool,
        embedder: &dyn Embedder,
    ) -> StoreResult<Self>;

    /// Accessor used by migrate/reembed CLI.
    pub fn pool(&self) -> &PgPool { &self.pool }

    pub fn embedding_dim(&self) -> i32 { self.embedding_dim }
}

#[trait_variant::make(crate::storage::memory_store::MemoryStore: Send)]
impl LocalMemoryStore for PgMemoryStore {
    async fn init(&self) -> StoreResult<()>;
    async fn health_check(&self) -> StoreResult<HealthStatus>;

    async fn insert(&self, record: &MemoryRecord) -> StoreResult<Uuid>;
    async fn get(&self, id: Uuid) -> StoreResult<Option<MemoryRecord>>;
    async fn update(&self, record: &MemoryRecord) -> StoreResult<()>;
    async fn delete(&self, id: Uuid) -> StoreResult<()>;

    async fn search(&self, query: &SearchQuery) -> StoreResult<Vec<SearchResult>>;
    async fn fts_search(&self, text: &str, limit: usize) -> StoreResult<Vec<SearchResult>>;
    async fn vector_search(&self, embedding: &[f32], limit: usize) -> StoreResult<Vec<SearchResult>>;

    async fn get_scheduling(&self, memory_id: Uuid) -> StoreResult<Option<SchedulingState>>;
    async fn update_scheduling(&self, state: &SchedulingState) -> StoreResult<()>;
    async fn get_due_memories(
        &self,
        before: DateTime<Utc>,
        limit: usize,
    ) -> StoreResult<Vec<(MemoryRecord, SchedulingState)>>;

    async fn add_edge(&self, edge: &MemoryEdge) -> StoreResult<()>;
    async fn get_edges(&self, node_id: Uuid, edge_type: Option<&str>) -> StoreResult<Vec<MemoryEdge>>;
    async fn remove_edge(&self, source: Uuid, target: Uuid, edge_type: &str) -> StoreResult<()>;
    async fn get_neighbors(&self, node_id: Uuid, depth: usize) -> StoreResult<Vec<(MemoryRecord, f64)>>;

    async fn list_domains(&self) -> StoreResult<Vec<Domain>>;
    async fn get_domain(&self, id: &str) -> StoreResult<Option<Domain>>;
    async fn upsert_domain(&self, domain: &Domain) -> StoreResult<()>;
    async fn delete_domain(&self, id: &str) -> StoreResult<()>;
    async fn classify(&self, embedding: &[f32]) -> StoreResult<Vec<(String, f64)>>;

    async fn count(&self) -> StoreResult<usize>;
    async fn get_stats(&self) -> StoreResult<StoreStats>;
    async fn vacuum(&self) -> StoreResult<()>;
}

SQL (inline within impl methods): every call uses sqlx::query! or sqlx::query_as! for compile-time validation. Examples:

// insert
sqlx::query!(
    r#"
    INSERT INTO memories (
        id, domains, domain_scores, content, node_type, tags,
        embedding, metadata, created_at, updated_at
    ) VALUES ($1, $2, $3, $4, $5, $6, $7::vector, $8, $9, $10)
    "#,
    record.id,
    &record.domains as &[String],
    serde_json::to_value(&record.domain_scores)?,
    record.content,
    record.node_type,
    &record.tags as &[String],
    record.embedding.as_ref().map(|v| Vector::from(v.clone())) as Option<Vector>,
    record.metadata,
    record.created_at,
    record.updated_at,
)
.execute(&self.pool)
.await?;

Behavior notes:
- StoreError gets two new variants behind the feature:

#[cfg(feature = "postgres-backend")]
#[error("postgres error: {0}")]
Postgres(#[from] sqlx::Error),

#[cfg(feature = "postgres-backend")]
#[error("postgres migration error: {0}")]
Migrate(#[from] sqlx::migrate::MigrateError),

classify() on Postgres implements the PRD's cosine-similarity-to-centroid computation inside SQL using 1 - (centroid <=> $1::vector) over the domains table and returns rows sorted descending. This mirrors the behavior a DomainClassifier in Phase 4 uses; Phase 2 ships the backend capability but does not call it.
Connection pool defaults (see D3): max_connections = 10, acquire_timeout = 30s, idle_timeout = 600s, test_before_acquire = false (cheap queries; avoid per-acquire roundtrip).
All methods are async fn and use sqlx's tokio runtime feature; no blocking block_on.

D3. Pool construction and config wiring

File: crates/vestige-core/src/storage/postgres/pool.rs
Depends on: D1, D2, D9.
Signatures:

#![cfg(feature = "postgres-backend")]

use sqlx::postgres::{PgConnectOptions, PgPoolOptions};
use sqlx::{ConnectOptions, PgPool};
use std::str::FromStr;
use std::time::Duration;

use crate::config::PostgresConfig;
use crate::storage::error::{StoreError, StoreResult};

pub async fn build_pool(cfg: &PostgresConfig) -> StoreResult<PgPool> {
    let mut opts = PgConnectOptions::from_str(&cfg.url)?;
    opts = opts
        .application_name("vestige")
        .statement_cache_capacity(256)
        .log_statements(tracing::log::LevelFilter::Debug);

    let pool = PgPoolOptions::new()
        .max_connections(cfg.max_connections.unwrap_or(10))
        .min_connections(0)
        .acquire_timeout(Duration::from_secs(cfg.acquire_timeout_secs.unwrap_or(30)))
        .idle_timeout(Some(Duration::from_secs(600)))
        .max_lifetime(Some(Duration::from_secs(1800)))
        .test_before_acquire(false)
        .connect_with(opts)
        .await?;

    Ok(pool)
}

Behavior notes: acquire timeout chosen to exceed the 30-second testcontainer spin-up requirement. application_name = "vestige" makes pg_stat_activity readable from psql during debugging.

D4. sqlx migrations directory

File: crates/vestige-core/migrations/postgres/0001_init.up.sql, 0001_init.down.sql, 0002_hnsw.up.sql, 0002_hnsw.down.sql.
Depends on: none (pure SQL).

0001_init.up.sql:

-- Extensions
CREATE EXTENSION IF NOT EXISTS pgcrypto;
CREATE EXTENSION IF NOT EXISTS vector;

-- Embedding model registry
-- Mirrors the SQLite table created in Phase 1.
CREATE TABLE embedding_model (
    id          SMALLINT PRIMARY KEY DEFAULT 1 CHECK (id = 1),
    name        TEXT NOT NULL,
    dimension   INTEGER NOT NULL CHECK (dimension > 0),
    hash        TEXT NOT NULL,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now()
);

-- Domains table (populated by Phase 4 DomainClassifier; Phase 2 only creates
-- the empty table so list/get/upsert/delete work against both backends).
CREATE TABLE domains (
    id           TEXT PRIMARY KEY,
    label        TEXT NOT NULL,
    centroid     vector,
    top_terms    TEXT[] NOT NULL DEFAULT '{}',
    memory_count INTEGER NOT NULL DEFAULT 0,
    created_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
    metadata     JSONB NOT NULL DEFAULT '{}'::jsonb
);

-- Core memories table
CREATE TABLE memories (
    id              UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    domains         TEXT[] NOT NULL DEFAULT '{}',
    domain_scores   JSONB NOT NULL DEFAULT '{}'::jsonb,
    content         TEXT NOT NULL,
    node_type       TEXT NOT NULL DEFAULT 'general',
    tags            TEXT[] NOT NULL DEFAULT '{}',
    embedding       vector,
    metadata        JSONB NOT NULL DEFAULT '{}'::jsonb,
    created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
    search_vec      TSVECTOR GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(content, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(node_type, '')), 'B') ||
        setweight(to_tsvector('english', coalesce(array_to_string(tags, ' '), '')), 'C')
    ) STORED
);

-- FSRS scheduling state (1:1 with memories)
CREATE TABLE scheduling (
    memory_id       UUID PRIMARY KEY REFERENCES memories(id) ON DELETE CASCADE,
    stability       DOUBLE PRECISION NOT NULL DEFAULT 0.0,
    difficulty      DOUBLE PRECISION NOT NULL DEFAULT 0.0,
    retrievability  DOUBLE PRECISION NOT NULL DEFAULT 1.0,
    last_review     TIMESTAMPTZ,
    next_review     TIMESTAMPTZ,
    reps            INTEGER NOT NULL DEFAULT 0,
    lapses          INTEGER NOT NULL DEFAULT 0
);

-- Graph edges (spreading activation)
CREATE TABLE edges (
    source_id   UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    target_id   UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    edge_type   TEXT NOT NULL DEFAULT 'related',
    weight      DOUBLE PRECISION NOT NULL DEFAULT 1.0,
    created_at  TIMESTAMPTZ NOT NULL DEFAULT now(),
    PRIMARY KEY (source_id, target_id, edge_type)
);

-- FSRS review event log (Phase 1 creates this; Phase 2 mirrors it for Postgres).
-- Append-only. Used for future federation (Phase 5).
CREATE TABLE review_events (
    id              BIGSERIAL PRIMARY KEY,
    memory_id       UUID NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    timestamp       TIMESTAMPTZ NOT NULL DEFAULT now(),
    rating          SMALLINT NOT NULL,
    prior_state     JSONB NOT NULL,
    new_state       JSONB NOT NULL
);

-- Indexes on memories (vector index is declared separately in 0002_hnsw.up.sql)
CREATE INDEX idx_memories_fts ON memories USING GIN (search_vec);
CREATE INDEX idx_memories_domains ON memories USING GIN (domains);
CREATE INDEX idx_memories_tags ON memories USING GIN (tags);
CREATE INDEX idx_memories_node_type ON memories (node_type);
CREATE INDEX idx_memories_created ON memories (created_at);
CREATE INDEX idx_memories_updated ON memories (updated_at);

-- Indexes on scheduling
CREATE INDEX idx_scheduling_next_review ON scheduling (next_review);
CREATE INDEX idx_scheduling_last_review ON scheduling (last_review);

-- Indexes on edges
CREATE INDEX idx_edges_target ON edges (target_id);
CREATE INDEX idx_edges_source ON edges (source_id);
CREATE INDEX idx_edges_type ON edges (edge_type);

-- Indexes on review_events
CREATE INDEX idx_review_events_memory ON review_events (memory_id);
CREATE INDEX idx_review_events_ts ON review_events (timestamp);

-- Update trigger on memories.updated_at
CREATE OR REPLACE FUNCTION memories_set_updated_at() RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at := now();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_memories_updated_at
BEFORE UPDATE ON memories
FOR EACH ROW EXECUTE FUNCTION memories_set_updated_at();

0001_init.down.sql:

DROP TRIGGER IF EXISTS trg_memories_updated_at ON memories;
DROP FUNCTION IF EXISTS memories_set_updated_at();

DROP INDEX IF EXISTS idx_review_events_ts;
DROP INDEX IF EXISTS idx_review_events_memory;
DROP INDEX IF EXISTS idx_edges_type;
DROP INDEX IF EXISTS idx_edges_source;
DROP INDEX IF EXISTS idx_edges_target;
DROP INDEX IF EXISTS idx_scheduling_last_review;
DROP INDEX IF EXISTS idx_scheduling_next_review;
DROP INDEX IF EXISTS idx_memories_updated;
DROP INDEX IF EXISTS idx_memories_created;
DROP INDEX IF EXISTS idx_memories_node_type;
DROP INDEX IF EXISTS idx_memories_tags;
DROP INDEX IF EXISTS idx_memories_domains;
DROP INDEX IF EXISTS idx_memories_fts;

DROP TABLE IF EXISTS review_events;
DROP TABLE IF EXISTS edges;
DROP TABLE IF EXISTS scheduling;
DROP TABLE IF EXISTS memories;
DROP TABLE IF EXISTS domains;
DROP TABLE IF EXISTS embedding_model;

0002_hnsw.up.sql (separated so reembed can drop-and-recreate without touching the rest of the schema):

-- HNSW index on memories.embedding.
-- pgvector requires the column to have a typmod (fixed dimension) for HNSW.
-- The dimension is stamped by the application at startup via ALTER TABLE
-- using the embedder's dimension() method (see PgMemoryStore::connect).
-- We express the index with the generic vector_cosine_ops operator class.
CREATE INDEX idx_memories_embedding_hnsw
    ON memories USING hnsw (embedding vector_cosine_ops)
    WITH (m = 16, ef_construction = 64);

0002_hnsw.down.sql:

DROP INDEX IF EXISTS idx_memories_embedding_hnsw;

Behavior notes:
- pgvector HNSW requires a typmod. PgMemoryStore::connect runs ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N) with $N = embedder.dimension() exactly once, guarded by a check against embedding_model (first startup ever) or validated against it on subsequent starts. If embedder.dimension() differs from the stored one and embedding_model is non-empty, return StoreError::EmbeddingDimensionMismatch -- the user must run vestige migrate --reembed.
- ALTER COLUMN ... TYPE vector($N) on a populated column fails unless the data fits; that is the desired safety net.
- The tsvector GENERATED column uses array_to_string(tags, ' ') rather than array_to_tsvector from the PRD sketch, because array_to_tsvector is not a core function in Postgres 16 and would require an extension. The behavior is equivalent for weight C.
- gen_random_uuid() comes from pgcrypto. In Postgres 13+ it is also available from core; we keep the extension for older compatibility paths.
- MVCC: all table writes are transactional; no explicit locks. INSERT ... ON CONFLICT DO UPDATE is used in upsert_domain, update_scheduling, and edge idempotency.

D5. Hybrid search via RRF

File: crates/vestige-core/src/storage/postgres/search.rs
Depends on: D2, D4.
Signatures:

#![cfg(feature = "postgres-backend")]

use pgvector::Vector;
use sqlx::PgPool;
use uuid::Uuid;

use crate::storage::error::StoreResult;
use crate::storage::types::{SearchQuery, SearchResult};

const RRF_K: i32 = 60;              // constant from Cormack et al. 2009
const OVERFETCH_MULT: i64 = 3;      // matches Phase 1 SQLite overfetch

pub(crate) async fn rrf_search(
    pool: &PgPool,
    query: &SearchQuery,
) -> StoreResult<Vec<SearchResult>>;

SQL for the full hybrid RRF query. Placeholders:

$1 = text query (string, may be empty)
$2 = embedding (vector)
$3 = overfetch limit per branch (int)
$4 = final limit (int)
$5 = domain filter (text[] or NULL)
$6 = node_type filter (text[] or NULL)
$7 = tag filter (text[] or NULL)

WITH params AS (
    SELECT
        $1::text  AS q_text,
        $2::vector AS q_vec,
        $3::int    AS overfetch,
        $4::int    AS final_limit,
        $5::text[] AS dom_filter,
        $6::text[] AS nt_filter,
        $7::text[] AS tag_filter
),
fts AS (
    SELECT m.id,
           ts_rank_cd(m.search_vec, websearch_to_tsquery('english', p.q_text)) AS score,
           ROW_NUMBER() OVER (
               ORDER BY ts_rank_cd(m.search_vec, websearch_to_tsquery('english', p.q_text)) DESC
           ) AS rank
    FROM memories m, params p
    WHERE p.q_text <> ''
      AND m.search_vec @@ websearch_to_tsquery('english', p.q_text)
      AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
      AND (p.nt_filter  IS NULL OR m.node_type = ANY(p.nt_filter))
      AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
    LIMIT (SELECT overfetch FROM params)
),
vec AS (
    SELECT m.id,
           1 - (m.embedding <=> p.q_vec) AS score,
           ROW_NUMBER() OVER (
               ORDER BY m.embedding <=> p.q_vec
           ) AS rank
    FROM memories m, params p
    WHERE m.embedding IS NOT NULL
      AND p.q_vec IS NOT NULL
      AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
      AND (p.nt_filter  IS NULL OR m.node_type = ANY(p.nt_filter))
      AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
    LIMIT (SELECT overfetch FROM params)
),
fused AS (
    SELECT COALESCE(f.id, v.id) AS id,
           COALESCE(1.0 / (60 + f.rank), 0.0)
         + COALESCE(1.0 / (60 + v.rank), 0.0) AS rrf_score,
           f.score AS fts_score,
           v.score AS vector_score
    FROM fts f FULL OUTER JOIN vec v ON f.id = v.id
)
SELECT m.id                  AS "id!: Uuid",
       m.domains             AS "domains!: Vec<String>",
       m.domain_scores       AS "domain_scores!: serde_json::Value",
       m.content             AS "content!",
       m.node_type           AS "node_type!",
       m.tags                AS "tags!: Vec<String>",
       m.embedding           AS "embedding?: Vector",
       m.metadata            AS "metadata!: serde_json::Value",
       m.created_at          AS "created_at!: chrono::DateTime<chrono::Utc>",
       m.updated_at          AS "updated_at!: chrono::DateTime<chrono::Utc>",
       fused.rrf_score       AS "rrf_score!: f64",
       fused.fts_score       AS "fts_score?: f64",
       fused.vector_score    AS "vector_score?: f64"
FROM fused
JOIN memories m ON m.id = fused.id
ORDER BY fused.rrf_score DESC
LIMIT (SELECT final_limit FROM params);

Behavior notes:
- OVERFETCH_MULT * query.limit is passed as $3. Final $4 is query.limit.
- Empty text query is allowed; the fts CTE returns zero rows (p.q_text <> '') and the result degrades to pure vector search, which matches vector_search behavior.
- Null embedding is allowed; the vec CTE returns zero rows and the result degrades to pure FTS, which matches fts_search behavior.
- fts_search and vector_search are separate public methods on the trait. Each uses a simpler single-CTE query derived from the above by removing the other branch. Implementing them as thin wrappers over rrf_search with nullified inputs is acceptable but adds one extra plan per call; the explicit implementations win on latency.
- min_retrievability in SearchQuery is applied as a final filter by joining on scheduling in the outer SELECT. Adding that join unconditionally regresses simple searches; add it only when query.min_retrievability.is_some().

D6. `embedding_model` registry impl

File: crates/vestige-core/src/storage/postgres/registry.rs
Depends on: D1, D4 (table exists), Phase 1 EmbeddingModelRegistry trait.
Signatures:

#![cfg(feature = "postgres-backend")]

use sqlx::PgPool;

use crate::embedder::Embedder;
use crate::storage::error::{StoreError, StoreResult};

pub(crate) async fn ensure_registry(
    pool: &PgPool,
    embedder: &dyn Embedder,
) -> StoreResult<()> {
    let row = sqlx::query!(
        r#"SELECT name, dimension, hash FROM embedding_model WHERE id = 1"#
    )
    .fetch_optional(pool)
    .await?;

    match row {
        None => {
            sqlx::query!(
                r#"
                INSERT INTO embedding_model (id, name, dimension, hash)
                VALUES (1, $1, $2, $3)
                "#,
                embedder.model_name(),
                embedder.dimension() as i32,
                embedder.model_hash(),
            )
            .execute(pool)
            .await?;

            // First-ever run: stamp the vector column typmod.
            let ddl = format!(
                "ALTER TABLE memories ALTER COLUMN embedding TYPE vector({})",
                embedder.dimension()
            );
            sqlx::query(&ddl).execute(pool).await?;
            Ok(())
        }
        Some(r) if r.name == embedder.model_name()
            && r.dimension == embedder.dimension() as i32
            && r.hash == embedder.model_hash() => Ok(()),
        Some(r) => Err(StoreError::EmbeddingMismatch {
            expected: format!("{} ({}d, {})", r.name, r.dimension, r.hash),
            got: format!(
                "{} ({}d, {})",
                embedder.model_name(),
                embedder.dimension(),
                embedder.model_hash()
            ),
        }),
    }
}

pub(crate) async fn update_registry(
    pool: &PgPool,
    embedder: &dyn Embedder,
) -> StoreResult<()> {
    // Used only by `vestige migrate --reembed` after a full re-encode.
    sqlx::query!(
        r#"
        UPDATE embedding_model
        SET name = $1, dimension = $2, hash = $3, created_at = now()
        WHERE id = 1
        "#,
        embedder.model_name(),
        embedder.dimension() as i32,
        embedder.model_hash(),
    )
    .execute(pool)
    .await?;
    Ok(())
}

Behavior notes:
- StoreError::EmbeddingMismatch { expected, got } already exists in Phase 1; Phase 2 just constructs it.
- The ALTER TABLE ... TYPE vector(N) DDL is only issued on first init. On subsequent inits the existing typmod already matches.
- Re-embed flow also uses this module, but the DDL path is different -- see D11.

D7. `VestigeConfig`: `vestige.toml` backend selection

File: crates/vestige-core/src/config.rs (Phase 1 may already own this file; Phase 2 extends, not replaces)
Depends on: D1.
Signatures:

use std::path::{Path, PathBuf};

use serde::Deserialize;

#[derive(Debug, Clone, Deserialize)]
pub struct VestigeConfig {
    #[serde(default)]
    pub embeddings: EmbeddingsConfig,
    #[serde(default)]
    pub storage: StorageConfig,
    #[serde(default)]
    pub server: ServerConfig,
    #[serde(default)]
    pub auth: AuthConfig,
}

#[derive(Debug, Clone, Deserialize)]
pub struct EmbeddingsConfig {
    pub provider: String,   // "fastembed"
    pub model: String,      // "BAAI/bge-base-en-v1.5"
}

#[derive(Debug, Clone, Deserialize)]
#[serde(tag = "backend", rename_all = "lowercase")]
pub enum StorageConfig {
    Sqlite(SqliteConfig),
    #[cfg(feature = "postgres-backend")]
    Postgres(PostgresConfig),
}

#[derive(Debug, Clone, Deserialize)]
pub struct SqliteConfig {
    pub path: PathBuf,
}

#[cfg(feature = "postgres-backend")]
#[derive(Debug, Clone, Deserialize)]
pub struct PostgresConfig {
    pub url: String,
    #[serde(default)]
    pub max_connections: Option<u32>,
    #[serde(default)]
    pub acquire_timeout_secs: Option<u64>,
}

#[derive(Debug, Clone, Default, Deserialize)]
pub struct ServerConfig { /* Phase 3 fills this in */ }

#[derive(Debug, Clone, Default, Deserialize)]
pub struct AuthConfig { /* Phase 3 fills this in */ }

impl VestigeConfig {
    pub fn load(path: Option<&Path>) -> Result<Self, ConfigError>;
    pub fn default_path() -> PathBuf;  // ~/.vestige/vestige.toml
}

#[derive(Debug, thiserror::Error)]
pub enum ConfigError {
    #[error("io: {0}")]
    Io(#[from] std::io::Error),
    #[error("toml: {0}")]
    Toml(#[from] toml::de::Error),
    #[error("invalid config: {0}")]
    Invalid(String),
}

Behavior notes:
- The serde representation matches the PRD: [storage] with backend = "sqlite" and a matching [storage.sqlite] or [storage.postgres] subsection.
- Because StorageConfig is #[serde(tag = "backend")], an unknown backend string returns a clear error.
- If postgres-backend is compiled off and the user writes backend = "postgres", deserialization returns "unknown variant postgres" -- loud failure. Phase 2 wraps this into ConfigError::Invalid("postgres-backend feature not compiled in").
- env-override hooks (e.g., VESTIGE_POSTGRES_URL) are a Phase 3 concern; not added here.

D8. `vestige migrate --from sqlite --to postgres`

File: crates/vestige-core/src/storage/postgres/migrate_cli.rs
Depends on: D2, D6, D7, Phase 1 SqliteMemoryStore.
Signatures:

#![cfg(feature = "postgres-backend")]

use std::path::Path;
use std::sync::Arc;

use futures::{StreamExt, TryStreamExt};
use indicatif::{ProgressBar, ProgressStyle};
use uuid::Uuid;

use crate::embedder::Embedder;
use crate::storage::error::{StoreError, StoreResult};
use crate::storage::postgres::PgMemoryStore;
use crate::storage::sqlite::SqliteMemoryStore;

#[derive(Debug, Clone)]
pub struct SqliteToPostgresPlan {
    pub sqlite_path: std::path::PathBuf,
    pub postgres_url: String,
    pub max_connections: u32,
    pub batch_size: usize,  // default 500
}

pub struct MigrationReport {
    pub memories_copied: u64,
    pub scheduling_rows: u64,
    pub edges_copied: u64,
    pub review_events_copied: u64,
    pub domains_copied: u64,
    pub errors: Vec<(Uuid, StoreError)>,
}

pub async fn run_sqlite_to_postgres(
    plan: SqliteToPostgresPlan,
    embedder: Arc<dyn Embedder>,
) -> StoreResult<MigrationReport>;

Algorithm:

Open source SqliteMemoryStore in read-only mode (?mode=ro).
Check source embedding_model registry; refuse if it disagrees with the supplied embedder unless the user also passed --reembed.
Open destination PgMemoryStore via connect (runs migrations, stamps dim).
Stream source rows in batches of plan.batch_size via a windowed query ordered by created_at, id (stable cursor; survives resume).
For each batch: begin a Postgres transaction, INSERT INTO memories ... ON CONFLICT (id) DO NOTHING for all rows, INSERT INTO scheduling likewise, commit. Copy domain assignments (domains, domain_scores) verbatim -- they are [] and {} for pre-Phase-4 SQLite data.
After memories finish, stream edges and review_events the same way.
Emit progress via indicatif::ProgressBar (one bar per table, multi-bar). Each 1000 rows log to tracing at INFO.
Return MigrationReport for the caller to print.

Behavior notes:
- Memory-bounded: batch size 500 and sqlx streams mean memory usage stays O(batch * row_size), not O(total_rows).
- Idempotent: re-running replays only the rows not already present; ON CONFLICT DO NOTHING means partial runs recover.
- UUID strings from SQLite are parsed via Uuid::parse_str -- any mangled ID pushes to errors instead of aborting.
- The FTS search_vec is regenerated by Postgres via the GENERATED column; no data to copy.
- review_events may not exist in Phase 1 SQLite for pre-V12 databases. The migrator detects missing tables via SELECT name FROM sqlite_master and skips gracefully.
- A separate --dry-run flag prints the counts per table without writing.

D9. `vestige migrate --reembed --model=<new>`

File: crates/vestige-core/src/storage/postgres/reembed.rs
Depends on: D2, D6, Phase 1 Embedder.
Signatures:

#![cfg(feature = "postgres-backend")]

use std::sync::Arc;
use std::time::Instant;

use futures::TryStreamExt;
use indicatif::{ProgressBar, ProgressStyle};
use sqlx::PgPool;
use uuid::Uuid;

use crate::embedder::Embedder;
use crate::storage::error::{StoreError, StoreResult};
use crate::storage::postgres::PgMemoryStore;

#[derive(Debug, Clone)]
pub struct ReembedPlan {
    pub batch_size: usize,           // default 128 (embedder batch)
    pub drop_hnsw_first: bool,       // default true
    pub concurrent_index: bool,      // default false; use CREATE INDEX (not CONCURRENTLY)
}

pub struct ReembedReport {
    pub rows_updated: u64,
    pub duration_secs: f64,
    pub index_rebuild_secs: f64,
}

pub async fn run_reembed(
    store: &PgMemoryStore,
    new_embedder: Arc<dyn Embedder>,
    plan: ReembedPlan,
) -> StoreResult<ReembedReport>;

Algorithm:

Verify new_embedder.dimension() != stored dimension OR new_embedder.model_hash() != stored hash -- otherwise no-op and return rows_updated = 0.
BEGIN; ALTER TABLE memories ALTER COLUMN embedding DROP NOT NULL; not actually needed (column is already nullable) but shown here for documentation.
If plan.drop_hnsw_first, execute DROP INDEX IF EXISTS idx_memories_embedding_hnsw; so updates are not slowed by index maintenance. This is the recommended path; REINDEX is kept in the Open Questions as an alternative.
Stream all id, content from memories ordered by id.
For each batch of plan.batch_size: call new_embedder.embed_batch(&texts) (Phase 1 trait exposes batched embedding when available; otherwise loop single embed). Then:

UPDATE memories
SET embedding = v.embedding::vector
FROM UNNEST($1::uuid[], $2::real[][]) AS v(id, embedding)
WHERE memories.id = v.id;

After all rows updated: run ALTER TABLE memories ALTER COLUMN embedding TYPE vector($NEW_DIM) if dimension changed.
Rebuild HNSW. If plan.concurrent_index, execute CREATE INDEX CONCURRENTLY idx_memories_embedding_hnsw ...; else CREATE INDEX idx_memories_embedding_hnsw ....
update_registry with the new embedder.
Return ReembedReport.

Behavior notes:
- Memory-bounded: batch_size * 2 (old + new texts) vectors in RAM at any time.
- The dimension change must happen AFTER all rows are updated (pgvector validates typmod on write when a typmod is present; we relax-then-tighten).
- CONCURRENTLY builds do not hold AccessExclusiveLock, but fail inside a transaction. That's why the outer driver runs index DDL as an autocommit statement (sqlx execute outside a pool transaction).
- For --dry-run, emit what would happen (row count, estimated embedder calls, estimated time using rows / 50-per-second baseline for local fastembed) and exit.

D10. CLI wiring in `vestige-mcp`

File: crates/vestige-mcp/src/bin/cli.rs
Depends on: D8, D9, D7. Requires vestige-mcp Cargo feature postgres-backend.
Signatures:

#[derive(Subcommand)]
enum Commands {
    // existing variants: Stats, Health, Consolidate, Restore, Backup,
    // Export, Gc, Dashboard, Ingest, Serve ...

    /// Migrate between backends or re-embed memories.
    #[cfg(feature = "postgres-backend")]
    Migrate(MigrateArgs),
}

#[derive(clap::Args)]
#[cfg(feature = "postgres-backend")]
struct MigrateArgs {
    #[command(subcommand)]
    action: MigrateAction,
}

#[derive(Subcommand)]
#[cfg(feature = "postgres-backend")]
enum MigrateAction {
    /// Copy all memories from SQLite to Postgres.
    #[command(name = "copy")]
    Copy {
        #[arg(long)]
        from: String,            // "sqlite"
        #[arg(long)]
        to: String,              // "postgres"
        #[arg(long)]
        sqlite_path: PathBuf,
        #[arg(long)]
        postgres_url: String,
        #[arg(long, default_value = "500")]
        batch_size: usize,
        #[arg(long)]
        dry_run: bool,
    },
    /// Re-embed all memories with a new embedder.
    #[command(name = "reembed")]
    Reembed {
        #[arg(long)]
        model: String,
        #[arg(long, default_value = "128")]
        batch_size: usize,
        #[arg(long, default_value_t = true)]
        drop_hnsw_first: bool,
        #[arg(long)]
        concurrent_index: bool,
        #[arg(long)]
        dry_run: bool,
    },
}

The user-facing invocation collapses to the exact string requested by the ADR:

vestige migrate copy --from sqlite --to postgres \
    --sqlite-path ~/.vestige/vestige.db \
    --postgres-url postgresql://localhost/vestige

vestige migrate reembed --model=BAAI/bge-large-en-v1.5

An alternate top-level layout (single vestige migrate with flags --from, --to, --reembed) is equivalent; the subcommand split is preferred because the two flag sets are disjoint (see Open Question 1).

Behavior notes:
- --from/--to values are validated; the current Phase 2 build accepts only sqlite and postgres.
- For reembed, the --model string resolves to an Embedder via a factory already provided by Phase 1 (Embedder::from_name(&str)); Phase 2 does not invent new embedder constructors.
- Progress output on stderr; machine-readable summary on stdout as one-line JSON when --json is set (skipped for Phase 2 unless trivial).

D11. Offline query cache (`.sqlx/`)

File: crates/vestige-core/.sqlx/ (committed directory of query-*.json)
Depends on: all sqlx::query! call sites being final.
Procedure: the developer runs cargo sqlx prepare --workspace with a live Postgres having the schema applied. Output goes into crates/vestige-core/.sqlx/. This directory is committed. CI enforces freshness by running cargo sqlx prepare --workspace --check against the same live Postgres (or failing that, any dev can reproduce by setting SQLX_OFFLINE=true).
Behavior notes: SQLX_OFFLINE=true in build.rs or env is the default on CI and for downstream consumers. The vestige-core docs add a one-liner in README for contributors: "if you change any SQL in Phase 2 modules, rerun cargo sqlx prepare with a live DB."

D12. Testcontainer harness (integration)

File: tests/phase_2/common/mod.rs (the common convention used in tests/phase_2/ crates)
Depends on: D2 through D11.
Signatures:

#![cfg(feature = "postgres-backend")]

use std::sync::Arc;

use testcontainers_modules::postgres::Postgres;
use testcontainers::{runners::AsyncRunner, ContainerAsync};

use vestige_core::embedder::Embedder;
use vestige_core::storage::postgres::PgMemoryStore;

pub struct PgHarness {
    pub container: ContainerAsync<Postgres>,
    pub store: PgMemoryStore,
}

impl PgHarness {
    pub async fn start(embedder: Arc<dyn Embedder>) -> anyhow::Result<Self> {
        let container = Postgres::default()
            .with_tag("pg16")
            .with_name("pgvector/pgvector")
            .start()
            .await?;
        let port = container.get_host_port_ipv4(5432).await?;
        let url = format!(
            "postgresql://postgres:postgres@127.0.0.1:{}/postgres", port
        );
        let store = PgMemoryStore::connect(&url, 4, embedder.as_ref()).await?;
        Ok(Self { container, store })
    }
}

Behavior notes:
- Image pgvector/pgvector:pg16 bundles pgvector into the official postgres:16 image.
- Pool size 4 is enough for tests without starving the container's default max_connections = 100.
- ContainerAsync is held for the whole test scope; drop tears down the container.
- A fake TestEmbedder in common/test_embedder.rs provides a deterministic hash-based embedding (no ONNX dependency in CI).

Test Plan

Unit tests (colocated in `src/`)

Under crates/vestige-core/src/storage/postgres/:

pool.rs -- one test per build_pool branch: defaults, explicit max_connections, invalid URL returns StoreError::Postgres.
registry.rs -- three tests: first-init writes row and alters typmod, reopen with same embedder returns Ok, reopen with different dimension returns EmbeddingMismatch.
search.rs -- query-builder unit tests for parameter packing: empty text, null embedding, all three filters null, all three filters populated.
migrate_cli.rs -- SqliteToPostgresPlan::default returns sane defaults; plan validation rejects empty URL.
reembed.rs -- ReembedPlan::no_change returns rows_updated == 0 when embedder matches registry (no network call).
config.rs -- five tests covering: valid postgres config, valid sqlite config, unknown backend string, missing subsection, feature-gated postgres without feature compiled in.

Integration tests (in `tests/phase_2/`)

Each file is a full integration test crate ([[test]] in workspace root Cargo).

tests/phase_2/pg_trait_parity.rs

Declares the same test matrix as Phase 1's SQLite trait tests, parameterized over impl MemoryStore.
Runs every method: insert, get, update, delete, search, fts_search, vector_search, get_scheduling, update_scheduling, get_due_memories, add_edge, get_edges, remove_edge, get_neighbors, list_domains, get_domain, upsert_domain, delete_domain, classify, count, get_stats, vacuum, health_check.
Each test is written once as async fn roundtrip_<method>(store: &dyn MemoryStore) and invoked from two wrappers, one for SQLite and one for Postgres.
Acceptance: every method returns equal results (except for Uuid ordering in list_domains where the test sorts before comparing).

tests/phase_2/pg_hybrid_search_rrf.rs

Inserts 20 memories with known content ("rust async trait", "postgres hnsw vector", "fastembed onnx model", ...).
Case 1: pure FTS. SearchQuery { text: Some("rust trait"), embedding: None, ... } returns the three Rust-related rows in order; fts_score populated, vector_score null.
Case 2: pure vector. SearchQuery { text: None, embedding: Some(embed("rust trait")), ... } returns the same three rows via cosine; vector_score populated, fts_score null.
Case 3: hybrid. Both set -- top hit has both scores; rrf_score >= 1/(60+1) + 1/(60+1) = 0.0328.
Case 4: domain filter. 10 memories tagged with domains = ["dev"], 10 with ["home"]. Query with domains: Some(vec!["dev"]) returns only dev memories.
Case 5: edge case -- empty FTS query plus an embedding behaves identically to vector_search; empty embedding plus FTS query behaves identically to fts_search.

tests/phase_2/pg_migration_sqlite_to_postgres.rs

Populate a fresh SQLite with 10,000 memories (seeded RNG, deterministic content), 4,000 scheduling rows, 2,000 edges.
Run run_sqlite_to_postgres with a test embedder.
Assert: count() == 10_000 on destination; spot-check 25 memories byte-for-byte (content, tags, metadata, domains, domain_scores).
Assert: FSRS fields (stability, difficulty, next_review) preserved per memory.
Assert: edges preserved by (source_id, target_id, edge_type).
Assert: re-running the migration is a no-op (ON CONFLICT DO NOTHING path); row count unchanged.

tests/phase_2/pg_migration_reembed.rs

Start with a fresh store using TestEmbedder768 (768-dim, hash h1). Insert 500 memories.
Swap to TestEmbedder1024 (1024-dim, hash h2). Run run_reembed(store, Arc::new(TestEmbedder1024), ReembedPlan::default()).
Assert: rows_updated == 500; embedding_model now has (name=TestEmbedder1024, dimension=1024, hash=h2).
Assert: SELECT DISTINCT vector_dims(embedding) FROM memories returns only 1024.
Assert: HNSW index exists after reembed (SELECT indexrelid FROM pg_indexes WHERE indexname = 'idx_memories_embedding_hnsw').
Assert: memory IDs unchanged (compare pre/post id sets).
Assert: a hybrid search using TestEmbedder1024 returns results (post-reembed vectors are queryable).

tests/phase_2/pg_config_parsing.rs

Parse six vestige.toml snippets:
- sqlite + fastembed -> StorageConfig::Sqlite.
- postgres + fastembed -> StorageConfig::Postgres with max_connections = 10.
- postgres with custom max_connections = 25 and acquire_timeout_secs = 60.
- unknown backend "mysql" -> ConfigError.
- missing subsection [storage.postgres] while backend = "postgres" -> ConfigError.
- malformed URL (empty) -> ConfigError::Invalid.

tests/phase_2/pg_concurrency.rs

Spawn 16 tasks, each inserting 100 memories in parallel for 1,600 total.
Spawn 4 tasks concurrently running search queries; none should fail.
Spawn 2 tasks concurrently running update_scheduling on overlapping IDs -- last write wins (MVCC), neither errors.
Assert: all 1,600 rows present, no deadlocks, every task returns Ok.
Run time < 10 seconds on a cold container.

Compile-time query verification

CI step: cargo sqlx prepare --workspace --check against a CI-provisioned Postgres (GitHub Actions / Forgejo Actions services block). Fails CI if any query! macro goes stale.
Alternative offline run for contributors: SQLX_OFFLINE=true cargo check -p vestige-core --features postgres-backend. CI runs both forms to ensure .sqlx/ is up to date.
.sqlx/ is committed to the repo. A .gitattributes entry marks it as linguist-generated=true so it doesn't inflate language stats.

Benchmarks

Under crates/vestige-core/benches/pg_hybrid_search.rs (Criterion), gated by postgres-backend.

pg_search_1k -- populate 1,000 memories once per bench suite, measure rrf_search p50/p99 over 500 iterations. Target: p50 < 10ms, p99 < 30ms on a local container.
pg_search_100k -- 100,000 memories. Target: p50 < 50ms, p99 < 150ms. Validates HNSW scaling.
Testcontainer shared across both benches via once_cell.
Bench entry in vestige-core/Cargo.toml:

[[bench]]
name = "pg_hybrid_search"
harness = false
required-features = ["postgres-backend"]

Acceptance Criteria

cargo build -p vestige-core --features postgres-backend -- zero warnings.
cargo build -p vestige-core (SQLite-only, default features) -- zero warnings; no Postgres symbols referenced.
cargo build -p vestige-mcp --features postgres-backend -- zero warnings; vestige binary exposes the migrate subcommand.
cargo clippy --workspace --all-targets --all-features -- -D warnings -- clean.
cargo sqlx prepare --workspace --check -- returns success; .sqlx/ is current.
cargo test -p vestige-core --features postgres-backend --test pg_trait_parity --test pg_hybrid_search_rrf --test pg_migration_sqlite_to_postgres --test pg_migration_reembed --test pg_config_parsing --test pg_concurrency -- all green.
Testcontainer spin-up p50 under 30 seconds on a developer laptop with a warm Docker daemon.
pg_search_100k Criterion bench reports p50 < 50ms on reference hardware (logged in the ADR comment trail).
vestige migrate copy --from sqlite --to postgres on a 10,000-memory corpus completes without data loss: row count parity, content byte-parity on a 1 percent sample, FSRS state preserved (stability, difficulty, reps, lapses, next_review), edge count parity.
vestige migrate reembed with a dimension-changing embedder returns to a fully queryable state: HNSW present, embedding_model updated, no stale vectors, memory IDs untouched.
Trait parity: every method on MemoryStore has at least one passing test against PgMemoryStore.
Phase 1's existing SQLite suite continues to pass with zero changes required (Phase 2 is additive).
The postgres-backend feature does not compile in SQLCipher (encryption) simultaneously (mutually exclusive at compile time, per project rule).

Rollback Notes

Every *.up.sql has a matching *.down.sql in crates/vestige-core/migrations/postgres/. sqlx migrate revert walks them in reverse order. Manual operator procedure: sqlx migrate revert --database-url $URL --source crates/vestige-core/migrations/postgres.
vestige migrate copy is a one-way operation. The source SQLite DB is read-only during the run and untouched afterward; users retain their original file indefinitely. Recommended discipline: copy the SQLite file aside before starting, retain for 30 days.
vestige migrate reembed is destructive to the embedding column. Recommended discipline: take a logical backup (pg_dump --table=memories --table=embedding_model --table=scheduling) before a reembed run. The tool prints that recommendation before starting and exits non-zero unless --yes is passed or the user is on a TTY that confirms.
Feature-gate strategy: the default build remains SQLite-only. Downstream users pull postgres-backend explicitly: cargo install --features postgres-backend vestige-mcp. If the Postgres implementation fails in the field, users fall back to SQLite simply by flipping vestige.toml's [storage] backend = "sqlite" and restarting. No data re-migration is needed if they retained their SQLite file.
The docs/runbook/postgres.md deliverable (D16) captures this discipline as a one-page ops note.

Open Implementation Questions

Each item has a recommendation. Ship that unless a reviewer objects.

Q1. CLI shape: subcommand split vs flag union

Options: (a) vestige migrate copy --from sqlite --to postgres ... and vestige migrate reembed --model=... (subcommand split); (b) vestige migrate --from sqlite --to postgres ... and vestige migrate --reembed --model=... under one clap command with disjoint flag groups (flag union).
RECOMMENDATION: (a) subcommand split. The flag sets do not overlap and clap expresses the constraint more cleanly. The ADR string vestige migrate --from sqlite --to postgres can still be documented as a canonical alias by having copy accept it verbatim when --from is present.

Q2. Feature flag name

Options: postgres-backend, postgres, backend-postgres, pg.
RECOMMENDATION: postgres-backend. Matches the ADR text and is explicit in Cargo.toml feature listings.

Q3. sqlx offline mode strategy

Options: (a) commit .sqlx/ so downstream builds never need DATABASE_URL; (b) require DATABASE_URL at build time.
RECOMMENDATION: (a). The repo already ships as a library; many downstream users will build from crates.io with no Postgres available. Committing .sqlx/ costs ~100 kB.

Q4. HNSW rebuild strategy during reembed

Options: (a) DROP INDEX; CREATE INDEX; (b) REINDEX INDEX CONCURRENTLY; (c) CREATE INDEX CONCURRENTLY on a new name then swap.
RECOMMENDATION: (a) by default for speed on empty / near-empty tables; expose --concurrent-index for large production corpora where locking the table is unacceptable. REINDEX CONCURRENTLY on pgvector HNSW is supported in pgvector 0.6+ but the community still reports edge cases with maintenance_work_mem -- skip unless a user explicitly opts in.

Q5. Connection pool sizing default

Options: 4, 10, 20, cpus() * 2.
RECOMMENDATION: 10. Matches the PRD example, covers a single-operator load, and does not exhaust the default Postgres max_connections = 100. Configurable via vestige.toml.

Q6. Testcontainer image pinning

Options: (a) pgvector/pgvector:pg16; (b) pgvector/pgvector:pg16.2-0.7.4 (exact tag); (c) maintain local Dockerfile.
RECOMMENDATION: (b) pin exact. The float tag pg16 has shipped breaking changes in the past (e.g., pg 16.0 to 16.1 interop). Pin to a specific pgvector minor and Postgres patch. CI bumps the tag via a single-line change.

Q7. Empty-text and null-embedding behavior in `search`

Options: (a) return an error if both are missing; (b) return an empty result; (c) return all memories sorted by created_at DESC.
RECOMMENDATION: (a). A search call with no query is a bug in the caller; returning empty silently would hide the bug. The existing Phase 1 SQLite behavior (TBD but likely errors) is the tiebreaker.

Q8. `classify()` SQL vs Rust

Options: (a) compute cosine to all centroids in SQL (SELECT id, 1 - (centroid <=> $1::vector) FROM domains ORDER BY ...); (b) load centroids, compute in Rust.
RECOMMENDATION: (a). Leverages pgvector's SIMD paths and avoids round-tripping centroid vectors. At Phase 4 scale (tens of centroids) the difference is marginal, but the SQL path is simpler and matches the rest of the backend.

Q9. FSRS `review_events` writes: trait method vs implicit on `update_scheduling`

Options: (a) add an explicit record_review(memory_id, rating, prior, new) method to the Phase 1 trait; (b) have update_scheduling write the event atomically.
RECOMMENDATION: this is a Phase 1 question, not Phase 2. Phase 2 implements whichever Phase 1 chose. If Phase 1 missed it, Phase 2 raises a blocker rather than deciding alone.

Q10. `tsvector` weight for tags -- PRD used `array_to_tsvector`, we used `array_to_string`

Options: (a) array_to_tsvector(tags) (requires the tsvector_extra extension or similar); (b) to_tsvector('english', array_to_string(tags, ' ')) (plain core Postgres).
RECOMMENDATION: (b). Equivalent ranking, zero extra extensions. If a future tag matches a stopword ("the"), it gets dropped, but that is correct behavior for ranking.

Q11. `PgMemoryStore::connect` runs migrations automatically?

Options: (a) always run sqlx::migrate! on connect; (b) require the user to run vestige migrate-schema explicitly before starting the server.
RECOMMENDATION: (a) during Phase 2; revisit in Phase 3 when the server binary exists. Developer ergonomics win now, and the migrations are idempotent.

Q12. Offline query cache freshness vs `sqlx-cli` version skew

Options: (a) pin sqlx-cli version in CI actions/cache step; (b) let CI install whatever version sqlx depends on.
RECOMMENDATION: (a) pin to the same 0.8.x as the crate. sqlx prepare output changes between 0.7 and 0.8 and must match the runtime.

Sequencing

The Phase 2 agent executes deliverables in this order; deliverables not listed can run in any order relative to each other.

D1 (feature gate + Cargo deps) -- unblocks everything.
D7 (config) -- required to construct PgMemoryStore.
D4 (migrations SQL) -- required before any query! compiles.
D3 (pool) + D6 (registry) -- small, used by D2.
D2 (PgMemoryStore core + trait impl) -- the bulk of Phase 2.
D5 (RRF search) -- after D2; requires the trait to exist.
D12 (test harness) + parity and search tests -- validates D2 and D5 in isolation.
D8 (sqlite->pg migrate) + its integration test.
D9 (reembed) + its integration test.
D10 (CLI wiring).
D11 (.sqlx/ offline cache) -- last, after SQL is frozen.
D15 (benches) + D16 (runbook) -- after acceptance tests pass.

Each deliverable PR includes its own tests; the final Phase 2 PR stacks them (or lands as a single branch if the Phase 1 trait is stable enough to avoid rebase churn).

Critical Files for Implementation

/home/delandtj/prppl/vestige/crates/vestige-core/src/storage/postgres/mod.rs
/home/delandtj/prppl/vestige/crates/vestige-core/migrations/postgres/0001_init.up.sql
/home/delandtj/prppl/vestige/crates/vestige-core/src/storage/postgres/search.rs
/home/delandtj/prppl/vestige/crates/vestige-core/src/storage/postgres/migrate_cli.rs
/home/delandtj/prppl/vestige/crates/vestige-mcp/src/bin/cli.rs

60 KiB Raw Blame History