vestige/docs/plans/0002a-skeleton-and-feature-gate.md
Jan De Landtsheer 9ef8afdb20
docs(plans): add Phase 2 sub-plans 0002a-0002i + supersession notice
Nine Phase 2 sub-plans operationalising ADR 0002 against the Phase 2
master plan, each sized to fit a focused implementation session and
handed to Claude Code as a /goal brief without requiring the agent to
load the master plan.

Order of execution (each depends on the previous unless noted):
- 0002a-skeleton-and-feature-gate.md -- postgres-backend Cargo feature
  + PgMemoryStore skeleton with todo!() bodies. D1+D2.
- 0002b-pool-and-config.md -- PgPool builder, VestigeConfig/
  PostgresConfig, vestige.toml loader wired into vestige-mcp. D3+D7
  (master plan numbering).
- 0002c-migrations.md -- sqlx migrations 0001_init/0002_hnsw including
  D7 (users/groups/memberships, owner/visibility/shared_with_groups)
  and D8 (codebase column). SQLite V15 parity migration. D4.
- 0002d-store-impl-bodies.md -- real CRUD + registry bodies; trivial
  fts_search/vector_search bodies. D2+D6.
- 0002e-hybrid-search.md -- one-statement RRF query. D5.
- 0002f-migrate-cli.md -- vestige migrate copy (SQLite -> Postgres),
  --dry-run, idempotent re-runs, --allow-source-upgrade for pre-V15
  sources. D8+D10.
- 0002g-reembed.md -- vestige migrate reembed (offline rebuild).
  D9 + D10 reembed arm. Ships resolve_embedder helper as a workaround
  for the missing Embedder::from_name(&str) constructor.
- 0002h-testing-and-benches.md -- testcontainers harness, six
  integration test files, Criterion bench at 1k/100k. D14+D15.
- 0002i-runbook.md -- operator-facing deployment + day-2 runbook. D16.

Supersession notice added to the master plan (0002-phase-2-postgres-
backend.md) pointing at ADR 0002; body retained as archival reference.

PR B carries this commit plus the previous two (ADR 0002 + Phase 1
amendment sub-plans); no code change.
2026-05-27 09:35:58 +02:00

22 KiB

Phase 2 Sub-Plan 0002a -- Skeleton and Feature Gate

Status: Ready Depends on: Phase 1 amendment (sub-plans 0001a-trait-rewrite.md and 0001b-sqlite-split.md) merged. Specifically:

  • MemoryStore trait declared with #[trait_variant::make(MemoryStore: Send)], generating a non-Send LocalMemoryStore companion trait. The pub use MemoryStore as LocalMemoryStore alias from Phase 1 is gone.
  • crates/vestige-core/src/storage/sqlite.rs has been split into crates/vestige-core/src/storage/sqlite/ with the same public surface.

This sub-plan covers Phase 2 master-plan deliverables D1 and D2 only: the postgres-backend Cargo feature gate and a compilable PgMemoryStore skeleton whose trait method bodies are todo!(). No real Postgres code, no migrations, no SQL. Later sub-plans (0002b-pool-and-config.md, 0002c-migrations.md, 0002d-store-impl-bodies.md, ...) fill the bodies in.

The success criterion is a clean build under both feature-flag configurations, nothing more.


Context

ADR 0002 D4 commits Phase 2 to a crates/vestige-core/src/storage/postgres/ directory from day one. The seven other files in that directory (pool.rs, migrations.rs, registry.rs, search.rs, migrate_cli.rs, reembed.rs) belong to subsequent sub-plans. This sub-plan creates only crates/vestige-core/src/storage/postgres/mod.rs so the rest can be added incrementally without breaking the build.

Per ADR 0002 D2, PgMemoryStore::connect mirrors SqliteMemoryStore::new: no Embedder argument. The pgvector typmod DDL (ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)) lives inside the trait method register_model, invoked by the caller after construction. In this sub-plan register_model is a todo!() body; 0002c-migrations.md and 0002d-store-impl-bodies.md provide the real implementation.

The trait surface in crates/vestige-core/src/storage/memory_store.rs is the source of truth for method signatures. Do NOT copy signatures from the master plan -- they are stale in places (for example, master plan 0002 D2 lists remove_edge as three-arg (source, target, edge_type); the live trait has two args (source, target)).


Cargo manifest changes

Two optional crates and one new feature flag. Use cargo add per the global CLAUDE.md preference; do not hand-edit Cargo.toml.

cd crates/vestige-core

cargo add sqlx@0.8 --optional --no-default-features \
    --features runtime-tokio,tls-rustls,postgres,uuid,chrono,json,migrate,macros

cargo add pgvector@0.4 --optional --features sqlx

After both commands, open crates/vestige-core/Cargo.toml and add the postgres-backend feature line in the [features] block. Place it after the metal feature, before [dependencies]:

# Postgres backend (mutually compilable with the SQLite backend; default OFF).
# Compile with: --features postgres-backend
postgres-backend = ["dep:sqlx", "dep:pgvector"]

Do NOT add tokio-stream, futures, indicatif, or toml in this sub-plan. The master plan D1 lists them in the postgres-backend feature for convenience, but their consumers (streaming migrate, progress bar, config parsing) land in later sub-plans. Adding them here pulls dead weight into the feature gate.

Do NOT add the vestige-mcp pass-through feature in this sub-plan either. The MCP crate gets its postgres-backend feature in 0002b-pool-and-config.md when MemoryStoreConfig lands and the binary needs a knob to pick a backend.

Verify the diff to crates/vestige-core/Cargo.toml looks like this and only this:

[features]
# ...existing features unchanged...
postgres-backend = ["dep:sqlx", "dep:pgvector"]

[dependencies]
# ...existing deps unchanged...
sqlx = { version = "0.8", default-features = false, features = [
    "runtime-tokio", "tls-rustls", "postgres", "uuid", "chrono",
    "json", "migrate", "macros",
], optional = true }
pgvector = { version = "0.4", features = ["sqlx"], optional = true }

The exact ordering of the two new lines inside [dependencies] is not significant; cargo add places them at the end. Leave the placement that cargo add produces.


Storage module export

Edit crates/vestige-core/src/storage/mod.rs to expose the new module behind the feature flag. Two lines change.

Add to the module-declaration block (after mod sqlite;):

#[cfg(feature = "postgres-backend")]
mod postgres;

Add to the re-export block (after the pub use sqlite::{ ... } block):

#[cfg(feature = "postgres-backend")]
pub use postgres::PgMemoryStore;

Nothing else in storage/mod.rs changes. The Storage alias still points at SqliteMemoryStore; the SQLite re-export block is untouched.


Postgres module skeleton

Create crates/vestige-core/src/storage/postgres/mod.rs with the full content below. This is the only new file in this sub-plan.

#![cfg(feature = "postgres-backend")]
//! Postgres-backed implementation of `MemoryStore`.
//!
//! Skeleton only. Every trait method is `todo!()`. Real bodies land in
//! subsequent Phase 2 sub-plans:
//! - `0002b-pool-and-config.md`: pool construction and config wiring
//! - `0002c-migrations.md`:      sqlx migration files and `init`/`register_model`
//! - `0002d-store-impl-bodies.md`: CRUD, scheduling, edges, domains
//! - `0002e-hybrid-search.md`:   RRF query and search bodies
//!
//! The directory grows companion files (`pool.rs`, `migrations.rs`,
//! `registry.rs`, `search.rs`, `migrate_cli.rs`, `reembed.rs`) in those
//! sub-plans; this skeleton only creates `mod.rs`.

use chrono::{DateTime, Utc};
use sqlx::PgPool;
use uuid::Uuid;

use crate::storage::memory_store::{
    Domain, HealthStatus, LocalMemoryStore, MemoryEdge, MemoryRecord, MemoryStoreResult,
    ModelSignature, SchedulingState, SearchQuery, SearchResult, StoreStats,
};

/// Postgres-backed implementation of `MemoryStore`.
///
/// Cheaply cloneable. Methods take `&self`; interior state lives inside the
/// `PgPool` (which already provides `Sync` via `Arc` internally).
#[derive(Clone)]
pub struct PgMemoryStore {
    pool: PgPool,
    /// Embedding vector dimension. Set to 0 in the skeleton; populated by
    /// `register_model` in `0002d-store-impl-bodies.md` once the pgvector
    /// `ALTER COLUMN TYPE vector(N)` DDL lands.
    embedding_dim: i32,
}

impl PgMemoryStore {
    /// Construct a new store from a connection URL.
    ///
    /// Mirrors `SqliteMemoryStore::new`: no `Embedder` argument. The pgvector
    /// `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)` DDL lives
    /// inside `register_model`, not here. The caller (cognitive engine
    /// bootstrap, migrate CLI, tests) invokes `register_model` after this
    /// returns, identically to the SQLite path.
    ///
    /// Real body lands in `0002b-pool-and-config.md` (pool construction) and
    /// `0002c-migrations.md` (initial migration run).
    pub async fn connect(_url: &str, _max_connections: u32) -> MemoryStoreResult<Self> {
        todo!("PgMemoryStore::connect lands in 0002b-pool-and-config.md")
    }

    /// Low-level constructor for tests: supply an existing pool, skip migrate.
    ///
    /// Real body lands in `0002b-pool-and-config.md`.
    pub async fn from_pool(_pool: PgPool) -> MemoryStoreResult<Self> {
        todo!("PgMemoryStore::from_pool lands in 0002b-pool-and-config.md")
    }

    /// Accessor used by migrate/reembed CLI.
    pub fn pool(&self) -> &PgPool {
        &self.pool
    }

    /// Currently-registered vector dimension. Returns 0 until `register_model`
    /// has been called (real body: `0002d-store-impl-bodies.md`).
    pub fn embedding_dim(&self) -> i32 {
        self.embedding_dim
    }
}

// trait_variant::make on the trait declaration generates two traits:
//   - `MemoryStore` (Send-bound)
//   - `LocalMemoryStore` (non-Send companion)
// The implementer writes `impl LocalMemoryStore for ...` plainly; the Send
// variant is generated automatically from the non-Send impl.
impl LocalMemoryStore for PgMemoryStore {
    // --- Lifecycle ---
    async fn init(&self) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::init lands in 0002c-migrations.md")
    }

    async fn health_check(&self) -> MemoryStoreResult<HealthStatus> {
        todo!("PgMemoryStore::health_check lands in 0002d-store-impl-bodies.md")
    }

    // --- Embedding model registry ---
    async fn registered_model(&self) -> MemoryStoreResult<Option<ModelSignature>> {
        todo!("PgMemoryStore::registered_model lands in 0002d-store-impl-bodies.md")
    }

    async fn register_model(&self, _sig: &ModelSignature) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::register_model lands in 0002d-store-impl-bodies.md")
    }

    // --- CRUD ---
    async fn insert(&self, _record: &MemoryRecord) -> MemoryStoreResult<Uuid> {
        todo!("PgMemoryStore::insert lands in 0002d-store-impl-bodies.md")
    }

    async fn get(&self, _id: Uuid) -> MemoryStoreResult<Option<MemoryRecord>> {
        todo!("PgMemoryStore::get lands in 0002d-store-impl-bodies.md")
    }

    async fn update(&self, _record: &MemoryRecord) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::update lands in 0002d-store-impl-bodies.md")
    }

    async fn delete(&self, _id: Uuid) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::delete lands in 0002d-store-impl-bodies.md")
    }

    // --- Search ---
    async fn search(&self, _query: &SearchQuery) -> MemoryStoreResult<Vec<SearchResult>> {
        todo!("PgMemoryStore::search lands in 0002e-hybrid-search.md")
    }

    async fn fts_search(
        &self,
        _text: &str,
        _limit: usize,
    ) -> MemoryStoreResult<Vec<SearchResult>> {
        todo!("PgMemoryStore::fts_search lands in 0002e-hybrid-search.md")
    }

    async fn vector_search(
        &self,
        _embedding: &[f32],
        _limit: usize,
    ) -> MemoryStoreResult<Vec<SearchResult>> {
        todo!("PgMemoryStore::vector_search lands in 0002e-hybrid-search.md")
    }

    // --- FSRS Scheduling ---
    async fn get_scheduling(
        &self,
        _memory_id: Uuid,
    ) -> MemoryStoreResult<Option<SchedulingState>> {
        todo!("PgMemoryStore::get_scheduling lands in 0002d-store-impl-bodies.md")
    }

    async fn update_scheduling(&self, _state: &SchedulingState) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::update_scheduling lands in 0002d-store-impl-bodies.md")
    }

    async fn get_due_memories(
        &self,
        _before: DateTime<Utc>,
        _limit: usize,
    ) -> MemoryStoreResult<Vec<(MemoryRecord, SchedulingState)>> {
        todo!("PgMemoryStore::get_due_memories lands in 0002d-store-impl-bodies.md")
    }

    // --- Graph (spreading activation) ---
    async fn add_edge(&self, _edge: &MemoryEdge) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::add_edge lands in 0002d-store-impl-bodies.md")
    }

    async fn get_edges(
        &self,
        _node_id: Uuid,
        _edge_type: Option<&str>,
    ) -> MemoryStoreResult<Vec<MemoryEdge>> {
        todo!("PgMemoryStore::get_edges lands in 0002d-store-impl-bodies.md")
    }

    async fn remove_edge(&self, _source: Uuid, _target: Uuid) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::remove_edge lands in 0002d-store-impl-bodies.md")
    }

    async fn get_neighbors(
        &self,
        _node_id: Uuid,
        _depth: usize,
    ) -> MemoryStoreResult<Vec<(MemoryRecord, f64)>> {
        todo!("PgMemoryStore::get_neighbors lands in 0002d-store-impl-bodies.md")
    }

    // --- Domains (Phase 1: stubs return empty; full impl in Phase 4) ---
    async fn list_domains(&self) -> MemoryStoreResult<Vec<Domain>> {
        todo!("PgMemoryStore::list_domains lands in 0002d-store-impl-bodies.md")
    }

    async fn get_domain(&self, _id: &str) -> MemoryStoreResult<Option<Domain>> {
        todo!("PgMemoryStore::get_domain lands in 0002d-store-impl-bodies.md")
    }

    async fn upsert_domain(&self, _domain: &Domain) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::upsert_domain lands in 0002d-store-impl-bodies.md")
    }

    async fn delete_domain(&self, _id: &str) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::delete_domain lands in 0002d-store-impl-bodies.md")
    }

    async fn classify(&self, _embedding: &[f32]) -> MemoryStoreResult<Vec<(String, f64)>> {
        todo!("PgMemoryStore::classify lands in 0002d-store-impl-bodies.md")
    }

    // --- Bulk / Maintenance ---
    async fn count(&self) -> MemoryStoreResult<usize> {
        todo!("PgMemoryStore::count lands in 0002d-store-impl-bodies.md")
    }

    async fn get_stats(&self) -> MemoryStoreResult<StoreStats> {
        todo!("PgMemoryStore::get_stats lands in 0002d-store-impl-bodies.md")
    }

    async fn vacuum(&self) -> MemoryStoreResult<()> {
        todo!("PgMemoryStore::vacuum lands in 0002d-store-impl-bodies.md")
    }
}

Notes on the skeleton:

  • The file-level #![cfg(feature = "postgres-backend")] means the whole file vanishes when the feature is off. The mod postgres; line in storage/mod.rs is itself feature-gated, so this is belt-and-braces; both gates are needed because the file-level attribute is what allows the file to use sqlx::PgPool unconditionally inside it.
  • EmbeddingModelDescriptor (a separate Postgres-internal type that the master plan sketched on the struct) is dropped. The trait surface already carries ModelSignature for the registry round-trip; the registry storage layout is a private concern of registry.rs, which is added later. Keep PgMemoryStore minimal until a real consumer needs the extra type.
  • The struct only carries pool and embedding_dim. The model descriptor field from the master plan sketch goes away with EmbeddingModelDescriptor. If register_model later needs to cache the descriptor on the struct, it can be added then; the skeleton does not speculate.
  • The two trivial accessors (pool, embedding_dim) get real bodies. Every other method is todo!() so it returns ! and trivially coerces to the declared return type at the type checker; this is what lets the build pass with no error variants and no SQL.

Connect signature

Per ADR 0002 D2:

pub async fn connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self>;
pub async fn from_pool(pool: PgPool) -> MemoryStoreResult<Self>;

No &dyn Embedder argument. This deliberately differs from master plan 0002, which predates the Phase 1 freeze. The pgvector-specific DDL (ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)) does not run inside connect; it runs inside register_model(&ModelSignature), which the caller invokes after connect returns.

In this sub-plan register_model is todo!(). The real body lands in 0002d-store-impl-bodies.md after 0002c-migrations.md ships the 0001_init.up.sql migration that creates the memories table with a placeholder embedding vector column (no typmod), against which register_model later runs the typmod stamp.


Error variant additions: deferred

MemoryStoreError does NOT gain Postgres(sqlx::Error) or Migrate(sqlx::migrate::MigrateError) in this sub-plan.

The reason is mechanical: todo!() evaluates to the never type !, which coerces to any MemoryStoreResult<T> regardless of the error variants available. With every method body a todo!(), the skeleton has no expression that needs to convert a sqlx::Error or sqlx::migrate::MigrateError into MemoryStoreError. Adding the variants here would mean adding the #[cfg(feature = "postgres-backend")] and #[from] plumbing to memory_store.rs with no consumer yet -- dead code at every level except the enum definition itself.

0002d-store-impl-bodies.md introduces both variants in the same commit that turns the first todo!() into a real sqlx::query! call. That keeps the diff to memory_store.rs next to the first usage site, which is easier to review than adding variants ahead of need.

For reference, the variants that will be added in 0002d-store-impl-bodies.md look like this:

#[cfg(feature = "postgres-backend")]
#[error("postgres error: {0}")]
Postgres(#[from] sqlx::Error),

#[cfg(feature = "postgres-backend")]
#[error("postgres migration error: {0}")]
Migrate(#[from] sqlx::migrate::MigrateError),

Do not pre-add them here.


Verification

Run these commands from the workspace root. All four must produce a clean build, zero warnings on the diff-affected files, no test changes.

# 1. Default features (SQLite backend, postgres-backend OFF). Must build.
cargo build --workspace --all-targets

# 2. Workspace clippy with default features. Must be clean.
cargo clippy --workspace --all-targets -- -D warnings

# 3. Postgres feature enabled. Must build.
cargo build -p vestige-core --features postgres-backend

# 4. Clippy with postgres feature enabled. Must be clean.
cargo clippy -p vestige-core --features postgres-backend --all-targets -- -D warnings

Expected outcomes:

  • cargo build --workspace --all-targets finishes with no compilation of sqlx or pgvector (both are optional, no consumer with default features). The postgres module is excluded entirely via #[cfg].
  • cargo build -p vestige-core --features postgres-backend compiles sqlx, pgvector, and storage/postgres/mod.rs. The build succeeds because every trait method is todo!(); nothing actually runs SQL.
  • Both clippy invocations pass with -D warnings. The todo!() macro does not emit a dead_code lint by itself, and the trivial accessors are used by later sub-plans (clippy on the postgres feature alone may flag them as unused if you run with --lib only; the --all-targets form keeps tests and benches in scope so this does not fire).
  • If clippy flags unused_variables on the underscore-prefixed parameters in the todo!() bodies, the underscore prefix is already the standard suppression; if a future clippy version disagrees, add #[allow(unused_variables)] to the impl block, not to each method.

Tests are not modified in this sub-plan. The unit tests in memory_store.rs (memory_store_error_from_storage_error, model_signature_serde_round_trip, memory_record_serde_round_trip) keep passing because no type they touch changes.

Do NOT run cargo test against the postgres feature -- there is no Postgres running and no test exercises PgMemoryStore yet. The build check is the contract.


Acceptance criteria

  1. crates/vestige-core/Cargo.toml declares sqlx = "0.8" and pgvector = "0.4" as optional dependencies with the exact feature sets specified above.
  2. crates/vestige-core/Cargo.toml declares postgres-backend = ["dep:sqlx", "dep:pgvector"] and nothing else inside that feature.
  3. crates/vestige-mcp/Cargo.toml is unchanged.
  4. crates/vestige-core/src/storage/mod.rs adds exactly two feature-gated lines: mod postgres; and pub use postgres::PgMemoryStore;. No other change.
  5. crates/vestige-core/src/storage/postgres/mod.rs exists and contains the PgMemoryStore struct, impl PgMemoryStore block with real pool and embedding_dim accessors and todo!() bodies for connect and from_pool, and the full impl LocalMemoryStore for PgMemoryStore block with todo!() for every trait method.
  6. The trait impl method signatures match memory_store.rs byte-for-byte (including remove_edge(&self, source: Uuid, target: Uuid) two-arg form, not the three-arg form from the master plan).
  7. MemoryStoreError is unchanged.
  8. No other files in the crate are touched. No new files in storage/postgres/ besides mod.rs.
  9. The four verification commands above all succeed.

Commit sequence

One commit is recommended. The two changes (Cargo manifest + module skeleton) do not compile in isolation: the manifest change without the skeleton produces unused-optional-dep warnings, and the skeleton without the manifest change fails to find sqlx. Splitting them adds no review value, since the second commit is the one that has to compile cleanly.

git add crates/vestige-core/Cargo.toml \
        crates/vestige-core/Cargo.lock \
        crates/vestige-core/src/storage/mod.rs \
        crates/vestige-core/src/storage/postgres/mod.rs

git commit -m "feat(storage): scaffold postgres-backend feature and PgMemoryStore skeleton

Adds the postgres-backend Cargo feature gating sqlx 0.8 and pgvector 0.4.
Introduces crates/vestige-core/src/storage/postgres/mod.rs with the
PgMemoryStore struct, connect/from_pool/pool/embedding_dim, and a trait impl
whose method bodies are todo!() pending later Phase 2 sub-plans.

Builds clean with default features (SQLite only) and with --features
postgres-backend. No runtime behaviour change.

Refs ADR 0002 D1, D2, D4."

If for any reason the manifest change must be reviewed separately (for example, a security review of the sqlx version pin), split as:

  1. cargo add for sqlx and pgvector + manual feature line in Cargo.toml. Build with default features will pass but --features postgres-backend will fail (no module to satisfy the feature). This is acceptable for a short-lived intermediate commit.
  2. storage/mod.rs edits + storage/postgres/mod.rs creation. Both builds pass.

Default to the single-commit form unless asked to split.


Followups

  • 0002b-pool-and-config.md adds pool.rs, PostgresConfig, and the vestige-mcp postgres-backend pass-through feature.
  • 0002c-migrations.md adds crates/vestige-core/migrations/postgres/ with 0001_init.{up,down}.sql and 0002_hnsw.{up,down}.sql, plus postgres/migrations.rs invoking sqlx::migrate!. init() body lands here.
  • 0002d-store-impl-bodies.md introduces the two MemoryStoreError variants and replaces every todo!() in CRUD / scheduling / edges / domains / registry with real sqlx::query! bodies.
  • 0002e-hybrid-search.md fills the three search bodies via the RRF query.