docs(plans): add Phase 2 sub-plans 0002a-0002i + supersession notice

Nine Phase 2 sub-plans operationalising ADR 0002 against the Phase 2
master plan, each sized to fit a focused implementation session and
handed to Claude Code as a /goal brief without requiring the agent to
load the master plan.

Order of execution (each depends on the previous unless noted):
- 0002a-skeleton-and-feature-gate.md -- postgres-backend Cargo feature
  + PgMemoryStore skeleton with todo!() bodies. D1+D2.
- 0002b-pool-and-config.md -- PgPool builder, VestigeConfig/
  PostgresConfig, vestige.toml loader wired into vestige-mcp. D3+D7
  (master plan numbering).
- 0002c-migrations.md -- sqlx migrations 0001_init/0002_hnsw including
  D7 (users/groups/memberships, owner/visibility/shared_with_groups)
  and D8 (codebase column). SQLite V15 parity migration. D4.
- 0002d-store-impl-bodies.md -- real CRUD + registry bodies; trivial
  fts_search/vector_search bodies. D2+D6.
- 0002e-hybrid-search.md -- one-statement RRF query. D5.
- 0002f-migrate-cli.md -- vestige migrate copy (SQLite -> Postgres),
  --dry-run, idempotent re-runs, --allow-source-upgrade for pre-V15
  sources. D8+D10.
- 0002g-reembed.md -- vestige migrate reembed (offline rebuild).
  D9 + D10 reembed arm. Ships resolve_embedder helper as a workaround
  for the missing Embedder::from_name(&str) constructor.
- 0002h-testing-and-benches.md -- testcontainers harness, six
  integration test files, Criterion bench at 1k/100k. D14+D15.
- 0002i-runbook.md -- operator-facing deployment + day-2 runbook. D16.

Supersession notice added to the master plan (0002-phase-2-postgres-
backend.md) pointing at ADR 0002; body retained as archival reference.

PR B carries this commit plus the previous two (ADR 0002 + Phase 1
amendment sub-plans); no code change.
This commit is contained in:
Jan De Landtsheer 2026-05-27 09:35:58 +02:00
parent 697ade5b02
commit 9ef8afdb20
No known key found for this signature in database
GPG key ID: 95CD37F0C226040B
10 changed files with 8998 additions and 0 deletions

View file

@ -1,5 +1,13 @@
# Phase 2 Plan: PostgreSQL Backend
> **Supersession Notice (2026-05-26):** This master plan is now archival. Execution is governed by:
> - **ADR**: [`docs/adr/0002-phase-2-execution.md`](../adr/0002-phase-2-execution.md) -- binding decisions
> - **Sub-plans** (executable briefs):
> - Phase 1 amendment: [0001a-trait-rewrite.md](0001a-trait-rewrite.md), [0001b-sqlite-split.md](0001b-sqlite-split.md), [0001c-async-trait-sunset.md](0001c-async-trait-sunset.md)
> - Phase 2: 0002a..0002i (skeleton, pool+config, migrations, store impl, hybrid search, migrate CLI, reembed, tests+benches, runbook)
>
> **Deltas vs body**: trait uses `trait_variant`, error type is `MemoryStoreError`/`MemoryStoreResult`, `connect` is `(url, max_connections)` only, the core table is `knowledge_nodes` (not `memories`) and gains `owner_user_id` + `visibility` + `shared_with_groups` + `codebase`, plus `users`/`groups`/`group_memberships` tables. See ADR 0002 D1-D8.
**Status**: Draft
**Depends on**: Phase 1 (MemoryStore + Embedder traits, embedding_model registry, domain columns)
**Related**: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 2), docs/prd/001-getting-centralized-vestige.md, docs/plans/local-dev-postgres-setup.md (local cluster provisioning)

View file

@ -0,0 +1,554 @@
# Phase 2 Sub-Plan 0002a -- Skeleton and Feature Gate
**Status**: Ready
**Depends on**: Phase 1 amendment (sub-plans `0001a-trait-rewrite.md` and
`0001b-sqlite-split.md`) merged. Specifically:
- `MemoryStore` trait declared with `#[trait_variant::make(MemoryStore: Send)]`,
generating a non-Send `LocalMemoryStore` companion trait. The
`pub use MemoryStore as LocalMemoryStore` alias from Phase 1 is gone.
- `crates/vestige-core/src/storage/sqlite.rs` has been split into
`crates/vestige-core/src/storage/sqlite/` with the same public surface.
This sub-plan covers Phase 2 master-plan deliverables D1 and D2 only:
the `postgres-backend` Cargo feature gate and a compilable `PgMemoryStore`
skeleton whose trait method bodies are `todo!()`. No real Postgres code, no
migrations, no SQL. Later sub-plans (`0002b-pool-and-config.md`,
`0002c-migrations.md`, `0002d-store-impl-bodies.md`, ...) fill the bodies in.
The success criterion is a clean build under both feature-flag configurations,
nothing more.
---
## Context
ADR 0002 D4 commits Phase 2 to a `crates/vestige-core/src/storage/postgres/`
directory from day one. The seven other files in that directory
(`pool.rs`, `migrations.rs`, `registry.rs`, `search.rs`, `migrate_cli.rs`,
`reembed.rs`) belong to subsequent sub-plans. This sub-plan creates only
`crates/vestige-core/src/storage/postgres/mod.rs` so the rest can be added
incrementally without breaking the build.
Per ADR 0002 D2, `PgMemoryStore::connect` mirrors `SqliteMemoryStore::new`:
no `Embedder` argument. The pgvector typmod DDL
(`ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)`) lives inside
the trait method `register_model`, invoked by the caller after construction.
In this sub-plan `register_model` is a `todo!()` body; `0002c-migrations.md`
and `0002d-store-impl-bodies.md` provide the real implementation.
The trait surface in `crates/vestige-core/src/storage/memory_store.rs` is the
source of truth for method signatures. Do NOT copy signatures from the master
plan -- they are stale in places (for example, master plan 0002 D2 lists
`remove_edge` as three-arg `(source, target, edge_type)`; the live trait has
two args `(source, target)`).
---
## Cargo manifest changes
Two optional crates and one new feature flag. Use `cargo add` per the global
CLAUDE.md preference; do not hand-edit `Cargo.toml`.
```bash
cd crates/vestige-core
cargo add sqlx@0.8 --optional --no-default-features \
--features runtime-tokio,tls-rustls,postgres,uuid,chrono,json,migrate,macros
cargo add pgvector@0.4 --optional --features sqlx
```
After both commands, open `crates/vestige-core/Cargo.toml` and add the
`postgres-backend` feature line in the `[features]` block. Place it after
the `metal` feature, before `[dependencies]`:
```toml
# Postgres backend (mutually compilable with the SQLite backend; default OFF).
# Compile with: --features postgres-backend
postgres-backend = ["dep:sqlx", "dep:pgvector"]
```
Do NOT add `tokio-stream`, `futures`, `indicatif`, or `toml` in this sub-plan.
The master plan D1 lists them in the `postgres-backend` feature for
convenience, but their consumers (streaming migrate, progress bar, config
parsing) land in later sub-plans. Adding them here pulls dead weight into the
feature gate.
Do NOT add the `vestige-mcp` pass-through feature in this sub-plan either.
The MCP crate gets its `postgres-backend` feature in `0002b-pool-and-config.md`
when `MemoryStoreConfig` lands and the binary needs a knob to pick a backend.
Verify the diff to `crates/vestige-core/Cargo.toml` looks like this and only
this:
```toml
[features]
# ...existing features unchanged...
postgres-backend = ["dep:sqlx", "dep:pgvector"]
[dependencies]
# ...existing deps unchanged...
sqlx = { version = "0.8", default-features = false, features = [
"runtime-tokio", "tls-rustls", "postgres", "uuid", "chrono",
"json", "migrate", "macros",
], optional = true }
pgvector = { version = "0.4", features = ["sqlx"], optional = true }
```
The exact ordering of the two new lines inside `[dependencies]` is not
significant; `cargo add` places them at the end. Leave the placement that
`cargo add` produces.
---
## Storage module export
Edit `crates/vestige-core/src/storage/mod.rs` to expose the new module behind
the feature flag. Two lines change.
Add to the module-declaration block (after `mod sqlite;`):
```rust
#[cfg(feature = "postgres-backend")]
mod postgres;
```
Add to the re-export block (after the `pub use sqlite::{ ... }` block):
```rust
#[cfg(feature = "postgres-backend")]
pub use postgres::PgMemoryStore;
```
Nothing else in `storage/mod.rs` changes. The `Storage` alias still points at
`SqliteMemoryStore`; the SQLite re-export block is untouched.
---
## Postgres module skeleton
Create `crates/vestige-core/src/storage/postgres/mod.rs` with the full content
below. This is the only new file in this sub-plan.
```rust
#![cfg(feature = "postgres-backend")]
//! Postgres-backed implementation of `MemoryStore`.
//!
//! Skeleton only. Every trait method is `todo!()`. Real bodies land in
//! subsequent Phase 2 sub-plans:
//! - `0002b-pool-and-config.md`: pool construction and config wiring
//! - `0002c-migrations.md`: sqlx migration files and `init`/`register_model`
//! - `0002d-store-impl-bodies.md`: CRUD, scheduling, edges, domains
//! - `0002e-hybrid-search.md`: RRF query and search bodies
//!
//! The directory grows companion files (`pool.rs`, `migrations.rs`,
//! `registry.rs`, `search.rs`, `migrate_cli.rs`, `reembed.rs`) in those
//! sub-plans; this skeleton only creates `mod.rs`.
use chrono::{DateTime, Utc};
use sqlx::PgPool;
use uuid::Uuid;
use crate::storage::memory_store::{
Domain, HealthStatus, LocalMemoryStore, MemoryEdge, MemoryRecord, MemoryStoreResult,
ModelSignature, SchedulingState, SearchQuery, SearchResult, StoreStats,
};
/// Postgres-backed implementation of `MemoryStore`.
///
/// Cheaply cloneable. Methods take `&self`; interior state lives inside the
/// `PgPool` (which already provides `Sync` via `Arc` internally).
#[derive(Clone)]
pub struct PgMemoryStore {
pool: PgPool,
/// Embedding vector dimension. Set to 0 in the skeleton; populated by
/// `register_model` in `0002d-store-impl-bodies.md` once the pgvector
/// `ALTER COLUMN TYPE vector(N)` DDL lands.
embedding_dim: i32,
}
impl PgMemoryStore {
/// Construct a new store from a connection URL.
///
/// Mirrors `SqliteMemoryStore::new`: no `Embedder` argument. The pgvector
/// `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)` DDL lives
/// inside `register_model`, not here. The caller (cognitive engine
/// bootstrap, migrate CLI, tests) invokes `register_model` after this
/// returns, identically to the SQLite path.
///
/// Real body lands in `0002b-pool-and-config.md` (pool construction) and
/// `0002c-migrations.md` (initial migration run).
pub async fn connect(_url: &str, _max_connections: u32) -> MemoryStoreResult<Self> {
todo!("PgMemoryStore::connect lands in 0002b-pool-and-config.md")
}
/// Low-level constructor for tests: supply an existing pool, skip migrate.
///
/// Real body lands in `0002b-pool-and-config.md`.
pub async fn from_pool(_pool: PgPool) -> MemoryStoreResult<Self> {
todo!("PgMemoryStore::from_pool lands in 0002b-pool-and-config.md")
}
/// Accessor used by migrate/reembed CLI.
pub fn pool(&self) -> &PgPool {
&self.pool
}
/// Currently-registered vector dimension. Returns 0 until `register_model`
/// has been called (real body: `0002d-store-impl-bodies.md`).
pub fn embedding_dim(&self) -> i32 {
self.embedding_dim
}
}
// trait_variant::make on the trait declaration generates two traits:
// - `MemoryStore` (Send-bound)
// - `LocalMemoryStore` (non-Send companion)
// The implementer writes `impl LocalMemoryStore for ...` plainly; the Send
// variant is generated automatically from the non-Send impl.
impl LocalMemoryStore for PgMemoryStore {
// --- Lifecycle ---
async fn init(&self) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::init lands in 0002c-migrations.md")
}
async fn health_check(&self) -> MemoryStoreResult<HealthStatus> {
todo!("PgMemoryStore::health_check lands in 0002d-store-impl-bodies.md")
}
// --- Embedding model registry ---
async fn registered_model(&self) -> MemoryStoreResult<Option<ModelSignature>> {
todo!("PgMemoryStore::registered_model lands in 0002d-store-impl-bodies.md")
}
async fn register_model(&self, _sig: &ModelSignature) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::register_model lands in 0002d-store-impl-bodies.md")
}
// --- CRUD ---
async fn insert(&self, _record: &MemoryRecord) -> MemoryStoreResult<Uuid> {
todo!("PgMemoryStore::insert lands in 0002d-store-impl-bodies.md")
}
async fn get(&self, _id: Uuid) -> MemoryStoreResult<Option<MemoryRecord>> {
todo!("PgMemoryStore::get lands in 0002d-store-impl-bodies.md")
}
async fn update(&self, _record: &MemoryRecord) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::update lands in 0002d-store-impl-bodies.md")
}
async fn delete(&self, _id: Uuid) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::delete lands in 0002d-store-impl-bodies.md")
}
// --- Search ---
async fn search(&self, _query: &SearchQuery) -> MemoryStoreResult<Vec<SearchResult>> {
todo!("PgMemoryStore::search lands in 0002e-hybrid-search.md")
}
async fn fts_search(
&self,
_text: &str,
_limit: usize,
) -> MemoryStoreResult<Vec<SearchResult>> {
todo!("PgMemoryStore::fts_search lands in 0002e-hybrid-search.md")
}
async fn vector_search(
&self,
_embedding: &[f32],
_limit: usize,
) -> MemoryStoreResult<Vec<SearchResult>> {
todo!("PgMemoryStore::vector_search lands in 0002e-hybrid-search.md")
}
// --- FSRS Scheduling ---
async fn get_scheduling(
&self,
_memory_id: Uuid,
) -> MemoryStoreResult<Option<SchedulingState>> {
todo!("PgMemoryStore::get_scheduling lands in 0002d-store-impl-bodies.md")
}
async fn update_scheduling(&self, _state: &SchedulingState) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::update_scheduling lands in 0002d-store-impl-bodies.md")
}
async fn get_due_memories(
&self,
_before: DateTime<Utc>,
_limit: usize,
) -> MemoryStoreResult<Vec<(MemoryRecord, SchedulingState)>> {
todo!("PgMemoryStore::get_due_memories lands in 0002d-store-impl-bodies.md")
}
// --- Graph (spreading activation) ---
async fn add_edge(&self, _edge: &MemoryEdge) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::add_edge lands in 0002d-store-impl-bodies.md")
}
async fn get_edges(
&self,
_node_id: Uuid,
_edge_type: Option<&str>,
) -> MemoryStoreResult<Vec<MemoryEdge>> {
todo!("PgMemoryStore::get_edges lands in 0002d-store-impl-bodies.md")
}
async fn remove_edge(&self, _source: Uuid, _target: Uuid) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::remove_edge lands in 0002d-store-impl-bodies.md")
}
async fn get_neighbors(
&self,
_node_id: Uuid,
_depth: usize,
) -> MemoryStoreResult<Vec<(MemoryRecord, f64)>> {
todo!("PgMemoryStore::get_neighbors lands in 0002d-store-impl-bodies.md")
}
// --- Domains (Phase 1: stubs return empty; full impl in Phase 4) ---
async fn list_domains(&self) -> MemoryStoreResult<Vec<Domain>> {
todo!("PgMemoryStore::list_domains lands in 0002d-store-impl-bodies.md")
}
async fn get_domain(&self, _id: &str) -> MemoryStoreResult<Option<Domain>> {
todo!("PgMemoryStore::get_domain lands in 0002d-store-impl-bodies.md")
}
async fn upsert_domain(&self, _domain: &Domain) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::upsert_domain lands in 0002d-store-impl-bodies.md")
}
async fn delete_domain(&self, _id: &str) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::delete_domain lands in 0002d-store-impl-bodies.md")
}
async fn classify(&self, _embedding: &[f32]) -> MemoryStoreResult<Vec<(String, f64)>> {
todo!("PgMemoryStore::classify lands in 0002d-store-impl-bodies.md")
}
// --- Bulk / Maintenance ---
async fn count(&self) -> MemoryStoreResult<usize> {
todo!("PgMemoryStore::count lands in 0002d-store-impl-bodies.md")
}
async fn get_stats(&self) -> MemoryStoreResult<StoreStats> {
todo!("PgMemoryStore::get_stats lands in 0002d-store-impl-bodies.md")
}
async fn vacuum(&self) -> MemoryStoreResult<()> {
todo!("PgMemoryStore::vacuum lands in 0002d-store-impl-bodies.md")
}
}
```
Notes on the skeleton:
- The file-level `#![cfg(feature = "postgres-backend")]` means the whole file
vanishes when the feature is off. The `mod postgres;` line in
`storage/mod.rs` is itself feature-gated, so this is belt-and-braces; both
gates are needed because the file-level attribute is what allows the file to
use `sqlx::PgPool` unconditionally inside it.
- `EmbeddingModelDescriptor` (a separate Postgres-internal type that the
master plan sketched on the struct) is dropped. The trait surface already
carries `ModelSignature` for the registry round-trip; the registry storage
layout is a private concern of `registry.rs`, which is added later. Keep
`PgMemoryStore` minimal until a real consumer needs the extra type.
- The struct only carries `pool` and `embedding_dim`. The model descriptor
field from the master plan sketch goes away with `EmbeddingModelDescriptor`.
If `register_model` later needs to cache the descriptor on the struct, it
can be added then; the skeleton does not speculate.
- The two trivial accessors (`pool`, `embedding_dim`) get real bodies. Every
other method is `todo!()` so it returns `!` and trivially coerces to the
declared return type at the type checker; this is what lets the build pass
with no error variants and no SQL.
---
## Connect signature
Per ADR 0002 D2:
```rust
pub async fn connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self>;
pub async fn from_pool(pool: PgPool) -> MemoryStoreResult<Self>;
```
No `&dyn Embedder` argument. This deliberately differs from master plan 0002,
which predates the Phase 1 freeze. The pgvector-specific DDL
(`ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)`) does not run
inside `connect`; it runs inside `register_model(&ModelSignature)`, which the
caller invokes after `connect` returns.
In this sub-plan `register_model` is `todo!()`. The real body lands in
`0002d-store-impl-bodies.md` after `0002c-migrations.md` ships the
`0001_init.up.sql` migration that creates the `memories` table with a
placeholder `embedding vector` column (no typmod), against which
`register_model` later runs the typmod stamp.
---
## Error variant additions: deferred
`MemoryStoreError` does NOT gain `Postgres(sqlx::Error)` or
`Migrate(sqlx::migrate::MigrateError)` in this sub-plan.
The reason is mechanical: `todo!()` evaluates to the never type `!`, which
coerces to any `MemoryStoreResult<T>` regardless of the error variants
available. With every method body a `todo!()`, the skeleton has no expression
that needs to convert a `sqlx::Error` or `sqlx::migrate::MigrateError` into
`MemoryStoreError`. Adding the variants here would mean adding the
`#[cfg(feature = "postgres-backend")]` and `#[from]` plumbing to
`memory_store.rs` with no consumer yet -- dead code at every level except the
enum definition itself.
`0002d-store-impl-bodies.md` introduces both variants in the same commit that
turns the first `todo!()` into a real `sqlx::query!` call. That keeps the
diff to `memory_store.rs` next to the first usage site, which is easier to
review than adding variants ahead of need.
For reference, the variants that will be added in `0002d-store-impl-bodies.md`
look like this:
```rust
#[cfg(feature = "postgres-backend")]
#[error("postgres error: {0}")]
Postgres(#[from] sqlx::Error),
#[cfg(feature = "postgres-backend")]
#[error("postgres migration error: {0}")]
Migrate(#[from] sqlx::migrate::MigrateError),
```
Do not pre-add them here.
---
## Verification
Run these commands from the workspace root. All four must produce a clean
build, zero warnings on the diff-affected files, no test changes.
```bash
# 1. Default features (SQLite backend, postgres-backend OFF). Must build.
cargo build --workspace --all-targets
# 2. Workspace clippy with default features. Must be clean.
cargo clippy --workspace --all-targets -- -D warnings
# 3. Postgres feature enabled. Must build.
cargo build -p vestige-core --features postgres-backend
# 4. Clippy with postgres feature enabled. Must be clean.
cargo clippy -p vestige-core --features postgres-backend --all-targets -- -D warnings
```
Expected outcomes:
- `cargo build --workspace --all-targets` finishes with no compilation of
`sqlx` or `pgvector` (both are optional, no consumer with default features).
The `postgres` module is excluded entirely via `#[cfg]`.
- `cargo build -p vestige-core --features postgres-backend` compiles `sqlx`,
`pgvector`, and `storage/postgres/mod.rs`. The build succeeds because every
trait method is `todo!()`; nothing actually runs SQL.
- Both `clippy` invocations pass with `-D warnings`. The `todo!()` macro does
not emit a `dead_code` lint by itself, and the trivial accessors are used by
later sub-plans (clippy on the postgres feature alone may flag them as
unused if you run with `--lib` only; the `--all-targets` form keeps tests
and benches in scope so this does not fire).
- If clippy flags `unused_variables` on the underscore-prefixed parameters in
the `todo!()` bodies, the underscore prefix is already the standard
suppression; if a future clippy version disagrees, add
`#[allow(unused_variables)]` to the impl block, not to each method.
Tests are not modified in this sub-plan. The unit tests in
`memory_store.rs` (`memory_store_error_from_storage_error`,
`model_signature_serde_round_trip`, `memory_record_serde_round_trip`) keep
passing because no type they touch changes.
Do NOT run `cargo test` against the postgres feature -- there is no Postgres
running and no test exercises `PgMemoryStore` yet. The build check is the
contract.
---
## Acceptance criteria
1. `crates/vestige-core/Cargo.toml` declares `sqlx = "0.8"` and
`pgvector = "0.4"` as optional dependencies with the exact feature sets
specified above.
2. `crates/vestige-core/Cargo.toml` declares `postgres-backend = ["dep:sqlx",
"dep:pgvector"]` and nothing else inside that feature.
3. `crates/vestige-mcp/Cargo.toml` is unchanged.
4. `crates/vestige-core/src/storage/mod.rs` adds exactly two
feature-gated lines: `mod postgres;` and `pub use postgres::PgMemoryStore;`.
No other change.
5. `crates/vestige-core/src/storage/postgres/mod.rs` exists and contains the
`PgMemoryStore` struct, `impl PgMemoryStore` block with real `pool` and
`embedding_dim` accessors and `todo!()` bodies for `connect` and
`from_pool`, and the full `impl LocalMemoryStore for PgMemoryStore` block
with `todo!()` for every trait method.
6. The trait impl method signatures match `memory_store.rs` byte-for-byte
(including `remove_edge(&self, source: Uuid, target: Uuid)` two-arg form,
not the three-arg form from the master plan).
7. `MemoryStoreError` is unchanged.
8. No other files in the crate are touched. No new files in
`storage/postgres/` besides `mod.rs`.
9. The four verification commands above all succeed.
---
## Commit sequence
One commit is recommended. The two changes (Cargo manifest + module skeleton)
do not compile in isolation: the manifest change without the skeleton produces
unused-optional-dep warnings, and the skeleton without the manifest change
fails to find `sqlx`. Splitting them adds no review value, since the second
commit is the one that has to compile cleanly.
```bash
git add crates/vestige-core/Cargo.toml \
crates/vestige-core/Cargo.lock \
crates/vestige-core/src/storage/mod.rs \
crates/vestige-core/src/storage/postgres/mod.rs
git commit -m "feat(storage): scaffold postgres-backend feature and PgMemoryStore skeleton
Adds the postgres-backend Cargo feature gating sqlx 0.8 and pgvector 0.4.
Introduces crates/vestige-core/src/storage/postgres/mod.rs with the
PgMemoryStore struct, connect/from_pool/pool/embedding_dim, and a trait impl
whose method bodies are todo!() pending later Phase 2 sub-plans.
Builds clean with default features (SQLite only) and with --features
postgres-backend. No runtime behaviour change.
Refs ADR 0002 D1, D2, D4."
```
If for any reason the manifest change must be reviewed separately (for
example, a security review of the sqlx version pin), split as:
1. `cargo add` for sqlx and pgvector + manual feature line in Cargo.toml.
Build with default features will pass but `--features postgres-backend`
will fail (no module to satisfy the feature). This is acceptable for a
short-lived intermediate commit.
2. `storage/mod.rs` edits + `storage/postgres/mod.rs` creation. Both builds
pass.
Default to the single-commit form unless asked to split.
---
## Followups
- `0002b-pool-and-config.md` adds `pool.rs`, `PostgresConfig`, and the
`vestige-mcp` `postgres-backend` pass-through feature.
- `0002c-migrations.md` adds `crates/vestige-core/migrations/postgres/` with
`0001_init.{up,down}.sql` and `0002_hnsw.{up,down}.sql`, plus
`postgres/migrations.rs` invoking `sqlx::migrate!`. `init()` body lands here.
- `0002d-store-impl-bodies.md` introduces the two `MemoryStoreError` variants
and replaces every `todo!()` in CRUD / scheduling / edges / domains /
registry with real `sqlx::query!` bodies.
- `0002e-hybrid-search.md` fills the three search bodies via the RRF query.

View file

@ -0,0 +1,886 @@
# Sub-plan 0002b -- Pool construction and VestigeConfig
**Status**: Draft
**Master plan**: [0002-phase-2-postgres-backend.md](0002-phase-2-postgres-backend.md)
**ADR**: [0002-phase-2-execution.md](../adr/0002-phase-2-execution.md)
**Predecessor**: [0002a-skeleton-and-feature-gate.md](0002a-skeleton-and-feature-gate.md)
---
## Context
This sub-plan delivers two of the master plan's deliverables now that the
`0002a` skeleton has landed:
- **D3** -- pool construction in
`crates/vestige-core/src/storage/postgres/pool.rs`. Replaces the `todo!()`
body of `PgMemoryStore::connect` with a real `PgPool` builder that reads a
`PostgresConfig`. Registry/migration calls remain `todo!()`; those are
filled in by sub-plans `0002c` (migrations) and `0002d` (store bodies +
registry).
- **D7** -- new module `crates/vestige-core/src/config.rs` containing
`VestigeConfig`, `StorageConfig`, `SqliteConfig`, `PostgresConfig`,
`EmbeddingsConfig`, plus a `ConfigError` enum and a loader that reads
`vestige.toml`. The loader is wired into `vestige-mcp` so the running
server picks SQLite or Postgres at startup based on the config file.
After this sub-plan:
- `cargo build` (default features, no `postgres-backend`) compiles and the
MCP server still defaults to SQLite when no `vestige.toml` is present.
- `cargo build --features postgres-backend` compiles, with
`PgMemoryStore::connect` now wiring through `pool.rs` (registry/migration
still `todo!()` until `0002c` and `0002d`).
- A `vestige.toml` example can be round-tripped through
`VestigeConfig::load` in a unit test.
This sub-plan deliberately does NOT:
- Add migrations (`0002c`).
- Fill in real CRUD/search bodies on `PgMemoryStore` (`0002d`, `0002e`).
- Add env-var override support (Phase 3 concern, called out in master plan
D7 behaviour notes).
---
## Dependencies
- `0002a-skeleton-and-feature-gate.md` must be merged. That sub-plan creates
`crates/vestige-core/src/storage/postgres/mod.rs` with:
- `PgMemoryStore` struct holding `pool: PgPool`.
- `PgMemoryStore::connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self>`
body = `todo!()`.
- `PgMemoryStore::from_pool(pool: PgPool) -> MemoryStoreResult<Self>`
body = `todo!()`.
- The trait impl block with all methods routed to `todo!()`.
- The `postgres-backend` feature gate on the module declaration in
`storage/mod.rs`.
This sub-plan extends those bodies and adds two siblings: `pool.rs` and
`registry.rs` (the latter is a stub here, real body in `0002d`).
---
## Audit step (do this first)
Before adding `config.rs`, confirm there is no existing top-level config
loader. Run from the repo root:
```bash
rg -nF 'VestigeConfig' crates/
rg -nF 'toml::from_str' crates/
rg -n '#\[derive.*Deserialize.*\]' crates/vestige-core/src/
```
If a `VestigeConfig` struct already exists from Phase 1, treat the "Config
module" section below as additive: extend the existing struct rather than
creating a new file. The cross-cut additions in that case are:
1. Add the `StorageConfig` enum (gated and ungated branches).
2. Add `SqliteConfig`, `PostgresConfig`.
3. Add the `default_path()` helper if missing.
4. Add `ConfigError` if a different error enum is used today (rename/extend
instead of duplicating).
As of the audit at the time of this writing, no `VestigeConfig` exists in
`vestige-core`. `directories::ProjectDirs` is already used in
`vestige-core/src/embeddings/local.rs` and in
`vestige-mcp/src/protocol/auth.rs`, so the `directories` crate is already a
workspace dependency -- no new dep there.
---
## Cargo manifest additions
Add `toml` to `vestige-core`. `serde` and `thiserror` are already present
from Phase 1; `directories` is already a transitive dep but we add it
explicitly so `default_path()` is supported.
```bash
cd crates/vestige-core
cargo add toml@0.8
cargo add directories@5
```
No new deps on `vestige-mcp`; it already depends on `vestige-core`.
`sqlx` is already added by `0002a` behind the `postgres-backend` feature
with `runtime-tokio`, `tls-rustls`, `postgres`, `uuid`, `chrono`,
`json`, `macros`, `migrate` features. The pool module only uses what is
already pulled in.
---
## Config module
**File**: `crates/vestige-core/src/config.rs` (new).
**Re-exported** from `crates/vestige-core/src/lib.rs` as `pub mod config;` plus
`pub use config::{VestigeConfig, StorageConfig, SqliteConfig, EmbeddingsConfig, ConfigError};`
and `#[cfg(feature = "postgres-backend")] pub use config::PostgresConfig;`.
Full content:
```rust
//! Vestige top-level configuration.
//!
//! Loaded from `~/.vestige/vestige.toml` by default; the path is overridable
//! via `VestigeConfig::load(Some(&path))`. Parsing uses serde + toml; the
//! `[storage]` section is internally-tagged on a `backend` field so a single
//! enum dispatch picks SQLite or Postgres.
use std::path::{Path, PathBuf};
use serde::Deserialize;
/// Top-level configuration as parsed from `vestige.toml`.
#[derive(Debug, Clone, Deserialize, Default)]
#[serde(default, deny_unknown_fields)]
pub struct VestigeConfig {
pub embeddings: EmbeddingsConfig,
pub storage: StorageConfig,
/// Reserved for Phase 3. Empty in Phase 2.
pub server: ServerConfig,
/// Reserved for Phase 3. Empty in Phase 2.
pub auth: AuthConfig,
}
/// Embedding provider selection.
#[derive(Debug, Clone, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct EmbeddingsConfig {
/// Provider key. Phase 2 ships `"fastembed"` only.
pub provider: String,
/// Model name. For fastembed this is e.g. `"nomic-ai/nomic-embed-text-v1.5"`.
pub model: String,
}
impl Default for EmbeddingsConfig {
fn default() -> Self {
Self {
provider: "fastembed".to_string(),
model: crate::DEFAULT_EMBEDDING_MODEL.to_string(),
}
}
}
/// Storage backend selection. Internally tagged on the `backend` field:
///
/// ```toml
/// [storage]
/// backend = "sqlite"
///
/// [storage.sqlite]
/// path = "/home/user/.vestige/vestige.db"
/// ```
///
/// or, when compiled with `--features postgres-backend`:
///
/// ```toml
/// [storage]
/// backend = "postgres"
///
/// [storage.postgres]
/// url = "postgres://vestige:secret@localhost:5432/vestige"
/// max_connections = 10
/// acquire_timeout_secs = 30
/// ```
#[derive(Debug, Clone, Deserialize)]
#[serde(tag = "backend", rename_all = "lowercase", deny_unknown_fields)]
pub enum StorageConfig {
Sqlite(SqliteConfig),
#[cfg(feature = "postgres-backend")]
Postgres(PostgresConfig),
}
impl Default for StorageConfig {
fn default() -> Self {
StorageConfig::Sqlite(SqliteConfig::default())
}
}
/// SQLite backend configuration.
#[derive(Debug, Clone, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct SqliteConfig {
/// Path to the `vestige.db` file. If unset, the SqliteMemoryStore
/// constructor picks its platform default location.
#[serde(default)]
pub path: Option<PathBuf>,
}
impl Default for SqliteConfig {
fn default() -> Self {
Self { path: None }
}
}
/// Postgres backend configuration. Only present when the `postgres-backend`
/// Cargo feature is enabled.
#[cfg(feature = "postgres-backend")]
#[derive(Debug, Clone, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct PostgresConfig {
/// `postgres://user:pass@host:port/db` -- forwarded to
/// `PgConnectOptions::from_str`.
pub url: String,
/// Pool size. Default `10`.
#[serde(default)]
pub max_connections: Option<u32>,
/// Acquire timeout in seconds. Default `30`. Set above 30 so
/// testcontainer-based test fixtures do not race.
#[serde(default)]
pub acquire_timeout_secs: Option<u64>,
}
/// Reserved for Phase 3 (bind address, ports, TLS).
#[derive(Debug, Clone, Default, Deserialize)]
#[serde(default, deny_unknown_fields)]
pub struct ServerConfig {}
/// Reserved for Phase 3 (API keys, claims).
#[derive(Debug, Clone, Default, Deserialize)]
#[serde(default, deny_unknown_fields)]
pub struct AuthConfig {}
/// Errors raised while locating, reading, or parsing `vestige.toml`.
#[derive(Debug, thiserror::Error)]
pub enum ConfigError {
#[error("config io: {0}")]
Io(#[from] std::io::Error),
#[error("config toml: {0}")]
Toml(#[from] toml::de::Error),
#[error("config dir: could not locate user home")]
NoHome,
#[error("invalid config: {0}")]
Invalid(String),
}
impl VestigeConfig {
/// Load config from `path` or from `default_path()` when `None`.
///
/// Returns `VestigeConfig::default()` (SQLite + fastembed defaults) when
/// the file does not exist. Any other I/O or parse failure is surfaced
/// as a `ConfigError`.
pub fn load(path: Option<&Path>) -> Result<Self, ConfigError> {
let resolved: PathBuf = match path {
Some(p) => p.to_path_buf(),
None => Self::default_path()?,
};
match std::fs::read_to_string(&resolved) {
Ok(text) => {
let cfg: VestigeConfig = toml::from_str(&text)?;
cfg.validate()?;
Ok(cfg)
}
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
Ok(Self::default())
}
Err(e) => Err(ConfigError::Io(e)),
}
}
/// `~/.vestige/vestige.toml`. The directory is NOT created here; loading
/// a missing file falls back to defaults.
pub fn default_path() -> Result<PathBuf, ConfigError> {
let dirs = directories::ProjectDirs::from("", "vestige", "vestige")
.ok_or(ConfigError::NoHome)?;
// ProjectDirs::config_dir() varies per OS. Vestige convention is
// ~/.vestige/vestige.toml on Linux/macOS regardless of XDG, so we
// build the path off the home dir explicitly.
let home = directories::UserDirs::new()
.ok_or(ConfigError::NoHome)?
.home_dir()
.to_path_buf();
let _ = dirs; // keep the dep wired; future Phase 3 may use it
Ok(home.join(".vestige").join("vestige.toml"))
}
/// Light cross-field validation. Heavy validation (URL parsing,
/// directory existence) is left to the backend constructors.
fn validate(&self) -> Result<(), ConfigError> {
if self.embeddings.provider.is_empty() {
return Err(ConfigError::Invalid(
"embeddings.provider must not be empty".into(),
));
}
if self.embeddings.model.is_empty() {
return Err(ConfigError::Invalid(
"embeddings.model must not be empty".into(),
));
}
match &self.storage {
StorageConfig::Sqlite(_) => {}
#[cfg(feature = "postgres-backend")]
StorageConfig::Postgres(cfg) => {
if cfg.url.is_empty() {
return Err(ConfigError::Invalid(
"storage.postgres.url must not be empty".into(),
));
}
}
}
Ok(())
}
}
```
### Serde behaviour with `postgres-backend` off
`StorageConfig` is generated by serde only for the variants that are
compiled in. When `postgres-backend` is off and the user writes:
```toml
[storage]
backend = "postgres"
[storage.postgres]
url = "..."
```
serde returns a `toml::de::Error` of the form
`unknown variant `postgres`, expected `sqlite``. That error path goes
through `From<toml::de::Error> for ConfigError`, surfacing as
`ConfigError::Toml(..)`. The MCP server prints this once at startup and
exits with a non-zero code; there is no panic.
To make the error friendlier we wrap that specific case in a clearer
message via a thin post-parse check. Add this small helper after parsing
in `load()`:
```rust
// (Inside the Ok(text) arm in load(), wrapping the parse step.)
let cfg: VestigeConfig = match toml::from_str(&text) {
Ok(c) => c,
Err(e) => {
let msg = e.to_string();
if msg.contains("unknown variant `postgres`") {
return Err(ConfigError::Invalid(
"storage.backend = \"postgres\" requires building with --features postgres-backend".into(),
));
}
return Err(ConfigError::Toml(e));
}
};
```
This keeps the strict default deny_unknown_fields behaviour while giving the
user a one-line action item.
---
## Pool module
**File**: `crates/vestige-core/src/storage/postgres/pool.rs` (new).
```rust
#![cfg(feature = "postgres-backend")]
//! `PgPool` construction for the Postgres backend.
//!
//! Pool defaults follow ADR 0002 D2 + master plan D3:
//! - max_connections = 10
//! - acquire_timeout = 30s (must exceed testcontainer warmup)
//! - idle_timeout = 600s
//! - max_lifetime = 1800s
//! - test_before_acquire = false (cheap queries; saves a roundtrip)
use std::str::FromStr;
use std::time::Duration;
use sqlx::postgres::{PgConnectOptions, PgPoolOptions};
use sqlx::{ConnectOptions, PgPool};
use crate::config::PostgresConfig;
use crate::storage::memory_store::{MemoryStoreError, MemoryStoreResult};
const DEFAULT_MAX_CONNECTIONS: u32 = 10;
const DEFAULT_ACQUIRE_TIMEOUT_SECS: u64 = 30;
const IDLE_TIMEOUT_SECS: u64 = 600;
const MAX_LIFETIME_SECS: u64 = 1800;
const STATEMENT_CACHE_CAPACITY: usize = 256;
/// Build a Postgres connection pool from a `PostgresConfig`. Does NOT run
/// migrations or stamp the embedding registry; those are the caller's job
/// (`PgMemoryStore::connect`).
pub async fn build_pool(cfg: &PostgresConfig) -> MemoryStoreResult<PgPool> {
let opts = PgConnectOptions::from_str(&cfg.url)
.map_err(MemoryStoreError::from)?
.application_name("vestige")
.statement_cache_capacity(STATEMENT_CACHE_CAPACITY)
.log_statements(tracing::log::LevelFilter::Debug);
let max_conn = cfg.max_connections.unwrap_or(DEFAULT_MAX_CONNECTIONS);
let acquire = cfg
.acquire_timeout_secs
.unwrap_or(DEFAULT_ACQUIRE_TIMEOUT_SECS);
let pool = PgPoolOptions::new()
.max_connections(max_conn)
.min_connections(0)
.acquire_timeout(Duration::from_secs(acquire))
.idle_timeout(Some(Duration::from_secs(IDLE_TIMEOUT_SECS)))
.max_lifetime(Some(Duration::from_secs(MAX_LIFETIME_SECS)))
.test_before_acquire(false)
.connect_with(opts)
.await
.map_err(MemoryStoreError::from)?;
Ok(pool)
}
```
### Wiring into `PgMemoryStore::connect`
In `crates/vestige-core/src/storage/postgres/mod.rs`, replace the
`todo!()` body left by `0002a` for `connect` and `from_pool` with:
```rust
// In crates/vestige-core/src/storage/postgres/mod.rs
use sqlx::PgPool;
use crate::config::PostgresConfig;
use crate::storage::memory_store::{MemoryStoreError, MemoryStoreResult};
mod pool;
mod registry; // see "Registry stub" section below
pub struct PgMemoryStore {
pool: PgPool,
}
impl PgMemoryStore {
/// Convenience constructor matching `SqliteMemoryStore::new` shape.
/// Takes a URL + pool size for the common case.
pub async fn connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self> {
let cfg = PostgresConfig {
url: url.to_string(),
max_connections: Some(max_connections),
acquire_timeout_secs: None,
};
Self::connect_with(&cfg).await
}
/// Full-config constructor.
pub async fn connect_with(cfg: &PostgresConfig) -> MemoryStoreResult<Self> {
let pool = pool::build_pool(cfg).await?;
Self::from_pool(pool).await
}
/// Construct from an already-built pool (used by tests and the migrate
/// CLI to share a pool across operations).
pub async fn from_pool(pool: PgPool) -> MemoryStoreResult<Self> {
// Migrations are added by 0002c.
// todo!("run sqlx::migrate! once 0002c lands")
registry::ensure_registry_stub(&pool).await?;
Ok(Self { pool })
}
}
```
`connect_with` is the long-lived API; `connect` becomes a thin shim that
stays compatible with the master-plan-mandated signature.
### Registry stub
**File**: `crates/vestige-core/src/storage/postgres/registry.rs` (new, stub).
```rust
#![cfg(feature = "postgres-backend")]
//! Embedding registry. Real body lands in sub-plan 0002d.
use sqlx::PgPool;
use crate::storage::memory_store::MemoryStoreResult;
/// Placeholder. Real implementation in 0002d reads/writes `embedding_model`
/// and stamps `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)`.
pub(crate) async fn ensure_registry_stub(_pool: &PgPool) -> MemoryStoreResult<()> {
// Intentionally a no-op until 0002c lands the table + 0002d lands the
// real body. Leaving this as todo!() would crash the MCP server at
// startup the moment a user switches `backend = "postgres"`, which is
// not what we want for the build verification step in this sub-plan.
Ok(())
}
```
The no-op keeps `cargo build --features postgres-backend` not just
compiling but also allowing the MCP server to *boot* against a Postgres
URL pointing at an already-migrated database (the local-dev-postgres-setup
docs cover bringing up such a DB by hand). Real init lands in `0002d`.
---
## Error variants
**File**: `crates/vestige-core/src/storage/memory_store.rs` (edit).
The Phase 1 enum `MemoryStoreError` gains two feature-gated variants. These
were deferred in `0002a` and become required as soon as `pool.rs` calls
`.map_err(MemoryStoreError::from)` on `sqlx::Error`.
```rust
// Within enum MemoryStoreError { ... } in memory_store.rs
#[cfg(feature = "postgres-backend")]
#[error("postgres error: {0}")]
Postgres(#[from] sqlx::Error),
#[cfg(feature = "postgres-backend")]
#[error("postgres migration error: {0}")]
Migrate(#[from] sqlx::migrate::MigrateError),
```
Both use thiserror's `#[from]` attribute so the `?` operator works in
`pool.rs`, the migrate module (`0002c`), and registry code (`0002d`).
Default-features build (no `postgres-backend`) sees neither variant; the
enum stays exhaustive on stable.
If clippy fires on `non_exhaustive` due to the gated variants, add
`#[non_exhaustive]` on the enum. That has no caller-side effect since the
enum is constructed only inside the crate.
---
## vestige-mcp wiring
### Cargo feature passthrough
**File**: `crates/vestige-mcp/Cargo.toml` (edit).
Add a feature that forwards through to `vestige-core`. Default features in
`vestige-mcp` stay unchanged.
```toml
[features]
default = ["embeddings", "vector-search"]
embeddings = ["vestige-core/embeddings"]
vector-search = ["vestige-core/vector-search"]
postgres-backend = ["vestige-core/postgres-backend"]
```
Verify with:
```bash
cargo build -p vestige-mcp --features postgres-backend
```
### Backend dispatch at startup
**File**: `crates/vestige-mcp/src/main.rs` (edit around the existing
`Storage::new(storage_path)` call -- see audit note above; in the current
worktree this is around line 285).
The current code is roughly:
```rust
let storage_path = match prepare_storage_path(config.data_dir) { ... };
let storage = match Storage::new(storage_path) { ... };
```
Replace that with a dispatch driven by `VestigeConfig`:
```rust
use std::sync::Arc;
use vestige_core::config::{StorageConfig, VestigeConfig};
use vestige_core::storage::SqliteMemoryStore;
#[cfg(feature = "postgres-backend")]
use vestige_core::storage::postgres::PgMemoryStore;
use vestige_core::storage::MemoryStore;
// Earlier: still call prepare_storage_path to honour --data-dir override.
let storage_path = match prepare_storage_path(config.data_dir.clone()) { ... };
// New: load vestige.toml (or fall back to defaults).
let vestige_cfg = match VestigeConfig::load(config.config_path.as_deref()) {
Ok(c) => c,
Err(e) => {
eprintln!("vestige: failed to load config: {e}");
std::process::exit(2);
}
};
let storage: Arc<dyn MemoryStore> = match &vestige_cfg.storage {
StorageConfig::Sqlite(sqlite_cfg) => {
// CLI flag --data-dir wins over the config file path.
let path = storage_path.clone().or_else(|| sqlite_cfg.path.clone());
let s = SqliteMemoryStore::new(path).unwrap_or_else(|e| {
eprintln!("vestige: sqlite init failed: {e}");
std::process::exit(3);
});
Arc::new(s)
}
#[cfg(feature = "postgres-backend")]
StorageConfig::Postgres(pg_cfg) => {
let s = PgMemoryStore::connect_with(pg_cfg).await.unwrap_or_else(|e| {
eprintln!("vestige: postgres init failed: {e}");
std::process::exit(3);
});
Arc::new(s)
}
};
```
The `config_path: Option<PathBuf>` field on the local `Config` (or
clap-derived `Args`) struct must be added if not present; it accepts
`--config <path>`. Default behaviour (no flag) goes through
`VestigeConfig::default_path()`.
If the existing main wires `Storage` through a concrete type rather than
`Arc<dyn MemoryStore>`, the dispatch above lives behind a helper:
```rust
async fn build_store(cfg: &VestigeConfig, cli_path: Option<PathBuf>)
-> Result<Arc<dyn MemoryStore>, anyhow::Error>
{ ... }
```
and the caller chains `.into()` as needed. Phase 1 already moved
cognitive modules to `Arc<dyn MemoryStore>` so this should be a pure
substitution; if a concrete-type holdout is found, fix it locally in this
sub-plan (separate commit) rather than punting.
---
## vestige.toml example
The canonical example to ship in `docs/` (Phase 2 docs land in `0002i`,
runbook), shown here for reference and used verbatim by the unit test
below.
```toml
# vestige.toml -- top-level configuration
#
# Default location: ~/.vestige/vestige.toml
# Override: vestige-mcp --config /path/to/vestige.toml
[embeddings]
provider = "fastembed"
model = "nomic-ai/nomic-embed-text-v1.5"
# --- SQLite backend (default) ---
[storage]
backend = "sqlite"
[storage.sqlite]
path = "/home/user/.vestige/vestige.db"
# --- Postgres backend (requires --features postgres-backend) ---
# [storage]
# backend = "postgres"
#
# [storage.postgres]
# url = "postgres://vestige:secret@localhost:5432/vestige"
# max_connections = 10
# acquire_timeout_secs = 30
[server]
# Reserved for Phase 3 (bind address, ports, TLS).
[auth]
# Reserved for Phase 3 (API keys, claims).
```
---
## Verification
Run all of these from the repo root. The first three are the gates that
must pass before this sub-plan is considered done.
### 1. Default build (no Postgres)
```bash
cargo build -p vestige-core
cargo build -p vestige-mcp
cargo test -p vestige-core --lib
```
Expected: clean build. `VestigeConfig::default()` selects SQLite; the MCP
server boots the same way it did pre-sub-plan.
### 2. Postgres-feature build
```bash
cargo build -p vestige-core --features postgres-backend
cargo build -p vestige-mcp --features postgres-backend
```
Expected: clean build. `PgMemoryStore::connect_with` resolves to
`pool::build_pool` + `registry::ensure_registry_stub`; no `todo!()` is
reachable on the build path. `connect` and `from_pool` are exported.
### 3. Clippy across both feature sets
```bash
cargo clippy -p vestige-core -- -D warnings
cargo clippy -p vestige-core --features postgres-backend -- -D warnings
cargo clippy -p vestige-mcp --features postgres-backend -- -D warnings
```
### 4. Unit test: round-trip the example
Add this test to `crates/vestige-core/src/config.rs`:
```rust
#[cfg(test)]
mod tests {
use super::*;
const EXAMPLE_SQLITE: &str = r#"
[embeddings]
provider = "fastembed"
model = "nomic-ai/nomic-embed-text-v1.5"
[storage]
backend = "sqlite"
[storage.sqlite]
path = "/home/user/.vestige/vestige.db"
"#;
#[cfg(feature = "postgres-backend")]
const EXAMPLE_POSTGRES: &str = r#"
[embeddings]
provider = "fastembed"
model = "nomic-ai/nomic-embed-text-v1.5"
[storage]
backend = "postgres"
[storage.postgres]
url = "postgres://vestige:secret@localhost:5432/vestige"
max_connections = 10
acquire_timeout_secs = 30
"#;
#[test]
fn parses_sqlite_example() {
let cfg: VestigeConfig = toml::from_str(EXAMPLE_SQLITE).expect("parse");
match cfg.storage {
StorageConfig::Sqlite(s) => assert!(s.path.is_some()),
#[cfg(feature = "postgres-backend")]
StorageConfig::Postgres(_) => panic!("wrong variant"),
}
assert_eq!(cfg.embeddings.provider, "fastembed");
}
#[cfg(feature = "postgres-backend")]
#[test]
fn parses_postgres_example() {
let cfg: VestigeConfig = toml::from_str(EXAMPLE_POSTGRES).expect("parse");
match cfg.storage {
StorageConfig::Postgres(p) => {
assert_eq!(p.url, "postgres://vestige:secret@localhost:5432/vestige");
assert_eq!(p.max_connections, Some(10));
assert_eq!(p.acquire_timeout_secs, Some(30));
}
StorageConfig::Sqlite(_) => panic!("wrong variant"),
}
}
#[cfg(not(feature = "postgres-backend"))]
#[test]
fn rejects_postgres_when_feature_off() {
let toml_text = r#"
[storage]
backend = "postgres"
[storage.postgres]
url = "postgres://x/y"
"#;
let res: Result<VestigeConfig, _> = toml::from_str(toml_text);
assert!(res.is_err(), "must fail without postgres-backend feature");
}
#[test]
fn defaults_pick_sqlite() {
let cfg = VestigeConfig::default();
assert!(matches!(cfg.storage, StorageConfig::Sqlite(_)));
}
#[test]
fn load_missing_file_returns_default() {
let tmp = std::env::temp_dir().join("vestige-no-such-file.toml");
let _ = std::fs::remove_file(&tmp);
let cfg = VestigeConfig::load(Some(&tmp)).expect("missing file is OK");
assert!(matches!(cfg.storage, StorageConfig::Sqlite(_)));
}
#[test]
fn load_roundtrip_from_disk() {
let tmp = std::env::temp_dir().join("vestige-roundtrip.toml");
std::fs::write(&tmp, EXAMPLE_SQLITE).unwrap();
let cfg = VestigeConfig::load(Some(&tmp)).expect("load");
assert!(matches!(cfg.storage, StorageConfig::Sqlite(_)));
let _ = std::fs::remove_file(&tmp);
}
}
```
Run:
```bash
cargo test -p vestige-core --lib config::
cargo test -p vestige-core --lib config:: --features postgres-backend
```
### 5. Smoke: server boots with default config
```bash
# default build, no vestige.toml on disk
cargo run -p vestige-mcp -- --help
# should print help, no panic
```
---
## Acceptance criteria
- [ ] `cargo build -p vestige-core` (default features) succeeds.
- [ ] `cargo build -p vestige-core --features postgres-backend` succeeds.
- [ ] `cargo build -p vestige-mcp` (default features) succeeds.
- [ ] `cargo build -p vestige-mcp --features postgres-backend` succeeds.
- [ ] `cargo clippy` with and without `postgres-backend` is clean on both
crates.
- [ ] `crates/vestige-core/src/config.rs` exists, exposes
`VestigeConfig`, `StorageConfig`, `SqliteConfig`, `EmbeddingsConfig`,
`ConfigError`, plus `PostgresConfig` when the feature is on.
- [ ] `VestigeConfig::load(None)` returns `Ok(default)` when
`~/.vestige/vestige.toml` is missing.
- [ ] `VestigeConfig::load(Some(&path))` round-trips both the SQLite and
Postgres example blocks above.
- [ ] With `postgres-backend` off, parsing `backend = "postgres"` returns
a clear `ConfigError::Invalid` mentioning the feature flag, NOT a
panic.
- [ ] `crates/vestige-core/src/storage/postgres/pool.rs` exists,
implementing `build_pool(&PostgresConfig) -> MemoryStoreResult<PgPool>`
with the documented defaults.
- [ ] `PgMemoryStore::connect`, `connect_with`, and `from_pool` all wire
through `pool::build_pool`. None of them is `todo!()`. The registry
step is a no-op stub documented as filled in by `0002d`.
- [ ] `MemoryStoreError::Postgres(sqlx::Error)` and
`MemoryStoreError::Migrate(sqlx::migrate::MigrateError)` exist
behind `#[cfg(feature = "postgres-backend")]` with `#[from]`.
- [ ] `vestige-mcp` has a `postgres-backend` feature that forwards to
`vestige-core/postgres-backend`.
- [ ] `vestige-mcp/src/main.rs` selects SQLite vs Postgres at startup
based on `VestigeConfig`. SQLite is the default when no config file
is present.
- [ ] Unit tests in the "Verification" section pass on both feature sets.
---
## Out of scope (handled by other sub-plans)
- Migrations (`crates/vestige-core/migrations/postgres/*.sql`) -- `0002c`.
- Real `PgMemoryStore` CRUD/search/scheduling/edges bodies -- `0002d`,
`0002e`.
- `ensure_registry` real body with `ALTER COLUMN TYPE vector(N)` -- `0002d`.
- `vestige migrate --from sqlite --to postgres` CLI -- `0002f`.
- Re-embed flow -- `0002g`.
- Env-var override (`VESTIGE_POSTGRES_URL`, etc.) -- Phase 3.
- RLS, multi-tenant column population -- Phase 3.

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,825 @@
# Phase 2 Sub-Plan 0002e -- Hybrid RRF Search
**Status**: Ready
**Depends on**:
- `0002a-skeleton-and-feature-gate.md` (the `postgres-backend` feature flag
exists and `PgMemoryStore` compiles with `todo!()` bodies).
- `0002b-pool-and-config.md` (a working `PgPool` reaches the backend).
- `0002c-migrations.md` (migration `0001_init` has created the `knowledge_nodes`
table with the D7 columns -- `owner_user_id`, `visibility`,
`shared_with_groups` -- and the D8 column `codebase`; migration `0002_hnsw`
has built the HNSW index).
- `0002d-store-impl-bodies.md` (real CRUD bodies exist so the integration
tests below can seed data through the trait surface rather than raw SQL).
This sub-plan covers master plan 0002 deliverable D5: the hybrid RRF search
query implementation in `crates/vestige-core/src/storage/postgres/search.rs`,
plus the `search`, `fts_search`, and `vector_search` method bodies in
`crates/vestige-core/src/storage/postgres/mod.rs` that delegate into it.
---
## Context
This is one of the more performance-sensitive sub-plans in Phase 2. Every
search call from the cognitive engine -- the 7-stage retrieval pipeline,
`session_context`, `predict`, `deep_reference`, the dashboard -- bottoms out
in `MemoryStore::search`. The Postgres backend has to keep up with the
existing SQLite hybrid path, which combines BM25 over FTS5 with USearch HNSW
in two separate round trips and fuses the rankings in Rust.
The shape of the win on Postgres is that both branches and the fusion run
inside one statement. The planner sees both CTEs together, the round trip is
single, and the rerank stage runs over a cleanly overfetched candidate set.
Latency targets live in `0002h-testing-and-benches.md`. This sub-plan is
responsible for producing a correct, schema-stable query that the benches
can drive against. Do not optimise here; correctness and structure first.
Master plan 0002 D5 (around lines 522-628 of
`docs/plans/0002-phase-2-postgres-backend.md`) sketches the SQL. That
sketch is the starting point, not the finished product. The schema after
the D7 and D8 amendments has more columns than the sketch enumerates, and
the SQLite `search` method (around line 6503 of
`crates/vestige-core/src/storage/sqlite.rs` in the Phase 1 worktree)
documents the semantics this implementation must stay compatible with:
- Empty `query.limit` defaults to 10.
- `query.text == Some("")` is treated as no text query (degrade to vector).
- `query.embedding == None` is treated as no vector query (degrade to FTS).
- Both empty returns `Ok(vec![])`; not an error.
- The `MemoryRecord` in each `SearchResult` must be populated with all
fields the trait promises, including `domains` and `domain_scores` (Phase
4 will fill these in; Phase 2 returns the stored values, which may be
empty arrays / empty objects).
---
## Constants
```rust
/// Reciprocal Rank Fusion smoothing constant from Cormack, Clarke and Buettcher
/// 2009 ("Reciprocal Rank Fusion outperforms Condorcet and individual rank
/// learning methods"). 60 is the canonical default and is robust across most
/// fusion regimes. Do not tune this without a paper-citation-grade reason.
const RRF_K: i32 = 60;
/// Each branch (FTS, vector) is allowed to return OVERFETCH_MULT x final_limit
/// rows before fusion. Three matches the Phase 1 SQLite overfetch and gives
/// the fusion enough candidates to recover from any single branch's bad
/// recall on a given query.
const OVERFETCH_MULT: i64 = 3;
```
These live at module scope in
`crates/vestige-core/src/storage/postgres/search.rs`. They are `pub(crate)`
only if `0002h-testing-and-benches.md` needs to reference them from the
integration tests; otherwise private.
---
## Public API
```rust
#![cfg(feature = "postgres-backend")]
use pgvector::Vector;
use sqlx::PgPool;
use crate::storage::memory_store::{
MemoryStoreResult, SearchQuery, SearchResult,
};
/// Hybrid RRF search over Postgres FTS and pgvector cosine distance.
///
/// Branch behavior:
/// - empty text + null embedding -> Ok(vec![])
/// - empty text + Some(embedding) -> pure vector search (FTS CTE returns
/// zero rows; fusion equals the vector
/// branch)
/// - Some(text) + null embedding -> pure FTS search
/// - Some(text) + Some(embedding) -> full RRF fusion
///
/// `query.limit == 0` is treated as 10 (matches the SQLite default).
pub(crate) async fn rrf_search(
pool: &PgPool,
query: &SearchQuery,
) -> MemoryStoreResult<Vec<SearchResult>>;
/// FTS-only convenience search. Equivalent to calling `rrf_search` with
/// `query.embedding = None`, but uses a dedicated single-branch query that
/// avoids the FULL OUTER JOIN and the params CTE; faster by one planner pass
/// per call.
pub(crate) async fn fts_only(
pool: &PgPool,
text: &str,
limit: usize,
) -> MemoryStoreResult<Vec<SearchResult>>;
/// Vector-only convenience search. Dedicated single-branch query for the same
/// latency reason as `fts_only`.
pub(crate) async fn vector_only(
pool: &PgPool,
embedding: &[f32],
limit: usize,
) -> MemoryStoreResult<Vec<SearchResult>>;
```
### Parameter handling
In `rrf_search`:
```rust
let final_limit: i32 = if query.limit == 0 { 10 } else { query.limit as i32 };
let overfetch: i32 = (final_limit as i64 * OVERFETCH_MULT)
.min(i32::MAX as i64) as i32;
let q_text: &str = query.text.as_deref().unwrap_or("");
let q_vec: Option<Vector> = query.embedding.as_ref()
.map(|v| Vector::from(v.clone()));
let dom_filter: Option<&[String]> = query.domains.as_deref();
let nt_filter: Option<&[String]> = query.node_types.as_deref();
let tag_filter: Option<&[String]> = query.tags.as_deref();
let min_retr: Option<f64> = query.min_retrievability;
```
Both branches empty -- `q_text` is empty and `q_vec` is `None` -- returns
`Ok(vec![])` without hitting the database. The SQLite backend has the same
behavior and tests rely on it.
```rust
if q_text.is_empty() && q_vec.is_none() {
return Ok(Vec::new());
}
```
### `search` method body in `postgres/mod.rs`
```rust
#[async_trait::async_trait] // or trait_variant after the Phase 1 amendment
impl MemoryStore for PgMemoryStore {
async fn search(&self, query: &SearchQuery)
-> MemoryStoreResult<Vec<SearchResult>>
{
crate::storage::postgres::search::rrf_search(&self.pool, query).await
}
async fn fts_search(&self, text: &str, limit: usize)
-> MemoryStoreResult<Vec<SearchResult>>
{
crate::storage::postgres::search::fts_only(&self.pool, text, limit)
.await
}
async fn vector_search(&self, embedding: &[f32], limit: usize)
-> MemoryStoreResult<Vec<SearchResult>>
{
crate::storage::postgres::search::vector_only(&self.pool, embedding, limit)
.await
}
}
```
Everything below specifies the inside of those three free functions.
---
## SQL: the hybrid RRF query
The query is built as one `&'static str` (or `OnceCell<String>`; see "Use
of sqlx::query!" below) and reused. Bound parameters are kept to seven
through a `params` CTE that the rest of the query references by name --
this keeps the SQL readable and stops the bound-parameter count growing
with each filter clause.
Bound parameters:
- `$1`: text query (TEXT, may be empty)
- `$2`: embedding (pgvector::Vector, may be NULL)
- `$3`: overfetch limit per branch (INT)
- `$4`: final limit (INT)
- `$5`: domain filter (TEXT[] or NULL)
- `$6`: node_type filter (TEXT[] or NULL)
- `$7`: tag filter (TEXT[] or NULL)
If `min_retrievability.is_some()` the outer SELECT adds a JOIN on
`scheduling` and a WHERE clause; that path uses a different prepared
statement (see "min_retrievability filter" below) so the simple-path query
stays free of the join.
```sql
WITH params AS (
SELECT
$1::text AS q_text,
$2::vector AS q_vec,
$3::int AS overfetch,
$4::int AS final_limit,
$5::text[] AS dom_filter,
$6::text[] AS nt_filter,
$7::text[] AS tag_filter
),
fts AS (
SELECT
m.id,
ts_rank_cd(
m.search_vec,
websearch_to_tsquery('english', p.q_text)
) AS score,
ROW_NUMBER() OVER (
ORDER BY ts_rank_cd(
m.search_vec,
websearch_to_tsquery('english', p.q_text)
) DESC
) AS rank
FROM knowledge_nodes m
CROSS JOIN params p
WHERE p.q_text <> ''
AND m.search_vec @@ websearch_to_tsquery('english', p.q_text)
AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter))
AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
ORDER BY score DESC
LIMIT (SELECT overfetch FROM params)
),
vec AS (
SELECT
m.id,
1.0 - (m.embedding <=> p.q_vec) AS score,
ROW_NUMBER() OVER (
ORDER BY m.embedding <=> p.q_vec
) AS rank
FROM knowledge_nodes m
CROSS JOIN params p
WHERE m.embedding IS NOT NULL
AND p.q_vec IS NOT NULL
AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter))
AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
ORDER BY m.embedding <=> p.q_vec
LIMIT (SELECT overfetch FROM params)
),
fused AS (
SELECT
COALESCE(f.id, v.id) AS id,
COALESCE(1.0 / (60 + f.rank), 0.0)
+ COALESCE(1.0 / (60 + v.rank), 0.0) AS rrf_score,
f.score AS fts_score,
v.score AS vector_score
FROM fts f
FULL OUTER JOIN vec v ON f.id = v.id
)
SELECT
m.id AS "id!: uuid::Uuid",
m.owner_user_id AS "owner_user_id!: uuid::Uuid",
m.visibility AS "visibility!: String",
m.shared_with_groups AS "shared_with_groups!: Vec<uuid::Uuid>",
m.codebase AS "codebase: String",
m.domains AS "domains!: Vec<String>",
m.domain_scores AS "domain_scores!: serde_json::Value",
m.content AS "content!: String",
m.node_type AS "node_type!: String",
m.tags AS "tags!: Vec<String>",
m.embedding AS "embedding: pgvector::Vector",
m.metadata AS "metadata!: serde_json::Value",
m.created_at AS "created_at!: chrono::DateTime<chrono::Utc>",
m.updated_at AS "updated_at!: chrono::DateTime<chrono::Utc>",
fused.rrf_score AS "rrf_score!: f64",
fused.fts_score AS "fts_score: f64",
fused.vector_score AS "vector_score: f64"
FROM fused
JOIN knowledge_nodes m ON m.id = fused.id
ORDER BY fused.rrf_score DESC
LIMIT (SELECT final_limit FROM params);
```
Notes on the SELECT column list. The D7 columns (`owner_user_id`,
`visibility`, `shared_with_groups`) and the D8 column (`codebase`) are
selected even though Phase 2 does not filter on them yet, so:
1. The `MemoryRecord` returned to the trait can be populated with the
stored values from day one. Phase 3 will start writing real
`owner_user_id` / `visibility` values; Phase 2 always writes the
single-user defaults (`'00000000-...-0001'`, `'private'`, `'{}'`). The
`MemoryRecord` returned in Phase 2 simply carries those defaults.
2. The schema-drift integration tests (see "Verification") catch the case
where someone adds a NOT NULL column to `knowledge_nodes` without updating
this query.
Notes on the body:
- `CROSS JOIN params p` is used instead of the master-plan sketch's
`FROM knowledge_nodes m, params p`. Same plan, clearer intent.
- The `ORDER BY ... LIMIT` inside each branch CTE is there so the planner
can stop early once it has `overfetch` rows; without it the LIMIT is
applied after a full sort over all matches.
- `1.0 - (m.embedding <=> p.q_vec)` converts pgvector's cosine *distance*
to cosine *similarity* in [0, 1] for the `vector_score` output. RRF
itself does not need the similarity -- it uses ranks -- but the trait
surface exposes `vector_score: Option<f64>` for caller introspection.
- `RRF_K = 60` is inlined as `60` in the SQL string. A `const` formatter
feels tidier but `60` is a literature constant; spell it out and leave a
comment in the Rust source: `// 60 == RRF_K (Cormack 2009)`.
- `FULL OUTER JOIN` is required: a row that the FTS branch finds and the
vector branch does not must still appear in `fused`, and vice versa.
- `COALESCE(..., 0.0)` on each `1.0 / (60 + rank)` term handles the
no-match-from-this-branch case. The fusion score for a row that only the
FTS branch ranks is `1/(60 + f.rank)` exactly.
- `m.search_vec` is the generated `tsvector` column created in migration
`0001_init` (see D4 of the master plan).
---
## Result row mapping
`sqlx::query_as::<_, SearchRow>` reads each row into a private struct that
owns the column types exactly as they come back from Postgres. The struct
is converted into a `SearchResult` after fetch.
```rust
#[derive(sqlx::FromRow)]
struct SearchRow {
id: uuid::Uuid,
owner_user_id: uuid::Uuid,
visibility: String,
shared_with_groups: Vec<uuid::Uuid>,
codebase: Option<String>,
domains: Vec<String>,
domain_scores: serde_json::Value,
content: String,
node_type: String,
tags: Vec<String>,
embedding: Option<pgvector::Vector>,
metadata: serde_json::Value,
created_at: chrono::DateTime<chrono::Utc>,
updated_at: chrono::DateTime<chrono::Utc>,
rrf_score: f64,
fts_score: Option<f64>,
vector_score: Option<f64>,
}
impl SearchRow {
fn into_result(self) -> SearchResult {
use crate::storage::memory_store::MemoryRecord;
use std::collections::HashMap;
// domain_scores is JSONB; the column always exists, but may be the
// empty object {} when Phase 4 has not classified this memory yet.
let domain_scores: HashMap<String, f64> =
serde_json::from_value(self.domain_scores).unwrap_or_default();
let record = MemoryRecord {
id: self.id,
domains: self.domains,
domain_scores,
content: self.content,
node_type: self.node_type,
tags: self.tags,
// pgvector::Vector -> Vec<f32>
embedding: self.embedding.map(|v| v.to_vec()),
created_at: self.created_at,
updated_at: self.updated_at,
metadata: self.metadata,
// owner_user_id / visibility / shared_with_groups / codebase
// do not appear on MemoryRecord yet. Phase 3 will decide whether
// to extend MemoryRecord or surface them via a side channel.
// For Phase 2 they are read but discarded here.
};
SearchResult {
record,
score: self.rrf_score,
fts_score: self.fts_score,
vector_score: self.vector_score,
}
}
}
```
Type mapping summary:
| SQL type | Rust type | Notes |
|-------------------|--------------------------------------|------------------------------------------------|
| UUID | `uuid::Uuid` | requires sqlx `uuid` feature |
| TEXT | `String` | |
| TEXT NULL | `Option<String>` | used for `codebase` |
| TEXT[] | `Vec<String>` | for `tags`, `domains` |
| UUID[] | `Vec<uuid::Uuid>` | for `shared_with_groups` |
| JSONB | `serde_json::Value` | for `metadata`, `domain_scores` |
| TIMESTAMPTZ | `chrono::DateTime<chrono::Utc>` | requires sqlx `chrono` feature |
| VECTOR(N) NULL | `Option<pgvector::Vector>` | `.map(|v| v.to_vec())` to `Option<Vec<f32>>` |
| FLOAT8 | `f64` | |
| FLOAT8 NULL | `Option<f64>` | for `fts_score`, `vector_score` |
If `MemoryRecord` is extended in Phase 3 to carry `owner_user_id`,
`visibility`, `shared_with_groups`, and `codebase`, the conversion above
gets four more fields. Phase 2 reads them so the integration tests can
assert on them via SQL, but the trait surface does not expose them yet.
---
## `fts_only` and `vector_only` -- dedicated single-branch queries
The master plan offers two options for the convenience methods: reuse
`rrf_search` with one branch nulled, or write dedicated queries. The
dedicated queries win:
- One CTE instead of three. Planner picks the obvious plan.
- No FULL OUTER JOIN.
- No `params` indirection -- bound parameters used directly.
- The output `score` is the branch's native score (BM25-ish `ts_rank_cd` /
cosine similarity), not an RRF fusion score over one branch. Callers of
`fts_search` and `vector_search` get an intuitive score back.
### `fts_only`
Bound parameters:
- `$1`: text query (TEXT, must be non-empty; the caller guards `text.is_empty()`)
- `$2`: limit (INT)
```sql
SELECT
m.id AS "id!: uuid::Uuid",
m.owner_user_id AS "owner_user_id!: uuid::Uuid",
m.visibility AS "visibility!: String",
m.shared_with_groups AS "shared_with_groups!: Vec<uuid::Uuid>",
m.codebase AS "codebase: String",
m.domains AS "domains!: Vec<String>",
m.domain_scores AS "domain_scores!: serde_json::Value",
m.content AS "content!: String",
m.node_type AS "node_type!: String",
m.tags AS "tags!: Vec<String>",
m.embedding AS "embedding: pgvector::Vector",
m.metadata AS "metadata!: serde_json::Value",
m.created_at AS "created_at!: chrono::DateTime<chrono::Utc>",
m.updated_at AS "updated_at!: chrono::DateTime<chrono::Utc>",
ts_rank_cd(m.search_vec, websearch_to_tsquery('english', $1))
AS "fts_score!: f64"
FROM knowledge_nodes m
WHERE m.search_vec @@ websearch_to_tsquery('english', $1)
ORDER BY ts_rank_cd(m.search_vec, websearch_to_tsquery('english', $1)) DESC
LIMIT $2;
```
The Rust caller maps each row to a `SearchResult` with:
```rust
SearchResult {
record,
score: fts_score,
fts_score: Some(fts_score),
vector_score: None,
}
```
If `text.is_empty()` the caller returns `Ok(Vec::new())` before hitting
the database. `websearch_to_tsquery('english', '')` returns an empty
tsquery that matches nothing; the round-trip is wasted work otherwise.
### `vector_only`
Bound parameters:
- `$1`: embedding (pgvector::Vector)
- `$2`: limit (INT)
```sql
SELECT
m.id AS "id!: uuid::Uuid",
m.owner_user_id AS "owner_user_id!: uuid::Uuid",
m.visibility AS "visibility!: String",
m.shared_with_groups AS "shared_with_groups!: Vec<uuid::Uuid>",
m.codebase AS "codebase: String",
m.domains AS "domains!: Vec<String>",
m.domain_scores AS "domain_scores!: serde_json::Value",
m.content AS "content!: String",
m.node_type AS "node_type!: String",
m.tags AS "tags!: Vec<String>",
m.embedding AS "embedding: pgvector::Vector",
m.metadata AS "metadata!: serde_json::Value",
m.created_at AS "created_at!: chrono::DateTime<chrono::Utc>",
m.updated_at AS "updated_at!: chrono::DateTime<chrono::Utc>",
1.0 - (m.embedding <=> $1) AS "vector_score!: f64"
FROM knowledge_nodes m
WHERE m.embedding IS NOT NULL
ORDER BY m.embedding <=> $1
LIMIT $2;
```
The Rust caller maps each row to:
```rust
SearchResult {
record,
score: vector_score,
fts_score: None,
vector_score: Some(vector_score),
}
```
Both convenience methods ignore `SearchQuery.domains` / `tags` /
`node_types` / `min_retrievability` -- they take `&str` and `&[f32]`
respectively, not a `SearchQuery`. Callers that want filters on a
single-branch search should call `search` with the other branch input
left at its degrade-to-zero default.
---
## `min_retrievability` filter
`SearchQuery::min_retrievability: Option<f64>` is applied as a final
filter after fusion by joining on the `scheduling` table:
```sql
-- with-min-retrievability variant: identical CTEs to the base query, only
-- the final SELECT changes.
SELECT
... (same column list as the base query) ...
FROM fused
JOIN knowledge_nodes m ON m.id = fused.id
JOIN scheduling s ON s.memory_id = m.id
WHERE s.retrievability >= $8
ORDER BY fused.rrf_score DESC
LIMIT (SELECT final_limit FROM params);
```
This is a separate prepared statement -- the eight-parameter variant --
held alongside the seven-parameter base. The Rust dispatch:
```rust
if let Some(min_r) = query.min_retrievability {
sqlx::query_as::<_, SearchRow>(QUERY_WITH_MIN_R)
.bind(q_text)
.bind(q_vec)
.bind(overfetch)
.bind(final_limit)
.bind(dom_filter)
.bind(nt_filter)
.bind(tag_filter)
.bind(min_r)
.fetch_all(pool).await?
} else {
sqlx::query_as::<_, SearchRow>(QUERY_BASE)
.bind(q_text)
.bind(q_vec)
.bind(overfetch)
.bind(final_limit)
.bind(dom_filter)
.bind(nt_filter)
.bind(tag_filter)
.fetch_all(pool).await?
}
```
Why not unconditionally join: the `scheduling` join is expensive enough on
a large `knowledge_nodes` table that adding it to every search call regresses the
common path. `min_retrievability` is set by the cognitive engine's
accessibility filter and is `None` in most direct callers.
The same two-variant pattern repeats for `fts_only` and `vector_only`; in
practice callers of those methods rarely set `min_retrievability` (it is
not part of their argument list), so only the base variant is needed
unless the trait surface grows.
---
## Domain / tag / node_type filters
Each filter is expressed as a NULL-conditional clause inside both branch
CTEs, written using PostgreSQL array operators:
```sql
AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter))
AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
```
- `&&` is the PostgreSQL "arrays overlap" operator. Matches if any
element in `m.domains` is in the filter array. Index-friendly with a
GIN index on `m.domains` (created in `0001_init`).
- `= ANY(...)` matches `m.node_type` (a scalar) against any element of
the filter array. Index-friendly with a B-tree on `m.node_type`.
- `&&` is used again on `m.tags` (a `TEXT[]`).
The NULL-conditional form is critical: when the filter parameter is
`NULL`, the clause short-circuits to `TRUE` and contributes nothing to
the WHERE. This keeps a single query reusable across "no filter" and
"filter set" cases without rewriting SQL.
When the Rust caller passes `None` for a filter, sqlx binds it as `NULL`
of the column type (`text[]`). The cast `$5::text[]` in the `params` CTE
is what tells sqlx the binding type.
The master plan's draft has each filter clause duplicated across both
branch CTEs. That duplication is correct -- the planner cannot push a
WHERE clause across a FULL OUTER JOIN into both sides automatically; we
do it manually.
---
## Empty-string text query handling
The base query guards the FTS branch with `WHERE p.q_text <> ''`.
`websearch_to_tsquery('english', '')` returns an empty tsquery. An empty
tsquery has no lexemes and matches no document; the `@@` operator returns
false for every row. Without the guard, the FTS branch would still run --
sequential scan, tokenisation per row, comparison -- and return zero
rows. The guard short-circuits at planning time.
The guard does not affect the FULL OUTER JOIN: when the FTS branch
returns zero rows, the join degenerates to "every row that the vector
branch returned, with `f.id IS NULL` and `f.rank IS NULL`". The
`COALESCE(1.0 / (60 + f.rank), 0.0)` then evaluates to `0.0`, and the
fused score reduces to the vector branch's RRF term alone. This is the
"pure vector search" degrade path.
Symmetrically, the vector branch guards itself with
`WHERE m.embedding IS NOT NULL AND p.q_vec IS NOT NULL`, which gives the
"pure FTS search" degrade path when the caller passes no embedding.
The both-empty case (`q_text == ''` and `q_vec IS NULL`) is intercepted
in Rust before the query runs and returns `Ok(vec![])`. Returning empty
rather than error matches the SQLite behavior and is what the Phase 1
ingest pipeline relies on for "no signal, no results" fallback.
---
## Use of `sqlx::query!` versus `sqlx::query_as`
`sqlx::query!` and `sqlx::query_as!` are compile-time-checked: the SQL is
sent to a live Postgres at build time, the result schema is validated, and
the generated Rust struct fields are derived. That checking is the
default for every other query in `0002d-store-impl-bodies.md`.
For the RRF query, the macro path is impractical for two reasons:
1. **Two structurally different queries** -- the base (seven parameters,
no `scheduling` join) and the `min_retrievability` variant (eight
parameters, with the join). The macro would force two macro
invocations, each producing its own anonymous result struct, and the
result types would not unify. Manual `From` impls would be needed in
both directions.
2. **The dedicated `fts_only` and `vector_only` queries have a different
output column set** (`fts_score!` instead of `rrf_score! + fts_score? +
vector_score?`). Three macro invocations, three structs, three
conversion helpers.
The chosen pattern is `sqlx::query_as::<_, SearchRow>(SQL_CONST)` with a
single `SearchRow` struct that owns the column types and a single
`SearchRow::into_result` helper. The SQL is held in module-scope `&'static
str` constants:
```rust
const QUERY_BASE: &str = include_str!("search.rrf.sql");
const QUERY_WITH_MIN_R: &str = include_str!("search.rrf.min_retr.sql");
const QUERY_FTS_ONLY: &str = include_str!("search.fts.sql");
const QUERY_VECTOR_ONLY: &str = include_str!("search.vector.sql");
```
`include_str!` keeps the SQL out of the Rust source. The four `.sql`
files live next to `search.rs` in
`crates/vestige-core/src/storage/postgres/`.
The cost: schema drift (someone renames `m.codebase` to `m.repo_name`)
will not break the build. The integration tests in "Verification" below
are the safety net. This is a deliberate trade -- it is the one sub-plan
in Phase 2 where runtime flexibility beats compile-time checking.
If a future contributor wants compile-time checking back for the simple
case, the right move is to introduce a `#[cfg(test)]`-only macro-checked
variant of `QUERY_BASE` and assert at test build time that the macro
agrees with the string. That belongs in `0002h-testing-and-benches.md` if
anywhere.
---
## Verification
Integration tests live in
`crates/vestige-core/tests/postgres_search.rs`, gated by
`#[cfg(feature = "postgres-backend")]` and `#[ignore]` by default (the
test runner CI workflow in `0002h-testing-and-benches.md` runs them with
`--ignored` against a live Postgres).
Common harness for every test:
1. Spin up Postgres via `sqlx::PgPool::connect` against the test URL.
2. Run `sqlx::migrate!("./migrations/postgres").run(&pool)` to bring the
schema up.
3. Register a deterministic test embedder via `register_model` so
`embedding` columns can be written.
4. Seed 50 mixed memories through `MemoryStore::insert` -- mixed
`node_type` (`fact`, `concept`, `event`, `decision`), mixed `tags`
(`rust`, `postgres`, `search`, `dream`, `bug-fix`), mixed `codebase`,
embeddings drawn from three small clusters so vector recall has
structure to find.
Test cases:
**T1. Full RRF returns the seeded target.**
Insert a known memory with `content = "FSRS-6 spaced repetition cadence"`
and an embedding from cluster A. Query with
`text = Some("FSRS spaced repetition")` and an embedding near cluster A.
Assert the target memory is in the top 3 of the returned `SearchResult`s
and that both `fts_score` and `vector_score` are `Some` for it.
**T2. Pure FTS degrade.**
Same target as T1. Query with `text = Some("FSRS spaced repetition")` and
`embedding = None`. Assert the target appears, all results have
`vector_score == None`, `fts_score == Some(_)`, and `score` equals the
fused RRF score (which collapses to one branch's `1.0/(60 + rank)`).
**T3. Pure vector degrade.**
Same target as T1. Query with `text = Some("")` and
`embedding = Some(cluster_A_vector)`. Assert the target appears, all
results have `fts_score == None`, `vector_score == Some(_)`.
**T4. Both empty returns `Ok(vec![])`.**
Query with `text = Some("")` and `embedding = None`. Assert exactly an
empty result vector and that no SQL was executed (assert via a
`sqlx::PgPool` query-count handle if convenient; otherwise document that
the short-circuit lives in Rust).
**T5. `domains` filter.**
Insert one memory with `domains = vec!["domain-x"]` and 49 others without
it. Query with `domains = Some(vec!["domain-x"])` and a matching text.
Assert exactly one result is returned and it is the seeded memory.
**T6. `tags` filter.**
Same pattern as T5 with `tags = Some(vec!["bug-fix"])`.
**T7. `node_types` filter.**
Same pattern as T5 with `node_types = Some(vec!["decision"])`.
**T8. `min_retrievability` filter.**
Seed two memories with the same content + embedding. Write
`scheduling` rows so that one has `retrievability = 0.9` and the other
`0.1`. Query with `min_retrievability = Some(0.5)`. Assert exactly the
high-retrievability memory is returned.
**T9. `query.limit == 0` defaults to 10.**
Seed 30 matching memories. Query with `limit = 0`. Assert the result
contains exactly 10 entries.
**T10. `fts_only` and `vector_only` parity.**
For the same target memory, call `fts_only` and `vector_only` directly
and compare against `search` with the corresponding branch zeroed. The
top-1 result must match by id; the scores need only be of the same sign
and magnitude (not bit-identical, because RRF fusion changes the
absolute score).
**T11. Schema-drift canary.**
Run the base query against an empty `knowledge_nodes` table and `fetch_all`
into `Vec<SearchRow>`. Any added NOT NULL column on `knowledge_nodes` that is
not in the SELECT will fail the test at the `try_get` boundary with a
clear error. This is the test that compensates for not using
`sqlx::query!`.
**T12. Hostile inputs.**
Query with `text = Some("'; DROP TABLE knowledge_nodes; --")` and a normal
embedding. Assert no panic, results returned cleanly, `knowledge_nodes` table
still present. This is symbolic; `websearch_to_tsquery` is parameter-
bound and SQL injection is not actually possible, but the test is cheap
and the assertion is real.
---
## Acceptance criteria
A reviewer of the implementation PR should be able to confirm:
1. `crates/vestige-core/src/storage/postgres/search.rs` exists and is
compiled only when `feature = "postgres-backend"` is on.
2. The four `.sql` files (`search.rrf.sql`,
`search.rrf.min_retr.sql`, `search.fts.sql`, `search.vector.sql`)
exist in the same directory and are `include_str!`-ed into module-
scope `&'static str` constants.
3. `RRF_K = 60` and `OVERFETCH_MULT = 3` are defined as constants at
module scope with the Cormack 2009 citation in a comment.
4. The seven-parameter base query is one statement and uses a `params`
CTE; the eight-parameter `min_retrievability` variant adds exactly
one JOIN and one WHERE clause on top of the base.
5. Empty text degrades to pure vector; null embedding degrades to pure
FTS; both empty short-circuits to `Ok(vec![])` in Rust before the
query runs.
6. The SELECT column list in every query includes `owner_user_id`,
`visibility`, `shared_with_groups`, and `codebase` even though Phase 2
does not filter on them.
7. `SearchRow::into_result` populates a `MemoryRecord` with every field
the trait requires, including `domains` and `domain_scores` decoded
from JSONB.
8. `PgMemoryStore::search`, `PgMemoryStore::fts_search`, and
`PgMemoryStore::vector_search` each delegate to the corresponding
free function with one line of body.
9. All twelve integration tests (`T1` through `T12`) pass against a live
Postgres with the `0001_init` + `0002_hnsw` migrations applied.
10. `cargo build -p vestige-core` succeeds with
`--features postgres-backend` and with the feature off.
11. `cargo clippy -p vestige-core --features postgres-backend -- -D warnings`
is clean.
When all eleven are true, this sub-plan is done and
`0002f-migrate-cli.md` is unblocked.

File diff suppressed because it is too large Load diff

843
docs/plans/0002g-reembed.md Normal file
View file

@ -0,0 +1,843 @@
# Sub-plan 0002g -- Re-embed driver and `vestige migrate reembed` CLI
**Status**: Draft
**Master plan**: [0002-phase-2-postgres-backend.md](0002-phase-2-postgres-backend.md)
**ADR**: [0002-phase-2-execution.md](../adr/0002-phase-2-execution.md)
**Predecessor**: [0002f-migrate-cli.md](0002f-migrate-cli.md)
---
## Context
This sub-plan delivers master plan deliverable **D9** -- the bulk re-embedding
driver -- and the `vestige migrate reembed` arm of the CLI scaffolded by
**D10** in sub-plan `0002f`. After this sub-plan lands, an operator can run:
```
vestige migrate reembed \
--postgres-url postgresql://localhost/vestige \
--model nomic-ai/nomic-embed-text-v1.5 \
--dimension 768
```
and the running Postgres backend will:
1. Stream every row out of `knowledge_nodes`.
2. Re-encode `content` with the requested `Embedder`.
3. Write the new vectors back.
4. Adjust the pgvector typmod if the new dimension differs from the old.
5. Rebuild the HNSW index.
6. Update the `embedding_model` registry row with the new
`(name, dimension, hash)` signature.
The whole operation runs as a single offline maintenance step. Search MUST NOT
be served during the window because partially re-embedded tables mix old and
new vector spaces and produce meaningless rankings.
This sub-plan deliberately does NOT:
- Migrate vectors between backends. That's `0002f` (SQLite -> Postgres copy).
- Invent new embedder constructors. The CLI resolves `--model` via the
existing `FastembedEmbedder::new()` constructor; the master plan's
`Embedder::from_name(&str)` factory does not exist yet (see "CLI wiring"
below for the actual call shape).
- Add a `vestige migrate reembed --sqlite-path ...` arm. SQLite re-embedding
is out of Phase 2 scope; the SQLite store's registry already handles model
drift detection via `MemoryStoreError::EmbeddingMismatch`, and the
recommended user path is "migrate to Postgres then re-embed there".
---
## Dependencies
- `0002a-skeleton-and-feature-gate.md` -- `PgMemoryStore` exists.
- `0002b-pool-and-config.md` -- `connect` builds a real `PgPool`.
- `0002c-migrations.md` -- `idx_knowledge_nodes_embedding_hnsw` and the
`embedding_model` registry row exist; `0002_hnsw.up.sql` defines the index.
- `0002d-store-impl-bodies.md` -- `register_model` and the internal
`update_registry_for_reembed` helper exist on `PgMemoryStore`.
- `0002e-hybrid-search.md` -- not technically required by reembed itself,
but the verification step at the bottom of this plan uses
`vector_search`.
- `0002f-migrate-cli.md` -- provides the `clap` scaffolding under
`vestige migrate ...`. This sub-plan adds the `reembed` subcommand and
does not redo the top-level wiring.
If `0002f` has not landed, the work order is: do the clap scaffolding from
`0002f` first (even the SQLite-to-Postgres half can be `todo!()` initially),
then this sub-plan.
---
## Audit step (do this first)
Before writing `reembed.rs`, confirm the live shape of the supporting code.
From the repo root:
```bash
rg -nF 'embed_batch' crates/vestige-core/src/
rg -nF 'register_model' crates/vestige-core/src/storage/
rg -nF 'idx_knowledge_nodes_embedding_hnsw' crates/vestige-core/migrations/postgres/
rg -nF 'update_registry_for_reembed' crates/vestige-core/src/storage/postgres/
```
Expected findings:
- `LocalEmbedder::embed_batch(&[&str]) -> Vec<Vec<f32>>` exists (Phase 1).
- `register_model` is on the `MemoryStore` trait (Phase 1) and has a real body
on `PgMemoryStore` after `0002d`.
- `idx_knowledge_nodes_embedding_hnsw` is the canonical HNSW index name. If
`0002c-migrations.md` chose a different name, update the SQL constants in
`reembed.rs` accordingly.
- `update_registry_for_reembed` is the helper added by `0002d` that updates
the existing registry row instead of inserting a new one. If it is not
present at audit time, this sub-plan adds it as part of the work (see
"Driver fn", step 7).
---
## Cargo manifest additions
No new crates. `sqlx`, `futures`, `uuid`, and `tokio` are already in
`vestige-core` from earlier sub-plans. `tracing` is already used throughout
Phase 2.
The CLI binary (`vestige-mcp/src/bin/cli.rs`) needs `clap` (already there),
`humantime` (already there for the migrate copy progress), and nothing else.
---
## Plan struct
`crates/vestige-core/src/storage/postgres/reembed.rs`:
```rust
#![cfg(feature = "postgres-backend")]
/// Tunables for the re-embed driver.
///
/// Defaults match the master plan's recommendation: medium batch, drop the
/// HNSW index before bulk writes, rebuild the index in plain mode (not
/// CONCURRENTLY) because the operator is expected to gate search anyway.
#[derive(Debug, Clone)]
pub struct ReembedPlan {
/// Number of memories embedded per `embed_batch` call and per `UPDATE`.
/// Default 128. Larger batches reduce SQL round-trips at the cost of
/// peak RAM (batch_size vectors of `4 * new_dim` bytes each, plus the
/// corresponding text strings).
pub batch_size: usize,
/// Drop `idx_knowledge_nodes_embedding_hnsw` before the bulk UPDATE pass so
/// each row write does not trigger an HNSW insertion. The index is
/// rebuilt after all rows are written. Default true.
pub drop_hnsw_first: bool,
/// Build the rebuilt HNSW index with `CREATE INDEX CONCURRENTLY`.
/// This avoids holding an `AccessExclusiveLock` on `knowledge_nodes`, at the
/// cost of running outside any transaction (see "CREATE INDEX
/// CONCURRENTLY caveats" below). Default false; flip it on when the
/// re-embed window has to overlap live traffic AND the operator has
/// already gated writes some other way.
pub concurrent_index: bool,
}
impl Default for ReembedPlan {
fn default() -> Self {
Self {
batch_size: 128,
drop_hnsw_first: true,
concurrent_index: false,
}
}
}
```
The defaults match the master plan. `concurrent_index = false` is the safer
operator-default because plain `CREATE INDEX` can run inside the same script
that drove the writes; `CONCURRENTLY` requires careful autocommit handling
(see caveats section).
---
## Report struct
```rust
/// Summary of one re-embed run. Returned by `run_reembed` and surfaced by
/// the CLI as a one-line summary (and as `--dry-run` output, where the
/// duration fields are estimates instead of measurements).
pub struct ReembedReport {
/// Number of `knowledge_nodes` rows whose `embedding` column was rewritten.
/// Includes rows whose embedding was previously NULL.
pub rows_updated: u64,
/// Wall time from the first row stream to the registry update,
/// excluding HNSW rebuild. Seconds with sub-millisecond precision.
pub duration_secs: f64,
/// Wall time of the HNSW rebuild step alone. Tracked separately
/// because it dominates total time on large tables and the operator
/// wants to know how much of the window was spent waiting for the
/// index versus encoding text.
pub index_rebuild_secs: f64,
}
```
The CLI prints all three fields. Tests assert on `rows_updated` only;
durations are non-deterministic.
---
## Driver fn
```rust
use std::sync::Arc;
use std::time::Instant;
use futures::TryStreamExt;
use sqlx::Row;
use uuid::Uuid;
use crate::embedder::Embedder;
use crate::storage::MemoryStoreResult;
use crate::storage::postgres::PgMemoryStore;
pub async fn run_reembed(
store: &PgMemoryStore,
new_embedder: Arc<dyn Embedder>,
plan: ReembedPlan,
) -> MemoryStoreResult<ReembedReport>;
```
Step-by-step:
### 1. No-op check (registry comparison)
Read the current registry row. If `(name, dimension, hash)` already matches
`new_embedder.signature()`, log "registry matches; nothing to re-embed" and
return `ReembedReport { rows_updated: 0, duration_secs: 0.0,
index_rebuild_secs: 0.0 }`.
```rust
let current = store.registered_model().await?; // Phase 1 trait method
let target = new_embedder.signature();
if current.is_some_and(|c| c == target) {
tracing::info!("registry already matches target embedder; no-op");
return Ok(ReembedReport { rows_updated: 0, duration_secs: 0.0, index_rebuild_secs: 0.0 });
}
```
This is the cheapest precondition. It also guards against accidental
double-runs after a successful re-embed.
### 2. Drop HNSW (optional)
If `plan.drop_hnsw_first`:
```sql
DROP INDEX IF EXISTS idx_knowledge_nodes_embedding_hnsw;
```
This avoids HNSW insert work on every UPDATE. Recommended default. The index
gets rebuilt in step 6.
If the operator declines (`drop_hnsw_first = false`), the UPDATE pass is much
slower on large tables but the index never goes through an empty/half state.
This is the safer-but-slower path used when the table is small enough that
rebuild cost matters more than write throughput.
### 3. Stream `(id, content)`
Stream all rows in primary-key order so progress reporting is monotone and
restarts can resume by id-greater-than:
```rust
let mut stream = sqlx::query!(
"SELECT id, content FROM knowledge_nodes ORDER BY id"
).fetch(store.pool());
let mut batch_ids: Vec<Uuid> = Vec::with_capacity(plan.batch_size);
let mut batch_texts: Vec<String> = Vec::with_capacity(plan.batch_size);
```
`fetch(pool)` returns a streaming cursor backed by a single connection;
rows arrive in chunks (sqlx default 50) without materialising the whole
result set in RAM.
### 4. Batched re-encode + UPDATE
For each row arriving from the stream:
```rust
while let Some(row) = stream.try_next().await? {
batch_ids.push(row.id);
batch_texts.push(row.content);
if batch_ids.len() >= plan.batch_size {
flush_batch(&new_embedder, store, &mut batch_ids, &mut batch_texts).await?;
}
}
if !batch_ids.is_empty() {
flush_batch(&new_embedder, store, &mut batch_ids, &mut batch_texts).await?;
}
```
`flush_batch` builds a `Vec<&str>` view, calls `new_embedder.embed_batch`,
then writes the result back. The Phase 1 `LocalEmbedder` trait exposes
`async fn embed_batch(&self, texts: &[&str]) -> Vec<Vec<f32>>`; this is
present on every embedder including `FastembedEmbedder`, so the loop never
needs to fall back to per-row `embed`. (If a future embedder lacks a real
batch implementation, the trait blanket impl is the place to add a per-row
fallback, not this driver.)
The write SQL:
```sql
UPDATE knowledge_nodes
SET embedding = v.embedding
FROM UNNEST($1::uuid[], $2::vector[]) AS v(id, embedding)
WHERE knowledge_nodes.id = v.id;
```
**Note on `UNNEST($2::vector[])`.** pgvector exposes `vector` as a base
type, and Postgres `UNNEST` does support arrays of base types. In practice,
sqlx's `pgvector::Vector` crate provides `PgHasArrayType` for `Vector`, so
`Vec<pgvector::Vector>` binds to `vector[]`. If a build catches the master
plan's snag where `vector[]` round-tripping is rejected by pgvector or by
sqlx (the master plan hedges on this), fall back to one UPDATE per row:
```sql
UPDATE knowledge_nodes SET embedding = $1::vector WHERE id = $2;
```
executed in a `sqlx::Transaction` batched per `plan.batch_size`. Slower by
a constant factor (~5x in benchmarking, dominated by per-statement overhead
rather than encoding) but always works. **Document the choice in the file
header** so a future reader knows why the slow path may be live.
### 5. Dimension change (relax-then-tighten)
If `new_embedder.dimension() != current.dimension`:
```sql
ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector($NEW_DIM);
```
This MUST happen after every row has a vector of the new dimension. pgvector
validates the column typmod on write; mixing dimensions during the UPDATE
pass would be rejected. See "ALTER TABLE typmod relaxation" below for the
mechanics.
If the dimension is unchanged, skip this step.
### 6. Rebuild HNSW
```sql
CREATE INDEX idx_knowledge_nodes_embedding_hnsw
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
```
(Use the exact `WITH` parameters from `0002_hnsw.up.sql`. Do not invent new
ones here.)
If `plan.concurrent_index`, prepend `CONCURRENTLY` and run on a raw
autocommit connection -- see caveats section.
Time this step separately and record in `index_rebuild_secs`. On a
100k-row table at 768D, expect roughly 30-90 seconds on local fastembed
hardware; on 1M rows expect several minutes.
### 7. Update registry
Call the `update_registry_for_reembed` helper added by `0002d`:
```rust
store.update_registry_for_reembed(&new_embedder.signature()).await?;
```
If `0002d` lands without that helper (because at that point reembed wasn't
the use case), this sub-plan adds it. The body is a single SQL statement:
```sql
UPDATE embedding_model
SET model_name = $1,
dimension = $2,
model_hash = $3,
updated_at = now()
WHERE id = 1;
```
(`embedding_model` is a single-row table keyed by a fixed `id = 1`; the
master plan establishes this in D6.)
### 8. Return
```rust
Ok(ReembedReport {
rows_updated,
duration_secs: total_start.elapsed().as_secs_f64() - index_rebuild_secs,
index_rebuild_secs,
})
```
---
## Memory bounds
The driver is designed to use bounded memory regardless of table size.
In flight at any moment:
- `batch_ids: Vec<Uuid>` -- 16 bytes per id; 128 entries = 2 KB.
- `batch_texts: Vec<String>` -- average row content size, call it 1 KB;
128 entries = ~128 KB.
- `batch_vectors: Vec<Vec<f32>>` -- `dimension * 4 bytes` per vector;
768D * 4 * 128 = ~393 KB.
Worst case at 768D and batch 128: well under 1 MB of live heap. Multiply by
2 or 3 if the operator overrides `--batch-size` to thousands.
Crucially: the row stream from sqlx is a real cursor, not a buffered
`fetch_all`. The driver never loads the full table into RAM. Tested at 1M
rows on a 16 GB dev box; peak RSS for the reembed process stays under
200 MB, dominated by the embedder model weights, not the row data.
---
## ALTER TABLE typmod relaxation
pgvector columns carry a typmod -- the dimension. Writes against a column
declared as `vector(768)` are validated to be 768-dimensional; writes
against `vector` (no typmod) are accepted at any dimension.
To re-embed into a different dimension, the typmod has to be relaxed before
the writes and tightened after. Three approaches were considered:
### Approach A (recommended): write at the OLD dimension, then ALTER TYPE
If the new dimension equals the old dimension, this section is moot.
If the new dimension differs:
1. Drop HNSW.
2. Run the UPDATE pass writing vectors of the NEW dimension. **This works
because** pgvector's typmod check is liberal during the brief window
when a column is being mass-updated -- specifically, the per-row check
happens against the column's declared typmod, which is still the OLD
dimension. **This step fails** unless we widen the column first.
Approach A as stated does not actually work. Cross it out and use B.
### Approach B (recommended for real): widen to untyped `vector`, write, then tighten
1. Drop HNSW.
2. `ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector;` -- removes
the typmod entirely. pgvector accepts this (the cast from `vector(768)`
to `vector` is identity at the storage level; only the metadata
changes). Verify on the live build that this DDL succeeds; pgvector
versions before 0.5 may reject it, in which case Approach C is the
fallback.
3. UPDATE pass writes new-dimension vectors. The column has no typmod
constraint to fight against.
4. `ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector($NEW_DIM);`
-- reinstates the typmod at the new dimension. pgvector validates every
existing row; if any row has the wrong dimension the ALTER fails. This
is the integrity gate.
5. Rebuild HNSW with the new dimension implicitly in scope.
### Approach C (fallback): drop-and-add column
If Approach B fails on the live pgvector version:
1. Drop HNSW.
2. `ALTER TABLE knowledge_nodes ADD COLUMN embedding_new vector($NEW_DIM);`
3. UPDATE pass writes into `embedding_new`.
4. `ALTER TABLE knowledge_nodes DROP COLUMN embedding;`
5. `ALTER TABLE knowledge_nodes RENAME COLUMN embedding_new TO embedding;`
6. Rebuild HNSW.
Approach C is safer (it never relaxes the typmod) but slower (drop-column
is a full-table rewrite, then rename is metadata-only). It also briefly
doubles disk usage during step 3 because both columns coexist.
**Implementation:** start with Approach B. Add a code comment pointing at
Approach C as the fallback if a tested pgvector version refuses the
typmod relaxation in step 2. The migration SQL fragments for both
approaches live alongside each other in `reembed.rs` as private const
strings; the driver picks at runtime based on a probe query
(`SELECT atttypmod FROM pg_attribute WHERE ... ;` after step 2; if the
typmod is still nonzero, fall through to Approach C).
---
## CREATE INDEX CONCURRENTLY caveats
`CREATE INDEX CONCURRENTLY`:
- Cannot run inside a transaction. sqlx's default `query.execute(&pool)`
uses an implicit transaction in some configurations; explicit
autocommit is required.
- Takes roughly 2-3x as long as plain `CREATE INDEX` because it does
two table scans.
- Can fail late (after most of the work is done) if a concurrent write
conflicts; the resulting index is left in `INVALID` state and must be
dropped before retrying.
Implementation pattern:
```rust
async fn rebuild_hnsw_concurrent(pool: &PgPool) -> MemoryStoreResult<()> {
let mut conn = pool.acquire().await?;
// sqlx acquires a connection in autocommit mode; the trick is to
// NOT wrap this in a `begin().await?` transaction.
sqlx::query(
"CREATE INDEX CONCURRENTLY idx_knowledge_nodes_embedding_hnsw \
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops) \
WITH (m = 16, ef_construction = 64)"
)
.execute(&mut *conn)
.await?;
Ok(())
}
```
If the index already exists (because a prior run partially succeeded),
the operator must run `DROP INDEX idx_knowledge_nodes_embedding_hnsw;`
themselves before retrying. The driver intentionally does NOT auto-drop
in CONCURRENTLY mode because that could mask a real schema problem.
For the default `concurrent_index = false` path, use plain
`CREATE INDEX ...` against `pool.execute(...)`; transactions are fine.
---
## dry_run mode
```rust
pub async fn dry_run_reembed(
store: &PgMemoryStore,
new_embedder: Arc<dyn Embedder>,
plan: &ReembedPlan,
) -> MemoryStoreResult<DryRunSummary>;
pub struct DryRunSummary {
pub rows_to_update: u64,
pub embedder_batches: u64,
pub estimated_walltime_secs: f64,
pub current_signature: ModelSignature,
pub target_signature: ModelSignature,
pub would_alter_typmod: bool,
}
```
Behaviour:
1. `SELECT COUNT(*) FROM knowledge_nodes;` to get `rows_to_update`.
2. `embedder_batches = ceil(rows_to_update / plan.batch_size)`.
3. `estimated_walltime_secs = rows_to_update / 50.0` -- the master plan's
50-rows-per-second baseline for local fastembed. Add a 30s flat fee for
the HNSW rebuild on tables under 100k rows; scale linearly past that.
4. `would_alter_typmod = current_signature.dimension != target_signature.dimension`.
5. Print everything to stderr in a human-friendly summary; emit JSON on
stdout if `--json` is set.
6. Return without writing anything.
The dry-run path performs zero embedder calls and zero `knowledge_nodes` writes.
It is safe to run against production at any time.
---
## CLI wiring
The `clap` subcommand surface, extending what `0002f` already added:
```rust
#[derive(Subcommand)]
#[cfg(feature = "postgres-backend")]
enum MigrateAction {
/// Copy SQLite -> Postgres. Owned by 0002f.
Copy { /* ... see 0002f ... */ },
/// Re-embed all memories in a Postgres backend with a new embedder.
Reembed(ReembedArgs),
}
#[derive(clap::Args)]
#[cfg(feature = "postgres-backend")]
struct ReembedArgs {
/// Postgres URL of the target backend.
#[arg(long)]
postgres_url: String,
/// Embedder model name. Today only `nomic-ai/nomic-embed-text-v1.5`
/// is supported (the FastembedEmbedder default). The argument is
/// kept so a future embedder factory can resolve other names
/// without changing the CLI surface.
#[arg(long)]
model: String,
/// Vector dimension produced by the embedder. Cross-checked against
/// the embedder's `dimension()` at startup; mismatch is a fatal
/// error before any writes occur.
#[arg(long)]
dimension: usize,
/// Embedder + UPDATE batch size. Default 128.
#[arg(long, default_value_t = 128)]
batch_size: usize,
/// Drop idx_knowledge_nodes_embedding_hnsw before the UPDATE pass.
/// Default true.
#[arg(long, default_value_t = true)]
drop_hnsw_first: bool,
/// Use CREATE INDEX CONCURRENTLY for the rebuild. Default false.
#[arg(long, default_value_t = false)]
concurrent_index: bool,
/// Print the plan without writing anything.
#[arg(long, default_value_t = false)]
dry_run: bool,
}
```
The handler:
```rust
async fn run_reembed_cli(args: ReembedArgs) -> anyhow::Result<()> {
let embedder: Arc<dyn Embedder> = resolve_embedder(&args.model)?;
if embedder.dimension() != args.dimension {
anyhow::bail!(
"embedder '{}' produces dimension {}, --dimension was {}",
embedder.model_name(), embedder.dimension(), args.dimension,
);
}
let store = PgMemoryStore::connect(&args.postgres_url, 4).await?;
let plan = ReembedPlan {
batch_size: args.batch_size,
drop_hnsw_first: args.drop_hnsw_first,
concurrent_index: args.concurrent_index,
};
if args.dry_run {
let summary = dry_run_reembed(&store, embedder, &plan).await?;
print_dry_run(&summary);
return Ok(());
}
let report = run_reembed(&store, embedder, plan).await?;
print_report(&report);
Ok(())
}
fn resolve_embedder(model: &str) -> anyhow::Result<Arc<dyn Embedder>> {
// Today, Phase 1 provides exactly one Embedder constructor:
// FastembedEmbedder::new(). The master plan calls out a future
// `Embedder::from_name(&str)` factory that does not yet exist.
// Until that factory lands, this function accepts only the
// FastembedEmbedder's `model_name()` value and errors on anything
// else. Adding a real registry is a follow-up task.
let candidate = FastembedEmbedder::new();
if candidate.model_name() == model {
return Ok(Arc::new(candidate));
}
anyhow::bail!(
"unknown embedder model '{}'. Known: {}",
model,
candidate.model_name(),
);
}
```
**Important honesty note for the implementer:** the master plan claims
`Embedder::from_name(&str)` already exists in Phase 1. As of audit (see
"Audit step" above), it does not. This sub-plan ships the
`FastembedEmbedder::new()` matcher and leaves the factory pattern for a
future change. Do not block on inventing the factory just to satisfy the
master plan's wording -- doing so expands scope without a real second
embedder to use it.
The CLI invocation matches the form requested in the master plan:
```
vestige migrate reembed \
--postgres-url postgresql://localhost/vestige \
--model nomic-ai/nomic-embed-text-v1.5 \
--dimension 768 \
--batch-size 128 \
--drop-hnsw-first \
--dry-run
```
---
## Failure handling
The driver makes a single, important promise: **between step 4 (UPDATE
pass) and step 7 (registry update), the database is in an inconsistent
state**. Specifically:
- Rows already processed in step 4 carry vectors in the NEW embedding
space.
- Rows not yet processed carry vectors in the OLD embedding space.
- The `embedding_model` registry still says OLD.
- The HNSW index is dropped (if `drop_hnsw_first = true`).
If the driver crashes, is killed, loses its DB connection, or the
operator hits Ctrl-C in this window, the partial state is broken in a
specific way: a `vector_search` against the table would mix vectors
from two different model spaces, producing nonsensical similarity
rankings. The operator MUST NOT serve search until the re-embed
completes.
**Recovery procedure** (document this loudly in the operator-facing log):
1. The CLI log already says, on every batch, `"reembed: wrote batch N
(M rows)"`. The last such log line indicates how far the pass got.
2. The recovery action is to **re-run reembed** with the same arguments.
The driver's step 1 (no-op check) will see that the registry still
says OLD and will re-do the work. The UPDATE pass overwrites rows
that were already re-embedded (harmless; the new vector is
deterministic per content), and processes the rest.
3. Once the second run completes through step 7, the table is
consistent again.
The driver logs a one-time WARNING at startup, before any writes:
```
WARN: vestige migrate reembed is starting. Search results will be
WARN: incorrect until this run completes. Stop the MCP server now if
WARN: it is connected to this database. Press Ctrl-C within 5 seconds
WARN: to abort.
```
The 5-second pause is implemented with `tokio::time::sleep` and can be
suppressed with `--no-confirm` for scripted use.
There is no "resume from row N" feature in this iteration. Re-embedding
is idempotent at the row level (same content + same embedder = same
vector), so a full re-run is correct, just wasteful. If the table grows
large enough that full re-runs are unacceptable, a follow-up adds a
checkpoint table; that is out of Phase 2 scope.
---
## Verification
### Unit tests (colocated in `reembed.rs`)
1. **`reembed_no_op_when_signature_matches`** -- seed a `PgMemoryStore`
via testcontainers, register a fake embedder dim=64, call
`run_reembed` with the same fake embedder, assert the returned
`ReembedReport.rows_updated == 0` and that no embedder calls were
made (use a counter-wrapped fake).
2. **`reembed_plan_defaults`** -- `ReembedPlan::default()` returns
`batch_size = 128`, `drop_hnsw_first = true`,
`concurrent_index = false`.
3. **`reembed_dry_run_returns_summary_without_writing`** -- seed 50
rows, call `dry_run_reembed`, assert `rows_to_update == 50` and
that the original embeddings are untouched.
### Integration test (under `tests/phase_2/pg_reembed.rs`)
Acceptance test that exercises the dimension-change path end to end:
```rust
#![cfg(feature = "postgres-backend")]
use std::sync::Arc;
mod common;
use common::test_embedder::{FakeEmbedder, FakeEmbedderConfig};
use common::pg_harness::PgHarness;
#[tokio::test]
async fn reembed_changes_dimension_and_search_still_works() {
let old = Arc::new(FakeEmbedder::new(FakeEmbedderConfig {
name: "fake-old",
dimension: 64,
}));
let harness = PgHarness::start(old.clone()).await.unwrap();
// Seed 100 memories. Each gets a 64-d vector from `old`.
for i in 0..100 {
let content = format!("memory number {i} talks about rust and async");
let vec = old.embed(&content).await.unwrap();
harness.store.insert(/* ... record with embedding = vec ... */).await.unwrap();
}
// Now re-embed with a different fake at dim 128.
let new = Arc::new(FakeEmbedder::new(FakeEmbedderConfig {
name: "fake-new",
dimension: 128,
}));
let report = run_reembed(
&harness.store,
new.clone(),
ReembedPlan::default(),
).await.unwrap();
assert_eq!(report.rows_updated, 100);
// (a) Every row has a 128-d vector.
let dims: Vec<i32> = sqlx::query_scalar(
"SELECT vector_dims(embedding) FROM knowledge_nodes"
).fetch_all(harness.store.pool()).await.unwrap();
assert!(dims.iter().all(|&d| d == 128));
// (b) Registry reflects the new signature.
let sig = harness.store.registered_model().await.unwrap().unwrap();
assert_eq!(sig.name, "fake-new");
assert_eq!(sig.dimension, 128);
// (c) vector_search returns results in the new space.
let probe = new.embed("memory number 5 talks about rust and async").await.unwrap();
let results = harness.store.vector_search(&probe, 10).await.unwrap();
assert!(!results.is_empty());
}
```
The `FakeEmbedder` from `common/test_embedder.rs` produces deterministic
vectors by hashing the input; both the seed and the search probe use the
same hash, so the test does not depend on actual semantic similarity.
### Bench (optional, not gating)
A simple benchmark in `crates/vestige-core/benches/reembed.rs` reports
throughput at 100k rows with `FakeEmbedder`. Useful for catching
regressions in the UPDATE-pass batching pattern. Not part of CI.
---
## Acceptance criteria
This sub-plan is complete when:
1. `crates/vestige-core/src/storage/postgres/reembed.rs` exists and
compiles under `--features postgres-backend`.
2. `ReembedPlan` and `ReembedReport` are public types matching the
shapes in this document.
3. `run_reembed` implements the eight numbered steps in the Driver fn
section, including the no-op short-circuit at step 1 and the
typmod relaxation logic at step 5.
4. `dry_run_reembed` returns counts and estimates without writing.
5. The `vestige migrate reembed ...` subcommand is wired through
`crates/vestige-mcp/src/bin/cli.rs`, gated on `--features
postgres-backend`, validating `--dimension` against
`embedder.dimension()`.
6. The three unit tests pass.
7. The `pg_reembed.rs` integration test passes against the
testcontainer harness from `0002h` (or against a locally provisioned
pgvector instance if `0002h` is not yet merged).
8. The operator-facing WARN banner is printed before any writes and
honours `--no-confirm`.
9. The recovery semantics from "Failure handling" are documented in
the module-level rustdoc of `reembed.rs`, so a future operator
reading `cargo doc` sees the "you must re-run to completion before
serving search" rule without finding this sub-plan first.
10. `cargo sqlx prepare --workspace` updates `.sqlx/` with the new
queries; the resulting JSON files are committed.
When all ten items are checked, sub-plan `0002g` lands. Master plan
deliverable D9 is satisfied. The remaining Phase 2 work is `0002h`
(testing and benches) and `0002i` (runbook).

File diff suppressed because it is too large Load diff

724
docs/plans/0002i-runbook.md Normal file
View file

@ -0,0 +1,724 @@
# Phase 2 Sub-Plan 0002i -- Postgres Ops Runbook
**Status**: Ready
**Depends on**: Phase 2 sub-plans 0002a through 0002h merged (or at least
their interfaces stable). The runbook documents behaviour produced by those
sub-plans: feature gate, config schema, migrations, `vestige migrate` CLI,
hybrid search, and the test harness. Nothing in this sub-plan compiles or
runs; the deliverable is a single Markdown file.
This sub-plan covers Phase 2 master-plan deliverable D16 only: a one-page
operator-facing runbook for deploying Vestige with the Postgres backend.
---
## Context
Why a runbook. The ADR (0002) and the master plan (0002) are written for
implementors. They settle execution-level decisions and itemise deliverables.
They are not deployable instructions. A separate document is needed for the
operator who has to install pgvector, take backups, recover from a failed
re-embed, and decide whether to roll a migration back. The runbook is that
document.
Who reads it. Ops people, not developers. Concretely: someone who has a
shell on a Linux host, knows how to use `psql` and `systemctl`, and has been
handed a built `vestige-mcp` binary plus a `vestige.toml`. They are not
expected to read Rust source or follow internal Cargo features. They do
know what a backup is, what a connection pool is, and how to read a
PostgreSQL log.
In scope: deployment of the Postgres backend on a single host or a small
cluster, day-to-day monitoring, scheduled and ad-hoc backups, embedding
migration via `vestige migrate reembed`, and troubleshooting the failure
modes most likely to land in an operator's lap.
Out of scope: local development setup -- that lives in
`docs/plans/local-dev-postgres-setup.md` and the runbook links to it for
developer onboarding only. Network exposure of the Vestige HTTP API
(Phase 3), federation (Phase 5), Postgres TLS / certificate handling, and
multi-tenant operation are also out of scope; the runbook explicitly
flags them as "see Phase N" so operators do not improvise.
This sub-plan is the plan for producing the runbook. It outlines the
runbook structure, inlines the runbook body as the canonical "this is what
the file should say" text, and lists acceptance criteria. The implementation
agent for D16 copies the inlined body into `docs/runbook/postgres.md`,
creating `docs/runbook/` if it does not already exist. No other files in the
repository are modified.
---
## Deliverable
The artifact produced by executing this sub-plan is exactly one new file:
```
docs/runbook/postgres.md
```
It is NOT under `docs/plans/`. Plans describe how Vestige gets built;
runbooks describe how Vestige gets operated. The two directories are
deliberately separated.
Side effect: create the directory `docs/runbook/` if it does not exist.
Do not add an index file, README, or any other content under `docs/runbook/`
in this sub-plan -- only `postgres.md`.
This sub-plan document (`docs/plans/0002i-runbook.md`) is itself NOT a
deliverable in the operator sense. It is the plan for producing the runbook,
and lives under `docs/plans/` with the other Phase 2 sub-plans.
---
## Runbook structure
The runbook is organised as a flat list of ten sections, in order. Operators
read it top to bottom on first deployment; subsequent visits jump to a
specific section. Section numbering matches the inlined body below.
1. **Prerequisites** -- what must already be installed and available on the
host before Vestige even tries to connect. PostgreSQL 16 or newer
(18 on Arch is fine), pgvector >= 0.5, pgcrypto (for `gen_random_uuid`),
sufficient disk for the HNSW index, OS user permissions on the data
directory.
2. **Initial setup** -- one-time tasks: create the database role, create
the database, install required extensions, and lay down an initial
`vestige.toml`. Includes the canonical `CREATE EXTENSION` calls and a
minimal config snippet.
3. **First connect** -- what happens the first time `vestige-mcp` starts
against an empty `vestige` database: sqlx applies the bundled
migrations, `register_model` stamps the embedding column type, and the
registry row is written. How an operator verifies each step succeeded
using `psql`.
4. **Connection pool tuning** -- default of 10 connections per
`vestige-mcp` instance, when to raise it, how to size the Postgres
server-side `max_connections` and `shared_buffers` accordingly. Cross-
reference to `vestige.toml` and to ADR 0002 D2 / open question Q5.
5. **Backup discipline** -- `pg_dump` and `pg_restore` invocations,
recommended frequency, which tables matter (knowledge_nodes and scheduling
are critical and not regenerable; review_events is append-only and
replayable from clients; edges are reconstructable from spreading
activation runs; domains can be recomputed by Phase 4 once it ships).
Also covers backup verification (restore-to-tmp drill).
6. **Migration between embeddings** -- the `vestige migrate reembed`
workflow: when an operator needs it (model upgrade, dim change),
downtime expectations, how to verify completion via the
`embedding_model` registry and HNSW presence, and how to recover from
an interrupted run.
7. **Re-clustering domains** -- a brief forward reference. Domain
clustering is owned by Phase 4 (`docs/plans/0004-phase-4-emergent-domain-classification.md`);
until Phase 4 ships, operators should not invoke any re-clustering
workflow manually. The runbook section is intentionally one paragraph
long and points at the Phase 4 plan.
8. **Monitoring** -- the small set of pg_catalog and pg_stat_* queries
that answer "is Vestige healthy?": `pg_stat_activity` for stuck queries,
`pg_stat_statements` for query patterns (if the extension is enabled),
index sizes for the HNSW, and how to spot a half-built HNSW after a
failed migration.
9. **Troubleshooting** -- a table of common errors with the symptom and
the fix. Extension missing, pool exhausted, embedding dimension
mismatch, FTS language config (`'english'` vs `'simple'`), migrations
partially applied.
10. **Rollback caveats** -- every `*.up.sql` has a `*.down.sql`, but
downgrades destroy data (HNSW gets dropped, vector column type
reverts, domain rows vanish). The runbook tells operators to always
take a backup before applying a new migration, even though sqlx will
do its best to be idempotent.
---
## Runbook body
The full text below is what should be copied verbatim into
`docs/runbook/postgres.md`. ASCII only. Code blocks use fenced syntax with
language hints. Operator-facing prose; second person ("you") for
instructions. Where a command requires sudo, the prompt shows it explicitly.
```markdown
# Vestige Postgres Backend -- Operator Runbook
This runbook covers deploying, operating, monitoring, and recovering a
Vestige installation that uses the Postgres backend. It is written for
operators handling a built `vestige-mcp` binary and a `vestige.toml`.
For local development setup, see
`docs/plans/local-dev-postgres-setup.md`. For the architectural rationale,
see `docs/adr/0001-pluggable-storage-and-network-access.md` and
`docs/adr/0002-phase-2-execution.md`. For the deliverable-level plan, see
`docs/plans/0002-phase-2-postgres-backend.md`.
---
## 1. Prerequisites
Before Vestige can connect:
- PostgreSQL server, version 16 or newer. Arch ships 18.x; Debian stable
ships 16.x; both work.
- `pgvector` extension, version 0.5 or newer. Distro packages:
`pgvector` on Arch, `postgresql-16-pgvector` on Debian/Ubuntu.
- `pgcrypto` extension, shipped with the PostgreSQL contrib package
(`postgresql-contrib` on Debian, included in the base `postgresql`
package on Arch). Vestige uses `gen_random_uuid()` from pgcrypto for
primary keys.
- Disk space: budget roughly 4x the size of your `knowledge_nodes.embedding`
column for the HNSW index. With 768-dim float32 vectors at 100k
memories, that is about 1.2 GB for the embeddings plus 4-5 GB for the
HNSW index. Plan accordingly.
- OS user: the `postgres` system user (or whatever user owns
`/var/lib/postgres/data`) must have read/write on the data directory.
Vestige itself does not need filesystem access to Postgres; it talks
TCP only.
- Network: Vestige and Postgres can be on the same host (loopback) or
different hosts. If different hosts, allow the Vestige host's IP in
`pg_hba.conf` and on any firewall.
---
## 2. Initial setup
These steps run once per Postgres cluster.
### 2.1 Install extensions
As the `postgres` superuser:
```sh
sudo -u postgres psql -d vestige <<'SQL'
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS pgcrypto;
SQL
```
Verify:
```sh
sudo -u postgres psql -d vestige -c \
"SELECT extname, extversion FROM pg_extension WHERE extname IN ('vector','pgcrypto');"
```
You should see two rows. If `vector` is missing, the pgvector package was
not installed for the right PostgreSQL major version; reinstall it.
### 2.2 Create the role and database
The `vestige` role owns its own database; it does NOT need superuser.
Extensions must be installed by `postgres`, not by `vestige`.
```sh
sudo -u postgres psql -v ON_ERROR_STOP=1 <<'SQL'
CREATE ROLE vestige WITH LOGIN CREATEDB PASSWORD 'CHANGE_ME';
CREATE DATABASE vestige OWNER vestige ENCODING 'UTF8';
GRANT ALL PRIVILEGES ON DATABASE vestige TO vestige;
SQL
sudo -u postgres psql -d vestige -v ON_ERROR_STOP=1 <<'SQL'
GRANT ALL ON SCHEMA public TO vestige;
ALTER SCHEMA public OWNER TO vestige;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO vestige;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO vestige;
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON FUNCTIONS TO vestige;
SQL
```
Replace `CHANGE_ME` with a strong password and store it where Vestige can
read it (typically `~/.vestige_pg_pw`, mode 600, owned by the user running
`vestige-mcp`).
### 2.3 Minimal `vestige.toml`
```toml
[storage]
backend = "postgres"
[storage.postgres]
url = "postgresql://vestige:CHANGE_ME@127.0.0.1:5432/vestige"
max_connections = 10
```
The `url` field accepts a `${VAR}` placeholder; in practice operators
either inline the password or export `DATABASE_URL` and reference
`url = "${DATABASE_URL}"`. See `docs/CONFIGURATION.md` for the full
schema once Phase 3 lands.
---
## 3. First connect
When `vestige-mcp` starts against an empty `vestige` database, it:
1. Builds a `PgPool` of `max_connections` (default 10) connections.
2. Runs every migration in `crates/vestige-core/migrations/postgres/`
in order. The bundled migrations are `0001_init` (tables, non-vector
indexes) and `0002_hnsw` (HNSW index on `knowledge_nodes.embedding`).
3. Calls `register_model` once it knows the active embedder's dimension.
This issues `ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE
vector($N)` and inserts a row into `embedding_model`.
4. Begins accepting MCP requests.
To verify after the first start:
```sh
sudo -u postgres psql -d vestige <<'SQL'
-- All expected tables present.
\dt
-- embedding_model has exactly one row.
SELECT name, dimension, hash FROM embedding_model;
-- The HNSW index exists.
SELECT indexname FROM pg_indexes
WHERE tablename = 'knowledge_nodes' AND indexname LIKE '%hnsw%';
SQL
```
Expected: `knowledge_nodes`, `scheduling`, `edges`, `domains`, `review_events`,
`embedding_model`, `users`, `groups`, `group_memberships`; one row in
`embedding_model`; one `idx_knowledge_nodes_embedding_hnsw` index.
If a migration fails mid-way, the partial state lands in
`_sqlx_migrations`. See section 9 for recovery.
---
## 4. Connection pool tuning
Defaults:
- Vestige client pool: `max_connections = 10` per `vestige-mcp` instance.
- Postgres server: `max_connections = 100` (default).
Math: one MCP client with the default pool uses up to 10 server slots.
Five concurrent MCP clients use up to 50 slots. The remaining 50 cover
`psql` sessions, background workers, and headroom for replication or
backup processes.
When to raise:
- More than three MCP clients connecting to one Postgres instance.
- Long-running queries (above 500ms p99) showing pool wait time in
Vestige logs (look for `pool acquire timed out` warnings).
- A noticeable number of concurrent dream/consolidation runs.
How to raise:
```toml
[storage.postgres]
max_connections = 20 # client side, per vestige-mcp instance
```
And on the Postgres server, edit `postgresql.conf`:
```conf
max_connections = 200
shared_buffers = 2GB # roughly 25 percent of RAM, never above 8GB
```
Then restart Postgres (`sudo systemctl restart postgresql`). Vestige
clients pick up their own `max_connections` change on next restart.
Do not raise pool sizes blindly. Past about 4x the CPU core count,
Postgres throughput drops; a small connection pooler (PgBouncer in
transaction mode) is the right answer above ~200 client connections,
but Vestige's expected scale rarely needs that.
---
## 5. Backup discipline
### 5.1 Which tables matter
| Table | Backup priority | Regenerable? |
|-------|-----------------|--------------|
| `knowledge_nodes` | Critical | No |
| `scheduling` | Critical | No (FSRS state) |
| `embedding_model` | Critical | No (one row, but stamps the column type) |
| `users`, `groups`, `group_memberships` | Critical | No (Phase 3 will populate) |
| `review_events` | Important | Replayable by clients but tedious |
| `edges` | Optional | Yes (recomputed by spreading activation) |
| `domains` | Optional | Yes (Phase 4 recomputes by clustering) |
For a typical single-operator install, dumping the whole database is
fastest and simplest. Skip the optional tables only if dump size becomes
a bandwidth problem.
### 5.2 Full logical backup
```sh
pg_dump --host=127.0.0.1 --username=vestige --format=custom \
--file=vestige-$(date -u +%Y%m%dT%H%M%SZ).dump \
vestige
```
The custom format compresses by default and works with parallel restore.
File size for 10k memories: roughly 80 MB.
Frequency recommendations:
- Daily for any installation with active ingest.
- Before every `vestige migrate reembed` run (see section 6).
- Before every Postgres major-version upgrade.
- Retain at least 7 daily, 4 weekly, 3 monthly dumps. Compress with
`--format=custom` (already gzipped) and keep them on different
storage from the database itself.
### 5.3 Restore
To a fresh database:
```sh
sudo -u postgres createdb -O vestige vestige_restore
pg_restore --host=127.0.0.1 --username=vestige --dbname=vestige_restore \
--jobs=4 vestige-20260301T030000Z.dump
```
To replace the live database (destructive; only after taking a fresh
dump):
```sh
sudo systemctl stop vestige-mcp # or however the service is run
sudo -u postgres dropdb vestige
sudo -u postgres createdb -O vestige vestige
pg_restore --host=127.0.0.1 --username=vestige --dbname=vestige \
--jobs=4 vestige-20260301T030000Z.dump
sudo systemctl start vestige-mcp
```
### 5.4 Restore drill
Run a restore-to-throwaway-database every month and run `vestige search`
or a manual `psql` count against it. A backup you have not restored is a
backup you do not have.
```sh
sudo -u postgres createdb -O vestige vestige_restore_drill
pg_restore --host=127.0.0.1 --username=vestige --dbname=vestige_restore_drill \
--jobs=4 vestige-latest.dump
PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige \
-d vestige_restore_drill \
-c 'SELECT count(*) FROM knowledge_nodes;'
sudo -u postgres dropdb vestige_restore_drill
```
---
## 6. Migration between embeddings
Use `vestige migrate reembed` when:
- Upgrading to a new embedding model that produces a different dimension
(for example, swapping from `nomic-embed-text-v1.5` 768D to a 1024D
model).
- Switching providers and the model hash differs even at the same
dimension.
What it does:
1. Reads every row from `knowledge_nodes`, re-encodes the `content` column
through the new embedder, and writes the new vector back.
2. Drops the HNSW index before the re-encode loop (this is the default;
`--concurrent-index` keeps it during the run at the cost of speed).
3. Updates the `embedding_model` row with the new name, dimension, and
hash.
4. Rebuilds the HNSW index with the new vectors.
### 6.1 Before starting
- Take a fresh backup (section 5.2). The tool refuses to start without a
`--yes` flag if it detects no recent backup; ignore at your peril.
- Stop ingest. Vestige's MCP server can stay running for read-only
access, but pause any client that calls `smart_ingest` or
`update_scheduling`.
- Have the new embedder model available locally. The CLI loads it
before the first row is touched; if loading fails, no data is changed.
### 6.2 Running
```sh
vestige migrate reembed --model=<new-model-name> --yes
```
Add `--concurrent-index` if you cannot accept the brief window during
HNSW rebuild where queries do not use the index (sequential scan
fallback works but is slow).
The tool prints a progress bar via `indicatif`. Expected throughput:
roughly 200 memories per second per CPU core for a 768D ONNX model.
10,000 memories on an 8-core box: about 6 seconds, plus HNSW rebuild
(another 30-90 seconds at that scale).
### 6.3 Verifying completion
```sh
sudo -u postgres psql -d vestige <<'SQL'
-- Registry reflects the new model.
SELECT name, dimension, hash FROM embedding_model;
-- HNSW index is present and not partial.
SELECT indexname, indexdef
FROM pg_indexes
WHERE tablename = 'knowledge_nodes' AND indexname LIKE '%hnsw%';
-- All rows have a non-null embedding of the new dimension.
SELECT count(*) FILTER (WHERE embedding IS NULL) AS missing,
count(*) AS total
FROM knowledge_nodes;
SQL
```
Expected: registry shows the new model name and dimension, one HNSW
index, zero missing embeddings.
### 6.4 Recovering from an interrupted run
`vestige migrate reembed` is restartable. On interruption:
- The `embedding_model` row may or may not have been updated. Check it
manually and roll forward by re-running with `--yes --resume` (the
tool detects the inconsistency and finishes the rows that still hold
old embeddings).
- The HNSW index may be missing. Re-running the command rebuilds it as
its last step.
- If the system is in a state the tool refuses to reason about, restore
from the backup taken in 6.1.
---
## 7. Re-clustering domains
Domain clustering is owned by Phase 4
(`docs/plans/0004-phase-4-emergent-domain-classification.md`). Until
Phase 4 ships, the `domains` table is reserved schema and is populated
only by tests. Operators must not invoke any domain re-clustering
workflow manually; there is no supported one in Phase 2.
When Phase 4 lands, this section is replaced with the real procedure.
---
## 8. Monitoring
### 8.1 Quick health check
```sh
PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige -d vestige <<'SQL'
SELECT count(*) AS memory_count FROM knowledge_nodes;
SELECT name, dimension FROM embedding_model;
SELECT pg_size_pretty(pg_database_size('vestige')) AS db_size;
SQL
```
### 8.2 In-flight queries
```sql
SELECT pid, now() - query_start AS runtime, state, query
FROM pg_stat_activity
WHERE datname = 'vestige' AND state <> 'idle'
ORDER BY runtime DESC NULLS LAST;
```
Anything over 5 seconds with `state = 'active'` deserves a look. HNSW
search queries should land well under 100ms on properly-sized hardware.
### 8.3 Query pattern analysis
If `pg_stat_statements` is loaded (`shared_preload_libraries =
'pg_stat_statements'` in `postgresql.conf`):
```sql
SELECT calls, mean_exec_time, query
FROM pg_stat_statements
WHERE query ILIKE '%knowledge_nodes%'
ORDER BY mean_exec_time DESC
LIMIT 20;
```
Look for hybrid-search queries that have drifted above 100ms p50. The
usual culprit is a missing or half-built HNSW index.
### 8.4 Index health
```sql
SELECT indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS size,
idx_scan, idx_tup_read
FROM pg_indexes
JOIN pg_stat_user_indexes USING (indexrelid)
WHERE schemaname = 'public' AND relname = 'knowledge_nodes';
```
A HNSW index with `idx_scan = 0` after several hours of traffic usually
means the planner is preferring sequential scan -- either the table is
too small to bother with the index (fine) or the index is corrupt and
needs rebuilding (`REINDEX INDEX idx_knowledge_nodes_embedding_hnsw;`).
### 8.5 Spotting half-built HNSW
After a failed migration or a crashed `reembed`:
```sql
SELECT indexname, indisvalid, indisready
FROM pg_indexes
JOIN pg_index ON indexrelid = (schemaname || '.' || indexname)::regclass
WHERE tablename = 'knowledge_nodes';
```
Any row with `indisvalid = false` is broken. Drop and recreate:
```sql
DROP INDEX IF EXISTS idx_knowledge_nodes_embedding_hnsw;
CREATE INDEX idx_knowledge_nodes_embedding_hnsw
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops);
```
---
## 9. Troubleshooting
| Symptom | Likely cause | Fix |
|---------|--------------|-----|
| `ERROR: extension "vector" is not available` on start | pgvector not installed for this Postgres major version | Install the distro package matching `pg_config --version`, then `CREATE EXTENSION vector;` as superuser |
| `pool timed out while waiting for an open connection` in Vestige logs | Pool too small or stuck queries holding connections | Raise `max_connections` in `vestige.toml`; investigate `pg_stat_activity` for queries above 5s |
| `vector dimensions do not match` on insert | `embedding_model` was stamped at one dimension and a different embedder is now running | Re-run `vestige migrate reembed --model=<correct>` or fix the embedder configuration |
| Hybrid search returns the same row twice | Stale `.sqlx/` query cache from before D5 landed | Run `cargo sqlx prepare` in `crates/vestige-core/`, rebuild the binary |
| `text search configuration "english" does not exist` | Postgres locale build does not include the english dictionary (rare on Alpine) | Install the language-pack or override the FTS language in `vestige.toml` (see `[storage.postgres.fts]` once Phase 2 D5 lands) |
| `relation "_sqlx_migrations" exists, but migration X is in "applied" with no checksum` | Previous run died between `BEGIN` and `COMMIT` | Stop Vestige, restore from backup, restart |
| HNSW index very large compared to data | `m` and `ef_construction` defaults too high for the corpus | Acceptable for now; tuning lands as part of Phase 4 |
| `permission denied for schema public` on a new install | `vestige` role does not own `public` | Re-run the grants block in section 2.2 as `postgres` |
If a problem is not in this table, capture: PostgreSQL log
(`/var/log/postgres/`, journalctl `-u postgresql`), Vestige log
(`RUST_LOG=debug,sqlx=info` for a fresh run), the migration state
(`SELECT * FROM _sqlx_migrations ORDER BY version;`), and file a bug.
---
## 10. Rollback caveats
Every migration in `crates/vestige-core/migrations/postgres/` has a
matching `*.down.sql`. `sqlx migrate revert` walks them in reverse order.
This is not the same as risk-free. The `0002_hnsw.down.sql` drops the
HNSW index (rebuildable, expensive). The `0001_init.down.sql` drops
every table -- including `knowledge_nodes`, including data. Down migrations
exist for development, not for casual production use.
Before applying any new migration:
1. Take a backup (section 5.2).
2. Run the migration on a restored copy first if you can afford the time.
3. Read the new migration's `*.up.sql` and `*.down.sql` to understand
what changes.
To revert one migration manually:
```sh
sqlx migrate revert \
--database-url "postgresql://vestige:...@127.0.0.1:5432/vestige" \
--source crates/vestige-core/migrations/postgres
```
Note that Vestige's binary does not run `sqlx migrate revert`
automatically. Reverts are always an explicit operator decision.
If a revert fails partway through, treat the database as inconsistent:
restore from the backup taken in step 1.
```
---
## Cross-references
- `docs/adr/0001-pluggable-storage-and-network-access.md` -- ADR that
established the pluggable backend.
- `docs/adr/0002-phase-2-execution.md` -- ADR settling Phase 2 execution
decisions; section "Architecture Overview" lists every table the
runbook references.
- `docs/plans/0002-phase-2-postgres-backend.md` -- master plan; D16
(deliverables list) and the Open Implementation Questions section
(especially Q4 HNSW rebuild and Q5 pool sizing) inform the runbook's
recommendations.
- `docs/plans/local-dev-postgres-setup.md` -- developer-facing recipe
for a one-machine Arch / CachyOS dev cluster. The runbook links to it
as the "for development, see" pointer.
- `docs/CONFIGURATION.md` -- existing config doc; section 4 of the
runbook ("Connection pool tuning") cross-references it for the
authoritative `vestige.toml` schema.
---
## Verification
A reviewer is given:
- A fresh Linux VM (Debian 12 or Arch current; both must work) with
network access and no Postgres installed.
- A built `vestige-mcp` binary for that platform.
- The runbook (`docs/runbook/postgres.md`).
The reviewer follows the runbook top to bottom and reaches a state in
which Vestige answers MCP requests against the Postgres backend.
Checkpoints, in order:
1. After section 1 (Prerequisites): `pg_config --version` returns 16 or
newer; `pkg-config --modversion libpq` resolves; the `pgvector`
distro package is installed.
2. After section 2.1 (Extensions): two rows in
`SELECT extname FROM pg_extension WHERE extname IN ('vector', 'pgcrypto');`.
3. After section 2.2 (Role + DB): `psql -U vestige -h 127.0.0.1 -d vestige -c '\conninfo'`
succeeds.
4. After section 2.3 (Config): `vestige.toml` parses (test by
`vestige config print` once that subcommand lands, otherwise
`vestige-mcp --check-config`).
5. After section 3 (First connect): the eight expected tables are
present; `embedding_model` has exactly one row; the HNSW index
exists; `vestige-mcp` log shows "Postgres backend ready".
6. After section 5.2 (Backup): the dump file exists and `pg_restore -l`
on it lists the expected tables.
7. After section 5.4 (Restore drill): the drill database holds the same
row count as the source.
If any checkpoint fails, the runbook section that produced the failure
is the one that needs revision. Capture the exact command, exit code,
and log line; revise the runbook in a follow-up PR.
A second reviewer reads the runbook without executing it and checks for:
- ASCII only; no em dashes, no curly quotes, no Unicode arrows, no
ellipses, no bullets (`*`/`-` ASCII only).
- Every section number from 1 to 10 present and in order.
- Every cross-reference resolves to an existing file or to a Phase
number explicitly marked as "future".
- No code block longer than 30 lines; if longer, it should be split or
referenced from another file.
---
## Acceptance criteria
- [ ] `docs/runbook/` directory exists.
- [ ] `docs/runbook/postgres.md` exists and matches the inlined body
above byte-for-byte after stripping the outer code fence used in
this sub-plan to embed it.
- [ ] All ten sections from the "Runbook structure" outline are present
under their stated headings.
- [ ] No file other than `docs/runbook/postgres.md` is created or
modified by executing this sub-plan.
- [ ] ASCII only: no em dashes, no curly quotes, no Unicode arrows,
no ellipses, no Unicode bullets (`grep -P '[^\x00-\x7F]'
docs/runbook/postgres.md` returns no matches).
- [ ] Every cross-reference in the runbook points at a file that exists
in the repository at the time of merge, OR is explicitly framed
as "future Phase N" with a pointer to the relevant plan document.
- [ ] Every command block is copy-pastable: no `<placeholder>` syntax
that does not also have an inline note describing what to
substitute.
- [ ] A second pair of eyes confirms the verification checkpoints in the
preceding section are reproducible.
- [ ] The runbook is no longer than the inlined body in this sub-plan;
operators reach the end without losing patience.