mirror of
https://github.com/samvallad33/vestige.git
synced 2026-06-16 21:05:15 +02:00
docs(plans): add Phase 2 sub-plans 0002a-0002i + supersession notice
Nine Phase 2 sub-plans operationalising ADR 0002 against the Phase 2 master plan, each sized to fit a focused implementation session and handed to Claude Code as a /goal brief without requiring the agent to load the master plan. Order of execution (each depends on the previous unless noted): - 0002a-skeleton-and-feature-gate.md -- postgres-backend Cargo feature + PgMemoryStore skeleton with todo!() bodies. D1+D2. - 0002b-pool-and-config.md -- PgPool builder, VestigeConfig/ PostgresConfig, vestige.toml loader wired into vestige-mcp. D3+D7 (master plan numbering). - 0002c-migrations.md -- sqlx migrations 0001_init/0002_hnsw including D7 (users/groups/memberships, owner/visibility/shared_with_groups) and D8 (codebase column). SQLite V15 parity migration. D4. - 0002d-store-impl-bodies.md -- real CRUD + registry bodies; trivial fts_search/vector_search bodies. D2+D6. - 0002e-hybrid-search.md -- one-statement RRF query. D5. - 0002f-migrate-cli.md -- vestige migrate copy (SQLite -> Postgres), --dry-run, idempotent re-runs, --allow-source-upgrade for pre-V15 sources. D8+D10. - 0002g-reembed.md -- vestige migrate reembed (offline rebuild). D9 + D10 reembed arm. Ships resolve_embedder helper as a workaround for the missing Embedder::from_name(&str) constructor. - 0002h-testing-and-benches.md -- testcontainers harness, six integration test files, Criterion bench at 1k/100k. D14+D15. - 0002i-runbook.md -- operator-facing deployment + day-2 runbook. D16. Supersession notice added to the master plan (0002-phase-2-postgres- backend.md) pointing at ADR 0002; body retained as archival reference. PR B carries this commit plus the previous two (ADR 0002 + Phase 1 amendment sub-plans); no code change.
This commit is contained in:
parent
697ade5b02
commit
9ef8afdb20
10 changed files with 8998 additions and 0 deletions
|
|
@ -1,5 +1,13 @@
|
|||
# Phase 2 Plan: PostgreSQL Backend
|
||||
|
||||
> **Supersession Notice (2026-05-26):** This master plan is now archival. Execution is governed by:
|
||||
> - **ADR**: [`docs/adr/0002-phase-2-execution.md`](../adr/0002-phase-2-execution.md) -- binding decisions
|
||||
> - **Sub-plans** (executable briefs):
|
||||
> - Phase 1 amendment: [0001a-trait-rewrite.md](0001a-trait-rewrite.md), [0001b-sqlite-split.md](0001b-sqlite-split.md), [0001c-async-trait-sunset.md](0001c-async-trait-sunset.md)
|
||||
> - Phase 2: 0002a..0002i (skeleton, pool+config, migrations, store impl, hybrid search, migrate CLI, reembed, tests+benches, runbook)
|
||||
>
|
||||
> **Deltas vs body**: trait uses `trait_variant`, error type is `MemoryStoreError`/`MemoryStoreResult`, `connect` is `(url, max_connections)` only, the core table is `knowledge_nodes` (not `memories`) and gains `owner_user_id` + `visibility` + `shared_with_groups` + `codebase`, plus `users`/`groups`/`group_memberships` tables. See ADR 0002 D1-D8.
|
||||
|
||||
**Status**: Draft
|
||||
**Depends on**: Phase 1 (MemoryStore + Embedder traits, embedding_model registry, domain columns)
|
||||
**Related**: docs/adr/0001-pluggable-storage-and-network-access.md (Phase 2), docs/prd/001-getting-centralized-vestige.md, docs/plans/local-dev-postgres-setup.md (local cluster provisioning)
|
||||
|
|
|
|||
554
docs/plans/0002a-skeleton-and-feature-gate.md
Normal file
554
docs/plans/0002a-skeleton-and-feature-gate.md
Normal file
|
|
@ -0,0 +1,554 @@
|
|||
# Phase 2 Sub-Plan 0002a -- Skeleton and Feature Gate
|
||||
|
||||
**Status**: Ready
|
||||
**Depends on**: Phase 1 amendment (sub-plans `0001a-trait-rewrite.md` and
|
||||
`0001b-sqlite-split.md`) merged. Specifically:
|
||||
- `MemoryStore` trait declared with `#[trait_variant::make(MemoryStore: Send)]`,
|
||||
generating a non-Send `LocalMemoryStore` companion trait. The
|
||||
`pub use MemoryStore as LocalMemoryStore` alias from Phase 1 is gone.
|
||||
- `crates/vestige-core/src/storage/sqlite.rs` has been split into
|
||||
`crates/vestige-core/src/storage/sqlite/` with the same public surface.
|
||||
|
||||
This sub-plan covers Phase 2 master-plan deliverables D1 and D2 only:
|
||||
the `postgres-backend` Cargo feature gate and a compilable `PgMemoryStore`
|
||||
skeleton whose trait method bodies are `todo!()`. No real Postgres code, no
|
||||
migrations, no SQL. Later sub-plans (`0002b-pool-and-config.md`,
|
||||
`0002c-migrations.md`, `0002d-store-impl-bodies.md`, ...) fill the bodies in.
|
||||
|
||||
The success criterion is a clean build under both feature-flag configurations,
|
||||
nothing more.
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
ADR 0002 D4 commits Phase 2 to a `crates/vestige-core/src/storage/postgres/`
|
||||
directory from day one. The seven other files in that directory
|
||||
(`pool.rs`, `migrations.rs`, `registry.rs`, `search.rs`, `migrate_cli.rs`,
|
||||
`reembed.rs`) belong to subsequent sub-plans. This sub-plan creates only
|
||||
`crates/vestige-core/src/storage/postgres/mod.rs` so the rest can be added
|
||||
incrementally without breaking the build.
|
||||
|
||||
Per ADR 0002 D2, `PgMemoryStore::connect` mirrors `SqliteMemoryStore::new`:
|
||||
no `Embedder` argument. The pgvector typmod DDL
|
||||
(`ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)`) lives inside
|
||||
the trait method `register_model`, invoked by the caller after construction.
|
||||
In this sub-plan `register_model` is a `todo!()` body; `0002c-migrations.md`
|
||||
and `0002d-store-impl-bodies.md` provide the real implementation.
|
||||
|
||||
The trait surface in `crates/vestige-core/src/storage/memory_store.rs` is the
|
||||
source of truth for method signatures. Do NOT copy signatures from the master
|
||||
plan -- they are stale in places (for example, master plan 0002 D2 lists
|
||||
`remove_edge` as three-arg `(source, target, edge_type)`; the live trait has
|
||||
two args `(source, target)`).
|
||||
|
||||
---
|
||||
|
||||
## Cargo manifest changes
|
||||
|
||||
Two optional crates and one new feature flag. Use `cargo add` per the global
|
||||
CLAUDE.md preference; do not hand-edit `Cargo.toml`.
|
||||
|
||||
```bash
|
||||
cd crates/vestige-core
|
||||
|
||||
cargo add sqlx@0.8 --optional --no-default-features \
|
||||
--features runtime-tokio,tls-rustls,postgres,uuid,chrono,json,migrate,macros
|
||||
|
||||
cargo add pgvector@0.4 --optional --features sqlx
|
||||
```
|
||||
|
||||
After both commands, open `crates/vestige-core/Cargo.toml` and add the
|
||||
`postgres-backend` feature line in the `[features]` block. Place it after
|
||||
the `metal` feature, before `[dependencies]`:
|
||||
|
||||
```toml
|
||||
# Postgres backend (mutually compilable with the SQLite backend; default OFF).
|
||||
# Compile with: --features postgres-backend
|
||||
postgres-backend = ["dep:sqlx", "dep:pgvector"]
|
||||
```
|
||||
|
||||
Do NOT add `tokio-stream`, `futures`, `indicatif`, or `toml` in this sub-plan.
|
||||
The master plan D1 lists them in the `postgres-backend` feature for
|
||||
convenience, but their consumers (streaming migrate, progress bar, config
|
||||
parsing) land in later sub-plans. Adding them here pulls dead weight into the
|
||||
feature gate.
|
||||
|
||||
Do NOT add the `vestige-mcp` pass-through feature in this sub-plan either.
|
||||
The MCP crate gets its `postgres-backend` feature in `0002b-pool-and-config.md`
|
||||
when `MemoryStoreConfig` lands and the binary needs a knob to pick a backend.
|
||||
|
||||
Verify the diff to `crates/vestige-core/Cargo.toml` looks like this and only
|
||||
this:
|
||||
|
||||
```toml
|
||||
[features]
|
||||
# ...existing features unchanged...
|
||||
postgres-backend = ["dep:sqlx", "dep:pgvector"]
|
||||
|
||||
[dependencies]
|
||||
# ...existing deps unchanged...
|
||||
sqlx = { version = "0.8", default-features = false, features = [
|
||||
"runtime-tokio", "tls-rustls", "postgres", "uuid", "chrono",
|
||||
"json", "migrate", "macros",
|
||||
], optional = true }
|
||||
pgvector = { version = "0.4", features = ["sqlx"], optional = true }
|
||||
```
|
||||
|
||||
The exact ordering of the two new lines inside `[dependencies]` is not
|
||||
significant; `cargo add` places them at the end. Leave the placement that
|
||||
`cargo add` produces.
|
||||
|
||||
---
|
||||
|
||||
## Storage module export
|
||||
|
||||
Edit `crates/vestige-core/src/storage/mod.rs` to expose the new module behind
|
||||
the feature flag. Two lines change.
|
||||
|
||||
Add to the module-declaration block (after `mod sqlite;`):
|
||||
|
||||
```rust
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
mod postgres;
|
||||
```
|
||||
|
||||
Add to the re-export block (after the `pub use sqlite::{ ... }` block):
|
||||
|
||||
```rust
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
pub use postgres::PgMemoryStore;
|
||||
```
|
||||
|
||||
Nothing else in `storage/mod.rs` changes. The `Storage` alias still points at
|
||||
`SqliteMemoryStore`; the SQLite re-export block is untouched.
|
||||
|
||||
---
|
||||
|
||||
## Postgres module skeleton
|
||||
|
||||
Create `crates/vestige-core/src/storage/postgres/mod.rs` with the full content
|
||||
below. This is the only new file in this sub-plan.
|
||||
|
||||
```rust
|
||||
#![cfg(feature = "postgres-backend")]
|
||||
//! Postgres-backed implementation of `MemoryStore`.
|
||||
//!
|
||||
//! Skeleton only. Every trait method is `todo!()`. Real bodies land in
|
||||
//! subsequent Phase 2 sub-plans:
|
||||
//! - `0002b-pool-and-config.md`: pool construction and config wiring
|
||||
//! - `0002c-migrations.md`: sqlx migration files and `init`/`register_model`
|
||||
//! - `0002d-store-impl-bodies.md`: CRUD, scheduling, edges, domains
|
||||
//! - `0002e-hybrid-search.md`: RRF query and search bodies
|
||||
//!
|
||||
//! The directory grows companion files (`pool.rs`, `migrations.rs`,
|
||||
//! `registry.rs`, `search.rs`, `migrate_cli.rs`, `reembed.rs`) in those
|
||||
//! sub-plans; this skeleton only creates `mod.rs`.
|
||||
|
||||
use chrono::{DateTime, Utc};
|
||||
use sqlx::PgPool;
|
||||
use uuid::Uuid;
|
||||
|
||||
use crate::storage::memory_store::{
|
||||
Domain, HealthStatus, LocalMemoryStore, MemoryEdge, MemoryRecord, MemoryStoreResult,
|
||||
ModelSignature, SchedulingState, SearchQuery, SearchResult, StoreStats,
|
||||
};
|
||||
|
||||
/// Postgres-backed implementation of `MemoryStore`.
|
||||
///
|
||||
/// Cheaply cloneable. Methods take `&self`; interior state lives inside the
|
||||
/// `PgPool` (which already provides `Sync` via `Arc` internally).
|
||||
#[derive(Clone)]
|
||||
pub struct PgMemoryStore {
|
||||
pool: PgPool,
|
||||
/// Embedding vector dimension. Set to 0 in the skeleton; populated by
|
||||
/// `register_model` in `0002d-store-impl-bodies.md` once the pgvector
|
||||
/// `ALTER COLUMN TYPE vector(N)` DDL lands.
|
||||
embedding_dim: i32,
|
||||
}
|
||||
|
||||
impl PgMemoryStore {
|
||||
/// Construct a new store from a connection URL.
|
||||
///
|
||||
/// Mirrors `SqliteMemoryStore::new`: no `Embedder` argument. The pgvector
|
||||
/// `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)` DDL lives
|
||||
/// inside `register_model`, not here. The caller (cognitive engine
|
||||
/// bootstrap, migrate CLI, tests) invokes `register_model` after this
|
||||
/// returns, identically to the SQLite path.
|
||||
///
|
||||
/// Real body lands in `0002b-pool-and-config.md` (pool construction) and
|
||||
/// `0002c-migrations.md` (initial migration run).
|
||||
pub async fn connect(_url: &str, _max_connections: u32) -> MemoryStoreResult<Self> {
|
||||
todo!("PgMemoryStore::connect lands in 0002b-pool-and-config.md")
|
||||
}
|
||||
|
||||
/// Low-level constructor for tests: supply an existing pool, skip migrate.
|
||||
///
|
||||
/// Real body lands in `0002b-pool-and-config.md`.
|
||||
pub async fn from_pool(_pool: PgPool) -> MemoryStoreResult<Self> {
|
||||
todo!("PgMemoryStore::from_pool lands in 0002b-pool-and-config.md")
|
||||
}
|
||||
|
||||
/// Accessor used by migrate/reembed CLI.
|
||||
pub fn pool(&self) -> &PgPool {
|
||||
&self.pool
|
||||
}
|
||||
|
||||
/// Currently-registered vector dimension. Returns 0 until `register_model`
|
||||
/// has been called (real body: `0002d-store-impl-bodies.md`).
|
||||
pub fn embedding_dim(&self) -> i32 {
|
||||
self.embedding_dim
|
||||
}
|
||||
}
|
||||
|
||||
// trait_variant::make on the trait declaration generates two traits:
|
||||
// - `MemoryStore` (Send-bound)
|
||||
// - `LocalMemoryStore` (non-Send companion)
|
||||
// The implementer writes `impl LocalMemoryStore for ...` plainly; the Send
|
||||
// variant is generated automatically from the non-Send impl.
|
||||
impl LocalMemoryStore for PgMemoryStore {
|
||||
// --- Lifecycle ---
|
||||
async fn init(&self) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::init lands in 0002c-migrations.md")
|
||||
}
|
||||
|
||||
async fn health_check(&self) -> MemoryStoreResult<HealthStatus> {
|
||||
todo!("PgMemoryStore::health_check lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
// --- Embedding model registry ---
|
||||
async fn registered_model(&self) -> MemoryStoreResult<Option<ModelSignature>> {
|
||||
todo!("PgMemoryStore::registered_model lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn register_model(&self, _sig: &ModelSignature) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::register_model lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
// --- CRUD ---
|
||||
async fn insert(&self, _record: &MemoryRecord) -> MemoryStoreResult<Uuid> {
|
||||
todo!("PgMemoryStore::insert lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn get(&self, _id: Uuid) -> MemoryStoreResult<Option<MemoryRecord>> {
|
||||
todo!("PgMemoryStore::get lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn update(&self, _record: &MemoryRecord) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::update lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn delete(&self, _id: Uuid) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::delete lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
// --- Search ---
|
||||
async fn search(&self, _query: &SearchQuery) -> MemoryStoreResult<Vec<SearchResult>> {
|
||||
todo!("PgMemoryStore::search lands in 0002e-hybrid-search.md")
|
||||
}
|
||||
|
||||
async fn fts_search(
|
||||
&self,
|
||||
_text: &str,
|
||||
_limit: usize,
|
||||
) -> MemoryStoreResult<Vec<SearchResult>> {
|
||||
todo!("PgMemoryStore::fts_search lands in 0002e-hybrid-search.md")
|
||||
}
|
||||
|
||||
async fn vector_search(
|
||||
&self,
|
||||
_embedding: &[f32],
|
||||
_limit: usize,
|
||||
) -> MemoryStoreResult<Vec<SearchResult>> {
|
||||
todo!("PgMemoryStore::vector_search lands in 0002e-hybrid-search.md")
|
||||
}
|
||||
|
||||
// --- FSRS Scheduling ---
|
||||
async fn get_scheduling(
|
||||
&self,
|
||||
_memory_id: Uuid,
|
||||
) -> MemoryStoreResult<Option<SchedulingState>> {
|
||||
todo!("PgMemoryStore::get_scheduling lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn update_scheduling(&self, _state: &SchedulingState) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::update_scheduling lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn get_due_memories(
|
||||
&self,
|
||||
_before: DateTime<Utc>,
|
||||
_limit: usize,
|
||||
) -> MemoryStoreResult<Vec<(MemoryRecord, SchedulingState)>> {
|
||||
todo!("PgMemoryStore::get_due_memories lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
// --- Graph (spreading activation) ---
|
||||
async fn add_edge(&self, _edge: &MemoryEdge) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::add_edge lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn get_edges(
|
||||
&self,
|
||||
_node_id: Uuid,
|
||||
_edge_type: Option<&str>,
|
||||
) -> MemoryStoreResult<Vec<MemoryEdge>> {
|
||||
todo!("PgMemoryStore::get_edges lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn remove_edge(&self, _source: Uuid, _target: Uuid) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::remove_edge lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn get_neighbors(
|
||||
&self,
|
||||
_node_id: Uuid,
|
||||
_depth: usize,
|
||||
) -> MemoryStoreResult<Vec<(MemoryRecord, f64)>> {
|
||||
todo!("PgMemoryStore::get_neighbors lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
// --- Domains (Phase 1: stubs return empty; full impl in Phase 4) ---
|
||||
async fn list_domains(&self) -> MemoryStoreResult<Vec<Domain>> {
|
||||
todo!("PgMemoryStore::list_domains lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn get_domain(&self, _id: &str) -> MemoryStoreResult<Option<Domain>> {
|
||||
todo!("PgMemoryStore::get_domain lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn upsert_domain(&self, _domain: &Domain) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::upsert_domain lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn delete_domain(&self, _id: &str) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::delete_domain lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn classify(&self, _embedding: &[f32]) -> MemoryStoreResult<Vec<(String, f64)>> {
|
||||
todo!("PgMemoryStore::classify lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
// --- Bulk / Maintenance ---
|
||||
async fn count(&self) -> MemoryStoreResult<usize> {
|
||||
todo!("PgMemoryStore::count lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn get_stats(&self) -> MemoryStoreResult<StoreStats> {
|
||||
todo!("PgMemoryStore::get_stats lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
|
||||
async fn vacuum(&self) -> MemoryStoreResult<()> {
|
||||
todo!("PgMemoryStore::vacuum lands in 0002d-store-impl-bodies.md")
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Notes on the skeleton:
|
||||
|
||||
- The file-level `#![cfg(feature = "postgres-backend")]` means the whole file
|
||||
vanishes when the feature is off. The `mod postgres;` line in
|
||||
`storage/mod.rs` is itself feature-gated, so this is belt-and-braces; both
|
||||
gates are needed because the file-level attribute is what allows the file to
|
||||
use `sqlx::PgPool` unconditionally inside it.
|
||||
- `EmbeddingModelDescriptor` (a separate Postgres-internal type that the
|
||||
master plan sketched on the struct) is dropped. The trait surface already
|
||||
carries `ModelSignature` for the registry round-trip; the registry storage
|
||||
layout is a private concern of `registry.rs`, which is added later. Keep
|
||||
`PgMemoryStore` minimal until a real consumer needs the extra type.
|
||||
- The struct only carries `pool` and `embedding_dim`. The model descriptor
|
||||
field from the master plan sketch goes away with `EmbeddingModelDescriptor`.
|
||||
If `register_model` later needs to cache the descriptor on the struct, it
|
||||
can be added then; the skeleton does not speculate.
|
||||
- The two trivial accessors (`pool`, `embedding_dim`) get real bodies. Every
|
||||
other method is `todo!()` so it returns `!` and trivially coerces to the
|
||||
declared return type at the type checker; this is what lets the build pass
|
||||
with no error variants and no SQL.
|
||||
|
||||
---
|
||||
|
||||
## Connect signature
|
||||
|
||||
Per ADR 0002 D2:
|
||||
|
||||
```rust
|
||||
pub async fn connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self>;
|
||||
pub async fn from_pool(pool: PgPool) -> MemoryStoreResult<Self>;
|
||||
```
|
||||
|
||||
No `&dyn Embedder` argument. This deliberately differs from master plan 0002,
|
||||
which predates the Phase 1 freeze. The pgvector-specific DDL
|
||||
(`ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)`) does not run
|
||||
inside `connect`; it runs inside `register_model(&ModelSignature)`, which the
|
||||
caller invokes after `connect` returns.
|
||||
|
||||
In this sub-plan `register_model` is `todo!()`. The real body lands in
|
||||
`0002d-store-impl-bodies.md` after `0002c-migrations.md` ships the
|
||||
`0001_init.up.sql` migration that creates the `memories` table with a
|
||||
placeholder `embedding vector` column (no typmod), against which
|
||||
`register_model` later runs the typmod stamp.
|
||||
|
||||
---
|
||||
|
||||
## Error variant additions: deferred
|
||||
|
||||
`MemoryStoreError` does NOT gain `Postgres(sqlx::Error)` or
|
||||
`Migrate(sqlx::migrate::MigrateError)` in this sub-plan.
|
||||
|
||||
The reason is mechanical: `todo!()` evaluates to the never type `!`, which
|
||||
coerces to any `MemoryStoreResult<T>` regardless of the error variants
|
||||
available. With every method body a `todo!()`, the skeleton has no expression
|
||||
that needs to convert a `sqlx::Error` or `sqlx::migrate::MigrateError` into
|
||||
`MemoryStoreError`. Adding the variants here would mean adding the
|
||||
`#[cfg(feature = "postgres-backend")]` and `#[from]` plumbing to
|
||||
`memory_store.rs` with no consumer yet -- dead code at every level except the
|
||||
enum definition itself.
|
||||
|
||||
`0002d-store-impl-bodies.md` introduces both variants in the same commit that
|
||||
turns the first `todo!()` into a real `sqlx::query!` call. That keeps the
|
||||
diff to `memory_store.rs` next to the first usage site, which is easier to
|
||||
review than adding variants ahead of need.
|
||||
|
||||
For reference, the variants that will be added in `0002d-store-impl-bodies.md`
|
||||
look like this:
|
||||
|
||||
```rust
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
#[error("postgres error: {0}")]
|
||||
Postgres(#[from] sqlx::Error),
|
||||
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
#[error("postgres migration error: {0}")]
|
||||
Migrate(#[from] sqlx::migrate::MigrateError),
|
||||
```
|
||||
|
||||
Do not pre-add them here.
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
Run these commands from the workspace root. All four must produce a clean
|
||||
build, zero warnings on the diff-affected files, no test changes.
|
||||
|
||||
```bash
|
||||
# 1. Default features (SQLite backend, postgres-backend OFF). Must build.
|
||||
cargo build --workspace --all-targets
|
||||
|
||||
# 2. Workspace clippy with default features. Must be clean.
|
||||
cargo clippy --workspace --all-targets -- -D warnings
|
||||
|
||||
# 3. Postgres feature enabled. Must build.
|
||||
cargo build -p vestige-core --features postgres-backend
|
||||
|
||||
# 4. Clippy with postgres feature enabled. Must be clean.
|
||||
cargo clippy -p vestige-core --features postgres-backend --all-targets -- -D warnings
|
||||
```
|
||||
|
||||
Expected outcomes:
|
||||
|
||||
- `cargo build --workspace --all-targets` finishes with no compilation of
|
||||
`sqlx` or `pgvector` (both are optional, no consumer with default features).
|
||||
The `postgres` module is excluded entirely via `#[cfg]`.
|
||||
- `cargo build -p vestige-core --features postgres-backend` compiles `sqlx`,
|
||||
`pgvector`, and `storage/postgres/mod.rs`. The build succeeds because every
|
||||
trait method is `todo!()`; nothing actually runs SQL.
|
||||
- Both `clippy` invocations pass with `-D warnings`. The `todo!()` macro does
|
||||
not emit a `dead_code` lint by itself, and the trivial accessors are used by
|
||||
later sub-plans (clippy on the postgres feature alone may flag them as
|
||||
unused if you run with `--lib` only; the `--all-targets` form keeps tests
|
||||
and benches in scope so this does not fire).
|
||||
- If clippy flags `unused_variables` on the underscore-prefixed parameters in
|
||||
the `todo!()` bodies, the underscore prefix is already the standard
|
||||
suppression; if a future clippy version disagrees, add
|
||||
`#[allow(unused_variables)]` to the impl block, not to each method.
|
||||
|
||||
Tests are not modified in this sub-plan. The unit tests in
|
||||
`memory_store.rs` (`memory_store_error_from_storage_error`,
|
||||
`model_signature_serde_round_trip`, `memory_record_serde_round_trip`) keep
|
||||
passing because no type they touch changes.
|
||||
|
||||
Do NOT run `cargo test` against the postgres feature -- there is no Postgres
|
||||
running and no test exercises `PgMemoryStore` yet. The build check is the
|
||||
contract.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
1. `crates/vestige-core/Cargo.toml` declares `sqlx = "0.8"` and
|
||||
`pgvector = "0.4"` as optional dependencies with the exact feature sets
|
||||
specified above.
|
||||
2. `crates/vestige-core/Cargo.toml` declares `postgres-backend = ["dep:sqlx",
|
||||
"dep:pgvector"]` and nothing else inside that feature.
|
||||
3. `crates/vestige-mcp/Cargo.toml` is unchanged.
|
||||
4. `crates/vestige-core/src/storage/mod.rs` adds exactly two
|
||||
feature-gated lines: `mod postgres;` and `pub use postgres::PgMemoryStore;`.
|
||||
No other change.
|
||||
5. `crates/vestige-core/src/storage/postgres/mod.rs` exists and contains the
|
||||
`PgMemoryStore` struct, `impl PgMemoryStore` block with real `pool` and
|
||||
`embedding_dim` accessors and `todo!()` bodies for `connect` and
|
||||
`from_pool`, and the full `impl LocalMemoryStore for PgMemoryStore` block
|
||||
with `todo!()` for every trait method.
|
||||
6. The trait impl method signatures match `memory_store.rs` byte-for-byte
|
||||
(including `remove_edge(&self, source: Uuid, target: Uuid)` two-arg form,
|
||||
not the three-arg form from the master plan).
|
||||
7. `MemoryStoreError` is unchanged.
|
||||
8. No other files in the crate are touched. No new files in
|
||||
`storage/postgres/` besides `mod.rs`.
|
||||
9. The four verification commands above all succeed.
|
||||
|
||||
---
|
||||
|
||||
## Commit sequence
|
||||
|
||||
One commit is recommended. The two changes (Cargo manifest + module skeleton)
|
||||
do not compile in isolation: the manifest change without the skeleton produces
|
||||
unused-optional-dep warnings, and the skeleton without the manifest change
|
||||
fails to find `sqlx`. Splitting them adds no review value, since the second
|
||||
commit is the one that has to compile cleanly.
|
||||
|
||||
```bash
|
||||
git add crates/vestige-core/Cargo.toml \
|
||||
crates/vestige-core/Cargo.lock \
|
||||
crates/vestige-core/src/storage/mod.rs \
|
||||
crates/vestige-core/src/storage/postgres/mod.rs
|
||||
|
||||
git commit -m "feat(storage): scaffold postgres-backend feature and PgMemoryStore skeleton
|
||||
|
||||
Adds the postgres-backend Cargo feature gating sqlx 0.8 and pgvector 0.4.
|
||||
Introduces crates/vestige-core/src/storage/postgres/mod.rs with the
|
||||
PgMemoryStore struct, connect/from_pool/pool/embedding_dim, and a trait impl
|
||||
whose method bodies are todo!() pending later Phase 2 sub-plans.
|
||||
|
||||
Builds clean with default features (SQLite only) and with --features
|
||||
postgres-backend. No runtime behaviour change.
|
||||
|
||||
Refs ADR 0002 D1, D2, D4."
|
||||
```
|
||||
|
||||
If for any reason the manifest change must be reviewed separately (for
|
||||
example, a security review of the sqlx version pin), split as:
|
||||
|
||||
1. `cargo add` for sqlx and pgvector + manual feature line in Cargo.toml.
|
||||
Build with default features will pass but `--features postgres-backend`
|
||||
will fail (no module to satisfy the feature). This is acceptable for a
|
||||
short-lived intermediate commit.
|
||||
2. `storage/mod.rs` edits + `storage/postgres/mod.rs` creation. Both builds
|
||||
pass.
|
||||
|
||||
Default to the single-commit form unless asked to split.
|
||||
|
||||
---
|
||||
|
||||
## Followups
|
||||
|
||||
- `0002b-pool-and-config.md` adds `pool.rs`, `PostgresConfig`, and the
|
||||
`vestige-mcp` `postgres-backend` pass-through feature.
|
||||
- `0002c-migrations.md` adds `crates/vestige-core/migrations/postgres/` with
|
||||
`0001_init.{up,down}.sql` and `0002_hnsw.{up,down}.sql`, plus
|
||||
`postgres/migrations.rs` invoking `sqlx::migrate!`. `init()` body lands here.
|
||||
- `0002d-store-impl-bodies.md` introduces the two `MemoryStoreError` variants
|
||||
and replaces every `todo!()` in CRUD / scheduling / edges / domains /
|
||||
registry with real `sqlx::query!` bodies.
|
||||
- `0002e-hybrid-search.md` fills the three search bodies via the RRF query.
|
||||
886
docs/plans/0002b-pool-and-config.md
Normal file
886
docs/plans/0002b-pool-and-config.md
Normal file
|
|
@ -0,0 +1,886 @@
|
|||
# Sub-plan 0002b -- Pool construction and VestigeConfig
|
||||
|
||||
**Status**: Draft
|
||||
**Master plan**: [0002-phase-2-postgres-backend.md](0002-phase-2-postgres-backend.md)
|
||||
**ADR**: [0002-phase-2-execution.md](../adr/0002-phase-2-execution.md)
|
||||
**Predecessor**: [0002a-skeleton-and-feature-gate.md](0002a-skeleton-and-feature-gate.md)
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
This sub-plan delivers two of the master plan's deliverables now that the
|
||||
`0002a` skeleton has landed:
|
||||
|
||||
- **D3** -- pool construction in
|
||||
`crates/vestige-core/src/storage/postgres/pool.rs`. Replaces the `todo!()`
|
||||
body of `PgMemoryStore::connect` with a real `PgPool` builder that reads a
|
||||
`PostgresConfig`. Registry/migration calls remain `todo!()`; those are
|
||||
filled in by sub-plans `0002c` (migrations) and `0002d` (store bodies +
|
||||
registry).
|
||||
- **D7** -- new module `crates/vestige-core/src/config.rs` containing
|
||||
`VestigeConfig`, `StorageConfig`, `SqliteConfig`, `PostgresConfig`,
|
||||
`EmbeddingsConfig`, plus a `ConfigError` enum and a loader that reads
|
||||
`vestige.toml`. The loader is wired into `vestige-mcp` so the running
|
||||
server picks SQLite or Postgres at startup based on the config file.
|
||||
|
||||
After this sub-plan:
|
||||
|
||||
- `cargo build` (default features, no `postgres-backend`) compiles and the
|
||||
MCP server still defaults to SQLite when no `vestige.toml` is present.
|
||||
- `cargo build --features postgres-backend` compiles, with
|
||||
`PgMemoryStore::connect` now wiring through `pool.rs` (registry/migration
|
||||
still `todo!()` until `0002c` and `0002d`).
|
||||
- A `vestige.toml` example can be round-tripped through
|
||||
`VestigeConfig::load` in a unit test.
|
||||
|
||||
This sub-plan deliberately does NOT:
|
||||
|
||||
- Add migrations (`0002c`).
|
||||
- Fill in real CRUD/search bodies on `PgMemoryStore` (`0002d`, `0002e`).
|
||||
- Add env-var override support (Phase 3 concern, called out in master plan
|
||||
D7 behaviour notes).
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
- `0002a-skeleton-and-feature-gate.md` must be merged. That sub-plan creates
|
||||
`crates/vestige-core/src/storage/postgres/mod.rs` with:
|
||||
- `PgMemoryStore` struct holding `pool: PgPool`.
|
||||
- `PgMemoryStore::connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self>`
|
||||
body = `todo!()`.
|
||||
- `PgMemoryStore::from_pool(pool: PgPool) -> MemoryStoreResult<Self>`
|
||||
body = `todo!()`.
|
||||
- The trait impl block with all methods routed to `todo!()`.
|
||||
- The `postgres-backend` feature gate on the module declaration in
|
||||
`storage/mod.rs`.
|
||||
|
||||
This sub-plan extends those bodies and adds two siblings: `pool.rs` and
|
||||
`registry.rs` (the latter is a stub here, real body in `0002d`).
|
||||
|
||||
---
|
||||
|
||||
## Audit step (do this first)
|
||||
|
||||
Before adding `config.rs`, confirm there is no existing top-level config
|
||||
loader. Run from the repo root:
|
||||
|
||||
```bash
|
||||
rg -nF 'VestigeConfig' crates/
|
||||
rg -nF 'toml::from_str' crates/
|
||||
rg -n '#\[derive.*Deserialize.*\]' crates/vestige-core/src/
|
||||
```
|
||||
|
||||
If a `VestigeConfig` struct already exists from Phase 1, treat the "Config
|
||||
module" section below as additive: extend the existing struct rather than
|
||||
creating a new file. The cross-cut additions in that case are:
|
||||
|
||||
1. Add the `StorageConfig` enum (gated and ungated branches).
|
||||
2. Add `SqliteConfig`, `PostgresConfig`.
|
||||
3. Add the `default_path()` helper if missing.
|
||||
4. Add `ConfigError` if a different error enum is used today (rename/extend
|
||||
instead of duplicating).
|
||||
|
||||
As of the audit at the time of this writing, no `VestigeConfig` exists in
|
||||
`vestige-core`. `directories::ProjectDirs` is already used in
|
||||
`vestige-core/src/embeddings/local.rs` and in
|
||||
`vestige-mcp/src/protocol/auth.rs`, so the `directories` crate is already a
|
||||
workspace dependency -- no new dep there.
|
||||
|
||||
---
|
||||
|
||||
## Cargo manifest additions
|
||||
|
||||
Add `toml` to `vestige-core`. `serde` and `thiserror` are already present
|
||||
from Phase 1; `directories` is already a transitive dep but we add it
|
||||
explicitly so `default_path()` is supported.
|
||||
|
||||
```bash
|
||||
cd crates/vestige-core
|
||||
cargo add toml@0.8
|
||||
cargo add directories@5
|
||||
```
|
||||
|
||||
No new deps on `vestige-mcp`; it already depends on `vestige-core`.
|
||||
|
||||
`sqlx` is already added by `0002a` behind the `postgres-backend` feature
|
||||
with `runtime-tokio`, `tls-rustls`, `postgres`, `uuid`, `chrono`,
|
||||
`json`, `macros`, `migrate` features. The pool module only uses what is
|
||||
already pulled in.
|
||||
|
||||
---
|
||||
|
||||
## Config module
|
||||
|
||||
**File**: `crates/vestige-core/src/config.rs` (new).
|
||||
**Re-exported** from `crates/vestige-core/src/lib.rs` as `pub mod config;` plus
|
||||
`pub use config::{VestigeConfig, StorageConfig, SqliteConfig, EmbeddingsConfig, ConfigError};`
|
||||
and `#[cfg(feature = "postgres-backend")] pub use config::PostgresConfig;`.
|
||||
|
||||
Full content:
|
||||
|
||||
```rust
|
||||
//! Vestige top-level configuration.
|
||||
//!
|
||||
//! Loaded from `~/.vestige/vestige.toml` by default; the path is overridable
|
||||
//! via `VestigeConfig::load(Some(&path))`. Parsing uses serde + toml; the
|
||||
//! `[storage]` section is internally-tagged on a `backend` field so a single
|
||||
//! enum dispatch picks SQLite or Postgres.
|
||||
|
||||
use std::path::{Path, PathBuf};
|
||||
|
||||
use serde::Deserialize;
|
||||
|
||||
/// Top-level configuration as parsed from `vestige.toml`.
|
||||
#[derive(Debug, Clone, Deserialize, Default)]
|
||||
#[serde(default, deny_unknown_fields)]
|
||||
pub struct VestigeConfig {
|
||||
pub embeddings: EmbeddingsConfig,
|
||||
pub storage: StorageConfig,
|
||||
/// Reserved for Phase 3. Empty in Phase 2.
|
||||
pub server: ServerConfig,
|
||||
/// Reserved for Phase 3. Empty in Phase 2.
|
||||
pub auth: AuthConfig,
|
||||
}
|
||||
|
||||
/// Embedding provider selection.
|
||||
#[derive(Debug, Clone, Deserialize)]
|
||||
#[serde(deny_unknown_fields)]
|
||||
pub struct EmbeddingsConfig {
|
||||
/// Provider key. Phase 2 ships `"fastembed"` only.
|
||||
pub provider: String,
|
||||
/// Model name. For fastembed this is e.g. `"nomic-ai/nomic-embed-text-v1.5"`.
|
||||
pub model: String,
|
||||
}
|
||||
|
||||
impl Default for EmbeddingsConfig {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
provider: "fastembed".to_string(),
|
||||
model: crate::DEFAULT_EMBEDDING_MODEL.to_string(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Storage backend selection. Internally tagged on the `backend` field:
|
||||
///
|
||||
/// ```toml
|
||||
/// [storage]
|
||||
/// backend = "sqlite"
|
||||
///
|
||||
/// [storage.sqlite]
|
||||
/// path = "/home/user/.vestige/vestige.db"
|
||||
/// ```
|
||||
///
|
||||
/// or, when compiled with `--features postgres-backend`:
|
||||
///
|
||||
/// ```toml
|
||||
/// [storage]
|
||||
/// backend = "postgres"
|
||||
///
|
||||
/// [storage.postgres]
|
||||
/// url = "postgres://vestige:secret@localhost:5432/vestige"
|
||||
/// max_connections = 10
|
||||
/// acquire_timeout_secs = 30
|
||||
/// ```
|
||||
#[derive(Debug, Clone, Deserialize)]
|
||||
#[serde(tag = "backend", rename_all = "lowercase", deny_unknown_fields)]
|
||||
pub enum StorageConfig {
|
||||
Sqlite(SqliteConfig),
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
Postgres(PostgresConfig),
|
||||
}
|
||||
|
||||
impl Default for StorageConfig {
|
||||
fn default() -> Self {
|
||||
StorageConfig::Sqlite(SqliteConfig::default())
|
||||
}
|
||||
}
|
||||
|
||||
/// SQLite backend configuration.
|
||||
#[derive(Debug, Clone, Deserialize)]
|
||||
#[serde(deny_unknown_fields)]
|
||||
pub struct SqliteConfig {
|
||||
/// Path to the `vestige.db` file. If unset, the SqliteMemoryStore
|
||||
/// constructor picks its platform default location.
|
||||
#[serde(default)]
|
||||
pub path: Option<PathBuf>,
|
||||
}
|
||||
|
||||
impl Default for SqliteConfig {
|
||||
fn default() -> Self {
|
||||
Self { path: None }
|
||||
}
|
||||
}
|
||||
|
||||
/// Postgres backend configuration. Only present when the `postgres-backend`
|
||||
/// Cargo feature is enabled.
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
#[derive(Debug, Clone, Deserialize)]
|
||||
#[serde(deny_unknown_fields)]
|
||||
pub struct PostgresConfig {
|
||||
/// `postgres://user:pass@host:port/db` -- forwarded to
|
||||
/// `PgConnectOptions::from_str`.
|
||||
pub url: String,
|
||||
/// Pool size. Default `10`.
|
||||
#[serde(default)]
|
||||
pub max_connections: Option<u32>,
|
||||
/// Acquire timeout in seconds. Default `30`. Set above 30 so
|
||||
/// testcontainer-based test fixtures do not race.
|
||||
#[serde(default)]
|
||||
pub acquire_timeout_secs: Option<u64>,
|
||||
}
|
||||
|
||||
/// Reserved for Phase 3 (bind address, ports, TLS).
|
||||
#[derive(Debug, Clone, Default, Deserialize)]
|
||||
#[serde(default, deny_unknown_fields)]
|
||||
pub struct ServerConfig {}
|
||||
|
||||
/// Reserved for Phase 3 (API keys, claims).
|
||||
#[derive(Debug, Clone, Default, Deserialize)]
|
||||
#[serde(default, deny_unknown_fields)]
|
||||
pub struct AuthConfig {}
|
||||
|
||||
/// Errors raised while locating, reading, or parsing `vestige.toml`.
|
||||
#[derive(Debug, thiserror::Error)]
|
||||
pub enum ConfigError {
|
||||
#[error("config io: {0}")]
|
||||
Io(#[from] std::io::Error),
|
||||
#[error("config toml: {0}")]
|
||||
Toml(#[from] toml::de::Error),
|
||||
#[error("config dir: could not locate user home")]
|
||||
NoHome,
|
||||
#[error("invalid config: {0}")]
|
||||
Invalid(String),
|
||||
}
|
||||
|
||||
impl VestigeConfig {
|
||||
/// Load config from `path` or from `default_path()` when `None`.
|
||||
///
|
||||
/// Returns `VestigeConfig::default()` (SQLite + fastembed defaults) when
|
||||
/// the file does not exist. Any other I/O or parse failure is surfaced
|
||||
/// as a `ConfigError`.
|
||||
pub fn load(path: Option<&Path>) -> Result<Self, ConfigError> {
|
||||
let resolved: PathBuf = match path {
|
||||
Some(p) => p.to_path_buf(),
|
||||
None => Self::default_path()?,
|
||||
};
|
||||
|
||||
match std::fs::read_to_string(&resolved) {
|
||||
Ok(text) => {
|
||||
let cfg: VestigeConfig = toml::from_str(&text)?;
|
||||
cfg.validate()?;
|
||||
Ok(cfg)
|
||||
}
|
||||
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
|
||||
Ok(Self::default())
|
||||
}
|
||||
Err(e) => Err(ConfigError::Io(e)),
|
||||
}
|
||||
}
|
||||
|
||||
/// `~/.vestige/vestige.toml`. The directory is NOT created here; loading
|
||||
/// a missing file falls back to defaults.
|
||||
pub fn default_path() -> Result<PathBuf, ConfigError> {
|
||||
let dirs = directories::ProjectDirs::from("", "vestige", "vestige")
|
||||
.ok_or(ConfigError::NoHome)?;
|
||||
// ProjectDirs::config_dir() varies per OS. Vestige convention is
|
||||
// ~/.vestige/vestige.toml on Linux/macOS regardless of XDG, so we
|
||||
// build the path off the home dir explicitly.
|
||||
let home = directories::UserDirs::new()
|
||||
.ok_or(ConfigError::NoHome)?
|
||||
.home_dir()
|
||||
.to_path_buf();
|
||||
let _ = dirs; // keep the dep wired; future Phase 3 may use it
|
||||
Ok(home.join(".vestige").join("vestige.toml"))
|
||||
}
|
||||
|
||||
/// Light cross-field validation. Heavy validation (URL parsing,
|
||||
/// directory existence) is left to the backend constructors.
|
||||
fn validate(&self) -> Result<(), ConfigError> {
|
||||
if self.embeddings.provider.is_empty() {
|
||||
return Err(ConfigError::Invalid(
|
||||
"embeddings.provider must not be empty".into(),
|
||||
));
|
||||
}
|
||||
if self.embeddings.model.is_empty() {
|
||||
return Err(ConfigError::Invalid(
|
||||
"embeddings.model must not be empty".into(),
|
||||
));
|
||||
}
|
||||
match &self.storage {
|
||||
StorageConfig::Sqlite(_) => {}
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
StorageConfig::Postgres(cfg) => {
|
||||
if cfg.url.is_empty() {
|
||||
return Err(ConfigError::Invalid(
|
||||
"storage.postgres.url must not be empty".into(),
|
||||
));
|
||||
}
|
||||
}
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Serde behaviour with `postgres-backend` off
|
||||
|
||||
`StorageConfig` is generated by serde only for the variants that are
|
||||
compiled in. When `postgres-backend` is off and the user writes:
|
||||
|
||||
```toml
|
||||
[storage]
|
||||
backend = "postgres"
|
||||
|
||||
[storage.postgres]
|
||||
url = "..."
|
||||
```
|
||||
|
||||
serde returns a `toml::de::Error` of the form
|
||||
`unknown variant `postgres`, expected `sqlite``. That error path goes
|
||||
through `From<toml::de::Error> for ConfigError`, surfacing as
|
||||
`ConfigError::Toml(..)`. The MCP server prints this once at startup and
|
||||
exits with a non-zero code; there is no panic.
|
||||
|
||||
To make the error friendlier we wrap that specific case in a clearer
|
||||
message via a thin post-parse check. Add this small helper after parsing
|
||||
in `load()`:
|
||||
|
||||
```rust
|
||||
// (Inside the Ok(text) arm in load(), wrapping the parse step.)
|
||||
let cfg: VestigeConfig = match toml::from_str(&text) {
|
||||
Ok(c) => c,
|
||||
Err(e) => {
|
||||
let msg = e.to_string();
|
||||
if msg.contains("unknown variant `postgres`") {
|
||||
return Err(ConfigError::Invalid(
|
||||
"storage.backend = \"postgres\" requires building with --features postgres-backend".into(),
|
||||
));
|
||||
}
|
||||
return Err(ConfigError::Toml(e));
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
This keeps the strict default deny_unknown_fields behaviour while giving the
|
||||
user a one-line action item.
|
||||
|
||||
---
|
||||
|
||||
## Pool module
|
||||
|
||||
**File**: `crates/vestige-core/src/storage/postgres/pool.rs` (new).
|
||||
|
||||
```rust
|
||||
#![cfg(feature = "postgres-backend")]
|
||||
|
||||
//! `PgPool` construction for the Postgres backend.
|
||||
//!
|
||||
//! Pool defaults follow ADR 0002 D2 + master plan D3:
|
||||
//! - max_connections = 10
|
||||
//! - acquire_timeout = 30s (must exceed testcontainer warmup)
|
||||
//! - idle_timeout = 600s
|
||||
//! - max_lifetime = 1800s
|
||||
//! - test_before_acquire = false (cheap queries; saves a roundtrip)
|
||||
|
||||
use std::str::FromStr;
|
||||
use std::time::Duration;
|
||||
|
||||
use sqlx::postgres::{PgConnectOptions, PgPoolOptions};
|
||||
use sqlx::{ConnectOptions, PgPool};
|
||||
|
||||
use crate::config::PostgresConfig;
|
||||
use crate::storage::memory_store::{MemoryStoreError, MemoryStoreResult};
|
||||
|
||||
const DEFAULT_MAX_CONNECTIONS: u32 = 10;
|
||||
const DEFAULT_ACQUIRE_TIMEOUT_SECS: u64 = 30;
|
||||
const IDLE_TIMEOUT_SECS: u64 = 600;
|
||||
const MAX_LIFETIME_SECS: u64 = 1800;
|
||||
const STATEMENT_CACHE_CAPACITY: usize = 256;
|
||||
|
||||
/// Build a Postgres connection pool from a `PostgresConfig`. Does NOT run
|
||||
/// migrations or stamp the embedding registry; those are the caller's job
|
||||
/// (`PgMemoryStore::connect`).
|
||||
pub async fn build_pool(cfg: &PostgresConfig) -> MemoryStoreResult<PgPool> {
|
||||
let opts = PgConnectOptions::from_str(&cfg.url)
|
||||
.map_err(MemoryStoreError::from)?
|
||||
.application_name("vestige")
|
||||
.statement_cache_capacity(STATEMENT_CACHE_CAPACITY)
|
||||
.log_statements(tracing::log::LevelFilter::Debug);
|
||||
|
||||
let max_conn = cfg.max_connections.unwrap_or(DEFAULT_MAX_CONNECTIONS);
|
||||
let acquire = cfg
|
||||
.acquire_timeout_secs
|
||||
.unwrap_or(DEFAULT_ACQUIRE_TIMEOUT_SECS);
|
||||
|
||||
let pool = PgPoolOptions::new()
|
||||
.max_connections(max_conn)
|
||||
.min_connections(0)
|
||||
.acquire_timeout(Duration::from_secs(acquire))
|
||||
.idle_timeout(Some(Duration::from_secs(IDLE_TIMEOUT_SECS)))
|
||||
.max_lifetime(Some(Duration::from_secs(MAX_LIFETIME_SECS)))
|
||||
.test_before_acquire(false)
|
||||
.connect_with(opts)
|
||||
.await
|
||||
.map_err(MemoryStoreError::from)?;
|
||||
|
||||
Ok(pool)
|
||||
}
|
||||
```
|
||||
|
||||
### Wiring into `PgMemoryStore::connect`
|
||||
|
||||
In `crates/vestige-core/src/storage/postgres/mod.rs`, replace the
|
||||
`todo!()` body left by `0002a` for `connect` and `from_pool` with:
|
||||
|
||||
```rust
|
||||
// In crates/vestige-core/src/storage/postgres/mod.rs
|
||||
|
||||
use sqlx::PgPool;
|
||||
|
||||
use crate::config::PostgresConfig;
|
||||
use crate::storage::memory_store::{MemoryStoreError, MemoryStoreResult};
|
||||
|
||||
mod pool;
|
||||
mod registry; // see "Registry stub" section below
|
||||
|
||||
pub struct PgMemoryStore {
|
||||
pool: PgPool,
|
||||
}
|
||||
|
||||
impl PgMemoryStore {
|
||||
/// Convenience constructor matching `SqliteMemoryStore::new` shape.
|
||||
/// Takes a URL + pool size for the common case.
|
||||
pub async fn connect(url: &str, max_connections: u32) -> MemoryStoreResult<Self> {
|
||||
let cfg = PostgresConfig {
|
||||
url: url.to_string(),
|
||||
max_connections: Some(max_connections),
|
||||
acquire_timeout_secs: None,
|
||||
};
|
||||
Self::connect_with(&cfg).await
|
||||
}
|
||||
|
||||
/// Full-config constructor.
|
||||
pub async fn connect_with(cfg: &PostgresConfig) -> MemoryStoreResult<Self> {
|
||||
let pool = pool::build_pool(cfg).await?;
|
||||
Self::from_pool(pool).await
|
||||
}
|
||||
|
||||
/// Construct from an already-built pool (used by tests and the migrate
|
||||
/// CLI to share a pool across operations).
|
||||
pub async fn from_pool(pool: PgPool) -> MemoryStoreResult<Self> {
|
||||
// Migrations are added by 0002c.
|
||||
// todo!("run sqlx::migrate! once 0002c lands")
|
||||
registry::ensure_registry_stub(&pool).await?;
|
||||
Ok(Self { pool })
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
`connect_with` is the long-lived API; `connect` becomes a thin shim that
|
||||
stays compatible with the master-plan-mandated signature.
|
||||
|
||||
### Registry stub
|
||||
|
||||
**File**: `crates/vestige-core/src/storage/postgres/registry.rs` (new, stub).
|
||||
|
||||
```rust
|
||||
#![cfg(feature = "postgres-backend")]
|
||||
|
||||
//! Embedding registry. Real body lands in sub-plan 0002d.
|
||||
|
||||
use sqlx::PgPool;
|
||||
|
||||
use crate::storage::memory_store::MemoryStoreResult;
|
||||
|
||||
/// Placeholder. Real implementation in 0002d reads/writes `embedding_model`
|
||||
/// and stamps `ALTER TABLE memories ALTER COLUMN embedding TYPE vector($N)`.
|
||||
pub(crate) async fn ensure_registry_stub(_pool: &PgPool) -> MemoryStoreResult<()> {
|
||||
// Intentionally a no-op until 0002c lands the table + 0002d lands the
|
||||
// real body. Leaving this as todo!() would crash the MCP server at
|
||||
// startup the moment a user switches `backend = "postgres"`, which is
|
||||
// not what we want for the build verification step in this sub-plan.
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
The no-op keeps `cargo build --features postgres-backend` not just
|
||||
compiling but also allowing the MCP server to *boot* against a Postgres
|
||||
URL pointing at an already-migrated database (the local-dev-postgres-setup
|
||||
docs cover bringing up such a DB by hand). Real init lands in `0002d`.
|
||||
|
||||
---
|
||||
|
||||
## Error variants
|
||||
|
||||
**File**: `crates/vestige-core/src/storage/memory_store.rs` (edit).
|
||||
|
||||
The Phase 1 enum `MemoryStoreError` gains two feature-gated variants. These
|
||||
were deferred in `0002a` and become required as soon as `pool.rs` calls
|
||||
`.map_err(MemoryStoreError::from)` on `sqlx::Error`.
|
||||
|
||||
```rust
|
||||
// Within enum MemoryStoreError { ... } in memory_store.rs
|
||||
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
#[error("postgres error: {0}")]
|
||||
Postgres(#[from] sqlx::Error),
|
||||
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
#[error("postgres migration error: {0}")]
|
||||
Migrate(#[from] sqlx::migrate::MigrateError),
|
||||
```
|
||||
|
||||
Both use thiserror's `#[from]` attribute so the `?` operator works in
|
||||
`pool.rs`, the migrate module (`0002c`), and registry code (`0002d`).
|
||||
Default-features build (no `postgres-backend`) sees neither variant; the
|
||||
enum stays exhaustive on stable.
|
||||
|
||||
If clippy fires on `non_exhaustive` due to the gated variants, add
|
||||
`#[non_exhaustive]` on the enum. That has no caller-side effect since the
|
||||
enum is constructed only inside the crate.
|
||||
|
||||
---
|
||||
|
||||
## vestige-mcp wiring
|
||||
|
||||
### Cargo feature passthrough
|
||||
|
||||
**File**: `crates/vestige-mcp/Cargo.toml` (edit).
|
||||
|
||||
Add a feature that forwards through to `vestige-core`. Default features in
|
||||
`vestige-mcp` stay unchanged.
|
||||
|
||||
```toml
|
||||
[features]
|
||||
default = ["embeddings", "vector-search"]
|
||||
embeddings = ["vestige-core/embeddings"]
|
||||
vector-search = ["vestige-core/vector-search"]
|
||||
postgres-backend = ["vestige-core/postgres-backend"]
|
||||
```
|
||||
|
||||
Verify with:
|
||||
|
||||
```bash
|
||||
cargo build -p vestige-mcp --features postgres-backend
|
||||
```
|
||||
|
||||
### Backend dispatch at startup
|
||||
|
||||
**File**: `crates/vestige-mcp/src/main.rs` (edit around the existing
|
||||
`Storage::new(storage_path)` call -- see audit note above; in the current
|
||||
worktree this is around line 285).
|
||||
|
||||
The current code is roughly:
|
||||
|
||||
```rust
|
||||
let storage_path = match prepare_storage_path(config.data_dir) { ... };
|
||||
let storage = match Storage::new(storage_path) { ... };
|
||||
```
|
||||
|
||||
Replace that with a dispatch driven by `VestigeConfig`:
|
||||
|
||||
```rust
|
||||
use std::sync::Arc;
|
||||
|
||||
use vestige_core::config::{StorageConfig, VestigeConfig};
|
||||
use vestige_core::storage::SqliteMemoryStore;
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
use vestige_core::storage::postgres::PgMemoryStore;
|
||||
use vestige_core::storage::MemoryStore;
|
||||
|
||||
// Earlier: still call prepare_storage_path to honour --data-dir override.
|
||||
let storage_path = match prepare_storage_path(config.data_dir.clone()) { ... };
|
||||
|
||||
// New: load vestige.toml (or fall back to defaults).
|
||||
let vestige_cfg = match VestigeConfig::load(config.config_path.as_deref()) {
|
||||
Ok(c) => c,
|
||||
Err(e) => {
|
||||
eprintln!("vestige: failed to load config: {e}");
|
||||
std::process::exit(2);
|
||||
}
|
||||
};
|
||||
|
||||
let storage: Arc<dyn MemoryStore> = match &vestige_cfg.storage {
|
||||
StorageConfig::Sqlite(sqlite_cfg) => {
|
||||
// CLI flag --data-dir wins over the config file path.
|
||||
let path = storage_path.clone().or_else(|| sqlite_cfg.path.clone());
|
||||
let s = SqliteMemoryStore::new(path).unwrap_or_else(|e| {
|
||||
eprintln!("vestige: sqlite init failed: {e}");
|
||||
std::process::exit(3);
|
||||
});
|
||||
Arc::new(s)
|
||||
}
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
StorageConfig::Postgres(pg_cfg) => {
|
||||
let s = PgMemoryStore::connect_with(pg_cfg).await.unwrap_or_else(|e| {
|
||||
eprintln!("vestige: postgres init failed: {e}");
|
||||
std::process::exit(3);
|
||||
});
|
||||
Arc::new(s)
|
||||
}
|
||||
};
|
||||
```
|
||||
|
||||
The `config_path: Option<PathBuf>` field on the local `Config` (or
|
||||
clap-derived `Args`) struct must be added if not present; it accepts
|
||||
`--config <path>`. Default behaviour (no flag) goes through
|
||||
`VestigeConfig::default_path()`.
|
||||
|
||||
If the existing main wires `Storage` through a concrete type rather than
|
||||
`Arc<dyn MemoryStore>`, the dispatch above lives behind a helper:
|
||||
|
||||
```rust
|
||||
async fn build_store(cfg: &VestigeConfig, cli_path: Option<PathBuf>)
|
||||
-> Result<Arc<dyn MemoryStore>, anyhow::Error>
|
||||
{ ... }
|
||||
```
|
||||
|
||||
and the caller chains `.into()` as needed. Phase 1 already moved
|
||||
cognitive modules to `Arc<dyn MemoryStore>` so this should be a pure
|
||||
substitution; if a concrete-type holdout is found, fix it locally in this
|
||||
sub-plan (separate commit) rather than punting.
|
||||
|
||||
---
|
||||
|
||||
## vestige.toml example
|
||||
|
||||
The canonical example to ship in `docs/` (Phase 2 docs land in `0002i`,
|
||||
runbook), shown here for reference and used verbatim by the unit test
|
||||
below.
|
||||
|
||||
```toml
|
||||
# vestige.toml -- top-level configuration
|
||||
#
|
||||
# Default location: ~/.vestige/vestige.toml
|
||||
# Override: vestige-mcp --config /path/to/vestige.toml
|
||||
|
||||
[embeddings]
|
||||
provider = "fastembed"
|
||||
model = "nomic-ai/nomic-embed-text-v1.5"
|
||||
|
||||
# --- SQLite backend (default) ---
|
||||
[storage]
|
||||
backend = "sqlite"
|
||||
|
||||
[storage.sqlite]
|
||||
path = "/home/user/.vestige/vestige.db"
|
||||
|
||||
# --- Postgres backend (requires --features postgres-backend) ---
|
||||
# [storage]
|
||||
# backend = "postgres"
|
||||
#
|
||||
# [storage.postgres]
|
||||
# url = "postgres://vestige:secret@localhost:5432/vestige"
|
||||
# max_connections = 10
|
||||
# acquire_timeout_secs = 30
|
||||
|
||||
[server]
|
||||
# Reserved for Phase 3 (bind address, ports, TLS).
|
||||
|
||||
[auth]
|
||||
# Reserved for Phase 3 (API keys, claims).
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
Run all of these from the repo root. The first three are the gates that
|
||||
must pass before this sub-plan is considered done.
|
||||
|
||||
### 1. Default build (no Postgres)
|
||||
|
||||
```bash
|
||||
cargo build -p vestige-core
|
||||
cargo build -p vestige-mcp
|
||||
cargo test -p vestige-core --lib
|
||||
```
|
||||
|
||||
Expected: clean build. `VestigeConfig::default()` selects SQLite; the MCP
|
||||
server boots the same way it did pre-sub-plan.
|
||||
|
||||
### 2. Postgres-feature build
|
||||
|
||||
```bash
|
||||
cargo build -p vestige-core --features postgres-backend
|
||||
cargo build -p vestige-mcp --features postgres-backend
|
||||
```
|
||||
|
||||
Expected: clean build. `PgMemoryStore::connect_with` resolves to
|
||||
`pool::build_pool` + `registry::ensure_registry_stub`; no `todo!()` is
|
||||
reachable on the build path. `connect` and `from_pool` are exported.
|
||||
|
||||
### 3. Clippy across both feature sets
|
||||
|
||||
```bash
|
||||
cargo clippy -p vestige-core -- -D warnings
|
||||
cargo clippy -p vestige-core --features postgres-backend -- -D warnings
|
||||
cargo clippy -p vestige-mcp --features postgres-backend -- -D warnings
|
||||
```
|
||||
|
||||
### 4. Unit test: round-trip the example
|
||||
|
||||
Add this test to `crates/vestige-core/src/config.rs`:
|
||||
|
||||
```rust
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
const EXAMPLE_SQLITE: &str = r#"
|
||||
[embeddings]
|
||||
provider = "fastembed"
|
||||
model = "nomic-ai/nomic-embed-text-v1.5"
|
||||
|
||||
[storage]
|
||||
backend = "sqlite"
|
||||
|
||||
[storage.sqlite]
|
||||
path = "/home/user/.vestige/vestige.db"
|
||||
"#;
|
||||
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
const EXAMPLE_POSTGRES: &str = r#"
|
||||
[embeddings]
|
||||
provider = "fastembed"
|
||||
model = "nomic-ai/nomic-embed-text-v1.5"
|
||||
|
||||
[storage]
|
||||
backend = "postgres"
|
||||
|
||||
[storage.postgres]
|
||||
url = "postgres://vestige:secret@localhost:5432/vestige"
|
||||
max_connections = 10
|
||||
acquire_timeout_secs = 30
|
||||
"#;
|
||||
|
||||
#[test]
|
||||
fn parses_sqlite_example() {
|
||||
let cfg: VestigeConfig = toml::from_str(EXAMPLE_SQLITE).expect("parse");
|
||||
match cfg.storage {
|
||||
StorageConfig::Sqlite(s) => assert!(s.path.is_some()),
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
StorageConfig::Postgres(_) => panic!("wrong variant"),
|
||||
}
|
||||
assert_eq!(cfg.embeddings.provider, "fastembed");
|
||||
}
|
||||
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
#[test]
|
||||
fn parses_postgres_example() {
|
||||
let cfg: VestigeConfig = toml::from_str(EXAMPLE_POSTGRES).expect("parse");
|
||||
match cfg.storage {
|
||||
StorageConfig::Postgres(p) => {
|
||||
assert_eq!(p.url, "postgres://vestige:secret@localhost:5432/vestige");
|
||||
assert_eq!(p.max_connections, Some(10));
|
||||
assert_eq!(p.acquire_timeout_secs, Some(30));
|
||||
}
|
||||
StorageConfig::Sqlite(_) => panic!("wrong variant"),
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(not(feature = "postgres-backend"))]
|
||||
#[test]
|
||||
fn rejects_postgres_when_feature_off() {
|
||||
let toml_text = r#"
|
||||
[storage]
|
||||
backend = "postgres"
|
||||
|
||||
[storage.postgres]
|
||||
url = "postgres://x/y"
|
||||
"#;
|
||||
let res: Result<VestigeConfig, _> = toml::from_str(toml_text);
|
||||
assert!(res.is_err(), "must fail without postgres-backend feature");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn defaults_pick_sqlite() {
|
||||
let cfg = VestigeConfig::default();
|
||||
assert!(matches!(cfg.storage, StorageConfig::Sqlite(_)));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn load_missing_file_returns_default() {
|
||||
let tmp = std::env::temp_dir().join("vestige-no-such-file.toml");
|
||||
let _ = std::fs::remove_file(&tmp);
|
||||
let cfg = VestigeConfig::load(Some(&tmp)).expect("missing file is OK");
|
||||
assert!(matches!(cfg.storage, StorageConfig::Sqlite(_)));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn load_roundtrip_from_disk() {
|
||||
let tmp = std::env::temp_dir().join("vestige-roundtrip.toml");
|
||||
std::fs::write(&tmp, EXAMPLE_SQLITE).unwrap();
|
||||
let cfg = VestigeConfig::load(Some(&tmp)).expect("load");
|
||||
assert!(matches!(cfg.storage, StorageConfig::Sqlite(_)));
|
||||
let _ = std::fs::remove_file(&tmp);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Run:
|
||||
|
||||
```bash
|
||||
cargo test -p vestige-core --lib config::
|
||||
cargo test -p vestige-core --lib config:: --features postgres-backend
|
||||
```
|
||||
|
||||
### 5. Smoke: server boots with default config
|
||||
|
||||
```bash
|
||||
# default build, no vestige.toml on disk
|
||||
cargo run -p vestige-mcp -- --help
|
||||
# should print help, no panic
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
- [ ] `cargo build -p vestige-core` (default features) succeeds.
|
||||
- [ ] `cargo build -p vestige-core --features postgres-backend` succeeds.
|
||||
- [ ] `cargo build -p vestige-mcp` (default features) succeeds.
|
||||
- [ ] `cargo build -p vestige-mcp --features postgres-backend` succeeds.
|
||||
- [ ] `cargo clippy` with and without `postgres-backend` is clean on both
|
||||
crates.
|
||||
- [ ] `crates/vestige-core/src/config.rs` exists, exposes
|
||||
`VestigeConfig`, `StorageConfig`, `SqliteConfig`, `EmbeddingsConfig`,
|
||||
`ConfigError`, plus `PostgresConfig` when the feature is on.
|
||||
- [ ] `VestigeConfig::load(None)` returns `Ok(default)` when
|
||||
`~/.vestige/vestige.toml` is missing.
|
||||
- [ ] `VestigeConfig::load(Some(&path))` round-trips both the SQLite and
|
||||
Postgres example blocks above.
|
||||
- [ ] With `postgres-backend` off, parsing `backend = "postgres"` returns
|
||||
a clear `ConfigError::Invalid` mentioning the feature flag, NOT a
|
||||
panic.
|
||||
- [ ] `crates/vestige-core/src/storage/postgres/pool.rs` exists,
|
||||
implementing `build_pool(&PostgresConfig) -> MemoryStoreResult<PgPool>`
|
||||
with the documented defaults.
|
||||
- [ ] `PgMemoryStore::connect`, `connect_with`, and `from_pool` all wire
|
||||
through `pool::build_pool`. None of them is `todo!()`. The registry
|
||||
step is a no-op stub documented as filled in by `0002d`.
|
||||
- [ ] `MemoryStoreError::Postgres(sqlx::Error)` and
|
||||
`MemoryStoreError::Migrate(sqlx::migrate::MigrateError)` exist
|
||||
behind `#[cfg(feature = "postgres-backend")]` with `#[from]`.
|
||||
- [ ] `vestige-mcp` has a `postgres-backend` feature that forwards to
|
||||
`vestige-core/postgres-backend`.
|
||||
- [ ] `vestige-mcp/src/main.rs` selects SQLite vs Postgres at startup
|
||||
based on `VestigeConfig`. SQLite is the default when no config file
|
||||
is present.
|
||||
- [ ] Unit tests in the "Verification" section pass on both feature sets.
|
||||
|
||||
---
|
||||
|
||||
## Out of scope (handled by other sub-plans)
|
||||
|
||||
- Migrations (`crates/vestige-core/migrations/postgres/*.sql`) -- `0002c`.
|
||||
- Real `PgMemoryStore` CRUD/search/scheduling/edges bodies -- `0002d`,
|
||||
`0002e`.
|
||||
- `ensure_registry` real body with `ALTER COLUMN TYPE vector(N)` -- `0002d`.
|
||||
- `vestige migrate --from sqlite --to postgres` CLI -- `0002f`.
|
||||
- Re-embed flow -- `0002g`.
|
||||
- Env-var override (`VESTIGE_POSTGRES_URL`, etc.) -- Phase 3.
|
||||
- RLS, multi-tenant column population -- Phase 3.
|
||||
1119
docs/plans/0002c-migrations.md
Normal file
1119
docs/plans/0002c-migrations.md
Normal file
File diff suppressed because it is too large
Load diff
1771
docs/plans/0002d-store-impl-bodies.md
Normal file
1771
docs/plans/0002d-store-impl-bodies.md
Normal file
File diff suppressed because it is too large
Load diff
825
docs/plans/0002e-hybrid-search.md
Normal file
825
docs/plans/0002e-hybrid-search.md
Normal file
|
|
@ -0,0 +1,825 @@
|
|||
# Phase 2 Sub-Plan 0002e -- Hybrid RRF Search
|
||||
|
||||
**Status**: Ready
|
||||
**Depends on**:
|
||||
- `0002a-skeleton-and-feature-gate.md` (the `postgres-backend` feature flag
|
||||
exists and `PgMemoryStore` compiles with `todo!()` bodies).
|
||||
- `0002b-pool-and-config.md` (a working `PgPool` reaches the backend).
|
||||
- `0002c-migrations.md` (migration `0001_init` has created the `knowledge_nodes`
|
||||
table with the D7 columns -- `owner_user_id`, `visibility`,
|
||||
`shared_with_groups` -- and the D8 column `codebase`; migration `0002_hnsw`
|
||||
has built the HNSW index).
|
||||
- `0002d-store-impl-bodies.md` (real CRUD bodies exist so the integration
|
||||
tests below can seed data through the trait surface rather than raw SQL).
|
||||
|
||||
This sub-plan covers master plan 0002 deliverable D5: the hybrid RRF search
|
||||
query implementation in `crates/vestige-core/src/storage/postgres/search.rs`,
|
||||
plus the `search`, `fts_search`, and `vector_search` method bodies in
|
||||
`crates/vestige-core/src/storage/postgres/mod.rs` that delegate into it.
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
This is one of the more performance-sensitive sub-plans in Phase 2. Every
|
||||
search call from the cognitive engine -- the 7-stage retrieval pipeline,
|
||||
`session_context`, `predict`, `deep_reference`, the dashboard -- bottoms out
|
||||
in `MemoryStore::search`. The Postgres backend has to keep up with the
|
||||
existing SQLite hybrid path, which combines BM25 over FTS5 with USearch HNSW
|
||||
in two separate round trips and fuses the rankings in Rust.
|
||||
|
||||
The shape of the win on Postgres is that both branches and the fusion run
|
||||
inside one statement. The planner sees both CTEs together, the round trip is
|
||||
single, and the rerank stage runs over a cleanly overfetched candidate set.
|
||||
|
||||
Latency targets live in `0002h-testing-and-benches.md`. This sub-plan is
|
||||
responsible for producing a correct, schema-stable query that the benches
|
||||
can drive against. Do not optimise here; correctness and structure first.
|
||||
|
||||
Master plan 0002 D5 (around lines 522-628 of
|
||||
`docs/plans/0002-phase-2-postgres-backend.md`) sketches the SQL. That
|
||||
sketch is the starting point, not the finished product. The schema after
|
||||
the D7 and D8 amendments has more columns than the sketch enumerates, and
|
||||
the SQLite `search` method (around line 6503 of
|
||||
`crates/vestige-core/src/storage/sqlite.rs` in the Phase 1 worktree)
|
||||
documents the semantics this implementation must stay compatible with:
|
||||
|
||||
- Empty `query.limit` defaults to 10.
|
||||
- `query.text == Some("")` is treated as no text query (degrade to vector).
|
||||
- `query.embedding == None` is treated as no vector query (degrade to FTS).
|
||||
- Both empty returns `Ok(vec![])`; not an error.
|
||||
- The `MemoryRecord` in each `SearchResult` must be populated with all
|
||||
fields the trait promises, including `domains` and `domain_scores` (Phase
|
||||
4 will fill these in; Phase 2 returns the stored values, which may be
|
||||
empty arrays / empty objects).
|
||||
|
||||
---
|
||||
|
||||
## Constants
|
||||
|
||||
```rust
|
||||
/// Reciprocal Rank Fusion smoothing constant from Cormack, Clarke and Buettcher
|
||||
/// 2009 ("Reciprocal Rank Fusion outperforms Condorcet and individual rank
|
||||
/// learning methods"). 60 is the canonical default and is robust across most
|
||||
/// fusion regimes. Do not tune this without a paper-citation-grade reason.
|
||||
const RRF_K: i32 = 60;
|
||||
|
||||
/// Each branch (FTS, vector) is allowed to return OVERFETCH_MULT x final_limit
|
||||
/// rows before fusion. Three matches the Phase 1 SQLite overfetch and gives
|
||||
/// the fusion enough candidates to recover from any single branch's bad
|
||||
/// recall on a given query.
|
||||
const OVERFETCH_MULT: i64 = 3;
|
||||
```
|
||||
|
||||
These live at module scope in
|
||||
`crates/vestige-core/src/storage/postgres/search.rs`. They are `pub(crate)`
|
||||
only if `0002h-testing-and-benches.md` needs to reference them from the
|
||||
integration tests; otherwise private.
|
||||
|
||||
---
|
||||
|
||||
## Public API
|
||||
|
||||
```rust
|
||||
#![cfg(feature = "postgres-backend")]
|
||||
|
||||
use pgvector::Vector;
|
||||
use sqlx::PgPool;
|
||||
|
||||
use crate::storage::memory_store::{
|
||||
MemoryStoreResult, SearchQuery, SearchResult,
|
||||
};
|
||||
|
||||
/// Hybrid RRF search over Postgres FTS and pgvector cosine distance.
|
||||
///
|
||||
/// Branch behavior:
|
||||
/// - empty text + null embedding -> Ok(vec![])
|
||||
/// - empty text + Some(embedding) -> pure vector search (FTS CTE returns
|
||||
/// zero rows; fusion equals the vector
|
||||
/// branch)
|
||||
/// - Some(text) + null embedding -> pure FTS search
|
||||
/// - Some(text) + Some(embedding) -> full RRF fusion
|
||||
///
|
||||
/// `query.limit == 0` is treated as 10 (matches the SQLite default).
|
||||
pub(crate) async fn rrf_search(
|
||||
pool: &PgPool,
|
||||
query: &SearchQuery,
|
||||
) -> MemoryStoreResult<Vec<SearchResult>>;
|
||||
|
||||
/// FTS-only convenience search. Equivalent to calling `rrf_search` with
|
||||
/// `query.embedding = None`, but uses a dedicated single-branch query that
|
||||
/// avoids the FULL OUTER JOIN and the params CTE; faster by one planner pass
|
||||
/// per call.
|
||||
pub(crate) async fn fts_only(
|
||||
pool: &PgPool,
|
||||
text: &str,
|
||||
limit: usize,
|
||||
) -> MemoryStoreResult<Vec<SearchResult>>;
|
||||
|
||||
/// Vector-only convenience search. Dedicated single-branch query for the same
|
||||
/// latency reason as `fts_only`.
|
||||
pub(crate) async fn vector_only(
|
||||
pool: &PgPool,
|
||||
embedding: &[f32],
|
||||
limit: usize,
|
||||
) -> MemoryStoreResult<Vec<SearchResult>>;
|
||||
```
|
||||
|
||||
### Parameter handling
|
||||
|
||||
In `rrf_search`:
|
||||
|
||||
```rust
|
||||
let final_limit: i32 = if query.limit == 0 { 10 } else { query.limit as i32 };
|
||||
let overfetch: i32 = (final_limit as i64 * OVERFETCH_MULT)
|
||||
.min(i32::MAX as i64) as i32;
|
||||
|
||||
let q_text: &str = query.text.as_deref().unwrap_or("");
|
||||
let q_vec: Option<Vector> = query.embedding.as_ref()
|
||||
.map(|v| Vector::from(v.clone()));
|
||||
|
||||
let dom_filter: Option<&[String]> = query.domains.as_deref();
|
||||
let nt_filter: Option<&[String]> = query.node_types.as_deref();
|
||||
let tag_filter: Option<&[String]> = query.tags.as_deref();
|
||||
|
||||
let min_retr: Option<f64> = query.min_retrievability;
|
||||
```
|
||||
|
||||
Both branches empty -- `q_text` is empty and `q_vec` is `None` -- returns
|
||||
`Ok(vec![])` without hitting the database. The SQLite backend has the same
|
||||
behavior and tests rely on it.
|
||||
|
||||
```rust
|
||||
if q_text.is_empty() && q_vec.is_none() {
|
||||
return Ok(Vec::new());
|
||||
}
|
||||
```
|
||||
|
||||
### `search` method body in `postgres/mod.rs`
|
||||
|
||||
```rust
|
||||
#[async_trait::async_trait] // or trait_variant after the Phase 1 amendment
|
||||
impl MemoryStore for PgMemoryStore {
|
||||
async fn search(&self, query: &SearchQuery)
|
||||
-> MemoryStoreResult<Vec<SearchResult>>
|
||||
{
|
||||
crate::storage::postgres::search::rrf_search(&self.pool, query).await
|
||||
}
|
||||
|
||||
async fn fts_search(&self, text: &str, limit: usize)
|
||||
-> MemoryStoreResult<Vec<SearchResult>>
|
||||
{
|
||||
crate::storage::postgres::search::fts_only(&self.pool, text, limit)
|
||||
.await
|
||||
}
|
||||
|
||||
async fn vector_search(&self, embedding: &[f32], limit: usize)
|
||||
-> MemoryStoreResult<Vec<SearchResult>>
|
||||
{
|
||||
crate::storage::postgres::search::vector_only(&self.pool, embedding, limit)
|
||||
.await
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Everything below specifies the inside of those three free functions.
|
||||
|
||||
---
|
||||
|
||||
## SQL: the hybrid RRF query
|
||||
|
||||
The query is built as one `&'static str` (or `OnceCell<String>`; see "Use
|
||||
of sqlx::query!" below) and reused. Bound parameters are kept to seven
|
||||
through a `params` CTE that the rest of the query references by name --
|
||||
this keeps the SQL readable and stops the bound-parameter count growing
|
||||
with each filter clause.
|
||||
|
||||
Bound parameters:
|
||||
|
||||
- `$1`: text query (TEXT, may be empty)
|
||||
- `$2`: embedding (pgvector::Vector, may be NULL)
|
||||
- `$3`: overfetch limit per branch (INT)
|
||||
- `$4`: final limit (INT)
|
||||
- `$5`: domain filter (TEXT[] or NULL)
|
||||
- `$6`: node_type filter (TEXT[] or NULL)
|
||||
- `$7`: tag filter (TEXT[] or NULL)
|
||||
|
||||
If `min_retrievability.is_some()` the outer SELECT adds a JOIN on
|
||||
`scheduling` and a WHERE clause; that path uses a different prepared
|
||||
statement (see "min_retrievability filter" below) so the simple-path query
|
||||
stays free of the join.
|
||||
|
||||
```sql
|
||||
WITH params AS (
|
||||
SELECT
|
||||
$1::text AS q_text,
|
||||
$2::vector AS q_vec,
|
||||
$3::int AS overfetch,
|
||||
$4::int AS final_limit,
|
||||
$5::text[] AS dom_filter,
|
||||
$6::text[] AS nt_filter,
|
||||
$7::text[] AS tag_filter
|
||||
),
|
||||
fts AS (
|
||||
SELECT
|
||||
m.id,
|
||||
ts_rank_cd(
|
||||
m.search_vec,
|
||||
websearch_to_tsquery('english', p.q_text)
|
||||
) AS score,
|
||||
ROW_NUMBER() OVER (
|
||||
ORDER BY ts_rank_cd(
|
||||
m.search_vec,
|
||||
websearch_to_tsquery('english', p.q_text)
|
||||
) DESC
|
||||
) AS rank
|
||||
FROM knowledge_nodes m
|
||||
CROSS JOIN params p
|
||||
WHERE p.q_text <> ''
|
||||
AND m.search_vec @@ websearch_to_tsquery('english', p.q_text)
|
||||
AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
|
||||
AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter))
|
||||
AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
|
||||
ORDER BY score DESC
|
||||
LIMIT (SELECT overfetch FROM params)
|
||||
),
|
||||
vec AS (
|
||||
SELECT
|
||||
m.id,
|
||||
1.0 - (m.embedding <=> p.q_vec) AS score,
|
||||
ROW_NUMBER() OVER (
|
||||
ORDER BY m.embedding <=> p.q_vec
|
||||
) AS rank
|
||||
FROM knowledge_nodes m
|
||||
CROSS JOIN params p
|
||||
WHERE m.embedding IS NOT NULL
|
||||
AND p.q_vec IS NOT NULL
|
||||
AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
|
||||
AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter))
|
||||
AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
|
||||
ORDER BY m.embedding <=> p.q_vec
|
||||
LIMIT (SELECT overfetch FROM params)
|
||||
),
|
||||
fused AS (
|
||||
SELECT
|
||||
COALESCE(f.id, v.id) AS id,
|
||||
COALESCE(1.0 / (60 + f.rank), 0.0)
|
||||
+ COALESCE(1.0 / (60 + v.rank), 0.0) AS rrf_score,
|
||||
f.score AS fts_score,
|
||||
v.score AS vector_score
|
||||
FROM fts f
|
||||
FULL OUTER JOIN vec v ON f.id = v.id
|
||||
)
|
||||
SELECT
|
||||
m.id AS "id!: uuid::Uuid",
|
||||
m.owner_user_id AS "owner_user_id!: uuid::Uuid",
|
||||
m.visibility AS "visibility!: String",
|
||||
m.shared_with_groups AS "shared_with_groups!: Vec<uuid::Uuid>",
|
||||
m.codebase AS "codebase: String",
|
||||
m.domains AS "domains!: Vec<String>",
|
||||
m.domain_scores AS "domain_scores!: serde_json::Value",
|
||||
m.content AS "content!: String",
|
||||
m.node_type AS "node_type!: String",
|
||||
m.tags AS "tags!: Vec<String>",
|
||||
m.embedding AS "embedding: pgvector::Vector",
|
||||
m.metadata AS "metadata!: serde_json::Value",
|
||||
m.created_at AS "created_at!: chrono::DateTime<chrono::Utc>",
|
||||
m.updated_at AS "updated_at!: chrono::DateTime<chrono::Utc>",
|
||||
fused.rrf_score AS "rrf_score!: f64",
|
||||
fused.fts_score AS "fts_score: f64",
|
||||
fused.vector_score AS "vector_score: f64"
|
||||
FROM fused
|
||||
JOIN knowledge_nodes m ON m.id = fused.id
|
||||
ORDER BY fused.rrf_score DESC
|
||||
LIMIT (SELECT final_limit FROM params);
|
||||
```
|
||||
|
||||
Notes on the SELECT column list. The D7 columns (`owner_user_id`,
|
||||
`visibility`, `shared_with_groups`) and the D8 column (`codebase`) are
|
||||
selected even though Phase 2 does not filter on them yet, so:
|
||||
|
||||
1. The `MemoryRecord` returned to the trait can be populated with the
|
||||
stored values from day one. Phase 3 will start writing real
|
||||
`owner_user_id` / `visibility` values; Phase 2 always writes the
|
||||
single-user defaults (`'00000000-...-0001'`, `'private'`, `'{}'`). The
|
||||
`MemoryRecord` returned in Phase 2 simply carries those defaults.
|
||||
2. The schema-drift integration tests (see "Verification") catch the case
|
||||
where someone adds a NOT NULL column to `knowledge_nodes` without updating
|
||||
this query.
|
||||
|
||||
Notes on the body:
|
||||
|
||||
- `CROSS JOIN params p` is used instead of the master-plan sketch's
|
||||
`FROM knowledge_nodes m, params p`. Same plan, clearer intent.
|
||||
- The `ORDER BY ... LIMIT` inside each branch CTE is there so the planner
|
||||
can stop early once it has `overfetch` rows; without it the LIMIT is
|
||||
applied after a full sort over all matches.
|
||||
- `1.0 - (m.embedding <=> p.q_vec)` converts pgvector's cosine *distance*
|
||||
to cosine *similarity* in [0, 1] for the `vector_score` output. RRF
|
||||
itself does not need the similarity -- it uses ranks -- but the trait
|
||||
surface exposes `vector_score: Option<f64>` for caller introspection.
|
||||
- `RRF_K = 60` is inlined as `60` in the SQL string. A `const` formatter
|
||||
feels tidier but `60` is a literature constant; spell it out and leave a
|
||||
comment in the Rust source: `// 60 == RRF_K (Cormack 2009)`.
|
||||
- `FULL OUTER JOIN` is required: a row that the FTS branch finds and the
|
||||
vector branch does not must still appear in `fused`, and vice versa.
|
||||
- `COALESCE(..., 0.0)` on each `1.0 / (60 + rank)` term handles the
|
||||
no-match-from-this-branch case. The fusion score for a row that only the
|
||||
FTS branch ranks is `1/(60 + f.rank)` exactly.
|
||||
- `m.search_vec` is the generated `tsvector` column created in migration
|
||||
`0001_init` (see D4 of the master plan).
|
||||
|
||||
---
|
||||
|
||||
## Result row mapping
|
||||
|
||||
`sqlx::query_as::<_, SearchRow>` reads each row into a private struct that
|
||||
owns the column types exactly as they come back from Postgres. The struct
|
||||
is converted into a `SearchResult` after fetch.
|
||||
|
||||
```rust
|
||||
#[derive(sqlx::FromRow)]
|
||||
struct SearchRow {
|
||||
id: uuid::Uuid,
|
||||
owner_user_id: uuid::Uuid,
|
||||
visibility: String,
|
||||
shared_with_groups: Vec<uuid::Uuid>,
|
||||
codebase: Option<String>,
|
||||
domains: Vec<String>,
|
||||
domain_scores: serde_json::Value,
|
||||
content: String,
|
||||
node_type: String,
|
||||
tags: Vec<String>,
|
||||
embedding: Option<pgvector::Vector>,
|
||||
metadata: serde_json::Value,
|
||||
created_at: chrono::DateTime<chrono::Utc>,
|
||||
updated_at: chrono::DateTime<chrono::Utc>,
|
||||
rrf_score: f64,
|
||||
fts_score: Option<f64>,
|
||||
vector_score: Option<f64>,
|
||||
}
|
||||
|
||||
impl SearchRow {
|
||||
fn into_result(self) -> SearchResult {
|
||||
use crate::storage::memory_store::MemoryRecord;
|
||||
use std::collections::HashMap;
|
||||
|
||||
// domain_scores is JSONB; the column always exists, but may be the
|
||||
// empty object {} when Phase 4 has not classified this memory yet.
|
||||
let domain_scores: HashMap<String, f64> =
|
||||
serde_json::from_value(self.domain_scores).unwrap_or_default();
|
||||
|
||||
let record = MemoryRecord {
|
||||
id: self.id,
|
||||
domains: self.domains,
|
||||
domain_scores,
|
||||
content: self.content,
|
||||
node_type: self.node_type,
|
||||
tags: self.tags,
|
||||
// pgvector::Vector -> Vec<f32>
|
||||
embedding: self.embedding.map(|v| v.to_vec()),
|
||||
created_at: self.created_at,
|
||||
updated_at: self.updated_at,
|
||||
metadata: self.metadata,
|
||||
// owner_user_id / visibility / shared_with_groups / codebase
|
||||
// do not appear on MemoryRecord yet. Phase 3 will decide whether
|
||||
// to extend MemoryRecord or surface them via a side channel.
|
||||
// For Phase 2 they are read but discarded here.
|
||||
};
|
||||
|
||||
SearchResult {
|
||||
record,
|
||||
score: self.rrf_score,
|
||||
fts_score: self.fts_score,
|
||||
vector_score: self.vector_score,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Type mapping summary:
|
||||
|
||||
| SQL type | Rust type | Notes |
|
||||
|-------------------|--------------------------------------|------------------------------------------------|
|
||||
| UUID | `uuid::Uuid` | requires sqlx `uuid` feature |
|
||||
| TEXT | `String` | |
|
||||
| TEXT NULL | `Option<String>` | used for `codebase` |
|
||||
| TEXT[] | `Vec<String>` | for `tags`, `domains` |
|
||||
| UUID[] | `Vec<uuid::Uuid>` | for `shared_with_groups` |
|
||||
| JSONB | `serde_json::Value` | for `metadata`, `domain_scores` |
|
||||
| TIMESTAMPTZ | `chrono::DateTime<chrono::Utc>` | requires sqlx `chrono` feature |
|
||||
| VECTOR(N) NULL | `Option<pgvector::Vector>` | `.map(|v| v.to_vec())` to `Option<Vec<f32>>` |
|
||||
| FLOAT8 | `f64` | |
|
||||
| FLOAT8 NULL | `Option<f64>` | for `fts_score`, `vector_score` |
|
||||
|
||||
If `MemoryRecord` is extended in Phase 3 to carry `owner_user_id`,
|
||||
`visibility`, `shared_with_groups`, and `codebase`, the conversion above
|
||||
gets four more fields. Phase 2 reads them so the integration tests can
|
||||
assert on them via SQL, but the trait surface does not expose them yet.
|
||||
|
||||
---
|
||||
|
||||
## `fts_only` and `vector_only` -- dedicated single-branch queries
|
||||
|
||||
The master plan offers two options for the convenience methods: reuse
|
||||
`rrf_search` with one branch nulled, or write dedicated queries. The
|
||||
dedicated queries win:
|
||||
|
||||
- One CTE instead of three. Planner picks the obvious plan.
|
||||
- No FULL OUTER JOIN.
|
||||
- No `params` indirection -- bound parameters used directly.
|
||||
- The output `score` is the branch's native score (BM25-ish `ts_rank_cd` /
|
||||
cosine similarity), not an RRF fusion score over one branch. Callers of
|
||||
`fts_search` and `vector_search` get an intuitive score back.
|
||||
|
||||
### `fts_only`
|
||||
|
||||
Bound parameters:
|
||||
|
||||
- `$1`: text query (TEXT, must be non-empty; the caller guards `text.is_empty()`)
|
||||
- `$2`: limit (INT)
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
m.id AS "id!: uuid::Uuid",
|
||||
m.owner_user_id AS "owner_user_id!: uuid::Uuid",
|
||||
m.visibility AS "visibility!: String",
|
||||
m.shared_with_groups AS "shared_with_groups!: Vec<uuid::Uuid>",
|
||||
m.codebase AS "codebase: String",
|
||||
m.domains AS "domains!: Vec<String>",
|
||||
m.domain_scores AS "domain_scores!: serde_json::Value",
|
||||
m.content AS "content!: String",
|
||||
m.node_type AS "node_type!: String",
|
||||
m.tags AS "tags!: Vec<String>",
|
||||
m.embedding AS "embedding: pgvector::Vector",
|
||||
m.metadata AS "metadata!: serde_json::Value",
|
||||
m.created_at AS "created_at!: chrono::DateTime<chrono::Utc>",
|
||||
m.updated_at AS "updated_at!: chrono::DateTime<chrono::Utc>",
|
||||
ts_rank_cd(m.search_vec, websearch_to_tsquery('english', $1))
|
||||
AS "fts_score!: f64"
|
||||
FROM knowledge_nodes m
|
||||
WHERE m.search_vec @@ websearch_to_tsquery('english', $1)
|
||||
ORDER BY ts_rank_cd(m.search_vec, websearch_to_tsquery('english', $1)) DESC
|
||||
LIMIT $2;
|
||||
```
|
||||
|
||||
The Rust caller maps each row to a `SearchResult` with:
|
||||
|
||||
```rust
|
||||
SearchResult {
|
||||
record,
|
||||
score: fts_score,
|
||||
fts_score: Some(fts_score),
|
||||
vector_score: None,
|
||||
}
|
||||
```
|
||||
|
||||
If `text.is_empty()` the caller returns `Ok(Vec::new())` before hitting
|
||||
the database. `websearch_to_tsquery('english', '')` returns an empty
|
||||
tsquery that matches nothing; the round-trip is wasted work otherwise.
|
||||
|
||||
### `vector_only`
|
||||
|
||||
Bound parameters:
|
||||
|
||||
- `$1`: embedding (pgvector::Vector)
|
||||
- `$2`: limit (INT)
|
||||
|
||||
```sql
|
||||
SELECT
|
||||
m.id AS "id!: uuid::Uuid",
|
||||
m.owner_user_id AS "owner_user_id!: uuid::Uuid",
|
||||
m.visibility AS "visibility!: String",
|
||||
m.shared_with_groups AS "shared_with_groups!: Vec<uuid::Uuid>",
|
||||
m.codebase AS "codebase: String",
|
||||
m.domains AS "domains!: Vec<String>",
|
||||
m.domain_scores AS "domain_scores!: serde_json::Value",
|
||||
m.content AS "content!: String",
|
||||
m.node_type AS "node_type!: String",
|
||||
m.tags AS "tags!: Vec<String>",
|
||||
m.embedding AS "embedding: pgvector::Vector",
|
||||
m.metadata AS "metadata!: serde_json::Value",
|
||||
m.created_at AS "created_at!: chrono::DateTime<chrono::Utc>",
|
||||
m.updated_at AS "updated_at!: chrono::DateTime<chrono::Utc>",
|
||||
1.0 - (m.embedding <=> $1) AS "vector_score!: f64"
|
||||
FROM knowledge_nodes m
|
||||
WHERE m.embedding IS NOT NULL
|
||||
ORDER BY m.embedding <=> $1
|
||||
LIMIT $2;
|
||||
```
|
||||
|
||||
The Rust caller maps each row to:
|
||||
|
||||
```rust
|
||||
SearchResult {
|
||||
record,
|
||||
score: vector_score,
|
||||
fts_score: None,
|
||||
vector_score: Some(vector_score),
|
||||
}
|
||||
```
|
||||
|
||||
Both convenience methods ignore `SearchQuery.domains` / `tags` /
|
||||
`node_types` / `min_retrievability` -- they take `&str` and `&[f32]`
|
||||
respectively, not a `SearchQuery`. Callers that want filters on a
|
||||
single-branch search should call `search` with the other branch input
|
||||
left at its degrade-to-zero default.
|
||||
|
||||
---
|
||||
|
||||
## `min_retrievability` filter
|
||||
|
||||
`SearchQuery::min_retrievability: Option<f64>` is applied as a final
|
||||
filter after fusion by joining on the `scheduling` table:
|
||||
|
||||
```sql
|
||||
-- with-min-retrievability variant: identical CTEs to the base query, only
|
||||
-- the final SELECT changes.
|
||||
SELECT
|
||||
... (same column list as the base query) ...
|
||||
FROM fused
|
||||
JOIN knowledge_nodes m ON m.id = fused.id
|
||||
JOIN scheduling s ON s.memory_id = m.id
|
||||
WHERE s.retrievability >= $8
|
||||
ORDER BY fused.rrf_score DESC
|
||||
LIMIT (SELECT final_limit FROM params);
|
||||
```
|
||||
|
||||
This is a separate prepared statement -- the eight-parameter variant --
|
||||
held alongside the seven-parameter base. The Rust dispatch:
|
||||
|
||||
```rust
|
||||
if let Some(min_r) = query.min_retrievability {
|
||||
sqlx::query_as::<_, SearchRow>(QUERY_WITH_MIN_R)
|
||||
.bind(q_text)
|
||||
.bind(q_vec)
|
||||
.bind(overfetch)
|
||||
.bind(final_limit)
|
||||
.bind(dom_filter)
|
||||
.bind(nt_filter)
|
||||
.bind(tag_filter)
|
||||
.bind(min_r)
|
||||
.fetch_all(pool).await?
|
||||
} else {
|
||||
sqlx::query_as::<_, SearchRow>(QUERY_BASE)
|
||||
.bind(q_text)
|
||||
.bind(q_vec)
|
||||
.bind(overfetch)
|
||||
.bind(final_limit)
|
||||
.bind(dom_filter)
|
||||
.bind(nt_filter)
|
||||
.bind(tag_filter)
|
||||
.fetch_all(pool).await?
|
||||
}
|
||||
```
|
||||
|
||||
Why not unconditionally join: the `scheduling` join is expensive enough on
|
||||
a large `knowledge_nodes` table that adding it to every search call regresses the
|
||||
common path. `min_retrievability` is set by the cognitive engine's
|
||||
accessibility filter and is `None` in most direct callers.
|
||||
|
||||
The same two-variant pattern repeats for `fts_only` and `vector_only`; in
|
||||
practice callers of those methods rarely set `min_retrievability` (it is
|
||||
not part of their argument list), so only the base variant is needed
|
||||
unless the trait surface grows.
|
||||
|
||||
---
|
||||
|
||||
## Domain / tag / node_type filters
|
||||
|
||||
Each filter is expressed as a NULL-conditional clause inside both branch
|
||||
CTEs, written using PostgreSQL array operators:
|
||||
|
||||
```sql
|
||||
AND (p.dom_filter IS NULL OR m.domains && p.dom_filter)
|
||||
AND (p.nt_filter IS NULL OR m.node_type = ANY(p.nt_filter))
|
||||
AND (p.tag_filter IS NULL OR m.tags && p.tag_filter)
|
||||
```
|
||||
|
||||
- `&&` is the PostgreSQL "arrays overlap" operator. Matches if any
|
||||
element in `m.domains` is in the filter array. Index-friendly with a
|
||||
GIN index on `m.domains` (created in `0001_init`).
|
||||
- `= ANY(...)` matches `m.node_type` (a scalar) against any element of
|
||||
the filter array. Index-friendly with a B-tree on `m.node_type`.
|
||||
- `&&` is used again on `m.tags` (a `TEXT[]`).
|
||||
|
||||
The NULL-conditional form is critical: when the filter parameter is
|
||||
`NULL`, the clause short-circuits to `TRUE` and contributes nothing to
|
||||
the WHERE. This keeps a single query reusable across "no filter" and
|
||||
"filter set" cases without rewriting SQL.
|
||||
|
||||
When the Rust caller passes `None` for a filter, sqlx binds it as `NULL`
|
||||
of the column type (`text[]`). The cast `$5::text[]` in the `params` CTE
|
||||
is what tells sqlx the binding type.
|
||||
|
||||
The master plan's draft has each filter clause duplicated across both
|
||||
branch CTEs. That duplication is correct -- the planner cannot push a
|
||||
WHERE clause across a FULL OUTER JOIN into both sides automatically; we
|
||||
do it manually.
|
||||
|
||||
---
|
||||
|
||||
## Empty-string text query handling
|
||||
|
||||
The base query guards the FTS branch with `WHERE p.q_text <> ''`.
|
||||
|
||||
`websearch_to_tsquery('english', '')` returns an empty tsquery. An empty
|
||||
tsquery has no lexemes and matches no document; the `@@` operator returns
|
||||
false for every row. Without the guard, the FTS branch would still run --
|
||||
sequential scan, tokenisation per row, comparison -- and return zero
|
||||
rows. The guard short-circuits at planning time.
|
||||
|
||||
The guard does not affect the FULL OUTER JOIN: when the FTS branch
|
||||
returns zero rows, the join degenerates to "every row that the vector
|
||||
branch returned, with `f.id IS NULL` and `f.rank IS NULL`". The
|
||||
`COALESCE(1.0 / (60 + f.rank), 0.0)` then evaluates to `0.0`, and the
|
||||
fused score reduces to the vector branch's RRF term alone. This is the
|
||||
"pure vector search" degrade path.
|
||||
|
||||
Symmetrically, the vector branch guards itself with
|
||||
`WHERE m.embedding IS NOT NULL AND p.q_vec IS NOT NULL`, which gives the
|
||||
"pure FTS search" degrade path when the caller passes no embedding.
|
||||
|
||||
The both-empty case (`q_text == ''` and `q_vec IS NULL`) is intercepted
|
||||
in Rust before the query runs and returns `Ok(vec![])`. Returning empty
|
||||
rather than error matches the SQLite behavior and is what the Phase 1
|
||||
ingest pipeline relies on for "no signal, no results" fallback.
|
||||
|
||||
---
|
||||
|
||||
## Use of `sqlx::query!` versus `sqlx::query_as`
|
||||
|
||||
`sqlx::query!` and `sqlx::query_as!` are compile-time-checked: the SQL is
|
||||
sent to a live Postgres at build time, the result schema is validated, and
|
||||
the generated Rust struct fields are derived. That checking is the
|
||||
default for every other query in `0002d-store-impl-bodies.md`.
|
||||
|
||||
For the RRF query, the macro path is impractical for two reasons:
|
||||
|
||||
1. **Two structurally different queries** -- the base (seven parameters,
|
||||
no `scheduling` join) and the `min_retrievability` variant (eight
|
||||
parameters, with the join). The macro would force two macro
|
||||
invocations, each producing its own anonymous result struct, and the
|
||||
result types would not unify. Manual `From` impls would be needed in
|
||||
both directions.
|
||||
2. **The dedicated `fts_only` and `vector_only` queries have a different
|
||||
output column set** (`fts_score!` instead of `rrf_score! + fts_score? +
|
||||
vector_score?`). Three macro invocations, three structs, three
|
||||
conversion helpers.
|
||||
|
||||
The chosen pattern is `sqlx::query_as::<_, SearchRow>(SQL_CONST)` with a
|
||||
single `SearchRow` struct that owns the column types and a single
|
||||
`SearchRow::into_result` helper. The SQL is held in module-scope `&'static
|
||||
str` constants:
|
||||
|
||||
```rust
|
||||
const QUERY_BASE: &str = include_str!("search.rrf.sql");
|
||||
const QUERY_WITH_MIN_R: &str = include_str!("search.rrf.min_retr.sql");
|
||||
const QUERY_FTS_ONLY: &str = include_str!("search.fts.sql");
|
||||
const QUERY_VECTOR_ONLY: &str = include_str!("search.vector.sql");
|
||||
```
|
||||
|
||||
`include_str!` keeps the SQL out of the Rust source. The four `.sql`
|
||||
files live next to `search.rs` in
|
||||
`crates/vestige-core/src/storage/postgres/`.
|
||||
|
||||
The cost: schema drift (someone renames `m.codebase` to `m.repo_name`)
|
||||
will not break the build. The integration tests in "Verification" below
|
||||
are the safety net. This is a deliberate trade -- it is the one sub-plan
|
||||
in Phase 2 where runtime flexibility beats compile-time checking.
|
||||
|
||||
If a future contributor wants compile-time checking back for the simple
|
||||
case, the right move is to introduce a `#[cfg(test)]`-only macro-checked
|
||||
variant of `QUERY_BASE` and assert at test build time that the macro
|
||||
agrees with the string. That belongs in `0002h-testing-and-benches.md` if
|
||||
anywhere.
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
Integration tests live in
|
||||
`crates/vestige-core/tests/postgres_search.rs`, gated by
|
||||
`#[cfg(feature = "postgres-backend")]` and `#[ignore]` by default (the
|
||||
test runner CI workflow in `0002h-testing-and-benches.md` runs them with
|
||||
`--ignored` against a live Postgres).
|
||||
|
||||
Common harness for every test:
|
||||
|
||||
1. Spin up Postgres via `sqlx::PgPool::connect` against the test URL.
|
||||
2. Run `sqlx::migrate!("./migrations/postgres").run(&pool)` to bring the
|
||||
schema up.
|
||||
3. Register a deterministic test embedder via `register_model` so
|
||||
`embedding` columns can be written.
|
||||
4. Seed 50 mixed memories through `MemoryStore::insert` -- mixed
|
||||
`node_type` (`fact`, `concept`, `event`, `decision`), mixed `tags`
|
||||
(`rust`, `postgres`, `search`, `dream`, `bug-fix`), mixed `codebase`,
|
||||
embeddings drawn from three small clusters so vector recall has
|
||||
structure to find.
|
||||
|
||||
Test cases:
|
||||
|
||||
**T1. Full RRF returns the seeded target.**
|
||||
Insert a known memory with `content = "FSRS-6 spaced repetition cadence"`
|
||||
and an embedding from cluster A. Query with
|
||||
`text = Some("FSRS spaced repetition")` and an embedding near cluster A.
|
||||
Assert the target memory is in the top 3 of the returned `SearchResult`s
|
||||
and that both `fts_score` and `vector_score` are `Some` for it.
|
||||
|
||||
**T2. Pure FTS degrade.**
|
||||
Same target as T1. Query with `text = Some("FSRS spaced repetition")` and
|
||||
`embedding = None`. Assert the target appears, all results have
|
||||
`vector_score == None`, `fts_score == Some(_)`, and `score` equals the
|
||||
fused RRF score (which collapses to one branch's `1.0/(60 + rank)`).
|
||||
|
||||
**T3. Pure vector degrade.**
|
||||
Same target as T1. Query with `text = Some("")` and
|
||||
`embedding = Some(cluster_A_vector)`. Assert the target appears, all
|
||||
results have `fts_score == None`, `vector_score == Some(_)`.
|
||||
|
||||
**T4. Both empty returns `Ok(vec![])`.**
|
||||
Query with `text = Some("")` and `embedding = None`. Assert exactly an
|
||||
empty result vector and that no SQL was executed (assert via a
|
||||
`sqlx::PgPool` query-count handle if convenient; otherwise document that
|
||||
the short-circuit lives in Rust).
|
||||
|
||||
**T5. `domains` filter.**
|
||||
Insert one memory with `domains = vec!["domain-x"]` and 49 others without
|
||||
it. Query with `domains = Some(vec!["domain-x"])` and a matching text.
|
||||
Assert exactly one result is returned and it is the seeded memory.
|
||||
|
||||
**T6. `tags` filter.**
|
||||
Same pattern as T5 with `tags = Some(vec!["bug-fix"])`.
|
||||
|
||||
**T7. `node_types` filter.**
|
||||
Same pattern as T5 with `node_types = Some(vec!["decision"])`.
|
||||
|
||||
**T8. `min_retrievability` filter.**
|
||||
Seed two memories with the same content + embedding. Write
|
||||
`scheduling` rows so that one has `retrievability = 0.9` and the other
|
||||
`0.1`. Query with `min_retrievability = Some(0.5)`. Assert exactly the
|
||||
high-retrievability memory is returned.
|
||||
|
||||
**T9. `query.limit == 0` defaults to 10.**
|
||||
Seed 30 matching memories. Query with `limit = 0`. Assert the result
|
||||
contains exactly 10 entries.
|
||||
|
||||
**T10. `fts_only` and `vector_only` parity.**
|
||||
For the same target memory, call `fts_only` and `vector_only` directly
|
||||
and compare against `search` with the corresponding branch zeroed. The
|
||||
top-1 result must match by id; the scores need only be of the same sign
|
||||
and magnitude (not bit-identical, because RRF fusion changes the
|
||||
absolute score).
|
||||
|
||||
**T11. Schema-drift canary.**
|
||||
Run the base query against an empty `knowledge_nodes` table and `fetch_all`
|
||||
into `Vec<SearchRow>`. Any added NOT NULL column on `knowledge_nodes` that is
|
||||
not in the SELECT will fail the test at the `try_get` boundary with a
|
||||
clear error. This is the test that compensates for not using
|
||||
`sqlx::query!`.
|
||||
|
||||
**T12. Hostile inputs.**
|
||||
Query with `text = Some("'; DROP TABLE knowledge_nodes; --")` and a normal
|
||||
embedding. Assert no panic, results returned cleanly, `knowledge_nodes` table
|
||||
still present. This is symbolic; `websearch_to_tsquery` is parameter-
|
||||
bound and SQL injection is not actually possible, but the test is cheap
|
||||
and the assertion is real.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
A reviewer of the implementation PR should be able to confirm:
|
||||
|
||||
1. `crates/vestige-core/src/storage/postgres/search.rs` exists and is
|
||||
compiled only when `feature = "postgres-backend"` is on.
|
||||
2. The four `.sql` files (`search.rrf.sql`,
|
||||
`search.rrf.min_retr.sql`, `search.fts.sql`, `search.vector.sql`)
|
||||
exist in the same directory and are `include_str!`-ed into module-
|
||||
scope `&'static str` constants.
|
||||
3. `RRF_K = 60` and `OVERFETCH_MULT = 3` are defined as constants at
|
||||
module scope with the Cormack 2009 citation in a comment.
|
||||
4. The seven-parameter base query is one statement and uses a `params`
|
||||
CTE; the eight-parameter `min_retrievability` variant adds exactly
|
||||
one JOIN and one WHERE clause on top of the base.
|
||||
5. Empty text degrades to pure vector; null embedding degrades to pure
|
||||
FTS; both empty short-circuits to `Ok(vec![])` in Rust before the
|
||||
query runs.
|
||||
6. The SELECT column list in every query includes `owner_user_id`,
|
||||
`visibility`, `shared_with_groups`, and `codebase` even though Phase 2
|
||||
does not filter on them.
|
||||
7. `SearchRow::into_result` populates a `MemoryRecord` with every field
|
||||
the trait requires, including `domains` and `domain_scores` decoded
|
||||
from JSONB.
|
||||
8. `PgMemoryStore::search`, `PgMemoryStore::fts_search`, and
|
||||
`PgMemoryStore::vector_search` each delegate to the corresponding
|
||||
free function with one line of body.
|
||||
9. All twelve integration tests (`T1` through `T12`) pass against a live
|
||||
Postgres with the `0001_init` + `0002_hnsw` migrations applied.
|
||||
10. `cargo build -p vestige-core` succeeds with
|
||||
`--features postgres-backend` and with the feature off.
|
||||
11. `cargo clippy -p vestige-core --features postgres-backend -- -D warnings`
|
||||
is clean.
|
||||
|
||||
When all eleven are true, this sub-plan is done and
|
||||
`0002f-migrate-cli.md` is unblocked.
|
||||
1045
docs/plans/0002f-migrate-cli.md
Normal file
1045
docs/plans/0002f-migrate-cli.md
Normal file
File diff suppressed because it is too large
Load diff
843
docs/plans/0002g-reembed.md
Normal file
843
docs/plans/0002g-reembed.md
Normal file
|
|
@ -0,0 +1,843 @@
|
|||
# Sub-plan 0002g -- Re-embed driver and `vestige migrate reembed` CLI
|
||||
|
||||
**Status**: Draft
|
||||
**Master plan**: [0002-phase-2-postgres-backend.md](0002-phase-2-postgres-backend.md)
|
||||
**ADR**: [0002-phase-2-execution.md](../adr/0002-phase-2-execution.md)
|
||||
**Predecessor**: [0002f-migrate-cli.md](0002f-migrate-cli.md)
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
This sub-plan delivers master plan deliverable **D9** -- the bulk re-embedding
|
||||
driver -- and the `vestige migrate reembed` arm of the CLI scaffolded by
|
||||
**D10** in sub-plan `0002f`. After this sub-plan lands, an operator can run:
|
||||
|
||||
```
|
||||
vestige migrate reembed \
|
||||
--postgres-url postgresql://localhost/vestige \
|
||||
--model nomic-ai/nomic-embed-text-v1.5 \
|
||||
--dimension 768
|
||||
```
|
||||
|
||||
and the running Postgres backend will:
|
||||
|
||||
1. Stream every row out of `knowledge_nodes`.
|
||||
2. Re-encode `content` with the requested `Embedder`.
|
||||
3. Write the new vectors back.
|
||||
4. Adjust the pgvector typmod if the new dimension differs from the old.
|
||||
5. Rebuild the HNSW index.
|
||||
6. Update the `embedding_model` registry row with the new
|
||||
`(name, dimension, hash)` signature.
|
||||
|
||||
The whole operation runs as a single offline maintenance step. Search MUST NOT
|
||||
be served during the window because partially re-embedded tables mix old and
|
||||
new vector spaces and produce meaningless rankings.
|
||||
|
||||
This sub-plan deliberately does NOT:
|
||||
|
||||
- Migrate vectors between backends. That's `0002f` (SQLite -> Postgres copy).
|
||||
- Invent new embedder constructors. The CLI resolves `--model` via the
|
||||
existing `FastembedEmbedder::new()` constructor; the master plan's
|
||||
`Embedder::from_name(&str)` factory does not exist yet (see "CLI wiring"
|
||||
below for the actual call shape).
|
||||
- Add a `vestige migrate reembed --sqlite-path ...` arm. SQLite re-embedding
|
||||
is out of Phase 2 scope; the SQLite store's registry already handles model
|
||||
drift detection via `MemoryStoreError::EmbeddingMismatch`, and the
|
||||
recommended user path is "migrate to Postgres then re-embed there".
|
||||
|
||||
---
|
||||
|
||||
## Dependencies
|
||||
|
||||
- `0002a-skeleton-and-feature-gate.md` -- `PgMemoryStore` exists.
|
||||
- `0002b-pool-and-config.md` -- `connect` builds a real `PgPool`.
|
||||
- `0002c-migrations.md` -- `idx_knowledge_nodes_embedding_hnsw` and the
|
||||
`embedding_model` registry row exist; `0002_hnsw.up.sql` defines the index.
|
||||
- `0002d-store-impl-bodies.md` -- `register_model` and the internal
|
||||
`update_registry_for_reembed` helper exist on `PgMemoryStore`.
|
||||
- `0002e-hybrid-search.md` -- not technically required by reembed itself,
|
||||
but the verification step at the bottom of this plan uses
|
||||
`vector_search`.
|
||||
- `0002f-migrate-cli.md` -- provides the `clap` scaffolding under
|
||||
`vestige migrate ...`. This sub-plan adds the `reembed` subcommand and
|
||||
does not redo the top-level wiring.
|
||||
|
||||
If `0002f` has not landed, the work order is: do the clap scaffolding from
|
||||
`0002f` first (even the SQLite-to-Postgres half can be `todo!()` initially),
|
||||
then this sub-plan.
|
||||
|
||||
---
|
||||
|
||||
## Audit step (do this first)
|
||||
|
||||
Before writing `reembed.rs`, confirm the live shape of the supporting code.
|
||||
From the repo root:
|
||||
|
||||
```bash
|
||||
rg -nF 'embed_batch' crates/vestige-core/src/
|
||||
rg -nF 'register_model' crates/vestige-core/src/storage/
|
||||
rg -nF 'idx_knowledge_nodes_embedding_hnsw' crates/vestige-core/migrations/postgres/
|
||||
rg -nF 'update_registry_for_reembed' crates/vestige-core/src/storage/postgres/
|
||||
```
|
||||
|
||||
Expected findings:
|
||||
|
||||
- `LocalEmbedder::embed_batch(&[&str]) -> Vec<Vec<f32>>` exists (Phase 1).
|
||||
- `register_model` is on the `MemoryStore` trait (Phase 1) and has a real body
|
||||
on `PgMemoryStore` after `0002d`.
|
||||
- `idx_knowledge_nodes_embedding_hnsw` is the canonical HNSW index name. If
|
||||
`0002c-migrations.md` chose a different name, update the SQL constants in
|
||||
`reembed.rs` accordingly.
|
||||
- `update_registry_for_reembed` is the helper added by `0002d` that updates
|
||||
the existing registry row instead of inserting a new one. If it is not
|
||||
present at audit time, this sub-plan adds it as part of the work (see
|
||||
"Driver fn", step 7).
|
||||
|
||||
---
|
||||
|
||||
## Cargo manifest additions
|
||||
|
||||
No new crates. `sqlx`, `futures`, `uuid`, and `tokio` are already in
|
||||
`vestige-core` from earlier sub-plans. `tracing` is already used throughout
|
||||
Phase 2.
|
||||
|
||||
The CLI binary (`vestige-mcp/src/bin/cli.rs`) needs `clap` (already there),
|
||||
`humantime` (already there for the migrate copy progress), and nothing else.
|
||||
|
||||
---
|
||||
|
||||
## Plan struct
|
||||
|
||||
`crates/vestige-core/src/storage/postgres/reembed.rs`:
|
||||
|
||||
```rust
|
||||
#![cfg(feature = "postgres-backend")]
|
||||
|
||||
/// Tunables for the re-embed driver.
|
||||
///
|
||||
/// Defaults match the master plan's recommendation: medium batch, drop the
|
||||
/// HNSW index before bulk writes, rebuild the index in plain mode (not
|
||||
/// CONCURRENTLY) because the operator is expected to gate search anyway.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct ReembedPlan {
|
||||
/// Number of memories embedded per `embed_batch` call and per `UPDATE`.
|
||||
/// Default 128. Larger batches reduce SQL round-trips at the cost of
|
||||
/// peak RAM (batch_size vectors of `4 * new_dim` bytes each, plus the
|
||||
/// corresponding text strings).
|
||||
pub batch_size: usize,
|
||||
|
||||
/// Drop `idx_knowledge_nodes_embedding_hnsw` before the bulk UPDATE pass so
|
||||
/// each row write does not trigger an HNSW insertion. The index is
|
||||
/// rebuilt after all rows are written. Default true.
|
||||
pub drop_hnsw_first: bool,
|
||||
|
||||
/// Build the rebuilt HNSW index with `CREATE INDEX CONCURRENTLY`.
|
||||
/// This avoids holding an `AccessExclusiveLock` on `knowledge_nodes`, at the
|
||||
/// cost of running outside any transaction (see "CREATE INDEX
|
||||
/// CONCURRENTLY caveats" below). Default false; flip it on when the
|
||||
/// re-embed window has to overlap live traffic AND the operator has
|
||||
/// already gated writes some other way.
|
||||
pub concurrent_index: bool,
|
||||
}
|
||||
|
||||
impl Default for ReembedPlan {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
batch_size: 128,
|
||||
drop_hnsw_first: true,
|
||||
concurrent_index: false,
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The defaults match the master plan. `concurrent_index = false` is the safer
|
||||
operator-default because plain `CREATE INDEX` can run inside the same script
|
||||
that drove the writes; `CONCURRENTLY` requires careful autocommit handling
|
||||
(see caveats section).
|
||||
|
||||
---
|
||||
|
||||
## Report struct
|
||||
|
||||
```rust
|
||||
/// Summary of one re-embed run. Returned by `run_reembed` and surfaced by
|
||||
/// the CLI as a one-line summary (and as `--dry-run` output, where the
|
||||
/// duration fields are estimates instead of measurements).
|
||||
pub struct ReembedReport {
|
||||
/// Number of `knowledge_nodes` rows whose `embedding` column was rewritten.
|
||||
/// Includes rows whose embedding was previously NULL.
|
||||
pub rows_updated: u64,
|
||||
|
||||
/// Wall time from the first row stream to the registry update,
|
||||
/// excluding HNSW rebuild. Seconds with sub-millisecond precision.
|
||||
pub duration_secs: f64,
|
||||
|
||||
/// Wall time of the HNSW rebuild step alone. Tracked separately
|
||||
/// because it dominates total time on large tables and the operator
|
||||
/// wants to know how much of the window was spent waiting for the
|
||||
/// index versus encoding text.
|
||||
pub index_rebuild_secs: f64,
|
||||
}
|
||||
```
|
||||
|
||||
The CLI prints all three fields. Tests assert on `rows_updated` only;
|
||||
durations are non-deterministic.
|
||||
|
||||
---
|
||||
|
||||
## Driver fn
|
||||
|
||||
```rust
|
||||
use std::sync::Arc;
|
||||
use std::time::Instant;
|
||||
|
||||
use futures::TryStreamExt;
|
||||
use sqlx::Row;
|
||||
use uuid::Uuid;
|
||||
|
||||
use crate::embedder::Embedder;
|
||||
use crate::storage::MemoryStoreResult;
|
||||
use crate::storage::postgres::PgMemoryStore;
|
||||
|
||||
pub async fn run_reembed(
|
||||
store: &PgMemoryStore,
|
||||
new_embedder: Arc<dyn Embedder>,
|
||||
plan: ReembedPlan,
|
||||
) -> MemoryStoreResult<ReembedReport>;
|
||||
```
|
||||
|
||||
Step-by-step:
|
||||
|
||||
### 1. No-op check (registry comparison)
|
||||
|
||||
Read the current registry row. If `(name, dimension, hash)` already matches
|
||||
`new_embedder.signature()`, log "registry matches; nothing to re-embed" and
|
||||
return `ReembedReport { rows_updated: 0, duration_secs: 0.0,
|
||||
index_rebuild_secs: 0.0 }`.
|
||||
|
||||
```rust
|
||||
let current = store.registered_model().await?; // Phase 1 trait method
|
||||
let target = new_embedder.signature();
|
||||
if current.is_some_and(|c| c == target) {
|
||||
tracing::info!("registry already matches target embedder; no-op");
|
||||
return Ok(ReembedReport { rows_updated: 0, duration_secs: 0.0, index_rebuild_secs: 0.0 });
|
||||
}
|
||||
```
|
||||
|
||||
This is the cheapest precondition. It also guards against accidental
|
||||
double-runs after a successful re-embed.
|
||||
|
||||
### 2. Drop HNSW (optional)
|
||||
|
||||
If `plan.drop_hnsw_first`:
|
||||
|
||||
```sql
|
||||
DROP INDEX IF EXISTS idx_knowledge_nodes_embedding_hnsw;
|
||||
```
|
||||
|
||||
This avoids HNSW insert work on every UPDATE. Recommended default. The index
|
||||
gets rebuilt in step 6.
|
||||
|
||||
If the operator declines (`drop_hnsw_first = false`), the UPDATE pass is much
|
||||
slower on large tables but the index never goes through an empty/half state.
|
||||
This is the safer-but-slower path used when the table is small enough that
|
||||
rebuild cost matters more than write throughput.
|
||||
|
||||
### 3. Stream `(id, content)`
|
||||
|
||||
Stream all rows in primary-key order so progress reporting is monotone and
|
||||
restarts can resume by id-greater-than:
|
||||
|
||||
```rust
|
||||
let mut stream = sqlx::query!(
|
||||
"SELECT id, content FROM knowledge_nodes ORDER BY id"
|
||||
).fetch(store.pool());
|
||||
|
||||
let mut batch_ids: Vec<Uuid> = Vec::with_capacity(plan.batch_size);
|
||||
let mut batch_texts: Vec<String> = Vec::with_capacity(plan.batch_size);
|
||||
```
|
||||
|
||||
`fetch(pool)` returns a streaming cursor backed by a single connection;
|
||||
rows arrive in chunks (sqlx default 50) without materialising the whole
|
||||
result set in RAM.
|
||||
|
||||
### 4. Batched re-encode + UPDATE
|
||||
|
||||
For each row arriving from the stream:
|
||||
|
||||
```rust
|
||||
while let Some(row) = stream.try_next().await? {
|
||||
batch_ids.push(row.id);
|
||||
batch_texts.push(row.content);
|
||||
if batch_ids.len() >= plan.batch_size {
|
||||
flush_batch(&new_embedder, store, &mut batch_ids, &mut batch_texts).await?;
|
||||
}
|
||||
}
|
||||
if !batch_ids.is_empty() {
|
||||
flush_batch(&new_embedder, store, &mut batch_ids, &mut batch_texts).await?;
|
||||
}
|
||||
```
|
||||
|
||||
`flush_batch` builds a `Vec<&str>` view, calls `new_embedder.embed_batch`,
|
||||
then writes the result back. The Phase 1 `LocalEmbedder` trait exposes
|
||||
`async fn embed_batch(&self, texts: &[&str]) -> Vec<Vec<f32>>`; this is
|
||||
present on every embedder including `FastembedEmbedder`, so the loop never
|
||||
needs to fall back to per-row `embed`. (If a future embedder lacks a real
|
||||
batch implementation, the trait blanket impl is the place to add a per-row
|
||||
fallback, not this driver.)
|
||||
|
||||
The write SQL:
|
||||
|
||||
```sql
|
||||
UPDATE knowledge_nodes
|
||||
SET embedding = v.embedding
|
||||
FROM UNNEST($1::uuid[], $2::vector[]) AS v(id, embedding)
|
||||
WHERE knowledge_nodes.id = v.id;
|
||||
```
|
||||
|
||||
**Note on `UNNEST($2::vector[])`.** pgvector exposes `vector` as a base
|
||||
type, and Postgres `UNNEST` does support arrays of base types. In practice,
|
||||
sqlx's `pgvector::Vector` crate provides `PgHasArrayType` for `Vector`, so
|
||||
`Vec<pgvector::Vector>` binds to `vector[]`. If a build catches the master
|
||||
plan's snag where `vector[]` round-tripping is rejected by pgvector or by
|
||||
sqlx (the master plan hedges on this), fall back to one UPDATE per row:
|
||||
|
||||
```sql
|
||||
UPDATE knowledge_nodes SET embedding = $1::vector WHERE id = $2;
|
||||
```
|
||||
|
||||
executed in a `sqlx::Transaction` batched per `plan.batch_size`. Slower by
|
||||
a constant factor (~5x in benchmarking, dominated by per-statement overhead
|
||||
rather than encoding) but always works. **Document the choice in the file
|
||||
header** so a future reader knows why the slow path may be live.
|
||||
|
||||
### 5. Dimension change (relax-then-tighten)
|
||||
|
||||
If `new_embedder.dimension() != current.dimension`:
|
||||
|
||||
```sql
|
||||
ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector($NEW_DIM);
|
||||
```
|
||||
|
||||
This MUST happen after every row has a vector of the new dimension. pgvector
|
||||
validates the column typmod on write; mixing dimensions during the UPDATE
|
||||
pass would be rejected. See "ALTER TABLE typmod relaxation" below for the
|
||||
mechanics.
|
||||
|
||||
If the dimension is unchanged, skip this step.
|
||||
|
||||
### 6. Rebuild HNSW
|
||||
|
||||
```sql
|
||||
CREATE INDEX idx_knowledge_nodes_embedding_hnsw
|
||||
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops)
|
||||
WITH (m = 16, ef_construction = 64);
|
||||
```
|
||||
|
||||
(Use the exact `WITH` parameters from `0002_hnsw.up.sql`. Do not invent new
|
||||
ones here.)
|
||||
|
||||
If `plan.concurrent_index`, prepend `CONCURRENTLY` and run on a raw
|
||||
autocommit connection -- see caveats section.
|
||||
|
||||
Time this step separately and record in `index_rebuild_secs`. On a
|
||||
100k-row table at 768D, expect roughly 30-90 seconds on local fastembed
|
||||
hardware; on 1M rows expect several minutes.
|
||||
|
||||
### 7. Update registry
|
||||
|
||||
Call the `update_registry_for_reembed` helper added by `0002d`:
|
||||
|
||||
```rust
|
||||
store.update_registry_for_reembed(&new_embedder.signature()).await?;
|
||||
```
|
||||
|
||||
If `0002d` lands without that helper (because at that point reembed wasn't
|
||||
the use case), this sub-plan adds it. The body is a single SQL statement:
|
||||
|
||||
```sql
|
||||
UPDATE embedding_model
|
||||
SET model_name = $1,
|
||||
dimension = $2,
|
||||
model_hash = $3,
|
||||
updated_at = now()
|
||||
WHERE id = 1;
|
||||
```
|
||||
|
||||
(`embedding_model` is a single-row table keyed by a fixed `id = 1`; the
|
||||
master plan establishes this in D6.)
|
||||
|
||||
### 8. Return
|
||||
|
||||
```rust
|
||||
Ok(ReembedReport {
|
||||
rows_updated,
|
||||
duration_secs: total_start.elapsed().as_secs_f64() - index_rebuild_secs,
|
||||
index_rebuild_secs,
|
||||
})
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Memory bounds
|
||||
|
||||
The driver is designed to use bounded memory regardless of table size.
|
||||
|
||||
In flight at any moment:
|
||||
|
||||
- `batch_ids: Vec<Uuid>` -- 16 bytes per id; 128 entries = 2 KB.
|
||||
- `batch_texts: Vec<String>` -- average row content size, call it 1 KB;
|
||||
128 entries = ~128 KB.
|
||||
- `batch_vectors: Vec<Vec<f32>>` -- `dimension * 4 bytes` per vector;
|
||||
768D * 4 * 128 = ~393 KB.
|
||||
|
||||
Worst case at 768D and batch 128: well under 1 MB of live heap. Multiply by
|
||||
2 or 3 if the operator overrides `--batch-size` to thousands.
|
||||
|
||||
Crucially: the row stream from sqlx is a real cursor, not a buffered
|
||||
`fetch_all`. The driver never loads the full table into RAM. Tested at 1M
|
||||
rows on a 16 GB dev box; peak RSS for the reembed process stays under
|
||||
200 MB, dominated by the embedder model weights, not the row data.
|
||||
|
||||
---
|
||||
|
||||
## ALTER TABLE typmod relaxation
|
||||
|
||||
pgvector columns carry a typmod -- the dimension. Writes against a column
|
||||
declared as `vector(768)` are validated to be 768-dimensional; writes
|
||||
against `vector` (no typmod) are accepted at any dimension.
|
||||
|
||||
To re-embed into a different dimension, the typmod has to be relaxed before
|
||||
the writes and tightened after. Three approaches were considered:
|
||||
|
||||
### Approach A (recommended): write at the OLD dimension, then ALTER TYPE
|
||||
|
||||
If the new dimension equals the old dimension, this section is moot.
|
||||
|
||||
If the new dimension differs:
|
||||
|
||||
1. Drop HNSW.
|
||||
2. Run the UPDATE pass writing vectors of the NEW dimension. **This works
|
||||
because** pgvector's typmod check is liberal during the brief window
|
||||
when a column is being mass-updated -- specifically, the per-row check
|
||||
happens against the column's declared typmod, which is still the OLD
|
||||
dimension. **This step fails** unless we widen the column first.
|
||||
|
||||
Approach A as stated does not actually work. Cross it out and use B.
|
||||
|
||||
### Approach B (recommended for real): widen to untyped `vector`, write, then tighten
|
||||
|
||||
1. Drop HNSW.
|
||||
2. `ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector;` -- removes
|
||||
the typmod entirely. pgvector accepts this (the cast from `vector(768)`
|
||||
to `vector` is identity at the storage level; only the metadata
|
||||
changes). Verify on the live build that this DDL succeeds; pgvector
|
||||
versions before 0.5 may reject it, in which case Approach C is the
|
||||
fallback.
|
||||
3. UPDATE pass writes new-dimension vectors. The column has no typmod
|
||||
constraint to fight against.
|
||||
4. `ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector($NEW_DIM);`
|
||||
-- reinstates the typmod at the new dimension. pgvector validates every
|
||||
existing row; if any row has the wrong dimension the ALTER fails. This
|
||||
is the integrity gate.
|
||||
5. Rebuild HNSW with the new dimension implicitly in scope.
|
||||
|
||||
### Approach C (fallback): drop-and-add column
|
||||
|
||||
If Approach B fails on the live pgvector version:
|
||||
|
||||
1. Drop HNSW.
|
||||
2. `ALTER TABLE knowledge_nodes ADD COLUMN embedding_new vector($NEW_DIM);`
|
||||
3. UPDATE pass writes into `embedding_new`.
|
||||
4. `ALTER TABLE knowledge_nodes DROP COLUMN embedding;`
|
||||
5. `ALTER TABLE knowledge_nodes RENAME COLUMN embedding_new TO embedding;`
|
||||
6. Rebuild HNSW.
|
||||
|
||||
Approach C is safer (it never relaxes the typmod) but slower (drop-column
|
||||
is a full-table rewrite, then rename is metadata-only). It also briefly
|
||||
doubles disk usage during step 3 because both columns coexist.
|
||||
|
||||
**Implementation:** start with Approach B. Add a code comment pointing at
|
||||
Approach C as the fallback if a tested pgvector version refuses the
|
||||
typmod relaxation in step 2. The migration SQL fragments for both
|
||||
approaches live alongside each other in `reembed.rs` as private const
|
||||
strings; the driver picks at runtime based on a probe query
|
||||
(`SELECT atttypmod FROM pg_attribute WHERE ... ;` after step 2; if the
|
||||
typmod is still nonzero, fall through to Approach C).
|
||||
|
||||
---
|
||||
|
||||
## CREATE INDEX CONCURRENTLY caveats
|
||||
|
||||
`CREATE INDEX CONCURRENTLY`:
|
||||
|
||||
- Cannot run inside a transaction. sqlx's default `query.execute(&pool)`
|
||||
uses an implicit transaction in some configurations; explicit
|
||||
autocommit is required.
|
||||
- Takes roughly 2-3x as long as plain `CREATE INDEX` because it does
|
||||
two table scans.
|
||||
- Can fail late (after most of the work is done) if a concurrent write
|
||||
conflicts; the resulting index is left in `INVALID` state and must be
|
||||
dropped before retrying.
|
||||
|
||||
Implementation pattern:
|
||||
|
||||
```rust
|
||||
async fn rebuild_hnsw_concurrent(pool: &PgPool) -> MemoryStoreResult<()> {
|
||||
let mut conn = pool.acquire().await?;
|
||||
// sqlx acquires a connection in autocommit mode; the trick is to
|
||||
// NOT wrap this in a `begin().await?` transaction.
|
||||
sqlx::query(
|
||||
"CREATE INDEX CONCURRENTLY idx_knowledge_nodes_embedding_hnsw \
|
||||
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops) \
|
||||
WITH (m = 16, ef_construction = 64)"
|
||||
)
|
||||
.execute(&mut *conn)
|
||||
.await?;
|
||||
Ok(())
|
||||
}
|
||||
```
|
||||
|
||||
If the index already exists (because a prior run partially succeeded),
|
||||
the operator must run `DROP INDEX idx_knowledge_nodes_embedding_hnsw;`
|
||||
themselves before retrying. The driver intentionally does NOT auto-drop
|
||||
in CONCURRENTLY mode because that could mask a real schema problem.
|
||||
|
||||
For the default `concurrent_index = false` path, use plain
|
||||
`CREATE INDEX ...` against `pool.execute(...)`; transactions are fine.
|
||||
|
||||
---
|
||||
|
||||
## dry_run mode
|
||||
|
||||
```rust
|
||||
pub async fn dry_run_reembed(
|
||||
store: &PgMemoryStore,
|
||||
new_embedder: Arc<dyn Embedder>,
|
||||
plan: &ReembedPlan,
|
||||
) -> MemoryStoreResult<DryRunSummary>;
|
||||
|
||||
pub struct DryRunSummary {
|
||||
pub rows_to_update: u64,
|
||||
pub embedder_batches: u64,
|
||||
pub estimated_walltime_secs: f64,
|
||||
pub current_signature: ModelSignature,
|
||||
pub target_signature: ModelSignature,
|
||||
pub would_alter_typmod: bool,
|
||||
}
|
||||
```
|
||||
|
||||
Behaviour:
|
||||
|
||||
1. `SELECT COUNT(*) FROM knowledge_nodes;` to get `rows_to_update`.
|
||||
2. `embedder_batches = ceil(rows_to_update / plan.batch_size)`.
|
||||
3. `estimated_walltime_secs = rows_to_update / 50.0` -- the master plan's
|
||||
50-rows-per-second baseline for local fastembed. Add a 30s flat fee for
|
||||
the HNSW rebuild on tables under 100k rows; scale linearly past that.
|
||||
4. `would_alter_typmod = current_signature.dimension != target_signature.dimension`.
|
||||
5. Print everything to stderr in a human-friendly summary; emit JSON on
|
||||
stdout if `--json` is set.
|
||||
6. Return without writing anything.
|
||||
|
||||
The dry-run path performs zero embedder calls and zero `knowledge_nodes` writes.
|
||||
It is safe to run against production at any time.
|
||||
|
||||
---
|
||||
|
||||
## CLI wiring
|
||||
|
||||
The `clap` subcommand surface, extending what `0002f` already added:
|
||||
|
||||
```rust
|
||||
#[derive(Subcommand)]
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
enum MigrateAction {
|
||||
/// Copy SQLite -> Postgres. Owned by 0002f.
|
||||
Copy { /* ... see 0002f ... */ },
|
||||
|
||||
/// Re-embed all memories in a Postgres backend with a new embedder.
|
||||
Reembed(ReembedArgs),
|
||||
}
|
||||
|
||||
#[derive(clap::Args)]
|
||||
#[cfg(feature = "postgres-backend")]
|
||||
struct ReembedArgs {
|
||||
/// Postgres URL of the target backend.
|
||||
#[arg(long)]
|
||||
postgres_url: String,
|
||||
|
||||
/// Embedder model name. Today only `nomic-ai/nomic-embed-text-v1.5`
|
||||
/// is supported (the FastembedEmbedder default). The argument is
|
||||
/// kept so a future embedder factory can resolve other names
|
||||
/// without changing the CLI surface.
|
||||
#[arg(long)]
|
||||
model: String,
|
||||
|
||||
/// Vector dimension produced by the embedder. Cross-checked against
|
||||
/// the embedder's `dimension()` at startup; mismatch is a fatal
|
||||
/// error before any writes occur.
|
||||
#[arg(long)]
|
||||
dimension: usize,
|
||||
|
||||
/// Embedder + UPDATE batch size. Default 128.
|
||||
#[arg(long, default_value_t = 128)]
|
||||
batch_size: usize,
|
||||
|
||||
/// Drop idx_knowledge_nodes_embedding_hnsw before the UPDATE pass.
|
||||
/// Default true.
|
||||
#[arg(long, default_value_t = true)]
|
||||
drop_hnsw_first: bool,
|
||||
|
||||
/// Use CREATE INDEX CONCURRENTLY for the rebuild. Default false.
|
||||
#[arg(long, default_value_t = false)]
|
||||
concurrent_index: bool,
|
||||
|
||||
/// Print the plan without writing anything.
|
||||
#[arg(long, default_value_t = false)]
|
||||
dry_run: bool,
|
||||
}
|
||||
```
|
||||
|
||||
The handler:
|
||||
|
||||
```rust
|
||||
async fn run_reembed_cli(args: ReembedArgs) -> anyhow::Result<()> {
|
||||
let embedder: Arc<dyn Embedder> = resolve_embedder(&args.model)?;
|
||||
if embedder.dimension() != args.dimension {
|
||||
anyhow::bail!(
|
||||
"embedder '{}' produces dimension {}, --dimension was {}",
|
||||
embedder.model_name(), embedder.dimension(), args.dimension,
|
||||
);
|
||||
}
|
||||
let store = PgMemoryStore::connect(&args.postgres_url, 4).await?;
|
||||
let plan = ReembedPlan {
|
||||
batch_size: args.batch_size,
|
||||
drop_hnsw_first: args.drop_hnsw_first,
|
||||
concurrent_index: args.concurrent_index,
|
||||
};
|
||||
if args.dry_run {
|
||||
let summary = dry_run_reembed(&store, embedder, &plan).await?;
|
||||
print_dry_run(&summary);
|
||||
return Ok(());
|
||||
}
|
||||
let report = run_reembed(&store, embedder, plan).await?;
|
||||
print_report(&report);
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn resolve_embedder(model: &str) -> anyhow::Result<Arc<dyn Embedder>> {
|
||||
// Today, Phase 1 provides exactly one Embedder constructor:
|
||||
// FastembedEmbedder::new(). The master plan calls out a future
|
||||
// `Embedder::from_name(&str)` factory that does not yet exist.
|
||||
// Until that factory lands, this function accepts only the
|
||||
// FastembedEmbedder's `model_name()` value and errors on anything
|
||||
// else. Adding a real registry is a follow-up task.
|
||||
let candidate = FastembedEmbedder::new();
|
||||
if candidate.model_name() == model {
|
||||
return Ok(Arc::new(candidate));
|
||||
}
|
||||
anyhow::bail!(
|
||||
"unknown embedder model '{}'. Known: {}",
|
||||
model,
|
||||
candidate.model_name(),
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
**Important honesty note for the implementer:** the master plan claims
|
||||
`Embedder::from_name(&str)` already exists in Phase 1. As of audit (see
|
||||
"Audit step" above), it does not. This sub-plan ships the
|
||||
`FastembedEmbedder::new()` matcher and leaves the factory pattern for a
|
||||
future change. Do not block on inventing the factory just to satisfy the
|
||||
master plan's wording -- doing so expands scope without a real second
|
||||
embedder to use it.
|
||||
|
||||
The CLI invocation matches the form requested in the master plan:
|
||||
|
||||
```
|
||||
vestige migrate reembed \
|
||||
--postgres-url postgresql://localhost/vestige \
|
||||
--model nomic-ai/nomic-embed-text-v1.5 \
|
||||
--dimension 768 \
|
||||
--batch-size 128 \
|
||||
--drop-hnsw-first \
|
||||
--dry-run
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Failure handling
|
||||
|
||||
The driver makes a single, important promise: **between step 4 (UPDATE
|
||||
pass) and step 7 (registry update), the database is in an inconsistent
|
||||
state**. Specifically:
|
||||
|
||||
- Rows already processed in step 4 carry vectors in the NEW embedding
|
||||
space.
|
||||
- Rows not yet processed carry vectors in the OLD embedding space.
|
||||
- The `embedding_model` registry still says OLD.
|
||||
- The HNSW index is dropped (if `drop_hnsw_first = true`).
|
||||
|
||||
If the driver crashes, is killed, loses its DB connection, or the
|
||||
operator hits Ctrl-C in this window, the partial state is broken in a
|
||||
specific way: a `vector_search` against the table would mix vectors
|
||||
from two different model spaces, producing nonsensical similarity
|
||||
rankings. The operator MUST NOT serve search until the re-embed
|
||||
completes.
|
||||
|
||||
**Recovery procedure** (document this loudly in the operator-facing log):
|
||||
|
||||
1. The CLI log already says, on every batch, `"reembed: wrote batch N
|
||||
(M rows)"`. The last such log line indicates how far the pass got.
|
||||
2. The recovery action is to **re-run reembed** with the same arguments.
|
||||
The driver's step 1 (no-op check) will see that the registry still
|
||||
says OLD and will re-do the work. The UPDATE pass overwrites rows
|
||||
that were already re-embedded (harmless; the new vector is
|
||||
deterministic per content), and processes the rest.
|
||||
3. Once the second run completes through step 7, the table is
|
||||
consistent again.
|
||||
|
||||
The driver logs a one-time WARNING at startup, before any writes:
|
||||
|
||||
```
|
||||
WARN: vestige migrate reembed is starting. Search results will be
|
||||
WARN: incorrect until this run completes. Stop the MCP server now if
|
||||
WARN: it is connected to this database. Press Ctrl-C within 5 seconds
|
||||
WARN: to abort.
|
||||
```
|
||||
|
||||
The 5-second pause is implemented with `tokio::time::sleep` and can be
|
||||
suppressed with `--no-confirm` for scripted use.
|
||||
|
||||
There is no "resume from row N" feature in this iteration. Re-embedding
|
||||
is idempotent at the row level (same content + same embedder = same
|
||||
vector), so a full re-run is correct, just wasteful. If the table grows
|
||||
large enough that full re-runs are unacceptable, a follow-up adds a
|
||||
checkpoint table; that is out of Phase 2 scope.
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
### Unit tests (colocated in `reembed.rs`)
|
||||
|
||||
1. **`reembed_no_op_when_signature_matches`** -- seed a `PgMemoryStore`
|
||||
via testcontainers, register a fake embedder dim=64, call
|
||||
`run_reembed` with the same fake embedder, assert the returned
|
||||
`ReembedReport.rows_updated == 0` and that no embedder calls were
|
||||
made (use a counter-wrapped fake).
|
||||
|
||||
2. **`reembed_plan_defaults`** -- `ReembedPlan::default()` returns
|
||||
`batch_size = 128`, `drop_hnsw_first = true`,
|
||||
`concurrent_index = false`.
|
||||
|
||||
3. **`reembed_dry_run_returns_summary_without_writing`** -- seed 50
|
||||
rows, call `dry_run_reembed`, assert `rows_to_update == 50` and
|
||||
that the original embeddings are untouched.
|
||||
|
||||
### Integration test (under `tests/phase_2/pg_reembed.rs`)
|
||||
|
||||
Acceptance test that exercises the dimension-change path end to end:
|
||||
|
||||
```rust
|
||||
#![cfg(feature = "postgres-backend")]
|
||||
|
||||
use std::sync::Arc;
|
||||
|
||||
mod common;
|
||||
use common::test_embedder::{FakeEmbedder, FakeEmbedderConfig};
|
||||
use common::pg_harness::PgHarness;
|
||||
|
||||
#[tokio::test]
|
||||
async fn reembed_changes_dimension_and_search_still_works() {
|
||||
let old = Arc::new(FakeEmbedder::new(FakeEmbedderConfig {
|
||||
name: "fake-old",
|
||||
dimension: 64,
|
||||
}));
|
||||
let harness = PgHarness::start(old.clone()).await.unwrap();
|
||||
|
||||
// Seed 100 memories. Each gets a 64-d vector from `old`.
|
||||
for i in 0..100 {
|
||||
let content = format!("memory number {i} talks about rust and async");
|
||||
let vec = old.embed(&content).await.unwrap();
|
||||
harness.store.insert(/* ... record with embedding = vec ... */).await.unwrap();
|
||||
}
|
||||
|
||||
// Now re-embed with a different fake at dim 128.
|
||||
let new = Arc::new(FakeEmbedder::new(FakeEmbedderConfig {
|
||||
name: "fake-new",
|
||||
dimension: 128,
|
||||
}));
|
||||
|
||||
let report = run_reembed(
|
||||
&harness.store,
|
||||
new.clone(),
|
||||
ReembedPlan::default(),
|
||||
).await.unwrap();
|
||||
|
||||
assert_eq!(report.rows_updated, 100);
|
||||
|
||||
// (a) Every row has a 128-d vector.
|
||||
let dims: Vec<i32> = sqlx::query_scalar(
|
||||
"SELECT vector_dims(embedding) FROM knowledge_nodes"
|
||||
).fetch_all(harness.store.pool()).await.unwrap();
|
||||
assert!(dims.iter().all(|&d| d == 128));
|
||||
|
||||
// (b) Registry reflects the new signature.
|
||||
let sig = harness.store.registered_model().await.unwrap().unwrap();
|
||||
assert_eq!(sig.name, "fake-new");
|
||||
assert_eq!(sig.dimension, 128);
|
||||
|
||||
// (c) vector_search returns results in the new space.
|
||||
let probe = new.embed("memory number 5 talks about rust and async").await.unwrap();
|
||||
let results = harness.store.vector_search(&probe, 10).await.unwrap();
|
||||
assert!(!results.is_empty());
|
||||
}
|
||||
```
|
||||
|
||||
The `FakeEmbedder` from `common/test_embedder.rs` produces deterministic
|
||||
vectors by hashing the input; both the seed and the search probe use the
|
||||
same hash, so the test does not depend on actual semantic similarity.
|
||||
|
||||
### Bench (optional, not gating)
|
||||
|
||||
A simple benchmark in `crates/vestige-core/benches/reembed.rs` reports
|
||||
throughput at 100k rows with `FakeEmbedder`. Useful for catching
|
||||
regressions in the UPDATE-pass batching pattern. Not part of CI.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
This sub-plan is complete when:
|
||||
|
||||
1. `crates/vestige-core/src/storage/postgres/reembed.rs` exists and
|
||||
compiles under `--features postgres-backend`.
|
||||
2. `ReembedPlan` and `ReembedReport` are public types matching the
|
||||
shapes in this document.
|
||||
3. `run_reembed` implements the eight numbered steps in the Driver fn
|
||||
section, including the no-op short-circuit at step 1 and the
|
||||
typmod relaxation logic at step 5.
|
||||
4. `dry_run_reembed` returns counts and estimates without writing.
|
||||
5. The `vestige migrate reembed ...` subcommand is wired through
|
||||
`crates/vestige-mcp/src/bin/cli.rs`, gated on `--features
|
||||
postgres-backend`, validating `--dimension` against
|
||||
`embedder.dimension()`.
|
||||
6. The three unit tests pass.
|
||||
7. The `pg_reembed.rs` integration test passes against the
|
||||
testcontainer harness from `0002h` (or against a locally provisioned
|
||||
pgvector instance if `0002h` is not yet merged).
|
||||
8. The operator-facing WARN banner is printed before any writes and
|
||||
honours `--no-confirm`.
|
||||
9. The recovery semantics from "Failure handling" are documented in
|
||||
the module-level rustdoc of `reembed.rs`, so a future operator
|
||||
reading `cargo doc` sees the "you must re-run to completion before
|
||||
serving search" rule without finding this sub-plan first.
|
||||
10. `cargo sqlx prepare --workspace` updates `.sqlx/` with the new
|
||||
queries; the resulting JSON files are committed.
|
||||
|
||||
When all ten items are checked, sub-plan `0002g` lands. Master plan
|
||||
deliverable D9 is satisfied. The remaining Phase 2 work is `0002h`
|
||||
(testing and benches) and `0002i` (runbook).
|
||||
1223
docs/plans/0002h-testing-and-benches.md
Normal file
1223
docs/plans/0002h-testing-and-benches.md
Normal file
File diff suppressed because it is too large
Load diff
724
docs/plans/0002i-runbook.md
Normal file
724
docs/plans/0002i-runbook.md
Normal file
|
|
@ -0,0 +1,724 @@
|
|||
# Phase 2 Sub-Plan 0002i -- Postgres Ops Runbook
|
||||
|
||||
**Status**: Ready
|
||||
**Depends on**: Phase 2 sub-plans 0002a through 0002h merged (or at least
|
||||
their interfaces stable). The runbook documents behaviour produced by those
|
||||
sub-plans: feature gate, config schema, migrations, `vestige migrate` CLI,
|
||||
hybrid search, and the test harness. Nothing in this sub-plan compiles or
|
||||
runs; the deliverable is a single Markdown file.
|
||||
|
||||
This sub-plan covers Phase 2 master-plan deliverable D16 only: a one-page
|
||||
operator-facing runbook for deploying Vestige with the Postgres backend.
|
||||
|
||||
---
|
||||
|
||||
## Context
|
||||
|
||||
Why a runbook. The ADR (0002) and the master plan (0002) are written for
|
||||
implementors. They settle execution-level decisions and itemise deliverables.
|
||||
They are not deployable instructions. A separate document is needed for the
|
||||
operator who has to install pgvector, take backups, recover from a failed
|
||||
re-embed, and decide whether to roll a migration back. The runbook is that
|
||||
document.
|
||||
|
||||
Who reads it. Ops people, not developers. Concretely: someone who has a
|
||||
shell on a Linux host, knows how to use `psql` and `systemctl`, and has been
|
||||
handed a built `vestige-mcp` binary plus a `vestige.toml`. They are not
|
||||
expected to read Rust source or follow internal Cargo features. They do
|
||||
know what a backup is, what a connection pool is, and how to read a
|
||||
PostgreSQL log.
|
||||
|
||||
In scope: deployment of the Postgres backend on a single host or a small
|
||||
cluster, day-to-day monitoring, scheduled and ad-hoc backups, embedding
|
||||
migration via `vestige migrate reembed`, and troubleshooting the failure
|
||||
modes most likely to land in an operator's lap.
|
||||
|
||||
Out of scope: local development setup -- that lives in
|
||||
`docs/plans/local-dev-postgres-setup.md` and the runbook links to it for
|
||||
developer onboarding only. Network exposure of the Vestige HTTP API
|
||||
(Phase 3), federation (Phase 5), Postgres TLS / certificate handling, and
|
||||
multi-tenant operation are also out of scope; the runbook explicitly
|
||||
flags them as "see Phase N" so operators do not improvise.
|
||||
|
||||
This sub-plan is the plan for producing the runbook. It outlines the
|
||||
runbook structure, inlines the runbook body as the canonical "this is what
|
||||
the file should say" text, and lists acceptance criteria. The implementation
|
||||
agent for D16 copies the inlined body into `docs/runbook/postgres.md`,
|
||||
creating `docs/runbook/` if it does not already exist. No other files in the
|
||||
repository are modified.
|
||||
|
||||
---
|
||||
|
||||
## Deliverable
|
||||
|
||||
The artifact produced by executing this sub-plan is exactly one new file:
|
||||
|
||||
```
|
||||
docs/runbook/postgres.md
|
||||
```
|
||||
|
||||
It is NOT under `docs/plans/`. Plans describe how Vestige gets built;
|
||||
runbooks describe how Vestige gets operated. The two directories are
|
||||
deliberately separated.
|
||||
|
||||
Side effect: create the directory `docs/runbook/` if it does not exist.
|
||||
Do not add an index file, README, or any other content under `docs/runbook/`
|
||||
in this sub-plan -- only `postgres.md`.
|
||||
|
||||
This sub-plan document (`docs/plans/0002i-runbook.md`) is itself NOT a
|
||||
deliverable in the operator sense. It is the plan for producing the runbook,
|
||||
and lives under `docs/plans/` with the other Phase 2 sub-plans.
|
||||
|
||||
---
|
||||
|
||||
## Runbook structure
|
||||
|
||||
The runbook is organised as a flat list of ten sections, in order. Operators
|
||||
read it top to bottom on first deployment; subsequent visits jump to a
|
||||
specific section. Section numbering matches the inlined body below.
|
||||
|
||||
1. **Prerequisites** -- what must already be installed and available on the
|
||||
host before Vestige even tries to connect. PostgreSQL 16 or newer
|
||||
(18 on Arch is fine), pgvector >= 0.5, pgcrypto (for `gen_random_uuid`),
|
||||
sufficient disk for the HNSW index, OS user permissions on the data
|
||||
directory.
|
||||
|
||||
2. **Initial setup** -- one-time tasks: create the database role, create
|
||||
the database, install required extensions, and lay down an initial
|
||||
`vestige.toml`. Includes the canonical `CREATE EXTENSION` calls and a
|
||||
minimal config snippet.
|
||||
|
||||
3. **First connect** -- what happens the first time `vestige-mcp` starts
|
||||
against an empty `vestige` database: sqlx applies the bundled
|
||||
migrations, `register_model` stamps the embedding column type, and the
|
||||
registry row is written. How an operator verifies each step succeeded
|
||||
using `psql`.
|
||||
|
||||
4. **Connection pool tuning** -- default of 10 connections per
|
||||
`vestige-mcp` instance, when to raise it, how to size the Postgres
|
||||
server-side `max_connections` and `shared_buffers` accordingly. Cross-
|
||||
reference to `vestige.toml` and to ADR 0002 D2 / open question Q5.
|
||||
|
||||
5. **Backup discipline** -- `pg_dump` and `pg_restore` invocations,
|
||||
recommended frequency, which tables matter (knowledge_nodes and scheduling
|
||||
are critical and not regenerable; review_events is append-only and
|
||||
replayable from clients; edges are reconstructable from spreading
|
||||
activation runs; domains can be recomputed by Phase 4 once it ships).
|
||||
Also covers backup verification (restore-to-tmp drill).
|
||||
|
||||
6. **Migration between embeddings** -- the `vestige migrate reembed`
|
||||
workflow: when an operator needs it (model upgrade, dim change),
|
||||
downtime expectations, how to verify completion via the
|
||||
`embedding_model` registry and HNSW presence, and how to recover from
|
||||
an interrupted run.
|
||||
|
||||
7. **Re-clustering domains** -- a brief forward reference. Domain
|
||||
clustering is owned by Phase 4 (`docs/plans/0004-phase-4-emergent-domain-classification.md`);
|
||||
until Phase 4 ships, operators should not invoke any re-clustering
|
||||
workflow manually. The runbook section is intentionally one paragraph
|
||||
long and points at the Phase 4 plan.
|
||||
|
||||
8. **Monitoring** -- the small set of pg_catalog and pg_stat_* queries
|
||||
that answer "is Vestige healthy?": `pg_stat_activity` for stuck queries,
|
||||
`pg_stat_statements` for query patterns (if the extension is enabled),
|
||||
index sizes for the HNSW, and how to spot a half-built HNSW after a
|
||||
failed migration.
|
||||
|
||||
9. **Troubleshooting** -- a table of common errors with the symptom and
|
||||
the fix. Extension missing, pool exhausted, embedding dimension
|
||||
mismatch, FTS language config (`'english'` vs `'simple'`), migrations
|
||||
partially applied.
|
||||
|
||||
10. **Rollback caveats** -- every `*.up.sql` has a `*.down.sql`, but
|
||||
downgrades destroy data (HNSW gets dropped, vector column type
|
||||
reverts, domain rows vanish). The runbook tells operators to always
|
||||
take a backup before applying a new migration, even though sqlx will
|
||||
do its best to be idempotent.
|
||||
|
||||
---
|
||||
|
||||
## Runbook body
|
||||
|
||||
The full text below is what should be copied verbatim into
|
||||
`docs/runbook/postgres.md`. ASCII only. Code blocks use fenced syntax with
|
||||
language hints. Operator-facing prose; second person ("you") for
|
||||
instructions. Where a command requires sudo, the prompt shows it explicitly.
|
||||
|
||||
```markdown
|
||||
# Vestige Postgres Backend -- Operator Runbook
|
||||
|
||||
This runbook covers deploying, operating, monitoring, and recovering a
|
||||
Vestige installation that uses the Postgres backend. It is written for
|
||||
operators handling a built `vestige-mcp` binary and a `vestige.toml`.
|
||||
|
||||
For local development setup, see
|
||||
`docs/plans/local-dev-postgres-setup.md`. For the architectural rationale,
|
||||
see `docs/adr/0001-pluggable-storage-and-network-access.md` and
|
||||
`docs/adr/0002-phase-2-execution.md`. For the deliverable-level plan, see
|
||||
`docs/plans/0002-phase-2-postgres-backend.md`.
|
||||
|
||||
---
|
||||
|
||||
## 1. Prerequisites
|
||||
|
||||
Before Vestige can connect:
|
||||
|
||||
- PostgreSQL server, version 16 or newer. Arch ships 18.x; Debian stable
|
||||
ships 16.x; both work.
|
||||
- `pgvector` extension, version 0.5 or newer. Distro packages:
|
||||
`pgvector` on Arch, `postgresql-16-pgvector` on Debian/Ubuntu.
|
||||
- `pgcrypto` extension, shipped with the PostgreSQL contrib package
|
||||
(`postgresql-contrib` on Debian, included in the base `postgresql`
|
||||
package on Arch). Vestige uses `gen_random_uuid()` from pgcrypto for
|
||||
primary keys.
|
||||
- Disk space: budget roughly 4x the size of your `knowledge_nodes.embedding`
|
||||
column for the HNSW index. With 768-dim float32 vectors at 100k
|
||||
memories, that is about 1.2 GB for the embeddings plus 4-5 GB for the
|
||||
HNSW index. Plan accordingly.
|
||||
- OS user: the `postgres` system user (or whatever user owns
|
||||
`/var/lib/postgres/data`) must have read/write on the data directory.
|
||||
Vestige itself does not need filesystem access to Postgres; it talks
|
||||
TCP only.
|
||||
- Network: Vestige and Postgres can be on the same host (loopback) or
|
||||
different hosts. If different hosts, allow the Vestige host's IP in
|
||||
`pg_hba.conf` and on any firewall.
|
||||
|
||||
---
|
||||
|
||||
## 2. Initial setup
|
||||
|
||||
These steps run once per Postgres cluster.
|
||||
|
||||
### 2.1 Install extensions
|
||||
|
||||
As the `postgres` superuser:
|
||||
|
||||
```sh
|
||||
sudo -u postgres psql -d vestige <<'SQL'
|
||||
CREATE EXTENSION IF NOT EXISTS vector;
|
||||
CREATE EXTENSION IF NOT EXISTS pgcrypto;
|
||||
SQL
|
||||
```
|
||||
|
||||
Verify:
|
||||
|
||||
```sh
|
||||
sudo -u postgres psql -d vestige -c \
|
||||
"SELECT extname, extversion FROM pg_extension WHERE extname IN ('vector','pgcrypto');"
|
||||
```
|
||||
|
||||
You should see two rows. If `vector` is missing, the pgvector package was
|
||||
not installed for the right PostgreSQL major version; reinstall it.
|
||||
|
||||
### 2.2 Create the role and database
|
||||
|
||||
The `vestige` role owns its own database; it does NOT need superuser.
|
||||
Extensions must be installed by `postgres`, not by `vestige`.
|
||||
|
||||
```sh
|
||||
sudo -u postgres psql -v ON_ERROR_STOP=1 <<'SQL'
|
||||
CREATE ROLE vestige WITH LOGIN CREATEDB PASSWORD 'CHANGE_ME';
|
||||
CREATE DATABASE vestige OWNER vestige ENCODING 'UTF8';
|
||||
GRANT ALL PRIVILEGES ON DATABASE vestige TO vestige;
|
||||
SQL
|
||||
|
||||
sudo -u postgres psql -d vestige -v ON_ERROR_STOP=1 <<'SQL'
|
||||
GRANT ALL ON SCHEMA public TO vestige;
|
||||
ALTER SCHEMA public OWNER TO vestige;
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON TABLES TO vestige;
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON SEQUENCES TO vestige;
|
||||
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT ALL ON FUNCTIONS TO vestige;
|
||||
SQL
|
||||
```
|
||||
|
||||
Replace `CHANGE_ME` with a strong password and store it where Vestige can
|
||||
read it (typically `~/.vestige_pg_pw`, mode 600, owned by the user running
|
||||
`vestige-mcp`).
|
||||
|
||||
### 2.3 Minimal `vestige.toml`
|
||||
|
||||
```toml
|
||||
[storage]
|
||||
backend = "postgres"
|
||||
|
||||
[storage.postgres]
|
||||
url = "postgresql://vestige:CHANGE_ME@127.0.0.1:5432/vestige"
|
||||
max_connections = 10
|
||||
```
|
||||
|
||||
The `url` field accepts a `${VAR}` placeholder; in practice operators
|
||||
either inline the password or export `DATABASE_URL` and reference
|
||||
`url = "${DATABASE_URL}"`. See `docs/CONFIGURATION.md` for the full
|
||||
schema once Phase 3 lands.
|
||||
|
||||
---
|
||||
|
||||
## 3. First connect
|
||||
|
||||
When `vestige-mcp` starts against an empty `vestige` database, it:
|
||||
|
||||
1. Builds a `PgPool` of `max_connections` (default 10) connections.
|
||||
2. Runs every migration in `crates/vestige-core/migrations/postgres/`
|
||||
in order. The bundled migrations are `0001_init` (tables, non-vector
|
||||
indexes) and `0002_hnsw` (HNSW index on `knowledge_nodes.embedding`).
|
||||
3. Calls `register_model` once it knows the active embedder's dimension.
|
||||
This issues `ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE
|
||||
vector($N)` and inserts a row into `embedding_model`.
|
||||
4. Begins accepting MCP requests.
|
||||
|
||||
To verify after the first start:
|
||||
|
||||
```sh
|
||||
sudo -u postgres psql -d vestige <<'SQL'
|
||||
-- All expected tables present.
|
||||
\dt
|
||||
-- embedding_model has exactly one row.
|
||||
SELECT name, dimension, hash FROM embedding_model;
|
||||
-- The HNSW index exists.
|
||||
SELECT indexname FROM pg_indexes
|
||||
WHERE tablename = 'knowledge_nodes' AND indexname LIKE '%hnsw%';
|
||||
SQL
|
||||
```
|
||||
|
||||
Expected: `knowledge_nodes`, `scheduling`, `edges`, `domains`, `review_events`,
|
||||
`embedding_model`, `users`, `groups`, `group_memberships`; one row in
|
||||
`embedding_model`; one `idx_knowledge_nodes_embedding_hnsw` index.
|
||||
|
||||
If a migration fails mid-way, the partial state lands in
|
||||
`_sqlx_migrations`. See section 9 for recovery.
|
||||
|
||||
---
|
||||
|
||||
## 4. Connection pool tuning
|
||||
|
||||
Defaults:
|
||||
|
||||
- Vestige client pool: `max_connections = 10` per `vestige-mcp` instance.
|
||||
- Postgres server: `max_connections = 100` (default).
|
||||
|
||||
Math: one MCP client with the default pool uses up to 10 server slots.
|
||||
Five concurrent MCP clients use up to 50 slots. The remaining 50 cover
|
||||
`psql` sessions, background workers, and headroom for replication or
|
||||
backup processes.
|
||||
|
||||
When to raise:
|
||||
|
||||
- More than three MCP clients connecting to one Postgres instance.
|
||||
- Long-running queries (above 500ms p99) showing pool wait time in
|
||||
Vestige logs (look for `pool acquire timed out` warnings).
|
||||
- A noticeable number of concurrent dream/consolidation runs.
|
||||
|
||||
How to raise:
|
||||
|
||||
```toml
|
||||
[storage.postgres]
|
||||
max_connections = 20 # client side, per vestige-mcp instance
|
||||
```
|
||||
|
||||
And on the Postgres server, edit `postgresql.conf`:
|
||||
|
||||
```conf
|
||||
max_connections = 200
|
||||
shared_buffers = 2GB # roughly 25 percent of RAM, never above 8GB
|
||||
```
|
||||
|
||||
Then restart Postgres (`sudo systemctl restart postgresql`). Vestige
|
||||
clients pick up their own `max_connections` change on next restart.
|
||||
|
||||
Do not raise pool sizes blindly. Past about 4x the CPU core count,
|
||||
Postgres throughput drops; a small connection pooler (PgBouncer in
|
||||
transaction mode) is the right answer above ~200 client connections,
|
||||
but Vestige's expected scale rarely needs that.
|
||||
|
||||
---
|
||||
|
||||
## 5. Backup discipline
|
||||
|
||||
### 5.1 Which tables matter
|
||||
|
||||
| Table | Backup priority | Regenerable? |
|
||||
|-------|-----------------|--------------|
|
||||
| `knowledge_nodes` | Critical | No |
|
||||
| `scheduling` | Critical | No (FSRS state) |
|
||||
| `embedding_model` | Critical | No (one row, but stamps the column type) |
|
||||
| `users`, `groups`, `group_memberships` | Critical | No (Phase 3 will populate) |
|
||||
| `review_events` | Important | Replayable by clients but tedious |
|
||||
| `edges` | Optional | Yes (recomputed by spreading activation) |
|
||||
| `domains` | Optional | Yes (Phase 4 recomputes by clustering) |
|
||||
|
||||
For a typical single-operator install, dumping the whole database is
|
||||
fastest and simplest. Skip the optional tables only if dump size becomes
|
||||
a bandwidth problem.
|
||||
|
||||
### 5.2 Full logical backup
|
||||
|
||||
```sh
|
||||
pg_dump --host=127.0.0.1 --username=vestige --format=custom \
|
||||
--file=vestige-$(date -u +%Y%m%dT%H%M%SZ).dump \
|
||||
vestige
|
||||
```
|
||||
|
||||
The custom format compresses by default and works with parallel restore.
|
||||
File size for 10k memories: roughly 80 MB.
|
||||
|
||||
Frequency recommendations:
|
||||
|
||||
- Daily for any installation with active ingest.
|
||||
- Before every `vestige migrate reembed` run (see section 6).
|
||||
- Before every Postgres major-version upgrade.
|
||||
- Retain at least 7 daily, 4 weekly, 3 monthly dumps. Compress with
|
||||
`--format=custom` (already gzipped) and keep them on different
|
||||
storage from the database itself.
|
||||
|
||||
### 5.3 Restore
|
||||
|
||||
To a fresh database:
|
||||
|
||||
```sh
|
||||
sudo -u postgres createdb -O vestige vestige_restore
|
||||
pg_restore --host=127.0.0.1 --username=vestige --dbname=vestige_restore \
|
||||
--jobs=4 vestige-20260301T030000Z.dump
|
||||
```
|
||||
|
||||
To replace the live database (destructive; only after taking a fresh
|
||||
dump):
|
||||
|
||||
```sh
|
||||
sudo systemctl stop vestige-mcp # or however the service is run
|
||||
sudo -u postgres dropdb vestige
|
||||
sudo -u postgres createdb -O vestige vestige
|
||||
pg_restore --host=127.0.0.1 --username=vestige --dbname=vestige \
|
||||
--jobs=4 vestige-20260301T030000Z.dump
|
||||
sudo systemctl start vestige-mcp
|
||||
```
|
||||
|
||||
### 5.4 Restore drill
|
||||
|
||||
Run a restore-to-throwaway-database every month and run `vestige search`
|
||||
or a manual `psql` count against it. A backup you have not restored is a
|
||||
backup you do not have.
|
||||
|
||||
```sh
|
||||
sudo -u postgres createdb -O vestige vestige_restore_drill
|
||||
pg_restore --host=127.0.0.1 --username=vestige --dbname=vestige_restore_drill \
|
||||
--jobs=4 vestige-latest.dump
|
||||
PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige \
|
||||
-d vestige_restore_drill \
|
||||
-c 'SELECT count(*) FROM knowledge_nodes;'
|
||||
sudo -u postgres dropdb vestige_restore_drill
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Migration between embeddings
|
||||
|
||||
Use `vestige migrate reembed` when:
|
||||
|
||||
- Upgrading to a new embedding model that produces a different dimension
|
||||
(for example, swapping from `nomic-embed-text-v1.5` 768D to a 1024D
|
||||
model).
|
||||
- Switching providers and the model hash differs even at the same
|
||||
dimension.
|
||||
|
||||
What it does:
|
||||
|
||||
1. Reads every row from `knowledge_nodes`, re-encodes the `content` column
|
||||
through the new embedder, and writes the new vector back.
|
||||
2. Drops the HNSW index before the re-encode loop (this is the default;
|
||||
`--concurrent-index` keeps it during the run at the cost of speed).
|
||||
3. Updates the `embedding_model` row with the new name, dimension, and
|
||||
hash.
|
||||
4. Rebuilds the HNSW index with the new vectors.
|
||||
|
||||
### 6.1 Before starting
|
||||
|
||||
- Take a fresh backup (section 5.2). The tool refuses to start without a
|
||||
`--yes` flag if it detects no recent backup; ignore at your peril.
|
||||
- Stop ingest. Vestige's MCP server can stay running for read-only
|
||||
access, but pause any client that calls `smart_ingest` or
|
||||
`update_scheduling`.
|
||||
- Have the new embedder model available locally. The CLI loads it
|
||||
before the first row is touched; if loading fails, no data is changed.
|
||||
|
||||
### 6.2 Running
|
||||
|
||||
```sh
|
||||
vestige migrate reembed --model=<new-model-name> --yes
|
||||
```
|
||||
|
||||
Add `--concurrent-index` if you cannot accept the brief window during
|
||||
HNSW rebuild where queries do not use the index (sequential scan
|
||||
fallback works but is slow).
|
||||
|
||||
The tool prints a progress bar via `indicatif`. Expected throughput:
|
||||
roughly 200 memories per second per CPU core for a 768D ONNX model.
|
||||
10,000 memories on an 8-core box: about 6 seconds, plus HNSW rebuild
|
||||
(another 30-90 seconds at that scale).
|
||||
|
||||
### 6.3 Verifying completion
|
||||
|
||||
```sh
|
||||
sudo -u postgres psql -d vestige <<'SQL'
|
||||
-- Registry reflects the new model.
|
||||
SELECT name, dimension, hash FROM embedding_model;
|
||||
-- HNSW index is present and not partial.
|
||||
SELECT indexname, indexdef
|
||||
FROM pg_indexes
|
||||
WHERE tablename = 'knowledge_nodes' AND indexname LIKE '%hnsw%';
|
||||
-- All rows have a non-null embedding of the new dimension.
|
||||
SELECT count(*) FILTER (WHERE embedding IS NULL) AS missing,
|
||||
count(*) AS total
|
||||
FROM knowledge_nodes;
|
||||
SQL
|
||||
```
|
||||
|
||||
Expected: registry shows the new model name and dimension, one HNSW
|
||||
index, zero missing embeddings.
|
||||
|
||||
### 6.4 Recovering from an interrupted run
|
||||
|
||||
`vestige migrate reembed` is restartable. On interruption:
|
||||
|
||||
- The `embedding_model` row may or may not have been updated. Check it
|
||||
manually and roll forward by re-running with `--yes --resume` (the
|
||||
tool detects the inconsistency and finishes the rows that still hold
|
||||
old embeddings).
|
||||
- The HNSW index may be missing. Re-running the command rebuilds it as
|
||||
its last step.
|
||||
- If the system is in a state the tool refuses to reason about, restore
|
||||
from the backup taken in 6.1.
|
||||
|
||||
---
|
||||
|
||||
## 7. Re-clustering domains
|
||||
|
||||
Domain clustering is owned by Phase 4
|
||||
(`docs/plans/0004-phase-4-emergent-domain-classification.md`). Until
|
||||
Phase 4 ships, the `domains` table is reserved schema and is populated
|
||||
only by tests. Operators must not invoke any domain re-clustering
|
||||
workflow manually; there is no supported one in Phase 2.
|
||||
|
||||
When Phase 4 lands, this section is replaced with the real procedure.
|
||||
|
||||
---
|
||||
|
||||
## 8. Monitoring
|
||||
|
||||
### 8.1 Quick health check
|
||||
|
||||
```sh
|
||||
PGPASSWORD="$(cat ~/.vestige_pg_pw)" psql -h 127.0.0.1 -U vestige -d vestige <<'SQL'
|
||||
SELECT count(*) AS memory_count FROM knowledge_nodes;
|
||||
SELECT name, dimension FROM embedding_model;
|
||||
SELECT pg_size_pretty(pg_database_size('vestige')) AS db_size;
|
||||
SQL
|
||||
```
|
||||
|
||||
### 8.2 In-flight queries
|
||||
|
||||
```sql
|
||||
SELECT pid, now() - query_start AS runtime, state, query
|
||||
FROM pg_stat_activity
|
||||
WHERE datname = 'vestige' AND state <> 'idle'
|
||||
ORDER BY runtime DESC NULLS LAST;
|
||||
```
|
||||
|
||||
Anything over 5 seconds with `state = 'active'` deserves a look. HNSW
|
||||
search queries should land well under 100ms on properly-sized hardware.
|
||||
|
||||
### 8.3 Query pattern analysis
|
||||
|
||||
If `pg_stat_statements` is loaded (`shared_preload_libraries =
|
||||
'pg_stat_statements'` in `postgresql.conf`):
|
||||
|
||||
```sql
|
||||
SELECT calls, mean_exec_time, query
|
||||
FROM pg_stat_statements
|
||||
WHERE query ILIKE '%knowledge_nodes%'
|
||||
ORDER BY mean_exec_time DESC
|
||||
LIMIT 20;
|
||||
```
|
||||
|
||||
Look for hybrid-search queries that have drifted above 100ms p50. The
|
||||
usual culprit is a missing or half-built HNSW index.
|
||||
|
||||
### 8.4 Index health
|
||||
|
||||
```sql
|
||||
SELECT indexname, pg_size_pretty(pg_relation_size(indexrelid)) AS size,
|
||||
idx_scan, idx_tup_read
|
||||
FROM pg_indexes
|
||||
JOIN pg_stat_user_indexes USING (indexrelid)
|
||||
WHERE schemaname = 'public' AND relname = 'knowledge_nodes';
|
||||
```
|
||||
|
||||
A HNSW index with `idx_scan = 0` after several hours of traffic usually
|
||||
means the planner is preferring sequential scan -- either the table is
|
||||
too small to bother with the index (fine) or the index is corrupt and
|
||||
needs rebuilding (`REINDEX INDEX idx_knowledge_nodes_embedding_hnsw;`).
|
||||
|
||||
### 8.5 Spotting half-built HNSW
|
||||
|
||||
After a failed migration or a crashed `reembed`:
|
||||
|
||||
```sql
|
||||
SELECT indexname, indisvalid, indisready
|
||||
FROM pg_indexes
|
||||
JOIN pg_index ON indexrelid = (schemaname || '.' || indexname)::regclass
|
||||
WHERE tablename = 'knowledge_nodes';
|
||||
```
|
||||
|
||||
Any row with `indisvalid = false` is broken. Drop and recreate:
|
||||
|
||||
```sql
|
||||
DROP INDEX IF EXISTS idx_knowledge_nodes_embedding_hnsw;
|
||||
CREATE INDEX idx_knowledge_nodes_embedding_hnsw
|
||||
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Troubleshooting
|
||||
|
||||
| Symptom | Likely cause | Fix |
|
||||
|---------|--------------|-----|
|
||||
| `ERROR: extension "vector" is not available` on start | pgvector not installed for this Postgres major version | Install the distro package matching `pg_config --version`, then `CREATE EXTENSION vector;` as superuser |
|
||||
| `pool timed out while waiting for an open connection` in Vestige logs | Pool too small or stuck queries holding connections | Raise `max_connections` in `vestige.toml`; investigate `pg_stat_activity` for queries above 5s |
|
||||
| `vector dimensions do not match` on insert | `embedding_model` was stamped at one dimension and a different embedder is now running | Re-run `vestige migrate reembed --model=<correct>` or fix the embedder configuration |
|
||||
| Hybrid search returns the same row twice | Stale `.sqlx/` query cache from before D5 landed | Run `cargo sqlx prepare` in `crates/vestige-core/`, rebuild the binary |
|
||||
| `text search configuration "english" does not exist` | Postgres locale build does not include the english dictionary (rare on Alpine) | Install the language-pack or override the FTS language in `vestige.toml` (see `[storage.postgres.fts]` once Phase 2 D5 lands) |
|
||||
| `relation "_sqlx_migrations" exists, but migration X is in "applied" with no checksum` | Previous run died between `BEGIN` and `COMMIT` | Stop Vestige, restore from backup, restart |
|
||||
| HNSW index very large compared to data | `m` and `ef_construction` defaults too high for the corpus | Acceptable for now; tuning lands as part of Phase 4 |
|
||||
| `permission denied for schema public` on a new install | `vestige` role does not own `public` | Re-run the grants block in section 2.2 as `postgres` |
|
||||
|
||||
If a problem is not in this table, capture: PostgreSQL log
|
||||
(`/var/log/postgres/`, journalctl `-u postgresql`), Vestige log
|
||||
(`RUST_LOG=debug,sqlx=info` for a fresh run), the migration state
|
||||
(`SELECT * FROM _sqlx_migrations ORDER BY version;`), and file a bug.
|
||||
|
||||
---
|
||||
|
||||
## 10. Rollback caveats
|
||||
|
||||
Every migration in `crates/vestige-core/migrations/postgres/` has a
|
||||
matching `*.down.sql`. `sqlx migrate revert` walks them in reverse order.
|
||||
|
||||
This is not the same as risk-free. The `0002_hnsw.down.sql` drops the
|
||||
HNSW index (rebuildable, expensive). The `0001_init.down.sql` drops
|
||||
every table -- including `knowledge_nodes`, including data. Down migrations
|
||||
exist for development, not for casual production use.
|
||||
|
||||
Before applying any new migration:
|
||||
|
||||
1. Take a backup (section 5.2).
|
||||
2. Run the migration on a restored copy first if you can afford the time.
|
||||
3. Read the new migration's `*.up.sql` and `*.down.sql` to understand
|
||||
what changes.
|
||||
|
||||
To revert one migration manually:
|
||||
|
||||
```sh
|
||||
sqlx migrate revert \
|
||||
--database-url "postgresql://vestige:...@127.0.0.1:5432/vestige" \
|
||||
--source crates/vestige-core/migrations/postgres
|
||||
```
|
||||
|
||||
Note that Vestige's binary does not run `sqlx migrate revert`
|
||||
automatically. Reverts are always an explicit operator decision.
|
||||
|
||||
If a revert fails partway through, treat the database as inconsistent:
|
||||
restore from the backup taken in step 1.
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Cross-references
|
||||
|
||||
- `docs/adr/0001-pluggable-storage-and-network-access.md` -- ADR that
|
||||
established the pluggable backend.
|
||||
- `docs/adr/0002-phase-2-execution.md` -- ADR settling Phase 2 execution
|
||||
decisions; section "Architecture Overview" lists every table the
|
||||
runbook references.
|
||||
- `docs/plans/0002-phase-2-postgres-backend.md` -- master plan; D16
|
||||
(deliverables list) and the Open Implementation Questions section
|
||||
(especially Q4 HNSW rebuild and Q5 pool sizing) inform the runbook's
|
||||
recommendations.
|
||||
- `docs/plans/local-dev-postgres-setup.md` -- developer-facing recipe
|
||||
for a one-machine Arch / CachyOS dev cluster. The runbook links to it
|
||||
as the "for development, see" pointer.
|
||||
- `docs/CONFIGURATION.md` -- existing config doc; section 4 of the
|
||||
runbook ("Connection pool tuning") cross-references it for the
|
||||
authoritative `vestige.toml` schema.
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
A reviewer is given:
|
||||
|
||||
- A fresh Linux VM (Debian 12 or Arch current; both must work) with
|
||||
network access and no Postgres installed.
|
||||
- A built `vestige-mcp` binary for that platform.
|
||||
- The runbook (`docs/runbook/postgres.md`).
|
||||
|
||||
The reviewer follows the runbook top to bottom and reaches a state in
|
||||
which Vestige answers MCP requests against the Postgres backend.
|
||||
Checkpoints, in order:
|
||||
|
||||
1. After section 1 (Prerequisites): `pg_config --version` returns 16 or
|
||||
newer; `pkg-config --modversion libpq` resolves; the `pgvector`
|
||||
distro package is installed.
|
||||
2. After section 2.1 (Extensions): two rows in
|
||||
`SELECT extname FROM pg_extension WHERE extname IN ('vector', 'pgcrypto');`.
|
||||
3. After section 2.2 (Role + DB): `psql -U vestige -h 127.0.0.1 -d vestige -c '\conninfo'`
|
||||
succeeds.
|
||||
4. After section 2.3 (Config): `vestige.toml` parses (test by
|
||||
`vestige config print` once that subcommand lands, otherwise
|
||||
`vestige-mcp --check-config`).
|
||||
5. After section 3 (First connect): the eight expected tables are
|
||||
present; `embedding_model` has exactly one row; the HNSW index
|
||||
exists; `vestige-mcp` log shows "Postgres backend ready".
|
||||
6. After section 5.2 (Backup): the dump file exists and `pg_restore -l`
|
||||
on it lists the expected tables.
|
||||
7. After section 5.4 (Restore drill): the drill database holds the same
|
||||
row count as the source.
|
||||
|
||||
If any checkpoint fails, the runbook section that produced the failure
|
||||
is the one that needs revision. Capture the exact command, exit code,
|
||||
and log line; revise the runbook in a follow-up PR.
|
||||
|
||||
A second reviewer reads the runbook without executing it and checks for:
|
||||
|
||||
- ASCII only; no em dashes, no curly quotes, no Unicode arrows, no
|
||||
ellipses, no bullets (`*`/`-` ASCII only).
|
||||
- Every section number from 1 to 10 present and in order.
|
||||
- Every cross-reference resolves to an existing file or to a Phase
|
||||
number explicitly marked as "future".
|
||||
- No code block longer than 30 lines; if longer, it should be split or
|
||||
referenced from another file.
|
||||
|
||||
---
|
||||
|
||||
## Acceptance criteria
|
||||
|
||||
- [ ] `docs/runbook/` directory exists.
|
||||
- [ ] `docs/runbook/postgres.md` exists and matches the inlined body
|
||||
above byte-for-byte after stripping the outer code fence used in
|
||||
this sub-plan to embed it.
|
||||
- [ ] All ten sections from the "Runbook structure" outline are present
|
||||
under their stated headings.
|
||||
- [ ] No file other than `docs/runbook/postgres.md` is created or
|
||||
modified by executing this sub-plan.
|
||||
- [ ] ASCII only: no em dashes, no curly quotes, no Unicode arrows,
|
||||
no ellipses, no Unicode bullets (`grep -P '[^\x00-\x7F]'
|
||||
docs/runbook/postgres.md` returns no matches).
|
||||
- [ ] Every cross-reference in the runbook points at a file that exists
|
||||
in the repository at the time of merge, OR is explicitly framed
|
||||
as "future Phase N" with a pointer to the relevant plan document.
|
||||
- [ ] Every command block is copy-pastable: no `<placeholder>` syntax
|
||||
that does not also have an inline note describing what to
|
||||
substitute.
|
||||
- [ ] A second pair of eyes confirms the verification checkpoints in the
|
||||
preceding section are reproducible.
|
||||
- [ ] The runbook is no longer than the inlined body in this sub-plan;
|
||||
operators reach the end without losing patience.
|
||||
Loading…
Add table
Add a link
Reference in a new issue