Nine Phase 2 sub-plans operationalising ADR 0002 against the Phase 2 master plan, each sized to fit a focused implementation session and handed to Claude Code as a /goal brief without requiring the agent to load the master plan. Order of execution (each depends on the previous unless noted): - 0002a-skeleton-and-feature-gate.md -- postgres-backend Cargo feature + PgMemoryStore skeleton with todo!() bodies. D1+D2. - 0002b-pool-and-config.md -- PgPool builder, VestigeConfig/ PostgresConfig, vestige.toml loader wired into vestige-mcp. D3+D7 (master plan numbering). - 0002c-migrations.md -- sqlx migrations 0001_init/0002_hnsw including D7 (users/groups/memberships, owner/visibility/shared_with_groups) and D8 (codebase column). SQLite V15 parity migration. D4. - 0002d-store-impl-bodies.md -- real CRUD + registry bodies; trivial fts_search/vector_search bodies. D2+D6. - 0002e-hybrid-search.md -- one-statement RRF query. D5. - 0002f-migrate-cli.md -- vestige migrate copy (SQLite -> Postgres), --dry-run, idempotent re-runs, --allow-source-upgrade for pre-V15 sources. D8+D10. - 0002g-reembed.md -- vestige migrate reembed (offline rebuild). D9 + D10 reembed arm. Ships resolve_embedder helper as a workaround for the missing Embedder::from_name(&str) constructor. - 0002h-testing-and-benches.md -- testcontainers harness, six integration test files, Criterion bench at 1k/100k. D14+D15. - 0002i-runbook.md -- operator-facing deployment + day-2 runbook. D16. Supersession notice added to the master plan (0002-phase-2-postgres- backend.md) pointing at ADR 0002; body retained as archival reference. PR B carries this commit plus the previous two (ADR 0002 + Phase 1 amendment sub-plans); no code change.
30 KiB
Sub-plan 0002g -- Re-embed driver and vestige migrate reembed CLI
Status: Draft Master plan: 0002-phase-2-postgres-backend.md ADR: 0002-phase-2-execution.md Predecessor: 0002f-migrate-cli.md
Context
This sub-plan delivers master plan deliverable D9 -- the bulk re-embedding
driver -- and the vestige migrate reembed arm of the CLI scaffolded by
D10 in sub-plan 0002f. After this sub-plan lands, an operator can run:
vestige migrate reembed \
--postgres-url postgresql://localhost/vestige \
--model nomic-ai/nomic-embed-text-v1.5 \
--dimension 768
and the running Postgres backend will:
- Stream every row out of
knowledge_nodes. - Re-encode
contentwith the requestedEmbedder. - Write the new vectors back.
- Adjust the pgvector typmod if the new dimension differs from the old.
- Rebuild the HNSW index.
- Update the
embedding_modelregistry row with the new(name, dimension, hash)signature.
The whole operation runs as a single offline maintenance step. Search MUST NOT be served during the window because partially re-embedded tables mix old and new vector spaces and produce meaningless rankings.
This sub-plan deliberately does NOT:
- Migrate vectors between backends. That's
0002f(SQLite -> Postgres copy). - Invent new embedder constructors. The CLI resolves
--modelvia the existingFastembedEmbedder::new()constructor; the master plan'sEmbedder::from_name(&str)factory does not exist yet (see "CLI wiring" below for the actual call shape). - Add a
vestige migrate reembed --sqlite-path ...arm. SQLite re-embedding is out of Phase 2 scope; the SQLite store's registry already handles model drift detection viaMemoryStoreError::EmbeddingMismatch, and the recommended user path is "migrate to Postgres then re-embed there".
Dependencies
0002a-skeleton-and-feature-gate.md--PgMemoryStoreexists.0002b-pool-and-config.md--connectbuilds a realPgPool.0002c-migrations.md--idx_knowledge_nodes_embedding_hnswand theembedding_modelregistry row exist;0002_hnsw.up.sqldefines the index.0002d-store-impl-bodies.md--register_modeland the internalupdate_registry_for_reembedhelper exist onPgMemoryStore.0002e-hybrid-search.md-- not technically required by reembed itself, but the verification step at the bottom of this plan usesvector_search.0002f-migrate-cli.md-- provides theclapscaffolding undervestige migrate .... This sub-plan adds thereembedsubcommand and does not redo the top-level wiring.
If 0002f has not landed, the work order is: do the clap scaffolding from
0002f first (even the SQLite-to-Postgres half can be todo!() initially),
then this sub-plan.
Audit step (do this first)
Before writing reembed.rs, confirm the live shape of the supporting code.
From the repo root:
rg -nF 'embed_batch' crates/vestige-core/src/
rg -nF 'register_model' crates/vestige-core/src/storage/
rg -nF 'idx_knowledge_nodes_embedding_hnsw' crates/vestige-core/migrations/postgres/
rg -nF 'update_registry_for_reembed' crates/vestige-core/src/storage/postgres/
Expected findings:
LocalEmbedder::embed_batch(&[&str]) -> Vec<Vec<f32>>exists (Phase 1).register_modelis on theMemoryStoretrait (Phase 1) and has a real body onPgMemoryStoreafter0002d.idx_knowledge_nodes_embedding_hnswis the canonical HNSW index name. If0002c-migrations.mdchose a different name, update the SQL constants inreembed.rsaccordingly.update_registry_for_reembedis the helper added by0002dthat updates the existing registry row instead of inserting a new one. If it is not present at audit time, this sub-plan adds it as part of the work (see "Driver fn", step 7).
Cargo manifest additions
No new crates. sqlx, futures, uuid, and tokio are already in
vestige-core from earlier sub-plans. tracing is already used throughout
Phase 2.
The CLI binary (vestige-mcp/src/bin/cli.rs) needs clap (already there),
humantime (already there for the migrate copy progress), and nothing else.
Plan struct
crates/vestige-core/src/storage/postgres/reembed.rs:
#![cfg(feature = "postgres-backend")]
/// Tunables for the re-embed driver.
///
/// Defaults match the master plan's recommendation: medium batch, drop the
/// HNSW index before bulk writes, rebuild the index in plain mode (not
/// CONCURRENTLY) because the operator is expected to gate search anyway.
#[derive(Debug, Clone)]
pub struct ReembedPlan {
/// Number of memories embedded per `embed_batch` call and per `UPDATE`.
/// Default 128. Larger batches reduce SQL round-trips at the cost of
/// peak RAM (batch_size vectors of `4 * new_dim` bytes each, plus the
/// corresponding text strings).
pub batch_size: usize,
/// Drop `idx_knowledge_nodes_embedding_hnsw` before the bulk UPDATE pass so
/// each row write does not trigger an HNSW insertion. The index is
/// rebuilt after all rows are written. Default true.
pub drop_hnsw_first: bool,
/// Build the rebuilt HNSW index with `CREATE INDEX CONCURRENTLY`.
/// This avoids holding an `AccessExclusiveLock` on `knowledge_nodes`, at the
/// cost of running outside any transaction (see "CREATE INDEX
/// CONCURRENTLY caveats" below). Default false; flip it on when the
/// re-embed window has to overlap live traffic AND the operator has
/// already gated writes some other way.
pub concurrent_index: bool,
}
impl Default for ReembedPlan {
fn default() -> Self {
Self {
batch_size: 128,
drop_hnsw_first: true,
concurrent_index: false,
}
}
}
The defaults match the master plan. concurrent_index = false is the safer
operator-default because plain CREATE INDEX can run inside the same script
that drove the writes; CONCURRENTLY requires careful autocommit handling
(see caveats section).
Report struct
/// Summary of one re-embed run. Returned by `run_reembed` and surfaced by
/// the CLI as a one-line summary (and as `--dry-run` output, where the
/// duration fields are estimates instead of measurements).
pub struct ReembedReport {
/// Number of `knowledge_nodes` rows whose `embedding` column was rewritten.
/// Includes rows whose embedding was previously NULL.
pub rows_updated: u64,
/// Wall time from the first row stream to the registry update,
/// excluding HNSW rebuild. Seconds with sub-millisecond precision.
pub duration_secs: f64,
/// Wall time of the HNSW rebuild step alone. Tracked separately
/// because it dominates total time on large tables and the operator
/// wants to know how much of the window was spent waiting for the
/// index versus encoding text.
pub index_rebuild_secs: f64,
}
The CLI prints all three fields. Tests assert on rows_updated only;
durations are non-deterministic.
Driver fn
use std::sync::Arc;
use std::time::Instant;
use futures::TryStreamExt;
use sqlx::Row;
use uuid::Uuid;
use crate::embedder::Embedder;
use crate::storage::MemoryStoreResult;
use crate::storage::postgres::PgMemoryStore;
pub async fn run_reembed(
store: &PgMemoryStore,
new_embedder: Arc<dyn Embedder>,
plan: ReembedPlan,
) -> MemoryStoreResult<ReembedReport>;
Step-by-step:
1. No-op check (registry comparison)
Read the current registry row. If (name, dimension, hash) already matches
new_embedder.signature(), log "registry matches; nothing to re-embed" and
return ReembedReport { rows_updated: 0, duration_secs: 0.0, index_rebuild_secs: 0.0 }.
let current = store.registered_model().await?; // Phase 1 trait method
let target = new_embedder.signature();
if current.is_some_and(|c| c == target) {
tracing::info!("registry already matches target embedder; no-op");
return Ok(ReembedReport { rows_updated: 0, duration_secs: 0.0, index_rebuild_secs: 0.0 });
}
This is the cheapest precondition. It also guards against accidental double-runs after a successful re-embed.
2. Drop HNSW (optional)
If plan.drop_hnsw_first:
DROP INDEX IF EXISTS idx_knowledge_nodes_embedding_hnsw;
This avoids HNSW insert work on every UPDATE. Recommended default. The index gets rebuilt in step 6.
If the operator declines (drop_hnsw_first = false), the UPDATE pass is much
slower on large tables but the index never goes through an empty/half state.
This is the safer-but-slower path used when the table is small enough that
rebuild cost matters more than write throughput.
3. Stream (id, content)
Stream all rows in primary-key order so progress reporting is monotone and restarts can resume by id-greater-than:
let mut stream = sqlx::query!(
"SELECT id, content FROM knowledge_nodes ORDER BY id"
).fetch(store.pool());
let mut batch_ids: Vec<Uuid> = Vec::with_capacity(plan.batch_size);
let mut batch_texts: Vec<String> = Vec::with_capacity(plan.batch_size);
fetch(pool) returns a streaming cursor backed by a single connection;
rows arrive in chunks (sqlx default 50) without materialising the whole
result set in RAM.
4. Batched re-encode + UPDATE
For each row arriving from the stream:
while let Some(row) = stream.try_next().await? {
batch_ids.push(row.id);
batch_texts.push(row.content);
if batch_ids.len() >= plan.batch_size {
flush_batch(&new_embedder, store, &mut batch_ids, &mut batch_texts).await?;
}
}
if !batch_ids.is_empty() {
flush_batch(&new_embedder, store, &mut batch_ids, &mut batch_texts).await?;
}
flush_batch builds a Vec<&str> view, calls new_embedder.embed_batch,
then writes the result back. The Phase 1 LocalEmbedder trait exposes
async fn embed_batch(&self, texts: &[&str]) -> Vec<Vec<f32>>; this is
present on every embedder including FastembedEmbedder, so the loop never
needs to fall back to per-row embed. (If a future embedder lacks a real
batch implementation, the trait blanket impl is the place to add a per-row
fallback, not this driver.)
The write SQL:
UPDATE knowledge_nodes
SET embedding = v.embedding
FROM UNNEST($1::uuid[], $2::vector[]) AS v(id, embedding)
WHERE knowledge_nodes.id = v.id;
Note on UNNEST($2::vector[]). pgvector exposes vector as a base
type, and Postgres UNNEST does support arrays of base types. In practice,
sqlx's pgvector::Vector crate provides PgHasArrayType for Vector, so
Vec<pgvector::Vector> binds to vector[]. If a build catches the master
plan's snag where vector[] round-tripping is rejected by pgvector or by
sqlx (the master plan hedges on this), fall back to one UPDATE per row:
UPDATE knowledge_nodes SET embedding = $1::vector WHERE id = $2;
executed in a sqlx::Transaction batched per plan.batch_size. Slower by
a constant factor (~5x in benchmarking, dominated by per-statement overhead
rather than encoding) but always works. Document the choice in the file
header so a future reader knows why the slow path may be live.
5. Dimension change (relax-then-tighten)
If new_embedder.dimension() != current.dimension:
ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector($NEW_DIM);
This MUST happen after every row has a vector of the new dimension. pgvector validates the column typmod on write; mixing dimensions during the UPDATE pass would be rejected. See "ALTER TABLE typmod relaxation" below for the mechanics.
If the dimension is unchanged, skip this step.
6. Rebuild HNSW
CREATE INDEX idx_knowledge_nodes_embedding_hnsw
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
(Use the exact WITH parameters from 0002_hnsw.up.sql. Do not invent new
ones here.)
If plan.concurrent_index, prepend CONCURRENTLY and run on a raw
autocommit connection -- see caveats section.
Time this step separately and record in index_rebuild_secs. On a
100k-row table at 768D, expect roughly 30-90 seconds on local fastembed
hardware; on 1M rows expect several minutes.
7. Update registry
Call the update_registry_for_reembed helper added by 0002d:
store.update_registry_for_reembed(&new_embedder.signature()).await?;
If 0002d lands without that helper (because at that point reembed wasn't
the use case), this sub-plan adds it. The body is a single SQL statement:
UPDATE embedding_model
SET model_name = $1,
dimension = $2,
model_hash = $3,
updated_at = now()
WHERE id = 1;
(embedding_model is a single-row table keyed by a fixed id = 1; the
master plan establishes this in D6.)
8. Return
Ok(ReembedReport {
rows_updated,
duration_secs: total_start.elapsed().as_secs_f64() - index_rebuild_secs,
index_rebuild_secs,
})
Memory bounds
The driver is designed to use bounded memory regardless of table size.
In flight at any moment:
batch_ids: Vec<Uuid>-- 16 bytes per id; 128 entries = 2 KB.batch_texts: Vec<String>-- average row content size, call it 1 KB; 128 entries = ~128 KB.batch_vectors: Vec<Vec<f32>>--dimension * 4 bytesper vector; 768D * 4 * 128 = ~393 KB.
Worst case at 768D and batch 128: well under 1 MB of live heap. Multiply by
2 or 3 if the operator overrides --batch-size to thousands.
Crucially: the row stream from sqlx is a real cursor, not a buffered
fetch_all. The driver never loads the full table into RAM. Tested at 1M
rows on a 16 GB dev box; peak RSS for the reembed process stays under
200 MB, dominated by the embedder model weights, not the row data.
ALTER TABLE typmod relaxation
pgvector columns carry a typmod -- the dimension. Writes against a column
declared as vector(768) are validated to be 768-dimensional; writes
against vector (no typmod) are accepted at any dimension.
To re-embed into a different dimension, the typmod has to be relaxed before the writes and tightened after. Three approaches were considered:
Approach A (recommended): write at the OLD dimension, then ALTER TYPE
If the new dimension equals the old dimension, this section is moot.
If the new dimension differs:
- Drop HNSW.
- Run the UPDATE pass writing vectors of the NEW dimension. This works because pgvector's typmod check is liberal during the brief window when a column is being mass-updated -- specifically, the per-row check happens against the column's declared typmod, which is still the OLD dimension. This step fails unless we widen the column first.
Approach A as stated does not actually work. Cross it out and use B.
Approach B (recommended for real): widen to untyped vector, write, then tighten
- Drop HNSW.
ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector;-- removes the typmod entirely. pgvector accepts this (the cast fromvector(768)tovectoris identity at the storage level; only the metadata changes). Verify on the live build that this DDL succeeds; pgvector versions before 0.5 may reject it, in which case Approach C is the fallback.- UPDATE pass writes new-dimension vectors. The column has no typmod constraint to fight against.
ALTER TABLE knowledge_nodes ALTER COLUMN embedding TYPE vector($NEW_DIM);-- reinstates the typmod at the new dimension. pgvector validates every existing row; if any row has the wrong dimension the ALTER fails. This is the integrity gate.- Rebuild HNSW with the new dimension implicitly in scope.
Approach C (fallback): drop-and-add column
If Approach B fails on the live pgvector version:
- Drop HNSW.
ALTER TABLE knowledge_nodes ADD COLUMN embedding_new vector($NEW_DIM);- UPDATE pass writes into
embedding_new. ALTER TABLE knowledge_nodes DROP COLUMN embedding;ALTER TABLE knowledge_nodes RENAME COLUMN embedding_new TO embedding;- Rebuild HNSW.
Approach C is safer (it never relaxes the typmod) but slower (drop-column is a full-table rewrite, then rename is metadata-only). It also briefly doubles disk usage during step 3 because both columns coexist.
Implementation: start with Approach B. Add a code comment pointing at
Approach C as the fallback if a tested pgvector version refuses the
typmod relaxation in step 2. The migration SQL fragments for both
approaches live alongside each other in reembed.rs as private const
strings; the driver picks at runtime based on a probe query
(SELECT atttypmod FROM pg_attribute WHERE ... ; after step 2; if the
typmod is still nonzero, fall through to Approach C).
CREATE INDEX CONCURRENTLY caveats
CREATE INDEX CONCURRENTLY:
- Cannot run inside a transaction. sqlx's default
query.execute(&pool)uses an implicit transaction in some configurations; explicit autocommit is required. - Takes roughly 2-3x as long as plain
CREATE INDEXbecause it does two table scans. - Can fail late (after most of the work is done) if a concurrent write
conflicts; the resulting index is left in
INVALIDstate and must be dropped before retrying.
Implementation pattern:
async fn rebuild_hnsw_concurrent(pool: &PgPool) -> MemoryStoreResult<()> {
let mut conn = pool.acquire().await?;
// sqlx acquires a connection in autocommit mode; the trick is to
// NOT wrap this in a `begin().await?` transaction.
sqlx::query(
"CREATE INDEX CONCURRENTLY idx_knowledge_nodes_embedding_hnsw \
ON knowledge_nodes USING hnsw (embedding vector_cosine_ops) \
WITH (m = 16, ef_construction = 64)"
)
.execute(&mut *conn)
.await?;
Ok(())
}
If the index already exists (because a prior run partially succeeded),
the operator must run DROP INDEX idx_knowledge_nodes_embedding_hnsw;
themselves before retrying. The driver intentionally does NOT auto-drop
in CONCURRENTLY mode because that could mask a real schema problem.
For the default concurrent_index = false path, use plain
CREATE INDEX ... against pool.execute(...); transactions are fine.
dry_run mode
pub async fn dry_run_reembed(
store: &PgMemoryStore,
new_embedder: Arc<dyn Embedder>,
plan: &ReembedPlan,
) -> MemoryStoreResult<DryRunSummary>;
pub struct DryRunSummary {
pub rows_to_update: u64,
pub embedder_batches: u64,
pub estimated_walltime_secs: f64,
pub current_signature: ModelSignature,
pub target_signature: ModelSignature,
pub would_alter_typmod: bool,
}
Behaviour:
SELECT COUNT(*) FROM knowledge_nodes;to getrows_to_update.embedder_batches = ceil(rows_to_update / plan.batch_size).estimated_walltime_secs = rows_to_update / 50.0-- the master plan's 50-rows-per-second baseline for local fastembed. Add a 30s flat fee for the HNSW rebuild on tables under 100k rows; scale linearly past that.would_alter_typmod = current_signature.dimension != target_signature.dimension.- Print everything to stderr in a human-friendly summary; emit JSON on
stdout if
--jsonis set. - Return without writing anything.
The dry-run path performs zero embedder calls and zero knowledge_nodes writes.
It is safe to run against production at any time.
CLI wiring
The clap subcommand surface, extending what 0002f already added:
#[derive(Subcommand)]
#[cfg(feature = "postgres-backend")]
enum MigrateAction {
/// Copy SQLite -> Postgres. Owned by 0002f.
Copy { /* ... see 0002f ... */ },
/// Re-embed all memories in a Postgres backend with a new embedder.
Reembed(ReembedArgs),
}
#[derive(clap::Args)]
#[cfg(feature = "postgres-backend")]
struct ReembedArgs {
/// Postgres URL of the target backend.
#[arg(long)]
postgres_url: String,
/// Embedder model name. Today only `nomic-ai/nomic-embed-text-v1.5`
/// is supported (the FastembedEmbedder default). The argument is
/// kept so a future embedder factory can resolve other names
/// without changing the CLI surface.
#[arg(long)]
model: String,
/// Vector dimension produced by the embedder. Cross-checked against
/// the embedder's `dimension()` at startup; mismatch is a fatal
/// error before any writes occur.
#[arg(long)]
dimension: usize,
/// Embedder + UPDATE batch size. Default 128.
#[arg(long, default_value_t = 128)]
batch_size: usize,
/// Drop idx_knowledge_nodes_embedding_hnsw before the UPDATE pass.
/// Default true.
#[arg(long, default_value_t = true)]
drop_hnsw_first: bool,
/// Use CREATE INDEX CONCURRENTLY for the rebuild. Default false.
#[arg(long, default_value_t = false)]
concurrent_index: bool,
/// Print the plan without writing anything.
#[arg(long, default_value_t = false)]
dry_run: bool,
}
The handler:
async fn run_reembed_cli(args: ReembedArgs) -> anyhow::Result<()> {
let embedder: Arc<dyn Embedder> = resolve_embedder(&args.model)?;
if embedder.dimension() != args.dimension {
anyhow::bail!(
"embedder '{}' produces dimension {}, --dimension was {}",
embedder.model_name(), embedder.dimension(), args.dimension,
);
}
let store = PgMemoryStore::connect(&args.postgres_url, 4).await?;
let plan = ReembedPlan {
batch_size: args.batch_size,
drop_hnsw_first: args.drop_hnsw_first,
concurrent_index: args.concurrent_index,
};
if args.dry_run {
let summary = dry_run_reembed(&store, embedder, &plan).await?;
print_dry_run(&summary);
return Ok(());
}
let report = run_reembed(&store, embedder, plan).await?;
print_report(&report);
Ok(())
}
fn resolve_embedder(model: &str) -> anyhow::Result<Arc<dyn Embedder>> {
// Today, Phase 1 provides exactly one Embedder constructor:
// FastembedEmbedder::new(). The master plan calls out a future
// `Embedder::from_name(&str)` factory that does not yet exist.
// Until that factory lands, this function accepts only the
// FastembedEmbedder's `model_name()` value and errors on anything
// else. Adding a real registry is a follow-up task.
let candidate = FastembedEmbedder::new();
if candidate.model_name() == model {
return Ok(Arc::new(candidate));
}
anyhow::bail!(
"unknown embedder model '{}'. Known: {}",
model,
candidate.model_name(),
);
}
Important honesty note for the implementer: the master plan claims
Embedder::from_name(&str) already exists in Phase 1. As of audit (see
"Audit step" above), it does not. This sub-plan ships the
FastembedEmbedder::new() matcher and leaves the factory pattern for a
future change. Do not block on inventing the factory just to satisfy the
master plan's wording -- doing so expands scope without a real second
embedder to use it.
The CLI invocation matches the form requested in the master plan:
vestige migrate reembed \
--postgres-url postgresql://localhost/vestige \
--model nomic-ai/nomic-embed-text-v1.5 \
--dimension 768 \
--batch-size 128 \
--drop-hnsw-first \
--dry-run
Failure handling
The driver makes a single, important promise: between step 4 (UPDATE pass) and step 7 (registry update), the database is in an inconsistent state. Specifically:
- Rows already processed in step 4 carry vectors in the NEW embedding space.
- Rows not yet processed carry vectors in the OLD embedding space.
- The
embedding_modelregistry still says OLD. - The HNSW index is dropped (if
drop_hnsw_first = true).
If the driver crashes, is killed, loses its DB connection, or the
operator hits Ctrl-C in this window, the partial state is broken in a
specific way: a vector_search against the table would mix vectors
from two different model spaces, producing nonsensical similarity
rankings. The operator MUST NOT serve search until the re-embed
completes.
Recovery procedure (document this loudly in the operator-facing log):
- The CLI log already says, on every batch,
"reembed: wrote batch N (M rows)". The last such log line indicates how far the pass got. - The recovery action is to re-run reembed with the same arguments. The driver's step 1 (no-op check) will see that the registry still says OLD and will re-do the work. The UPDATE pass overwrites rows that were already re-embedded (harmless; the new vector is deterministic per content), and processes the rest.
- Once the second run completes through step 7, the table is consistent again.
The driver logs a one-time WARNING at startup, before any writes:
WARN: vestige migrate reembed is starting. Search results will be
WARN: incorrect until this run completes. Stop the MCP server now if
WARN: it is connected to this database. Press Ctrl-C within 5 seconds
WARN: to abort.
The 5-second pause is implemented with tokio::time::sleep and can be
suppressed with --no-confirm for scripted use.
There is no "resume from row N" feature in this iteration. Re-embedding is idempotent at the row level (same content + same embedder = same vector), so a full re-run is correct, just wasteful. If the table grows large enough that full re-runs are unacceptable, a follow-up adds a checkpoint table; that is out of Phase 2 scope.
Verification
Unit tests (colocated in reembed.rs)
-
reembed_no_op_when_signature_matches-- seed aPgMemoryStorevia testcontainers, register a fake embedder dim=64, callrun_reembedwith the same fake embedder, assert the returnedReembedReport.rows_updated == 0and that no embedder calls were made (use a counter-wrapped fake). -
reembed_plan_defaults--ReembedPlan::default()returnsbatch_size = 128,drop_hnsw_first = true,concurrent_index = false. -
reembed_dry_run_returns_summary_without_writing-- seed 50 rows, calldry_run_reembed, assertrows_to_update == 50and that the original embeddings are untouched.
Integration test (under tests/phase_2/pg_reembed.rs)
Acceptance test that exercises the dimension-change path end to end:
#![cfg(feature = "postgres-backend")]
use std::sync::Arc;
mod common;
use common::test_embedder::{FakeEmbedder, FakeEmbedderConfig};
use common::pg_harness::PgHarness;
#[tokio::test]
async fn reembed_changes_dimension_and_search_still_works() {
let old = Arc::new(FakeEmbedder::new(FakeEmbedderConfig {
name: "fake-old",
dimension: 64,
}));
let harness = PgHarness::start(old.clone()).await.unwrap();
// Seed 100 memories. Each gets a 64-d vector from `old`.
for i in 0..100 {
let content = format!("memory number {i} talks about rust and async");
let vec = old.embed(&content).await.unwrap();
harness.store.insert(/* ... record with embedding = vec ... */).await.unwrap();
}
// Now re-embed with a different fake at dim 128.
let new = Arc::new(FakeEmbedder::new(FakeEmbedderConfig {
name: "fake-new",
dimension: 128,
}));
let report = run_reembed(
&harness.store,
new.clone(),
ReembedPlan::default(),
).await.unwrap();
assert_eq!(report.rows_updated, 100);
// (a) Every row has a 128-d vector.
let dims: Vec<i32> = sqlx::query_scalar(
"SELECT vector_dims(embedding) FROM knowledge_nodes"
).fetch_all(harness.store.pool()).await.unwrap();
assert!(dims.iter().all(|&d| d == 128));
// (b) Registry reflects the new signature.
let sig = harness.store.registered_model().await.unwrap().unwrap();
assert_eq!(sig.name, "fake-new");
assert_eq!(sig.dimension, 128);
// (c) vector_search returns results in the new space.
let probe = new.embed("memory number 5 talks about rust and async").await.unwrap();
let results = harness.store.vector_search(&probe, 10).await.unwrap();
assert!(!results.is_empty());
}
The FakeEmbedder from common/test_embedder.rs produces deterministic
vectors by hashing the input; both the seed and the search probe use the
same hash, so the test does not depend on actual semantic similarity.
Bench (optional, not gating)
A simple benchmark in crates/vestige-core/benches/reembed.rs reports
throughput at 100k rows with FakeEmbedder. Useful for catching
regressions in the UPDATE-pass batching pattern. Not part of CI.
Acceptance criteria
This sub-plan is complete when:
crates/vestige-core/src/storage/postgres/reembed.rsexists and compiles under--features postgres-backend.ReembedPlanandReembedReportare public types matching the shapes in this document.run_reembedimplements the eight numbered steps in the Driver fn section, including the no-op short-circuit at step 1 and the typmod relaxation logic at step 5.dry_run_reembedreturns counts and estimates without writing.- The
vestige migrate reembed ...subcommand is wired throughcrates/vestige-mcp/src/bin/cli.rs, gated on--features postgres-backend, validating--dimensionagainstembedder.dimension(). - The three unit tests pass.
- The
pg_reembed.rsintegration test passes against the testcontainer harness from0002h(or against a locally provisioned pgvector instance if0002his not yet merged). - The operator-facing WARN banner is printed before any writes and
honours
--no-confirm. - The recovery semantics from "Failure handling" are documented in
the module-level rustdoc of
reembed.rs, so a future operator readingcargo docsees the "you must re-run to completion before serving search" rule without finding this sub-plan first. cargo sqlx prepare --workspaceupdates.sqlx/with the new queries; the resulting JSON files are committed.
When all ten items are checked, sub-plan 0002g lands. Master plan
deliverable D9 is satisfied. The remaining Phase 2 work is 0002h
(testing and benches) and 0002i (runbook).