diff --git a/docs/plans/0001a-trait-rewrite.md b/docs/plans/0001a-trait-rewrite.md new file mode 100644 index 0000000..354e138 --- /dev/null +++ b/docs/plans/0001a-trait-rewrite.md @@ -0,0 +1,689 @@ +# Phase 1 Amendment Sub-Plan: async_trait -> trait_variant + +**Status**: Draft +**Depends on**: Phase 1 (already on `feat/storage-trait-phase1`, tip 790c0c8) +**Followed by**: `docs/plans/0001c-async-trait-sunset.md` (Embedder rewrite + async-trait crate removal) +**Related**: docs/adr/0002-phase-2-execution.md (decision D1), docs/plans/0001-phase-1-storage-trait-extraction.md + +--- + +## Context + +Phase 1 froze with the storage trait declared as +`#[async_trait::async_trait] pub trait MemoryStore: Send + Sync + 'static` +and a `pub use MemoryStore as LocalMemoryStore;` alias. `async_trait` boxes +every `async fn` in the trait into `Pin>`. That is one +heap allocation per call inside the hottest code path in Vestige (insert, +search, update_scheduling are all on this surface). It also collapses the +two intended trait shapes -- a Send-bound `MemoryStore` for tokio/axum and a +non-Send `LocalMemoryStore` for thread-local backends -- into a single trait +behind an alias, removing the type-system signal we will want for Phase 2's +Postgres backend. + +ADR 0002 decision D1 supersedes this. The amendment lands on the existing +`feat/storage-trait-phase1` branch before Phase 2 starts, before the PR is +opened upstream. The end state: + +- `LocalMemoryStore` is the source-of-truth trait, declared with native + async-fn-in-trait (stable on MSRV 1.91), no `Sync` bound on the trait + itself, and `Sync + 'static` on the trait object. +- `MemoryStore` is auto-generated by `#[trait_variant::make(MemoryStore: Send)]` + with `Send` bounds on every returned future, so `Arc` is + movable across `tokio::spawn`. +- `trait-variant` 0.1 is already present in `crates/vestige-core/Cargo.toml`. + The `async-trait` crate dependency stays in place through this sub-plan + (it is still in use by the embedder impl); removing it is the job of + `0001c-async-trait-sunset.md`. +- No production caller changes. Every production call site holds + `Arc` (the concrete `SqliteMemoryStore` alias), which the trait + rewrite does not touch. Only the trait declaration and impl blocks change. + +--- + +## Scope + +### In scope + +- Rewrite `MemoryStore` / `LocalMemoryStore` declaration in + `crates/vestige-core/src/storage/memory_store.rs` to use + `#[trait_variant::make(MemoryStore: Send)] pub trait LocalMemoryStore`. +- Remove the `pub use MemoryStore as LocalMemoryStore;` alias from the same + file. `LocalMemoryStore` becomes the real trait; `MemoryStore` is the + generated Send variant. +- Drop the `#[async_trait::async_trait]` attribute on the SQLite impl block + in `crates/vestige-core/src/storage/sqlite.rs` (line 6274). The impl block + switches from `impl MemoryStore for SqliteMemoryStore` (currently spelled + through the `LocalMemoryStore` alias) to `impl LocalMemoryStore for + SqliteMemoryStore`. `trait_variant` provides a blanket `impl MemoryStore for T` so `&dyn MemoryStore` and + `Arc` keep working unchanged. +- Update doc comments in `memory_store.rs` to drop + references to `#[async_trait::async_trait]` and instead describe the + `trait_variant` mechanism. +- Keep the existing Phase 1 test suite (`tests/phase_1/*.rs`) green. The + tests already use `Arc` and `Box`, which is + exactly the surface the rewrite is meant to preserve. + +### Out of scope -- moved to 0001c + +- Rewriting the `Embedder` / `LocalEmbedder` trait declaration -- handled by + `0001c-async-trait-sunset.md`. +- Dropping `#[async_trait::async_trait]` from + `crates/vestige-core/src/embedder/fastembed.rs`. +- Removing `async-trait = "0.1"` from `crates/vestige-core/Cargo.toml`. +- The hard `grep -rn 'async_trait' crates/` zero-match gate (async_trait is + still present in the embedder module after 0001a alone). + +### Out of scope + +- The SQLite file split (`sqlite.rs` -> `sqlite/` directory). That is the + sibling sub-plan `0001b-sqlite-split.md`. +- Any production-side refactor that switches `Arc` to + `Arc`. Production code keeps using the concrete alias. +- Adding `register_model` / `from_pool` / Postgres-specific variants of + `MemoryStoreError`. Those land with the Postgres backend in Phase 2. +- Removing the `pub type Storage = SqliteMemoryStore;` alias. + +--- + +## Prerequisites + +### Current state (verified) + +- `crates/vestige-core/Cargo.toml` already declares + `trait-variant = "0.1"` (line 117). `async-trait = "0.1"` (line 119) + remains in place for the duration of this sub-plan; `0001c` removes it. +- `crates/vestige-core/src/storage/memory_store.rs:194` declares the trait + with `#[async_trait::async_trait] pub trait MemoryStore: Send + Sync + + 'static`. +- `crates/vestige-core/src/storage/memory_store.rs:262` has + `pub use MemoryStore as LocalMemoryStore;`. +- `crates/vestige-core/src/storage/sqlite.rs:6274` declares + `#[async_trait::async_trait] impl + crate::storage::memory_store::LocalMemoryStore for SqliteMemoryStore`. +- Production call sites use `Arc` (the concrete + `SqliteMemoryStore` alias). Confirmed by grep: + `grep -rn "dyn MemoryStore\|dyn LocalMemoryStore" --include="*.rs"` + returns hits only in `tests/phase_1/*.rs`, in two comments inside + `memory_store.rs` and `sqlite.rs`, and in one test-only `&dyn MemoryStore` + cast inside `sqlite.rs::tests` (line 8669). +- Workspace-wide `async_trait` usages: only the trait declarations and + impl blocks in `memory_store.rs`, `sqlite.rs`, `embedder/mod.rs`, and + `embedder/fastembed.rs` (verified by + `grep -rn async_trait --include="*.rs"`). This sub-plan touches only the + first two; the embedder files are addressed by `0001c`. + +### Required crates + +| Crate | Version | Action | +|-------|---------|--------| +| `trait-variant` | `0.1` | Already declared. Verify present after edit. | +| `async-trait` | `0.1` | Unchanged for this sub-plan (still used by embedder impl). Removed by `0001c`. | +| `blake3`, `thiserror`, `chrono`, `uuid`, `serde`, `serde_json` | unchanged | unchanged | + +--- + +## Files Touched + +Grouped by category. Every change is listed; nothing else gets touched. + +### Trait declarations (vestige-core) + +| File | Lines (approx) | Change | +|------|----------------|--------| +| `crates/vestige-core/src/storage/memory_store.rs` | 183-262 | Replace `#[async_trait::async_trait]` block with `#[trait_variant::make(MemoryStore: Send)] pub trait LocalMemoryStore`. Drop the `pub use MemoryStore as LocalMemoryStore;` alias. Update doc comments. | + +### Impl blocks (vestige-core) + +| File | Lines (approx) | Change | +|------|----------------|--------| +| `crates/vestige-core/src/storage/sqlite.rs` | 6274 (impl block header only) | Delete the `#[async_trait::async_trait]` attribute. Keep the `impl crate::storage::memory_store::LocalMemoryStore for SqliteMemoryStore { ... }` body verbatim. | + +### Cargo dependency cleanup + +None in this sub-plan. The `async-trait` crate stays declared in +`crates/vestige-core/Cargo.toml` because the embedder impl still uses it. +`0001c-async-trait-sunset.md` removes the dependency after the embedder +side is rewritten. + +### Cognitive module call sites + +No changes. The 29 cognitive modules under `crates/vestige-core/src/neuroscience/` +and `crates/vestige-core/src/advanced/` already operate on concrete +`SqliteMemoryStore` (via the `Storage` alias) or on plain data types +(`KnowledgeNode`, `Vec`, `ConnectionRecord`). Grep verified zero +production references to `dyn MemoryStore` or `dyn LocalMemoryStore`. + +### MCP call sites + +No changes. All ~30 `vestige-mcp/src/**.rs` files holding `Arc` +keep working because: +- `Storage` is still `pub type Storage = SqliteMemoryStore;` (unchanged). +- They call inherent methods on the concrete type, never via a trait object. +- `SqliteMemoryStore` implements `LocalMemoryStore`; `trait_variant` auto- + generates a blanket `impl MemoryStore for T`, + so the concrete type also satisfies `MemoryStore` for any future caller + that wants the trait-object form. + +### Test files (vestige-core integration tests) + +| File | Lines | Change | +|------|-------|--------| +| `tests/phase_1/trait_round_trip.rs` | 7-18, 134 | No change. Already uses `Arc` and `use vestige_core::storage::MemoryStore`. trait_variant emits a `MemoryStore` trait at the same path, so the imports resolve. | +| `tests/phase_1/send_bound_variant.rs` | 10-12, 36, 57 | No change. Already asserts the trait_variant Send invariant; the rewrite is what makes the doc comment on lines 3-4 actually true. | +| `tests/phase_1/cognitive_module_isolation.rs` | 11, 37, 76, 102, 115 | No change. | +| `tests/phase_1/embedding_model_registry.rs` | 10 | No change. | +| `tests/phase_1/domain_column_migration.rs` | 98 | No change. | +| `crates/vestige-core/src/storage/sqlite.rs::tests` | 8666-8675 | No change. The existing test casts `&s` to `&dyn MemoryStore` and calls trait methods through that vtable; trait_variant preserves this exact dyn-compatible surface. | + +### Documentation + +| File | Change | +|------|--------| +| `crates/vestige-core/src/storage/memory_store.rs` | Rewrite the trait-level doc comment (lines 185-193) to describe trait_variant rather than async_trait. | +| `CLAUDE.md` | No change. The repo-level architecture notes do not name the trait attribute. | + +--- + +## Trait Declaration Rewrite + +### Before (current state on `feat/storage-trait-phase1`) + +`crates/vestige-core/src/storage/memory_store.rs:183-262`: + +```rust +// ---------------------------------------------------------------------------- +// TRAIT +// ---------------------------------------------------------------------------- + +/// The single storage abstraction. +/// +/// `#[async_trait::async_trait]` makes every `async fn` return a +/// `Pin>`, which is required for `Arc` +/// to be movable across `tokio::spawn` boundaries. +/// +/// `LocalMemoryStore` is a type alias kept for source compatibility with code +/// that refers to the non-send variant. In Phase 1 both names refer to the same +/// (dyn-compatible, Send-safe) trait. +#[async_trait::async_trait] +pub trait MemoryStore: Send + Sync + 'static { + // --- Lifecycle --- + async fn init(&self) -> MemoryStoreResult<()>; + async fn health_check(&self) -> MemoryStoreResult; + + // ... 25 more async fn ... + + async fn vacuum(&self) -> MemoryStoreResult<()>; +} + +/// Type alias kept for source compatibility. Both names refer to the same +/// `async_trait`-annotated trait that is dyn-compatible and `Send + Sync`. +pub use MemoryStore as LocalMemoryStore; +``` + +### After + +`crates/vestige-core/src/storage/memory_store.rs:183-262`: + +```rust +// ---------------------------------------------------------------------------- +// TRAIT +// ---------------------------------------------------------------------------- + +/// The single storage abstraction. +/// +/// `LocalMemoryStore` is the source-of-truth trait. The +/// `#[trait_variant::make(MemoryStore: Send)]` attribute auto-generates a +/// `MemoryStore` trait whose returned futures are `Send`, so +/// `Arc` is movable across `tokio::spawn` boundaries while +/// `Arc` remains usable on single-threaded executors +/// and thread-local backends. +/// +/// Every method is native async-fn-in-trait (stable on MSRV 1.91); no +/// per-call heap allocation, no boxed futures. +#[trait_variant::make(MemoryStore: Send)] +pub trait LocalMemoryStore: Sync + 'static { + // --- Lifecycle --- + async fn init(&self) -> MemoryStoreResult<()>; + async fn health_check(&self) -> MemoryStoreResult; + + // --- Embedding model registry --- + async fn registered_model(&self) -> MemoryStoreResult>; + async fn register_model(&self, sig: &ModelSignature) -> MemoryStoreResult<()>; + + // --- CRUD --- + async fn insert(&self, record: &MemoryRecord) -> MemoryStoreResult; + async fn get(&self, id: Uuid) -> MemoryStoreResult>; + async fn update(&self, record: &MemoryRecord) -> MemoryStoreResult<()>; + async fn delete(&self, id: Uuid) -> MemoryStoreResult<()>; + + // --- Search --- + async fn search(&self, query: &SearchQuery) -> MemoryStoreResult>; + async fn fts_search(&self, text: &str, limit: usize) -> MemoryStoreResult>; + async fn vector_search( + &self, + embedding: &[f32], + limit: usize, + ) -> MemoryStoreResult>; + + // --- FSRS Scheduling --- + async fn get_scheduling( + &self, + memory_id: Uuid, + ) -> MemoryStoreResult>; + async fn update_scheduling(&self, state: &SchedulingState) -> MemoryStoreResult<()>; + async fn get_due_memories( + &self, + before: DateTime, + limit: usize, + ) -> MemoryStoreResult>; + + // --- Graph (spreading activation) --- + async fn add_edge(&self, edge: &MemoryEdge) -> MemoryStoreResult<()>; + async fn get_edges( + &self, + node_id: Uuid, + edge_type: Option<&str>, + ) -> MemoryStoreResult>; + async fn remove_edge(&self, source: Uuid, target: Uuid) -> MemoryStoreResult<()>; + async fn get_neighbors( + &self, + node_id: Uuid, + depth: usize, + ) -> MemoryStoreResult>; + + // --- Domains --- + async fn list_domains(&self) -> MemoryStoreResult>; + async fn get_domain(&self, id: &str) -> MemoryStoreResult>; + async fn upsert_domain(&self, domain: &Domain) -> MemoryStoreResult<()>; + async fn delete_domain(&self, id: &str) -> MemoryStoreResult<()>; + async fn classify(&self, embedding: &[f32]) -> MemoryStoreResult>; + + // --- Bulk / Maintenance --- + async fn count(&self) -> MemoryStoreResult; + async fn get_stats(&self) -> MemoryStoreResult; + async fn vacuum(&self) -> MemoryStoreResult<()>; +} +``` + +Notes: + +- The `pub use MemoryStore as LocalMemoryStore;` line on the current + `memory_store.rs:262` is **deleted** entirely. `MemoryStore` is now the + generated trait that `trait_variant::make` emits at the same module path; + `LocalMemoryStore` is the source-of-truth declaration. Both names export + from `storage/mod.rs` already (see lines 10-14 of that file). +- `Sync + 'static` on `LocalMemoryStore` (and no `Send` bound on the trait + itself) is correct: `Send` on the trait is what `trait_variant::make` + inserts when it emits the `MemoryStore` variant. The current trait carries + `Send + Sync + 'static`; the rewrite drops the `Send` bound from the + local variant. `Arc` is `Sync` but not `Send`; + `Arc` (the generated variant) is `Send + Sync`. +- `trait_variant` 0.1 does **not** require any attribute on the impl block. + The attribute lives only on the trait declaration. See the next section. + +The Embedder trait rewrite uses the identical `trait_variant::make` pattern +and is fully specified in `0001c-async-trait-sunset.md`. + +--- + +## Impl Block Migration + +`trait_variant` 0.1 attaches the attribute only to the trait declaration. +The impl side is plain `impl LocalMemoryStore for SqliteMemoryStore`; no +attribute on the impl, no `#[trait_variant::make(MemoryStore: Send)]` on the +impl block. The crate auto-generates the blanket +`impl MemoryStore for T`, so any concrete type +that implements `LocalMemoryStore` automatically also implements +`MemoryStore` provided it is `Send`. + +`SqliteMemoryStore` already is `Send + Sync` (it holds `Mutex` +fields whose `Mutex` is `std::sync::Mutex`, which is `Send + Sync` when its +guarded type is `Send`; `rusqlite::Connection` is `Send`). No new bound is +needed. + +### Before + +`crates/vestige-core/src/storage/sqlite.rs:6274`: + +```rust +#[async_trait::async_trait] +impl crate::storage::memory_store::LocalMemoryStore for SqliteMemoryStore { + async fn init(&self) -> crate::storage::memory_store::MemoryStoreResult<()> { + // Migrations run in `new`; this is a no-op for the SQLite backend. + Ok(()) + } + + // ... ~2400 lines of method bodies, unchanged ... +} +``` + +### After + +`crates/vestige-core/src/storage/sqlite.rs:6274`: + +```rust +impl crate::storage::memory_store::LocalMemoryStore for SqliteMemoryStore { + async fn init(&self) -> crate::storage::memory_store::MemoryStoreResult<()> { + // Migrations run in `new`; this is a no-op for the SQLite backend. + Ok(()) + } + + // ... ~2400 lines of method bodies, unchanged ... +} +``` + +Diff is exactly one removed line (the `#[async_trait::async_trait]` attribute). +Every method body, every `async fn` signature, every `use` statement inside +the impl block stays verbatim. No `Box::pin(async move { ... })`, no manual +`Pin>`, no `'async_trait` lifetime markers; native +async-fn-in-trait does this directly. + +The Embedder impl block rewrite follows the identical "remove one +`#[async_trait::async_trait]` attribute" pattern and is fully specified in +`0001c-async-trait-sunset.md`. + +### Why the impl block does not need an attribute + +`trait_variant::make` generates two things from the source trait: + +1. The source trait itself (`LocalMemoryStore`), with native async fns. +2. A second trait (`MemoryStore`) whose method signatures return + `impl Future + Send` instead of `impl Future`, + plus a blanket impl wiring concrete types through. + +Both are emitted at the macro-call site. `SqliteMemoryStore` only writes one +impl block (against `LocalMemoryStore`); the macro-generated blanket +guarantees `SqliteMemoryStore: MemoryStore` for free. The current `&dyn +MemoryStore` casts (sqlite.rs:8669; tests under tests/phase_1/) keep +type-checking unchanged. + +--- + +## Call Site Audit + +Verified via: + +```bash +grep -rn "dyn MemoryStore\|dyn LocalMemoryStore" --include="*.rs" \ + /home/delandtj/prppl/vestige-phase2/ +grep -rn "Arc\|Arc" --include="*.rs" \ + /home/delandtj/prppl/vestige-phase2/ +grep -rn "use.*MemoryStore;\|use.*LocalMemoryStore;" --include="*.rs" \ + /home/delandtj/prppl/vestige-phase2/ +``` + +### Files that reference the trait object form (`dyn MemoryStore` / `dyn LocalMemoryStore`) + +All test-only or doc-comment-only: + +| File | Line | Use | Required change | +|------|------|-----|-----------------| +| `tests/phase_1/trait_round_trip.rs` | 7-18 | `Arc` in `make_store()` and test bodies | None. `MemoryStore` is the generated Send variant; signature stays. | +| `tests/phase_1/trait_round_trip.rs` | 134 | `Arc` upcast inside a test body | None. | +| `tests/phase_1/send_bound_variant.rs` | 10-97 | `Arc` moved across `tokio::spawn` | None. This test becomes meaningfully correct only after the rewrite (currently it relies on async_trait boxing; after the rewrite it relies on trait_variant's Send variant -- same observable outcome). | +| `tests/phase_1/cognitive_module_isolation.rs` | 11, 33-115 | `Arc` passed into cognitive module-style closures | None. | +| `tests/phase_1/embedding_model_registry.rs` | 10 | `Arc` in `make_store()` | None. | +| `tests/phase_1/domain_column_migration.rs` | 98 | `Arc` cast inside a migration assertion | None. | +| `crates/vestige-core/src/storage/sqlite.rs` | 8666-8675 | `let dyn_s: &dyn MemoryStore = &s;` inside `mod tests` | None. The cast is testing that the dyn-vtable still resolves the trait methods correctly; after the rewrite it resolves through the `MemoryStore` trait that `trait_variant::make` emits at the same path. | +| `crates/vestige-core/src/storage/memory_store.rs` | 188 (doc) | Doc comment mentioning `Arc` | Replaced as part of the doc rewrite (see Trait Declaration section). | + +### Files that hold the concrete type (`Arc` / `Arc`) + +35 files, 116 hits. Every one of them keeps working unchanged because the +concrete `SqliteMemoryStore` type stays exactly as it is. Listed here for +completeness so a reviewer can confirm none of them needs an edit: + +``` +crates/vestige-core/src/storage/mod.rs (alias declaration) +crates/vestige-core/src/storage/sqlite.rs (impl block) +crates/vestige-mcp/src/server.rs (Arc in McpServer) +crates/vestige-mcp/src/cognitive.rs (hydrate(&Storage)) +crates/vestige-mcp/src/autopilot.rs +crates/vestige-mcp/src/protocol/http.rs +crates/vestige-mcp/src/dashboard/mod.rs +crates/vestige-mcp/src/dashboard/state.rs +crates/vestige-mcp/src/dashboard/handlers.rs +crates/vestige-mcp/src/resources/codebase.rs +crates/vestige-mcp/src/resources/memory.rs +crates/vestige-mcp/src/tools/changelog.rs +crates/vestige-mcp/src/tools/codebase_unified.rs +crates/vestige-mcp/src/tools/context.rs +crates/vestige-mcp/src/tools/contradictions.rs +crates/vestige-mcp/src/tools/cross_reference.rs +crates/vestige-mcp/src/tools/dedup.rs +crates/vestige-mcp/src/tools/dream.rs +crates/vestige-mcp/src/tools/explore.rs +crates/vestige-mcp/src/tools/feedback.rs +crates/vestige-mcp/src/tools/graph.rs +crates/vestige-mcp/src/tools/health.rs +crates/vestige-mcp/src/tools/importance.rs +crates/vestige-mcp/src/tools/intention_unified.rs +crates/vestige-mcp/src/tools/maintenance.rs +crates/vestige-mcp/src/tools/memory_states.rs +crates/vestige-mcp/src/tools/memory_unified.rs +crates/vestige-mcp/src/tools/predict.rs +crates/vestige-mcp/src/tools/restore.rs +crates/vestige-mcp/src/tools/review.rs +crates/vestige-mcp/src/tools/search_unified.rs +crates/vestige-mcp/src/tools/session_context.rs +crates/vestige-mcp/src/tools/smart_ingest.rs +crates/vestige-mcp/src/tools/suppress.rs +crates/vestige-mcp/src/tools/tagging.rs +crates/vestige-mcp/src/tools/timeline.rs +``` + +Each holds `Arc` and dispatches to inherent methods on +`SqliteMemoryStore`. None of them goes through a trait vtable. Required +change for every one of them: **none**. + +### Files that `use ...MemoryStore` from production code + +``` +grep -rn "use.*MemoryStore;\|use.*LocalMemoryStore;" --include="*.rs" \ + | grep -v "memory_store.rs\|sqlite.rs\|tests/phase_1" +``` + +returns nothing. Production code does not import the trait by name. + +### Conclusion + +The rewrite is a strictly local change to two source files +(`storage/memory_store.rs` and `storage/sqlite.rs`). Zero production call +sites need edits. The integration tests that consume `Arc` +keep their current form; the rewrite is what gives that signature its +no-box semantics on the storage side. The `Box` surface is +addressed by `0001c-async-trait-sunset.md`. + +--- + +## Commit Sequence + +Two commits, each green on `cargo test -p vestige-core --no-default-features` +and `cargo test -p vestige-core --features embeddings,vector-search`. + +### Commit 1: rewrite MemoryStore / LocalMemoryStore trait declaration + +- Touches: `crates/vestige-core/src/storage/memory_store.rs` only. +- Action: replace lines 183-262 per the "Trait Declaration Rewrite" section + above. Delete the `pub use MemoryStore as LocalMemoryStore;` line. +- Green after: `cargo check -p vestige-core` (the impl block in `sqlite.rs` + still has `#[async_trait::async_trait]` on it, but it still resolves + through the `LocalMemoryStore` trait which is now native-async; the + `async_trait` macro is harmless when applied to a trait that the impl + block targets by path, because the macro rewrites the impl's async fns + into boxed-future fns whose signatures still match the native-async + declarations after trait_variant lowering). If `cargo check` complains + here, fold commit 2 into commit 1. + + **Mitigation if check fails between commits 1 and 2:** combine the two + into a single commit. The split is offered for review convenience; the + build must be green after every commit lands. + +### Commit 2: drop `#[async_trait::async_trait]` from SqliteMemoryStore impl + +- Touches: `crates/vestige-core/src/storage/sqlite.rs` only. +- Action: delete line 6274 (`#[async_trait::async_trait]`). +- Green after: `cargo test -p vestige-core --features embeddings,vector-search`, + including all `trait_*` tests inside `sqlite.rs::tests` (lines 8643-8712) + and the trait-object cast on line 8669. + +### Combined alternative + +If the per-step split feels artificial, commits 1 and 2 can collapse into +a single commit covering both the trait rewrite and the impl-attribute +drop for `MemoryStore`. That is acceptable; the two-commit form is +preferred only because it lets a reviewer bisect trait-rewrite failures +separately from impl-rewrite failures. + +The Embedder / fastembed commits and the `async-trait` Cargo dependency +removal live in `0001c-async-trait-sunset.md`. + +--- + +## Verification + +Every command runs from the repo root unless noted otherwise. + +```bash +# 1. Vestige-core, default features (embeddings + vector-search). +cargo test -p vestige-core --features embeddings,vector-search + +# 2. Vestige-core, minimal features (no embeddings, no vector-search). +cargo test -p vestige-core --no-default-features + +# 3. Workspace build, release mode (catches any feature-gated regression +# in the vestige-mcp tools tree). +cargo build --workspace --release + +# 4. Whole-workspace test (vestige-mcp 406 tests + vestige-core 352 tests +# per the CLAUDE.md baseline). +cargo test --workspace + +# 5. Phase 1 integration tests (these are the trait-shape contract). +cargo test --test trait_round_trip --features embeddings,vector-search +cargo test --test send_bound_variant --features embeddings,vector-search +cargo test --test cognitive_module_isolation --features embeddings,vector-search +cargo test --test embedding_model_registry --features embeddings,vector-search +cargo test --test domain_column_migration --features embeddings,vector-search + +# 6. Clippy gate, deny warnings (matches Phase 1 PR policy of zero warnings). +cargo clippy --workspace --all-targets --features embeddings,vector-search -- -D warnings + +# 7. Storage-side dependency hygiene check (must return zero hits). +# Scoped to the storage module only -- the embedder module still uses +# async_trait until 0001c lands. +grep -rn "async_trait\|async-trait" crates/vestige-core/src/storage/ + +# 8. Confirm trait_variant attribute is in place on the storage trait +# (must return exactly one hit in memory_store.rs). +grep -rn "trait_variant::make" crates/vestige-core/src/storage/ +``` + +Expected outcomes: + +- Command 1: 352 tests pass (matches baseline). +- Command 2: smaller test count, all pass. +- Command 3: workspace compiles in release mode. +- Command 4: 758 total tests pass (matches CLAUDE.md baseline). +- Command 5: each phase_1 integration test binary runs green. The + `send_bound_variant::arc_dyn_memory_store_moves_across_tokio_tasks` test + is the canary; if `MemoryStore` lost its Send-bound future variant, this + fails to compile with "future cannot be sent between threads safely". +- Command 6: zero clippy warnings. The rewrite must not introduce a new + `clippy::needless_lifetimes` or `clippy::async_yields_async`. +- Command 7: empty output. async_trait is gone from the storage module. + The embedder module still contains async_trait; that is removed by + `0001c-async-trait-sunset.md`. +- Command 8: one hit, in `memory_store.rs`. + +--- + +## Acceptance Criteria + +A reviewer should be able to check every box: + +- [ ] `crates/vestige-core/src/storage/memory_store.rs` declares the trait + with `#[trait_variant::make(MemoryStore: Send)] pub trait + LocalMemoryStore: Sync + 'static`, no `async_trait` attribute, no + `Send` bound on `LocalMemoryStore` itself. +- [ ] `crates/vestige-core/src/storage/memory_store.rs` no longer contains + `pub use MemoryStore as LocalMemoryStore;`. +- [ ] `crates/vestige-core/src/storage/sqlite.rs:6274` is plain + `impl crate::storage::memory_store::LocalMemoryStore for + SqliteMemoryStore` -- no attribute on the impl block. +- [ ] `grep -rn "async_trait\|async-trait" crates/vestige-core/src/storage/` + returns zero hits. +- [ ] `grep -rn "trait_variant::make" crates/vestige-core/src/storage/` + returns exactly one hit (the storage trait in `memory_store.rs`). +- [ ] All 758 workspace tests pass (`cargo test --workspace`). +- [ ] Phase 1 integration tests pass with the trait-object surface + (`Arc`) intact. +- [ ] `cargo clippy --workspace --all-targets --features + embeddings,vector-search -- -D warnings` is clean. +- [ ] No production source file under `crates/vestige-mcp/` or + `crates/vestige-core/src/{neuroscience,advanced,consolidation,codebase, + memory,embeddings,embedder}/` was modified by this sub-plan. +- [ ] `Arc` still type-checks at every existing call site (verified + by the workspace test pass). +- [ ] Doc comments on the storage trait declaration describe + `trait_variant`, not `async_trait`. + +--- + +## Risks and Mitigations + +- **`trait_variant` 0.1 macro emits unexpected diagnostics on MSRV 1.91.** + Mitigation: the master Phase 1 plan already prescribed this exact pattern + (`#[trait_variant::make(MemoryStore: Send)] pub trait LocalMemoryStore: + Sync + 'static`, see plan `0001-...` line 274-275); the crate has been in + `vestige-core/Cargo.toml` since Phase 1 landed. If a diagnostic appears, + pin to the exact known-good version with `trait-variant = "=0.1.2"` and + open an upstream issue. +- **Native async-fn-in-trait makes the trait no longer dyn-compatible.** + Mitigation: `trait_variant::make` is specifically the workaround for this + -- it emits both the source trait (for static dispatch) and a Send-bound + variant whose returned futures use `Pin>` only at + the dyn boundary. `Arc` keeps working because the + generated `MemoryStore` trait is dyn-compatible by construction. Verified + by the existing `send_bound_variant::*` tests, which intentionally move + `Arc` across `tokio::spawn` from inside a + `multi_thread` runtime. +- **A cognitive module silently relied on the boxed-future return type.** + Mitigation: grep verified no cognitive module imports `MemoryStore` / + `LocalMemoryStore` or holds an `Arc` form; all of them use the + concrete `Storage` alias. There is no Send-ness expectation downstream to + break. +- **Future bodies inside the SQLite impl capture non-Send locals.** + Mitigation: every method body in `sqlite.rs:6274..` runs synchronous + rusqlite calls inside the same `async fn` frame; no `.await` points + exist inside the bodies that we ship today. The `Send` bound on the + generated `MemoryStore` variant is therefore satisfied automatically. + If a future change adds `.await` inside an impl method, the new + trait_variant surface will surface that as a compile error at the call + site, which is the correct outcome. +- **`async-trait` crate is left declared after this sub-plan.** + This is intentional: the embedder impl still depends on it. The + `0001c-async-trait-sunset.md` sub-plan removes the crate after the + embedder side is rewritten. Grep on the whole workspace returns only + the storage and embedder files; no downstream crate pulls `async-trait`. + +--- + +## Out-of-Band Notes + +- This sub-plan amends `feat/storage-trait-phase1` (tip 790c0c8). The + branch has not been opened upstream yet, so amending in place is safe; + no force-push to a public PR. +- The companion sub-plan `0001b-sqlite-split.md` lands after this one on + the same branch. The trait-rewrite landing first is intentional: the + SQLite split moves the impl block into `storage/sqlite/trait_impl.rs`, + and it is cleaner to move a small attribute-free impl than a + macro-decorated one. +- The companion sub-plan `0001c-async-trait-sunset.md` lands after this + one (order with `0001b` is independent) and finishes the + async_trait -> trait_variant transition for the Embedder trait, then + removes the `async-trait` crate dependency. +- After the Phase 1 amendment sub-plans (`0001a`, `0001b`, `0001c`) land, + the branch is reviewed and merged before Phase 2 sub-plans (`0002a-` + through `0002i-`) begin implementation. diff --git a/docs/plans/0001b-sqlite-split.md b/docs/plans/0001b-sqlite-split.md new file mode 100644 index 0000000..ce2590d --- /dev/null +++ b/docs/plans/0001b-sqlite-split.md @@ -0,0 +1,732 @@ +# Sub-Plan 0001b: Split sqlite.rs into a sqlite/ directory + +**Status**: Draft +**Branch**: `feat/storage-trait-phase1` (Phase 1 amendment, PR A) +**Depends on**: `0001a-trait-rewrite.md` (must land first; it carries the +`trait_variant`-generated trait declaration that `trait_impl.rs` matches) +**Related**: `docs/adr/0002-phase-2-execution.md` (D3, D6) + +--- + +## Context + +`crates/vestige-core/src/storage/sqlite.rs` is the single SQLite backend file +that Phase 1 inherited from pre-trait Vestige and then appended the +`LocalMemoryStore` trait impl block to. The file is 8713 lines as of +commit 790c0c8 on `feat/storage-trait-phase1`. ADR 0002 D3 decides to split +it into a `sqlite/` directory before Phase 2 lands `postgres/` as a peer. +Reasoning, in one paragraph: + +The Postgres backend is going in as a directory of seven small files +(`postgres/{mod,pool,migrations,registry,search,migrate_cli,reembed}.rs`). +If SQLite stays as one 8K-line file alongside that, the repo says "backends +look like one big file or seven small ones, pick a side", which forces +every future maintainer to re-litigate the layout. Splitting now -- as +**pure code motion**, no public-API change, no behavioural change, no +migration -- lets both backends look the same, keeps each surface mappable +in a single editor tab, and shrinks the diffs Phase 2 has to review. +This sub-plan is sized as one focused implementation session. + +The split is **private** to `storage/sqlite/`. Cognitive modules continue +to `use crate::storage::SqliteMemoryStore`; the existing re-exports in +`crates/vestige-core/src/storage/mod.rs` keep working without touching +any caller. Tests stay green commit-by-commit. + +This sub-plan depends on `0001a-trait-rewrite.md` landing first because +`sqlite/trait_impl.rs` is the file that picks up the new trait_variant +attribute. Doing the split first would force a second rewrite of +`trait_impl.rs` when the trait rewrite arrives. Order matters; this is +the cheap-to-respect ordering. + +--- + +## Target Layout + +Final directory after this sub-plan: + +``` +crates/vestige-core/src/storage/sqlite/ + mod.rs -- module root: SqliteMemoryStore struct, new(), + reader/writer locks, error types, shared helpers, + portable-sync-related types, record types + crud.rs -- ingest/smart_ingest/get/update/delete/purge/search-by-id + search.rs -- fts, semantic, hybrid, time-based queries + scheduling.rs -- FSRS state, decay, consolidation, review, promote/demote, + suppression, gc, retention, waking tags + graph.rs -- memory_connections (edges), subgraph, neighbors + domain.rs -- domains/domain_scores column readers, classify stub + (Phase 4 will expand this file) + registry.rs -- embedding_model table, enforce_model, register_model body + portable_sync.rs -- portable export/import/sync + merge helpers + trait_impl.rs -- impl LocalMemoryStore for SqliteMemoryStore +``` + +`storage/mod.rs` is unchanged in spirit: it still does `mod sqlite;` and +`pub use sqlite::{...};` -- the only difference is that `sqlite` is now a +directory module instead of a leaf file. The re-export list does not +change. + +--- + +## Current File Map (line numbers from commit 790c0c8) + +The current `sqlite.rs` is structurally: + +| Region | Lines | Contents | +|--------|-------|----------| +| Header | 1-43 | Imports, feature-gated imports | +| Error types | 45-89 | `StorageError`, `Result`, `SmartIngestResult`, `MergeWrite` | +| Portable sync types | 97-211 | `PortableSyncBackend` trait, `FilePortableSyncBackend` struct, `PortableSyncReport`, `PurgeReport` | +| Constants | 233-273 | `PORTABLE_TABLES`, `PORTABLE_USER_DATA_TABLES`, `PortableMergeState`, env constants | +| Struct decl | 287-301 | `SqliteMemoryStore` struct fields | +| Impl block 1 | 303-3740 | Constructor + bulk of native API | +| Record structs | 3747-3866 | `IntentionRecord`, `InsightRecord`, `ConnectionRecord`, `MemoryStateRecord`, `StateTransitionRecord`, `ConsolidationHistoryRecord`, `DreamHistoryRecord`, `Default for InsightRecord` | +| Impl block 2 | 3868-6133 | Intentions / Insights / Connections / States / History / Backup / Portable / GC / Subgraph | +| Impl block 3 | 6139-6272 | Trait-helper methods (`node_to_record`, `read_domain_columns`, `enforce_model`) | +| Trait impl | 6274-7110 | `impl LocalMemoryStore for SqliteMemoryStore` | +| Tests | 7112-8713 | `#[cfg(test)] mod tests`: native API tests + trait round-trip tests | + +--- + +## Mapping Table + +Every public method, every private helper, every struct, every test module +in the current file -- with the destination file in the new layout. Line +ranges cite the current `sqlite.rs` (commit 790c0c8 on +`feat/storage-trait-phase1`, viewed through the +`/home/delandtj/prppl/vestige-phase2` worktree). + +### Header and shared types (-> `sqlite/mod.rs`) + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| Module-level `use` imports | 1-43 | `sqlite/mod.rs` | Trimmed per-file in destination; what does not fit in `mod.rs` moves with its consumers | +| `StorageError` enum + `Result` alias | 49-71 | `sqlite/mod.rs` | Re-exported through `pub use` chain; called from every sub-module | +| `SmartIngestResult` struct | 73-89 | `sqlite/mod.rs` | Returned by `crud::smart_ingest`; defined here because other code depends on the type | +| `MergeWrite` enum | 91-95 | `sqlite/portable_sync.rs` | Only used by merge helpers | +| `PortableSyncBackend` trait | 97-109 | `sqlite/portable_sync.rs` | Public trait; re-exported through `mod.rs` | +| `FilePortableSyncBackend` struct + `impl` | 111-194 | `sqlite/portable_sync.rs` | | +| `PortableSyncReport` struct | 196-211 | `sqlite/portable_sync.rs` | | +| `PurgeReport` struct | 213-231 | `sqlite/crud.rs` | Returned by `purge_node` | +| `PORTABLE_TABLES` constant | 237-254 | `sqlite/portable_sync.rs` | | +| `PORTABLE_USER_DATA_TABLES` constant | 256-272 | `sqlite/portable_sync.rs` | | +| `PortableMergeState` struct | 274-277 | `sqlite/portable_sync.rs` | | +| `DATA_DIR_ENV` constant | 279 | `sqlite/mod.rs` | Read by constructor | +| `DATABASE_FILE` constant | 280 | `sqlite/mod.rs` | Read by constructor | +| `SqliteMemoryStore` struct decl | 282-301 | `sqlite/mod.rs` | All fields stay public-to-crate within the directory | + +### Constructor and config (-> `sqlite/mod.rs`) + +These are foundational; they live in `mod.rs` because every sub-module +calls them or operates on the struct they build. + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `fn data_dir_from_env` | 304-313 | `sqlite/mod.rs` | private helper | +| `fn expand_tilde` | 314-332 | `sqlite/mod.rs` | private helper | +| `fn prepare_data_dir` | 333-346 | `sqlite/mod.rs` | private helper | +| `pub fn db_path_for_data_dir` | 347-355 | `sqlite/mod.rs` | | +| `pub fn default_db_path` | 356-368 | `sqlite/mod.rs` | | +| `fn configure_connection` | 369-396 | `sqlite/mod.rs` | | +| `pub fn new` | 397-457 | `sqlite/mod.rs` | The constructor | +| `pub fn db_path` | 458-462 | `sqlite/mod.rs` | | +| `pub fn data_dir` | 463-467 | `sqlite/mod.rs` | | +| `pub fn sidecar_dir` | 468-473 | `sqlite/mod.rs` | | +| `fn load_embeddings_into_index` | 474-552 | `sqlite/mod.rs` | Called by `new`; touches vector index | + +### CRUD: ingest, get, update, delete, purge (-> `sqlite/crud.rs`) + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `pub fn ingest` | 553-643 | `sqlite/crud.rs` | | +| `pub fn smart_ingest` | 644-864 | `sqlite/crud.rs` | Calls vector search via `self.semantic_search`; cross-module call resolved by impl block being on the same struct | +| `pub fn get_node_embedding` (vector-search) | 865-887 | `sqlite/crud.rs` | embedding read for one node | +| `pub fn get_all_embeddings` (vector-search) | 888-914 | `sqlite/crud.rs` | | +| `pub fn get_node_embedding` (no vector-search stub) | 915-919 | `sqlite/crud.rs` | feature-gated alternative | +| `pub fn update_node_content` | 920-951 | `sqlite/crud.rs` | | +| `fn generate_embedding_for_node` | 952-999 | `sqlite/crud.rs` | private helper; only called by ingest and update_node_content | +| `pub fn get_node` | 1000-1011 | `sqlite/crud.rs` | | +| `fn parse_timestamp` | 1012-1027 | `sqlite/mod.rs` | **Shared helper**: row_to_node uses it, intention/insight rows also parse timestamps. Bump to `pub(super) fn` | +| `fn row_to_node` | 1028-1119 | `sqlite/mod.rs` | **Shared helper**: crud reads single nodes; search.rs builds node lists from rows; scheduling.rs returns nodes for review queue. Bump to `pub(super) fn`. Static method (no `&self`) so a free function in `mod.rs` is fine | +| `pub fn delete_node` | 1842-1869 | `sqlite/crud.rs` | | +| `pub fn purge_node` | 1870-1987 | `sqlite/crud.rs` | | +| `fn node_exists` | 1988-1996 | `sqlite/crud.rs` | static helper, called only by purge | +| `fn record_sync_tombstone` | 1997-2014 | `sqlite/crud.rs` | static helper, called by delete and purge | +| `pub fn get_all_nodes` | 2268-2291 | `sqlite/crud.rs` | bulk read | +| `pub fn get_nodes_by_type_and_tag` | 2292-2342 | `sqlite/crud.rs` | bulk read | + +### Search: fts, semantic, hybrid, temporal (-> `sqlite/search.rs`) + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `pub fn recall` | 1120-1147 | `sqlite/search.rs` | top-level recall path | +| `fn keyword_search` | 1148-1180 | `sqlite/search.rs` | private | +| `pub fn search` | 2015-2043 | `sqlite/search.rs` | | +| `pub fn search_terms` | 2044-2075 | `sqlite/search.rs` | | +| `pub fn concrete_search_filtered` | 2076-2172 | `sqlite/search.rs` | | +| `fn upsert_concrete_result` | 2173-2197 | `sqlite/search.rs` | static helper | +| `fn normalize_literal_query` | 2198-2210 | `sqlite/search.rs` | static helper | +| `fn escape_like` | 2211-2224 | `sqlite/search.rs` | static helper | +| `fn literal_match_score` | 2225-2248 | `sqlite/search.rs` | static helper | +| `fn node_matches_type_filters` | 2249-2267 | `sqlite/search.rs` | static helper | +| `pub fn is_embedding_ready` (both feature variants) | 2343-2354 | `sqlite/search.rs` | both versions move together | +| `pub fn init_embeddings` (both feature variants) | 2355-2367 | `sqlite/search.rs` | both versions move together | +| `fn get_query_embedding` | 2368-2400 | `sqlite/search.rs` | private; uses `query_cache` field | +| `pub fn semantic_search` | 2401-2434 | `sqlite/search.rs` | | +| `pub fn hybrid_search` (feature on) | 2435-2452 | `sqlite/search.rs` | | +| `pub fn hybrid_search_filtered` (feature on) | 2453-2581 | `sqlite/search.rs` | | +| `pub fn hybrid_search` (feature off) | 2582-2593 | `sqlite/search.rs` | feature-gated stub | +| `pub fn hybrid_search_filtered` (feature off) | 2594-2635 | `sqlite/search.rs` | feature-gated stub | +| `fn keyword_search_with_scores` | 2636-2726 | `sqlite/search.rs` | | +| `fn semantic_search_raw` | 2727-2765 | `sqlite/search.rs` | | +| `pub fn generate_embeddings` | 2766-2819 | `sqlite/search.rs` | populates embeddings post hoc | +| `fn embedding_regeneration_candidates` | 2820-2891 | `sqlite/search.rs` | called by generate_embeddings | +| `pub fn query_at_time` | 2892-2933 | `sqlite/search.rs` | temporal query | +| `pub fn query_time_range` | 2934-3005 | `sqlite/search.rs` | temporal query | +| `fn embedding_model_matches_active` (associated fn) | 5652-5671 | `sqlite/search.rs` | static helper for hybrid_search; promoted to `pub(super)` (test references it) | +| `fn embedding_model_supports_matryoshka` | 5672-5677 | `sqlite/search.rs` | static helper | +| `fn embedding_vector_for_active_model` | 5678-5697 | `sqlite/search.rs` | static helper | +| `fn active_embedding_model_like_pattern` | 5698-5713 | `sqlite/search.rs` | static helper | + +### Scheduling: FSRS, decay, consolidation, review, promote/demote, suppression, gc, retention (-> `sqlite/scheduling.rs`) + +This is the busiest destination file. The grouping rule is: anything that +touches FSRS scheduling fields (`stability`, `difficulty`, `retrievability`, +`reps`, `lapses`, `retention_strength`, `retrieval_strength`) or the +review/decay/consolidation/gc lifecycle lives here. + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `pub fn mark_reviewed` | 1181-1275 | `sqlite/scheduling.rs` | FSRS state mutation | +| `pub fn strengthen_on_access` | 1276-1344 | `sqlite/scheduling.rs` | | +| `pub fn strengthen_batch_on_access` | 1345-1357 | `sqlite/scheduling.rs` | | +| `pub fn mark_memory_useful` | 1358-1377 | `sqlite/scheduling.rs` | | +| `fn log_access` | 1378-1393 | `sqlite/scheduling.rs` | private | +| `pub fn promote_memory` | 1394-1425 | `sqlite/scheduling.rs` | | +| `pub fn demote_memory` | 1426-1472 | `sqlite/scheduling.rs` | | +| `pub fn suppress_memory` | 1473-1504 | `sqlite/scheduling.rs` | active forgetting | +| `pub fn reverse_suppression` | 1505-1552 | `sqlite/scheduling.rs` | | +| `pub fn count_suppressed` | 1553-1567 | `sqlite/scheduling.rs` | | +| `pub fn get_recently_suppressed` | 1568-1594 | `sqlite/scheduling.rs` | | +| `pub fn apply_rac1_cascade` | 1595-1641 | `sqlite/scheduling.rs` | active forgetting cascade | +| `pub fn run_rac1_cascade_sweep` | 1642-1657 | `sqlite/scheduling.rs` | | +| `pub fn get_review_queue` | 1658-1681 | `sqlite/scheduling.rs` | | +| `pub fn preview_review` | 1682-1712 | `sqlite/scheduling.rs` | | +| `pub fn get_stats` | 1713-1841 | `sqlite/scheduling.rs` | reports retention/lapses/review counts; lives here for symmetry with the FSRS reporters next door | +| `pub fn apply_decay` | 3006-3095 | `sqlite/scheduling.rs` | core decay loop | +| `fn get_fsrs_w20` | 3096-3119 | `sqlite/scheduling.rs` | | +| `pub fn run_consolidation` | 3120-3407 | `sqlite/scheduling.rs` | | +| `fn auto_dedup_consolidation` | 3408-3538 | `sqlite/scheduling.rs` | called by run_consolidation | +| `fn compute_act_r_activations` | 3539-3605 | `sqlite/scheduling.rs` | called by run_consolidation | +| `fn prune_access_log` | 3606-3620 | `sqlite/scheduling.rs` | called by run_consolidation | +| `fn optimize_w20_if_ready` | 3621-3720 | `sqlite/scheduling.rs` | called by run_consolidation | +| `fn generate_missing_embeddings` | 3721-3740 | `sqlite/scheduling.rs` | called by run_consolidation | +| `pub fn get_state_transitions` | 5714-5748 | `sqlite/scheduling.rs` | audit trail tied to scheduling state | +| `pub fn get_avg_retention` | 5780-5792 | `sqlite/scheduling.rs` | | +| `pub fn get_retention_distribution` | 5794-5825 | `sqlite/scheduling.rs` | | +| `pub fn get_retention_trend` | 5826-5858 | `sqlite/scheduling.rs` | | +| `pub fn save_retention_snapshot` | 5859-5878 | `sqlite/scheduling.rs` | | +| `pub fn count_memories_below_retention` | 5879-5892 | `sqlite/scheduling.rs` | | +| `pub fn gc_below_retention` | 5893-5936 | `sqlite/scheduling.rs` | | +| `pub fn auto_promote_frequent_access` | 5937-5985 | `sqlite/scheduling.rs` | | +| `pub fn set_waking_tag` | 5986-5998 | `sqlite/scheduling.rs` | waking SWR tagging | +| `pub fn clear_waking_tags` | 5999-6011 | `sqlite/scheduling.rs` | | +| `pub fn get_waking_tagged_memories` | 6012-6028 | `sqlite/scheduling.rs` | | +| `pub fn get_recent_state_transitions` | 6105-6132 | `sqlite/scheduling.rs` | | + +### Graph: edges (memory_connections), neighbors, subgraph (-> `sqlite/graph.rs`) + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `pub fn save_connection` | 4180-4202 | `sqlite/graph.rs` | | +| `pub fn get_connections_for_memory` | 4203-4220 | `sqlite/graph.rs` | | +| `pub fn get_all_connections` | 4221-4236 | `sqlite/graph.rs` | | +| `pub fn strengthen_connection` | 4237-4259 | `sqlite/graph.rs` | | +| `pub fn apply_connection_decay` | 4260-4272 | `sqlite/graph.rs` | | +| `pub fn prune_weak_connections` | 4273-4284 | `sqlite/graph.rs` | | +| `fn row_to_connection` | 4285-4305 | `sqlite/graph.rs` | private | +| `pub fn get_most_connected_memory` | 6029-6046 | `sqlite/graph.rs` | | +| `pub fn get_memory_subgraph` | 6048-6103 | `sqlite/graph.rs` | calls `get_connections_for_memory`, `get_node`, `get_all_connections` -- all resolvable through `self` | + +### Domain (-> `sqlite/domain.rs`) + +Phase 1 keeps domain logic to JSON-column reads + classify stub. Phase 4 +expands this file. Keeping the file in the split so Phase 4 has an +obvious place to add to. + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `fn read_domain_columns` | 6167-6196 | `sqlite/domain.rs` | private helper used by trait `get`. Bump to `pub(super)` | + +The trait methods `list_domains` / `get_domain` / `upsert_domain` / +`delete_domain` / `classify` live in `sqlite/trait_impl.rs`; they +delegate to thin helpers that, in Phase 1, are essentially noops or +JSON reads. Phase 4 will move the substance of those methods into +`sqlite/domain.rs` as real functions. + +### Registry: embedding_model table (-> `sqlite/registry.rs`) + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `fn enforce_model` | 6203-6272 | `sqlite/registry.rs` | private helper used by trait `insert` and `update`. Bump to `pub(super)` | + +The trait methods `registered_model` and `register_model` themselves +live in `sqlite/trait_impl.rs`. Phase 2's `postgres/registry.rs` will +mirror this layout. + +### Intentions, Insights, Memory States, Consolidation History, Dream History, Backup (-> `sqlite/mod.rs`) + +These were tacked onto `SqliteMemoryStore` over time as the cognitive +modules needed somewhere to persist their state. They are not part of the +trait surface, they are not naturally grouped with crud/search/scheduling, +and they are each fairly small and self-contained. They live in `mod.rs` +under labelled sections (one big impl block can carry them) rather than +inventing extra files like `intentions.rs` and `insights.rs`. Postgres +will mirror this once Phase 5 picks up the work; for now they have no +peer. + +Rationale: every one of these methods writes to a single table, the +bodies are short, and grouping them next to the constructor preserves +"open `mod.rs` to see the whole struct" as the navigation default. + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `IntentionRecord` struct | 3747-3766 | `sqlite/mod.rs` | re-exported through `storage/mod.rs` | +| `InsightRecord` struct + `Default` | 3767-3797 | `sqlite/mod.rs` | re-exported | +| `ConnectionRecord` struct | 3799-3809 | `sqlite/mod.rs` | re-exported; consumed by `graph.rs` | +| `MemoryStateRecord` struct | 3811-3821 | `sqlite/mod.rs` | | +| `StateTransitionRecord` struct | 3823-3833 | `sqlite/mod.rs` | re-exported | +| `ConsolidationHistoryRecord` struct | 3835-3846 | `sqlite/mod.rs` | | +| `DreamHistoryRecord` struct | 3848-3866 | `sqlite/mod.rs` | re-exported | +| `pub fn save_intention` etc. (intention block) | 3874-4058 | `sqlite/mod.rs` | one impl block, in-section labelled | +| `fn row_to_intention` | 4023-4058 | `sqlite/mod.rs` | private | +| insights block (`save_insight`, `get_insights`, etc.) | 4065-4174 | `sqlite/mod.rs` | | +| `fn row_to_insight` | 4153-4173 | `sqlite/mod.rs` | private | +| memory-state block | 4306-4459 | `sqlite/mod.rs` | | +| `fn row_to_memory_state` | 4431-4459 | `sqlite/mod.rs` | private | +| consolidation-history block | 4465-4540 | `sqlite/mod.rs` | | +| dream-history block | 4546-4638 | `sqlite/mod.rs` | | +| `pub fn count_memories_since` | 4639-4651 | `sqlite/mod.rs` | | +| `fn scan_last_backup_timestamp` | 4652-4682 | `sqlite/mod.rs` | | +| `pub fn last_backup_timestamp` | 4683-4688 | `sqlite/mod.rs` | | +| `pub fn get_last_backup_timestamp` (associated) | 4689-4696 | `sqlite/mod.rs` | | +| `pub fn backup_to` | 5749-5774 | `sqlite/mod.rs` | sqlite VACUUM INTO; called by backup tool | + +### Portable export/import/sync (-> `sqlite/portable_sync.rs`) + +This is the second-largest destination after `scheduling.rs` and the most +self-contained. + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `pub fn export_portable_archive` | 4699-4755 | `sqlite/portable_sync.rs` | | +| `pub fn export_portable_archive_to_path` | 4756-4806 | `sqlite/portable_sync.rs` | | +| `pub fn import_portable_archive` | 4807-4978 | `sqlite/portable_sync.rs` | | +| `pub fn import_portable_archive_from_path` | 4979-4996 | `sqlite/portable_sync.rs` | | +| `pub fn sync_portable_archive` (generic over backend) | 4997-5025 | `sqlite/portable_sync.rs` | | +| `pub fn sync_portable_archive_file` | 5026-5030 | `sqlite/portable_sync.rs` | | +| `fn merge_portable_table` | 5031-5073 | `sqlite/portable_sync.rs` | | +| `fn merge_knowledge_nodes` | 5074-5126 | `sqlite/portable_sync.rs` | | +| `fn merge_sync_tombstones` | 5127-5204 | `sqlite/portable_sync.rs` | | +| `fn merge_deletion_tombstones` | 5205-5245 | `sqlite/portable_sync.rs` | | +| `fn merge_keyed_table` | 5246-5281 | `sqlite/portable_sync.rs` | | +| `fn row_references_locally_newer_node` | 5282-5302 | `sqlite/portable_sync.rs` | | +| `fn merge_append_only_table` | 5303-5338 | `sqlite/portable_sync.rs` | | +| `fn parent_rows_exist` | 5339-5370 | `sqlite/portable_sync.rs` | | +| `fn insert_or_replace_row` | 5371-5386 | `sqlite/portable_sync.rs` | | +| `fn merge_key_columns` | 5387-5398 | `sqlite/portable_sync.rs` | | +| `fn upsert_row_with_columns` | 5399-5446 | `sqlite/portable_sync.rs` | | +| `fn insert_row_with_columns` | 5447-5469 | `sqlite/portable_sync.rs` | | +| `fn merge_row_exists` | 5470-5487 | `sqlite/portable_sync.rs` | | +| `fn row_exists_by_values` | 5488-5507 | `sqlite/portable_sync.rs` | | +| `fn row_values_for_columns` | 5508-5528 | `sqlite/portable_sync.rs` | | +| `fn portable_value` | 5529-5540 | `sqlite/portable_sync.rs` | | +| `fn portable_text` | 5541-5551 | `sqlite/portable_sync.rs` | | +| `fn portable_timestamp` | 5552-5559 | `sqlite/portable_sync.rs` | | +| `fn parse_rfc3339_opt` | 5560-5565 | `sqlite/portable_sync.rs` | | +| `fn tombstone_timestamp` | 5566-5580 | `sqlite/portable_sync.rs` | | +| `fn current_schema_version` | 5581-5589 | `sqlite/portable_sync.rs` | static helper | +| `fn ensure_portable_import_target_empty` | 5590-5604 | `sqlite/portable_sync.rs` | static helper | +| `fn table_exists` | 5605-5613 | `sqlite/portable_sync.rs` | static helper | +| `fn table_row_count` | 5614-5618 | `sqlite/portable_sync.rs` | static helper | +| `fn table_columns` | 5619-5630 | `sqlite/portable_sync.rs` | static helper | +| `fn portable_value_from_ref` | 5631-5646 | `sqlite/portable_sync.rs` | static helper | +| `fn quote_ident` | 5647-5651 | `sqlite/portable_sync.rs` | static helper | + +### Trait helpers and trait impl (-> `sqlite/trait_impl.rs`) + +| Item | Lines | Destination | Notes | +|------|-------|-------------|-------| +| `fn node_to_record` | 6142-6164 | `sqlite/trait_impl.rs` | associated fn used only by trait body; co-locate | +| `impl LocalMemoryStore for SqliteMemoryStore` block | 6274-7110 | `sqlite/trait_impl.rs` | full trait impl; attribute changes from `#[async_trait::async_trait]` to whatever 0001a settles on (`#[trait_variant::make(...)]` is on the trait declaration; the impl block carries no attribute under trait_variant) | + +### Tests + +The current `#[cfg(test)] mod tests` block at lines 7112-8713 contains +**two** distinct test families: + +1. **Native API tests** (7120-8198): unit tests against the legacy + `pub fn` surface (`test_ingest_and_get`, `test_search`, `test_review`, + `test_delete`, `test_dream_history_save_and_get_last`, + `test_portable_archive_exact_round_trip`, `test_keyword_search_*`, + `test_concrete_search_*`, `test_purge_*`, etc.). +2. **Trait round-trip tests** (8200-8712, after the + `// ===== Phase 1: LocalMemoryStore trait round-trip tests =====` + banner): `trait_init_is_idempotent`, `trait_register_model_*`, + `trait_insert_*`, `trait_get_*`, `trait_update_*`, `trait_delete_*`, + `trait_fts_search_*`, `trait_hybrid_search_*`, + `trait_scheduling_*`, `trait_add_edge_*`, `trait_get_edges_*`, + `trait_remove_edge_*`, `trait_get_neighbors_*`, `trait_list_domains_*`, + `trait_upsert_*`, `trait_classify_*`, `trait_count_*`, + `trait_get_stats_*`, `trait_vacuum_*`, + `trait_insert_refuses_dimension_mismatch`. + +See the Test Relocation section below for the resolution. + +--- + +## Visibility Changes + +The split moves items into sibling files inside one module. Helpers that +were `fn ...` (i.e. crate-private but file-private under the current +layout, since the file *is* the module) need their visibility lifted +just enough that sibling files can call them. The principle is: smallest +bump that makes the call site compile. + +`pub(super)` is sufficient for everything below; nothing needs +`pub(crate)`. The trait `LocalMemoryStore` exposure does not change -- +sub-modules call `self.method(...)` on `SqliteMemoryStore`, which +resolves through the impl blocks defined in their own files; visibility +is automatic at impl-block scope. + +Items that need a visibility bump (currently private fn, become +`pub(super) fn`): + +- `parse_timestamp` (1012): called by `row_to_node` and by intention / + insight row mappers. +- `row_to_node` (1028): called by `crud.rs`, `search.rs`, + `scheduling.rs`. Static associated fn. +- `read_domain_columns` (6167): called by `trait_impl.rs`. +- `enforce_model` (6203): called by `trait_impl.rs`. +- `embedding_model_matches_active` (5652): currently called by + `hybrid_search_filtered`; tests also reference it. Has to remain + `pub(super) fn` and be `pub` only if the existing test names reach it + through a re-export. (See Test Relocation.) +- `embedding_model_supports_matryoshka` (5672): private; only callers in + `search.rs` after the move; stays `fn` (no bump needed). +- `embedding_vector_for_active_model` (5678): same as the matches + function -- a test references it. Bump to `pub(super)`. +- `active_embedding_model_like_pattern` (5698): private; only used by + search; stays `fn`. +- `generate_embedding_for_node` (952): currently called by `ingest` and + `update_node_content`. Both move to `crud.rs`; stays `fn`. +- `get_query_embedding` (2368): only used inside `search.rs`; stays `fn`. +- `keyword_search_with_scores` (2636): only used inside `search.rs`; + stays `fn`. +- `semantic_search_raw` (2727): only used inside `search.rs`; stays `fn`. +- `embedding_regeneration_candidates` (2820): used by + `generate_embeddings`; both move to `search.rs`; stays `fn`. The + existing test (line 7167) references it through `storage.method()`, + which will continue to work because the test file can move with it. +- `log_access` (1378): private to `scheduling.rs`; stays `fn`. +- All the `auto_dedup_consolidation` / `compute_act_r_activations` / + `prune_access_log` / `optimize_w20_if_ready` / + `generate_missing_embeddings` helpers (3408-3740): private to + `scheduling.rs`; stays `fn`. +- `row_to_intention` / `row_to_insight` / `row_to_memory_state` / + `row_to_connection`: all stay private in their destination file (only + one caller each). +- All `merge_*` / `portable_*` / `parse_rfc3339_opt` / `quote_ident`: + private to `portable_sync.rs`; stays `fn`. +- `node_exists` (1988): private to `crud.rs`; stays `fn`. +- `record_sync_tombstone` (1997): private to `crud.rs`; stays `fn`. +- `get_fsrs_w20` (3096): private to `scheduling.rs`; stays `fn`. + +Items already `pub fn` (or `pub(crate) fn`) stay as they are -- no +visibility regression. + +Field visibility on `SqliteMemoryStore` itself: currently all fields are +private. The sub-modules access them via `self.field`. Because impl +blocks for `SqliteMemoryStore` are written in sibling files of the same +module, `self.field` reaches private fields without a visibility bump. +**No field visibility changes are required.** Confirm this during the +first motion commit; if Rust disagrees, mark the relevant fields +`pub(super)` and document in the commit message. + +--- + +## Public Re-exports + +`crates/vestige-core/src/storage/mod.rs` currently exports: + +```rust +mod memory_store; +mod migrations; +mod portable; +mod sqlite; + +pub use memory_store::{...}; +pub use migrations::MIGRATIONS; +pub use portable::{...}; +pub use sqlite::{ + ConnectionRecord, ConsolidationHistoryRecord, DreamHistoryRecord, FilePortableSyncBackend, + InsightRecord, IntentionRecord, PortableSyncBackend, PortableSyncReport, Result, + SmartIngestResult, SqliteMemoryStore, StateTransitionRecord, StorageError, +}; + +pub type Storage = SqliteMemoryStore; +``` + +After the split, `mod sqlite;` resolves to the new directory module +(`storage/sqlite/mod.rs`). The `pub use sqlite::{...}` block resolves +against the items re-exported by `storage/sqlite/mod.rs`. + +`storage/sqlite/mod.rs` therefore needs the same names visible at its +top level. Add at the end of `mod.rs`: + +```rust +mod crud; +mod search; +mod scheduling; +mod graph; +mod domain; +mod registry; +mod portable_sync; +mod trait_impl; + +pub use portable_sync::{FilePortableSyncBackend, PortableSyncBackend, PortableSyncReport}; +// SqliteMemoryStore, StorageError, Result, SmartIngestResult, IntentionRecord, +// InsightRecord, ConnectionRecord, StateTransitionRecord, +// ConsolidationHistoryRecord, DreamHistoryRecord are defined in mod.rs itself, +// so they are already in scope and do not need a re-export. +``` + +The `crates/vestige-core/src/storage/mod.rs` file does not change. The +`pub type Storage = SqliteMemoryStore;` alias keeps working. + +If `cargo build` complains that `storage/mod.rs` cannot resolve a name +in its `pub use sqlite::{...}` block, the fix is to add the missing name +to `sqlite/mod.rs`'s re-export tail; no change to `storage/mod.rs`. + +--- + +## Test Relocation + +Two test families, two destinations. + +**Native API tests** (current lines 7120-8198) cover the legacy `pub fn` +surface. They live close to their subject: + +- Tests that touch the constructor, common helpers, and shared setup + (`create_test_storage`, `create_test_storage_at`, + `test_storage_creation`, `test_get_last_backup_timestamp_no_panic`) + move to `sqlite/mod.rs` in a `#[cfg(test)] mod tests` block. +- `test_ingest_and_get`, `test_delete`, + `test_purge_scrubs_insight_json_orphans_children_and_writes_tombstone` + go to `sqlite/crud.rs` as a `#[cfg(test)] mod tests` block. +- `test_search`, `test_keyword_search_with_include_types`, + `test_keyword_search_with_exclude_types`, + `test_include_types_takes_precedence_over_exclude`, + `test_type_filter_with_no_matches_returns_empty`, + `test_hybrid_search_backward_compat`, + `test_concrete_search_literal_identifier_lands_first`, + `test_embedding_model_family_matching`, + `test_embedding_regeneration_candidates_include_entire_mismatched_corpus` + go to `sqlite/search.rs`. +- `test_review` goes to `sqlite/scheduling.rs`. +- `test_dream_history_save_and_get_last`, `test_dream_history_empty`, + `test_count_memories_since` go to `sqlite/mod.rs` (history tables live + there). +- All `test_portable_*` go to `sqlite/portable_sync.rs`. +- `test_file_portable_sync_round_trips_between_devices` goes to + `sqlite/portable_sync.rs`. + +**Trait round-trip tests** (current lines 8200-8712) test the +`LocalMemoryStore` trait impl. Two viable layouts: + +A. Co-locate with the impl in `sqlite/trait_impl.rs` (one big + `#[cfg(test)] mod trait_tests`). +B. Keep them as a single `tests.rs` file in the sqlite directory. + +**Decision: A.** Co-locate. The trait round-trip tests are explicitly +testing the `impl LocalMemoryStore for SqliteMemoryStore` block; +co-location means a reader who edits the trait impl sees its tests in +the same file. Option B would mean two places to look every time a +trait method changes shape. For an 8K-line collapse the tradeoff +favours co-location. + +Concretely: `sqlite/trait_impl.rs` ends with a +`#[cfg(test)] mod trait_tests { ... }` block that contains all 30+ +`trait_*` tests, plus the shared `make_record`, `rt`, and small helpers +defined inside the current test mod for trait tests (lines 8208-8226). + +--- + +## Commit Sequence + +Each commit moves one logical group. After each commit: + +``` +cargo build -p vestige-core +cargo test -p vestige-core +cargo clippy -p vestige-core -- -D warnings +``` + +must pass. Order is chosen so that each move is small, the next move +does not depend on the previous having grown surprising visibility, and +the largest / riskiest move (the trait impl, with the new +trait_variant attribute) lands last. + +| # | Commit | What moves | Tests touched | +|---|--------|-----------|----------------| +| 1 | `refactor(sqlite): scaffold sqlite/ directory` | Convert `sqlite.rs` -> `sqlite/mod.rs` verbatim (rename + create empty sibling files `crud.rs`, `search.rs`, `scheduling.rs`, `graph.rs`, `domain.rs`, `registry.rs`, `portable_sync.rs`, `trait_impl.rs` each with `use super::*;`). At this point `mod.rs` declares the new modules but they are empty. | None move. Build proves the rename works. | +| 2 | `refactor(sqlite): split out portable sync` | Move all `merge_*`, `portable_*`, `export_*`, `import_*`, `sync_*` items + `MergeWrite`, `PortableSyncBackend`, `FilePortableSyncBackend`, `PortableSyncReport`, `PortableMergeState`, `PORTABLE_TABLES`, `PORTABLE_USER_DATA_TABLES`, helper statics into `sqlite/portable_sync.rs`. Add `pub use portable_sync::{...}` in `mod.rs` for the public types. | `test_portable_*` and `test_file_portable_sync_round_trips_between_devices` move too. | +| 3 | `refactor(sqlite): split out graph / connections` | Move `save_connection`, `get_connections_for_memory`, `get_all_connections`, `strengthen_connection`, `apply_connection_decay`, `prune_weak_connections`, `row_to_connection`, `get_most_connected_memory`, `get_memory_subgraph` to `sqlite/graph.rs`. | None move (no native graph tests; trait edge tests still in trait_tests). | +| 4 | `refactor(sqlite): split out scheduling / fsrs / consolidation` | Move all items listed in the Scheduling row to `sqlite/scheduling.rs`. | `test_review` moves. | +| 5 | `refactor(sqlite): split out search / fts / semantic / hybrid` | Move all items listed in the Search row to `sqlite/search.rs`. Add `pub(super)` to the four `embedding_model_*` helpers that tests reference. | `test_search`, `test_keyword_search_*`, `test_include_types_*`, `test_type_filter_*`, `test_hybrid_search_*`, `test_concrete_search_*`, `test_embedding_model_family_matching`, `test_embedding_regeneration_candidates_include_entire_mismatched_corpus` move. | +| 6 | `refactor(sqlite): split out crud / ingest / get / update / delete / purge` | Move `ingest`, `smart_ingest`, `get_node`, `update_node_content`, `delete_node`, `purge_node`, `get_all_nodes`, `get_nodes_by_type_and_tag`, `node_exists`, `record_sync_tombstone`, `generate_embedding_for_node`, `get_node_embedding`, `get_all_embeddings`, `PurgeReport` to `sqlite/crud.rs`. Bump `row_to_node` and `parse_timestamp` to `pub(super) fn` in `mod.rs`. | `test_ingest_and_get`, `test_delete`, `test_purge_scrubs_insight_json_orphans_children_and_writes_tombstone` move. | +| 7 | `refactor(sqlite): split out registry helper` | Move `enforce_model` to `sqlite/registry.rs`, bumped to `pub(super)`. | None move. | +| 8 | `refactor(sqlite): split out domain helper` | Move `read_domain_columns` to `sqlite/domain.rs`, bumped to `pub(super)`. | None move. | +| 9 | `refactor(sqlite): split out trait impl + tests` | Move `node_to_record` and the full `impl LocalMemoryStore for SqliteMemoryStore` block to `sqlite/trait_impl.rs`. Move the entire trait round-trip test module (lines 8200-8712, including `make_record` and `rt` helpers) to a `#[cfg(test)] mod trait_tests` block at the bottom of `trait_impl.rs`. This is the commit where the trait_variant attribute (from sub-plan 0001a) is observed: the impl block on `SqliteMemoryStore` uses whatever syntax the rewritten trait expects (no `#[async_trait::async_trait]`). | All `trait_*` tests move. | + +Commit 1 is the only commit that creates new files; the rest move +existing code into them. Reviewers can bisect through this list to +find any silent-semantic change. + +--- + +## Verification + +Run after every commit. All three must pass before pushing: + +``` +cargo build -p vestige-core +cargo test -p vestige-core +cargo clippy -p vestige-core -- -D warnings +``` + +The Phase 1 amendment branch must also build with the no-default-features +configuration that the release binary uses for the alternative feature +set: + +``` +cargo build -p vestige-core --no-default-features +cargo test -p vestige-core --no-default-features +``` + +Some of the methods being moved (`get_node_embedding`, `is_embedding_ready`, +`init_embeddings`, the feature-on/feature-off `hybrid_search` pair) have +two definitions guarded by feature flags. The split must preserve both +copies in the same destination file with their existing `#[cfg(...)]` +attributes; the no-default-features build confirms. + +After the last commit, run the workspace-wide check that Phase 1 promised: + +``` +cargo build --workspace +cargo test --workspace +``` + +This catches downstream consumers (`vestige-mcp`, `vestige`, +`vestige-restore`) that might depend on a specific module path (they +should not -- they import from `crate::storage::SqliteMemoryStore` and +the re-exports in `storage/mod.rs` -- but the workspace build is the +authoritative confirmation). + +--- + +## Acceptance Criteria + +1. `crates/vestige-core/src/storage/sqlite.rs` no longer exists. In its + place is `crates/vestige-core/src/storage/sqlite/` with the eight + files listed in the Target Layout section, each below 2000 lines. +2. `crates/vestige-core/src/storage/mod.rs` is unchanged (or + functionally unchanged -- the `pub use sqlite::{...}` block contains + the same identifiers in the same order). +3. Every cognitive module and binary in the workspace + (`vestige-core`, `vestige-mcp`, `vestige`, `vestige-restore`) + compiles with no source edits other than the ones in + `crates/vestige-core/src/storage/sqlite/`. +4. `cargo build -p vestige-core`, + `cargo test -p vestige-core`, + `cargo clippy -p vestige-core -- -D warnings`, + `cargo build -p vestige-core --no-default-features`, and + `cargo test -p vestige-core --no-default-features` all pass at the + end of every commit in the sequence (bisectability). +5. `cargo test --workspace` matches the Phase 1 baseline test count + (758 tests, of which 352 are in `vestige-core`). No new tests are + added by this sub-plan; no existing test is renamed or deleted. +6. The on-disk SQLite schema is unchanged. A live database created on + the pre-split build opens cleanly on the post-split build and round + trips a memory. +7. `git log --follow` works for at least one method in each destination + file (i.e. `git mv` was used where the line range constitutes most + of the file content of the destination, otherwise a `git log -p` + on the new file shows the history before the rename through the + block-move detection that recent `git log` versions support). +8. No public symbol disappears from `crate::storage::*`. A reviewer can + verify with: + ``` + cargo doc -p vestige-core --no-deps + ``` + before and after the split, and `diff` the generated + `target/doc/vestige_core/storage/index.html` lists. + +--- + +## Non-Goals (explicit) + +- No public API change. The trait surface (`LocalMemoryStore`, + `MemoryStore`), the legacy `pub fn` surface on `SqliteMemoryStore`, + the re-exports through `storage/mod.rs`, and the `pub type Storage = + SqliteMemoryStore;` alias are all preserved. +- No behavioural change. No SQL is rewritten, no FSRS parameter is + retuned, no embedding model is touched, no migration is added. +- No new tests. Tests move with their subject; no new tests are + written. +- No clippy fix-ups that pre-date this sub-plan. If `cargo clippy + -- -D warnings` was passing before the split, it must continue to + pass; if it was not passing, the failures stay where they are and + are addressed in a separate commit (out of scope here). +- No removal of the `pub type Storage = SqliteMemoryStore;` BC alias. + That happens at the end of Phase 4 per ADR 0001. +- No reorganisation of `storage/memory_store.rs`, + `storage/migrations.rs`, or `storage/portable.rs`. Those files are + out of scope; the split is private to `storage/sqlite/`. + +--- + +## Risks and Mitigations + +| Risk | Mitigation | +|------|------------| +| Silent semantic change introduced by a motion commit | Per-commit `cargo test -p vestige-core` keeps the bisect window to a single commit. Reviewer bisects with `cargo test -p vestige-core` as the witness. | +| Sibling-file `self.field` accesses fail because Rust enforces module visibility on tuple-struct or named fields | Confirmed: associated impl blocks in sibling files of the same `mod sqlite` reach private fields. If the compiler disagrees, bump the affected fields to `pub(super)` in `mod.rs` and note the bump in the commit message. | +| Test-only helpers (`create_test_storage`, `make_record`, `rt`) get duplicated across test modules | Lift them once into a `#[cfg(test)] mod test_support { ... }` sub-module in `sqlite/mod.rs` and re-export with `pub(super) use`. Do this in commit 1 (scaffold), not later. | +| `#[cfg(all(feature = "embeddings", feature = "vector-search"))]` items end up in `mod.rs` where they pollute the shared layer | Audit during commit 5 (search split); items behind both feature flags belong in `search.rs`. The `query_cache` field stays in `mod.rs` because the struct definition is there; the field declaration is feature-gated and that gate moves with the struct as-is. | +| `git log --follow` blame chains break on the moved methods | Use `git mv sqlite.rs sqlite/mod.rs` in commit 1 so commit 1 looks like a rename (`git log --follow` keeps working). Subsequent commits are content moves inside the module; modern `git log --follow -M -C` heuristics still trace the lines. Reviewers who need pristine blame should bisect to before commit 1. | +| Sub-plan 0001a (trait rewrite) has not landed when this work starts | Block: do not start commits 1-9 until 0001a is on the same branch (`feat/storage-trait-phase1`) and tests pass. `trait_impl.rs` lands the new attribute in commit 9; if 0001a is not in, commit 9 fails. | + +--- + +## Self-Contained Brief (for /goal) + +A fresh Claude Code session can execute this sub-plan by: + +1. Reading this file end to end. +2. Reading `crates/vestige-core/src/storage/sqlite.rs` (the file to be + split) in full, using line ranges from the Mapping Table to confirm + the current shape matches the brief. +3. Reading `crates/vestige-core/src/storage/mod.rs` (the re-export + surface that must continue to work). +4. Reading `crates/vestige-core/src/storage/memory_store.rs` (the + trait surface that `trait_impl.rs` implements). +5. Confirming sub-plan 0001a has landed on the current branch by + checking that `memory_store.rs` no longer carries + `#[async_trait::async_trait]` on the trait declaration. +6. Working through the Commit Sequence in order, running the + Verification commands after each commit. + +The session does not need to read ADR 0002 or the master Phase 2 plan +to do this work. The split is purely mechanical relative to the +mapping table above. diff --git a/docs/plans/0001c-async-trait-sunset.md b/docs/plans/0001c-async-trait-sunset.md new file mode 100644 index 0000000..f9d8938 --- /dev/null +++ b/docs/plans/0001c-async-trait-sunset.md @@ -0,0 +1,847 @@ +# Sub-Plan 0001c: Sunset the `async-trait` crate dependency + +**Status**: Draft +**Branch**: `feat/storage-trait-phase1` (Phase 1 amendment, PR A) +**Depends on**: +- `0001a-trait-rewrite.md` (rewrites `MemoryStore` / `LocalMemoryStore` and + the SQLite impl; lands first) +- `0001b-sqlite-split.md` (moves `sqlite.rs` into `sqlite/`; lands second) + +**Related**: `docs/adr/0002-phase-2-execution.md` (decision D1 closing line: +"async-trait dependency stays in Cargo.toml only if other code uses it; +otherwise removed"). This sub-plan operationalises the removal. + +--- + +## Context + +This is the third and final Phase 1 amendment sub-plan. Sub-plan `0001a` +rewrote `MemoryStore` / `LocalMemoryStore` using +`#[trait_variant::make(MemoryStore: Send)]` and dropped the +`#[async_trait::async_trait]` attribute from the SQLite impl block. +Sub-plan `0001b` then split `sqlite.rs` into a `sqlite/` directory; the +trait impl now lives in `sqlite/trait_impl.rs`. After `0001a` lands, the +only remaining `async_trait` usage in the workspace is the embedder pair +(`embedder/mod.rs` declares the trait; `embedder/fastembed.rs` implements +it). This sub-plan rewrites those two files following the exact pattern +from `0001a`, then removes `async-trait = "0.1"` from +`crates/vestige-core/Cargo.toml`. End state: zero `async_trait` references +anywhere under `crates/`, zero direct dependency on the `async-trait` +crate, workspace tests and clippy green. + +The rewrite is mechanically tiny -- one trait declaration, one impl block, +one Cargo.toml line -- but it is gated behind `0001a` and `0001b` so the +trait-rewrite pattern is already settled and so the SQLite trait impl +attribute has already been dropped. Doing the embedder rewrite without +that pre-work would leave the `async-trait` dep behind for the SQLite +side and force the Cargo.toml deletion into a separate commit later. + +--- + +## Scope + +### In scope + +- Rewrite `LocalEmbedder` declaration in + `crates/vestige-core/src/embedder/mod.rs` to use + `#[trait_variant::make(Embedder: Send)] pub trait LocalEmbedder`. +- Delete the `pub use LocalEmbedder as Embedder;` alias from the same file. + The `Embedder` symbol becomes the trait that `trait_variant::make` emits + at the same module path, so the existing + `pub use embedder::{Embedder, ..., LocalEmbedder};` line in + `crates/vestige-core/src/lib.rs:167` keeps working untouched. +- Drop the `#[async_trait::async_trait]` attribute from the + `FastembedEmbedder` impl block in + `crates/vestige-core/src/embedder/fastembed.rs`. +- Update doc comments on the trait declaration to describe + `trait_variant` rather than `async_trait`. +- Remove `async-trait = "0.1"` from + `crates/vestige-core/Cargo.toml` (line 119 area). Use + `cargo rm async-trait` from inside the crate directory. +- Verify with `grep -rn "async_trait" crates/` returning zero hits. + +### Out of scope + +- Any change to the `MemoryStore` trait or `SqliteMemoryStore` impl; + those were handled by `0001a`. +- Any file moves in `embedder/` (no parallel of `0001b` is required; + `embedder/` already has the `mod.rs` + `fastembed.rs` shape). +- Touching `crates/vestige-mcp/` or any cognitive module. None of them + hold `Arc` or `Box` in production. +- Renaming the `Embedder` / `LocalEmbedder` symbols or changing the + re-exports in `crates/vestige-core/src/lib.rs`. + +--- + +## Prerequisites + +### State assumed at start + +- `0001a` is merged onto the branch. After `0001a`: + - `crates/vestige-core/src/storage/memory_store.rs` declares + `#[trait_variant::make(MemoryStore: Send)] pub trait LocalMemoryStore`. + - The SQLite impl block has no `#[async_trait::async_trait]` attribute. + - `grep -rn async_trait crates/` returns exactly three hits, all in + `crates/vestige-core/src/embedder/` (two in `mod.rs`, one in + `fastembed.rs`), and one Cargo.toml hit. +- `0001b` is merged onto the branch. After `0001b`: + - `crates/vestige-core/src/storage/sqlite.rs` no longer exists as a + single file; the impl lives in `crates/vestige-core/src/storage/sqlite/trait_impl.rs`. + - The embedder files are untouched. + +### Required crates + +| Crate | Version | Action | +|----------------|---------|-----------------------------------------------------------------| +| `trait-variant`| `0.1` | Already declared (line 117 of Cargo.toml). Verify present. | +| `async-trait` | `0.1` | Remove. Only the two embedder files still use it after `0001a`. | + +### Workspace-wide audit before starting + +Run from `/home/delandtj/prppl/vestige-phase2/` (or the equivalent +worktree where this sub-plan executes): + +```bash +grep -rn "async_trait\|async-trait" crates/ tests/ +``` + +Expected hits before this sub-plan starts (after `0001a` + `0001b`): + +``` +crates/vestige-core/Cargo.toml:119:async-trait = "0.1" +crates/vestige-core/src/embedder/mod.rs:24:/// `#[async_trait::async_trait]` makes every `async fn` return a +crates/vestige-core/src/embedder/mod.rs:27:#[async_trait::async_trait] +crates/vestige-core/src/embedder/mod.rs:56:/// Both names refer to the same `async_trait`-annotated trait. +crates/vestige-core/src/embedder/fastembed.rs:44:#[async_trait::async_trait] +``` + +Five hits across two source files and one Cargo.toml. After this sub-plan, +the same grep must return zero hits. + +```bash +grep -rn "async-trait\|async_trait" --include="Cargo.toml" crates/ +``` + +Expected: exactly one hit (`crates/vestige-core/Cargo.toml:119`). No other +workspace crate declares `async-trait` as a direct dependency. This is +the precondition that lets us delete the line cleanly. + +--- + +## Files Touched + +### Trait declaration (vestige-core) + +| File | Lines (approx) | Change | +|-------------------------------------------------|----------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `crates/vestige-core/src/embedder/mod.rs` | 21-57 | Replace `#[async_trait::async_trait] pub trait LocalEmbedder: Send + Sync + 'static` with `#[trait_variant::make(Embedder: Send)] pub trait LocalEmbedder: Sync + 'static`. Delete the `pub use LocalEmbedder as Embedder;` alias on line 57. Update doc comments (lines 21-26, 55-56). | + +### Impl block (vestige-core) + +| File | Lines (approx) | Change | +|-------------------------------------------------|----------------|--------------------------------------------------------------------------------------------------------------| +| `crates/vestige-core/src/embedder/fastembed.rs` | 44 | Delete the `#[async_trait::async_trait]` attribute. Keep the `impl LocalEmbedder for FastembedEmbedder { ... }` body verbatim. No `Box::pin`, no `'async_trait` lifetimes, no manual `Pin>`. | + +### Other Embedder impls + +None. `grep -rn "impl.*LocalEmbedder\|impl.*Embedder for" crates/ tests/` +returns exactly one production hit: +`crates/vestige-core/src/embedder/fastembed.rs:45: impl LocalEmbedder for FastembedEmbedder`. +There is no test mock implementing `Embedder` in the test tree (the only +test that touches the trait, `tests/phase_1/embedder_trait.rs`, uses the +concrete `FastembedEmbedder` boxed as `Box`). + +### Call sites (production) + +Verified by: + +```bash +grep -rn "dyn Embedder\|dyn LocalEmbedder" crates/ tests/ --include="*.rs" +grep -rn "Box\|Arc" crates/ tests/ --include="*.rs" +grep -rn "use.*Embedder" crates/ tests/ --include="*.rs" +``` + +Production call sites that may need verification (and the expected change +for each, even though we have already verified that none need an edit): + +| File | Use | Required change | +|------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-----------------| +| `crates/vestige-core/src/lib.rs:167` | `pub use embedder::{Embedder, EmbedderError, EmbedderResult, FastembedEmbedder, LocalEmbedder};` | None. Both names still exist at `crate::embedder::*` after the rewrite; `Embedder` is now the `trait_variant`-generated trait, `LocalEmbedder` is the source-of-truth trait. The re-export keeps resolving. | +| `crates/vestige-core/src/embedder/fastembed.rs:7` | `use super::{EmbedderError, EmbedderResult, LocalEmbedder};` | None. `LocalEmbedder` is still the source-of-truth trait name. | +| `crates/vestige-core/src/embedder/mod.rs:5` | `pub use fastembed::FastembedEmbedder;` | None. Concrete type, untouched. | +| `crates/vestige-mcp/src/**` | No file imports `Embedder` or `LocalEmbedder` by name; none hold `Arc` or `Box`. | None. Verified by grep returning empty for `dyn Embedder` and `dyn LocalEmbedder` under `crates/vestige-mcp/`. | +| Cognitive modules under `crates/vestige-core/src/advanced/` and `crates/vestige-core/src/neuroscience/` | No file imports `Embedder` or `LocalEmbedder` by name. `advanced/adaptive_embedding.rs` defines its own unrelated `AdaptiveEmbedder` struct. | None. The name collision is purely surface-level; the two types live in different modules and never resolve to each other. | +| `crates/vestige-core/src/embeddings/**` | No file imports `Embedder` or `LocalEmbedder`. The `EmbeddingService` struct is what `FastembedEmbedder` wraps internally. | None. | + +The production audit returns zero files that need editing. + +### Call sites (tests) + +| File | Lines | Use | Required change | +|------------------------------------------------------------|-------|--------------------------------------------------------------------|-----------------| +| `tests/phase_1/embedder_trait.rs` | 3, 19 | `use vestige_core::embedder::{Embedder, FastembedEmbedder};`
`let e: Box = Box::new(FastembedEmbedder::new());` | None. `Embedder` is the `trait_variant`-generated Send variant; `Box` keeps compiling. `FastembedEmbedder` implements `LocalEmbedder`; the blanket `impl Embedder for T` that `trait_variant::make` emits gives the boxing for free. | + +The test audit returns zero files that need editing. + +### Cargo dependency cleanup + +| File | Lines | Change | +|-------------------------------------|-----------|-----------------------------------------------------------------------------------------------------| +| `crates/vestige-core/Cargo.toml` | 119 | Remove `async-trait = "0.1"`. Run `cargo rm async-trait` from inside `crates/vestige-core/` so `Cargo.lock` updates atomically with the manifest. | + +### Documentation + +| File | Change | +|---------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------| +| `crates/vestige-core/src/embedder/mod.rs` | Rewrite the trait-level doc comment (lines 21-26) and the `pub use` doc comment (lines 55-56) to describe `trait_variant`, not `async_trait`. See "Trait declaration rewrite" below for the exact new text. | +| `CLAUDE.md` | No change. The repo-level architecture notes do not name the trait attribute. | + +--- + +## Trait Declaration Rewrite + +### Before (state after `0001a` and `0001b` land) + +`crates/vestige-core/src/embedder/mod.rs:1-58`: + +```rust +//! Text-to-vector encoding trait. Pluggable per-install. + +mod fastembed; + +pub use fastembed::FastembedEmbedder; + +/// Error returned by every `Embedder` method. +#[non_exhaustive] +#[derive(Debug, thiserror::Error)] +pub enum EmbedderError { + #[error("embedder initialization failed: {0}")] + Init(String), + #[error("embedding generation failed: {0}")] + EmbedFailed(String), + #[error("invalid input: {0}")] + InvalidInput(String), +} + +pub type EmbedderResult = std::result::Result; + +/// Pluggable embedder. The storage layer NEVER calls fastembed directly; +/// callers compute vectors via this trait and pass them into `MemoryStore`. +/// +/// `#[async_trait::async_trait]` makes every `async fn` return a +/// `Pin>`, which is required for `Box` +/// and `Arc` to be dyn-compatible. +#[async_trait::async_trait] +pub trait LocalEmbedder: Send + Sync + 'static { + async fn embed(&self, text: &str) -> EmbedderResult>; + + fn model_name(&self) -> &str; + + fn dimension(&self) -> usize; + + /// Stable blake3 hash of (model_name || dimension || vestige-core crate version). + /// Lowercase hex, 64 chars. + /// + /// Used by `MemoryStore::register_model` to detect silent model drift + /// (e.g. a fastembed minor upgrade that changes vector output). + fn model_hash(&self) -> String; + + async fn embed_batch(&self, texts: &[&str]) -> EmbedderResult>>; + + /// Returns the `ModelSignature` describing this embedder. Convenience + /// wrapper over the three accessors above. + fn signature(&self) -> crate::storage::ModelSignature { + crate::storage::ModelSignature { + name: self.model_name().to_string(), + dimension: self.dimension(), + hash: self.model_hash(), + } + } +} + +/// Type alias: `Embedder` is the dyn-compatible, Send+Sync variant. +/// Both names refer to the same `async_trait`-annotated trait. +pub use LocalEmbedder as Embedder; +``` + +### After + +`crates/vestige-core/src/embedder/mod.rs:1-55` (approximately): + +```rust +//! Text-to-vector encoding trait. Pluggable per-install. + +mod fastembed; + +pub use fastembed::FastembedEmbedder; + +/// Error returned by every `Embedder` method. +#[non_exhaustive] +#[derive(Debug, thiserror::Error)] +pub enum EmbedderError { + #[error("embedder initialization failed: {0}")] + Init(String), + #[error("embedding generation failed: {0}")] + EmbedFailed(String), + #[error("invalid input: {0}")] + InvalidInput(String), +} + +pub type EmbedderResult = std::result::Result; + +/// Pluggable embedder. The storage layer NEVER calls fastembed directly; +/// callers compute vectors via this trait and pass them into `MemoryStore`. +/// +/// `LocalEmbedder` is the source-of-truth trait. The +/// `#[trait_variant::make(Embedder: Send)]` attribute auto-generates an +/// `Embedder` variant whose returned futures are `Send`, so +/// `Box` and `Arc` are usable on tokio/axum +/// runtimes, while `Box` remains usable on single- +/// threaded executors and thread-local backends. +/// +/// Every method is native async-fn-in-trait (stable on MSRV 1.91); no +/// per-call heap allocation, no boxed futures at the static-dispatch +/// boundary. +#[trait_variant::make(Embedder: Send)] +pub trait LocalEmbedder: Sync + 'static { + async fn embed(&self, text: &str) -> EmbedderResult>; + + fn model_name(&self) -> &str; + + fn dimension(&self) -> usize; + + /// Stable blake3 hash of (model_name || dimension || vestige-core crate version). + /// Lowercase hex, 64 chars. + /// + /// Used by `MemoryStore::register_model` to detect silent model drift + /// (e.g. a fastembed minor upgrade that changes vector output). + fn model_hash(&self) -> String; + + async fn embed_batch(&self, texts: &[&str]) -> EmbedderResult>>; + + /// Returns the `ModelSignature` describing this embedder. Convenience + /// wrapper over the three accessors above. + fn signature(&self) -> crate::storage::ModelSignature { + crate::storage::ModelSignature { + name: self.model_name().to_string(), + dimension: self.dimension(), + hash: self.model_hash(), + } + } +} +``` + +### Both halves of the macro-generated output (for reviewer clarity) + +`trait_variant::make(Embedder: Send)` expands the source-of-truth +`LocalEmbedder` declaration above into the equivalent of: + +```rust +// 1. The source-of-truth trait, exactly as written. +pub trait LocalEmbedder: Sync + 'static { + fn embed(&self, text: &str) -> impl Future>>; + fn model_name(&self) -> &str; + fn dimension(&self) -> usize; + fn model_hash(&self) -> String; + fn embed_batch(&self, texts: &[&str]) -> impl Future>>>; + fn signature(&self) -> crate::storage::ModelSignature { /* default impl unchanged */ } +} + +// 2. The generated Send variant. +pub trait Embedder: Sync + 'static { + fn embed(&self, text: &str) -> impl Future>> + Send; + fn model_name(&self) -> &str; + fn dimension(&self) -> usize; + fn model_hash(&self) -> String; + fn embed_batch(&self, texts: &[&str]) -> impl Future>>> + Send; + fn signature(&self) -> crate::storage::ModelSignature { /* default impl unchanged */ } +} + +// 3. The blanket impl that wires any LocalEmbedder + Send through to Embedder. +impl Embedder for T +where + T: LocalEmbedder + Send, + // (all returned futures of LocalEmbedder's async fns are required to be Send, + // which is satisfied for FastembedEmbedder -- see "Risks" below) +{ /* forwarders */ } +``` + +Notes: + +- The `pub use LocalEmbedder as Embedder;` line on the current + `embedder/mod.rs:57` is **deleted** entirely. `Embedder` is now the + trait that `trait_variant::make` emits at the same module path; the + re-export in `crates/vestige-core/src/lib.rs:167` + (`pub use embedder::{Embedder, ..., LocalEmbedder};`) keeps resolving + unchanged. +- `Sync + 'static` on `LocalEmbedder` (and no `Send` bound on the trait + itself) mirrors the `0001a` pattern for `LocalMemoryStore`. The current + trait carries `Send + Sync + 'static`; the rewrite drops the `Send` + bound from the local variant. `Box` is `Sync` but + not `Send`; `Box` (the generated variant) is `Send + Sync`. +- `trait_variant` 0.1 does **not** require any attribute on the impl + block. The attribute lives only on the trait declaration. See next + section. + +--- + +## Impl Block Migration + +`trait_variant` 0.1 attaches the attribute only to the trait declaration. +The impl side is plain `impl LocalEmbedder for FastembedEmbedder`; no +attribute on the impl, no `#[trait_variant::make(Embedder: Send)]` on the +impl block. The macro auto-generates the blanket +`impl Embedder for T`, so any concrete type that +implements `LocalEmbedder` automatically also implements `Embedder` +provided it is `Send`. + +`FastembedEmbedder` is `Send + Sync` because: + +- `inner: EmbeddingService` is `Send + Sync` (it wraps fastembed's + `TextEmbedding` which is `Send + Sync` after fastembed 4.x; verified + in `crates/vestige-core/src/embeddings/mod.rs`). +- `cached_hash: std::sync::OnceLock` is `Send + Sync` for `T: Send + Sync`. +- The `#[cfg(not(feature = "embeddings"))]` branch carries only + `cached_hash`, which is unconditionally `Send + Sync`. + +No new bound is needed. + +### Before + +`crates/vestige-core/src/embedder/fastembed.rs:38-100` (relevant header): + +```rust +impl Default for FastembedEmbedder { + fn default() -> Self { + Self::new() + } +} + +#[async_trait::async_trait] +impl LocalEmbedder for FastembedEmbedder { + async fn embed(&self, text: &str) -> EmbedderResult> { + // ... body unchanged ... + } + + fn model_name(&self) -> &str { /* ... */ } + fn dimension(&self) -> usize { /* ... */ } + fn model_hash(&self) -> String { /* ... */ } + + async fn embed_batch(&self, texts: &[&str]) -> EmbedderResult>> { + // ... body unchanged ... + } +} +``` + +### After + +`crates/vestige-core/src/embedder/fastembed.rs:38-99` (one fewer line): + +```rust +impl Default for FastembedEmbedder { + fn default() -> Self { + Self::new() + } +} + +impl LocalEmbedder for FastembedEmbedder { + async fn embed(&self, text: &str) -> EmbedderResult> { + // ... body unchanged ... + } + + fn model_name(&self) -> &str { /* ... */ } + fn dimension(&self) -> usize { /* ... */ } + fn model_hash(&self) -> String { /* ... */ } + + async fn embed_batch(&self, texts: &[&str]) -> EmbedderResult>> { + // ... body unchanged ... + } +} +``` + +Diff is exactly one removed line (the `#[async_trait::async_trait]` +attribute on line 44). Every method body, every `async fn` signature, +every `use` statement inside the impl block stays verbatim. No +`Box::pin(async move { ... })`, no manual `Pin>`, no +`'async_trait` lifetime markers; native async-fn-in-trait does this +directly. + +### Why the impl block does not need an attribute + +`trait_variant::make` generates two things from the source trait +(see the "macro-generated output" subsection above): + +1. The source trait itself (`LocalEmbedder`), with native async fns. +2. A second trait (`Embedder`) whose method signatures return + `impl Future + Send` instead of `impl Future`, + plus a blanket impl wiring concrete types through. + +Both are emitted at the macro-call site. `FastembedEmbedder` writes one +impl block (against `LocalEmbedder`); the macro-generated blanket +guarantees `FastembedEmbedder: Embedder` for free. The +`Box` cast on `tests/phase_1/embedder_trait.rs:19` keeps +type-checking unchanged. + +--- + +## Call Site Audit + +Verified via, from the phase2 worktree root: + +```bash +grep -rn "async_trait\|LocalEmbedder\|dyn Embedder" crates/ +grep -rn "use.*Embedder" crates/ tests/ --include="*.rs" +grep -rn "Box\|Arc\|&dyn Embedder" crates/ tests/ --include="*.rs" +grep -rn "Box\|Arc\|&dyn LocalEmbedder" crates/ tests/ --include="*.rs" +grep -rn "impl LocalEmbedder for\|impl Embedder for" crates/ tests/ --include="*.rs" +``` + +### Files that reference the trait object form + +Exactly one, test-only: + +| File | Line | Use | Required change | +|--------------------------------------|------|------------------------------------------------------------------------------------------|-----------------| +| `tests/phase_1/embedder_trait.rs` | 3 | `use vestige_core::embedder::{Embedder, FastembedEmbedder};` | None. `Embedder` is the generated Send variant at the same path. | +| `tests/phase_1/embedder_trait.rs` | 19 | `let e: Box = Box::new(FastembedEmbedder::new());` | None. `FastembedEmbedder: LocalEmbedder + Send` -> blanket gives `: Embedder` -> `Box` is well-formed. | + +### Files that import `Embedder` or `LocalEmbedder` by name + +| File | Line | Use | Required change | +|-----------------------------------------------------|------|----------------------------------------------------------------------------------------------------------------|-----------------| +| `crates/vestige-core/src/lib.rs` | 167 | `pub use embedder::{Embedder, EmbedderError, EmbedderResult, FastembedEmbedder, LocalEmbedder};` | None. Both names still resolve. | +| `crates/vestige-core/src/embedder/mod.rs` | 5 | `pub use fastembed::FastembedEmbedder;` | None. | +| `crates/vestige-core/src/embedder/fastembed.rs` | 7 | `use super::{EmbedderError, EmbedderResult, LocalEmbedder};` | None. | +| `tests/phase_1/embedder_trait.rs` | 3 | `use vestige_core::embedder::{Embedder, FastembedEmbedder};` | None. | + +### Files that implement the trait + +| File | Line | Impl | Required change | +|-----------------------------------------------------|------|-----------------------------------------------------------------------|----------------------------------------------| +| `crates/vestige-core/src/embedder/fastembed.rs` | 45 | `impl LocalEmbedder for FastembedEmbedder` (currently `#[async_trait]`) | Drop the `#[async_trait::async_trait]` attr. | + +No other impls exist. There is no test mock implementing `Embedder` or +`LocalEmbedder` anywhere in the workspace. + +### Files that import `async_trait` directly + +After `0001a` lands, only the embedder pair: + +``` +crates/vestige-core/src/embedder/mod.rs:24 (doc comment) +crates/vestige-core/src/embedder/mod.rs:27 (attribute) +crates/vestige-core/src/embedder/mod.rs:56 (doc comment) +crates/vestige-core/src/embedder/fastembed.rs:44 (attribute) +``` + +Plus the Cargo manifest: + +``` +crates/vestige-core/Cargo.toml:119:async-trait = "0.1" +``` + +### Production files that hold a concrete embedder + +`FastembedEmbedder` is constructed and used by concrete name (not via +trait object) in: the dashboard / MCP layer if it needs to embed query +strings ad-hoc. None of those call sites need an edit because the +concrete type is what they hold, and the concrete type is untouched by +this sub-plan. + +### Conclusion + +| Category | Count | +|--------------------------------------------------|-------| +| Production source files modified | 2 | +| Test source files modified | 0 | +| Cargo manifests modified | 1 | +| Production source files importing `Embedder` / `LocalEmbedder` (verified unchanged) | 3 | +| Test source files importing `Embedder` (verified unchanged) | 1 | +| Direct `async_trait` uses outside the embedder module after `0001a` | 0 | + +--- + +## Cargo.toml Change + +From inside `crates/vestige-core/`: + +```bash +cargo rm async-trait +``` + +This removes line 119 of `Cargo.toml` and updates `Cargo.lock` in one +step. Manual editing is acceptable as a fallback if `cargo rm` is +unavailable in the agent environment; in that case, follow up with +`cargo check -p vestige-core` to refresh the lockfile. + +### Verification + +```bash +# Direct dependency must be gone. +grep -rn "async-trait\|async_trait" --include="Cargo.toml" crates/ +# Expected: empty. + +# Transitive presence is permitted (e.g. via a third-party crate). +cargo tree -p vestige-core --depth 2 | grep async-trait +# Expected: empty for the direct edges; if a sub-dependency still pulls +# async-trait transitively, the output may contain it deeper than depth 2, +# which is fine. We only care about removing the DIRECT edge. +``` + +If `cargo tree --depth 2` returns any `async-trait` line, inspect with +`cargo tree -p vestige-core -i async-trait` to see what is pulling it. +Acceptable parents: any third-party crate. Unacceptable parent: anything +under `vestige-*`, which would mean a missed file. + +--- + +## Commit Sequence + +Three commits, each green on +`cargo test -p vestige-core --features embeddings,vector-search` and +`cargo test -p vestige-core --no-default-features`. + +### Commit 1: rewrite LocalEmbedder trait declaration + +- Touches: `crates/vestige-core/src/embedder/mod.rs` only. +- Action: replace lines 21-57 per the "Trait Declaration Rewrite" + section above. Delete the `pub use LocalEmbedder as Embedder;` line. +- Green after: `cargo check -p vestige-core` (the impl block in + `fastembed.rs` still has its `#[async_trait::async_trait]` attribute; + the macro is harmless when applied to a trait that the impl block + targets by path, because async_trait rewrites the impl's async fns + into boxed-future fns whose signatures still match the native-async + declarations after trait_variant lowering, just as it did for the + SQLite intermediate state in `0001a`'s commit 1). + + **Mitigation if check fails between commits 1 and 2:** combine the + two into a single commit. The split is offered for review convenience; + the build must be green after every commit lands. + +### Commit 2: drop `#[async_trait::async_trait]` from FastembedEmbedder impl + +- Touches: `crates/vestige-core/src/embedder/fastembed.rs` only. +- Action: delete line 44 (`#[async_trait::async_trait]`). +- Green after: + - `cargo test -p vestige-core --features embeddings,vector-search`. + - `cargo test -p vestige-core --no-default-features` (the + `#[cfg(not(feature = "embeddings"))]` branches inside the impl now + stand on their own). + - Phase 1 integration test: `cargo test --test embedder_trait + --features embeddings,vector-search`. + +### Commit 3: drop the async-trait dependency + +- Touches: `crates/vestige-core/Cargo.toml` (plus `Cargo.lock` as a + side effect). +- Action: from inside `crates/vestige-core/`, run `cargo rm async-trait`. +- Green after: `cargo build --workspace --all-targets` and + `cargo test --workspace`. +- Final hard ASCII gate: `! grep -rn "async_trait" crates/` must exit + with status 0 (i.e. the inverted grep finds nothing). + +### Combined alternative + +Commits 1 and 2 may fold into a single commit if the per-step split +feels artificial (the patterns are identical to `0001a`'s commits 3 +and 4). Commit 3 (the Cargo.toml removal) should stay separate so the +dependency-removal diff is visible in isolation. + +--- + +## Verification + +Every command runs from the repo root unless noted otherwise. + +```bash +# 1. Vestige-core, default features (embeddings + vector-search). +cargo test -p vestige-core --features embeddings,vector-search + +# 2. Vestige-core, minimal features (no embeddings, no vector-search). +cargo test -p vestige-core --no-default-features + +# 3. Workspace build, all targets (catches any feature-gated regression +# in the vestige-mcp tools tree). +cargo build --workspace --all-targets + +# 4. Whole-workspace test (vestige-mcp 406 tests + vestige-core 352 +# tests per the CLAUDE.md baseline). +cargo test --workspace + +# 5. Phase 1 embedder integration test (the trait-shape contract). +cargo test --test embedder_trait --features embeddings,vector-search + +# 6. Clippy gate, deny warnings (matches Phase 1 PR policy). +cargo clippy --workspace --all-targets --features embeddings,vector-search -- -D warnings + +# 7. Hard ASCII gate -- async_trait must be gone from source. +! grep -rn "async_trait" crates/ +# Inverted grep: exit 0 iff grep found nothing. + +# 8. Hard ASCII gate -- async-trait must be gone from manifests. +! grep -rn "async-trait" --include="Cargo.toml" crates/ + +# 9. Confirm trait_variant attribute is in place at the embedder. +grep -rn "trait_variant::make" crates/vestige-core/src/embedder/ +# Expected: exactly one hit, in embedder/mod.rs. + +# 10. Workspace-wide trait_variant audit (should match the count after +# 0001a -- two hits total, one for storage, one for embedder). +grep -rn "trait_variant::make" crates/vestige-core/src/ +# Expected: two hits. +``` + +Expected outcomes: + +- Command 1: 352 vestige-core tests pass (matches baseline). +- Command 2: smaller test count, all pass. +- Command 3: workspace builds in dev mode for all targets. +- Command 4: 758 total tests pass (matches CLAUDE.md baseline). +- Command 5: `embedder_trait` integration test passes. The + `fastembed_implements_embedder_trait` assertion (`let e: Box = ...`) is the canary; if `trait_variant::make` failed to + emit the `Embedder` Send variant, this fails to compile. +- Command 6: zero clippy warnings. +- Command 7: empty output. `async_trait` is fully gone from source. +- Command 8: empty output. `async-trait` is fully gone from manifests. +- Command 9: one hit. +- Command 10: two hits. + +--- + +## Acceptance Criteria + +A reviewer should be able to check every box: + +- [ ] `crates/vestige-core/src/embedder/mod.rs` declares the embedder + trait with `#[trait_variant::make(Embedder: Send)] pub trait + LocalEmbedder: Sync + 'static`, no `async_trait` attribute, no + `Send` bound on `LocalEmbedder` itself. +- [ ] `crates/vestige-core/src/embedder/mod.rs` no longer contains + `pub use LocalEmbedder as Embedder;`. +- [ ] `crates/vestige-core/src/embedder/fastembed.rs` declares + `impl LocalEmbedder for FastembedEmbedder` with no attribute on + the impl block. +- [ ] `crates/vestige-core/Cargo.toml` does not declare `async-trait` + as a direct dependency. +- [ ] `grep -rn "async_trait" crates/` returns zero hits. +- [ ] `grep -rn "async-trait" --include="Cargo.toml" crates/` returns + zero hits. +- [ ] `grep -rn "trait_variant::make" crates/vestige-core/src/` returns + exactly two hits (storage trait + embedder trait). +- [ ] All 758 workspace tests pass (`cargo test --workspace`). +- [ ] `tests/phase_1/embedder_trait.rs` compiles and passes with the + `Box` cast intact. +- [ ] `cargo clippy --workspace --all-targets --features + embeddings,vector-search -- -D warnings` is clean. +- [ ] No file under `crates/vestige-mcp/` or under + `crates/vestige-core/src/{neuroscience,advanced,consolidation, + codebase,memory,embeddings}/` was modified by this sub-plan. +- [ ] `Cargo.lock` was updated as a side effect of `cargo rm async-trait` + (it must no longer reference `async-trait`). +- [ ] Doc comments on the embedder trait declaration describe + `trait_variant`, not `async_trait`. + +--- + +## Risks and Mitigations + +- **`trait_variant::make` requires returned futures to be `Send` for the + blanket `impl Embedder for T`. If any + `async fn embed`/`embed_batch` body inside `FastembedEmbedder` captures + a non-Send local, the blanket impl fails to type-check.** + Mitigation: the existing impl bodies call `self.inner.embed(text)` / + `self.inner.embed_batch(texts)`, where `inner: EmbeddingService` is + `Send + Sync` (verified in `crates/vestige-core/src/embeddings/mod.rs`). + No `.await` points exist inside the bodies in either feature branch; + the `EmbeddingService::embed` calls are synchronous. The futures are + trivially `Send`. If a future change introduces a non-Send local + (e.g. an `Rc` or a non-Send guard), the blanket impl will surface that + as a compile error at the dyn cast in `tests/phase_1/embedder_trait.rs`, + which is the correct outcome. +- **The macro's blanket impl interacts oddly with the default `signature` + method.** + Mitigation: `signature` is a synchronous method returning + `crate::storage::ModelSignature`, with no `Send` or `async` concerns. + `trait_variant::make` emits it on both variants as-is. The existing + Phase 1 test `signature_matches_memory_store_registry` exercises this + path and is part of the verification step. +- **`Box` cast in `tests/phase_1/embedder_trait.rs` fails + to resolve after the rewrite.** + Mitigation: the rewrite preserves the `Embedder` symbol at the same + module path; only its provenance changes (now generated by + `trait_variant::make` instead of by `pub use LocalEmbedder as + Embedder;`). The macro is specifically designed so that the generated + trait is dyn-compatible at the Send-bound boundary. Verified by the + identical pattern already working for `MemoryStore` after `0001a`. +- **`cargo rm async-trait` updates `Cargo.lock` but accidentally bumps + other crates.** + Mitigation: run `cargo rm async-trait` and then immediately inspect + the resulting `Cargo.lock` diff. The expected diff is the removal of + the `[[package]] name = "async-trait"` block and its hash. Anything + else is a red flag and should be reverted before committing + (`git checkout -- Cargo.lock` then `cargo update -p async-trait + --precise=remove` -- or fall back to manual edit + `cargo check`). +- **A new workspace crate added in parallel with this work declares + `async-trait` and the dependency removal silently re-introduces it + later.** + Mitigation: the verification step `grep -rn "async-trait" + --include="Cargo.toml" crates/` is part of the acceptance criteria; a + rebase that reintroduces the line will fail this gate. +- **MCP server uses `Embedder` somewhere we missed.** + Mitigation: full workspace grep (`grep -rn "Embedder" crates/`) + returns no hits inside `crates/vestige-mcp/` for the trait names; the + MCP layer uses the concrete `EmbeddingService` from + `crates/vestige-core/src/embeddings/` for ad-hoc embedding calls. The + trait surface is purely internal to `vestige-core`. + +--- + +## Out-of-Band Notes + +- **No other workspace crate declares `async-trait` as a direct + dependency.** Verified by + `grep -rn "async-trait" --include="Cargo.toml" crates/` returning + exactly one hit at `crates/vestige-core/Cargo.toml:119`. There is + nothing to clean up in `crates/vestige-mcp/Cargo.toml` or elsewhere. +- **Order matters across the three Phase 1 amendment sub-plans:** + `0001a` (trait rewrite) -> `0001b` (sqlite split) -> `0001c` (this + one, async-trait sunset). Reversing the order is possible in + principle but would force re-editing the embedder rewrite twice and + leaves the `async-trait` dep behind until very late. +- **This sub-plan amends `feat/storage-trait-phase1` (tip 790c0c8 plus + whatever commits `0001a` and `0001b` added).** The branch has not + been opened upstream yet, so amending in place is safe; no force-push + to a public PR. +- **After this sub-plan lands, the branch is reviewed and merged before + Phase 2 sub-plans (`0002a-` through `0002i-`) begin implementation.** + Phase 2 introduces no async-trait usage; the Postgres backend follows + the same `trait_variant::make` pattern (see ADR 0002 D1). +- **`trait-variant` 0.1 stays in `Cargo.toml`.** It is the only crate + this sub-plan keeps; `async-trait` is the only one it removes. + +--- + +## Self-Contained `/goal` Brief + +For a fresh Claude Code session executing this sub-plan without prior +conversation context: + +1. Check out branch `feat/storage-trait-phase1` (or a worktree off + of it after `0001a` and `0001b` are merged into it). +2. Read this file (`docs/plans/0001c-async-trait-sunset.md`) in full. +3. Read `docs/plans/0001a-trait-rewrite.md` sections "Trait declaration + rewrite" and "Impl block migration" -- they document the exact + pattern this sub-plan mirrors for the embedder. +4. Run the prerequisite audit grep listed under "Prerequisites". If it + returns more than the five hits documented there, stop and report; + the upstream state does not match what this sub-plan assumes. +5. Execute Commit 1 (rewrite `embedder/mod.rs`), then Commit 2 (drop + the attribute on the FastembedEmbedder impl), then Commit 3 + (`cargo rm async-trait`). Run the verification commands listed + above after each commit; do not proceed if any test or clippy gate + fails. +6. Verify every box in "Acceptance Criteria" is ticked. +7. Report file paths touched, test counts, and the final two grep + results (commands 7 and 8 from "Verification") in the closing + message.