mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-18 02:24:27 +02:00
Index materialization is derived state: defer off the write path, reconcile via optimize (iss-848) (#246)
* test(engine): reproduce empty-table Vector @index aborting schema apply
A Vector (IVF) index trains k-means centroids over the column, so Lance
cannot build it on 0 vectors ("Creating empty vector indices with
train=False is not yet implemented"). schema apply reconciles a table's
whole index set whenever any @index on it changes, so adding an unrelated
scalar @index materializes the dormant empty vector index and aborts the
entire migration (all-or-nothing).
This regression test inits a 0-row Doc with a Vector @index, adds a scalar
@index, and asserts the apply succeeds (then loads one embedded row and
asserts the deferred index materializes). It fails today at the apply step
with the vector-index abort; the fix lands in the next commit.
Refs dev-graph iss-empty-vector-index-schema-apply, iss-848.
* fix(engine): defer Vector @index on an empty table instead of aborting schema apply
build_indices_on_dataset_for_catalog materialized a declared Vector @index
unconditionally. On a 0-row table Lance cannot train the IVF index
("Creating empty vector indices with train=False is not yet implemented"),
so any later migration that touches the table (e.g. adding an unrelated
scalar @index, which reconciles the table's whole index set) aborted the
entire migration on the dormant vector index — all-or-nothing.
Guard the vector arm with a row-count check, matching the guard
ensure_indices_for_branch and the branch-merge rebuild already use: an
untrainable column becomes a pending index that a later ensure_indices /
optimize materializes once the table has rows. Reads stay correct meanwhile
(vector search degrades to a brute-force scan).
Stop-gap: the residual rows-present-but-vectors-null window and the full
decoupling (intent recorded at apply, an idempotent coverage reconciler)
are dev-graph iss-848. Turns the green half of the regression test added in
the previous commit.
Refs dev-graph iss-empty-vector-index-schema-apply, iss-848, iss-687.
* docs(invariants): record the logical-contract-over-physical-state principle
The bug class behind the empty-table vector-index abort (and the schema-apply
vs optimize version drift) is one shape: a physical operation allowed to fail
a logical one. Several hard invariants (2, 5, 7, 13) and deny-list items are
already instances of this, but the unifying rule was never written down.
Add it to docs/dev/invariants.md as a "Governing principle" section above the
hard invariants, naming which invariants and deny-list items instantiate it
and the smell to watch for (a logical operation gated on a physical fact).
Add a one-line always-on rule (7) in AGENTS.md so it stays in working memory,
with the qualifier that genuine logical conflicts still fail loudly — the
licence to lag covers physical convergence, not correctness.
Audience-neutral: no private ticket refs. check-agents-md.sh passes.
* test(engine): index build must tolerate rows with null vectors (load-before-embed)
Loading rows whose vector column is null into a `Vector @index` table fails
today: build_indices (reached via the loader's prepare_updates_for_commit)
calls create_vector_index, and Lance's IVF KMeans errors "cannot train 1
centroids with 0 vectors". The same abort hits ensure_indices/optimize/schema
apply/merge, since they all funnel through build_indices_on_dataset_for_catalog.
This test loads two null-embedding rows and calls ensure_indices; it must not
abort (the untrainable vector column is deferred, sibling indexes still build).
Fails today at the load step; fixed in the next commit.
Refs dev-graph iss-848, iss-empty-vector-index-schema-apply.
* fix(engine): defer unbuildable index columns instead of aborting the write path
build_indices_on_dataset_for_catalog is the chokepoint every write path funnels
through (load/mutate via prepare_updates_for_commit, schema apply, ensure_indices,
optimize, branch merge). Its vector arm called create_vector_index
unconditionally, so a column with no trainable vectors yet — an empty table, or
rows loaded before `embed` populates them — aborted the whole operation with
Lance's IVF KMeans error.
Fault-isolate the vector build: on failure, record the column as a PendingIndex
(table, column, reason), log it, and continue building the sibling indexes; a
later ensure_indices/optimize materializes it once the column is trainable, and
reads use brute-force meanwhile. Manifest/CAS/IO errors at the publish boundary
still propagate. Isolating at the single chokepoint realizes the governing
principle (physical index state never fails a logical operation) for every write
path, and supersedes the earlier symptomatic count_rows==0 stop-gap (removed) —
closing the residual rows-present-but-vectors-null window it left open.
Surfacing pending index status rather than failing is the database norm
(Postgres indisvalid, LanceDB list_indices). ensure_indices and the build_indices
wrappers now return Vec<PendingIndex>; optimize surfaces it in a later commit.
Refs dev-graph iss-848, iss-951 (vector index stays inline-commit until lance#6666).
* test(engine): index-only schema apply must not touch table data
Adding an @index to an existing column should be a pure metadata change once
index materialization moves to the reconciler (iss-848): the apply records the
intent in the catalog/IR but builds nothing inline, so the table's manifest
version is unchanged. Today the indexed_tables block builds the index inline
and bumps the version (4 -> 5). Fixed in the next commit.
Refs dev-graph iss-848.
* fix(engine): schema apply records index intent only; index-only apply is metadata
Schema apply no longer builds indexes inline. The four build_indices calls
(added/renamed/rewritten/index-only tables) are removed; the @index/@key intent
is already persisted in the catalog/IR the apply writes, and the physical index
is materialized off the critical path by ensure_indices/optimize (iss-848).
Concretely:
- AddConstraint (an @index addition — every other added constraint plans as
UnsupportedChange) becomes a pure metadata step alongside the metadata-only
steps: it touches no table data, so the table version is unchanged.
- added/renamed/rewritten tables still write their data; only the trailing
index build is gone. The rewritten table's coverage is restored later by
optimize_indices.
- recovery_pins drops index-only tables (they no longer advance Lance HEAD) and
keeps rewritten tables; their post_commit_pin = expected+1 is now exact (one
rewrite commit), strengthening recovery classification.
- the now-orphaned Omnigraph::build_indices_on_dataset_for_catalog wrapper is
removed.
A migration can no longer abort on an index build, for any index type at any
cardinality. Turns the green half of index_only_constraint_apply_touches_no_table_data.
Refs dev-graph iss-848.
* test(engine): optimize must converge a declared-but-unbuilt index
After iss-848, adding an @index post-data is a metadata-only apply that defers
the physical build, so the column is declared-indexed but unbuilt (reads scan).
`optimize` — the operator's cron reconciler — must materialize it. Today optimize
only maintains coverage of EXISTING indexes (optimize_indices) and never creates
missing ones, so the rank BTREE stays Degraded after optimize. Fixed next commit.
Refs dev-graph iss-848.
* fix(engine): optimize materializes declared-but-unbuilt indexes (the reconciler)
`omnigraph optimize` is the operator's cron reconciler. It already compacts and
folds new fragments into EXISTING indexes (optimize_indices); now it also builds
declared-but-missing indexes, so the indexes schema apply / load defer (iss-848)
converge on the next optimize.
Done inside optimize_one_table (not by composing the all-tables ensure_indices,
which is drift-blind and would re-publish the uncovered HEAD>manifest drift that
optimize deliberately skips): after the per-table drift/blob skips and under the
queue + Optimize sidecar already held, a needs_index_create gate (reusing
needs_index_work_node/edge — "declared index missing AND row_count > 0", so empty
tables stay no-ops) admits index-only work, and Phase B builds the missing index
over the just-compacted layout via the build chokepoint. An untrainable vector
column fault-isolates into the new TableOptimizeStats.pending_indexes (the
list_indices/indisvalid analog operators read), not a failure. committed now
reflects index commits, so the existing post-publish cache invalidation covers
them. LanceDB's optimize only maintains existing indexes; creating
declared-but-missing ones is the L2 behavior omnigraph's declarative @index needs.
Turns the green half of optimize_materializes_index_declared_but_unbuilt.
Refs dev-graph iss-848.
* docs: index materialization is deferred to the reconciler (iss-848)
Update the index-lifecycle docs to reflect the new contract: @index/@key
declares intent and the physical index is derived state that never fails a
logical operation. Schema apply builds nothing (records intent only);
load/mutate build inline through one chokepoint that defers an untrainable
Vector column as pending; optimize/ensure_indices is the reconciler that
creates declared-but-missing indexes and maintains coverage, reporting
still-pending columns.
Touches: dev/invariants.md (truth-matrix Index-lifecycle row), AGENTS.md
(capability matrix), user/search/indexes.md (L2 orchestration), user/operations/
maintenance.md (optimize reconciler bullet), dev/testing.md (new tests).
* test(server): schema_apply_route_can_add_index reflects deferred index build
iss-848 made schema apply record @index intent without building the physical
index inline. The route test asserted the index count increased after apply;
on an empty graph it now stays unchanged (the build is deferred to
ensure_indices/optimize). Assert the new contract: apply succeeds and the
physical index count is unchanged.
* fix(engine): precheck vector trainability — don't pin or swallow (PR review)
Two issues Cursor Bugbot caught in the chokepoint fault-isolation:
1. (HIGH) Pending vector pins roll back siblings. needs_index_work_node counted
a missing vector index as work whenever the table had rows, so a column with
no trainable vectors got pinned in the EnsureIndices recovery sidecar — but
the build deferred it (zero commit). On a crash before manifest publish the
classifier sees NoMovement and the all-or-nothing decision (recovery.rs
decide()) rolls back the WHOLE sidecar, undoing a sibling table's committed
index work.
2. (MED) Vector build swallowed fatal errors. The match arm converted every
create_vector_index error into a deferred PendingIndex, hiding genuine
I/O/manifest/Lance failures as "pending".
Fix both with one trainability precheck (vector_column_trainable: >=1 non-null
vector, the ivf_flat(1) minimum) used identically by needs_index_work_node and
the build arm: an untrainable column is never counted as work (so never pinned —
no zero-commit pin) and never attempted (so it can't fail); only a trainable
column is built, and then any error PROPAGATES (stays fatal). The deferred
column is still recorded as a PendingIndex with a clear reason.
Refs dev-graph iss-848.
* feat(cli): surface pending index column + reason in optimize output (PR review)
Codex (P2): pending_indexes was documented as visible in `optimize --json` but
the CLI projection never emitted it — operators would lose the only signal that
optimize has deferred index work. Greptile (P2): the stat dropped the reason, so
operators saw which column was stuck, not why.
Carry the reason: TableOptimizeStats.pending_indexes is now Vec<PendingIndex>
(column + reason), and `omnigraph optimize --json` emits {column, reason} per
pending index; human output prints a "↳ index pending on '<col>': <reason>" line.
Refs dev-graph iss-848.
* test: align CLI index-add test with deferred build; cover post-rename reconcile
- schema_apply_json_adds_index_for_existing_property (cli_schema_config.rs): the
CLI analog of the server test — asserted the index count grew after apply;
under iss-848 the apply defers the build, so the count is unchanged on an
empty graph. Assert the deferred contract. (The only full-suite failure.)
- optimize_materializes_index_after_type_rename (maintenance.rs, new): covers
the gap Greptile flagged — a RenameType writes the renamed table with rows but
no indexes (inline build removed in Commit B); assert the rank index is
Degraded post-rename and Indexed after optimize reconciles it.
Refs dev-graph iss-848.
* test(engine): in-source apply tests reflect deferred index materialization
The two db::omnigraph in-source unit tests asserted the old "schema apply builds
/ preserves indexes inline" behavior (the only remaining full-suite failures):
- test_apply_schema_defers_index_then_reconciler_builds_it (was
test_apply_schema_adds_index_for_existing_property): apply records the @index
intent but builds nothing; assert the BTREE on `age` is absent after apply and
present after ensure_indices. (Uses `age`, unindexed in TEST_SCHEMA — `name
@key` is already FTS-indexed at seed.)
- test_apply_schema_rewrite_defers_index_then_reconciler_restores (was
test_apply_schema_rewrite_preserves_existing_indices): an AddProperty rewrite
no longer rebuilds indexes inline; assert ensure_indices restores id BTREE +
name FTS after the rewrite.
Verified by grep that these + the server/CLI tests are the complete set of
"apply builds an index" assertions; all other index-presence tests run after
load/ensure_indices/primitives, which still build.
Refs dev-graph iss-848.
* fix(engine): optimize always reports pending indexes, not only on create-work (PR review)
Cursor Bugbot (MED): pending_indexes was filled only when needs_index_create was
true, but the vector trainability precheck makes needs_index_work_node exclude an
untrainable Vector column. So a table whose sole missing index is untrainable, but
which optimize still compacts or reindexes, returned an empty pending_indexes —
contradicting the documented operator contract for deferred columns.
Run the (idempotent) build chokepoint unconditionally once past the no-op gate,
rather than gating it on needs_index_create. It skips existing indexes, builds
any buildable missing one, and reports an untrainable column as pending whether
the table entered for compaction, reindex, or index creation. needs_index_create
still gates the no-op decision (so an index-only table still enters the path).
Refs dev-graph iss-848.
* test(engine): reframe staged-BTREE-failure failpoint onto the reconciler path
ensure_indices_stage_btree_failure_leaves_existing_tables_writable fired
`ensure_indices.post_stage_pre_commit_btree` and expected `apply_schema` (adding
a type) to fail mid-BTREE-build. iss-848 removed apply's inline index build, so
that apply now succeeds and the test's unwrap_err panicked — it exercised a
removed code path.
Reframe onto where BTREE builds happen now: seed Person, add an `@index` on
`age` (apply records intent, defers the build), then `ensure_indices` builds the
deferred BTREE and the failpoint fires between stage and commit. Person's HEAD
is unchanged (no drift) and its EnsureIndices sidecar pins NoMovement; a write to
a different, unpinned table (Company) is unaffected (mutations/loads heal
roll-forward and proceed, unlike optimize/repair which refuse on a pending
sidecar). Preserves the original coverage (staged-index stage failure leaves
other tables writable, no drift) in the new architecture.
Refs dev-graph iss-848.
* feat(server): converge deferred indexes promptly after schema apply (iss-848)
Schema apply records @index intent but defers the physical build. On a
long-lived server, spawn a detached best-effort ensure_indices after a
successful apply so the indexes converge promptly instead of waiting for the
operator's next optimize. Fire-and-forget: it never blocks or fails the apply
response, and a failure is logged (the index still converges on the next
optimize). Guarded on result.applied. The CLI is one-shot, so it has no
equivalent; its convergence path is the optimize cadence.
handle.engine is already an Arc, so the spawn takes an owned clone. Convergence
itself is covered by the engine ensure_indices/optimize tests; the existing
empty-graph schema-apply route tests confirm the response is unaffected (the
spawn is a read-only no-op on an empty table).
Refs dev-graph iss-848.
* docs(maintenance): list pending_indexes in optimize per-table stats (consistency)
This commit is contained in:
parent
6f3e0e3157
commit
b183db078f
17 changed files with 622 additions and 181 deletions
|
|
@ -144,6 +144,7 @@ These are architectural rules that need to be in scope on every change. They're
|
|||
4. **Bearer-token plaintext never persists in process memory.** Tokens are hashed at startup; auth uses constant-time comparison; the actor id is server-resolved from the hash match and must not be settable by the client.
|
||||
5. **Reads always see the current index state for the branch they're reading.** Indexes track the branch head, not historical snapshots. If you change index lifecycle, preserve this guarantee.
|
||||
6. **Stable type IDs survive renames.** Schema migration relies on identity that's stable across rename — don't mint new IDs on rename.
|
||||
7. **Logical contract over physical state.** Physical state (index coverage, fragment layout, compaction versions, staged writes) is derived and rebuildable; it must never fail a logical operation. Check preconditions against logical state and let reconciliation converge the physical state idempotently — genuine logical conflicts still fail loudly. This is the rule rules 1–6 instantiate; full statement and applications in [docs/dev/invariants.md](docs/dev/invariants.md).
|
||||
|
||||
### Deny-list (fast-pass review filter — full reasoning in [docs/dev/invariants.md](docs/dev/invariants.md))
|
||||
|
||||
|
|
@ -250,7 +251,7 @@ omnigraph policy explain --actor act-alice --action change --branch main
|
|||
| Compaction (`compact_files`) + reindex (`optimize_indices`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; per table runs `compact_files` **then Lance `optimize_indices`** (folds appended/rewritten fragments back into existing indexes — incremental merge, not retrain) and **publishes the resulting version to `__manifest`** (so the manifest tracks the Lance HEAD — required for reads to observe the work and for schema apply / strict writes to pass their HEAD-vs-manifest precondition), under the per-`(table, main)` write queue with `SidecarKind::Optimize` recovery coverage spanning both ops; **commits even with no compaction work if index coverage is stale**; **refuses on an unrecovered graph**; **skips uncovered HEAD > manifest drift** with `DriftNeedsRepair`; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent; reindex is skipped for them too today), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) |
|
||||
| Repair uncovered drift | — | `omnigraph repair` explicitly classifies uncovered table `HEAD > manifest` drift: verified maintenance drift (`ReserveFragments`/`Rewrite`) can be published with `--confirm`; suspicious or unverifiable drift requires `--force --confirm`. Sidecar-covered crash residuals still recover automatically on open. |
|
||||
| Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy |
|
||||
| BTREE / inverted (FTS) / vector indexes | ✅ | `ensure_indices` builds them per `@index`/`@key` column, dispatched by type via `node_prop_index_kind` (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. Coverage of fragments appended after build is restored by `optimize`'s `optimize_indices` pass (see Compaction row). |
|
||||
| BTREE / inverted (FTS) / vector indexes | ✅ | `@index`/`@key` declares intent; the physical index is derived state that never fails a logical op. Built per column through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. **Schema apply builds nothing** (records intent only); `load`/`mutate` build inline but **defer an untrainable Vector column** (no trainable vectors yet) as *pending* rather than aborting. `ensure_indices`/`optimize` is the reconciler that materializes declared-but-missing indexes and restores coverage of appended/rewritten fragments (`optimize_indices`), reporting still-pending columns (see Compaction row). |
|
||||
| `merge_insert` upsert | ✅ | `LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering |
|
||||
| Vector search | ✅ | `nearest()` query op; embedding pipeline (Gemini / OpenAI clients); `@embed` in schema |
|
||||
| Full-text search | ✅ | `search/fuzzy/match_text/bm25` query ops |
|
||||
|
|
|
|||
|
|
@ -770,6 +770,10 @@ async fn main() -> Result<()> {
|
|||
"skipped": s.skipped.map(|r| r.as_str()),
|
||||
"manifest_version": s.manifest_version,
|
||||
"lance_head_version": s.lance_head_version,
|
||||
"pending_indexes": s.pending_indexes.iter().map(|p| serde_json::json!({
|
||||
"column": p.column,
|
||||
"reason": p.reason,
|
||||
})).collect::<Vec<_>>(),
|
||||
})).collect::<Vec<_>>(),
|
||||
});
|
||||
print_json(&value)?;
|
||||
|
|
@ -786,6 +790,9 @@ async fn main() -> Result<()> {
|
|||
} else {
|
||||
println!(" {:<40} no-op", s.table_key);
|
||||
}
|
||||
for p in &s.pending_indexes {
|
||||
println!(" ↳ index pending on '{}': {}", p.column, p.reason);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
|||
|
|
@ -334,7 +334,13 @@ fn schema_apply_json_adds_index_for_existing_property() {
|
|||
let dataset = snapshot.open("node:Person").await.unwrap();
|
||||
dataset.load_indices().await.unwrap().len()
|
||||
});
|
||||
assert!(after_index_count > before_index_count);
|
||||
// iss-848: `schema apply` records the `@index` intent but defers the physical
|
||||
// index build (materialized later by ensure_indices/optimize; on this empty
|
||||
// table nothing builds anyway). So the physical index count is unchanged.
|
||||
assert_eq!(
|
||||
after_index_count, before_index_count,
|
||||
"schema apply records @index intent but defers the physical build (iss-848)"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
|
|
|||
|
|
@ -1196,6 +1196,25 @@ pub(crate) async fn server_schema_apply(
|
|||
.await
|
||||
.map_err(ApiError::from_omni)?
|
||||
};
|
||||
// Prompt index convergence (iss-848): schema apply records `@index` intent
|
||||
// but defers the physical build. On a long-lived server, materialize it
|
||||
// promptly rather than waiting for the next `optimize` cron — spawned
|
||||
// detached so it never blocks or fails the apply response. Best-effort: a
|
||||
// failure is logged and the index still converges on the next optimize.
|
||||
// The CLI is one-shot, so it has no equivalent; its convergence path is the
|
||||
// operator's optimize cadence.
|
||||
if result.applied {
|
||||
let engine = Arc::clone(&handle.engine);
|
||||
tokio::spawn(async move {
|
||||
if let Err(err) = engine.ensure_indices().await {
|
||||
tracing::warn!(
|
||||
target: "omnigraph::server",
|
||||
error = %err,
|
||||
"post-apply ensure_indices failed; indexes will converge on the next optimize",
|
||||
);
|
||||
}
|
||||
});
|
||||
}
|
||||
Ok(Json(schema_apply_output(handle.uri.as_str(), result)))
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -294,6 +294,11 @@ async fn schema_apply_route_can_add_index() {
|
|||
|
||||
assert_eq!(status, StatusCode::OK);
|
||||
assert_eq!(payload["applied"], true);
|
||||
// iss-848: the /schema/apply route accepts the index-add and applies it as a
|
||||
// metadata change — it records the `@index` intent in the catalog/IR but does
|
||||
// NOT build the physical index inline (the build is deferred to
|
||||
// ensure_indices/optimize; on this empty table nothing would build anyway).
|
||||
// So the physical index count is unchanged by the apply.
|
||||
let reopened = Omnigraph::open(graph.to_str().unwrap()).await.unwrap();
|
||||
let snapshot = reopened
|
||||
.snapshot_of(ReadTarget::branch("main"))
|
||||
|
|
@ -301,7 +306,10 @@ async fn schema_apply_route_can_add_index() {
|
|||
.unwrap();
|
||||
let dataset = snapshot.open("node:Person").await.unwrap();
|
||||
let after_index_count = dataset.load_indices().await.unwrap().len();
|
||||
assert!(after_index_count > before_index_count);
|
||||
assert_eq!(
|
||||
after_index_count, before_index_count,
|
||||
"schema apply records @index intent but defers the physical build (iss-848)"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
|
|
|
|||
|
|
@ -11,9 +11,9 @@ pub use graph_coordinator::{GraphCoordinator, ReadTarget, ResolvedTarget, Snapsh
|
|||
pub use manifest::{Snapshot, SubTableEntry, SubTableUpdate};
|
||||
pub(crate) use omnigraph::ensure_public_branch_ref;
|
||||
pub use omnigraph::{
|
||||
CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, RepairAction,
|
||||
RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions, SchemaApplyResult,
|
||||
SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats,
|
||||
CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, PendingIndex,
|
||||
RepairAction, RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions,
|
||||
SchemaApplyResult, SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats,
|
||||
};
|
||||
|
||||
pub(crate) const SCHEMA_APPLY_LOCK_BRANCH: &str = "__schema_apply_lock__";
|
||||
|
|
|
|||
|
|
@ -40,6 +40,7 @@ pub use repair::{
|
|||
RepairAction, RepairClassification, RepairOptions, RepairStats, TableRepairStats,
|
||||
};
|
||||
pub use schema_apply::SchemaApplyOptions;
|
||||
pub use table_ops::PendingIndex;
|
||||
|
||||
use super::commit_graph::GraphCommit;
|
||||
use super::manifest::{
|
||||
|
|
@ -1069,11 +1070,15 @@ impl Omnigraph {
|
|||
/// unbranched subtables keep inheriting `main`, while subtables inherited
|
||||
/// from an ancestor branch are first forked into the active branch before
|
||||
/// their index metadata is updated.
|
||||
pub async fn ensure_indices(&self) -> Result<()> {
|
||||
/// Returns the declared indexes that could not be materialized on this
|
||||
/// pass (today: vector columns with no trainable vectors yet). They are
|
||||
/// deferred, not errors; a later `ensure_indices`/`optimize` builds them
|
||||
/// once the column is trainable. Reads stay correct (brute-force) meanwhile.
|
||||
pub async fn ensure_indices(&self) -> Result<Vec<PendingIndex>> {
|
||||
table_ops::ensure_indices(self).await
|
||||
}
|
||||
|
||||
pub async fn ensure_indices_on(&self, branch: &str) -> Result<()> {
|
||||
pub async fn ensure_indices_on(&self, branch: &str) -> Result<Vec<PendingIndex>> {
|
||||
table_ops::ensure_indices_on(self, branch).await
|
||||
}
|
||||
|
||||
|
|
@ -1530,19 +1535,10 @@ impl Omnigraph {
|
|||
&self,
|
||||
table_key: &str,
|
||||
ds: &mut SnapshotHandle,
|
||||
) -> Result<()> {
|
||||
) -> Result<Vec<PendingIndex>> {
|
||||
table_ops::build_indices_on_dataset(self, table_key, ds).await
|
||||
}
|
||||
|
||||
pub(crate) async fn build_indices_on_dataset_for_catalog(
|
||||
&self,
|
||||
catalog: &Catalog,
|
||||
table_key: &str,
|
||||
ds: &mut SnapshotHandle,
|
||||
) -> Result<()> {
|
||||
table_ops::build_indices_on_dataset_for_catalog(self, catalog, table_key, ds).await
|
||||
}
|
||||
|
||||
// Used only by in-tree tests (`#[cfg(test)]`); the runtime path now
|
||||
// uses `commit_updates_on_branch_with_expected` exclusively.
|
||||
#[cfg(test)]
|
||||
|
|
@ -2498,25 +2494,49 @@ edge WorksAt: Person -> Company
|
|||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_apply_schema_adds_index_for_existing_property() {
|
||||
async fn test_apply_schema_defers_index_then_reconciler_builds_it() {
|
||||
// iss-848: schema apply records the @index intent but builds nothing
|
||||
// inline; a later ensure_indices materializes it once the table has
|
||||
// rows. (Use `age`, which is unindexed in TEST_SCHEMA — `name @key` is
|
||||
// already FTS-indexed at seed, so it can't show the deferral.)
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
|
||||
seed_person_row(&mut db, "Alice", Some(30)).await;
|
||||
|
||||
let desired = TEST_SCHEMA.replace("name: String @key", "name: String @key @index");
|
||||
let desired = TEST_SCHEMA.replace("age: I32?", "age: I32? @index");
|
||||
db.apply_schema(&desired).await.unwrap();
|
||||
|
||||
// Apply built nothing — the BTREE on `age` is deferred.
|
||||
let snapshot = db.snapshot().await;
|
||||
let ds = db
|
||||
.storage()
|
||||
.open_snapshot_at_table(&snapshot, "node:Person")
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(db.storage().has_fts_index(&ds, "name").await.unwrap());
|
||||
assert!(
|
||||
!db.storage().has_btree_index(&ds, "age").await.unwrap(),
|
||||
"apply must not build the index inline (deferred to the reconciler)"
|
||||
);
|
||||
|
||||
// The reconciler materializes it (Person has a row).
|
||||
db.ensure_indices().await.unwrap();
|
||||
let snapshot = db.snapshot().await;
|
||||
let ds = db
|
||||
.storage()
|
||||
.open_snapshot_at_table(&snapshot, "node:Person")
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(
|
||||
db.storage().has_btree_index(&ds, "age").await.unwrap(),
|
||||
"ensure_indices must build the deferred index"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn test_apply_schema_rewrite_preserves_existing_indices() {
|
||||
async fn test_apply_schema_rewrite_defers_index_then_reconciler_restores() {
|
||||
// iss-848: an AddProperty rewrite writes a new dataset version without
|
||||
// rebuilding indexes inline (deferred); ensure_indices restores them.
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
let initial_schema = TEST_SCHEMA.replace("name: String @key", "name: String @key @index");
|
||||
|
|
@ -2529,6 +2549,8 @@ edge WorksAt: Person -> Company
|
|||
);
|
||||
db.apply_schema(&desired).await.unwrap();
|
||||
|
||||
// After the rewrite the reconciler restores index coverage.
|
||||
db.ensure_indices().await.unwrap();
|
||||
let snapshot = db.snapshot().await;
|
||||
let ds = db
|
||||
.storage()
|
||||
|
|
|
|||
|
|
@ -140,6 +140,12 @@ pub struct TableOptimizeStats {
|
|||
/// Lance HEAD version observed by optimize for drift skips. `None` for
|
||||
/// normal compaction/no-op/blob skips.
|
||||
pub lance_head_version: Option<u64>,
|
||||
/// Declared `@index` columns on this table the reconciler could not build
|
||||
/// this run, each with the `reason` (today: a vector column with no
|
||||
/// trainable vectors yet). Empty on the common path. Reported, not fatal — a
|
||||
/// later `optimize` retries; the `list_indices`/`indisvalid` analog so
|
||||
/// operators can see which index is pending and why.
|
||||
pub pending_indexes: Vec<super::PendingIndex>,
|
||||
}
|
||||
|
||||
impl TableOptimizeStats {
|
||||
|
|
@ -153,6 +159,7 @@ impl TableOptimizeStats {
|
|||
skipped: None,
|
||||
manifest_version: None,
|
||||
lance_head_version: None,
|
||||
pending_indexes: Vec::new(),
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -166,6 +173,7 @@ impl TableOptimizeStats {
|
|||
skipped: Some(reason),
|
||||
manifest_version: None,
|
||||
lance_head_version: None,
|
||||
pending_indexes: Vec::new(),
|
||||
}
|
||||
}
|
||||
|
||||
|
|
@ -183,6 +191,7 @@ impl TableOptimizeStats {
|
|||
skipped: Some(SkipReason::DriftNeedsRepair),
|
||||
manifest_version: Some(manifest_version),
|
||||
lance_head_version: Some(lance_head_version),
|
||||
pending_indexes: Vec::new(),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -371,14 +380,26 @@ async fn optimize_one_table(
|
|||
let will_compact = plan.num_tasks() > 0;
|
||||
// Even when there is nothing to compact, the table may still have index
|
||||
// work: rows appended since the index was built (e.g. via `ingest --mode
|
||||
// merge`) are scanned unindexed until folded in. Either compaction or stale
|
||||
// index coverage is enough to enter the publish path. If NEITHER, this
|
||||
// table is a no-op and must NOT be pinned in a sidecar — a zero-commit pin
|
||||
// classifies NoMovement on recovery and forces an all-or-nothing rollback
|
||||
// of sibling tables' legitimate work. Uncovered pre-existing manifest/HEAD
|
||||
// drift is skipped above and must go through explicit repair.
|
||||
// merge`) are scanned unindexed until folded in (needs_reindex), OR a
|
||||
// declared `@index` was never built — schema apply records the intent but
|
||||
// defers the physical build (iss-848), so optimize is the operator-facing
|
||||
// reconciler that materializes it (needs_index_create). Any of the three is
|
||||
// enough to enter the publish path. If NONE, this table is a no-op and must
|
||||
// NOT be pinned in a sidecar — a zero-commit pin classifies NoMovement on
|
||||
// recovery and forces an all-or-nothing rollback of sibling tables'
|
||||
// legitimate work. Uncovered pre-existing manifest/HEAD drift is skipped
|
||||
// above and goes through explicit repair, so this only runs on a healthy
|
||||
// table under the per-table queue + sidecar.
|
||||
let needs_reindex = TableStore::has_unindexed_fragments(&ds).await?;
|
||||
if !will_compact && !needs_reindex {
|
||||
// needs_index_work_* checks "a declared index is missing AND row_count > 0",
|
||||
// so empty tables stay no-ops (never pinned). It re-reads the head under the
|
||||
// queue we already hold, so it is consistent with `ds`.
|
||||
let needs_index_create = if let Some(type_name) = table_key.strip_prefix("node:") {
|
||||
super::table_ops::needs_index_work_node(db, type_name, &table_key, &full_path, None).await?
|
||||
} else {
|
||||
super::table_ops::needs_index_work_edge(db, &table_key, &full_path, None).await?
|
||||
};
|
||||
if !will_compact && !needs_reindex && !needs_index_create {
|
||||
return Ok(TableOptimizeStats::compacted(
|
||||
table_key,
|
||||
&CompactionMetrics::default(),
|
||||
|
|
@ -427,7 +448,30 @@ async fn optimize_one_table(
|
|||
ds.optimize_indices(&OptimizeOptions::default())
|
||||
.await
|
||||
.map_err(|e| OmniError::Lance(format!("optimize_indices on {}: {}", table_key, e)))?;
|
||||
let version_after = ds.version().version;
|
||||
|
||||
// Materialize any declared-but-missing index over the just-compacted layout,
|
||||
// reusing the build chokepoint (idempotent: skips existing indexes; fault-
|
||||
// isolates an untrainable vector column into `pending` rather than failing).
|
||||
// Run it UNCONDITIONALLY now that we are past the no-op gate — not only when
|
||||
// `needs_index_create`. A table can enter this path for compaction or
|
||||
// reindex while its sole missing index is an untrainable Vector column
|
||||
// (which `needs_index_work_*` does not count as buildable work); calling the
|
||||
// build here is what surfaces that column in `pending_indexes`, so optimize
|
||||
// can't compact a table yet silently drop the deferred-index signal.
|
||||
// Idempotent + cheap when there is nothing to build. Vector index creation
|
||||
// is an inline-commit residual; the Optimize sidecar's loose post_commit_pin
|
||||
// covers the extra commits.
|
||||
let catalog = db.catalog();
|
||||
let mut snapshot = crate::storage_layer::SnapshotHandle::new(ds);
|
||||
let pending_indexes: Vec<super::PendingIndex> =
|
||||
super::table_ops::build_indices_on_dataset_for_catalog(
|
||||
db,
|
||||
&catalog,
|
||||
&table_key,
|
||||
&mut snapshot,
|
||||
)
|
||||
.await?;
|
||||
let version_after = snapshot.dataset().version().version;
|
||||
let committed = version_after != version_before;
|
||||
|
||||
// Pin the per-writer Phase B → Phase C residual for optimize: Lance HEAD has
|
||||
|
|
@ -438,9 +482,6 @@ async fn optimize_one_table(
|
|||
// expected = the version observed under the queue). On failure the sidecar
|
||||
// is intentionally left for the open-time recovery sweep to roll forward.
|
||||
if committed {
|
||||
// Re-wrap the post-compaction dataset to read its state through the
|
||||
// trait surface (`table_state` is a read; no HEAD advance).
|
||||
let snapshot = crate::storage_layer::SnapshotHandle::new(ds);
|
||||
let state = db.storage().table_state(&full_path, &snapshot).await?;
|
||||
let update = crate::db::SubTableUpdate {
|
||||
table_key: table_key.clone(),
|
||||
|
|
@ -467,7 +508,9 @@ async fn optimize_one_table(
|
|||
);
|
||||
}
|
||||
|
||||
Ok(TableOptimizeStats::compacted(table_key, &metrics, committed))
|
||||
let mut stat = TableOptimizeStats::compacted(table_key, &metrics, committed);
|
||||
stat.pending_indexes = pending_indexes;
|
||||
Ok(stat)
|
||||
}
|
||||
|
||||
/// Run Lance `cleanup_old_versions` on every node + edge table on `main`,
|
||||
|
|
|
|||
|
|
@ -193,7 +193,6 @@ where
|
|||
let mut added_tables = BTreeSet::new();
|
||||
let mut renamed_tables = HashMap::new();
|
||||
let mut rewritten_tables = BTreeSet::new();
|
||||
let mut indexed_tables = BTreeSet::new();
|
||||
let mut dropped_tables = BTreeSet::new();
|
||||
// Hard-drop cleanup targets: (table_key, full_dataset_uri).
|
||||
// Populated for DropProperty { Hard } and DropType { Hard }; the
|
||||
|
|
@ -252,14 +251,14 @@ where
|
|||
.or_default()
|
||||
.insert(to.clone(), from.clone());
|
||||
}
|
||||
SchemaMigrationStep::AddConstraint {
|
||||
type_kind,
|
||||
type_name,
|
||||
..
|
||||
} => {
|
||||
indexed_tables.insert(schema_table_key(*type_kind, type_name));
|
||||
}
|
||||
SchemaMigrationStep::UpdateTypeMetadata { .. }
|
||||
// AddConstraint is only ever an `@index` addition (every other
|
||||
// added constraint plans as UnsupportedChange). It records intent
|
||||
// in the desired catalog/IR; the physical index is built off the
|
||||
// critical path by ensure_indices/optimize (iss-848), so the apply
|
||||
// does no table work for it — a pure metadata change like the two
|
||||
// metadata steps below.
|
||||
SchemaMigrationStep::AddConstraint { .. }
|
||||
| SchemaMigrationStep::UpdateTypeMetadata { .. }
|
||||
| SchemaMigrationStep::UpdatePropertyMetadata { .. } => {}
|
||||
SchemaMigrationStep::DropProperty {
|
||||
type_kind,
|
||||
|
|
@ -347,18 +346,15 @@ where
|
|||
let mut table_updates = HashMap::<String, crate::db::SubTableUpdate>::new();
|
||||
let mut table_tombstones = HashMap::<String, u64>::new();
|
||||
|
||||
// Recovery sidecar: protect the per-table commit_staged loop in
|
||||
// rewritten_tables + indexed_tables. The post_commit_pin we record
|
||||
// here is a lower bound (expected + 1); the classifier loose-matches
|
||||
// for SidecarKind::SchemaApply because the actual N depends on how
|
||||
// many indices need building. See classify_table's loose-match arm.
|
||||
// Recovery sidecar: protect the per-table `stage_overwrite` +
|
||||
// `commit_staged` in rewritten_tables — the only tables that advance Lance
|
||||
// HEAD inline now that index building is deferred to the reconciler
|
||||
// (iss-848). Each rewritten table is exactly one commit, so
|
||||
// `post_commit_pin = expected + 1` is now exact (it was a loose lower bound
|
||||
// when index builds added extra commits); the classifier's loose-match for
|
||||
// SidecarKind::SchemaApply still accepts it.
|
||||
let recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = rewritten_tables
|
||||
.iter()
|
||||
.chain(indexed_tables.iter().filter(|t| {
|
||||
!rewritten_tables.contains(*t)
|
||||
&& !added_tables.contains(*t)
|
||||
&& !renamed_tables.contains_key(*t)
|
||||
}))
|
||||
.filter_map(|table_key| {
|
||||
let entry = snapshot.entry(table_key)?;
|
||||
Some(crate::db::manifest::SidecarTablePin {
|
||||
|
|
@ -490,10 +486,11 @@ where
|
|||
let table_path = table_path_for_table_key(table_key)?;
|
||||
let dataset_uri = db.storage().dataset_uri(&table_path);
|
||||
let schema = schema_for_table_key(&desired_catalog, table_key)?;
|
||||
let mut ds =
|
||||
let ds =
|
||||
SnapshotHandle::new(TableStore::create_empty_dataset(&dataset_uri, &schema).await?);
|
||||
db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds)
|
||||
.await?;
|
||||
// Indexes for the new table are materialized off the critical path by
|
||||
// ensure_indices/optimize (iss-848); a 0-row table is never trainable
|
||||
// anyway. The @index intent is recorded in the persisted catalog/IR.
|
||||
let state = db.storage().table_state(&dataset_uri, &ds).await?;
|
||||
table_registrations.insert(table_key.clone(), table_path);
|
||||
table_updates.insert(
|
||||
|
|
@ -533,10 +530,9 @@ where
|
|||
.await?;
|
||||
let table_path = table_path_for_table_key(target_table_key)?;
|
||||
let dataset_uri = db.storage().dataset_uri(&table_path);
|
||||
let mut target_ds =
|
||||
let target_ds =
|
||||
SnapshotHandle::new(TableStore::write_dataset(&dataset_uri, batch).await?);
|
||||
db.build_indices_on_dataset_for_catalog(&desired_catalog, target_table_key, &mut target_ds)
|
||||
.await?;
|
||||
// Indexes on the renamed table are reconciled later (iss-848).
|
||||
let state = db.storage().table_state(&dataset_uri, &target_ds).await?;
|
||||
table_registrations.insert(target_table_key.clone(), table_path);
|
||||
table_updates.insert(
|
||||
|
|
@ -593,9 +589,10 @@ where
|
|||
.open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
|
||||
.await?;
|
||||
let staged = db.storage().stage_overwrite(&existing, batch).await?;
|
||||
let mut target_ds = db.storage().commit_staged(existing, staged).await?;
|
||||
db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut target_ds)
|
||||
.await?;
|
||||
let target_ds = db.storage().commit_staged(existing, staged).await?;
|
||||
// The rewrite drops the table's existing index coverage; it is
|
||||
// restored off the critical path by optimize's optimize_indices /
|
||||
// ensure_indices (iss-848). Reads scan uncovered fragments meanwhile.
|
||||
let state = db.storage().table_state(&dataset_uri, &target_ds).await?;
|
||||
table_updates.insert(
|
||||
table_key.clone(),
|
||||
|
|
@ -609,41 +606,12 @@ where
|
|||
);
|
||||
}
|
||||
|
||||
for table_key in &indexed_tables {
|
||||
if added_tables.contains(table_key)
|
||||
|| renamed_tables.contains_key(table_key)
|
||||
|| rewritten_tables.contains(table_key)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
let entry = snapshot.entry(table_key).ok_or_else(|| {
|
||||
OmniError::manifest(format!(
|
||||
"missing table '{}' for schema index apply",
|
||||
table_key
|
||||
))
|
||||
})?;
|
||||
ensure_snapshot_entry_head_matches(db, entry).await?;
|
||||
let dataset_uri = db.storage().dataset_uri(&entry.table_path);
|
||||
let mut ds = db
|
||||
.storage()
|
||||
.open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
|
||||
.await?;
|
||||
db.storage()
|
||||
.ensure_expected_version(&ds, table_key, entry.table_version)?;
|
||||
db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds)
|
||||
.await?;
|
||||
let state = db.storage().table_state(&dataset_uri, &ds).await?;
|
||||
table_updates.insert(
|
||||
table_key.clone(),
|
||||
crate::db::SubTableUpdate {
|
||||
table_key: table_key.clone(),
|
||||
table_version: state.version,
|
||||
table_branch: None,
|
||||
row_count: state.row_count,
|
||||
version_metadata: state.version_metadata,
|
||||
},
|
||||
);
|
||||
}
|
||||
// Index-only changes (AddConstraint, i.e. adding an `@index`) are pure
|
||||
// metadata: the new `@index` intent is recorded in the desired catalog/IR
|
||||
// persisted below, and the physical index is materialized off the critical
|
||||
// path by `ensure_indices`/`optimize` (iss-848). Schema apply touches no
|
||||
// table data for them, so there is no per-table loop here and no recovery
|
||||
// pin (no Lance HEAD advances). Reads stay correct meanwhile via a scan.
|
||||
|
||||
let mut manifest_changes = Vec::new();
|
||||
for (table_key, table_path) in table_registrations {
|
||||
|
|
|
|||
|
|
@ -21,7 +21,7 @@ pub(super) async fn graph_index_for_resolved(
|
|||
db.runtime_cache.graph_index(resolved, &catalog).await
|
||||
}
|
||||
|
||||
pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> {
|
||||
pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<Vec<PendingIndex>> {
|
||||
let current_branch = db
|
||||
.coordinator
|
||||
.read()
|
||||
|
|
@ -31,7 +31,7 @@ pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> {
|
|||
ensure_indices_for_branch(db, current_branch.as_deref()).await
|
||||
}
|
||||
|
||||
pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<()> {
|
||||
pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<Vec<PendingIndex>> {
|
||||
let branch = normalize_branch_name(branch)?;
|
||||
ensure_indices_for_branch(db, branch.as_deref()).await
|
||||
}
|
||||
|
|
@ -73,12 +73,16 @@ pub(super) async fn failpoint_publish_table_head_without_index_rebuild_for_test(
|
|||
.await
|
||||
}
|
||||
|
||||
pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&str>) -> Result<()> {
|
||||
pub(super) async fn ensure_indices_for_branch(
|
||||
db: &Omnigraph,
|
||||
branch: Option<&str>,
|
||||
) -> Result<Vec<PendingIndex>> {
|
||||
db.ensure_schema_state_valid().await?;
|
||||
db.ensure_schema_apply_idle("ensure_indices").await?;
|
||||
let resolved = db.resolved_branch_target(branch).await?;
|
||||
let snapshot = resolved.snapshot;
|
||||
let mut updates = Vec::new();
|
||||
let mut pending = Vec::new();
|
||||
let active_branch = resolved.branch;
|
||||
let catalog = db.catalog();
|
||||
|
||||
|
|
@ -217,7 +221,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
|
|||
};
|
||||
let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0);
|
||||
if row_count > 0 {
|
||||
build_indices_on_dataset(db, &table_key, &mut ds).await?;
|
||||
pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?);
|
||||
}
|
||||
|
||||
let state = db.storage().table_state(&full_path, &ds).await?;
|
||||
|
|
@ -265,7 +269,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
|
|||
};
|
||||
let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0);
|
||||
if row_count > 0 {
|
||||
build_indices_on_dataset(db, &table_key, &mut ds).await?;
|
||||
pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?);
|
||||
}
|
||||
|
||||
let state = db.storage().table_state(&full_path, &ds).await?;
|
||||
|
|
@ -307,7 +311,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
|
|||
}
|
||||
}
|
||||
|
||||
Ok(())
|
||||
Ok(pending)
|
||||
}
|
||||
|
||||
/// The single scalar/vector index a node property receives from a one-column
|
||||
|
|
@ -352,6 +356,26 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option<NodePropIndexKind> {
|
|||
}
|
||||
}
|
||||
|
||||
/// Whether a vector column currently has at least one non-null vector — the
|
||||
/// minimum for Lance IVF k-means to train (the `ivf_flat(1)` index we build
|
||||
/// needs >=1 vector). Used identically by `needs_index_work_node` (so an
|
||||
/// untrainable column is not pinned for recovery — avoiding a zero-commit pin
|
||||
/// that would roll back a sibling's index work) and by the vector build arm (so
|
||||
/// `create_vector_index` is only attempted when it can succeed, keeping its
|
||||
/// genuine errors fatal instead of swallowed as pending). If index params
|
||||
/// become size-aware (dev-graph iss-687), this threshold moves with them.
|
||||
async fn vector_column_trainable(
|
||||
db: &Omnigraph,
|
||||
ds: &SnapshotHandle,
|
||||
column: &str,
|
||||
) -> Result<bool> {
|
||||
Ok(db
|
||||
.storage()
|
||||
.count_rows(ds, Some(format!("{column} IS NOT NULL")))
|
||||
.await?
|
||||
> 0)
|
||||
}
|
||||
|
||||
/// Returns true if the node table is missing at least one declared
|
||||
/// scalar/vector index that `build_indices_on_dataset_for_catalog` would
|
||||
/// build AND has at least one row (the ensure_indices loop has
|
||||
|
|
@ -366,7 +390,7 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option<NodePropIndexKind> {
|
|||
/// (DateTime/Date/numeric/Bool), FTS for free-text Strings, or a Vector index.
|
||||
/// Edges get BTree only (id, src, dst). This helper and the builder share
|
||||
/// `node_prop_index_kind` so they cannot drift — see its doc comment.
|
||||
async fn needs_index_work_node(
|
||||
pub(super) async fn needs_index_work_node(
|
||||
db: &Omnigraph,
|
||||
type_name: &str,
|
||||
table_key: &str,
|
||||
|
|
@ -409,7 +433,14 @@ async fn needs_index_work_node(
|
|||
}
|
||||
}
|
||||
Some(NodePropIndexKind::Vector) => {
|
||||
if !db.storage().has_vector_index(&ds, prop_name).await? {
|
||||
// Only count a missing vector index as buildable *work* when the
|
||||
// column is trainable (>=1 non-null vector). An untrainable
|
||||
// column would defer in the build and commit nothing; pinning it
|
||||
// for recovery would be a zero-commit pin that classifies
|
||||
// NoMovement and rolls back a sibling table's index work.
|
||||
if !db.storage().has_vector_index(&ds, prop_name).await?
|
||||
&& vector_column_trainable(db, &ds, prop_name).await?
|
||||
{
|
||||
return Ok(true);
|
||||
}
|
||||
}
|
||||
|
|
@ -434,7 +465,7 @@ async fn needs_index_work_node(
|
|||
///
|
||||
/// Empty edge tables are skipped by the ensure_indices loop the same
|
||||
/// way node tables are; see `needs_index_work_node`.
|
||||
async fn needs_index_work_edge(
|
||||
pub(super) async fn needs_index_work_edge(
|
||||
db: &Omnigraph,
|
||||
table_key: &str,
|
||||
full_path: &str,
|
||||
|
|
@ -632,11 +663,25 @@ pub(super) async fn open_dataset_at_state(
|
|||
.await
|
||||
}
|
||||
|
||||
/// A declared index the builder could not materialize on this pass. Today the
|
||||
/// only such case is a vector (IVF) column with no trainable vectors yet
|
||||
/// (KMeans needs >=1 vector), e.g. the load-before-embed window. Reported, not
|
||||
/// fatal: a later `ensure_indices`/`optimize` retries once the column is
|
||||
/// buildable, and reads stay correct via brute-force meanwhile. Surfacing
|
||||
/// pending index *status* rather than failing the operation is the database
|
||||
/// norm (Postgres `indisvalid`, LanceDB `list_indices`).
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct PendingIndex {
|
||||
pub table_key: String,
|
||||
pub column: String,
|
||||
pub reason: String,
|
||||
}
|
||||
|
||||
pub(super) async fn build_indices_on_dataset(
|
||||
db: &Omnigraph,
|
||||
table_key: &str,
|
||||
ds: &mut SnapshotHandle,
|
||||
) -> Result<()> {
|
||||
) -> Result<Vec<PendingIndex>> {
|
||||
let catalog = db.catalog();
|
||||
build_indices_on_dataset_for_catalog(db, &catalog, table_key, ds).await
|
||||
}
|
||||
|
|
@ -646,8 +691,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
|
|||
catalog: &Catalog,
|
||||
table_key: &str,
|
||||
ds: &mut SnapshotHandle,
|
||||
) -> Result<()> {
|
||||
) -> Result<Vec<PendingIndex>> {
|
||||
if let Some(type_name) = table_key.strip_prefix("node:") {
|
||||
let mut pending = Vec::new();
|
||||
if !db.storage().has_btree_index(ds, "id").await? {
|
||||
stage_and_commit_btree(db, table_key, ds, &["id"]).await?;
|
||||
}
|
||||
|
|
@ -676,22 +722,52 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
|
|||
}
|
||||
Some(NodePropIndexKind::Vector) => {
|
||||
if !db.storage().has_vector_index(ds, prop_name).await? {
|
||||
// Inline-commit residual: lance-6.0.1 does not
|
||||
// expose `build_index_metadata_from_segments` as
|
||||
// `pub`, so vector indices cannot be staged from
|
||||
// outside the lance crate. Document at the call
|
||||
// site; companion ticket to lance-format/lance#6658.
|
||||
let new_snap = db
|
||||
.storage_inline_residual()
|
||||
.create_vector_index(ds.clone(), prop_name.as_str())
|
||||
.await
|
||||
.map_err(|e| {
|
||||
OmniError::Lance(format!(
|
||||
"create Vector index on {}({}): {}",
|
||||
table_key, prop_name, e
|
||||
))
|
||||
})?;
|
||||
*ds = new_snap;
|
||||
// A vector (IVF) index trains k-means over the column,
|
||||
// so it needs >=1 non-null vector (KMeans errors
|
||||
// "cannot train N centroids with 0 vectors"). Precheck
|
||||
// trainability: a column with no vectors yet (e.g. rows
|
||||
// loaded before `embed`) is recorded as a *pending*
|
||||
// index and skipped — deferred, not failed. The SAME
|
||||
// predicate gates `needs_index_work_node`, so an
|
||||
// untrainable column is never pinned for recovery (no
|
||||
// zero-commit pin that would roll back a sibling
|
||||
// table's index work). This function is the chokepoint
|
||||
// every write path funnels through (load/mutate, schema
|
||||
// apply, ensure_indices, optimize, merge), realizing
|
||||
// the governing principle — physical index state never
|
||||
// fails a logical operation. Only when trainable do we
|
||||
// attempt the build, and then we PROPAGATE any error: a
|
||||
// genuine I/O/manifest/Lance failure must stay fatal,
|
||||
// not be hidden as pending. (Vector creation is an
|
||||
// inline-commit residual until lance#6666; iss-951.)
|
||||
if vector_column_trainable(db, ds, prop_name).await? {
|
||||
let new_snap = db
|
||||
.storage_inline_residual()
|
||||
.create_vector_index(ds.clone(), prop_name.as_str())
|
||||
.await
|
||||
.map_err(|e| {
|
||||
OmniError::Lance(format!(
|
||||
"create Vector index on {}({}): {}",
|
||||
table_key, prop_name, e
|
||||
))
|
||||
})?;
|
||||
*ds = new_snap;
|
||||
} else {
|
||||
tracing::info!(
|
||||
target: "omnigraph::index",
|
||||
table = %table_key,
|
||||
column = %prop_name,
|
||||
"deferring Vector index: column has no \
|
||||
trainable vectors yet",
|
||||
);
|
||||
pending.push(PendingIndex {
|
||||
table_key: table_key.to_string(),
|
||||
column: prop_name.clone(),
|
||||
reason: "column has no non-null vectors to \
|
||||
train on yet"
|
||||
.to_string(),
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
// Enum + orderable scalars (DateTime/Date/numeric/Bool)
|
||||
|
|
@ -709,7 +785,7 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
|
|||
}
|
||||
}
|
||||
}
|
||||
return Ok(());
|
||||
return Ok(pending);
|
||||
}
|
||||
|
||||
if table_key.starts_with("edge:") {
|
||||
|
|
@ -722,7 +798,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
|
|||
if !db.storage().has_btree_index(ds, "dst").await? {
|
||||
stage_and_commit_btree(db, table_key, ds, &["dst"]).await?;
|
||||
}
|
||||
return Ok(());
|
||||
// Edge tables only get BTree (id/src/dst), which build at any
|
||||
// cardinality; no pending state is possible here.
|
||||
return Ok(Vec::new());
|
||||
}
|
||||
|
||||
Err(OmniError::manifest(format!(
|
||||
|
|
@ -844,7 +922,11 @@ async fn prepare_updates_for_commit(
|
|||
crate::db::MutationOpKind::SchemaRewrite,
|
||||
)
|
||||
.await?;
|
||||
build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?;
|
||||
// Any column not yet buildable (e.g. a vector column whose rows
|
||||
// have null embeddings) is deferred and logged inside
|
||||
// build_indices; a later ensure_indices/optimize materializes it.
|
||||
// The load/mutate/merge commit must not fail on it.
|
||||
let _pending = build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?;
|
||||
let state = db.storage().table_state(&full_path, &ds).await?;
|
||||
prepared_update.table_version = state.version;
|
||||
prepared_update.row_count = state.row_count;
|
||||
|
|
|
|||
|
|
@ -2619,69 +2619,66 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
|
|||
}
|
||||
|
||||
/// Acceptance test: a stage-step failure in the staged-index path
|
||||
/// (`stage_create_btree_index` succeeded; `commit_staged` not yet
|
||||
/// called) leaves NO Lance-HEAD drift on the existing tables.
|
||||
/// Subsequent operations against those tables succeed without
|
||||
/// `ExpectedVersionMismatch`.
|
||||
/// (`stage_create_btree_index` succeeded; `commit_staged` not yet called)
|
||||
/// leaves NO Lance-HEAD drift, so other tables stay writable.
|
||||
///
|
||||
/// Path: `apply_schema(v1 → v2)` adds a new node type. The
|
||||
/// `added_tables` loop in `schema_apply` creates the empty dataset and
|
||||
/// then calls `build_indices_on_dataset_for_catalog` →
|
||||
/// `stage_and_commit_btree(..., &["id"])`. The failpoint fires
|
||||
/// between `stage_create_btree_index` and `commit_staged`, so the
|
||||
/// staged segments are written under `_indices/<uuid>/` but Lance HEAD
|
||||
/// on the new dataset is unchanged at v=1. The schema-apply lock
|
||||
/// branch is released by `apply_schema`'s outer match. Existing
|
||||
/// tables (e.g. `node:Person`) are completely untouched by the new
|
||||
/// node's added_tables iteration — they're outside the failed apply
|
||||
/// path entirely — and we assert that mutations against them continue
|
||||
/// to work.
|
||||
///
|
||||
/// The orphan empty dataset from the failed apply is acceptable
|
||||
/// residual: it's unreferenced by `__manifest` and will be reclaimed
|
||||
/// by `cleanup_old_versions` (or removed when a future apply at the
|
||||
/// same target path resolves the rename).
|
||||
/// Under iss-848 schema apply no longer builds indexes inline — the build
|
||||
/// happens in the reconciler (`ensure_indices`/`optimize`) and at load. So this
|
||||
/// fires the failpoint where it lives now: an `ensure_indices` build of a BTREE
|
||||
/// that a prior apply declared (`@index`) but deferred. The failpoint fires
|
||||
/// between `stage_create_btree_index` and `commit_staged`, so the staged
|
||||
/// segment is written under `_indices/<uuid>/` but `node:Person`'s Lance HEAD is
|
||||
/// unchanged. `ensure_indices` fails and its EnsureIndices sidecar pins only
|
||||
/// Person at NoMovement (a clean no-op on the next open). A write to a
|
||||
/// different, unpinned table (`node:Company`) is unaffected: mutations/loads run
|
||||
/// a roll-forward-only heal and proceed — they do not refuse on a pending
|
||||
/// sidecar the way `optimize`/`repair` do — so the write succeeds with no drift.
|
||||
#[tokio::test]
|
||||
async fn ensure_indices_stage_btree_failure_leaves_existing_tables_writable() {
|
||||
let _scenario = FailScenario::setup();
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap().to_string();
|
||||
|
||||
// Init with TEST_SCHEMA which declares Person + Knows. Indices on
|
||||
// those tables get built during init.
|
||||
let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
|
||||
|
||||
// Apply a schema that adds a new node type. The added_tables loop
|
||||
// will hit the failpoint between stage and commit on the new
|
||||
// node:Project table's btree-on-id build. (TEST_SCHEMA already
|
||||
// has Person + Company + Knows + WorksAt — pick a name that isn't
|
||||
// already declared.)
|
||||
let extended_schema = format!(
|
||||
"{}\nnode Project {{ name: String @key }}\n",
|
||||
helpers::TEST_SCHEMA
|
||||
);
|
||||
|
||||
{
|
||||
let _failpoint =
|
||||
ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
|
||||
let err = db.apply_schema(&extended_schema).await.unwrap_err();
|
||||
assert!(
|
||||
err.to_string()
|
||||
.contains("ensure_indices.post_stage_pre_commit_btree"),
|
||||
"schema apply should fail with the synthetic failpoint error, got: {err}"
|
||||
);
|
||||
}
|
||||
|
||||
// Existing tables stayed at their pre-apply versions; subsequent
|
||||
// mutations against them succeed (no Lance-HEAD drift).
|
||||
// Seed a Person row — the load builds Person's id BTREE + name FTS.
|
||||
mutate_main(
|
||||
&mut db,
|
||||
helpers::MUTATION_QUERIES,
|
||||
"insert_person",
|
||||
&mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
|
||||
&mixed_params(&[("$name", "Alice")], &[("$age", 30)]),
|
||||
)
|
||||
.await
|
||||
.expect("Person mutation must succeed after the failed schema apply — existing tables are not drifted");
|
||||
.expect("seed Person");
|
||||
|
||||
// Add `@index` on `age`: schema apply records the intent but defers the
|
||||
// physical build (iss-848), so the BTREE on `age` is unbuilt.
|
||||
let indexed_schema = helpers::TEST_SCHEMA.replace("age: I32?", "age: I32? @index");
|
||||
db.apply_schema(&indexed_schema)
|
||||
.await
|
||||
.expect("adding an @index is metadata-only and succeeds");
|
||||
|
||||
{
|
||||
// ensure_indices builds the deferred `age` BTREE on Person; the failpoint
|
||||
// fires between stage and commit, so Person's Lance HEAD does not move.
|
||||
let _failpoint =
|
||||
ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
|
||||
let err = db.ensure_indices().await.unwrap_err();
|
||||
assert!(
|
||||
err.to_string()
|
||||
.contains("ensure_indices.post_stage_pre_commit_btree"),
|
||||
"ensure_indices should fail with the synthetic failpoint error, got: {err}"
|
||||
);
|
||||
}
|
||||
|
||||
// A different, unpinned table is untouched by the failed index build.
|
||||
use omnigraph::loader::{LoadMode, load_jsonl};
|
||||
load_jsonl(
|
||||
&mut db,
|
||||
r#"{"type": "Company", "data": {"name": "Acme"}}"#,
|
||||
LoadMode::Append,
|
||||
)
|
||||
.await
|
||||
.expect("Company write on a table untouched by the failed ensure_indices should succeed");
|
||||
}
|
||||
|
||||
fn assert_no_staging_files(graph: &std::path::Path) {
|
||||
|
|
|
|||
|
|
@ -843,3 +843,152 @@ async fn cleanup_reconciles_orphaned_branch_forks() {
|
|||
.await
|
||||
.unwrap();
|
||||
}
|
||||
|
||||
// Regression (iss-848): a table with rows but NULL vectors (the load-before-
|
||||
// embed window) must not abort index building. The vector (IVF) index cannot
|
||||
// train on 0 vectors, so `create_vector_index` errors with "KMeans cannot
|
||||
// train 1 centroids with 0 vectors". `build_indices_on_dataset_for_catalog`
|
||||
// is the chokepoint every caller funnels through (load/mutate via
|
||||
// prepare_updates_for_commit, ensure_indices, optimize, schema apply, merge),
|
||||
// so per-index fault isolation there must defer that one column (pending) and
|
||||
// still build the sibling scalar indexes, instead of propagating the error.
|
||||
// This exercises both the load path (which builds indices inline) and the
|
||||
// ensure_indices reconciler. Pre-fix this fails at the load step.
|
||||
#[tokio::test]
|
||||
async fn index_build_tolerates_null_vector_rows() {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
let schema = "node Doc {\n \
|
||||
slug: String @key\n \
|
||||
n: I64 @index\n \
|
||||
embedding: Vector(8)? @index\n\
|
||||
}\n";
|
||||
let mut db = Omnigraph::init(uri, schema).await.unwrap();
|
||||
// Rows present, embeddings null (loaded but not yet embedded).
|
||||
load_jsonl(
|
||||
&mut db,
|
||||
"{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"n\":1}}\n\
|
||||
{\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"n\":2}}",
|
||||
LoadMode::Merge,
|
||||
)
|
||||
.await
|
||||
.expect("load rows with null embeddings");
|
||||
|
||||
// Must not abort: the untrainable vector column is deferred, the sibling
|
||||
// BTREE on `n` still builds.
|
||||
db.ensure_indices().await.expect(
|
||||
"ensure_indices must not abort when a vector column has no trainable vectors yet",
|
||||
);
|
||||
}
|
||||
|
||||
// iss-848: `optimize` converges declared-but-unbuilt indexes. After an @index is
|
||||
// added post-data (a metadata-only apply that defers the physical build), the
|
||||
// column is unindexed and reads scan. `optimize` — the operator's reconciler,
|
||||
// run on a cron — must materialize it, by composing the ensure_indices
|
||||
// reconciler after the compaction sweep. Pre-iss-848 optimize only maintained
|
||||
// coverage of EXISTING indexes (optimize_indices) and never created missing ones.
|
||||
#[tokio::test]
|
||||
async fn optimize_materializes_index_declared_but_unbuilt() {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
let v1 = "node Doc {\n slug: String @key\n rank: I32\n}\n";
|
||||
let mut db = Omnigraph::init(uri, v1).await.unwrap();
|
||||
load_jsonl(
|
||||
&mut db,
|
||||
"{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\
|
||||
{\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}",
|
||||
LoadMode::Merge,
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
// Add @index on `rank` after data exists: a metadata-only apply that defers
|
||||
// the physical build (iss-848), so the column is declared-indexed but unbuilt.
|
||||
let v2 = "node Doc {\n slug: String @key\n rank: I32 @index\n}\n";
|
||||
db.apply_schema(v2).await.expect("index-only apply");
|
||||
|
||||
// Precondition: `rank` is declared @index but unbuilt -> reads degrade.
|
||||
{
|
||||
let snap = snapshot_main(&db).await.unwrap();
|
||||
let ds = snap.open("node:Doc").await.unwrap();
|
||||
assert!(
|
||||
matches!(
|
||||
TableStore::key_column_index_coverage(&ds, "rank")
|
||||
.await
|
||||
.unwrap(),
|
||||
IndexCoverage::Degraded { .. }
|
||||
),
|
||||
"rank must be unindexed after the deferred apply"
|
||||
);
|
||||
}
|
||||
|
||||
db.optimize().await.unwrap();
|
||||
|
||||
// Postcondition: optimize's reconciler materialized the declared index.
|
||||
let snap = snapshot_main(&db).await.unwrap();
|
||||
let ds = snap.open("node:Doc").await.unwrap();
|
||||
assert_eq!(
|
||||
TableStore::key_column_index_coverage(&ds, "rank")
|
||||
.await
|
||||
.unwrap(),
|
||||
IndexCoverage::Indexed,
|
||||
"optimize must build the declared-but-unbuilt rank index"
|
||||
);
|
||||
}
|
||||
|
||||
// iss-848 (PR review): the rename path also defers index building. A RenameType
|
||||
// migration writes the renamed table as a new dataset with the existing rows
|
||||
// but no indexes (its inline build was removed). optimize must then materialize
|
||||
// the declared index on the renamed table.
|
||||
#[tokio::test]
|
||||
async fn optimize_materializes_index_after_type_rename() {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
let v1 = "node Doc {\n slug: String @key\n rank: I32 @index\n}\n";
|
||||
let mut db = Omnigraph::init(uri, v1).await.unwrap();
|
||||
load_jsonl(
|
||||
&mut db,
|
||||
"{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\
|
||||
{\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}",
|
||||
LoadMode::Merge,
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
// Rename Doc -> Item; rows are preserved on the new table key.
|
||||
let v2 = "node Item @rename_from(\"Doc\") {\n slug: String @key\n rank: I32 @index\n}\n";
|
||||
let result = db.apply_schema(v2).await.expect("rename apply");
|
||||
assert!(result.applied);
|
||||
assert_eq!(
|
||||
count_rows(&db, "node:Item").await,
|
||||
2,
|
||||
"rename must preserve rows"
|
||||
);
|
||||
|
||||
// Post-rename the renamed table's declared rank index is unbuilt (deferred).
|
||||
{
|
||||
let snap = snapshot_main(&db).await.unwrap();
|
||||
let ds = snap.open("node:Item").await.unwrap();
|
||||
assert!(
|
||||
matches!(
|
||||
TableStore::key_column_index_coverage(&ds, "rank")
|
||||
.await
|
||||
.unwrap(),
|
||||
IndexCoverage::Degraded { .. }
|
||||
),
|
||||
"rank must be unindexed immediately after the rename"
|
||||
);
|
||||
}
|
||||
|
||||
db.optimize().await.unwrap();
|
||||
|
||||
let snap = snapshot_main(&db).await.unwrap();
|
||||
let ds = snap.open("node:Item").await.unwrap();
|
||||
assert_eq!(
|
||||
TableStore::key_column_index_coverage(&ds, "rank")
|
||||
.await
|
||||
.unwrap(),
|
||||
IndexCoverage::Indexed,
|
||||
"optimize must build the renamed table's deferred rank index"
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -736,3 +736,108 @@ edge Knows: Person -> Person {
|
|||
// current contract, the data is *unreachable* via omnigraph
|
||||
// (no manifest entry), which is the user-facing guarantee.
|
||||
}
|
||||
|
||||
// Regression (bug 3 / dev-graph iss-848): a `Vector @index` on a 0-row table
|
||||
// must not abort an otherwise-valid schema apply. A vector (IVF) index trains
|
||||
// k-means centroids over the column's vectors, so Lance cannot build it on 0
|
||||
// vectors — it errors with "Creating empty vector indices with train=False is
|
||||
// not yet implemented". When a *later* migration touches that table (here, an
|
||||
// unrelated scalar `@index` on `body`), schema apply reconciles the table's
|
||||
// whole index set, which previously tried to materialize the dormant vector
|
||||
// index and aborted the entire migration (all-or-nothing). The build is now
|
||||
// deferred (pending) when the column is untrainable, instead of failing the
|
||||
// migration. The dormant index is materialized by a later `ensure_indices` /
|
||||
// `optimize` once the table has rows. Full decoupling — intent recorded at
|
||||
// apply, an async reconciler converges physical coverage — is iss-848.
|
||||
#[tokio::test]
|
||||
async fn apply_schema_defers_vector_index_on_empty_table() {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
|
||||
// init does not build indices, so the declared-but-unbuilt vector index
|
||||
// sits harmless on the empty table (this is how it survived earlier
|
||||
// applies that never touched the table).
|
||||
// `slug` is the user @key; omnigraph injects its own internal `id` column,
|
||||
// so the key field must not be named `id`.
|
||||
let v1 = "node Doc {\n \
|
||||
slug: String @key\n \
|
||||
body: String?\n \
|
||||
embedding: Vector(8) @index\n\
|
||||
}\n";
|
||||
let mut db = Omnigraph::init(uri, v1).await.unwrap();
|
||||
|
||||
// Add an *unrelated* scalar @index on `body`. This routes Doc through
|
||||
// schema apply's index reconcile, which must NOT abort on the untrainable
|
||||
// empty vector index.
|
||||
let v2 = "node Doc {\n \
|
||||
slug: String @key\n \
|
||||
body: String? @index\n \
|
||||
embedding: Vector(8) @index\n\
|
||||
}\n";
|
||||
let result = db.apply_schema(v2).await.expect(
|
||||
"schema apply must succeed: an empty-table vector @index is deferred, not fatal",
|
||||
);
|
||||
assert!(result.applied, "the scalar @index change must apply");
|
||||
|
||||
// The deferred vector index is not dropped — once the table has a
|
||||
// trainable vector, `ensure_indices` materializes it without error. (If
|
||||
// the guard wrongly skipped a non-empty column, this would still be
|
||||
// unindexed; if it wrongly tried to build on empty, the apply above would
|
||||
// have failed.)
|
||||
load_jsonl(
|
||||
&mut db,
|
||||
r#"{"type":"Doc","data":{"slug":"d1","body":"hello","embedding":[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8]}}"#,
|
||||
LoadMode::Merge,
|
||||
)
|
||||
.await
|
||||
.expect("loading a Doc with an embedding must succeed");
|
||||
db.ensure_indices()
|
||||
.await
|
||||
.expect("the deferred vector index must build once the table has a trainable vector");
|
||||
}
|
||||
|
||||
// iss-848: adding an `@index` to an existing column is a pure metadata change.
|
||||
// Schema apply records the intent (the catalog/IR now declares the index) but
|
||||
// must NOT build the index inline, so the table's data and manifest version are
|
||||
// untouched. The physical index is materialized later by ensure_indices /
|
||||
// optimize. Pre-iss-848 the indexed_tables block built the index inline and
|
||||
// bumped the table version.
|
||||
#[tokio::test]
|
||||
async fn index_only_constraint_apply_touches_no_table_data() {
|
||||
let dir = tempfile::tempdir().unwrap();
|
||||
let uri = dir.path().to_str().unwrap();
|
||||
let v1 = "node Doc {\n slug: String @key\n n: I64\n}\n";
|
||||
let mut db = Omnigraph::init(uri, v1).await.unwrap();
|
||||
load_jsonl(
|
||||
&mut db,
|
||||
r#"{"type":"Doc","data":{"slug":"d1","n":1}}"#,
|
||||
LoadMode::Merge,
|
||||
)
|
||||
.await
|
||||
.expect("load a Doc");
|
||||
|
||||
let before = db
|
||||
.snapshot_of(ReadTarget::branch("main"))
|
||||
.await
|
||||
.unwrap()
|
||||
.entry("node:Doc")
|
||||
.unwrap()
|
||||
.table_version;
|
||||
|
||||
// Add an @index on the existing `n` column.
|
||||
let v2 = "node Doc {\n slug: String @key\n n: I64 @index\n}\n";
|
||||
let result = db.apply_schema(v2).await.expect("index-only apply must succeed");
|
||||
assert!(result.applied, "the @index addition must apply");
|
||||
|
||||
let after = db
|
||||
.snapshot_of(ReadTarget::branch("main"))
|
||||
.await
|
||||
.unwrap()
|
||||
.entry("node:Doc")
|
||||
.unwrap()
|
||||
.table_version;
|
||||
assert_eq!(
|
||||
before, after,
|
||||
"adding an @index must not bump the table version (no inline index build)"
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -15,6 +15,38 @@ Use it this way:
|
|||
- Keep implementation ledgers, roadmap detail, and historical MR notes in the
|
||||
per-area docs. This file is the filter, not the encyclopedia.
|
||||
|
||||
## Governing principle: logical contract over physical state
|
||||
|
||||
The hard invariants below are instances of one rule. Keep it in view whenever
|
||||
a change touches the boundary between what the graph *means* and how it is
|
||||
physically stored.
|
||||
|
||||
> **Logical state is the contract. Physical state — index coverage, fragment
|
||||
> layout, compaction versions, staged writes — is derived, rebuildable, and may
|
||||
> be produced asynchronously. A physical operation must never fail a logical
|
||||
> one. Preconditions are checked against logical state; physical reconciliation
|
||||
> is idempotent and may lag or retry. Genuine logical conflicts still fail
|
||||
> loudly: the licence to lag covers physical convergence, not correctness.**
|
||||
|
||||
Invariants that instantiate it: **2** (manifest-atomic visibility) and **5**
|
||||
(recovery is part of the commit protocol) — a partially-written physical layer
|
||||
never changes what a graph commit means; **7** (indexes are derived state) — a
|
||||
query is correct under partial index coverage, and expensive index work
|
||||
converges from manifest state instead of gating the write path; **13** (failures
|
||||
bounded and observable) — the licence to lag is not a licence to drop, so a
|
||||
physical step that cannot make progress is surfaced, not swallowed. Deny-list
|
||||
items that enforce it: synchronous inline vector/FTS index rebuilds on the
|
||||
commit path; state that drifts from Lance or the manifest when it can be
|
||||
derived; job queues for manifest-derivable state where a reconciler fits.
|
||||
|
||||
The failure shape it rules out: a legitimate background operation on the
|
||||
physical layer (compaction, an index build, an interrupted staged write) is
|
||||
allowed to break a logical operation (a query's correctness, a migration's
|
||||
success, a branch's writability). The smell to watch for is a logical operation
|
||||
whose precondition is a *physical* fact — a cached file version, an index's
|
||||
existence, a fragment count. Make the precondition logical and let a reconciler
|
||||
converge the physical state.
|
||||
|
||||
## Hard Invariants
|
||||
|
||||
1. **Respect the substrate.** Lance owns columnar storage, per-dataset
|
||||
|
|
@ -105,7 +137,7 @@ Use it this way:
|
|||
| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) |
|
||||
| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) |
|
||||
| Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
|
||||
| Index lifecycle | Index *creation* per `@index`/`@key` property is dispatched by type (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) via `node_prop_index_kind`; index *coverage maintenance* exists — `optimize` runs Lance `optimize_indices` after compaction to fold appended/rewritten fragments into existing indexes (still an explicit maintenance call, not yet a background reconciler) | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
|
||||
| Index lifecycle | `@index`/`@key` declares *intent*; the physical index is derived state and never fails a logical op. `schema apply` builds no indexes (records intent only; index-only changes touch no table data). `load`/`mutate` build inline through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) that fault-isolates an untrainable Vector column into a *pending* index instead of aborting. `optimize`/`ensure_indices` is the reconciler: it creates declared-but-missing indexes and folds appended/rewritten fragments into existing ones (`optimize_indices`), reporting still-pending columns. Explicit maintenance call, not yet a background loop | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
|
||||
| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) |
|
||||
| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) |
|
||||
| Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) |
|
||||
|
|
|
|||
|
|
@ -29,7 +29,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
|
|||
| `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) |
|
||||
| `changes.rs` | `diff_between` / `diff_commits` |
|
||||
| `consistency.rs` | Cross-table snapshot isolation, atomic publish |
|
||||
| `schema_apply.rs` | Migration plan + apply, schema-apply lock |
|
||||
| `schema_apply.rs` | Migration plan + apply, schema-apply lock; index materialization deferred to the reconciler (iss-848): `apply_schema_defers_vector_index_on_empty_table` (an empty-table Vector `@index` never aborts the apply) and `index_only_constraint_apply_touches_no_table_data` (adding an `@index` is metadata-only — no table-version bump) |
|
||||
| `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) |
|
||||
| `traversal.rs` | `Expand`, variable-length hops, anti-join (CSR path — `OMNIGRAPH_TRAVERSAL_MODE` unset) |
|
||||
| `traversal_indexed.rs` | BTREE-indexed Expand (`execute_expand_indexed`) forced via `OMNIGRAPH_TRAVERSAL_MODE`, asserted semantically equal to the CSR path; own binary, all `#[serial]` so env writes never race |
|
||||
|
|
@ -42,7 +42,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
|
|||
| `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
|
||||
| `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths |
|
||||
| `policy_engine_chassis.rs` | Engine-layer Cedar enforcement (MR-722): allow + deny through every `_as` writer via the SDK directly — no HTTP — proving embedded and CLI callers hit the same gate as the server, with action × scope shapes matching `authorize_request` |
|
||||
| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice |
|
||||
| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice; the index reconciler (iss-848): `index_build_tolerates_null_vector_rows` (an untrainable Vector column defers instead of aborting the build, sibling indexes still build) and `optimize_materializes_index_declared_but_unbuilt` (optimize creates a declared-but-deferred index) |
|
||||
| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`) and the write-entry in-process heal contract (the four `*_after_finalize_publisher_failure_heals_without_reopen` tests — load, mutation, schema apply, branch merge: a follow-up write on the same handle rolls a sidecar-covered residual forward without reopen/refresh) and the storage-fault matrix for the sidecar lifecycle (`recovery.sidecar_{write,delete,list}` / `recovery.record_audit` failpoints: Phase A put failure aborts with zero drift, Phase D delete failure is swallowed and healed by the next write, list failures are loud at heal and open, audit-append failures are retried to exactly one audit row; plus the bucket-gated `s3_load_recovers_after_publisher_failure_without_reopen`). |
|
||||
| `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path |
|
||||
| `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). |
|
||||
|
|
|
|||
|
|
@ -7,11 +7,12 @@
|
|||
- Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed.
|
||||
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs.
|
||||
- **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail.
|
||||
- **Create declared-but-missing indexes (the index reconciler).** `@index`/`@key` declares intent; `schema apply` records it but builds nothing, and `load`/`mutate` defer a column that cannot be built yet (a `Vector` column with no trainable vectors). `optimize` materializes any such declared-but-unbuilt index over the compacted layout — so it is the convergence path for an `@index` added after data exists, or a vector index whose embeddings arrived via a later `embed`. A column still not buildable (no vectors yet) is reported on the table's stat as `pending_indexes` (visible in `--json`), not treated as a failure; the next `optimize` retries. So `optimize` is the single operator-facing index reconciler: it compacts, restores coverage, **and** builds declared-but-missing indexes.
|
||||
- Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe).
|
||||
- **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`.
|
||||
- **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
|
||||
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
|
||||
- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version`.
|
||||
- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version, pending_indexes` (the last lists any declared `@index` column the reconciler could not build this run, with the reason — e.g. a vector column with no trainable vectors yet).
|
||||
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today.
|
||||
|
||||
## `repair` — explicit
|
||||
|
|
|
|||
|
|
@ -27,7 +27,8 @@ list/`Blob` columns → none.
|
|||
|
||||
## L2 — OmniGraph orchestration
|
||||
|
||||
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run.
|
||||
- **`@index`/`@key` declares intent; the physical index is derived state.** A migration records the declaration in the catalog/IR and never fails on it — `schema apply` builds **no** indexes (adding an `@index` to an existing column is a pure metadata change that touches no table data). `load`/`mutate` build declared indexes inline as part of the write, but a column that can't be built yet (a `Vector` column with no trainable vectors — IVF k-means needs ≥1 vector, e.g. rows loaded before `embed` runs) is left **pending**, not fatal. Reads stay correct meanwhile: a missing/partial index degrades to a scan (vector search to brute-force). A later `ensure_indices`/`optimize` materializes the pending index once it is buildable. This mirrors how LanceDB builds indexes asynchronously and serves unindexed rows by brute-force.
|
||||
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted + vector indexes for the current head; safe to re-run; returns the columns it had to defer as pending. `optimize` runs it after compaction, so the maintenance cron is the convergence path for deferred indexes.
|
||||
- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
|
||||
- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
|
||||
- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue