From b183db078ffa57daae10c67041ca2dc12b4720ea Mon Sep 17 00:00:00 2001
From: Ragnor Comerford <ragnor.comerford@gmail.com>
Date: Mon, 15 Jun 2026 18:48:43 +0200
Subject: [PATCH] Index materialization is derived state: defer off the write
 path, reconcile via optimize (iss-848) (#246)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* test(engine): reproduce empty-table Vector @index aborting schema apply

A Vector (IVF) index trains k-means centroids over the column, so Lance
cannot build it on 0 vectors ("Creating empty vector indices with
train=False is not yet implemented"). schema apply reconciles a table's
whole index set whenever any @index on it changes, so adding an unrelated
scalar @index materializes the dormant empty vector index and aborts the
entire migration (all-or-nothing).

This regression test inits a 0-row Doc with a Vector @index, adds a scalar
@index, and asserts the apply succeeds (then loads one embedded row and
asserts the deferred index materializes). It fails today at the apply step
with the vector-index abort; the fix lands in the next commit.

Refs dev-graph iss-empty-vector-index-schema-apply, iss-848.

* fix(engine): defer Vector @index on an empty table instead of aborting schema apply

build_indices_on_dataset_for_catalog materialized a declared Vector @index
unconditionally. On a 0-row table Lance cannot train the IVF index
("Creating empty vector indices with train=False is not yet implemented"),
so any later migration that touches the table (e.g. adding an unrelated
scalar @index, which reconciles the table's whole index set) aborted the
entire migration on the dormant vector index — all-or-nothing.

Guard the vector arm with a row-count check, matching the guard
ensure_indices_for_branch and the branch-merge rebuild already use: an
untrainable column becomes a pending index that a later ensure_indices /
optimize materializes once the table has rows. Reads stay correct meanwhile
(vector search degrades to a brute-force scan).

Stop-gap: the residual rows-present-but-vectors-null window and the full
decoupling (intent recorded at apply, an idempotent coverage reconciler)
are dev-graph iss-848. Turns the green half of the regression test added in
the previous commit.

Refs dev-graph iss-empty-vector-index-schema-apply, iss-848, iss-687.

* docs(invariants): record the logical-contract-over-physical-state principle

The bug class behind the empty-table vector-index abort (and the schema-apply
vs optimize version drift) is one shape: a physical operation allowed to fail
a logical one. Several hard invariants (2, 5, 7, 13) and deny-list items are
already instances of this, but the unifying rule was never written down.

Add it to docs/dev/invariants.md as a "Governing principle" section above the
hard invariants, naming which invariants and deny-list items instantiate it
and the smell to watch for (a logical operation gated on a physical fact).
Add a one-line always-on rule (7) in AGENTS.md so it stays in working memory,
with the qualifier that genuine logical conflicts still fail loudly — the
licence to lag covers physical convergence, not correctness.

Audience-neutral: no private ticket refs. check-agents-md.sh passes.

* test(engine): index build must tolerate rows with null vectors (load-before-embed)

Loading rows whose vector column is null into a `Vector @index` table fails
today: build_indices (reached via the loader's prepare_updates_for_commit)
calls create_vector_index, and Lance's IVF KMeans errors "cannot train 1
centroids with 0 vectors". The same abort hits ensure_indices/optimize/schema
apply/merge, since they all funnel through build_indices_on_dataset_for_catalog.

This test loads two null-embedding rows and calls ensure_indices; it must not
abort (the untrainable vector column is deferred, sibling indexes still build).
Fails today at the load step; fixed in the next commit.

Refs dev-graph iss-848, iss-empty-vector-index-schema-apply.

* fix(engine): defer unbuildable index columns instead of aborting the write path

build_indices_on_dataset_for_catalog is the chokepoint every write path funnels
through (load/mutate via prepare_updates_for_commit, schema apply, ensure_indices,
optimize, branch merge). Its vector arm called create_vector_index
unconditionally, so a column with no trainable vectors yet — an empty table, or
rows loaded before `embed` populates them — aborted the whole operation with
Lance's IVF KMeans error.

Fault-isolate the vector build: on failure, record the column as a PendingIndex
(table, column, reason), log it, and continue building the sibling indexes; a
later ensure_indices/optimize materializes it once the column is trainable, and
reads use brute-force meanwhile. Manifest/CAS/IO errors at the publish boundary
still propagate. Isolating at the single chokepoint realizes the governing
principle (physical index state never fails a logical operation) for every write
path, and supersedes the earlier symptomatic count_rows==0 stop-gap (removed) —
closing the residual rows-present-but-vectors-null window it left open.

Surfacing pending index status rather than failing is the database norm
(Postgres indisvalid, LanceDB list_indices). ensure_indices and the build_indices
wrappers now return Vec<PendingIndex>; optimize surfaces it in a later commit.

Refs dev-graph iss-848, iss-951 (vector index stays inline-commit until lance#6666).

* test(engine): index-only schema apply must not touch table data

Adding an @index to an existing column should be a pure metadata change once
index materialization moves to the reconciler (iss-848): the apply records the
intent in the catalog/IR but builds nothing inline, so the table's manifest
version is unchanged. Today the indexed_tables block builds the index inline
and bumps the version (4 -> 5). Fixed in the next commit.

Refs dev-graph iss-848.

* fix(engine): schema apply records index intent only; index-only apply is metadata

Schema apply no longer builds indexes inline. The four build_indices calls
(added/renamed/rewritten/index-only tables) are removed; the @index/@key intent
is already persisted in the catalog/IR the apply writes, and the physical index
is materialized off the critical path by ensure_indices/optimize (iss-848).

Concretely:
- AddConstraint (an @index addition — every other added constraint plans as
  UnsupportedChange) becomes a pure metadata step alongside the metadata-only
  steps: it touches no table data, so the table version is unchanged.
- added/renamed/rewritten tables still write their data; only the trailing
  index build is gone. The rewritten table's coverage is restored later by
  optimize_indices.
- recovery_pins drops index-only tables (they no longer advance Lance HEAD) and
  keeps rewritten tables; their post_commit_pin = expected+1 is now exact (one
  rewrite commit), strengthening recovery classification.
- the now-orphaned Omnigraph::build_indices_on_dataset_for_catalog wrapper is
  removed.

A migration can no longer abort on an index build, for any index type at any
cardinality. Turns the green half of index_only_constraint_apply_touches_no_table_data.

Refs dev-graph iss-848.

* test(engine): optimize must converge a declared-but-unbuilt index

After iss-848, adding an @index post-data is a metadata-only apply that defers
the physical build, so the column is declared-indexed but unbuilt (reads scan).
`optimize` — the operator's cron reconciler — must materialize it. Today optimize
only maintains coverage of EXISTING indexes (optimize_indices) and never creates
missing ones, so the rank BTREE stays Degraded after optimize. Fixed next commit.

Refs dev-graph iss-848.

* fix(engine): optimize materializes declared-but-unbuilt indexes (the reconciler)

`omnigraph optimize` is the operator's cron reconciler. It already compacts and
folds new fragments into EXISTING indexes (optimize_indices); now it also builds
declared-but-missing indexes, so the indexes schema apply / load defer (iss-848)
converge on the next optimize.

Done inside optimize_one_table (not by composing the all-tables ensure_indices,
which is drift-blind and would re-publish the uncovered HEAD>manifest drift that
optimize deliberately skips): after the per-table drift/blob skips and under the
queue + Optimize sidecar already held, a needs_index_create gate (reusing
needs_index_work_node/edge — "declared index missing AND row_count > 0", so empty
tables stay no-ops) admits index-only work, and Phase B builds the missing index
over the just-compacted layout via the build chokepoint. An untrainable vector
column fault-isolates into the new TableOptimizeStats.pending_indexes (the
list_indices/indisvalid analog operators read), not a failure. committed now
reflects index commits, so the existing post-publish cache invalidation covers
them. LanceDB's optimize only maintains existing indexes; creating
declared-but-missing ones is the L2 behavior omnigraph's declarative @index needs.

Turns the green half of optimize_materializes_index_declared_but_unbuilt.

Refs dev-graph iss-848.

* docs: index materialization is deferred to the reconciler (iss-848)

Update the index-lifecycle docs to reflect the new contract: @index/@key
declares intent and the physical index is derived state that never fails a
logical operation. Schema apply builds nothing (records intent only);
load/mutate build inline through one chokepoint that defers an untrainable
Vector column as pending; optimize/ensure_indices is the reconciler that
creates declared-but-missing indexes and maintains coverage, reporting
still-pending columns.

Touches: dev/invariants.md (truth-matrix Index-lifecycle row), AGENTS.md
(capability matrix), user/search/indexes.md (L2 orchestration), user/operations/
maintenance.md (optimize reconciler bullet), dev/testing.md (new tests).

* test(server): schema_apply_route_can_add_index reflects deferred index build

iss-848 made schema apply record @index intent without building the physical
index inline. The route test asserted the index count increased after apply;
on an empty graph it now stays unchanged (the build is deferred to
ensure_indices/optimize). Assert the new contract: apply succeeds and the
physical index count is unchanged.

* fix(engine): precheck vector trainability — don't pin or swallow (PR review)

Two issues Cursor Bugbot caught in the chokepoint fault-isolation:

1. (HIGH) Pending vector pins roll back siblings. needs_index_work_node counted
   a missing vector index as work whenever the table had rows, so a column with
   no trainable vectors got pinned in the EnsureIndices recovery sidecar — but
   the build deferred it (zero commit). On a crash before manifest publish the
   classifier sees NoMovement and the all-or-nothing decision (recovery.rs
   decide()) rolls back the WHOLE sidecar, undoing a sibling table's committed
   index work.
2. (MED) Vector build swallowed fatal errors. The match arm converted every
   create_vector_index error into a deferred PendingIndex, hiding genuine
   I/O/manifest/Lance failures as "pending".

Fix both with one trainability precheck (vector_column_trainable: >=1 non-null
vector, the ivf_flat(1) minimum) used identically by needs_index_work_node and
the build arm: an untrainable column is never counted as work (so never pinned —
no zero-commit pin) and never attempted (so it can't fail); only a trainable
column is built, and then any error PROPAGATES (stays fatal). The deferred
column is still recorded as a PendingIndex with a clear reason.

Refs dev-graph iss-848.

* feat(cli): surface pending index column + reason in optimize output (PR review)

Codex (P2): pending_indexes was documented as visible in `optimize --json` but
the CLI projection never emitted it — operators would lose the only signal that
optimize has deferred index work. Greptile (P2): the stat dropped the reason, so
operators saw which column was stuck, not why.

Carry the reason: TableOptimizeStats.pending_indexes is now Vec<PendingIndex>
(column + reason), and `omnigraph optimize --json` emits {column, reason} per
pending index; human output prints a "↳ index pending on '<col>': <reason>" line.

Refs dev-graph iss-848.

* test: align CLI index-add test with deferred build; cover post-rename reconcile

- schema_apply_json_adds_index_for_existing_property (cli_schema_config.rs): the
  CLI analog of the server test — asserted the index count grew after apply;
  under iss-848 the apply defers the build, so the count is unchanged on an
  empty graph. Assert the deferred contract. (The only full-suite failure.)
- optimize_materializes_index_after_type_rename (maintenance.rs, new): covers
  the gap Greptile flagged — a RenameType writes the renamed table with rows but
  no indexes (inline build removed in Commit B); assert the rank index is
  Degraded post-rename and Indexed after optimize reconciles it.

Refs dev-graph iss-848.

* test(engine): in-source apply tests reflect deferred index materialization

The two db::omnigraph in-source unit tests asserted the old "schema apply builds
/ preserves indexes inline" behavior (the only remaining full-suite failures):

- test_apply_schema_defers_index_then_reconciler_builds_it (was
  test_apply_schema_adds_index_for_existing_property): apply records the @index
  intent but builds nothing; assert the BTREE on `age` is absent after apply and
  present after ensure_indices. (Uses `age`, unindexed in TEST_SCHEMA — `name
  @key` is already FTS-indexed at seed.)
- test_apply_schema_rewrite_defers_index_then_reconciler_restores (was
  test_apply_schema_rewrite_preserves_existing_indices): an AddProperty rewrite
  no longer rebuilds indexes inline; assert ensure_indices restores id BTREE +
  name FTS after the rewrite.

Verified by grep that these + the server/CLI tests are the complete set of
"apply builds an index" assertions; all other index-presence tests run after
load/ensure_indices/primitives, which still build.

Refs dev-graph iss-848.

* fix(engine): optimize always reports pending indexes, not only on create-work (PR review)

Cursor Bugbot (MED): pending_indexes was filled only when needs_index_create was
true, but the vector trainability precheck makes needs_index_work_node exclude an
untrainable Vector column. So a table whose sole missing index is untrainable, but
which optimize still compacts or reindexes, returned an empty pending_indexes —
contradicting the documented operator contract for deferred columns.

Run the (idempotent) build chokepoint unconditionally once past the no-op gate,
rather than gating it on needs_index_create. It skips existing indexes, builds
any buildable missing one, and reports an untrainable column as pending whether
the table entered for compaction, reindex, or index creation. needs_index_create
still gates the no-op decision (so an index-only table still enters the path).

Refs dev-graph iss-848.

* test(engine): reframe staged-BTREE-failure failpoint onto the reconciler path

ensure_indices_stage_btree_failure_leaves_existing_tables_writable fired
`ensure_indices.post_stage_pre_commit_btree` and expected `apply_schema` (adding
a type) to fail mid-BTREE-build. iss-848 removed apply's inline index build, so
that apply now succeeds and the test's unwrap_err panicked — it exercised a
removed code path.

Reframe onto where BTREE builds happen now: seed Person, add an `@index` on
`age` (apply records intent, defers the build), then `ensure_indices` builds the
deferred BTREE and the failpoint fires between stage and commit. Person's HEAD
is unchanged (no drift) and its EnsureIndices sidecar pins NoMovement; a write to
a different, unpinned table (Company) is unaffected (mutations/loads heal
roll-forward and proceed, unlike optimize/repair which refuse on a pending
sidecar). Preserves the original coverage (staged-index stage failure leaves
other tables writable, no drift) in the new architecture.

Refs dev-graph iss-848.

* feat(server): converge deferred indexes promptly after schema apply (iss-848)

Schema apply records @index intent but defers the physical build. On a
long-lived server, spawn a detached best-effort ensure_indices after a
successful apply so the indexes converge promptly instead of waiting for the
operator's next optimize. Fire-and-forget: it never blocks or fails the apply
response, and a failure is logged (the index still converges on the next
optimize). Guarded on result.applied. The CLI is one-shot, so it has no
equivalent; its convergence path is the optimize cadence.

handle.engine is already an Arc, so the spawn takes an owned clone. Convergence
itself is covered by the engine ensure_indices/optimize tests; the existing
empty-graph schema-apply route tests confirm the response is unaffected (the
spawn is a read-only no-op on an empty table).

Refs dev-graph iss-848.

* docs(maintenance): list pending_indexes in optimize per-table stats (consistency)
---
 AGENTS.md                                     |   3 +-
 crates/omnigraph-cli/src/main.rs              |   7 +
 .../omnigraph-cli/tests/cli_schema_config.rs  |   8 +-
 crates/omnigraph-server/src/handlers.rs       |  19 +++
 .../omnigraph-server/tests/schema_routes.rs   |  10 +-
 crates/omnigraph/src/db/mod.rs                |   6 +-
 crates/omnigraph/src/db/omnigraph.rs          |  54 +++++--
 crates/omnigraph/src/db/omnigraph/optimize.rs |  67 ++++++--
 .../src/db/omnigraph/schema_apply.rs          |  94 ++++-------
 .../omnigraph/src/db/omnigraph/table_ops.rs   | 142 +++++++++++++----
 crates/omnigraph/tests/failpoints.rs          |  95 ++++++-----
 crates/omnigraph/tests/maintenance.rs         | 149 ++++++++++++++++++
 crates/omnigraph/tests/schema_apply.rs        | 105 ++++++++++++
 docs/dev/invariants.md                        |  34 +++-
 docs/dev/testing.md                           |   4 +-
 docs/user/operations/maintenance.md           |   3 +-
 docs/user/search/indexes.md                   |   3 +-
 17 files changed, 622 insertions(+), 181 deletions(-)

diff --git a/AGENTS.md b/AGENTS.md
index 7e42a2a..b4453be 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -144,6 +144,7 @@ These are architectural rules that need to be in scope on every change. They're
 4. **Bearer-token plaintext never persists in process memory.** Tokens are hashed at startup; auth uses constant-time comparison; the actor id is server-resolved from the hash match and must not be settable by the client.
 5. **Reads always see the current index state for the branch they're reading.** Indexes track the branch head, not historical snapshots. If you change index lifecycle, preserve this guarantee.
 6. **Stable type IDs survive renames.** Schema migration relies on identity that's stable across rename — don't mint new IDs on rename.
+7. **Logical contract over physical state.** Physical state (index coverage, fragment layout, compaction versions, staged writes) is derived and rebuildable; it must never fail a logical operation. Check preconditions against logical state and let reconciliation converge the physical state idempotently — genuine logical conflicts still fail loudly. This is the rule rules 1–6 instantiate; full statement and applications in [docs/dev/invariants.md](docs/dev/invariants.md).
 
 ### Deny-list (fast-pass review filter — full reasoning in [docs/dev/invariants.md](docs/dev/invariants.md))
 
@@ -250,7 +251,7 @@ omnigraph policy explain --actor act-alice --action change --branch main
 | Compaction (`compact_files`) + reindex (`optimize_indices`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; per table runs `compact_files` **then Lance `optimize_indices`** (folds appended/rewritten fragments back into existing indexes — incremental merge, not retrain) and **publishes the resulting version to `__manifest`** (so the manifest tracks the Lance HEAD — required for reads to observe the work and for schema apply / strict writes to pass their HEAD-vs-manifest precondition), under the per-`(table, main)` write queue with `SidecarKind::Optimize` recovery coverage spanning both ops; **commits even with no compaction work if index coverage is stale**; **refuses on an unrecovered graph**; **skips uncovered HEAD > manifest drift** with `DriftNeedsRepair`; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent; reindex is skipped for them too today), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) |
 | Repair uncovered drift | — | `omnigraph repair` explicitly classifies uncovered table `HEAD > manifest` drift: verified maintenance drift (`ReserveFragments`/`Rewrite`) can be published with `--confirm`; suspicious or unverifiable drift requires `--force --confirm`. Sidecar-covered crash residuals still recover automatically on open. |
 | Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy |
-| BTREE / inverted (FTS) / vector indexes | ✅ | `ensure_indices` builds them per `@index`/`@key` column, dispatched by type via `node_prop_index_kind` (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. Coverage of fragments appended after build is restored by `optimize`'s `optimize_indices` pass (see Compaction row). |
+| BTREE / inverted (FTS) / vector indexes | ✅ | `@index`/`@key` declares intent; the physical index is derived state that never fails a logical op. Built per column through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. **Schema apply builds nothing** (records intent only); `load`/`mutate` build inline but **defer an untrainable Vector column** (no trainable vectors yet) as *pending* rather than aborting. `ensure_indices`/`optimize` is the reconciler that materializes declared-but-missing indexes and restores coverage of appended/rewritten fragments (`optimize_indices`), reporting still-pending columns (see Compaction row). |
 | `merge_insert` upsert | ✅ | `LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering |
 | Vector search | ✅ | `nearest()` query op; embedding pipeline (Gemini / OpenAI clients); `@embed` in schema |
 | Full-text search | ✅ | `search/fuzzy/match_text/bm25` query ops |
diff --git a/crates/omnigraph-cli/src/main.rs b/crates/omnigraph-cli/src/main.rs
index 45e24f7..a02f9aa 100644
--- a/crates/omnigraph-cli/src/main.rs
+++ b/crates/omnigraph-cli/src/main.rs
@@ -770,6 +770,10 @@ async fn main() -> Result<()> {
                         "skipped": s.skipped.map(|r| r.as_str()),
                         "manifest_version": s.manifest_version,
                         "lance_head_version": s.lance_head_version,
+                        "pending_indexes": s.pending_indexes.iter().map(|p| serde_json::json!({
+                            "column": p.column,
+                            "reason": p.reason,
+                        })).collect::<Vec<_>>(),
                     })).collect::<Vec<_>>(),
                 });
                 print_json(&value)?;
@@ -786,6 +790,9 @@ async fn main() -> Result<()> {
                     } else {
                         println!("  {:<40} no-op", s.table_key);
                     }
+                    for p in &s.pending_indexes {
+                        println!("    ↳ index pending on '{}': {}", p.column, p.reason);
+                    }
                 }
             }
         }
diff --git a/crates/omnigraph-cli/tests/cli_schema_config.rs b/crates/omnigraph-cli/tests/cli_schema_config.rs
index 8c5b1f3..b81d6ff 100644
--- a/crates/omnigraph-cli/tests/cli_schema_config.rs
+++ b/crates/omnigraph-cli/tests/cli_schema_config.rs
@@ -334,7 +334,13 @@ fn schema_apply_json_adds_index_for_existing_property() {
         let dataset = snapshot.open("node:Person").await.unwrap();
         dataset.load_indices().await.unwrap().len()
     });
-    assert!(after_index_count > before_index_count);
+    // iss-848: `schema apply` records the `@index` intent but defers the physical
+    // index build (materialized later by ensure_indices/optimize; on this empty
+    // table nothing builds anyway). So the physical index count is unchanged.
+    assert_eq!(
+        after_index_count, before_index_count,
+        "schema apply records @index intent but defers the physical build (iss-848)"
+    );
 }
 
 #[test]
diff --git a/crates/omnigraph-server/src/handlers.rs b/crates/omnigraph-server/src/handlers.rs
index 26fa33f..8e310fd 100644
--- a/crates/omnigraph-server/src/handlers.rs
+++ b/crates/omnigraph-server/src/handlers.rs
@@ -1196,6 +1196,25 @@ pub(crate) async fn server_schema_apply(
         .await
         .map_err(ApiError::from_omni)?
     };
+    // Prompt index convergence (iss-848): schema apply records `@index` intent
+    // but defers the physical build. On a long-lived server, materialize it
+    // promptly rather than waiting for the next `optimize` cron — spawned
+    // detached so it never blocks or fails the apply response. Best-effort: a
+    // failure is logged and the index still converges on the next optimize.
+    // The CLI is one-shot, so it has no equivalent; its convergence path is the
+    // operator's optimize cadence.
+    if result.applied {
+        let engine = Arc::clone(&handle.engine);
+        tokio::spawn(async move {
+            if let Err(err) = engine.ensure_indices().await {
+                tracing::warn!(
+                    target: "omnigraph::server",
+                    error = %err,
+                    "post-apply ensure_indices failed; indexes will converge on the next optimize",
+                );
+            }
+        });
+    }
     Ok(Json(schema_apply_output(handle.uri.as_str(), result)))
 }
 
diff --git a/crates/omnigraph-server/tests/schema_routes.rs b/crates/omnigraph-server/tests/schema_routes.rs
index d250d8a..65b39a9 100644
--- a/crates/omnigraph-server/tests/schema_routes.rs
+++ b/crates/omnigraph-server/tests/schema_routes.rs
@@ -294,6 +294,11 @@ async fn schema_apply_route_can_add_index() {
 
     assert_eq!(status, StatusCode::OK);
     assert_eq!(payload["applied"], true);
+    // iss-848: the /schema/apply route accepts the index-add and applies it as a
+    // metadata change — it records the `@index` intent in the catalog/IR but does
+    // NOT build the physical index inline (the build is deferred to
+    // ensure_indices/optimize; on this empty table nothing would build anyway).
+    // So the physical index count is unchanged by the apply.
     let reopened = Omnigraph::open(graph.to_str().unwrap()).await.unwrap();
     let snapshot = reopened
         .snapshot_of(ReadTarget::branch("main"))
@@ -301,7 +306,10 @@ async fn schema_apply_route_can_add_index() {
         .unwrap();
     let dataset = snapshot.open("node:Person").await.unwrap();
     let after_index_count = dataset.load_indices().await.unwrap().len();
-    assert!(after_index_count > before_index_count);
+    assert_eq!(
+        after_index_count, before_index_count,
+        "schema apply records @index intent but defers the physical build (iss-848)"
+    );
 }
 
 #[tokio::test]
diff --git a/crates/omnigraph/src/db/mod.rs b/crates/omnigraph/src/db/mod.rs
index 000602a..f382908 100644
--- a/crates/omnigraph/src/db/mod.rs
+++ b/crates/omnigraph/src/db/mod.rs
@@ -11,9 +11,9 @@ pub use graph_coordinator::{GraphCoordinator, ReadTarget, ResolvedTarget, Snapsh
 pub use manifest::{Snapshot, SubTableEntry, SubTableUpdate};
 pub(crate) use omnigraph::ensure_public_branch_ref;
 pub use omnigraph::{
-    CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, RepairAction,
-    RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions, SchemaApplyResult,
-    SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats,
+    CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, PendingIndex,
+    RepairAction, RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions,
+    SchemaApplyResult, SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats,
 };
 
 pub(crate) const SCHEMA_APPLY_LOCK_BRANCH: &str = "__schema_apply_lock__";
diff --git a/crates/omnigraph/src/db/omnigraph.rs b/crates/omnigraph/src/db/omnigraph.rs
index 6c80117..6d2ccd7 100644
--- a/crates/omnigraph/src/db/omnigraph.rs
+++ b/crates/omnigraph/src/db/omnigraph.rs
@@ -40,6 +40,7 @@ pub use repair::{
     RepairAction, RepairClassification, RepairOptions, RepairStats, TableRepairStats,
 };
 pub use schema_apply::SchemaApplyOptions;
+pub use table_ops::PendingIndex;
 
 use super::commit_graph::GraphCommit;
 use super::manifest::{
@@ -1069,11 +1070,15 @@ impl Omnigraph {
     /// unbranched subtables keep inheriting `main`, while subtables inherited
     /// from an ancestor branch are first forked into the active branch before
     /// their index metadata is updated.
-    pub async fn ensure_indices(&self) -> Result<()> {
+    /// Returns the declared indexes that could not be materialized on this
+    /// pass (today: vector columns with no trainable vectors yet). They are
+    /// deferred, not errors; a later `ensure_indices`/`optimize` builds them
+    /// once the column is trainable. Reads stay correct (brute-force) meanwhile.
+    pub async fn ensure_indices(&self) -> Result<Vec<PendingIndex>> {
         table_ops::ensure_indices(self).await
     }
 
-    pub async fn ensure_indices_on(&self, branch: &str) -> Result<()> {
+    pub async fn ensure_indices_on(&self, branch: &str) -> Result<Vec<PendingIndex>> {
         table_ops::ensure_indices_on(self, branch).await
     }
 
@@ -1530,19 +1535,10 @@ impl Omnigraph {
         &self,
         table_key: &str,
         ds: &mut SnapshotHandle,
-    ) -> Result<()> {
+    ) -> Result<Vec<PendingIndex>> {
         table_ops::build_indices_on_dataset(self, table_key, ds).await
     }
 
-    pub(crate) async fn build_indices_on_dataset_for_catalog(
-        &self,
-        catalog: &Catalog,
-        table_key: &str,
-        ds: &mut SnapshotHandle,
-    ) -> Result<()> {
-        table_ops::build_indices_on_dataset_for_catalog(self, catalog, table_key, ds).await
-    }
-
     // Used only by in-tree tests (`#[cfg(test)]`); the runtime path now
     // uses `commit_updates_on_branch_with_expected` exclusively.
     #[cfg(test)]
@@ -2498,25 +2494,49 @@ edge WorksAt: Person -> Company
     }
 
     #[tokio::test]
-    async fn test_apply_schema_adds_index_for_existing_property() {
+    async fn test_apply_schema_defers_index_then_reconciler_builds_it() {
+        // iss-848: schema apply records the @index intent but builds nothing
+        // inline; a later ensure_indices materializes it once the table has
+        // rows. (Use `age`, which is unindexed in TEST_SCHEMA — `name @key` is
+        // already FTS-indexed at seed, so it can't show the deferral.)
         let dir = tempfile::tempdir().unwrap();
         let uri = dir.path().to_str().unwrap();
         let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
+        seed_person_row(&mut db, "Alice", Some(30)).await;
 
-        let desired = TEST_SCHEMA.replace("name: String @key", "name: String @key @index");
+        let desired = TEST_SCHEMA.replace("age: I32?", "age: I32? @index");
         db.apply_schema(&desired).await.unwrap();
 
+        // Apply built nothing — the BTREE on `age` is deferred.
         let snapshot = db.snapshot().await;
         let ds = db
             .storage()
             .open_snapshot_at_table(&snapshot, "node:Person")
             .await
             .unwrap();
-        assert!(db.storage().has_fts_index(&ds, "name").await.unwrap());
+        assert!(
+            !db.storage().has_btree_index(&ds, "age").await.unwrap(),
+            "apply must not build the index inline (deferred to the reconciler)"
+        );
+
+        // The reconciler materializes it (Person has a row).
+        db.ensure_indices().await.unwrap();
+        let snapshot = db.snapshot().await;
+        let ds = db
+            .storage()
+            .open_snapshot_at_table(&snapshot, "node:Person")
+            .await
+            .unwrap();
+        assert!(
+            db.storage().has_btree_index(&ds, "age").await.unwrap(),
+            "ensure_indices must build the deferred index"
+        );
     }
 
     #[tokio::test]
-    async fn test_apply_schema_rewrite_preserves_existing_indices() {
+    async fn test_apply_schema_rewrite_defers_index_then_reconciler_restores() {
+        // iss-848: an AddProperty rewrite writes a new dataset version without
+        // rebuilding indexes inline (deferred); ensure_indices restores them.
         let dir = tempfile::tempdir().unwrap();
         let uri = dir.path().to_str().unwrap();
         let initial_schema = TEST_SCHEMA.replace("name: String @key", "name: String @key @index");
@@ -2529,6 +2549,8 @@ edge WorksAt: Person -> Company
         );
         db.apply_schema(&desired).await.unwrap();
 
+        // After the rewrite the reconciler restores index coverage.
+        db.ensure_indices().await.unwrap();
         let snapshot = db.snapshot().await;
         let ds = db
             .storage()
diff --git a/crates/omnigraph/src/db/omnigraph/optimize.rs b/crates/omnigraph/src/db/omnigraph/optimize.rs
index 9195256..00cb872 100644
--- a/crates/omnigraph/src/db/omnigraph/optimize.rs
+++ b/crates/omnigraph/src/db/omnigraph/optimize.rs
@@ -140,6 +140,12 @@ pub struct TableOptimizeStats {
     /// Lance HEAD version observed by optimize for drift skips. `None` for
     /// normal compaction/no-op/blob skips.
     pub lance_head_version: Option<u64>,
+    /// Declared `@index` columns on this table the reconciler could not build
+    /// this run, each with the `reason` (today: a vector column with no
+    /// trainable vectors yet). Empty on the common path. Reported, not fatal — a
+    /// later `optimize` retries; the `list_indices`/`indisvalid` analog so
+    /// operators can see which index is pending and why.
+    pub pending_indexes: Vec<super::PendingIndex>,
 }
 
 impl TableOptimizeStats {
@@ -153,6 +159,7 @@ impl TableOptimizeStats {
             skipped: None,
             manifest_version: None,
             lance_head_version: None,
+            pending_indexes: Vec::new(),
         }
     }
 
@@ -166,6 +173,7 @@ impl TableOptimizeStats {
             skipped: Some(reason),
             manifest_version: None,
             lance_head_version: None,
+            pending_indexes: Vec::new(),
         }
     }
 
@@ -183,6 +191,7 @@ impl TableOptimizeStats {
             skipped: Some(SkipReason::DriftNeedsRepair),
             manifest_version: Some(manifest_version),
             lance_head_version: Some(lance_head_version),
+            pending_indexes: Vec::new(),
         }
     }
 }
@@ -371,14 +380,26 @@ async fn optimize_one_table(
     let will_compact = plan.num_tasks() > 0;
     // Even when there is nothing to compact, the table may still have index
     // work: rows appended since the index was built (e.g. via `ingest --mode
-    // merge`) are scanned unindexed until folded in. Either compaction or stale
-    // index coverage is enough to enter the publish path. If NEITHER, this
-    // table is a no-op and must NOT be pinned in a sidecar — a zero-commit pin
-    // classifies NoMovement on recovery and forces an all-or-nothing rollback
-    // of sibling tables' legitimate work. Uncovered pre-existing manifest/HEAD
-    // drift is skipped above and must go through explicit repair.
+    // merge`) are scanned unindexed until folded in (needs_reindex), OR a
+    // declared `@index` was never built — schema apply records the intent but
+    // defers the physical build (iss-848), so optimize is the operator-facing
+    // reconciler that materializes it (needs_index_create). Any of the three is
+    // enough to enter the publish path. If NONE, this table is a no-op and must
+    // NOT be pinned in a sidecar — a zero-commit pin classifies NoMovement on
+    // recovery and forces an all-or-nothing rollback of sibling tables'
+    // legitimate work. Uncovered pre-existing manifest/HEAD drift is skipped
+    // above and goes through explicit repair, so this only runs on a healthy
+    // table under the per-table queue + sidecar.
     let needs_reindex = TableStore::has_unindexed_fragments(&ds).await?;
-    if !will_compact && !needs_reindex {
+    // needs_index_work_* checks "a declared index is missing AND row_count > 0",
+    // so empty tables stay no-ops (never pinned). It re-reads the head under the
+    // queue we already hold, so it is consistent with `ds`.
+    let needs_index_create = if let Some(type_name) = table_key.strip_prefix("node:") {
+        super::table_ops::needs_index_work_node(db, type_name, &table_key, &full_path, None).await?
+    } else {
+        super::table_ops::needs_index_work_edge(db, &table_key, &full_path, None).await?
+    };
+    if !will_compact && !needs_reindex && !needs_index_create {
         return Ok(TableOptimizeStats::compacted(
             table_key,
             &CompactionMetrics::default(),
@@ -427,7 +448,30 @@ async fn optimize_one_table(
     ds.optimize_indices(&OptimizeOptions::default())
         .await
         .map_err(|e| OmniError::Lance(format!("optimize_indices on {}: {}", table_key, e)))?;
-    let version_after = ds.version().version;
+
+    // Materialize any declared-but-missing index over the just-compacted layout,
+    // reusing the build chokepoint (idempotent: skips existing indexes; fault-
+    // isolates an untrainable vector column into `pending` rather than failing).
+    // Run it UNCONDITIONALLY now that we are past the no-op gate — not only when
+    // `needs_index_create`. A table can enter this path for compaction or
+    // reindex while its sole missing index is an untrainable Vector column
+    // (which `needs_index_work_*` does not count as buildable work); calling the
+    // build here is what surfaces that column in `pending_indexes`, so optimize
+    // can't compact a table yet silently drop the deferred-index signal.
+    // Idempotent + cheap when there is nothing to build. Vector index creation
+    // is an inline-commit residual; the Optimize sidecar's loose post_commit_pin
+    // covers the extra commits.
+    let catalog = db.catalog();
+    let mut snapshot = crate::storage_layer::SnapshotHandle::new(ds);
+    let pending_indexes: Vec<super::PendingIndex> =
+        super::table_ops::build_indices_on_dataset_for_catalog(
+            db,
+            &catalog,
+            &table_key,
+            &mut snapshot,
+        )
+        .await?;
+    let version_after = snapshot.dataset().version().version;
     let committed = version_after != version_before;
 
     // Pin the per-writer Phase B → Phase C residual for optimize: Lance HEAD has
@@ -438,9 +482,6 @@ async fn optimize_one_table(
     // expected = the version observed under the queue). On failure the sidecar
     // is intentionally left for the open-time recovery sweep to roll forward.
     if committed {
-        // Re-wrap the post-compaction dataset to read its state through the
-        // trait surface (`table_state` is a read; no HEAD advance).
-        let snapshot = crate::storage_layer::SnapshotHandle::new(ds);
         let state = db.storage().table_state(&full_path, &snapshot).await?;
         let update = crate::db::SubTableUpdate {
             table_key: table_key.clone(),
@@ -467,7 +508,9 @@ async fn optimize_one_table(
         );
     }
 
-    Ok(TableOptimizeStats::compacted(table_key, &metrics, committed))
+    let mut stat = TableOptimizeStats::compacted(table_key, &metrics, committed);
+    stat.pending_indexes = pending_indexes;
+    Ok(stat)
 }
 
 /// Run Lance `cleanup_old_versions` on every node + edge table on `main`,
diff --git a/crates/omnigraph/src/db/omnigraph/schema_apply.rs b/crates/omnigraph/src/db/omnigraph/schema_apply.rs
index f965ad4..c054004 100644
--- a/crates/omnigraph/src/db/omnigraph/schema_apply.rs
+++ b/crates/omnigraph/src/db/omnigraph/schema_apply.rs
@@ -193,7 +193,6 @@ where
     let mut added_tables = BTreeSet::new();
     let mut renamed_tables = HashMap::new();
     let mut rewritten_tables = BTreeSet::new();
-    let mut indexed_tables = BTreeSet::new();
     let mut dropped_tables = BTreeSet::new();
     // Hard-drop cleanup targets: (table_key, full_dataset_uri).
     // Populated for DropProperty { Hard } and DropType { Hard }; the
@@ -252,14 +251,14 @@ where
                     .or_default()
                     .insert(to.clone(), from.clone());
             }
-            SchemaMigrationStep::AddConstraint {
-                type_kind,
-                type_name,
-                ..
-            } => {
-                indexed_tables.insert(schema_table_key(*type_kind, type_name));
-            }
-            SchemaMigrationStep::UpdateTypeMetadata { .. }
+            // AddConstraint is only ever an `@index` addition (every other
+            // added constraint plans as UnsupportedChange). It records intent
+            // in the desired catalog/IR; the physical index is built off the
+            // critical path by ensure_indices/optimize (iss-848), so the apply
+            // does no table work for it — a pure metadata change like the two
+            // metadata steps below.
+            SchemaMigrationStep::AddConstraint { .. }
+            | SchemaMigrationStep::UpdateTypeMetadata { .. }
             | SchemaMigrationStep::UpdatePropertyMetadata { .. } => {}
             SchemaMigrationStep::DropProperty {
                 type_kind,
@@ -347,18 +346,15 @@ where
     let mut table_updates = HashMap::<String, crate::db::SubTableUpdate>::new();
     let mut table_tombstones = HashMap::<String, u64>::new();
 
-    // Recovery sidecar: protect the per-table commit_staged loop in
-    // rewritten_tables + indexed_tables. The post_commit_pin we record
-    // here is a lower bound (expected + 1); the classifier loose-matches
-    // for SidecarKind::SchemaApply because the actual N depends on how
-    // many indices need building. See classify_table's loose-match arm.
+    // Recovery sidecar: protect the per-table `stage_overwrite` +
+    // `commit_staged` in rewritten_tables — the only tables that advance Lance
+    // HEAD inline now that index building is deferred to the reconciler
+    // (iss-848). Each rewritten table is exactly one commit, so
+    // `post_commit_pin = expected + 1` is now exact (it was a loose lower bound
+    // when index builds added extra commits); the classifier's loose-match for
+    // SidecarKind::SchemaApply still accepts it.
     let recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = rewritten_tables
         .iter()
-        .chain(indexed_tables.iter().filter(|t| {
-            !rewritten_tables.contains(*t)
-                && !added_tables.contains(*t)
-                && !renamed_tables.contains_key(*t)
-        }))
         .filter_map(|table_key| {
             let entry = snapshot.entry(table_key)?;
             Some(crate::db::manifest::SidecarTablePin {
@@ -490,10 +486,11 @@ where
         let table_path = table_path_for_table_key(table_key)?;
         let dataset_uri = db.storage().dataset_uri(&table_path);
         let schema = schema_for_table_key(&desired_catalog, table_key)?;
-        let mut ds =
+        let ds =
             SnapshotHandle::new(TableStore::create_empty_dataset(&dataset_uri, &schema).await?);
-        db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds)
-            .await?;
+        // Indexes for the new table are materialized off the critical path by
+        // ensure_indices/optimize (iss-848); a 0-row table is never trainable
+        // anyway. The @index intent is recorded in the persisted catalog/IR.
         let state = db.storage().table_state(&dataset_uri, &ds).await?;
         table_registrations.insert(table_key.clone(), table_path);
         table_updates.insert(
@@ -533,10 +530,9 @@ where
         .await?;
         let table_path = table_path_for_table_key(target_table_key)?;
         let dataset_uri = db.storage().dataset_uri(&table_path);
-        let mut target_ds =
+        let target_ds =
             SnapshotHandle::new(TableStore::write_dataset(&dataset_uri, batch).await?);
-        db.build_indices_on_dataset_for_catalog(&desired_catalog, target_table_key, &mut target_ds)
-            .await?;
+        // Indexes on the renamed table are reconciled later (iss-848).
         let state = db.storage().table_state(&dataset_uri, &target_ds).await?;
         table_registrations.insert(target_table_key.clone(), table_path);
         table_updates.insert(
@@ -593,9 +589,10 @@ where
             .open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
             .await?;
         let staged = db.storage().stage_overwrite(&existing, batch).await?;
-        let mut target_ds = db.storage().commit_staged(existing, staged).await?;
-        db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut target_ds)
-            .await?;
+        let target_ds = db.storage().commit_staged(existing, staged).await?;
+        // The rewrite drops the table's existing index coverage; it is
+        // restored off the critical path by optimize's optimize_indices /
+        // ensure_indices (iss-848). Reads scan uncovered fragments meanwhile.
         let state = db.storage().table_state(&dataset_uri, &target_ds).await?;
         table_updates.insert(
             table_key.clone(),
@@ -609,41 +606,12 @@ where
         );
     }
 
-    for table_key in &indexed_tables {
-        if added_tables.contains(table_key)
-            || renamed_tables.contains_key(table_key)
-            || rewritten_tables.contains(table_key)
-        {
-            continue;
-        }
-        let entry = snapshot.entry(table_key).ok_or_else(|| {
-            OmniError::manifest(format!(
-                "missing table '{}' for schema index apply",
-                table_key
-            ))
-        })?;
-        ensure_snapshot_entry_head_matches(db, entry).await?;
-        let dataset_uri = db.storage().dataset_uri(&entry.table_path);
-        let mut ds = db
-            .storage()
-            .open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
-            .await?;
-        db.storage()
-            .ensure_expected_version(&ds, table_key, entry.table_version)?;
-        db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds)
-            .await?;
-        let state = db.storage().table_state(&dataset_uri, &ds).await?;
-        table_updates.insert(
-            table_key.clone(),
-            crate::db::SubTableUpdate {
-                table_key: table_key.clone(),
-                table_version: state.version,
-                table_branch: None,
-                row_count: state.row_count,
-                version_metadata: state.version_metadata,
-            },
-        );
-    }
+    // Index-only changes (AddConstraint, i.e. adding an `@index`) are pure
+    // metadata: the new `@index` intent is recorded in the desired catalog/IR
+    // persisted below, and the physical index is materialized off the critical
+    // path by `ensure_indices`/`optimize` (iss-848). Schema apply touches no
+    // table data for them, so there is no per-table loop here and no recovery
+    // pin (no Lance HEAD advances). Reads stay correct meanwhile via a scan.
 
     let mut manifest_changes = Vec::new();
     for (table_key, table_path) in table_registrations {
diff --git a/crates/omnigraph/src/db/omnigraph/table_ops.rs b/crates/omnigraph/src/db/omnigraph/table_ops.rs
index 3f40c1d..fab0e0c 100644
--- a/crates/omnigraph/src/db/omnigraph/table_ops.rs
+++ b/crates/omnigraph/src/db/omnigraph/table_ops.rs
@@ -21,7 +21,7 @@ pub(super) async fn graph_index_for_resolved(
     db.runtime_cache.graph_index(resolved, &catalog).await
 }
 
-pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> {
+pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<Vec<PendingIndex>> {
     let current_branch = db
         .coordinator
         .read()
@@ -31,7 +31,7 @@ pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> {
     ensure_indices_for_branch(db, current_branch.as_deref()).await
 }
 
-pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<()> {
+pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<Vec<PendingIndex>> {
     let branch = normalize_branch_name(branch)?;
     ensure_indices_for_branch(db, branch.as_deref()).await
 }
@@ -73,12 +73,16 @@ pub(super) async fn failpoint_publish_table_head_without_index_rebuild_for_test(
     .await
 }
 
-pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&str>) -> Result<()> {
+pub(super) async fn ensure_indices_for_branch(
+    db: &Omnigraph,
+    branch: Option<&str>,
+) -> Result<Vec<PendingIndex>> {
     db.ensure_schema_state_valid().await?;
     db.ensure_schema_apply_idle("ensure_indices").await?;
     let resolved = db.resolved_branch_target(branch).await?;
     let snapshot = resolved.snapshot;
     let mut updates = Vec::new();
+    let mut pending = Vec::new();
     let active_branch = resolved.branch;
     let catalog = db.catalog();
 
@@ -217,7 +221,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
         };
         let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0);
         if row_count > 0 {
-            build_indices_on_dataset(db, &table_key, &mut ds).await?;
+            pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?);
         }
 
         let state = db.storage().table_state(&full_path, &ds).await?;
@@ -265,7 +269,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
         };
         let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0);
         if row_count > 0 {
-            build_indices_on_dataset(db, &table_key, &mut ds).await?;
+            pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?);
         }
 
         let state = db.storage().table_state(&full_path, &ds).await?;
@@ -307,7 +311,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
         }
     }
 
-    Ok(())
+    Ok(pending)
 }
 
 /// The single scalar/vector index a node property receives from a one-column
@@ -352,6 +356,26 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option<NodePropIndexKind> {
     }
 }
 
+/// Whether a vector column currently has at least one non-null vector — the
+/// minimum for Lance IVF k-means to train (the `ivf_flat(1)` index we build
+/// needs >=1 vector). Used identically by `needs_index_work_node` (so an
+/// untrainable column is not pinned for recovery — avoiding a zero-commit pin
+/// that would roll back a sibling's index work) and by the vector build arm (so
+/// `create_vector_index` is only attempted when it can succeed, keeping its
+/// genuine errors fatal instead of swallowed as pending). If index params
+/// become size-aware (dev-graph iss-687), this threshold moves with them.
+async fn vector_column_trainable(
+    db: &Omnigraph,
+    ds: &SnapshotHandle,
+    column: &str,
+) -> Result<bool> {
+    Ok(db
+        .storage()
+        .count_rows(ds, Some(format!("{column} IS NOT NULL")))
+        .await?
+        > 0)
+}
+
 /// Returns true if the node table is missing at least one declared
 /// scalar/vector index that `build_indices_on_dataset_for_catalog` would
 /// build AND has at least one row (the ensure_indices loop has
@@ -366,7 +390,7 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option<NodePropIndexKind> {
 /// (DateTime/Date/numeric/Bool), FTS for free-text Strings, or a Vector index.
 /// Edges get BTree only (id, src, dst). This helper and the builder share
 /// `node_prop_index_kind` so they cannot drift — see its doc comment.
-async fn needs_index_work_node(
+pub(super) async fn needs_index_work_node(
     db: &Omnigraph,
     type_name: &str,
     table_key: &str,
@@ -409,7 +433,14 @@ async fn needs_index_work_node(
                 }
             }
             Some(NodePropIndexKind::Vector) => {
-                if !db.storage().has_vector_index(&ds, prop_name).await? {
+                // Only count a missing vector index as buildable *work* when the
+                // column is trainable (>=1 non-null vector). An untrainable
+                // column would defer in the build and commit nothing; pinning it
+                // for recovery would be a zero-commit pin that classifies
+                // NoMovement and rolls back a sibling table's index work.
+                if !db.storage().has_vector_index(&ds, prop_name).await?
+                    && vector_column_trainable(db, &ds, prop_name).await?
+                {
                     return Ok(true);
                 }
             }
@@ -434,7 +465,7 @@ async fn needs_index_work_node(
 ///
 /// Empty edge tables are skipped by the ensure_indices loop the same
 /// way node tables are; see `needs_index_work_node`.
-async fn needs_index_work_edge(
+pub(super) async fn needs_index_work_edge(
     db: &Omnigraph,
     table_key: &str,
     full_path: &str,
@@ -632,11 +663,25 @@ pub(super) async fn open_dataset_at_state(
         .await
 }
 
+/// A declared index the builder could not materialize on this pass. Today the
+/// only such case is a vector (IVF) column with no trainable vectors yet
+/// (KMeans needs >=1 vector), e.g. the load-before-embed window. Reported, not
+/// fatal: a later `ensure_indices`/`optimize` retries once the column is
+/// buildable, and reads stay correct via brute-force meanwhile. Surfacing
+/// pending index *status* rather than failing the operation is the database
+/// norm (Postgres `indisvalid`, LanceDB `list_indices`).
+#[derive(Debug, Clone)]
+pub struct PendingIndex {
+    pub table_key: String,
+    pub column: String,
+    pub reason: String,
+}
+
 pub(super) async fn build_indices_on_dataset(
     db: &Omnigraph,
     table_key: &str,
     ds: &mut SnapshotHandle,
-) -> Result<()> {
+) -> Result<Vec<PendingIndex>> {
     let catalog = db.catalog();
     build_indices_on_dataset_for_catalog(db, &catalog, table_key, ds).await
 }
@@ -646,8 +691,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
     catalog: &Catalog,
     table_key: &str,
     ds: &mut SnapshotHandle,
-) -> Result<()> {
+) -> Result<Vec<PendingIndex>> {
     if let Some(type_name) = table_key.strip_prefix("node:") {
+        let mut pending = Vec::new();
         if !db.storage().has_btree_index(ds, "id").await? {
             stage_and_commit_btree(db, table_key, ds, &["id"]).await?;
         }
@@ -676,22 +722,52 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
                         }
                         Some(NodePropIndexKind::Vector) => {
                             if !db.storage().has_vector_index(ds, prop_name).await? {
-                                // Inline-commit residual: lance-6.0.1 does not
-                                // expose `build_index_metadata_from_segments` as
-                                // `pub`, so vector indices cannot be staged from
-                                // outside the lance crate. Document at the call
-                                // site; companion ticket to lance-format/lance#6658.
-                                let new_snap = db
-                                    .storage_inline_residual()
-                                    .create_vector_index(ds.clone(), prop_name.as_str())
-                                    .await
-                                    .map_err(|e| {
-                                        OmniError::Lance(format!(
-                                            "create Vector index on {}({}): {}",
-                                            table_key, prop_name, e
-                                        ))
-                                    })?;
-                                *ds = new_snap;
+                                // A vector (IVF) index trains k-means over the column,
+                                // so it needs >=1 non-null vector (KMeans errors
+                                // "cannot train N centroids with 0 vectors"). Precheck
+                                // trainability: a column with no vectors yet (e.g. rows
+                                // loaded before `embed`) is recorded as a *pending*
+                                // index and skipped — deferred, not failed. The SAME
+                                // predicate gates `needs_index_work_node`, so an
+                                // untrainable column is never pinned for recovery (no
+                                // zero-commit pin that would roll back a sibling
+                                // table's index work). This function is the chokepoint
+                                // every write path funnels through (load/mutate, schema
+                                // apply, ensure_indices, optimize, merge), realizing
+                                // the governing principle — physical index state never
+                                // fails a logical operation. Only when trainable do we
+                                // attempt the build, and then we PROPAGATE any error: a
+                                // genuine I/O/manifest/Lance failure must stay fatal,
+                                // not be hidden as pending. (Vector creation is an
+                                // inline-commit residual until lance#6666; iss-951.)
+                                if vector_column_trainable(db, ds, prop_name).await? {
+                                    let new_snap = db
+                                        .storage_inline_residual()
+                                        .create_vector_index(ds.clone(), prop_name.as_str())
+                                        .await
+                                        .map_err(|e| {
+                                            OmniError::Lance(format!(
+                                                "create Vector index on {}({}): {}",
+                                                table_key, prop_name, e
+                                            ))
+                                        })?;
+                                    *ds = new_snap;
+                                } else {
+                                    tracing::info!(
+                                        target: "omnigraph::index",
+                                        table = %table_key,
+                                        column = %prop_name,
+                                        "deferring Vector index: column has no \
+                                         trainable vectors yet",
+                                    );
+                                    pending.push(PendingIndex {
+                                        table_key: table_key.to_string(),
+                                        column: prop_name.clone(),
+                                        reason: "column has no non-null vectors to \
+                                                 train on yet"
+                                            .to_string(),
+                                    });
+                                }
                             }
                         }
                         // Enum + orderable scalars (DateTime/Date/numeric/Bool)
@@ -709,7 +785,7 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
                 }
             }
         }
-        return Ok(());
+        return Ok(pending);
     }
 
     if table_key.starts_with("edge:") {
@@ -722,7 +798,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
         if !db.storage().has_btree_index(ds, "dst").await? {
             stage_and_commit_btree(db, table_key, ds, &["dst"]).await?;
         }
-        return Ok(());
+        // Edge tables only get BTree (id/src/dst), which build at any
+        // cardinality; no pending state is possible here.
+        return Ok(Vec::new());
     }
 
     Err(OmniError::manifest(format!(
@@ -844,7 +922,11 @@ async fn prepare_updates_for_commit(
                 crate::db::MutationOpKind::SchemaRewrite,
             )
             .await?;
-            build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?;
+            // Any column not yet buildable (e.g. a vector column whose rows
+            // have null embeddings) is deferred and logged inside
+            // build_indices; a later ensure_indices/optimize materializes it.
+            // The load/mutate/merge commit must not fail on it.
+            let _pending = build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?;
             let state = db.storage().table_state(&full_path, &ds).await?;
             prepared_update.table_version = state.version;
             prepared_update.row_count = state.row_count;
diff --git a/crates/omnigraph/tests/failpoints.rs b/crates/omnigraph/tests/failpoints.rs
index b45cfa0..2a0e9aa 100644
--- a/crates/omnigraph/tests/failpoints.rs
+++ b/crates/omnigraph/tests/failpoints.rs
@@ -2619,69 +2619,66 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
 }
 
 /// Acceptance test: a stage-step failure in the staged-index path
-/// (`stage_create_btree_index` succeeded; `commit_staged` not yet
-/// called) leaves NO Lance-HEAD drift on the existing tables.
-/// Subsequent operations against those tables succeed without
-/// `ExpectedVersionMismatch`.
+/// (`stage_create_btree_index` succeeded; `commit_staged` not yet called)
+/// leaves NO Lance-HEAD drift, so other tables stay writable.
 ///
-/// Path: `apply_schema(v1 → v2)` adds a new node type. The
-/// `added_tables` loop in `schema_apply` creates the empty dataset and
-/// then calls `build_indices_on_dataset_for_catalog` →
-/// `stage_and_commit_btree(..., &["id"])`. The failpoint fires
-/// between `stage_create_btree_index` and `commit_staged`, so the
-/// staged segments are written under `_indices/<uuid>/` but Lance HEAD
-/// on the new dataset is unchanged at v=1. The schema-apply lock
-/// branch is released by `apply_schema`'s outer match. Existing
-/// tables (e.g. `node:Person`) are completely untouched by the new
-/// node's added_tables iteration — they're outside the failed apply
-/// path entirely — and we assert that mutations against them continue
-/// to work.
-///
-/// The orphan empty dataset from the failed apply is acceptable
-/// residual: it's unreferenced by `__manifest` and will be reclaimed
-/// by `cleanup_old_versions` (or removed when a future apply at the
-/// same target path resolves the rename).
+/// Under iss-848 schema apply no longer builds indexes inline — the build
+/// happens in the reconciler (`ensure_indices`/`optimize`) and at load. So this
+/// fires the failpoint where it lives now: an `ensure_indices` build of a BTREE
+/// that a prior apply declared (`@index`) but deferred. The failpoint fires
+/// between `stage_create_btree_index` and `commit_staged`, so the staged
+/// segment is written under `_indices/<uuid>/` but `node:Person`'s Lance HEAD is
+/// unchanged. `ensure_indices` fails and its EnsureIndices sidecar pins only
+/// Person at NoMovement (a clean no-op on the next open). A write to a
+/// different, unpinned table (`node:Company`) is unaffected: mutations/loads run
+/// a roll-forward-only heal and proceed — they do not refuse on a pending
+/// sidecar the way `optimize`/`repair` do — so the write succeeds with no drift.
 #[tokio::test]
 async fn ensure_indices_stage_btree_failure_leaves_existing_tables_writable() {
     let _scenario = FailScenario::setup();
     let dir = tempfile::tempdir().unwrap();
     let uri = dir.path().to_str().unwrap().to_string();
-
-    // Init with TEST_SCHEMA which declares Person + Knows. Indices on
-    // those tables get built during init.
     let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
 
-    // Apply a schema that adds a new node type. The added_tables loop
-    // will hit the failpoint between stage and commit on the new
-    // node:Project table's btree-on-id build. (TEST_SCHEMA already
-    // has Person + Company + Knows + WorksAt — pick a name that isn't
-    // already declared.)
-    let extended_schema = format!(
-        "{}\nnode Project {{ name: String @key }}\n",
-        helpers::TEST_SCHEMA
-    );
-
-    {
-        let _failpoint =
-            ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
-        let err = db.apply_schema(&extended_schema).await.unwrap_err();
-        assert!(
-            err.to_string()
-                .contains("ensure_indices.post_stage_pre_commit_btree"),
-            "schema apply should fail with the synthetic failpoint error, got: {err}"
-        );
-    }
-
-    // Existing tables stayed at their pre-apply versions; subsequent
-    // mutations against them succeed (no Lance-HEAD drift).
+    // Seed a Person row — the load builds Person's id BTREE + name FTS.
     mutate_main(
         &mut db,
         helpers::MUTATION_QUERIES,
         "insert_person",
-        &mixed_params(&[("$name", "Eve")], &[("$age", 22)]),
+        &mixed_params(&[("$name", "Alice")], &[("$age", 30)]),
     )
     .await
-    .expect("Person mutation must succeed after the failed schema apply — existing tables are not drifted");
+    .expect("seed Person");
+
+    // Add `@index` on `age`: schema apply records the intent but defers the
+    // physical build (iss-848), so the BTREE on `age` is unbuilt.
+    let indexed_schema = helpers::TEST_SCHEMA.replace("age: I32?", "age: I32? @index");
+    db.apply_schema(&indexed_schema)
+        .await
+        .expect("adding an @index is metadata-only and succeeds");
+
+    {
+        // ensure_indices builds the deferred `age` BTREE on Person; the failpoint
+        // fires between stage and commit, so Person's Lance HEAD does not move.
+        let _failpoint =
+            ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
+        let err = db.ensure_indices().await.unwrap_err();
+        assert!(
+            err.to_string()
+                .contains("ensure_indices.post_stage_pre_commit_btree"),
+            "ensure_indices should fail with the synthetic failpoint error, got: {err}"
+        );
+    }
+
+    // A different, unpinned table is untouched by the failed index build.
+    use omnigraph::loader::{LoadMode, load_jsonl};
+    load_jsonl(
+        &mut db,
+        r#"{"type": "Company", "data": {"name": "Acme"}}"#,
+        LoadMode::Append,
+    )
+    .await
+    .expect("Company write on a table untouched by the failed ensure_indices should succeed");
 }
 
 fn assert_no_staging_files(graph: &std::path::Path) {
diff --git a/crates/omnigraph/tests/maintenance.rs b/crates/omnigraph/tests/maintenance.rs
index deb4d2d..02ee8a7 100644
--- a/crates/omnigraph/tests/maintenance.rs
+++ b/crates/omnigraph/tests/maintenance.rs
@@ -843,3 +843,152 @@ async fn cleanup_reconciles_orphaned_branch_forks() {
     .await
     .unwrap();
 }
+
+// Regression (iss-848): a table with rows but NULL vectors (the load-before-
+// embed window) must not abort index building. The vector (IVF) index cannot
+// train on 0 vectors, so `create_vector_index` errors with "KMeans cannot
+// train 1 centroids with 0 vectors". `build_indices_on_dataset_for_catalog`
+// is the chokepoint every caller funnels through (load/mutate via
+// prepare_updates_for_commit, ensure_indices, optimize, schema apply, merge),
+// so per-index fault isolation there must defer that one column (pending) and
+// still build the sibling scalar indexes, instead of propagating the error.
+// This exercises both the load path (which builds indices inline) and the
+// ensure_indices reconciler. Pre-fix this fails at the load step.
+#[tokio::test]
+async fn index_build_tolerates_null_vector_rows() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let schema = "node Doc {\n    \
+        slug: String @key\n    \
+        n: I64 @index\n    \
+        embedding: Vector(8)? @index\n\
+        }\n";
+    let mut db = Omnigraph::init(uri, schema).await.unwrap();
+    // Rows present, embeddings null (loaded but not yet embedded).
+    load_jsonl(
+        &mut db,
+        "{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"n\":1}}\n\
+         {\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"n\":2}}",
+        LoadMode::Merge,
+    )
+    .await
+    .expect("load rows with null embeddings");
+
+    // Must not abort: the untrainable vector column is deferred, the sibling
+    // BTREE on `n` still builds.
+    db.ensure_indices().await.expect(
+        "ensure_indices must not abort when a vector column has no trainable vectors yet",
+    );
+}
+
+// iss-848: `optimize` converges declared-but-unbuilt indexes. After an @index is
+// added post-data (a metadata-only apply that defers the physical build), the
+// column is unindexed and reads scan. `optimize` — the operator's reconciler,
+// run on a cron — must materialize it, by composing the ensure_indices
+// reconciler after the compaction sweep. Pre-iss-848 optimize only maintained
+// coverage of EXISTING indexes (optimize_indices) and never created missing ones.
+#[tokio::test]
+async fn optimize_materializes_index_declared_but_unbuilt() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let v1 = "node Doc {\n    slug: String @key\n    rank: I32\n}\n";
+    let mut db = Omnigraph::init(uri, v1).await.unwrap();
+    load_jsonl(
+        &mut db,
+        "{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\
+         {\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}",
+        LoadMode::Merge,
+    )
+    .await
+    .unwrap();
+
+    // Add @index on `rank` after data exists: a metadata-only apply that defers
+    // the physical build (iss-848), so the column is declared-indexed but unbuilt.
+    let v2 = "node Doc {\n    slug: String @key\n    rank: I32 @index\n}\n";
+    db.apply_schema(v2).await.expect("index-only apply");
+
+    // Precondition: `rank` is declared @index but unbuilt -> reads degrade.
+    {
+        let snap = snapshot_main(&db).await.unwrap();
+        let ds = snap.open("node:Doc").await.unwrap();
+        assert!(
+            matches!(
+                TableStore::key_column_index_coverage(&ds, "rank")
+                    .await
+                    .unwrap(),
+                IndexCoverage::Degraded { .. }
+            ),
+            "rank must be unindexed after the deferred apply"
+        );
+    }
+
+    db.optimize().await.unwrap();
+
+    // Postcondition: optimize's reconciler materialized the declared index.
+    let snap = snapshot_main(&db).await.unwrap();
+    let ds = snap.open("node:Doc").await.unwrap();
+    assert_eq!(
+        TableStore::key_column_index_coverage(&ds, "rank")
+            .await
+            .unwrap(),
+        IndexCoverage::Indexed,
+        "optimize must build the declared-but-unbuilt rank index"
+    );
+}
+
+// iss-848 (PR review): the rename path also defers index building. A RenameType
+// migration writes the renamed table as a new dataset with the existing rows
+// but no indexes (its inline build was removed). optimize must then materialize
+// the declared index on the renamed table.
+#[tokio::test]
+async fn optimize_materializes_index_after_type_rename() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let v1 = "node Doc {\n    slug: String @key\n    rank: I32 @index\n}\n";
+    let mut db = Omnigraph::init(uri, v1).await.unwrap();
+    load_jsonl(
+        &mut db,
+        "{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\
+         {\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}",
+        LoadMode::Merge,
+    )
+    .await
+    .unwrap();
+
+    // Rename Doc -> Item; rows are preserved on the new table key.
+    let v2 = "node Item @rename_from(\"Doc\") {\n    slug: String @key\n    rank: I32 @index\n}\n";
+    let result = db.apply_schema(v2).await.expect("rename apply");
+    assert!(result.applied);
+    assert_eq!(
+        count_rows(&db, "node:Item").await,
+        2,
+        "rename must preserve rows"
+    );
+
+    // Post-rename the renamed table's declared rank index is unbuilt (deferred).
+    {
+        let snap = snapshot_main(&db).await.unwrap();
+        let ds = snap.open("node:Item").await.unwrap();
+        assert!(
+            matches!(
+                TableStore::key_column_index_coverage(&ds, "rank")
+                    .await
+                    .unwrap(),
+                IndexCoverage::Degraded { .. }
+            ),
+            "rank must be unindexed immediately after the rename"
+        );
+    }
+
+    db.optimize().await.unwrap();
+
+    let snap = snapshot_main(&db).await.unwrap();
+    let ds = snap.open("node:Item").await.unwrap();
+    assert_eq!(
+        TableStore::key_column_index_coverage(&ds, "rank")
+            .await
+            .unwrap(),
+        IndexCoverage::Indexed,
+        "optimize must build the renamed table's deferred rank index"
+    );
+}
diff --git a/crates/omnigraph/tests/schema_apply.rs b/crates/omnigraph/tests/schema_apply.rs
index cc0cae2..508451a 100644
--- a/crates/omnigraph/tests/schema_apply.rs
+++ b/crates/omnigraph/tests/schema_apply.rs
@@ -736,3 +736,108 @@ edge Knows: Person -> Person {
     // current contract, the data is *unreachable* via omnigraph
     // (no manifest entry), which is the user-facing guarantee.
 }
+
+// Regression (bug 3 / dev-graph iss-848): a `Vector @index` on a 0-row table
+// must not abort an otherwise-valid schema apply. A vector (IVF) index trains
+// k-means centroids over the column's vectors, so Lance cannot build it on 0
+// vectors — it errors with "Creating empty vector indices with train=False is
+// not yet implemented". When a *later* migration touches that table (here, an
+// unrelated scalar `@index` on `body`), schema apply reconciles the table's
+// whole index set, which previously tried to materialize the dormant vector
+// index and aborted the entire migration (all-or-nothing). The build is now
+// deferred (pending) when the column is untrainable, instead of failing the
+// migration. The dormant index is materialized by a later `ensure_indices` /
+// `optimize` once the table has rows. Full decoupling — intent recorded at
+// apply, an async reconciler converges physical coverage — is iss-848.
+#[tokio::test]
+async fn apply_schema_defers_vector_index_on_empty_table() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+
+    // init does not build indices, so the declared-but-unbuilt vector index
+    // sits harmless on the empty table (this is how it survived earlier
+    // applies that never touched the table).
+    // `slug` is the user @key; omnigraph injects its own internal `id` column,
+    // so the key field must not be named `id`.
+    let v1 = "node Doc {\n    \
+        slug: String @key\n    \
+        body: String?\n    \
+        embedding: Vector(8) @index\n\
+        }\n";
+    let mut db = Omnigraph::init(uri, v1).await.unwrap();
+
+    // Add an *unrelated* scalar @index on `body`. This routes Doc through
+    // schema apply's index reconcile, which must NOT abort on the untrainable
+    // empty vector index.
+    let v2 = "node Doc {\n    \
+        slug: String @key\n    \
+        body: String? @index\n    \
+        embedding: Vector(8) @index\n\
+        }\n";
+    let result = db.apply_schema(v2).await.expect(
+        "schema apply must succeed: an empty-table vector @index is deferred, not fatal",
+    );
+    assert!(result.applied, "the scalar @index change must apply");
+
+    // The deferred vector index is not dropped — once the table has a
+    // trainable vector, `ensure_indices` materializes it without error. (If
+    // the guard wrongly skipped a non-empty column, this would still be
+    // unindexed; if it wrongly tried to build on empty, the apply above would
+    // have failed.)
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Doc","data":{"slug":"d1","body":"hello","embedding":[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8]}}"#,
+        LoadMode::Merge,
+    )
+    .await
+    .expect("loading a Doc with an embedding must succeed");
+    db.ensure_indices()
+        .await
+        .expect("the deferred vector index must build once the table has a trainable vector");
+}
+
+// iss-848: adding an `@index` to an existing column is a pure metadata change.
+// Schema apply records the intent (the catalog/IR now declares the index) but
+// must NOT build the index inline, so the table's data and manifest version are
+// untouched. The physical index is materialized later by ensure_indices /
+// optimize. Pre-iss-848 the indexed_tables block built the index inline and
+// bumped the table version.
+#[tokio::test]
+async fn index_only_constraint_apply_touches_no_table_data() {
+    let dir = tempfile::tempdir().unwrap();
+    let uri = dir.path().to_str().unwrap();
+    let v1 = "node Doc {\n    slug: String @key\n    n: I64\n}\n";
+    let mut db = Omnigraph::init(uri, v1).await.unwrap();
+    load_jsonl(
+        &mut db,
+        r#"{"type":"Doc","data":{"slug":"d1","n":1}}"#,
+        LoadMode::Merge,
+    )
+    .await
+    .expect("load a Doc");
+
+    let before = db
+        .snapshot_of(ReadTarget::branch("main"))
+        .await
+        .unwrap()
+        .entry("node:Doc")
+        .unwrap()
+        .table_version;
+
+    // Add an @index on the existing `n` column.
+    let v2 = "node Doc {\n    slug: String @key\n    n: I64 @index\n}\n";
+    let result = db.apply_schema(v2).await.expect("index-only apply must succeed");
+    assert!(result.applied, "the @index addition must apply");
+
+    let after = db
+        .snapshot_of(ReadTarget::branch("main"))
+        .await
+        .unwrap()
+        .entry("node:Doc")
+        .unwrap()
+        .table_version;
+    assert_eq!(
+        before, after,
+        "adding an @index must not bump the table version (no inline index build)"
+    );
+}
diff --git a/docs/dev/invariants.md b/docs/dev/invariants.md
index 878adfe..dd802b1 100644
--- a/docs/dev/invariants.md
+++ b/docs/dev/invariants.md
@@ -15,6 +15,38 @@ Use it this way:
 - Keep implementation ledgers, roadmap detail, and historical MR notes in the
   per-area docs. This file is the filter, not the encyclopedia.
 
+## Governing principle: logical contract over physical state
+
+The hard invariants below are instances of one rule. Keep it in view whenever
+a change touches the boundary between what the graph *means* and how it is
+physically stored.
+
+> **Logical state is the contract. Physical state — index coverage, fragment
+> layout, compaction versions, staged writes — is derived, rebuildable, and may
+> be produced asynchronously. A physical operation must never fail a logical
+> one. Preconditions are checked against logical state; physical reconciliation
+> is idempotent and may lag or retry. Genuine logical conflicts still fail
+> loudly: the licence to lag covers physical convergence, not correctness.**
+
+Invariants that instantiate it: **2** (manifest-atomic visibility) and **5**
+(recovery is part of the commit protocol) — a partially-written physical layer
+never changes what a graph commit means; **7** (indexes are derived state) — a
+query is correct under partial index coverage, and expensive index work
+converges from manifest state instead of gating the write path; **13** (failures
+bounded and observable) — the licence to lag is not a licence to drop, so a
+physical step that cannot make progress is surfaced, not swallowed. Deny-list
+items that enforce it: synchronous inline vector/FTS index rebuilds on the
+commit path; state that drifts from Lance or the manifest when it can be
+derived; job queues for manifest-derivable state where a reconciler fits.
+
+The failure shape it rules out: a legitimate background operation on the
+physical layer (compaction, an index build, an interrupted staged write) is
+allowed to break a logical operation (a query's correctness, a migration's
+success, a branch's writability). The smell to watch for is a logical operation
+whose precondition is a *physical* fact — a cached file version, an index's
+existence, a fragment count. Make the precondition logical and let a reconciler
+converge the physical state.
+
 ## Hard Invariants
 
 1. **Respect the substrate.** Lance owns columnar storage, per-dataset
@@ -105,7 +137,7 @@ Use it this way:
 | Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) |
 | Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) |
 | Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
-| Index lifecycle | Index *creation* per `@index`/`@key` property is dispatched by type (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) via `node_prop_index_kind`; index *coverage maintenance* exists — `optimize` runs Lance `optimize_indices` after compaction to fold appended/rewritten fragments into existing indexes (still an explicit maintenance call, not yet a background reconciler) | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
+| Index lifecycle | `@index`/`@key` declares *intent*; the physical index is derived state and never fails a logical op. `schema apply` builds no indexes (records intent only; index-only changes touch no table data). `load`/`mutate` build inline through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) that fault-isolates an untrainable Vector column into a *pending* index instead of aborting. `optimize`/`ensure_indices` is the reconciler: it creates declared-but-missing indexes and folds appended/rewritten fragments into existing ones (`optimize_indices`), reporting still-pending columns. Explicit maintenance call, not yet a background loop | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
 | Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) |
 | Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) |
 | Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) |
diff --git a/docs/dev/testing.md b/docs/dev/testing.md
index 7e181e1..38b81f2 100644
--- a/docs/dev/testing.md
+++ b/docs/dev/testing.md
@@ -29,7 +29,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
 | `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) |
 | `changes.rs` | `diff_between` / `diff_commits` |
 | `consistency.rs` | Cross-table snapshot isolation, atomic publish |
-| `schema_apply.rs` | Migration plan + apply, schema-apply lock |
+| `schema_apply.rs` | Migration plan + apply, schema-apply lock; index materialization deferred to the reconciler (iss-848): `apply_schema_defers_vector_index_on_empty_table` (an empty-table Vector `@index` never aborts the apply) and `index_only_constraint_apply_touches_no_table_data` (adding an `@index` is metadata-only — no table-version bump) |
 | `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) |
 | `traversal.rs` | `Expand`, variable-length hops, anti-join (CSR path — `OMNIGRAPH_TRAVERSAL_MODE` unset) |
 | `traversal_indexed.rs` | BTREE-indexed Expand (`execute_expand_indexed`) forced via `OMNIGRAPH_TRAVERSAL_MODE`, asserted semantically equal to the CSR path; own binary, all `#[serial]` so env writes never race |
@@ -42,7 +42,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
 | `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
 | `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths |
 | `policy_engine_chassis.rs` | Engine-layer Cedar enforcement (MR-722): allow + deny through every `_as` writer via the SDK directly — no HTTP — proving embedded and CLI callers hit the same gate as the server, with action × scope shapes matching `authorize_request` |
-| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice |
+| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice; the index reconciler (iss-848): `index_build_tolerates_null_vector_rows` (an untrainable Vector column defers instead of aborting the build, sibling indexes still build) and `optimize_materializes_index_declared_but_unbuilt` (optimize creates a declared-but-deferred index) |
 | `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`) and the write-entry in-process heal contract (the four `*_after_finalize_publisher_failure_heals_without_reopen` tests — load, mutation, schema apply, branch merge: a follow-up write on the same handle rolls a sidecar-covered residual forward without reopen/refresh) and the storage-fault matrix for the sidecar lifecycle (`recovery.sidecar_{write,delete,list}` / `recovery.record_audit` failpoints: Phase A put failure aborts with zero drift, Phase D delete failure is swallowed and healed by the next write, list failures are loud at heal and open, audit-append failures are retried to exactly one audit row; plus the bucket-gated `s3_load_recovers_after_publisher_failure_without_reopen`). |
 | `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path |
 | `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). |
diff --git a/docs/user/operations/maintenance.md b/docs/user/operations/maintenance.md
index 87688d6..161e5d6 100644
--- a/docs/user/operations/maintenance.md
+++ b/docs/user/operations/maintenance.md
@@ -7,11 +7,12 @@
 - Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed.
 - Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs.
 - **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail.
+- **Create declared-but-missing indexes (the index reconciler).** `@index`/`@key` declares intent; `schema apply` records it but builds nothing, and `load`/`mutate` defer a column that cannot be built yet (a `Vector` column with no trainable vectors). `optimize` materializes any such declared-but-unbuilt index over the compacted layout — so it is the convergence path for an `@index` added after data exists, or a vector index whose embeddings arrived via a later `embed`. A column still not buildable (no vectors yet) is reported on the table's stat as `pending_indexes` (visible in `--json`), not treated as a failure; the next `optimize` retries. So `optimize` is the single operator-facing index reconciler: it compacts, restores coverage, **and** builds declared-but-missing indexes.
 - Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe).
 - **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`.
 - **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
 - Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
-- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version`.
+- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version, pending_indexes` (the last lists any declared `@index` column the reconciler could not build this run, with the reason — e.g. a vector column with no trainable vectors yet).
 - **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today.
 
 ## `repair` — explicit
diff --git a/docs/user/search/indexes.md b/docs/user/search/indexes.md
index ea65a6f..57935cd 100644
--- a/docs/user/search/indexes.md
+++ b/docs/user/search/indexes.md
@@ -27,7 +27,8 @@ list/`Blob` columns → none.
 
 ## L2 — OmniGraph orchestration
 
-- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run.
+- **`@index`/`@key` declares intent; the physical index is derived state.** A migration records the declaration in the catalog/IR and never fails on it — `schema apply` builds **no** indexes (adding an `@index` to an existing column is a pure metadata change that touches no table data). `load`/`mutate` build declared indexes inline as part of the write, but a column that can't be built yet (a `Vector` column with no trainable vectors — IVF k-means needs ≥1 vector, e.g. rows loaded before `embed` runs) is left **pending**, not fatal. Reads stay correct meanwhile: a missing/partial index degrades to a scan (vector search to brute-force). A later `ensure_indices`/`optimize` materializes the pending index once it is buildable. This mirrors how LanceDB builds indexes asynchronously and serves unindexed rows by brute-force.
+- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted + vector indexes for the current head; safe to re-run; returns the columns it had to defer as pending. `optimize` runs it after compaction, so the maintenance cron is the convergence path for deferred indexes.
 - Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
 - **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
 - Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.