From 5eb47b8c135e0a22680dd327321af00ddb0c9fdb Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 29 Apr 2026 07:56:18 +0000 Subject: [PATCH] docs: surface MR-766 publisher OCC in storage / errors / constants - storage.md: document the row-level CAS annotation on `__manifest.object_id` and the `expected_table_versions` OCC contract on `ManifestBatchPublisher::publish`. - errors.md: list `ManifestConflictDetails` and its variants alongside `ManifestError`. - constants.md: add `PUBLISHER_RETRY_BUDGET = 5`. Per AGENTS.md "Maintenance contract": new schema construct, new constant, and new typed error shape all need to ship with the source change. --- docs/constants.md | 1 + docs/errors.md | 4 +++- docs/storage.md | 2 ++ 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/docs/constants.md b/docs/constants.md index b10aba0..8c329b1 100644 --- a/docs/constants.md +++ b/docs/constants.md @@ -7,6 +7,7 @@ | Run registry dir | `_graph_runs.lance` | `db/run_registry.rs` | | Run branch prefix | `__run__` | `db/run_registry.rs` | | Schema apply lock | `__schema_apply_lock__` | `db/mod.rs` | +| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | `db/manifest/publisher.rs` | | Merge stage batch | `MERGE_STAGE_BATCH_ROWS = 8192` | `exec/merge.rs` | | Maintenance concurrency | `OMNIGRAPH_MAINTENANCE_CONCURRENCY=8` | `db/omnigraph/optimize.rs` | | Graph index cache size | `8` (LRU) | `runtime_cache.rs` | diff --git a/docs/errors.md b/docs/errors.md index 4a86a5b..257ae4c 100644 --- a/docs/errors.md +++ b/docs/errors.md @@ -6,7 +6,9 @@ - `Lance(String)` — storage layer - `DataFusion(String)` — execution layer - `Io(io::Error)` -- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, … })` +- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option, … })` + - `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`). + - `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts. - `MergeConflicts(Vec)` Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics. diff --git a/docs/storage.md b/docs/storage.md index 21e73b4..db58de4 100644 --- a/docs/storage.md +++ b/docs/storage.md @@ -29,6 +29,8 @@ OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordin - `table_branch` is `null` for the main lineage and the branch name otherwise - **Snapshot reconstruction**: latest visible `table_version` per `(table_key, table_branch)` minus tombstones — rows where `object_type = table_tombstone`, whose own `table_version` (acting as the tombstone version) is `>= the entry's table_version`. - **Atomic publish**: multi-dataset commits publish via a `ManifestBatchPublisher` so a single write to `__manifest` flips all the new sub-table versions visible at once. +- **Row-level CAS on the merge-insert join key**: `object_id` carries `lance-schema:unenforced-primary-key=true` so Lance's bloom-filter conflict resolver rejects two concurrent commits that land the same `object_id` row. Without this annotation, Lance's transparent rebase would admit silent duplicates of `version:T@v=N` from racing publishers (see `.context/merge-insert-cas-granularity.md`). +- **Optimistic concurrency control on publish**: `ManifestBatchPublisher::publish` accepts a `expected_table_versions: HashMap` map. Each entry asserts the manifest's current latest non-tombstoned version for that table is exactly what the caller observed; mismatches surface as `OmniError::Manifest` with `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }`. Empty map preserves the legacy "best-effort publish" semantics. The publisher uses `conflict_retries(0)` against Lance and owns retry itself (`PUBLISHER_RETRY_BUDGET = 5`), re-running the pre-check on each iteration so concurrent advances surface as `ExpectedVersionMismatch` rather than being silently rebased through. ## URI scheme support (`storage.rs`)