2026-04-10 20:49:41 +03:00
|
|
|
[workspace]
|
|
|
|
|
resolver = "2"
|
|
|
|
|
members = [
|
|
|
|
|
"crates/omnigraph-compiler",
|
|
|
|
|
"crates/omnigraph",
|
|
|
|
|
"crates/omnigraph-cli",
|
2026-06-13 17:03:20 +03:00
|
|
|
"crates/omnigraph-api-types",
|
2026-06-08 20:07:39 +03:00
|
|
|
"crates/omnigraph-cluster",
|
policy: chassis core — omnigraph-policy crate + Omnigraph::enforce() (MR-722) (#102)
PR #2 of the policy chassis series (PR #1 = MR-731, merged in #101).
The structural fix that moves Cedar enforcement from HTTP-only to
engine-wide. apply_schema is the proof-of-concept writer; PR #3 fans
the enforce() call out to the remaining six (mutate_as, load,
ingest_as, branch_create_from, branch_delete, branch_merge).
## What lands
### New crate: omnigraph-policy
The 844-line policy.rs moves from `omnigraph-server` into a new
`omnigraph-policy` workspace crate so both engine and server can
depend on it. Cedar dependency moves with it. The server's policy.rs
becomes a re-export shim (`pub use omnigraph_policy::*`) so existing
`omnigraph_server::PolicyAction` etc. paths keep working — CLI and
test consumers don't have to migrate in one go.
### New trait: PolicyChecker
```rust
pub trait PolicyChecker: Send + Sync {
fn check(&self, action: PolicyAction, scope: &ResourceScope,
actor: &str) -> Result<(), PolicyError>;
}
```
`PolicyEngine` (Cedar-backed) implements it. `Omnigraph::with_policy()`
takes `Arc<dyn PolicyChecker>`. Engine tests mock the trait without
spinning up Cedar. MR-725 will extend the trait with `predicate_for()`
for query-layer pushdown — additive, no call-site changes.
### New enum: ResourceScope
Four variants — Graph, Branch, TargetBranch, BranchTransition —
mapping cleanly to today's `(branch, target_branch)` shape on
PolicyRequest via `to_branch_pair()`. Each engine writer picks the
variant that matches the existing HTTP-layer convention so engine
and HTTP evaluate the same Cedar decision.
**Invariant**: ResourceScope stays at branch granularity. Per-type
and per-row scope are MR-725's territory, not engine-layer's.
Adding Type/Row variants here creates two places per-type policy
can be evaluated, which can drift. See chassis design refinements
comment on MR-722 (2026-05-17).
### Omnigraph::with_policy() + enforce()
* New `policy: Option<Arc<dyn PolicyChecker>>` field on Omnigraph,
None by default (preserves embedded/dev no-enforcement mode).
* `with_policy(self, checker)` setter — builder-style, consumes self.
* `enforce(action, scope, actor)` — the gate. When policy is None,
no-op. When policy is Some AND actor is None, hard error — silent
bypass via "I forgot the actor" is exactly the footgun this gate
is here to prevent.
### apply_schema_as: first writer wired
* New public method `apply_schema_as(source, options, actor)` that
calls `enforce(SchemaApply, TargetBranch("main"), actor)` before
acquiring the schema-apply lock or doing any other work.
* Existing `apply_schema(source)` and `apply_schema_with_options(...)`
delegate to it with actor=None (no-actor variants).
* HTTP handler `server_schema_apply` updated to call apply_schema_as
with the resolved actor. AppState construction injects the
PolicyEngine into Omnigraph via `with_policy`. HTTP-layer
authorize_request still fires first; the engine gate is the
redundant-but-correct backstop and the only path that protects SDK
/ embedded callers. PR #3 removes the HTTP redundancy.
### OmniError::Policy
New error variant for engine-layer policy denial / evaluation
failure. ApiError::from_omni maps it to 403.
### MR-724 Admin action — Option A reservation
PolicyAction::Admin kept in the enum with a load-bearing doc
comment naming its future consumers (hot reload, audit log query,
approvals list per MR-726 / MR-732 / MR-734). No enforce(Admin, ...)
call site exists yet — the variant is reserved so the action
vocabulary is complete from chassis day one. MR-724 closes when
the first consumer surface ships.
### New SDK-side integration test
`crates/omnigraph/tests/policy_engine_chassis.rs` — four tests
covering:
* Policy denies for unauthorized actor → OmniError::Policy
* Policy permits for authorized actor → apply succeeds
* Policy installed + no actor → hard error (forget-the-actor footgun)
* No policy → no-op (embedded/dev default still works)
These exercise the engine path directly — no HTTP layer involved.
## Test results
- cargo test --workspace --locked --no-fail-fast: 851 passed, 0 failed
* 45 server tests (existing) pass
* 14 schema_apply tests (existing) pass
* 4 new chassis tests pass
* 60 OpenAPI tests pass (no HTTP API surface changes)
* No regressions across the workspace
## Architectural decisions baked in
Per MR-722 chassis design refinements comment (2026-05-17):
1. PolicyChecker is a trait, not just a concrete. Engine and server
consume the trait. MR-725 adds predicate_for() additively.
2. ResourceScope stays at branch granularity. No Type/Row variants.
3. Coarse-vs-fine framing pinned: engine-layer is action gate;
query-layer (MR-725) is predicate gate. Both backed by same Cedar
engine; non-overlapping responsibilities.
4. Admin action reserved for policy-management surfaces (MR-724
Option A).
## Pending follow-ups (PR #3+)
- Fan-out enforce() to mutate_as, load, ingest_as, branch_create_from,
branch_delete, branch_merge (PR #3).
- Remove HTTP-layer authorize_request redundancy once engine gate
covers all writers (PR #3).
- CLI policy injection into Omnigraph for non-`policy validate|test|explain`
subcommands (PR #3 or follow-up).
- MR-723 default-deny 3-state matrix (PR #4).
- MR-736 severity warn/deny (PR #5).
- AGENTS.md scope-of-enforcement rewrite once chassis fully lands.
- Coarse-vs-fine framing in docs/user/policy.md.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 00:36:36 +03:00
|
|
|
"crates/omnigraph-policy",
|
2026-04-10 20:49:41 +03:00
|
|
|
"crates/omnigraph-server",
|
|
|
|
|
]
|
|
|
|
|
default-members = [
|
|
|
|
|
"crates/omnigraph",
|
|
|
|
|
"crates/omnigraph-cli",
|
|
|
|
|
"crates/omnigraph-server",
|
|
|
|
|
]
|
|
|
|
|
|
|
|
|
|
[workspace.dependencies]
|
chore(lance): bump 4.0.0 → 6.0.1 (DataFusion 52→53, Arrow 57→58) (#111)
* tests: add lance_surface_guards pre-flight pins for the v6 bump
Land 8 named guards in a new test file that pin Lance API surfaces
OmniGraph relies on. Each guard turns a silent-break risk (variant
rename, struct restructure, async-flip) into a red CI bar instead of
runtime drift.
Guards (mapped to the silent-break inventory from the v6 migration plan):
Runtime (#[tokio::test]):
1. lance_error_too_much_write_contention_variant_exists — pins the
variant referenced by db/manifest/publisher.rs::map_lance_publish_error.
2. manifest_location_field_shape — pins .path/.size/.e_tag/.naming_scheme
types and ManifestLocation accessor returning &Self (the access
pattern at db/manifest/metadata.rs:84-88).
6. write_params_default_does_not_set_storage_version — confirms our
explicit V2_2 pin remains load-bearing (blob v2 requirement).
Compile-only async fns (#[allow(...)] + unimplemented!() placeholders;
never run, but cargo build --tests enforces the API shape):
3. checkout_version + restore chain — pins the recovery rollback hammer
at db/manifest/recovery.rs:505-522.
4. DatasetBuilder::from_namespace().with_branch().with_version().load()
— pins the namespace builder chain at db/manifest/namespace.rs:162-174.
5. MergeInsertBuilder fluent chain — pins the manifest CAS at
db/manifest/publisher.rs:370-391, including the return shape
(Arc<Dataset>, MergeStats).
7. compact_files(&mut ds, CompactionOptions, None) — pins
db/omnigraph/optimize.rs:107.
8. DeleteResult { new_dataset, num_deleted_rows } — pins the inline
delete result shape (MR-A will repurpose this guard to the staged
two-phase variant once Lance #6658 migration lands).
This is commit 1 of the chore/lance-6.0.1 migration. Cargo bump
follows in commit 2 (will trigger the guards under v6 if any surface
drifted).
Per the migration plan at ~/.claude/plans/shimmering-percolating-duckling.md
(written this session). Two guards from the plan deferred to follow-up:
- manifest_cas_returns_row_level_contention_variant (full publisher
race integration test — needs harness scaffolding)
- table_version_metadata_byte_compatible_with_v4 (TableVersionMetadata
is pub(crate); requires test reach extension).
Verified on v4: cargo test -p omnigraph-engine --test lance_surface_guards
passes 3/3 runtime tests; cargo build -p omnigraph-engine --tests
compiles all 5 compile-only guards clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump Lance 4.0.0 → 6.0.1, DataFusion 52 → 53, Arrow 57 → 58
The Cargo bump itself. Source is intentionally untouched — this commit
will not compile. The compile errors are the work-list for subsequent
commits on this branch.
Lance updates: lance + 7 sub-crates 4.0.0 → 6.0.1. Transitive churn:
+ lance-tokenizer v6.0.1 (vendored tokenizer per Lance PR #6512)
+ object_store 0.13.x (Lance 6 brings it transitively; our explicit
pin stays at 0.12.5 for now — revisit in stages if diamond bites)
- tantivy* crates (replaced by lance-tokenizer)
Compile error landscape on this commit (11 errors):
• 1× E0432: `lance_index::DatasetIndexExt` import (Lance PR #6280
moved it to lance::index). Sites: table_store.rs:20,
db/manifest.rs:37 (the second site was missed by the pre-flight
inventory).
• 8× E0599: `create_index_builder` / `load_indices` missing on
`lance::Dataset` — all downstream of the DatasetIndexExt move.
Once the import is corrected on table_store.rs and db/manifest.rs,
these resolve automatically.
• 2× E0063: missing field `is_only_declared` in `DescribeTableResponse`
initializer at db/manifest/namespace.rs:221, 364. New Lance
namespace field per the v5 namespace restructure (PR #6186).
Surface guards (lance_surface_guards.rs, commit d571fa8) all still
compile + the 3 runtime ones pass on v6 — none of the silent-break
surfaces drifted. That's the load-bearing observation: the publisher
CAS chain, ManifestLocation field shape, checkout_version/restore,
DatasetBuilder fluent chain, MergeInsertBuilder return shape,
WriteParams::default, compact_files signature, and DeleteResult
fields are all v6-stable.
Next commits address the 11 errors per the migration plan stages
3-8.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* imports: move DatasetIndexExt to lance::index (Lance PR #6280)
Lance 5.0 (PR #6280) moved `DatasetIndexExt` out of `lance-index` into
`lance::index`. `is_system_index` and `IndexType` stayed in `lance-index`.
Mechanical update of 6 import sites:
crates/omnigraph/src/table_store.rs:20 — split into two `use` lines
crates/omnigraph-server/tests/server.rs:10 — was traits::DatasetIndexExt
crates/omnigraph/tests/search.rs:6
crates/omnigraph/tests/branching.rs:7
crates/omnigraph/tests/failpoints.rs:467
crates/omnigraph-cli/tests/cli.rs:3 — was traits::DatasetIndexExt
All 9 E0599 cascading errors on .create_index_builder / .load_indices
resolve once the trait is back in scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* namespace: add is_only_declared field to DescribeTableResponse
Lance namespace 6.0.0 added `is_only_declared: Option<bool>` to
`DescribeTableResponse` (lance-namespace-reqwest-client 0.7+ via the
v5.0 namespace API restructure, Lance PR #6186). Set to `Some(false)`
because every table BranchManifestNamespace returns from describe_table
is materialized — the manifest snapshot only includes entries for
tables we've already opened via Dataset::open.
Two sites in db/manifest/namespace.rs (BranchManifestNamespace +
StagedTableNamespace impls of LanceNamespace::describe_table).
Closes the last two compile errors from the v6 bump in the engine lib.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cargo: add lance to omnigraph-cli + omnigraph-server dev-deps
Stage 3 moved DatasetIndexExt imports from `lance-index` to `lance::index`
in the cli and server test crates. Both crates only had `lance-index`
in their dev-dependencies; add `lance` alongside so the new path
resolves.
This is the last compile-error fix from the v6 bump — `cargo build
--workspace --tests` is now green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: refresh Lance alignment audit for v6.0.1; bump surveyed version
Per CLAUDE.md maintenance rule 2 (same-PR docs):
- docs/dev/lance.md: replace the v4.0.1 alignment audit stanza with
the v6.0.1 audit. Captures every v5/v6 finding from this PR (the
DatasetIndexExt move, DescribeTableResponse.is_only_declared,
MergeInsertBuilder return shape, ManifestLocation field shape,
LanceFileVersion::default flip, file-reader async, tokenizer
vendor, Lance #6658/#6666/#6877 status). Cross-references each
guard in tests/lance_surface_guards.rs.
- AGENTS.md: bump "Storage substrate: Lance 4.x" → "Lance 6.x".
Note: surveyed crate version stays at 0.4.2 — substrate version
bumps are independent of OmniGraph's release version.
- crates/omnigraph/src/storage_layer.rs: update the trait module-level
doc-comment to reflect that Lance #6658 closed 2026-05-14 and
delete_where two-phase migration is MR-A (the next follow-up).
#6666 stays open; create_vector_index inline residual stays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* tests: silence clippy::diverging_sub_expression on compile-only guards
The five `_compile_*` async fns in lance_surface_guards.rs use
`let ds: Dataset = unimplemented!()` as a placeholder so type inference
can chase the method chain we want to pin, without ever running the
function. Clippy's `diverging_sub_expression` lint flags this pattern
because the RHS diverges; that's the entire point. Added to the
per-fn `#[allow(...)]` list, alongside dead_code / unreachable_code /
unused_variables / unused_mut already there.
No behavior change. cargo test -p omnigraph-engine --test
lance_surface_guards still 3/3 green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: correct #6658 status — closed but API ships in Lance v7.x, not v6.0.1
The audit stanza in docs/dev/lance.md and the storage_layer.rs trait
doc-comment both implied the public DeleteBuilder::execute_uncommitted
API shipped with Lance 6.0.1. It did not. Issue #6658 closed
2026-05-14, but binary search across the release stream confirms:
v6.0.1 ❌ no pub async fn execute_uncommitted on DeleteBuilder
v6.1.0-rc.1 ❌
v7.0.0-beta.5 ❌
v7.0.0-beta.10 ✅ first appearance
v7.0.0-rc.1 ✅
So MR-A (delete two-phase migration) is gated on the Lance v7.x bump,
not on this PR. v7.0.0-rc.1 dropped 2026-05-21; GA likely within a
week.
No behavior change. Doc-only correction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci(lib): bump recursion_limit to 256 — Lance 6 trait depth on Linux
Lance 6's heavier trait surface around futures/streams in storage_layer.rs's
staged-write API pushes the rustc trait-resolution recursion limit past
the default 128 on Linux builds. CI on PR #111 surfaced this in both
`Test Workspace` and `Test omnigraph-server --features aws`:
error: queries overflow the depth limit!
= help: consider increasing the recursion limit by adding a
`#![recursion_limit = "256"]` attribute to your crate (`omnigraph`)
= note: query depth increased by 130 when computing layout of
`{async block@crates/omnigraph/src/storage_layer.rs:697:5: 697:10}`
(The async block is `stage_create_btree_index`'s body — its return type
is several layers of `impl Future<Output=Result<StagedHandle>>` deep on
top of Lance's own builder return types.)
Local macOS builds happened to short-circuit before tripping the limit,
which is why this didn't surface during the v6 bump sequence. The fix
rustc itself suggests is one line at the crate root.
No behavior change. Revisit if a future Lance bump stops needing it.
Verified: `cargo build --locked -p omnigraph-server --features aws`
compiles clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 00:42:29 +01:00
|
|
|
arrow-array = "58"
|
|
|
|
|
arrow-ipc = "58"
|
|
|
|
|
arrow-schema = "58"
|
|
|
|
|
arrow-select = "58"
|
|
|
|
|
arrow-cast = { version = "58", features = ["prettyprint"] }
|
|
|
|
|
arrow-ord = "58"
|
2026-04-10 20:49:41 +03:00
|
|
|
|
exec/query: structured Expr pushdown via Scanner::filter_expr (unblocks CompOp::Contains) (#113)
* exec/query: pushdown IR filters via DataFusion Expr (Scanner::filter_expr)
Switches `execute_node_scan` from string-flattened Lance SQL pushdown
(`build_lance_filter` + `scanner.filter(&str)`) to structured DataFusion
Expr pushdown (`build_lance_filter_expr` + `scanner.filter_expr(Expr)`).
## What this enables
1. **`CompOp::Contains` now pushes down.** `ir_filter_to_sql` returned
`None` for list-contains (the comment said *"Can't pushdown list
contains"*) because string SQL can't easily express it. With Expr,
it lowers to DataFusion's `array_has(col, value)` builtin via the
`nested_expressions` feature, and pushes down to Lance's scan layer
the same way Eq/Lt/etc. do. Pinned by the new regression test
`end_to_end::ir_filter_with_list_contains_pushes_down`.
2. **DataFusion 53's optimizer rules now reach our predicates.** Once
the Expr lands at the Lance scanner, DF's planner runs:
- `IN`-list vectorized eq kernel (DF #20528)
- `PhysicalExprSimplifier` (DF #20111)
- CASE WHEN x THEN y ELSE NULL shortcut (DF #20097)
- Push limit into hash join (DF #20228)
None of these were applicable before because the string SQL path
short-circuited the optimizer.
## Scope
This is one of three string-flattened pushdown sites; the other two
(`hydrate_nodes`/Expand pushdown at query.rs:771-796 and the mutation
delete path in `exec/mutation.rs::predicate_to_sql`) stay on the SQL
string path for now:
- The Expand pushdown still serializes through `hydrate_nodes`'s
`extra_filter_sql: Option<&str>` parameter. Migrating it changes the
`TableStorage` trait surface (`scan_stream(filter: Option<&str>)` →
`Option<Expr>`) and the cascading call sites — out of scope for this
MR.
- The mutation delete predicate still goes through `Dataset::delete(&str)`
in Lance 6.0.1. MR-A (delete two-phase via Lance #6658, gated on the
Lance v7 bump per issue #112) will migrate that path to
`DeleteBuilder::execute_uncommitted` taking an Expr.
The existing `ir_filter_to_sql` / `ir_expr_to_sql` / `literal_to_sql`
helpers stay in place to serve the remaining string-SQL consumers
(mutation predicates). They get retired when the other call sites
migrate.
## Cargo
Enables the `nested_expressions` feature on the `datafusion` workspace
dep. Lance already pulls in `datafusion-functions-nested` transitively
(it's listed in their feature set), so this just exposes the
`datafusion::functions_nested::expr_fn::array_has` re-export. No
transitive dep change (Cargo.lock unchanged).
## Tests
- New: `ir_filter_with_list_contains_pushes_down` — pins the case that
was previously impossible (`ir_filter_to_sql` returning `None`).
- 906/906 workspace tests still pass.
- 417/417 engine integration tests pass (was 416 + the new one).
- 19/19 failpoints (recovery canary).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci: pin rustfs/rustfs to 1.0.0-beta.3 (last known-good before creds-policy break)
The RustFS S3 Integration job started failing 2026-05-23 with all 3
tests panicking on the first PUT:
HTTP error: error sending request
The "Dump RustFS logs on failure" step revealed the container was
dying at startup:
[FATAL] Server encountered an error and is shutting down:
Default root credentials are not allowed on non-loopback listeners;
set RUSTFS_ACCESS_KEY and RUSTFS_SECRET_KEY to non-default values,
bind to loopback, or set RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true
for local development only
`rustfs/rustfs:latest` was updated 2026-05-21 (1.0.0-beta.4) with a
credentials-policy check that rejects `rustfsadmin`/`rustfsadmin` as
"default" values. PR #111 passed yesterday because it ran against
beta.3; today's runs against beta.4 fail at container startup.
This is unrelated to PR #113's Expr-pushdown refactor — the bump
just happened to hit the same week.
Pin to 1.0.0-beta.3 (2026-05-14, last tag before the change). The
right long-term fix is one of:
- Rotate the CI creds to less-default values (less coupling to
RustFS's "default" set definition)
- Set `RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true` per the
error message
- Use a workflow service container with controlled lifecycle
Deferred — pinning is the minimal restore. Also incidentally
documents *which* version we tested against, which `:latest` never
did.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 12:47:33 +01:00
|
|
|
datafusion = { version = "53", default-features = false, features = ["nested_expressions"] }
|
chore(lance): bump 4.0.0 → 6.0.1 (DataFusion 52→53, Arrow 57→58) (#111)
* tests: add lance_surface_guards pre-flight pins for the v6 bump
Land 8 named guards in a new test file that pin Lance API surfaces
OmniGraph relies on. Each guard turns a silent-break risk (variant
rename, struct restructure, async-flip) into a red CI bar instead of
runtime drift.
Guards (mapped to the silent-break inventory from the v6 migration plan):
Runtime (#[tokio::test]):
1. lance_error_too_much_write_contention_variant_exists — pins the
variant referenced by db/manifest/publisher.rs::map_lance_publish_error.
2. manifest_location_field_shape — pins .path/.size/.e_tag/.naming_scheme
types and ManifestLocation accessor returning &Self (the access
pattern at db/manifest/metadata.rs:84-88).
6. write_params_default_does_not_set_storage_version — confirms our
explicit V2_2 pin remains load-bearing (blob v2 requirement).
Compile-only async fns (#[allow(...)] + unimplemented!() placeholders;
never run, but cargo build --tests enforces the API shape):
3. checkout_version + restore chain — pins the recovery rollback hammer
at db/manifest/recovery.rs:505-522.
4. DatasetBuilder::from_namespace().with_branch().with_version().load()
— pins the namespace builder chain at db/manifest/namespace.rs:162-174.
5. MergeInsertBuilder fluent chain — pins the manifest CAS at
db/manifest/publisher.rs:370-391, including the return shape
(Arc<Dataset>, MergeStats).
7. compact_files(&mut ds, CompactionOptions, None) — pins
db/omnigraph/optimize.rs:107.
8. DeleteResult { new_dataset, num_deleted_rows } — pins the inline
delete result shape (MR-A will repurpose this guard to the staged
two-phase variant once Lance #6658 migration lands).
This is commit 1 of the chore/lance-6.0.1 migration. Cargo bump
follows in commit 2 (will trigger the guards under v6 if any surface
drifted).
Per the migration plan at ~/.claude/plans/shimmering-percolating-duckling.md
(written this session). Two guards from the plan deferred to follow-up:
- manifest_cas_returns_row_level_contention_variant (full publisher
race integration test — needs harness scaffolding)
- table_version_metadata_byte_compatible_with_v4 (TableVersionMetadata
is pub(crate); requires test reach extension).
Verified on v4: cargo test -p omnigraph-engine --test lance_surface_guards
passes 3/3 runtime tests; cargo build -p omnigraph-engine --tests
compiles all 5 compile-only guards clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(deps): bump Lance 4.0.0 → 6.0.1, DataFusion 52 → 53, Arrow 57 → 58
The Cargo bump itself. Source is intentionally untouched — this commit
will not compile. The compile errors are the work-list for subsequent
commits on this branch.
Lance updates: lance + 7 sub-crates 4.0.0 → 6.0.1. Transitive churn:
+ lance-tokenizer v6.0.1 (vendored tokenizer per Lance PR #6512)
+ object_store 0.13.x (Lance 6 brings it transitively; our explicit
pin stays at 0.12.5 for now — revisit in stages if diamond bites)
- tantivy* crates (replaced by lance-tokenizer)
Compile error landscape on this commit (11 errors):
• 1× E0432: `lance_index::DatasetIndexExt` import (Lance PR #6280
moved it to lance::index). Sites: table_store.rs:20,
db/manifest.rs:37 (the second site was missed by the pre-flight
inventory).
• 8× E0599: `create_index_builder` / `load_indices` missing on
`lance::Dataset` — all downstream of the DatasetIndexExt move.
Once the import is corrected on table_store.rs and db/manifest.rs,
these resolve automatically.
• 2× E0063: missing field `is_only_declared` in `DescribeTableResponse`
initializer at db/manifest/namespace.rs:221, 364. New Lance
namespace field per the v5 namespace restructure (PR #6186).
Surface guards (lance_surface_guards.rs, commit d571fa8) all still
compile + the 3 runtime ones pass on v6 — none of the silent-break
surfaces drifted. That's the load-bearing observation: the publisher
CAS chain, ManifestLocation field shape, checkout_version/restore,
DatasetBuilder fluent chain, MergeInsertBuilder return shape,
WriteParams::default, compact_files signature, and DeleteResult
fields are all v6-stable.
Next commits address the 11 errors per the migration plan stages
3-8.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* imports: move DatasetIndexExt to lance::index (Lance PR #6280)
Lance 5.0 (PR #6280) moved `DatasetIndexExt` out of `lance-index` into
`lance::index`. `is_system_index` and `IndexType` stayed in `lance-index`.
Mechanical update of 6 import sites:
crates/omnigraph/src/table_store.rs:20 — split into two `use` lines
crates/omnigraph-server/tests/server.rs:10 — was traits::DatasetIndexExt
crates/omnigraph/tests/search.rs:6
crates/omnigraph/tests/branching.rs:7
crates/omnigraph/tests/failpoints.rs:467
crates/omnigraph-cli/tests/cli.rs:3 — was traits::DatasetIndexExt
All 9 E0599 cascading errors on .create_index_builder / .load_indices
resolve once the trait is back in scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* namespace: add is_only_declared field to DescribeTableResponse
Lance namespace 6.0.0 added `is_only_declared: Option<bool>` to
`DescribeTableResponse` (lance-namespace-reqwest-client 0.7+ via the
v5.0 namespace API restructure, Lance PR #6186). Set to `Some(false)`
because every table BranchManifestNamespace returns from describe_table
is materialized — the manifest snapshot only includes entries for
tables we've already opened via Dataset::open.
Two sites in db/manifest/namespace.rs (BranchManifestNamespace +
StagedTableNamespace impls of LanceNamespace::describe_table).
Closes the last two compile errors from the v6 bump in the engine lib.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* cargo: add lance to omnigraph-cli + omnigraph-server dev-deps
Stage 3 moved DatasetIndexExt imports from `lance-index` to `lance::index`
in the cli and server test crates. Both crates only had `lance-index`
in their dev-dependencies; add `lance` alongside so the new path
resolves.
This is the last compile-error fix from the v6 bump — `cargo build
--workspace --tests` is now green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: refresh Lance alignment audit for v6.0.1; bump surveyed version
Per CLAUDE.md maintenance rule 2 (same-PR docs):
- docs/dev/lance.md: replace the v4.0.1 alignment audit stanza with
the v6.0.1 audit. Captures every v5/v6 finding from this PR (the
DatasetIndexExt move, DescribeTableResponse.is_only_declared,
MergeInsertBuilder return shape, ManifestLocation field shape,
LanceFileVersion::default flip, file-reader async, tokenizer
vendor, Lance #6658/#6666/#6877 status). Cross-references each
guard in tests/lance_surface_guards.rs.
- AGENTS.md: bump "Storage substrate: Lance 4.x" → "Lance 6.x".
Note: surveyed crate version stays at 0.4.2 — substrate version
bumps are independent of OmniGraph's release version.
- crates/omnigraph/src/storage_layer.rs: update the trait module-level
doc-comment to reflect that Lance #6658 closed 2026-05-14 and
delete_where two-phase migration is MR-A (the next follow-up).
#6666 stays open; create_vector_index inline residual stays.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* tests: silence clippy::diverging_sub_expression on compile-only guards
The five `_compile_*` async fns in lance_surface_guards.rs use
`let ds: Dataset = unimplemented!()` as a placeholder so type inference
can chase the method chain we want to pin, without ever running the
function. Clippy's `diverging_sub_expression` lint flags this pattern
because the RHS diverges; that's the entire point. Added to the
per-fn `#[allow(...)]` list, alongside dead_code / unreachable_code /
unused_variables / unused_mut already there.
No behavior change. cargo test -p omnigraph-engine --test
lance_surface_guards still 3/3 green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: correct #6658 status — closed but API ships in Lance v7.x, not v6.0.1
The audit stanza in docs/dev/lance.md and the storage_layer.rs trait
doc-comment both implied the public DeleteBuilder::execute_uncommitted
API shipped with Lance 6.0.1. It did not. Issue #6658 closed
2026-05-14, but binary search across the release stream confirms:
v6.0.1 ❌ no pub async fn execute_uncommitted on DeleteBuilder
v6.1.0-rc.1 ❌
v7.0.0-beta.5 ❌
v7.0.0-beta.10 ✅ first appearance
v7.0.0-rc.1 ✅
So MR-A (delete two-phase migration) is gated on the Lance v7.x bump,
not on this PR. v7.0.0-rc.1 dropped 2026-05-21; GA likely within a
week.
No behavior change. Doc-only correction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci(lib): bump recursion_limit to 256 — Lance 6 trait depth on Linux
Lance 6's heavier trait surface around futures/streams in storage_layer.rs's
staged-write API pushes the rustc trait-resolution recursion limit past
the default 128 on Linux builds. CI on PR #111 surfaced this in both
`Test Workspace` and `Test omnigraph-server --features aws`:
error: queries overflow the depth limit!
= help: consider increasing the recursion limit by adding a
`#![recursion_limit = "256"]` attribute to your crate (`omnigraph`)
= note: query depth increased by 130 when computing layout of
`{async block@crates/omnigraph/src/storage_layer.rs:697:5: 697:10}`
(The async block is `stage_create_btree_index`'s body — its return type
is several layers of `impl Future<Output=Result<StagedHandle>>` deep on
top of Lance's own builder return types.)
Local macOS builds happened to short-circuit before tripping the limit,
which is why this didn't surface during the v6 bump sequence. The fix
rustc itself suggests is one line at the crate root.
No behavior change. Revisit if a future Lance bump stops needing it.
Verified: `cargo build --locked -p omnigraph-server --features aws`
compiles clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 00:42:29 +01:00
|
|
|
datafusion-physical-plan = "53"
|
|
|
|
|
datafusion-physical-expr = "53"
|
|
|
|
|
datafusion-execution = "53"
|
|
|
|
|
datafusion-common = "53"
|
|
|
|
|
datafusion-expr = "53"
|
|
|
|
|
datafusion-functions-aggregate = "53"
|
2026-04-10 20:49:41 +03:00
|
|
|
|
build(deps): bump Lance 6.0.1 → 7.0.0 (correct-by-design substrate alignment) (#229)
* build(deps): bump Lance 6.0.1 → 7.0.0 (object_store 0.13.2, roaring 0.11.4)
Arrow stays 58 and DataFusion stays 53 (no change). The only transitive bump
is object_store 0.12.5 → 0.13.2. 141 upstream commits reviewed; no fixes lost
(the 6.0.x release-branch backports are all forward-ported into 7.0.0).
- object_store 0.13 moved get/put/head/rename/delete behind a new ObjectStoreExt
trait (list/list_with_delimiter/put_opts stay on the core trait). Add
`use object_store::ObjectStoreExt` in storage.rs and db/manifest/namespace.rs;
no call-site changes. Mirrors Lance's own migration in PR #6672.
- roaring pinned to 0.11.4 (cargo update -p roaring --precise 0.11.4). Lance
7.0.0's UpdatedFragmentOffsets newtype (lance#6650) derives Eq over
HashMap<u64, RoaringBitmap>, which needs RoaringBitmap: Eq, added in roaring
0.11.4; the loose `roaring = "0.11"` constraint otherwise resolves 0.11.3 and
lance itself fails to compile.
- lance#6774: merge-insert INSERT rows now stamp _row_created_at_version with the
commit version (was a fallback of 1). Flip the lance_version_columns assertion
to `== v2` and correct the changes/mod.rs rationale comment. Production
change-detection keys on _row_last_updated_at_version + ID membership, so its
logic is unaffected.
Refs lance#6650, lance#6774, lance#6672.
* fix(storage): pin WriteParams::auto_cleanup = None (lance#6755 default flip)
lance#6755 flipped the WriteParams::auto_cleanup default from on (a full cleanup
pass every 20th commit) to None. On 6.0.1 the on-by-default hook could silently
GC versions that __manifest pins for snapshots/time-travel. OmniGraph owns
cleanup explicitly (optimize.rs::cleanup_all_tables) and never set auto_cleanup,
so it was relying on a default that is both wrong for our snapshot model and now
changed upstream.
Pin auto_cleanup: None explicitly at all 11 production WriteParams sites
(table_store ×6, commit_graph ×2, recovery_audit ×1, manifest/graph ×2 — the
__manifest + sub-table Create paths). Removes the dependency on a default-flag
value and locks in the snapshot-safe behavior regardless of future upstream
re-flips.
Refs lance#6755.
* test(lance): pin BTREE range-boundary correctness (lance#6796)
lance#6796 (issue #6792) fixed a BTREE scalar-index range-query bound
inclusiveness bug: `x <= hi AND x > lo` returned the wrong boundary row.
Add lance_surface_guards.rs::btree_range_query_boundary_is_correct, which
reproduces the exact #6792 shape (5 rows + an explicit BTREE drives the index
path even on tiny data) and pins the corrected inclusive-<= / exclusive->
semantics. It turns red if a future Lance regression reintroduces the bug.
OmniGraph today builds BTREE only on string @key columns and queries them by
equality/IN, so its current patterns do not hit this; the guard protects any
future BTREE-range path (BTREE-on-properties, range-on-key).
Refs lance#6796.
* docs(dev): align Lance docs + invariants to 7.0.0
- docs/dev/lance.md: new 2026-06-14 alignment stanza for the 6.0.1 → 7.0.0 bump
(object_store ObjectStoreExt move, roaring 0.11.4, #6774/#6796/#6755 behavior,
#6658 shipped → MR-A unblocked but separate, #6666 + blob compaction still
open); prior 6.0.1 stanza demoted to historical.
- AGENTS.md: storage substrate 6.x → 7.x (line + architecture diagram).
- docs/dev/invariants.md: deletes/vector known gap updated — the staged
two-phase delete API (lance#6658) now exists and MR-A is unblocked, but
delete_where stays inline and D2 stays in place until the migration lands;
create_vector_index still gated on lance#6666.
* fix(storage): skip Lance auto-cleanup on commit paths for legacy datasets
Addresses PR #229 review (Codex P1). `WriteParams::auto_cleanup` is create-time
config with no effect on existing datasets (Lance write.rs docs), so the previous
`auto_cleanup: None` change alone did NOT protect graphs created before the v7
bump: 6.0.1 defaulted auto_cleanup ON, leaving `lance.auto_cleanup.*` config on
those datasets, and Lance's per-commit hook (io/commit.rs: `if
!commit_config.skip_auto_cleanup`) fires off that stored config — so omnigraph's
own writes would GC versions the __manifest pins for snapshots/time-travel.
Skip the hook on every commit path, covering new and legacy datasets alike:
- commit_staged: CommitBuilder::with_skip_auto_cleanup(true) — the staged data path.
- __manifest publisher: MergeInsertBuilder::skip_auto_cleanup(true).
- all 11 WriteParams: skip_auto_cleanup: true (direct Dataset::write/append paths;
auto_cleanup: None retained so new datasets store no cleanup config at all).
Tests:
- lance_surface_guards::skip_auto_cleanup_suppresses_version_gc — substrate:
negative control (config GCs v1 without skip) + with-skip survival.
- staged_writes::commit_staged_skips_auto_cleanup_so_pinned_versions_survive —
omnigraph usage: commit_staged on a legacy-config dataset preserves the pinned
create version.
Refs lance#6755.
* test(lance): assert created_at-preserved + updated_at-bumped on merge_insert UPDATE
Addresses PR #229 review follow-up. `lance_merge_insert_update_preserves_created_at_version`
documented (in a comment) that a merge_insert UPDATE preserves created_at and
bumps updated_at, but only asserted the value change — leaving the change-feed
invariant unguarded. Add the two missing assertions:
- bob created_at == v1 (preserved across UPDATE; what the test name promises;
lance#6774 only changed INSERT-row stamping).
- bob updated_at == v2 (bumped to the commit version) — the invariant
OmniGraph's insert/update classification relies on (changes/mod.rs keys on
_row_last_updated_at_version). A regression here would silently drop updates
from the diff/change feed.
2026-06-14 20:42:24 +02:00
|
|
|
lance = { version = "7.0.0", default-features = false, features = ["aws"] }
|
feat(engine): Stage the delete path; retire the inline-delete residual (#308)
* test(engine): pin zero-row cascade delete must not drift an edge table (red)
A delete <Node> cascades a delete_where into every incident edge type. The
inline delete_where (Dataset::delete) advances Lance HEAD even when zero edges
match, but the cascade records the new version only if deleted_rows > 0 — so a
node with no incident edges leaves edge:Knows HEAD>manifest drift, which trips
the next strict write's ExpectedVersionMismatch and repair refuses it.
Red today: edge:Knows manifest=v5, Lance HEAD=v6. Goes green when delete moves
to the staged two-phase path (iss-950, Lance 7.0 DeleteBuilder::execute_uncommitted),
where a 0-row delete commits no Lance version and the deleted_rows>0 gate becomes
correct by construction.
* fix(engine): a zero-row delete must not advance Lance HEAD
Lance's Dataset::delete commits a new version even when the predicate matches
nothing (build_transaction always emits Operation::Delete), so a node delete
that cascades a delete_where into an incident edge type with no matching edges
advanced that edge table's Lance HEAD while the cascade skipped record_inline
(gated on deleted_rows > 0) — leaving HEAD>manifest drift that wedged the next
strict write and that repair refused as suspicious/unverifiable.
Use Lance 7.0's two-phase DeleteBuilder::execute_uncommitted to read
num_deleted_rows before committing: a no-match delete now advances nothing (no
version, no drift) and the existing deleted_rows>0 gate is correct by
construction. Non-zero deletes commit the staged transaction with
skip_auto_cleanup + affected_rows (parity with the prior inline path).
First step of the staged-delete migration (iss-950); turns the
node_delete_with_no_incident_edges_leaves_no_edge_table_drift regression green.
* feat(engine): stage_delete two-phase primitive (MR-A step 0)
Add TableStore::stage_delete (Lance 7.0 DeleteBuilder::execute_uncommitted),
the two-phase analogue of stage_merge_insert: writes deletion files without
advancing Lance HEAD, returns Option<StagedWrite> (None on 0 rows = true no-op),
carrying the deletion-vector updated_fragments as new_fragments and the
superseded originals as removed_fragment_ids so combine_committed_with_staged
makes the deletion visible to in-query reads.
No affected_rows is threaded: like stage_merge_insert's Operation::Update commit,
the staged delete relies on OmniGraph's per-table write queue + manifest CAS, not
Lance's per-dataset conflict resolver (commit_staged is a single attempt).
Flip the two residual guards to the staged path: staged_writes.rs now asserts
stage_delete does NOT advance HEAD and that a staged delete is read-your-writes
visible (the deletion-vector RYW proof D2 retirement depends on); the
lance_surface_guards delete guard pins execute_uncommitted's UncommittedDelete.
No behavior change yet (callers still use delete_where); Step 1 wires them.
* feat(engine): TableStorage::stage_delete + migrate merge delete path (MR-A step 1a)
Add stage_delete/Option<StagedHandle> to the TableStorage trait (delegates to
TableStore::stage_delete). Migrate the two branch_merge delete sites
(three-way RewriteMerged + adopt delta) from the inline delete_where residual to
stage_delete + commit_staged — identical in shape to the stage_merge_insert +
commit_staged pair above each. HEAD still advances within the merge sequence
(via commit_staged), under the unchanged SidecarKind::BranchMerge Phase-B
confirmation; the _pre_delete/_pre_index failpoints fire by position, unchanged.
merge_truth_table, branching, composite_flow green.
* feat(engine): migrate all delete sites to staged path, retire inline delete (MR-A step 1b/1c)
Routes every delete through the staged write path so delete never advances
Lance HEAD inline — the last inline-commit residual on the mutation path is
gone. `MutationStaging` now accumulates delete predicates (`record_delete`)
alongside pending write batches; at end-of-query `stage_all` combines a
table's predicates into one `(p1) OR (p2) …` `stage_delete` (a deletion-vector
transaction, no HEAD advance) and `commit_all` commits it through the same
`commit_staged` path as inserts/updates. Deletes are now ordinary staged
entries: one sidecar pin at `expected + 1`, no inline special-casing.
Migrated callers (all 5): the 3 mutation.rs sites (delete-node, cascade,
delete-edge) and the 2 merge.rs sites (already on stage_delete in step 1a).
`affected_edges`/`affected` move from post-inline-commit `deleted_rows` to a
committed `count_rows` at record time — exact under D₂, bounded by the cascade
working set. A predicate matching zero rows stages nothing (the staged
equivalent of the old "skip record_inline on 0 deleted rows"), so the zero-row
edge-table drift class stays closed by construction.
Retired scaffolding now that no caller remains:
- `MutationStaging.inline_committed` + `record_inline` → `delete_predicates` +
`record_delete`; `StagedMutation.inline_committed`/`paths` fields and all the
`commit_all` inline handling (queue keys, sidecar pins with the
`record_inline` table_version special-case, the inline recheck loop).
- `open_table_for_mutation`'s post-inline-commit reopen branch (deletes no
longer advance HEAD mid-query, so a second touch reopens at the pinned
version like any write).
- `InlineCommitResidual::delete_where` + its `TableStore` impl, the orphaned
`TableStore::delete_where`, and `DeleteState`. `InlineCommitResidual` now
carries only `create_vector_index` (Lance #6666 still open).
D₂ stays for now: staged-delete read-your-writes doesn't yet compose into the
pending accumulator (insert-then-delete on one table), so mixed
insert/update/delete in one query is still rejected at parse time. Retiring D₂
is step 2. Doc comments updated to match across exec/, storage_layer, db/.
Tests (all green): writes, consistency, validators, end_to_end, composite_flow,
merge_truth_table, maintenance, recovery, staged_writes, forbidden_apis,
lance_surface_guards, changes, point_in_time (286), plus failpoints (63).
* docs: delete is a staged write, not an inline-commit residual (MR-A step 1)
Update the docs that described `delete` as the inline-commit residual now that
MR-A routes it through `stage_delete`. Always-loaded surfaces (AGENTS.md rule
4 / capability matrix, invariants.md Invariant 4 / truth matrix / known gaps)
plus the dev write-path docs (writes.md, execution.md incl. its mutation
sequence diagram, architecture.md) now state: deletes accumulate as predicates
and stage like inserts/updates, no inline HEAD advance; `InlineCommitResidual`
carries only `create_vector_index` (Lance #6666). The parse-time D₂ rule is
documented as retained — not because delete inline-commits, but because
staged-delete read-your-writes is not yet wired into the pending accumulator
(MR-A step 2). lance.md's 7.0 audit note marked MR-A as landed.
* docs: D₂ is a deliberate boundary, not temporary scaffolding (MR-A close-out)
After MR-A staged the delete path, D₂ (a mutation query is insert/update-only
OR delete-only) was left framed as temporary — "until Lance ships two-phase
delete" / "retire in step 2". Lance shipped that and we used it for the
inline-commit fix; D₂'s original justification is gone. It now stands for a
different, permanent reason: keeping a query to one kind keeps its
read-your-writes unambiguous and each table to one version per query. Retiring
it would buy single-commit mixed atomicity (cheap workaround: split, or a
branch) at the cost of an in-query delete view, pending pruning, edge
id-resolution, and two-commit-per-table ordering in the hot mutation path —
complexity not worth earning. Decision: keep D₂ as a deliberate boundary.
Reframes the now-stale wording everywhere, no logic change:
- The D₂ parse-time error message no longer promises "this restriction lifts
when Lance exposes a two-phase delete API"; it states the boundary and points
to a branch+merge for one atomic commit.
- `enforce_no_mixed_destructive_constructive` doc, AGENTS.md, invariants.md
(Invariant 4 / truth matrix / removed from the known-gaps), writes.md,
architecture.md, lance.md, and the user mutations doc (which wrongly said
deletes "commit through a different path" — both stage now).
- Swept remaining stale `delete_where` mentions left from the Step-1 migration:
the merge.rs "swap when upstream ships" comments (already swapped), the
forbidden_apis / table_ops residual notes, the staged_writes vector-index
guard doc (was "same as stage_delete's absence" — stage_delete now exists),
and test comments/assert messages in recovery/maintenance/writes/failpoints.
Genuinely-historical records (dated Lance audit, rfc-013, bug-case-fix) left.
Verified: engine builds warning-free; check-agents-md OK; writes/maintenance/
recovery/staged_writes/forbidden_apis all green. Closes MR-A.
* test(engine): overlapping delete predicates must not double-count affected_* (red)
Reproduces a reporting regression from the staged-delete migration flagged in
PR #308 review. Because deletes now stage (instead of inline-committing), two
delete statements in one query both scan the same unchanged committed snapshot;
counting each predicate independently over-reports `affected_*` when they
overlap. The old inline path committed each delete before the next ran, so it
counted distinct.
`delete Person where name = "Alice"` then `delete Person where age > 29` over
the standard fixture (Alice 30, Charlie 35) removes 2 distinct nodes and 3
distinct edges, but the buggy per-statement counting returns 3 nodes / 6 edges.
RED at this commit (asserts left=3, right=2).
* fix(engine): dedup overlapping delete predicates when counting affected_*
Count each delete statement against the committed snapshot MINUS the predicates
a prior delete statement on the same table already recorded:
`(pred) AND NOT ((prior1) OR (prior2) …)`. Summed over statements this is
inclusion-exclusion — `Σ |pₙ \ (p₁ ∪ …)| = |p₁ ∪ p₂ ∪ …|` — exactly the distinct
count the combined `(p1) OR (p2)` staged delete removes. Works for nodes and
edges alike with no edge identity needed; the node ID scan uses the same
exclusion so a later statement also doesn't re-cascade already-deleted nodes.
The ORIGINAL predicate is still what gets recorded (the staged delete removes
the union); only the count uses the exclusion. The common single-delete path is
unchanged (`prior` empty → filter is just the base predicate).
New helper `dedup_delete_filter` + `MutationStaging::recorded_delete_predicates`.
Turns the red regression test green (2 nodes / 3 edges); writes (33),
end_to_end, validators, maintenance, recovery, composite_flow, merge_truth_table,
consistency, changes, and failpoints (63) all stay green.
* test(engine): delete dedup must not drop NULL-column rows (red)
Follow-up to the overlapping-delete fix flagged in PR #308 review (Greptile P1):
the `(base) AND NOT (prior)` exclusion breaks under SQL three-valued logic. If a
prior delete predicate references a NULLable column, a later statement's
matching row whose column is NULL makes `prior` evaluate to UNKNOWN, `NOT
UNKNOWN` is UNKNOWN, and the row is filtered out of the scan — even though the
prior delete never matched it. That drops it from `deleted_ids`, skipping its
cascade (orphaned edges) or, if it is the only match, leaving the node
undeleted. A data bug, not just a miscount.
Data: Charlie(age 35), Zoe(age NULL); Knows Zoe→Charlie. `delete Person where
age > 30` then `delete Person where name = "Zoe"`. Under the buggy `NOT`, Zoe's
scan `(name='Zoe') AND NOT (age>30)` is UNKNOWN → Zoe survives. RED at this
commit (Person count left=1, right=0).
* fix(engine): NULL-safe delete dedup — exclude only definitely-matched prior rows
Change `dedup_delete_filter` from `(base) AND NOT (prior)` to
`(base) AND ((prior) IS NOT TRUE)`. `IS NOT TRUE` keeps both FALSE and UNKNOWN
rows, so a prior predicate that evaluates to SQL UNKNOWN (a NULL in a referenced
column) no longer drops a row this statement legitimately matches — only rows a
prior predicate matched as definitely TRUE are excluded from the count/scan. The
distinct-count semantics are unchanged for non-NULL data.
Turns the red NULL-dedup test green (Zoe deleted, her edge cascaded), and the
overlapping-dedup + writes/end_to_end/validators/maintenance/recovery/
composite_flow/consistency suites stay green.
* docs(engine): note dedup_delete_filter's load-bearing dependency on D₂
Self-review follow-up: the overlapping-delete dedup assumes the committed
snapshot is invariant across a query's statements, which holds only because D₂
forbids mixing writes with deletes (so a delete-touched table has no pending
writes). Make that dependency explicit at the function so a future D₂ relaxation
is forced to revisit the dedup. Comment-only.
* Preserve staged write commit metadata
2026-06-27 16:48:41 +02:00
|
|
|
lance-core = "7.0.0"
|
build(deps): bump Lance 6.0.1 → 7.0.0 (correct-by-design substrate alignment) (#229)
* build(deps): bump Lance 6.0.1 → 7.0.0 (object_store 0.13.2, roaring 0.11.4)
Arrow stays 58 and DataFusion stays 53 (no change). The only transitive bump
is object_store 0.12.5 → 0.13.2. 141 upstream commits reviewed; no fixes lost
(the 6.0.x release-branch backports are all forward-ported into 7.0.0).
- object_store 0.13 moved get/put/head/rename/delete behind a new ObjectStoreExt
trait (list/list_with_delimiter/put_opts stay on the core trait). Add
`use object_store::ObjectStoreExt` in storage.rs and db/manifest/namespace.rs;
no call-site changes. Mirrors Lance's own migration in PR #6672.
- roaring pinned to 0.11.4 (cargo update -p roaring --precise 0.11.4). Lance
7.0.0's UpdatedFragmentOffsets newtype (lance#6650) derives Eq over
HashMap<u64, RoaringBitmap>, which needs RoaringBitmap: Eq, added in roaring
0.11.4; the loose `roaring = "0.11"` constraint otherwise resolves 0.11.3 and
lance itself fails to compile.
- lance#6774: merge-insert INSERT rows now stamp _row_created_at_version with the
commit version (was a fallback of 1). Flip the lance_version_columns assertion
to `== v2` and correct the changes/mod.rs rationale comment. Production
change-detection keys on _row_last_updated_at_version + ID membership, so its
logic is unaffected.
Refs lance#6650, lance#6774, lance#6672.
* fix(storage): pin WriteParams::auto_cleanup = None (lance#6755 default flip)
lance#6755 flipped the WriteParams::auto_cleanup default from on (a full cleanup
pass every 20th commit) to None. On 6.0.1 the on-by-default hook could silently
GC versions that __manifest pins for snapshots/time-travel. OmniGraph owns
cleanup explicitly (optimize.rs::cleanup_all_tables) and never set auto_cleanup,
so it was relying on a default that is both wrong for our snapshot model and now
changed upstream.
Pin auto_cleanup: None explicitly at all 11 production WriteParams sites
(table_store ×6, commit_graph ×2, recovery_audit ×1, manifest/graph ×2 — the
__manifest + sub-table Create paths). Removes the dependency on a default-flag
value and locks in the snapshot-safe behavior regardless of future upstream
re-flips.
Refs lance#6755.
* test(lance): pin BTREE range-boundary correctness (lance#6796)
lance#6796 (issue #6792) fixed a BTREE scalar-index range-query bound
inclusiveness bug: `x <= hi AND x > lo` returned the wrong boundary row.
Add lance_surface_guards.rs::btree_range_query_boundary_is_correct, which
reproduces the exact #6792 shape (5 rows + an explicit BTREE drives the index
path even on tiny data) and pins the corrected inclusive-<= / exclusive->
semantics. It turns red if a future Lance regression reintroduces the bug.
OmniGraph today builds BTREE only on string @key columns and queries them by
equality/IN, so its current patterns do not hit this; the guard protects any
future BTREE-range path (BTREE-on-properties, range-on-key).
Refs lance#6796.
* docs(dev): align Lance docs + invariants to 7.0.0
- docs/dev/lance.md: new 2026-06-14 alignment stanza for the 6.0.1 → 7.0.0 bump
(object_store ObjectStoreExt move, roaring 0.11.4, #6774/#6796/#6755 behavior,
#6658 shipped → MR-A unblocked but separate, #6666 + blob compaction still
open); prior 6.0.1 stanza demoted to historical.
- AGENTS.md: storage substrate 6.x → 7.x (line + architecture diagram).
- docs/dev/invariants.md: deletes/vector known gap updated — the staged
two-phase delete API (lance#6658) now exists and MR-A is unblocked, but
delete_where stays inline and D2 stays in place until the migration lands;
create_vector_index still gated on lance#6666.
* fix(storage): skip Lance auto-cleanup on commit paths for legacy datasets
Addresses PR #229 review (Codex P1). `WriteParams::auto_cleanup` is create-time
config with no effect on existing datasets (Lance write.rs docs), so the previous
`auto_cleanup: None` change alone did NOT protect graphs created before the v7
bump: 6.0.1 defaulted auto_cleanup ON, leaving `lance.auto_cleanup.*` config on
those datasets, and Lance's per-commit hook (io/commit.rs: `if
!commit_config.skip_auto_cleanup`) fires off that stored config — so omnigraph's
own writes would GC versions the __manifest pins for snapshots/time-travel.
Skip the hook on every commit path, covering new and legacy datasets alike:
- commit_staged: CommitBuilder::with_skip_auto_cleanup(true) — the staged data path.
- __manifest publisher: MergeInsertBuilder::skip_auto_cleanup(true).
- all 11 WriteParams: skip_auto_cleanup: true (direct Dataset::write/append paths;
auto_cleanup: None retained so new datasets store no cleanup config at all).
Tests:
- lance_surface_guards::skip_auto_cleanup_suppresses_version_gc — substrate:
negative control (config GCs v1 without skip) + with-skip survival.
- staged_writes::commit_staged_skips_auto_cleanup_so_pinned_versions_survive —
omnigraph usage: commit_staged on a legacy-config dataset preserves the pinned
create version.
Refs lance#6755.
* test(lance): assert created_at-preserved + updated_at-bumped on merge_insert UPDATE
Addresses PR #229 review follow-up. `lance_merge_insert_update_preserves_created_at_version`
documented (in a comment) that a merge_insert UPDATE preserves created_at and
bumps updated_at, but only asserted the value change — leaving the change-feed
invariant unguarded. Add the two missing assertions:
- bob created_at == v1 (preserved across UPDATE; what the test name promises;
lance#6774 only changed INSERT-row stamping).
- bob updated_at == v2 (bumped to the commit version) — the invariant
OmniGraph's insert/update classification relies on (changes/mod.rs keys on
_row_last_updated_at_version). A regression here would silently drop updates
from the diff/change feed.
2026-06-14 20:42:24 +02:00
|
|
|
lance-datafusion = "7.0.0"
|
|
|
|
|
lance-file = "7.0.0"
|
|
|
|
|
lance-index = "7.0.0"
|
|
|
|
|
lance-linalg = "7.0.0"
|
|
|
|
|
lance-namespace = "7.0.0"
|
|
|
|
|
lance-namespace-impls = "7.0.0"
|
|
|
|
|
lance-table = "7.0.0"
|
2026-04-10 20:49:41 +03:00
|
|
|
|
|
|
|
|
ulid = "1"
|
|
|
|
|
futures = "0.3"
|
|
|
|
|
async-trait = "0.1"
|
2026-04-25 14:22:14 +03:00
|
|
|
chrono = { version = "0.4", default-features = false, features = ["clock"] }
|
2026-04-10 20:49:41 +03:00
|
|
|
pest = "2"
|
|
|
|
|
pest_derive = "2"
|
|
|
|
|
thiserror = "2"
|
|
|
|
|
tokio = { version = "1", features = ["rt-multi-thread", "macros", "time", "net", "signal", "sync"] }
|
feat(cli): plane-grouped --help + clap 4.6.1 (RFC-010 Slice 2) (#220)
* chore(deps): bump clap to 4.6.1
Workspace constraint "4" → "4.6" so the resolver picks up the 4.6 line
(a plain `cargo update` stayed on 4.5.x). clap 4.5.58 → 4.6.1
(clap_builder 4.6.0, clap_derive 4.6.1). Minor bump, no API breakage; the
workspace builds and all CLI suites pass unchanged.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(cli): group --help by plane (RFC-010 Slice 2)
Slice 1 declared the planes (the command_plane table + the wrong-plane
guard); this makes them visible in `--help`. clap can't print labeled
heading rows between subcommand groups (verified against the source —
help_heading is args-only, {subcommands} is one flat block), so per the
chosen approach: cluster + legend.
- Reorder the `Command` enum into plane bands (clap lists subcommands in
declaration order): data (query, mutate, load, branch, snapshot, export,
commit, schema, graphs) → storage/local-graph ops (init, optimize,
repair, cleanup, lint, queries) → control (cluster) → session (policy,
embed, login, logout, config, version). No magic display_order numbers —
the source order IS the help order, with band comments for readers. The
band placement matches `command_plane` (lint/queries are storage-plane:
they reject --server), so the help grouping and the guard agree.
- Add an `after_help` legend on `Cli` naming the planes. Written to
describe the planes (not enumerate every command) so it doesn't drift.
Help-polish (post-review): hide the deprecated `ingest` from the list
(still a valid command); trim the long `login` and `--as` descriptions to
one line each so the columns don't blow up.
The behavioral source of truth for planes stays `planes::command_plane`;
this ordering is its cosmetic counterpart.
Test: `help_groups_commands_by_plane` pins the legend phrase + the cluster
ordering (query < optimize < cluster). Doc: a line under cli-reference's
*Command planes* section.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* feat(cli): qualify mixed-plane commands in the --help legend
Addresses the Greptile P2 on #220: the legend placed `schema` entirely in
Data and `queries` entirely in Storage, but per `command_plane` the
subcommands differ — `schema plan` is storage-plane (rejects --server) and
`queries list` is session (no graph). A user reading the legend then running
`schema plan --server` would hit a rejection contradicting it. The Commands
list is one entry per top-level command (necessarily coarse), so the legend
carries the nuance: `schema [plan: storage]` and `queries [list: session]`.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 01:49:40 +03:00
|
|
|
clap = { version = "4.6", features = ["derive"] }
|
2026-04-10 20:49:41 +03:00
|
|
|
serde = { version = "1", features = ["derive"] }
|
|
|
|
|
serde_json = "1"
|
|
|
|
|
serde_yaml = "0.9"
|
|
|
|
|
tracing = "0.1"
|
|
|
|
|
tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
|
|
|
|
|
tower = "0.5"
|
|
|
|
|
tower-http = { version = "0.6", features = ["trace"] }
|
|
|
|
|
color-eyre = "0.6"
|
|
|
|
|
tempfile = "3"
|
|
|
|
|
ahash = "0.8"
|
2026-05-07 15:25:22 +02:00
|
|
|
arc-swap = "1"
|
2026-04-10 20:49:41 +03:00
|
|
|
base64 = "0.22"
|
|
|
|
|
ariadne = "0.4"
|
|
|
|
|
regex = "1"
|
|
|
|
|
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"] }
|
build(deps): bump Lance 6.0.1 → 7.0.0 (correct-by-design substrate alignment) (#229)
* build(deps): bump Lance 6.0.1 → 7.0.0 (object_store 0.13.2, roaring 0.11.4)
Arrow stays 58 and DataFusion stays 53 (no change). The only transitive bump
is object_store 0.12.5 → 0.13.2. 141 upstream commits reviewed; no fixes lost
(the 6.0.x release-branch backports are all forward-ported into 7.0.0).
- object_store 0.13 moved get/put/head/rename/delete behind a new ObjectStoreExt
trait (list/list_with_delimiter/put_opts stay on the core trait). Add
`use object_store::ObjectStoreExt` in storage.rs and db/manifest/namespace.rs;
no call-site changes. Mirrors Lance's own migration in PR #6672.
- roaring pinned to 0.11.4 (cargo update -p roaring --precise 0.11.4). Lance
7.0.0's UpdatedFragmentOffsets newtype (lance#6650) derives Eq over
HashMap<u64, RoaringBitmap>, which needs RoaringBitmap: Eq, added in roaring
0.11.4; the loose `roaring = "0.11"` constraint otherwise resolves 0.11.3 and
lance itself fails to compile.
- lance#6774: merge-insert INSERT rows now stamp _row_created_at_version with the
commit version (was a fallback of 1). Flip the lance_version_columns assertion
to `== v2` and correct the changes/mod.rs rationale comment. Production
change-detection keys on _row_last_updated_at_version + ID membership, so its
logic is unaffected.
Refs lance#6650, lance#6774, lance#6672.
* fix(storage): pin WriteParams::auto_cleanup = None (lance#6755 default flip)
lance#6755 flipped the WriteParams::auto_cleanup default from on (a full cleanup
pass every 20th commit) to None. On 6.0.1 the on-by-default hook could silently
GC versions that __manifest pins for snapshots/time-travel. OmniGraph owns
cleanup explicitly (optimize.rs::cleanup_all_tables) and never set auto_cleanup,
so it was relying on a default that is both wrong for our snapshot model and now
changed upstream.
Pin auto_cleanup: None explicitly at all 11 production WriteParams sites
(table_store ×6, commit_graph ×2, recovery_audit ×1, manifest/graph ×2 — the
__manifest + sub-table Create paths). Removes the dependency on a default-flag
value and locks in the snapshot-safe behavior regardless of future upstream
re-flips.
Refs lance#6755.
* test(lance): pin BTREE range-boundary correctness (lance#6796)
lance#6796 (issue #6792) fixed a BTREE scalar-index range-query bound
inclusiveness bug: `x <= hi AND x > lo` returned the wrong boundary row.
Add lance_surface_guards.rs::btree_range_query_boundary_is_correct, which
reproduces the exact #6792 shape (5 rows + an explicit BTREE drives the index
path even on tiny data) and pins the corrected inclusive-<= / exclusive->
semantics. It turns red if a future Lance regression reintroduces the bug.
OmniGraph today builds BTREE only on string @key columns and queries them by
equality/IN, so its current patterns do not hit this; the guard protects any
future BTREE-range path (BTREE-on-properties, range-on-key).
Refs lance#6796.
* docs(dev): align Lance docs + invariants to 7.0.0
- docs/dev/lance.md: new 2026-06-14 alignment stanza for the 6.0.1 → 7.0.0 bump
(object_store ObjectStoreExt move, roaring 0.11.4, #6774/#6796/#6755 behavior,
#6658 shipped → MR-A unblocked but separate, #6666 + blob compaction still
open); prior 6.0.1 stanza demoted to historical.
- AGENTS.md: storage substrate 6.x → 7.x (line + architecture diagram).
- docs/dev/invariants.md: deletes/vector known gap updated — the staged
two-phase delete API (lance#6658) now exists and MR-A is unblocked, but
delete_where stays inline and D2 stays in place until the migration lands;
create_vector_index still gated on lance#6666.
* fix(storage): skip Lance auto-cleanup on commit paths for legacy datasets
Addresses PR #229 review (Codex P1). `WriteParams::auto_cleanup` is create-time
config with no effect on existing datasets (Lance write.rs docs), so the previous
`auto_cleanup: None` change alone did NOT protect graphs created before the v7
bump: 6.0.1 defaulted auto_cleanup ON, leaving `lance.auto_cleanup.*` config on
those datasets, and Lance's per-commit hook (io/commit.rs: `if
!commit_config.skip_auto_cleanup`) fires off that stored config — so omnigraph's
own writes would GC versions the __manifest pins for snapshots/time-travel.
Skip the hook on every commit path, covering new and legacy datasets alike:
- commit_staged: CommitBuilder::with_skip_auto_cleanup(true) — the staged data path.
- __manifest publisher: MergeInsertBuilder::skip_auto_cleanup(true).
- all 11 WriteParams: skip_auto_cleanup: true (direct Dataset::write/append paths;
auto_cleanup: None retained so new datasets store no cleanup config at all).
Tests:
- lance_surface_guards::skip_auto_cleanup_suppresses_version_gc — substrate:
negative control (config GCs v1 without skip) + with-skip survival.
- staged_writes::commit_staged_skips_auto_cleanup_so_pinned_versions_survive —
omnigraph usage: commit_staged on a legacy-config dataset preserves the pinned
create version.
Refs lance#6755.
* test(lance): assert created_at-preserved + updated_at-bumped on merge_insert UPDATE
Addresses PR #229 review follow-up. `lance_merge_insert_update_preserves_created_at_version`
documented (in a comment) that a merge_insert UPDATE preserves created_at and
bumps updated_at, but only asserted the value change — leaving the change-feed
invariant unguarded. Add the two missing assertions:
- bob created_at == v1 (preserved across UPDATE; what the test name promises;
lance#6774 only changed INSERT-row stamping).
- bob updated_at == v2 (bumped to the commit version) — the invariant
OmniGraph's insert/update classification relies on (changes/mod.rs keys on
_row_last_updated_at_version). A regression here would silently drop updates
from the diff/change feed.
2026-06-14 20:42:24 +02:00
|
|
|
object_store = { version = "0.13.2", default-features = false, features = ["aws", "fs"] }
|
2026-04-10 20:49:41 +03:00
|
|
|
fail = "0.5"
|
|
|
|
|
time = { version = "0.3", features = ["formatting"] }
|
|
|
|
|
axum = { version = "0.8", features = ["json", "macros"] }
|
Add OpenAPI spec generation via utoipa with /openapi.json endpoint
Integrate utoipa 5 to auto-generate an OpenAPI 3.1 spec from the existing
Axum handlers and serde types. All 16 endpoints are annotated with path
metadata, request/response schemas, security requirements, and tags. A
public /openapi.json endpoint serves the spec without requiring auth.
Includes 59 tests covering path completeness, HTTP methods, schema fields,
enum variants, security scheme, path/query parameters, request bodies,
response references, and endpoint integration.
https://claude.ai/code/session_01NfoPVx21rZUQned1f7WpXY
2026-04-11 13:11:14 +00:00
|
|
|
utoipa = { version = "5", features = ["axum_extras"] }
|
2026-04-10 20:49:41 +03:00
|
|
|
url = "2"
|
|
|
|
|
cedar-policy = "4.9"
|
|
|
|
|
sha2 = "0.10"
|
2026-04-17 21:40:51 +03:00
|
|
|
subtle = "2"
|
2026-04-10 20:49:41 +03:00
|
|
|
|
|
|
|
|
[profile.dev]
|
|
|
|
|
debug = 0
|
|
|
|
|
|
|
|
|
|
[profile.dev.package."*"]
|
|
|
|
|
opt-level = 2
|
|
|
|
|
|
|
|
|
|
[profile.release]
|
|
|
|
|
opt-level = 2
|
|
|
|
|
lto = "thin"
|
|
|
|
|
codegen-units = 16
|
|
|
|
|
strip = true
|