omnigraph/Cargo.toml
Ragnor Comerford 67baf615d9
build(deps): bump Lance 6.0.1 → 7.0.0 (correct-by-design substrate alignment) (#229)
* build(deps): bump Lance 6.0.1 → 7.0.0 (object_store 0.13.2, roaring 0.11.4)

Arrow stays 58 and DataFusion stays 53 (no change). The only transitive bump
is object_store 0.12.5 → 0.13.2. 141 upstream commits reviewed; no fixes lost
(the 6.0.x release-branch backports are all forward-ported into 7.0.0).

- object_store 0.13 moved get/put/head/rename/delete behind a new ObjectStoreExt
  trait (list/list_with_delimiter/put_opts stay on the core trait). Add
  `use object_store::ObjectStoreExt` in storage.rs and db/manifest/namespace.rs;
  no call-site changes. Mirrors Lance's own migration in PR #6672.
- roaring pinned to 0.11.4 (cargo update -p roaring --precise 0.11.4). Lance
  7.0.0's UpdatedFragmentOffsets newtype (lance#6650) derives Eq over
  HashMap<u64, RoaringBitmap>, which needs RoaringBitmap: Eq, added in roaring
  0.11.4; the loose `roaring = "0.11"` constraint otherwise resolves 0.11.3 and
  lance itself fails to compile.
- lance#6774: merge-insert INSERT rows now stamp _row_created_at_version with the
  commit version (was a fallback of 1). Flip the lance_version_columns assertion
  to `== v2` and correct the changes/mod.rs rationale comment. Production
  change-detection keys on _row_last_updated_at_version + ID membership, so its
  logic is unaffected.

Refs lance#6650, lance#6774, lance#6672.

* fix(storage): pin WriteParams::auto_cleanup = None (lance#6755 default flip)

lance#6755 flipped the WriteParams::auto_cleanup default from on (a full cleanup
pass every 20th commit) to None. On 6.0.1 the on-by-default hook could silently
GC versions that __manifest pins for snapshots/time-travel. OmniGraph owns
cleanup explicitly (optimize.rs::cleanup_all_tables) and never set auto_cleanup,
so it was relying on a default that is both wrong for our snapshot model and now
changed upstream.

Pin auto_cleanup: None explicitly at all 11 production WriteParams sites
(table_store ×6, commit_graph ×2, recovery_audit ×1, manifest/graph ×2 — the
__manifest + sub-table Create paths). Removes the dependency on a default-flag
value and locks in the snapshot-safe behavior regardless of future upstream
re-flips.

Refs lance#6755.

* test(lance): pin BTREE range-boundary correctness (lance#6796)

lance#6796 (issue #6792) fixed a BTREE scalar-index range-query bound
inclusiveness bug: `x <= hi AND x > lo` returned the wrong boundary row.
Add lance_surface_guards.rs::btree_range_query_boundary_is_correct, which
reproduces the exact #6792 shape (5 rows + an explicit BTREE drives the index
path even on tiny data) and pins the corrected inclusive-<= / exclusive->
semantics. It turns red if a future Lance regression reintroduces the bug.

OmniGraph today builds BTREE only on string @key columns and queries them by
equality/IN, so its current patterns do not hit this; the guard protects any
future BTREE-range path (BTREE-on-properties, range-on-key).

Refs lance#6796.

* docs(dev): align Lance docs + invariants to 7.0.0

- docs/dev/lance.md: new 2026-06-14 alignment stanza for the 6.0.1 → 7.0.0 bump
  (object_store ObjectStoreExt move, roaring 0.11.4, #6774/#6796/#6755 behavior,
  #6658 shipped → MR-A unblocked but separate, #6666 + blob compaction still
  open); prior 6.0.1 stanza demoted to historical.
- AGENTS.md: storage substrate 6.x → 7.x (line + architecture diagram).
- docs/dev/invariants.md: deletes/vector known gap updated — the staged
  two-phase delete API (lance#6658) now exists and MR-A is unblocked, but
  delete_where stays inline and D2 stays in place until the migration lands;
  create_vector_index still gated on lance#6666.

* fix(storage): skip Lance auto-cleanup on commit paths for legacy datasets

Addresses PR #229 review (Codex P1). `WriteParams::auto_cleanup` is create-time
config with no effect on existing datasets (Lance write.rs docs), so the previous
`auto_cleanup: None` change alone did NOT protect graphs created before the v7
bump: 6.0.1 defaulted auto_cleanup ON, leaving `lance.auto_cleanup.*` config on
those datasets, and Lance's per-commit hook (io/commit.rs: `if
!commit_config.skip_auto_cleanup`) fires off that stored config — so omnigraph's
own writes would GC versions the __manifest pins for snapshots/time-travel.

Skip the hook on every commit path, covering new and legacy datasets alike:
- commit_staged: CommitBuilder::with_skip_auto_cleanup(true) — the staged data path.
- __manifest publisher: MergeInsertBuilder::skip_auto_cleanup(true).
- all 11 WriteParams: skip_auto_cleanup: true (direct Dataset::write/append paths;
  auto_cleanup: None retained so new datasets store no cleanup config at all).

Tests:
- lance_surface_guards::skip_auto_cleanup_suppresses_version_gc — substrate:
  negative control (config GCs v1 without skip) + with-skip survival.
- staged_writes::commit_staged_skips_auto_cleanup_so_pinned_versions_survive —
  omnigraph usage: commit_staged on a legacy-config dataset preserves the pinned
  create version.

Refs lance#6755.

* test(lance): assert created_at-preserved + updated_at-bumped on merge_insert UPDATE

Addresses PR #229 review follow-up. `lance_merge_insert_update_preserves_created_at_version`
documented (in a comment) that a merge_insert UPDATE preserves created_at and
bumps updated_at, but only asserted the value change — leaving the change-feed
invariant unguarded. Add the two missing assertions:

- bob created_at == v1 (preserved across UPDATE; what the test name promises;
  lance#6774 only changed INSERT-row stamping).
- bob updated_at == v2 (bumped to the commit version) — the invariant
  OmniGraph's insert/update classification relies on (changes/mod.rs keys on
  _row_last_updated_at_version). A regression here would silently drop updates
  from the diff/change feed.
2026-06-14 20:42:24 +02:00

87 lines
2.2 KiB
TOML

[workspace]
resolver = "2"
members = [
"crates/omnigraph-compiler",
"crates/omnigraph",
"crates/omnigraph-cli",
"crates/omnigraph-api-types",
"crates/omnigraph-cluster",
"crates/omnigraph-policy",
"crates/omnigraph-server",
]
default-members = [
"crates/omnigraph",
"crates/omnigraph-cli",
"crates/omnigraph-server",
]
[workspace.dependencies]
arrow-array = "58"
arrow-ipc = "58"
arrow-schema = "58"
arrow-select = "58"
arrow-cast = { version = "58", features = ["prettyprint"] }
arrow-ord = "58"
datafusion = { version = "53", default-features = false, features = ["nested_expressions"] }
datafusion-physical-plan = "53"
datafusion-physical-expr = "53"
datafusion-execution = "53"
datafusion-common = "53"
datafusion-expr = "53"
datafusion-functions-aggregate = "53"
lance = { version = "7.0.0", default-features = false, features = ["aws"] }
lance-datafusion = "7.0.0"
lance-file = "7.0.0"
lance-index = "7.0.0"
lance-linalg = "7.0.0"
lance-namespace = "7.0.0"
lance-namespace-impls = "7.0.0"
lance-table = "7.0.0"
ulid = "1"
futures = "0.3"
async-trait = "0.1"
chrono = { version = "0.4", default-features = false, features = ["clock"] }
pest = "2"
pest_derive = "2"
thiserror = "2"
tokio = { version = "1", features = ["rt-multi-thread", "macros", "time", "net", "signal", "sync"] }
clap = { version = "4.6", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde_yaml = "0.9"
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
tower = "0.5"
tower-http = { version = "0.6", features = ["trace"] }
color-eyre = "0.6"
tempfile = "3"
ahash = "0.8"
arc-swap = "1"
base64 = "0.22"
ariadne = "0.4"
regex = "1"
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"] }
object_store = { version = "0.13.2", default-features = false, features = ["aws", "fs"] }
fail = "0.5"
time = { version = "0.3", features = ["formatting"] }
axum = { version = "0.8", features = ["json", "macros"] }
utoipa = { version = "5", features = ["axum_extras"] }
url = "2"
cedar-policy = "4.9"
sha2 = "0.10"
subtle = "2"
[profile.dev]
debug = 0
[profile.dev.package."*"]
opt-level = 2
[profile.release]
opt-level = 2
lto = "thin"
codegen-units = 16
strip = true