mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
* gitignore: exclude docs/internal/ from publication
Mirrors the existing "Local-only working files (not for the public
repo)" pattern. Working notes filed under docs/internal/ stay on the
contributor's machine instead of cluttering the published doc tree
or tripping the AGENTS.md / docs-index cross-link check
(scripts/check-agents-md.sh enumerates every docs/*.md and requires
each one to be linked from an audience index — internal notes don't
have an audience index by definition).
Incidental to the v0.5.0 release; lands separately from the version
bump commits.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* ci: skip docs/internal/ in agents-md cross-link check
Matches the .gitignore exclusion. Mirrors the existing 'docs/releases/'
exclusion pattern: notes under docs/internal/ aren't part of the
published doc tree and don't need to be linked from an audience index.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* release: v0.5.0 — Lance 6 substrate, Cedar policy engine, schema-lint v1
Bumps the workspace from 0.4.2 to 0.5.0. Release notes at
docs/releases/v0.5.0.md.
Three user-visible pillars motivate the minor bump:
1. Lance 6.0.1 substrate (DataFusion 52→53, Arrow 57→58)
2. Engine-wide Cedar policy enforcement on every _as writer; server
defaults to deny-all; signed-token-claim-only actor identity
3. Schema-lint v1 chassis: OG-XXX-NNN codes, soft drops, and
`--allow-data-loss` (Hard mode) for destructive migrations
Plus structured DataFusion Expr filter pushdown (unblocks
CompOp::Contains via array_has), HTTP allow_data_loss parity, inline
.gq sources on CLI/HTTP, optional CORS layer, and bug fixes
(merge-insert dup-rowid, branch-merge coordinator restore on error,
blob columns in branch merge).
Sites bumped:
- 5 crate [package].version lines (omnigraph, omnigraph-cli,
omnigraph-compiler, omnigraph-policy, omnigraph-server)
- 10 internal path-dep `version = "..."` constraints across the
four manifests that depend on sister crates (engine, server, cli,
plus engine's dev-dep on the compiler)
- Cargo.lock (regenerated via cargo update --workspace)
- AGENTS.md "Version surveyed:"
- openapi.json `info.version` (regenerated via
OMNIGRAPH_UPDATE_OPENAPI=1 cargo test -p omnigraph-server --test
openapi)
Verification:
- cargo test --workspace --locked: 907/907 green
- cargo test -p omnigraph-engine --test failpoints --features
failpoints: 19/19 green
- cargo test -p omnigraph-engine --test lance_surface_guards: 3/3
- scripts/check-agents-md.sh: clean
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.9 KiB
8.9 KiB
Omnigraph v0.5.0
Omnigraph v0.5.0 is a substrate, security, and migration-safety release. It
jumps the storage substrate from Lance 4 to Lance 6.0.1 (DataFusion 52 → 53,
Arrow 57 → 58), introduces engine-wide Cedar policy enforcement on every
authoring path, and ships a structured schema-lint v1 chassis with
code-tagged diagnostics, soft drops, and an explicit --allow-data-loss
flag for destructive migrations.
Highlights
- Lance 6.0.1 substrate: bump from Lance 4.0.0 → 6.0.1, DataFusion 52 →
53, Arrow 57 → 58. New optimizer rules (vectorized
IN-list eq kernel,PhysicalExprSimplifier, push-limit-into-hash-join, CASE-NULL shortcut) reach predicates that flow through the engine.lance-tokenizerreplaces tantivy internally; FTS behavior preserved. - Cedar policy engine: a new
omnigraph-policycrate wiresOmnigraph::enforce(action, scope, actor)into every_aswriter (mutate_as,load_as,apply_schema_as,branch_create_as,branch_merge_as,branch_delete_as, plus the load and change variants). The HTTP server defaults to deny-all when no Cedar policy is configured; a YAML policy file is required to enable writes. Actor identity comes only from signed token claims — clients cannot set actor identity directly. - Schema lint v1 chassis: diagnostics now carry stable codes of the form
OG-XXX-NNNinstead of free-form messages.omnigraph schema planandapplyunderstand soft drops on properties and types — destructive drops require the new--allow-data-lossflag (Hard mode) at the CLI and an equivalent JSON flag over HTTP. - Structured filter pushdown: query-language predicates lower to
DataFusion
Exprand push down through Lance'sScanner::filter_exprinstead of being flattened to SQL strings. This unlocksCompOp::Containspushdown (viaarray_has), which previously fell through to in-memory post-scan filtering, and lets the DataFusion 53 optimizer rules above act on our predicates. - HTTP
allow_data_lossparity: the destructive-drop guard now exists on both the CLI (--allow-data-loss) and HTTP (allow_data_loss: truein the schema-apply request body). - Inline query strings on CLI and HTTP:
omnigraph read/omnigraph mutateand the corresponding HTTP endpoints accept inline.gqsource, not just a file path. Easier ad-hoc queries, clearer request logs. - Browser CORS layer: optional CORS layer on
omnigraph-serverfor browser-based UIs, gated byOMNIGRAPH_CORS_ORIGINS. - Merge-insert dup-rowid fix: Lance's
MergeInsertBuildercould surface spurious"Ambiguous merge inserts"errors on sequential merges against rows previously rewritten bymerge_insert. The engine now opts intoSourceDedupeBehavior::FirstSeenwith acheck_batch_unique_by_keysfail-fast precondition that guarantees source-side dedup happens before Lance sees the batch. - Branch-merge error-path recovery: a branch merge that failed mid-flight could leave the in-process coordinator pointing at a stale active branch. The error path now restores the prior coordinator, matching the success path's invariant.
- Branch merge with blob columns: external blob URIs are now materialized correctly during branch merge instead of being dropped or pointing at the source branch.
- Lance API surface guards: a new test file
(
crates/omnigraph/tests/lance_surface_guards.rs) pins eight specific Lance API surfaces (LanceError::TooMuchWriteContention,ManifestLocationfields,MergeInsertBuilderreturn shape,WriteParams::default,compact_filessignature, etc.) so the next Lance bump fails compile or runtime on any silent drift rather than producing wrong-state recovery in production.
Behavior changes
- On-disk format unchanged: existing v0.4.2 datasets open unchanged. The Lance file format pin stays at V2_2 (required by Lance's blob v2 feature).
omnigraph-serverdefaults to deny-all under--policy: starting a server with the policy feature enabled but no Cedar YAML policy configured rejects every write. Operators must supply a policy file to authorize anything.- Schema-lint diagnostics carry stable codes: messages now lead with
OG-XXX-NNN. CI parsers or tooling that keyed off the v0.4.2 free-form text need to switch to code-based matching. - Destructive schema drops require
--allow-data-loss: dropping a property or type returns a structured diagnostic by default.omnigraph schema apply --allow-data-loss(CLI) or{"allow_data_loss": true}(HTTP) opts into Hard mode. HashJoinExecnull-aware semantics on anti-join: a side effect of the DataFusion 53 bump —NOT INsemantics under null-valued anti-join columns are now correct per SQL standard. Queries that depended on the prior behavior would have been incorrect.
Upgrade Notes
Migration
- No data migration. v0.4.2 repos open directly on v0.5.0.
Clients
- HTTP and SDK clients should switch any string-matching schema-lint
parsing to code-based matching against the
OG-XXX-NNNprefix. - Clients exercising destructive schema drops (
DropProperty,DropType) must add theallow_data_lossrequest field (HTTP) or--allow-data-lossflag (CLI). Default is soft-drop-or-reject. - Clients consuming
mutate_as/load_as/apply_schema_as/ branch authoring APIs now flow through the policy enforcer. Anything bypassing authorization on v0.4.2 will be rejected on v0.5.0 once a policy is configured.
Operators
- Configure a Cedar policy YAML for production servers before enabling
writes; deny-all is the new default. The
omnigraph policy validate/test/explainCLI commands are unchanged. - Bearer tokens continue to be the actor-identity source; review the
signed-token-claim-only invariant in
docs/dev/invariants.mdif you've built custom authentication. - If your local CI uses RustFS for S3-compatible storage testing, our CI
pins
rustfs/rustfs:1.0.0-beta.3(the last known-good tag before the upstream credentials-policy change). Mirror the pin or setRUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=truefor the new image versions.
Tests added or strengthened
crates/omnigraph/tests/lance_surface_guards.rs— 8 named guards pinning Lance API surfaces against silent drift on future bumps.crates/omnigraph/tests/policy_engine_chassis.rs— engine-level policy enforcement coverage; complements the existing HTTP policy tests.- Policy chassis e2e gap-fills — branch-merge, branch-create, branch-delete policy paths now have explicit end-to-end tests over HTTP and CLI.
- Merge-pair truth table — exhaustive op-variant matrix for three-way
merge across
noop,addNode,removeNode,addEdge,removeEdge,setProperty,dropProperty,addLabel,removeLabel; the build fails to compile when a new op variant is added without dispositioning every pairing. - Merge-insert: regression for the dup-rowid bug class on the load surface
(
load_merge_repeated_against_overlapping_keys_succeeds), the update surface (second_sequential_update_on_same_row_succeeds), and the upstream-Lance-gap canary (load_merge_window_2_documents_upstream_lance_gap). - Maintenance + destructive-migration coverage —
omnigraph optimize/cleanupboundary cases, plus schema-apply soft-drop and Hard-mode paths. - Stable-row-id preservation across
stage_overwrite— pins the invariant that staged overwrites carry stable row IDs through to the committed fragment set. CompOp::Containspushdown regression (ir_filter_with_list_contains_pushes_down) — pins the new structured Expr pushdown path that retired the in-memory fallback.
Included Changes
- Lance 4 → 6.0.1, DataFusion 52 → 53, Arrow 57 → 58 substrate upgrade.
omnigraph-policycrate with engine-wide Cedar enforcement and signed-token-claim-only actor identity.- Schema-lint v1 chassis with
OG-XXX-NNNcodes, softDropProperty/DropTypesemantics, and--allow-data-lossfor Hard mode. - HTTP
allow_data_lossrequest field parity with the CLI flag. - Structured DataFusion
Exprfilter pushdown viaScanner::filter_expr, withCompOp::Containslowered througharray_has. - Inline
.gqsource acceptance on CLI and HTTP read/mutate endpoints. - Optional CORS layer on
omnigraph-serverfor browser UIs. - Bug fixes: merge-insert dup-rowid (FirstSeen + uniqueness precondition), branch-merge coordinator restore on error, blob-column materialization during branch merge.
- New Lance API surface-guard test file as the canary for future Lance bumps.
- Recovery-sidecar coverage extended across the four write paths
(
MutationStaging::finalize,schema_apply,branch_merge,ensure_indices) with failpoint regression tests. - CI: pinned
rustfs/rustfs:1.0.0-beta.3after the upstream:latestintroduced a credentials-policy change. - Version bump to
0.5.0across workspace crates,Cargo.lock,openapi.json, and theAGENTS.mdsurveyed version.