mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
Some checks failed
CI / Classify Changes (push) Has been cancelled
CI / Check AGENTS.md Links (push) Has been cancelled
CI / Container Entrypoint (push) Has been cancelled
Release Edge / Prepare edge release (push) Has been cancelled
CI / Test Workspace (push) Has been cancelled
CI / Test omnigraph-server --features aws (push) Has been cancelled
CI / Test Windows release binaries (push) Has been cancelled
CI / RustFS S3 Integration (push) Has been cancelled
Release Edge / Build edge omnigraph-linux-x86_64 (push) Has been cancelled
Release Edge / Build edge omnigraph-macos-arm64 (push) Has been cancelled
Release Edge / Build edge omnigraph-windows-x86_64 (push) Has been cancelled
Release Edge / Smoke Windows installer (push) Has been cancelled
The Run state machine was removed in MR-771 (v0.4.0); `docs/dev/runs.md` and `crates/omnigraph/tests/runs.rs` have since documented and tested the direct-publish write path, so the "runs" name was misleading. - git mv docs/dev/runs.md → docs/dev/writes.md (reframe H1 + intro; keep MR-771 history note) - git mv crates/omnigraph/tests/runs.rs → tests/writes.rs (reframe header) - repoint every runs.md / runs.rs reference across docs, AGENTS.md, and source comments - fix four pre-existing broken `docs/runs.md` links (the file never lived at that path) to `docs/dev/writes.md` - fix the stale v0.4.0 anchor to the live section No behavior change: every source edit is a comment. Engine builds and the renamed test passes 25/25; scripts/check-agents-md.sh passes. The run-removal cleanup itself (run_registry.rs guard, __run__ prefix) is deferred to MR-770.
4 KiB
4 KiB
Omnigraph v0.4.0
Omnigraph v0.4.0 demotes the Run state machine to commit metadata via the publisher's CAS, fixing a write-cancellation hole and reducing the engine's surface area.
Highlights
- Direct-to-target writes:
mutate_asandloadwrite directly to the target tables and callManifestBatchPublisher::publishonce at the end withexpected_table_versions. No more__run__<id>staging branches, no moreRunRecordstate machine. Cross-table OCC is enforced inside the publisher's row-level CAS on__manifest. - Cancellation safety by construction: a dropped mutation future
leaves no graph-level state — only orphaned Lance fragments, reclaimed
by
omnigraph cleanup. The "zombie run" cascade documented in.context/zombie-run-investigation.mdis gone. - Read-your-writes inside multi-statement mutations: a
.gqquery that inserts and then references a row in the same statement now sees its own writes via an in-processMutationStagingcache, even though no manifest commit happens between ops. - Structured conflict surface: concurrent writers race through the
publisher's CAS; the loser surfaces as
ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }. The HTTP server maps this to 409 Conflict with a structuredmanifest_conflictbody so clients can detect-and-retry without parsing the message.
Removed
This is a breaking release. Pre-0.4.0 / no SLA.
omnigraph::db::{RunRecord, RunStatus, RunId}types and the_graph_runs.lance/_graph_run_actors.lanceLance datasets.- Engine APIs
begin_run,begin_run_as,publish_run,publish_run_as,abort_run,fail_run,terminate_run,list_runs,get_run. - HTTP endpoints:
GET /runs,GET /runs/{run_id},POST /runs/{run_id}/publish,POST /runs/{run_id}/abort. TheRunListOutputandRunOutputschemas are removed from the OpenAPI document. - CLI subcommands:
omnigraph run list,omnigraph run show,omnigraph run publish,omnigraph run abort. Useomnigraph commit listreading the commit graph for audit history. - Cedar policy actions
run_publishandrun_abort. Existingpolicy.yamlfiles referencing these actions will fail validation — remove the rules; thechangeaction covers the equivalent gating.
Behavior changes
mutate_as/loadare now atomic per query, single publish at the end. A failed mutation leaves the target unchanged with no intermediate manifest commits.- The
OmniError::manifest_conflictshape produced by concurrent writers is nowExpectedVersionMismatch(wasMergeConflict::DivergentUpdatevia the run merge path). Clients that match on the conflict body must switch to inspectingmanifest_conflict.table_key/expected/actual.
Known limitation
A multi-statement mutation that writes a Lance fragment in op-N and then
fails in op-N+1 leaves the touched table with Lance HEAD ahead of the
manifest. The next mutation against that table fails with
ExpectedVersionMismatch. Most validation runs before any Lance write,
so single-statement mutations are unaffected; the narrow path is
multi-statement queries with late-op failures. Tracked as a follow-up;
see docs/dev/writes.md
for the workaround.
Upgrade notes
- Stale
__run__*branches and_graph_runs.lancein legacy v0.3.x repos are inert — the engine no longer reads them — but they remain on disk until production cleanup. This release deliberately does not touch legacy bytes. - The
is_internal_run_branchpredicate is kept as a defense-in-depth guard against users naming a branch__run__*. It will be removed in a follow-up cleanup. - External scripts hitting
/runs/*will now receive 404. Migrate them to/commitsfor audit history; mutation status is implied by the HTTP response on/changeitself.
Included Changes
- Demote Run: write directly to target via publisher
ManifestBatchPublisher::publishaccepts per-tableexpected_table_versions