omnigraph/docs/releases/v0.4.0.md
Ragnor Comerford 2d5c4b1202
Some checks failed
CI / Classify Changes (push) Has been cancelled
CI / Check AGENTS.md Links (push) Has been cancelled
CI / Container Entrypoint (push) Has been cancelled
Release Edge / Prepare edge release (push) Has been cancelled
CI / Test Workspace (push) Has been cancelled
CI / Test omnigraph-server --features aws (push) Has been cancelled
CI / Test Windows release binaries (push) Has been cancelled
CI / RustFS S3 Integration (push) Has been cancelled
Release Edge / Build edge omnigraph-linux-x86_64 (push) Has been cancelled
Release Edge / Build edge omnigraph-macos-arm64 (push) Has been cancelled
Release Edge / Build edge omnigraph-windows-x86_64 (push) Has been cancelled
Release Edge / Smoke Windows installer (push) Has been cancelled
docs: rename runs.md/runs.rs → writes and repoint all references (#131)
The Run state machine was removed in MR-771 (v0.4.0); `docs/dev/runs.md`
and `crates/omnigraph/tests/runs.rs` have since documented and tested the
direct-publish write path, so the "runs" name was misleading.

- git mv docs/dev/runs.md → docs/dev/writes.md (reframe H1 + intro;
  keep MR-771 history note)
- git mv crates/omnigraph/tests/runs.rs → tests/writes.rs (reframe header)
- repoint every runs.md / runs.rs reference across docs, AGENTS.md, and
  source comments
- fix four pre-existing broken `docs/runs.md` links (the file never lived
  at that path) to `docs/dev/writes.md`
- fix the stale v0.4.0 anchor to the live section

No behavior change: every source edit is a comment. Engine builds and the
renamed test passes 25/25; scripts/check-agents-md.sh passes.

The run-removal cleanup itself (run_registry.rs guard, __run__ prefix) is
deferred to MR-770.
2026-05-30 23:20:56 +02:00

4 KiB

Omnigraph v0.4.0

Omnigraph v0.4.0 demotes the Run state machine to commit metadata via the publisher's CAS, fixing a write-cancellation hole and reducing the engine's surface area.

Highlights

  • Direct-to-target writes: mutate_as and load write directly to the target tables and call ManifestBatchPublisher::publish once at the end with expected_table_versions. No more __run__<id> staging branches, no more RunRecord state machine. Cross-table OCC is enforced inside the publisher's row-level CAS on __manifest.
  • Cancellation safety by construction: a dropped mutation future leaves no graph-level state — only orphaned Lance fragments, reclaimed by omnigraph cleanup. The "zombie run" cascade documented in .context/zombie-run-investigation.md is gone.
  • Read-your-writes inside multi-statement mutations: a .gq query that inserts and then references a row in the same statement now sees its own writes via an in-process MutationStaging cache, even though no manifest commit happens between ops.
  • Structured conflict surface: concurrent writers race through the publisher's CAS; the loser surfaces as ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }. The HTTP server maps this to 409 Conflict with a structured manifest_conflict body so clients can detect-and-retry without parsing the message.

Removed

This is a breaking release. Pre-0.4.0 / no SLA.

  • omnigraph::db::{RunRecord, RunStatus, RunId} types and the _graph_runs.lance / _graph_run_actors.lance Lance datasets.
  • Engine APIs begin_run, begin_run_as, publish_run, publish_run_as, abort_run, fail_run, terminate_run, list_runs, get_run.
  • HTTP endpoints: GET /runs, GET /runs/{run_id}, POST /runs/{run_id}/publish, POST /runs/{run_id}/abort. The RunListOutput and RunOutput schemas are removed from the OpenAPI document.
  • CLI subcommands: omnigraph run list, omnigraph run show, omnigraph run publish, omnigraph run abort. Use omnigraph commit list reading the commit graph for audit history.
  • Cedar policy actions run_publish and run_abort. Existing policy.yaml files referencing these actions will fail validation — remove the rules; the change action covers the equivalent gating.

Behavior changes

  • mutate_as / load are now atomic per query, single publish at the end. A failed mutation leaves the target unchanged with no intermediate manifest commits.
  • The OmniError::manifest_conflict shape produced by concurrent writers is now ExpectedVersionMismatch (was MergeConflict::DivergentUpdate via the run merge path). Clients that match on the conflict body must switch to inspecting manifest_conflict.table_key/expected/actual.

Known limitation

A multi-statement mutation that writes a Lance fragment in op-N and then fails in op-N+1 leaves the touched table with Lance HEAD ahead of the manifest. The next mutation against that table fails with ExpectedVersionMismatch. Most validation runs before any Lance write, so single-statement mutations are unaffected; the narrow path is multi-statement queries with late-op failures. Tracked as a follow-up; see docs/dev/writes.md for the workaround.

Upgrade notes

  • Stale __run__* branches and _graph_runs.lance in legacy v0.3.x repos are inert — the engine no longer reads them — but they remain on disk until production cleanup. This release deliberately does not touch legacy bytes.
  • The is_internal_run_branch predicate is kept as a defense-in-depth guard against users naming a branch __run__*. It will be removed in a follow-up cleanup.
  • External scripts hitting /runs/* will now receive 404. Migrate them to /commits for audit history; mutation status is implied by the HTTP response on /change itself.

Included Changes

  • Demote Run: write directly to target via publisher
  • ManifestBatchPublisher::publish accepts per-table expected_table_versions