mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
MR-771: demote Run to direct-publish via expected_table_versions CAS
mutate_as and load now write directly to target tables and call the publisher once at the end with per-table expected versions; the Run state machine, _graph_runs.lance writers, __run__ staging branches, and server /runs/* endpoints are removed. Multi-statement mutations remain atomic at the manifest level via an in-memory MutationStaging accumulator that gives read-your-writes within a query and a single publish at the end. Concurrent-writer conflicts surface as ExpectedVersionMismatch (HTTP 409 manifest_conflict) instead of the old DivergentUpdate merge shape. Documents one known limitation in docs/runs.md: a multi-statement mid-query failure where op-N writes a Lance fragment and op-N+1 fails leaves Lance HEAD ahead of the manifest until a follow-up introduces per-table Lance branches. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
4e5374a85e
commit
35be20cb05
28 changed files with 1188 additions and 3216 deletions
89
docs/releases/v0.4.0.md
Normal file
89
docs/releases/v0.4.0.md
Normal file
|
|
@ -0,0 +1,89 @@
|
|||
# Omnigraph v0.4.0
|
||||
|
||||
Omnigraph v0.4.0 demotes the Run state machine to commit metadata via the
|
||||
publisher's CAS, fixing the cancellation hole that motivated MR-771 and
|
||||
reducing the engine's surface area.
|
||||
|
||||
## Highlights
|
||||
|
||||
- **Direct-to-target writes (MR-771)**: `mutate_as` and `load` write
|
||||
directly to the target tables and call
|
||||
`ManifestBatchPublisher::publish` once at the end with
|
||||
`expected_table_versions`. No more `__run__<id>` staging branches, no
|
||||
more `RunRecord` state machine. Cross-table OCC is enforced inside the
|
||||
publisher's row-level CAS on `__manifest`.
|
||||
- **Cancellation safety by construction**: a dropped mutation future
|
||||
leaves no graph-level state — only orphaned Lance fragments, reclaimed
|
||||
by `omnigraph cleanup`. The "zombie run" cascade documented in
|
||||
`.context/zombie-run-investigation.md` is gone.
|
||||
- **Read-your-writes inside multi-statement mutations**: a `.gq` query
|
||||
that inserts and then references a row in the same statement now sees
|
||||
its own writes via an in-process `MutationStaging` cache, even though
|
||||
no manifest commit happens between ops.
|
||||
- **Structured conflict surface**: concurrent writers race through the
|
||||
publisher's CAS; the loser surfaces as
|
||||
`ManifestConflictDetails::ExpectedVersionMismatch { table_key,
|
||||
expected, actual }`. The HTTP server maps this to **409 Conflict** with
|
||||
a structured `manifest_conflict` body so clients can detect-and-retry
|
||||
without parsing the message.
|
||||
|
||||
## Removed
|
||||
|
||||
This is a breaking release. Pre-0.4.0 / no SLA.
|
||||
|
||||
- `omnigraph::db::{RunRecord, RunStatus, RunId}` types and the
|
||||
`_graph_runs.lance` / `_graph_run_actors.lance` Lance datasets.
|
||||
- Engine APIs `begin_run`, `begin_run_as`, `publish_run`,
|
||||
`publish_run_as`, `abort_run`, `fail_run`, `terminate_run`,
|
||||
`list_runs`, `get_run`.
|
||||
- HTTP endpoints: `GET /runs`, `GET /runs/{run_id}`, `POST
|
||||
/runs/{run_id}/publish`, `POST /runs/{run_id}/abort`. The
|
||||
`RunListOutput` and `RunOutput` schemas are removed from the OpenAPI
|
||||
document.
|
||||
- CLI subcommands: `omnigraph run list`, `omnigraph run show`, `omnigraph
|
||||
run publish`, `omnigraph run abort`. Use `omnigraph commit list`
|
||||
reading the commit graph for audit history.
|
||||
- Cedar policy actions `run_publish` and `run_abort`. Existing
|
||||
`policy.yaml` files referencing these actions will fail validation —
|
||||
remove the rules; the `change` action covers the equivalent gating.
|
||||
|
||||
## Behavior changes
|
||||
|
||||
- `mutate_as` / `load` are now **atomic per query, single publish at the
|
||||
end**. A failed mutation leaves the target unchanged with no
|
||||
intermediate manifest commits.
|
||||
- The `OmniError::manifest_conflict` shape produced by concurrent
|
||||
writers is now `ExpectedVersionMismatch` (was `MergeConflict::DivergentUpdate`
|
||||
via the run merge path). Clients that match on the conflict body must
|
||||
switch to inspecting `manifest_conflict.table_key/expected/actual`.
|
||||
|
||||
## Known limitation
|
||||
|
||||
A multi-statement mutation that writes a Lance fragment in op-N and then
|
||||
fails in op-N+1 leaves the touched table with Lance HEAD ahead of the
|
||||
manifest. The next mutation against that table fails with
|
||||
`ExpectedVersionMismatch`. Most validation runs before any Lance write,
|
||||
so single-statement mutations are unaffected; the narrow path is
|
||||
multi-statement queries with late-op failures. Tracked as a follow-up;
|
||||
see [docs/runs.md](../runs.md#known-limitation-mid-query-partial-failure-on-the-same-table)
|
||||
for the workaround.
|
||||
|
||||
## Upgrade notes
|
||||
|
||||
- **Stale `__run__*` branches and `_graph_runs.lance`** in legacy v0.3.x
|
||||
repos are *inert* — the engine no longer reads them — but they remain
|
||||
on disk until production cleanup. MR-770 owns the destructive sweep;
|
||||
this release deliberately does not touch legacy bytes.
|
||||
- The `is_internal_run_branch` predicate is kept as a defense-in-depth
|
||||
guard against users naming a branch `__run__*`. It will be removed in
|
||||
a follow-up alongside MR-770.
|
||||
- External scripts hitting `/runs/*` will now receive 404. Migrate them
|
||||
to `/commits` for audit history; mutation status is implied by the
|
||||
HTTP response on `/change` itself.
|
||||
|
||||
## Included Changes
|
||||
|
||||
- MR-771 — Demote Run: write directly to target via publisher
|
||||
- MR-766 — `ManifestBatchPublisher::publish` accepts per-table
|
||||
`expected_table_versions` (landed earlier; this release wires it in
|
||||
end-to-end)
|
||||
Loading…
Add table
Add a link
Reference in a new issue