omnigraph/docs/releases/v0.4.2.md
2026-05-10 14:37:58 +00:00

6.5 KiB
Raw Blame History

Omnigraph v0.4.2

Omnigraph v0.4.2 is a concurrency, admission-control, and release-hygiene release. It removes the server-global write lock, lets disjoint writers make progress concurrently, adds per-actor admission limits, hardens branch and mutation races with snapshot-isolation fences, and documents the release in public open-source terms.

Highlights

  • Unlocked server engine handle: the HTTP server now holds the engine behind a shared handle instead of a server-global write lock. Concurrent handlers can call engine APIs directly while the engine serializes only the resources that actually conflict.
  • Engine-owned writer queues: same (table, branch) writers are serialized by per-table writer queues inside the engine, while disjoint table/branch writes can run concurrently. This narrows contention without relying on route handlers to know storage-level ordering rules.
  • Per-actor admission control: mutating HTTP handlers are gated by a WorkloadController with per-actor in-flight request and estimated-byte budgets. Rejections use HTTP 429 with code: too_many_requests and a Retry-After header, so noisy actors back off without blocking unrelated actors.
  • Admission coverage for all mutating handlers: /change, /ingest, /schema/apply, branch create/delete, and branch merge now flow through the admission controller. Read-only endpoints are not admission-gated.
  • Op-kind-aware version checks: mutation commit-time drift checks distinguish append-like inserts from strict update/delete work. Inserts remain permissive enough for safe concurrent append patterns; updates and deletes get stricter stale-view rejection.
  • Read-time drift checks for strict mutations: staged mutations compare the manifest pin captured when the query opened against the manifest snapshot captured under table-queue ownership. If a concurrent writer moved the table after the query read, the stale writer returns a structured manifest_conflict 409 instead of staging work computed against an old snapshot.
  • Inline-delete recovery coverage: delete-only mutations still use Lance's inline delete path, but their recovery sidecar is now written before the manifest-version rejection path can return. If a delete moves Lance HEAD and a concurrent manifest update makes the query stale, the next read-write open can roll the residual back rather than leaving a head-ahead-of-manifest table.
  • Branch-operation race hardening: branch creation and branch merge avoid coordinator swap-restore races that could expose the wrong active branch to concurrent work. Concurrent branch merges are serialized by a merge mutex.
  • Branch-merge target revalidation: merges re-check target table versions after acquiring target write queues. A stale merge plan returns a structured conflict instead of overwriting concurrent target-branch changes or adopting a source table over newly appended target rows.
  • Schema refresh deadlock fix: recovery refresh releases the write guard before schema reload, preventing a refresh/schema-apply deadlock.
  • Lean admission API: removed the unused global rewrite admission pool, service_unavailable error variant, related 503 documentation, and benchmark flag. The public server surface now reflects only admission behavior that is wired to handlers.
  • Open-source release hygiene: this release adds guidance for public-facing documentation, release notes, and version bumps. Release docs now avoid private issue tracker references and use stable public descriptions instead.

Behavior changes

  • Disjoint mutating HTTP requests can now make progress concurrently instead of queueing behind one process-wide engine write lock.
  • Mutating handlers may return HTTP 429 when an actor exceeds per-actor in-flight or estimated-byte budgets. Clients should respect Retry-After and retry later.
  • Concurrent update/delete and merge races now return structured manifest_conflict 409 responses in more stale-view cases instead of relying on later publisher-CAS detection or allowing a stale plan to proceed.
  • Concurrent branch merge × change on the same target branch may return either success or a clean 409 conflict, depending on which operation wins the queue.
  • OMNIGRAPH_GLOBAL_REWRITE_MAX is no longer recognized. Remove it from deployment manifests; use the per-actor in-flight and byte-budget admission settings for the currently wired server controls.

Upgrade Notes

  • No repository migration is required. Existing v0.4.1 repos can be opened directly with v0.4.2.
  • Clients should treat manifest_conflict 409 responses as retryable stale-view conflicts. This was already the documented contract, but this release uses it in more concurrent-write paths.
  • Clients should handle HTTP 429 from every mutating endpoint, not only /change. Honor the Retry-After header.
  • Operators should remove stale references to global rewrite admission and 503 rewrite-pool exhaustion from local runbooks.
  • If you maintain public docs or release notes, use public identifiers and user-facing descriptions rather than private tracker IDs.

Tests added or strengthened

  • Regression tests for update read-your-writes under in-process concurrency.
  • HTTP tests for same-key insert snapshots, disjoint /change concurrency, and /ingest admission 429 + Retry-After.
  • Branch-operation regression tests for branch-create swap-restore races, concurrent /change + branch-merge interleavings, branch-merge swap-restore races, branch-op matrix coverage, and post-reopen consistency.
  • Failpoint-backed regression coverage for inline-delete recovery sidecar creation before version-mismatch rejection.
  • Admission tests use injectable WorkloadController state instead of mutating process environment.

Included Changes

  • Shared server engine state and per-actor admission on mutating endpoints.
  • Per-(table, branch) writer queues and op-kind-aware manifest drift checks.
  • Strict read-time version checks for updates/deletes.
  • Branch create/merge race hardening and branch-merge target snapshot revalidation under queue ownership.
  • Retry-after support for admission rejections and OpenAPI updates for reachable 429 responses.
  • Actor-isolation benchmark harness updates for the current admission controller.
  • Removal of the unwired global rewrite admission / 503 server surface.
  • Version bump to 0.4.2 across workspace crates, Cargo.lock, and openapi.json.
  • Public release-note cleanup and new OSS best-practice guidance in AGENTS.md.