diff --git a/docs/releases/v0.8.0.md b/docs/releases/v0.8.0.md index 18d3dd71..fca48574 100644 --- a/docs/releases/v0.8.0.md +++ b/docs/releases/v0.8.0.md @@ -1,114 +1,43 @@ # Omnigraph v0.8.0 -This release moves the graph commit lineage into `__manifest` (RFC-013 Phase 7), -retires the two legacy commit-graph datasets, unifies constraint validation across -every write surface (stricter in a few cases), and surfaces the storage-format -version to operators. It is the first release with an internal-schema change since -v0.4.0, and storage is **strict-single-version**: there is no in-place migration — -a graph from an older release is rebuilt via export/import. Read the upgrade notes -and the stricter-validation notes before rolling it out. +The first release with an on-disk format change since v0.4.0. Graph commit +lineage now lives in `__manifest`, the two legacy commit-graph datasets are +gone, constraint validation is unified (and stricter in a few cases), and the +storage-format version is now visible to operators. **Storage is +strict-single-version: a graph from an older release is rebuilt via +export/import, not migrated in place.** Read the upgrade notes before rolling out. -## Graph lineage now lives in `__manifest` (internal schema v4) +## Highlights -The graph commit DAG (commits, parents, merge parents, per-branch heads, and the -authoring actor) is now stored in `__manifest` as `graph_commit` / `graph_head` -rows, written in the **same commit (CAS)** as the table-version rows of a graph -publish. Previously the lineage lived in a separate `_graph_commits.lance` dataset -written after the manifest commit, leaving a narrow window where a crash could land -a manifest version with no matching lineage row. Folding the lineage into the -publish closes that gap by construction: a graph commit and its lineage now land -atomically at one manifest version. The in-memory commit graph is a pure projection -of those manifest rows. +- **Graph lineage moved into `__manifest` (internal schema v4).** The commit DAG + (commits, parents, branch heads, actor) is now written in the same atomic + commit as a graph's table-version rows, closing the crash window where a + manifest version could exist without its lineage. The two legacy datasets + (`_graph_commits.lance`, `_graph_commit_actors.lance`) are no longer created or + read — one fewer moving part and two fewer directory listings per open. -This bumps the `__manifest` internal schema stamp to **v4**. +- **Stricter, unified constraint validation.** The loader, mutation, and + branch-merge paths now share one validator, so they can no longer drift. All + changes are stricter (none relax an existing check): enum constraints are + enforced on merge, a `@unique` value that collides with an already-committed + different row is rejected, duplicate-key semantics are precise (a key repeated + within one input batch is rejected; the same id across batches coalesces), and + overwrite loads validate per touched table. A pipeline that unknowingly relied + on one of these gaps will now fail loudly at write time. -## The `_graph_commits.lance` / `_graph_commit_actors.lance` tables are retired +- **Storage-format version is visible.** `omnigraph version` prints the format + version this binary serves, `omnigraph snapshot` reports a graph's on-disk + version, and the server `GET /healthz` response includes it. -With lineage in `__manifest` and branch authority already on the manifest, the two -legacy commit-graph datasets are no longer created, read, or written. A graph this -release creates has neither. This removes two cold-open directory listings per -graph open and simplifies maintenance (`optimize` compacts only `__manifest` among -the internal tables now). +- **Prebuilt linux-arm64 (aarch64) binaries** now ship alongside Linux x86_64, + macOS arm64, and Windows x86_64, with a matching Homebrew bottle. -## Strict-single-version storage: rebuild via export/import, no in-place migration +## Upgrade -Internal schema v4 is a hard version gate, enforced in **both** directions on every -open (read-write and read-only): - -- A graph stamped **below v4** (created by an older release whose storage format - this binary does not read) is refused with a **rebuild-via-export/import** - message. There is no in-place upgrade: export the graph with the older binary, - then `omnigraph init` + `omnigraph load` with this one. Data, vectors, and blobs - are preserved; commit history and branches are not carried over. See the +- **Upgrade every binary that touches a graph to v0.8.0 together.** Older-format + graphs are refused on open in both directions: a newer binary refuses an + old-format graph, and an old binary refuses this one. +- **Any pre-v0.8.0 graph is rebuilt, not migrated:** export it with the old binary, + then `init` + `load` with v0.8.0. Data, vectors, and blobs are preserved; + commit history and branches are not carried over. See the [upgrade guide](../user/operations/upgrade.md). -- A graph stamped **above v4** (created by a newer release) is refused with an - **`upgrade omnigraph before opening this graph`** error, so an old binary cannot - silently misread a newer format. - -This replaces the speculative one-time on-disk migration that earlier drafts of -this release described. The rationale (lower long-term liability than carrying -in-place migration code for a pre-release format) is in -[docs/dev/versioning.md](../dev/versioning.md). - -## See the storage-format version - -Operators can now read the internal-schema version directly instead of discovering -it through a refusal: - -- `omnigraph version` prints an `internal-schema ` line (the version this binary - serves). -- `omnigraph snapshot` reports the opened graph's on-disk `internal_schema_version`. -- The server `GET /healthz` response includes `internal_schema_version` (the - binary's served version). - -## Stricter, unified constraint validation (#314) - -Constraint enforcement — value/range/`@check`, enum, `@unique`, edge referential -integrity, and cardinality — was previously implemented separately in the bulk -loader, the mutation executor, and the branch-merge path, and had drifted. All three -write surfaces now route through one catalog-derived evaluator, so they can no longer -diverge. The evaluator is delta-scoped (it checks only the change set) and -index-backed (it probes committed state through the `@key`/`@unique`/`src`/`dst` -BTREEs instead of scanning every catalog table), so a one-row merge opens ~3 data -tables instead of 6+ and validation cost is flat in graph size rather than O(V+E). - -The unification closes real gaps. **These behavior changes are all stricter — none -relax an existing check** — so a graph that already satisfies its schema is -unaffected, but inputs that previously slipped through are now rejected before the -commit: - -- **Enum constraints are enforced on the merge path** (previously a gap: merge - validated `@range`/`@check` but not enum). -- **Cross-version uniqueness is enforced.** A write or load whose `@unique` value - collides with an already-committed *different* row visible to that write is now - rejected. Re-upserting an existing `@key` still upserts as before. -- **Duplicate-key semantics are precise.** A `@key` that appears twice as two - distinct records *within one input batch* (e.g. a bulk load listing the same key - twice) is rejected; the *same* id reappearing *across* batches (e.g. an - insert-then-update in one mutation) is coalesced as ordered supersession of one - logical row. -- **Overwrite loads validate per touched table.** A table in the overwrite batch is - validated against its replacement image (an empty committed view); a table absent - from the batch keeps its committed rows, so an edges-only overwrite still resolves - referential integrity against the retained nodes. - -If an ingestion pipeline unknowingly relied on one of these gaps — a duplicate -`@unique` value, or an enum violation reaching a branch through merge — it will now -fail loudly at write time. Validate load inputs against the schema before upgrading -if in doubt. - -## New prebuilt platform: linux-arm64 (#316) - -Tagged releases now ship an `omnigraph-linux-arm64` (aarch64) archive alongside the -existing Linux x86_64, macOS arm64, and Windows x86_64 builds, and the Homebrew -formula carries a matching `on_linux`/`on_arm` bottle. The install script maps -Linux/aarch64 to the new asset, so aarch64 Linux is now a first-class prebuilt -target instead of build-from-source. - -## Upgrade order - -Upgrade every binary that touches a graph to v0.8.0 together. A mixed fleet where an -older binary still writes a graph another has stamped v4 is unsupported, as with any -internal-schema bump. To move a pre-v4 graph forward, follow the -[upgrade guide](../user/operations/upgrade.md): export with the old binary, then -`init` + `load` with v0.8.0.