Merge branch 'main' into ragnorc/omnigraph-mcp-crate

Folds in v0.7.2 (release #301) + RFC-013 Phase 7 (graph lineage in __manifest,
internal schema v3→v4 migration #299; WriteTxn #298; recovery convergence #296)
under the MCP branch.

Conflict resolutions (2 files):
- crates/omnigraph-server/Cargo.toml: take main's 0.7.2 path-dep constraints;
  keep our omnigraph-mcp dep (bumped to 0.7.2).
- docs/releases/v0.8.0.md (add/add): both branches drafted v0.8.0 notes for the
  same next minor — combined them. v0.8.0 now documents BOTH the MCP surface
  (ours) and main's __manifest lineage fold + the breaking internal-schema-v4
  upgrade-order requirement (kept prominent under Upgrade notes). Corrected our
  'no breaking changes / on-disk format unchanged' line, which the v4 migration
  makes false.

Coherence: omnigraph-mcp [package] + Cargo.lock bumped 0.7.1→0.7.2; openapi.json
auto-merged to info.version 0.7.2 (no API-surface drift from the incoming
engine-internal commits). Verification deferred to CI (no local rebuild).
This commit is contained in:
Ragnor Comerford 2026-06-25 15:53:53 +02:00
commit 4d4c2164de
No known key found for this signature in database
62 changed files with 5898 additions and 1053 deletions

60
docs/releases/v0.7.2.md Normal file
View file

@ -0,0 +1,60 @@
# Omnigraph v0.7.2
A patch release over v0.7.1: write-path latency reductions plus three
correctness fixes on the maintenance and recovery paths. No breaking changes, no
on-disk format change, and no migration — drop-in over v0.7.1.
## Performance
- **Write opens go direct, schema validates once (#288, #298).** Write opens
used to route through the per-table Lance namespace catalog, which re-opened
the dataset just to read its location and re-resolved the latest version on
every table open — an O(commit-depth) double resolution that dominated write
latency on object stores (~70%). Writes now open each touched data table
directly by its manifest-recorded location (Lance's O(1) version-hint path),
validate the schema contract once per write instead of ~4×, and open each
touched table once instead of 4×.
- **`optimize` compacts the internal metadata tables (#291).** `optimize`
previously iterated only node/edge tables, so the internal `__manifest`,
`_graph_commits`, and `_graph_commit_actors` tables accumulated one fragment
per commit and were never compacted — making every write's metadata scan grow
with commit history. `optimize` now compacts all three, so a periodically
optimized long-lived graph keeps its per-write metadata scan flat in history.
## Fixes
- **`optimize` survives a cross-process write race (#297).** A CLI `optimize`
racing a served write on the same table could fail: the in-process write queue
doesn't serialize across processes, so a concurrent insert/delete advancing the
manifest between optimize's compaction and its publish broke the strict
equality CAS. Optimize now reopens-and-replans on a genuine Lance conflict and
fast-forwards its publish monotonically, so a maintenance compaction never
fails a live write. Bounded retry; sustained contention surfaces a loud
conflict rather than dropping work.
- **`optimize` is non-destructive on upgraded graphs (#291).** A graph created by
a pre-0.7.0 binary carries an on-by-default Lance auto-cleanup config; under it,
optimize's compaction commit could fire Lance's version-GC hook and prune
`__manifest`-pinned versions (breaking snapshots and time travel). Optimize now
strips any stale `lance.auto_cleanup.*` config off every table — data and
internal — before its HEAD-advancing commits, so compaction can never GC pinned
versions.
- **Recovery converges instead of failing `open` under a concurrent manifest
advance (#296).** The open-time recovery sweep published its roll-forward at the
sidecar's pinned expected version; if another writer advanced the manifest
during the classify→publish window, the CAS failed and aborted the whole
`Omnigraph::open`. The sweep now treats roll-forward as "the manifest reflects
the sidecar's committed state," not "this sweep won the CAS": on a CAS loss it
re-reads the live manifest and, when the sidecar's intent is already satisfied,
records the recovery and deletes the sidecar idempotently — so a concurrent
advance no longer fails the open. (The destructive roll-back twin still defers
to a cross-process lease, as documented.)
## Upgrade notes
Drop-in over v0.7.1 — no configuration, schema, or data changes. Upgrade the
server and CLI together as usual. Graphs created on v0.7.1 read and write
identically on v0.7.2; the optimize non-destructive fix additionally protects
graphs created by pre-0.7.0 binaries from version GC during compaction.

View file

@ -1,16 +1,23 @@
# Omnigraph v0.8.0
v0.8.0 makes every served graph an **MCP (Model Context Protocol) server**. An
MCP-capable agent — Claude Code/Desktop, Cursor, the OpenAI Responses `mcp` tool,
and others — can connect to a graph and operate it directly: run reads and
mutations, load data, manage branches, browse commits, read the schema, and
invoke the graph's curated stored queries. The surface adds no new capability and
no new business logic; every tool delegates to the same engine/handler path the
REST routes use and is gated by the same Cedar policy.
v0.8.0 has two headline changes:
## Highlights
1. **Every served graph becomes an MCP (Model Context Protocol) server** — an
MCP-capable agent (Claude Code/Desktop, Cursor, the OpenAI Responses `mcp`
tool, and others) can connect to a graph and operate it directly. The surface
adds no new capability and no new business logic; every tool delegates to the
same engine/handler path the REST routes use and is gated by the same Cedar
policy. It is **additive**.
2. **Graph commit lineage moves into `__manifest`** (RFC-013 Phase 7), folded
into the publish CAS, via a one-time on-disk migration (internal schema
**v3 → v4**). This is the first internal-schema change since v0.4.0 and carries
an **upgrade-order requirement** — read the upgrade notes before rolling it out.
### MCP surface (`POST /graphs/{id}/mcp`)
## MCP surface (`POST /graphs/{id}/mcp`)
An MCP-capable agent can connect to a graph and run reads and mutations, load
data, manage branches, browse commits, read the schema, and invoke the graph's
curated stored queries.
- **One MCP endpoint per served graph**, mounted automatically by the cluster
server — no separate flag. It is a stateless Streamable-HTTP transport: a
@ -78,8 +85,56 @@ carried in the query source:
unsupported version is a `400`); `initialize` negotiates the version in its
body and is exempt by design.
## Graph lineage now lives in `__manifest` (internal schema v4)
The graph commit DAG (commits, parents, merge parents, per-branch heads, and the
authoring actor) is now stored in `__manifest` as `graph_commit` / `graph_head`
rows, written in the **same commit (CAS)** as the table-version rows of a graph
publish. Previously the lineage lived in a separate `_graph_commits.lance`
dataset written after the manifest commit, leaving a narrow window where a crash
could land a manifest version with no matching lineage row. Folding the lineage
into the publish closes that gap by construction: a graph commit and its lineage
now land atomically at one manifest version. The in-memory commit graph is a
projection of those manifest rows; `_graph_commits.lance` is retained only as a
carrier for Lance branch refs and no longer receives commit rows.
This bumps the `__manifest` internal schema stamp from **v3 to v4**.
### Existing graphs migrate seamlessly on first write
A graph created by an earlier binary (internal schema v3) keeps its lineage in
`_graph_commits.lance` with none in `__manifest`. On the **first read-write
open**, Omnigraph backfills that lineage into `__manifest` (the `migrate_v3_to_v4`
internal-schema step) and bumps the stamp to v4. The migration:
- is **per-branch** — each branch backfills on its first write;
- is **idempotent and crash-safe** — the stamp bump is the last step, and the
backfill is keyed on the commit id, so a crash mid-migration re-runs harmlessly
on the next open;
- **preserves all data** — every commit, parent, merge parent, actor, and head is
carried over; commit ids are stable, so existing references still resolve.
No data is lost and no operator action is required beyond upgrading the binary.
Before its first write migrates the graph, a **read-only** open of a v3 graph
(e.g. `omnigraph commit list`, NDJSON export) still reads correct history via a
transitional fallback that sources the commit DAG from `_graph_commits.lance`
read-only opens never write, so they never migrate, but they never show an empty
history either.
## Upgrade notes
- **Breaking: internal schema v4 — upgrade writer (and reader) binaries first.**
Internal schema v4 is a hard version gate. Once a graph has been opened for
write by a v0.8.0 binary, its `__manifest` is stamped v4, and an **older binary
will refuse to open it** — read-write *and* read-only — with an
`upgrade omnigraph before opening this graph` error rather than silently
misreading the new lineage. This is the standard forward-version protection
(same shape as the v1→v2 / v2→v3 steps), now enforced on the read-only path
too. Upgrade every writer (and reader) binary that touches a graph to v0.8.0
before, or together with, the first write under the new version. A mixed fleet
where an old binary still writes the same graph is unsupported, as with any
internal-schema bump.
- **`GET /graphs/{id}/queries` is now `invoke_query`-gated (was `read`).** The
stored-query catalog uses the same authority as invocation and the MCP
`tools/list` surface, so discovery and invocation agree ("see the menu iff you
@ -87,8 +142,9 @@ carried in the query source:
`403` instead of a listing; in default-deny mode the endpoint returns `403`
until an `invoke_query` rule is configured. This is the one observable REST
behavior change in this release.
- Otherwise no breaking changes: the rest of the REST surface, CLI, cluster
config, and on-disk format are unchanged. The MCP endpoint is additive.
- **The MCP endpoint is additive.** Apart from the `GET /queries` gate change and
the v4 on-disk migration above, the REST surface, CLI, and cluster config are
unchanged.
- **Pointing an agent at a graph:** configure your MCP client with the URL
`https://<host>/graphs/<id>/mcp` and the same bearer token you use for REST.
See [docs/user/operations/mcp.md](../user/operations/mcp.md) for the connect