mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-24 02:38:06 +02:00
Reconcile the v0.7.0 release notes with what 0.7.0 actually ships. The draft was mid-cycle; two facts changed and a late-cycle arc was missing. - omnigraph.yaml is REMOVED (not deprecated): drop the deprecation-window framing (config migrate, OMNIGRAPH_NO_LEGACY_CONFIG, OMNIGRAPH_SUPPRESS_YAML_ DEPRECATION); the two-surface config (cluster.yaml + ~/.omnigraph/config.yaml) is the only config. - Cluster-only server: the server boots only from --cluster; no single-graph flat-route / positional-URI / omnigraph.yaml-graphs boot. Deprecated-route Link headers are the sibling-relative form (<load>, not </load>). - Add the RFC-011 tail: defaults.store, profile list/show, schema-apply refusal (CLI signpost + server 409), read-only aliases, the any/served/direct/control/ local capability vocabulary, removed legacy data-plane addressing. - New "Engine & substrate" section: Lance 6→7, indexed traversal, scalar-index/ query-latency, index-materialization-deferred, recovery liveness, branch-fork self-heal, composite @unique. - New "Embeddings (RFC-012)" section + breaking bullet: provider-independent client (OpenRouter default), @embed same-space validation, the default-provider flip (OMNIGRAPH_EMBED_PROVIDER=gemini for Gemini-direct; OMNIGRAPH_GEMINI_BASE_URL dropped). - Upgrade notes: replace the false "omnigraph.yaml keeps working / config migrate" guidance with the manual cluster.yaml + operator-config path; add server --cluster and embeddings notes. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
294 lines
19 KiB
Markdown
294 lines
19 KiB
Markdown
# Omnigraph v0.7.0
|
|
|
|
v0.7.0 is three large arcs in one release. **Operations:** the cluster control
|
|
plane moves to object storage and the configuration architecture collapses to two
|
|
single-owner surfaces — a cluster can live entirely on an S3-compatible bucket, a
|
|
server boots from it with no local files, and the legacy combined `omnigraph.yaml`
|
|
is **removed**. **CLI:** the command-line surface is unified and made honest —
|
|
embedded and remote runs are one execution path, `load` becomes the single
|
|
bulk-write command, every command declares the **capability** it needs (and
|
|
rejects flags that don't apply), and the server boots only from a cluster.
|
|
**Engine & substrate:** Lance moves to 7.x, traversal/index/recovery internals
|
|
get faster and self-healing, and text embedding becomes provider-independent.
|
|
|
|
## Highlights
|
|
|
|
### Clusters & storage on object storage
|
|
|
|
- **Clusters on object storage (`storage:`).** `cluster.yaml` gains an optional
|
|
`storage: s3://bucket/prefix` root. Every stored byte — state ledger, lock,
|
|
recovery sidecars, approval artifacts, catalog blobs, and the derived graph
|
|
roots (`<storage>/graphs/<id>.omni`) — flows through one storage layer, so
|
|
`file://` (the default, byte-compatible with existing clusters) and `s3://`
|
|
are a single code path. The ledger's compare-and-swap uses S3 conditional
|
|
writes (`If-Match`/`If-None-Match`), verified against AWS, RustFS, and other
|
|
S3-compatible stores; the state lock is genuinely cross-machine on object
|
|
storage.
|
|
- **Config-free serving: `--cluster s3://bucket/prefix`.** The server accepts a
|
|
bare storage-root URI and boots from the applied revision on the bucket — the
|
|
ledger and catalog are the whole deployment artifact. Policy bundles serve as
|
|
digest-verified *content* from the catalog (never re-read from disk). The
|
|
preferred container shape becomes **bucket, no volume** (see
|
|
`docs/user/deployment.md`).
|
|
- **Cluster-only server.** `omnigraph-server` boots **only** from `--cluster
|
|
<dir | s3://…>` and serves N graphs (N ≥ 1) under cluster routes
|
|
(`/graphs/{id}/…`, plus a read-only `GET /graphs` enumeration). The old
|
|
single-graph flat-route mode, positional-`<URI>` boot, and `omnigraph.yaml`
|
|
`graphs:`-map boot are gone — add or remove graphs with `cluster apply` and
|
|
restart.
|
|
- **One storage substrate + recovery liveness.** The cluster storage backend and
|
|
the engine both go through one `StorageAdapter` (versioned read, conditional
|
|
replace/CAS, prefix delete), exercised by a storage fault-injection matrix.
|
|
A long-lived server now heals a recoverable write on its *next write* rather
|
|
than only at restart.
|
|
|
|
### Configuration: two single-owner surfaces
|
|
|
|
The legacy combined `omnigraph.yaml` is **removed**. Configuration now lives in
|
|
two surfaces with single owners, plus a zero-config tier:
|
|
|
|
- **Cluster config (`cluster.yaml` + checkout, team-owned)** declares what the
|
|
system *is*: graphs, schemas, stored queries, policies, storage. A server boots
|
|
from it via `--cluster`.
|
|
- **Per-operator config (`~/.omnigraph/config.yaml`, person-owned)** declares who
|
|
*you* are: `operator.actor` (the last hop of the `--as` chain), output
|
|
defaults, named servers + clusters, profiles, aliases, and a default scope.
|
|
`$OMNIGRAPH_HOME` relocates it.
|
|
- **Credentials keyed by server name.** `omnigraph login <server>` stores a
|
|
bearer token in `~/.omnigraph/credentials` (created `0600`; over-permissive
|
|
files refused). Resolution for a request whose URL matches an operator-defined
|
|
server: `OMNIGRAPH_TOKEN_<NAME>` env → the credentials file → the default
|
|
`OMNIGRAPH_BEARER_TOKEN`. A token is only ever sent to the server it is keyed to.
|
|
- **Operator targeting and aliases.** `--server <name>` (with `--graph <id>` for
|
|
multi-graph servers) addresses operator-defined endpoints. Operator aliases are
|
|
pure, **read-only** *bindings* — personal name → (server, graph, stored-query
|
|
name, default params) — invoking catalog-owned stored queries; they carry no
|
|
query content and a binding to a stored mutation is rejected.
|
|
- **Default scopes.** `defaults.server` (served) or `defaults.store` (a zero-flag
|
|
*local* default — mutually exclusive with `server`) supply the no-flag scope,
|
|
with an optional `default_graph`. `--profile <name>` / `$OMNIGRAPH_PROFILE`
|
|
selects a named scope bundle wholesale; `omnigraph profile list` /
|
|
`profile show [<name>]` inspect what's defined (read-only).
|
|
|
|
### Unified, capability-aware CLI
|
|
|
|
- **One bulk-write command: `omnigraph load`.** `load` is now the single data-write
|
|
command and works against remote graphs (over HTTP with the same bearer/actor
|
|
resolution as every other remote command) — previously the only data command
|
|
forced to open storage directly. `--mode overwrite|append|merge` is **required**
|
|
(overwrite is destructive, so there is no default); `--from <base>` opts into
|
|
fork-if-missing for `--branch`. `omnigraph ingest` becomes a **deprecated
|
|
alias** (`--from main --mode merge` defaults; one-line stderr warning).
|
|
- **No implicit branch forks.** Loading into a branch that does not exist is an
|
|
**error** unless `--from <base>` is given — a typo'd branch name no longer
|
|
silently forks `main` and lands your data there. Same rule on the server.
|
|
- **One execution path, embedded ≡ remote.** Every CLI verb runs through one
|
|
`GraphClient` with two implementations (embedded engine, HTTP) sharing a single
|
|
wire-DTO crate (`omnigraph-api-types`). An executable parity matrix runs every
|
|
verb against both and asserts identical results, so local and remote no longer
|
|
drift.
|
|
- **Declared capabilities + honest addressing.** Every command declares the
|
|
**capability** it needs — `any` (run against a graph, served or embedded),
|
|
`served` (needs a server), `direct` (direct storage access), `control`
|
|
(manage/inspect a cluster), or `local` (no graph) — and the CLI enforces it.
|
|
Wrong-capability addressing now fails loudly with a declared message (e.g.
|
|
`--server` on `optimize`) instead of being silently ignored, and a maintenance
|
|
verb pointed at a remote target is rejected. `omnigraph --help` groups commands
|
|
by capability with a legend.
|
|
- **Address cluster graphs for maintenance.** `optimize` / `repair` / `cleanup`
|
|
accept `--cluster <dir|s3://…> --graph <id>` (`--cluster` is a cluster directory,
|
|
storage-root URI, or a `clusters:` name from `~/.omnigraph/config.yaml`),
|
|
resolving the graph's storage URI from the served cluster state (no need to
|
|
hand-type `<storage>/graphs/<id>.omni`). `--graph` is the single graph selector
|
|
across server and cluster scopes. Conversely, `omnigraph init` **refuses** a
|
|
cluster-managed path and points at `cluster apply` — graphs in a cluster are
|
|
created with ledger/recovery/approvals, not by hand. `schema apply` refuses a
|
|
cluster-managed graph for the same reason (and the server rejects a cluster-
|
|
backed schema apply with `409`, pointing at `cluster apply`).
|
|
- **Write diagnostics + destructive-write safety (RFC-011 Decision 9).** Every
|
|
write (`load`, `mutate`, `branch create|delete|merge`, `schema apply`,
|
|
`optimize`, `repair`, `cleanup`) echoes its resolved target + access path to
|
|
stderr — e.g. `omnigraph load → s3://…/knowledge.omni (direct, remote)` —
|
|
suppressible with the global `--quiet`. Destructive writes against a
|
|
**non-local** scope (`cleanup`, overwrite `load`, `branch delete` against an
|
|
`http(s)://` server or `s3://` store/cluster) require explicit consent: the
|
|
global `--yes`, an interactive TTY prompt, or — for a non-interactive /
|
|
`--json` run — a hard refusal instead of silently proceeding. Local (`file://`)
|
|
writes are unaffected.
|
|
- **Route alignment: canonical `POST /load`.** The server gains a canonical
|
|
`POST /load`; `POST /ingest` is now a deprecated alias that emits RFC 9745
|
|
`Deprecation: true` + RFC 8288 `Link: <load>; rel="successor-version"`
|
|
headers (a sibling-relative reference that resolves under `/graphs/{id}/…`).
|
|
The CLI's `load` targets `/load`.
|
|
- **Operator aliases get their own namespace (`omnigraph alias <name>`).** A
|
|
personal binding to a stored query on a named server is invoked as
|
|
`omnigraph alias <name> [args]` (RFC-011 Decision 4), so an alias can never
|
|
shadow — or be shadowed by — a built-in verb. `alias` rejects global scope
|
|
flags (`--server`/`--graph`/`--store`/`--cluster`/`--profile`/`--as`) its
|
|
binding already owns.
|
|
- **No-graph addressing lists candidates (RFC-011 Decision 7).** When a scope
|
|
has no `--graph` and no `default_graph`, the CLI never silently picks. A
|
|
**cluster** scope with exactly one applied graph uses it automatically and
|
|
otherwise **lists the candidates** (from the served catalog). A multi-graph
|
|
**server** lists the candidates (from `GET /graphs`) and requires `--graph <id>`.
|
|
- **Invoke stored queries by name (RFC-011 Decision 3).** `omnigraph query
|
|
<name>` / `mutate <name>` invoke a stored query **by name** from the served
|
|
catalog — `omnigraph query find_people` instead of `--query find.gq --name
|
|
find_people`. The verb asserts the query's kind (an `expect_mutation` flag on
|
|
`POST /queries/{name}`: `query <a-mutation>` is rejected with `'<name>' is a
|
|
mutation — use omnigraph mutate <name>`, and vice-versa). `.gq` files become
|
|
the explicit ad-hoc lane (`-e` / `--query`), with the positional selecting
|
|
which query in the source.
|
|
|
|
### Engine & substrate
|
|
|
|
- **Lance 6.0.1 → 7.0.0.** The columnar substrate is bumped to Lance 7.x with
|
|
correct-by-design alignment: the unenforced primary key is immutable once set,
|
|
`WriteParams::auto_cleanup` is disabled so version GC stays operator-owned, and
|
|
the native namespace/`object_store` 0.13 surface is pinned by surface-guard
|
|
tests. No on-disk format change for existing graphs.
|
|
- **Indexed graph traversal.** `Expand` can run over a BTREE-indexed path,
|
|
asserted semantically equal to the CSR traversal it accelerates.
|
|
- **Scalar index coverage + filter literal coercion.** Closes index-coverage gaps
|
|
and coerces filter literals correctly, cutting query latency on indexed scans.
|
|
- **Index materialization is derived state.** `schema apply` records
|
|
`@index`/`@key` *intent* and builds nothing (index-only changes touch no table
|
|
data); `load`/`mutate` build inline through one chokepoint but **defer** an
|
|
untrainable Vector column as *pending* instead of aborting; `optimize` is the
|
|
reconciler that materializes declared-but-missing indexes and folds appended
|
|
fragments back into existing ones.
|
|
- **Recovery liveness + one storage substrate.** Writers heal a recoverable
|
|
write on the *next write* (not only at the next read-write open); a storage
|
|
fault-injection matrix exercises the sidecar lifecycle; the cluster and engine
|
|
share one `StorageAdapter` over `object_store`.
|
|
- **Branch-fork self-heal.** Manifest-unreferenced branch forks are reclaimed
|
|
(eager best-effort + a `cleanup` reconciler backstop), so a failed branch-delete
|
|
reclaim no longer wedges a reused branch name.
|
|
- **Composite `@unique(a, b)`.** Enforced as a true composite key, with one shared
|
|
keying function for intake and branch-merge that fails loudly on an un-keyable
|
|
column type rather than silently exempting it.
|
|
|
|
### Embeddings: provider-independent (RFC-012)
|
|
|
|
- **One client, any provider.** Text embedding moves to a single
|
|
provider-independent `EmbeddingConfig` behind a sealed `Provider` enum:
|
|
**OpenAI-compatible** (the **OpenRouter** default gateway — one key for many
|
|
models — plus OpenAI-direct and self-hosted endpoints), native **Gemini**, and
|
|
a deterministic **Mock**. One client serves both the query path and the offline
|
|
`omnigraph embed` CLI, with a per-query deadline and `tracing` observability.
|
|
The dead, uncallable compiler-crate OpenAI client (and its `reqwest`/`tokio`
|
|
deps) was removed.
|
|
- **Same-space guarantee.** `@embed("source", model="…")` records the embedding
|
|
identity (model) in the schema IR so it travels with the data; a string
|
|
`nearest()` whose resolved embedder model differs from the recorded one is
|
|
**rejected with a typed error** instead of silently ranking across vector
|
|
spaces. (`@embed` still does no ingest-time embedding — deferred to a later
|
|
phase.)
|
|
|
|
## Breaking & behavior changes
|
|
|
|
- **`omnigraph.yaml` is removed.** The CLI and server no longer read it at all;
|
|
the `OmnigraphConfig` type, `omnigraph config migrate`, and the deprecation
|
|
env vars (`OMNIGRAPH_NO_LEGACY_CONFIG`, `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION`,
|
|
`OMNIGRAPH_CONFIG`) are gone. Configure via a team `cluster.yaml` and a
|
|
per-operator `~/.omnigraph/config.yaml` (see Upgrade notes).
|
|
- **`omnigraph-server` boots only from `--cluster`.** The positional-`<URI>`
|
|
single-graph boot and the `omnigraph.yaml` `graphs:`-map boot are removed; all
|
|
HTTP is under `/graphs/{id}/…` (with flat `/healthz` and the `/graphs`
|
|
enumeration). Upgrade deployments to `omnigraph-server --cluster <dir|s3://…>`.
|
|
- **Default embedding provider flips to OpenRouter.** Embedding is no longer
|
|
hardwired to Gemini: the default provider is **OpenAI-compatible via
|
|
OpenRouter**, `OMNIGRAPH_GEMINI_BASE_URL` is dropped, and Gemini-direct users
|
|
must set `OMNIGRAPH_EMBED_PROVIDER=gemini`. A `nearest("string")` query whose
|
|
resolved model differs from a property's recorded `@embed(model=…)` is now a
|
|
typed error rather than silent cross-space ranking.
|
|
- **`query --alias <name>` is removed.** Invoke operator aliases via
|
|
`omnigraph alias <name> [args]`.
|
|
- **`query`/`mutate` no longer take a positional graph URI, `--uri`, or
|
|
`--name`** (RFC-011 D3). The positional is now the query name; address the
|
|
graph with `--store` (local) / `--server` / `--profile`, and select a query
|
|
within an ad-hoc `--query`/`-e` source with the positional (replacing
|
|
`--name`). By-name catalog invocation is **served-only** (a bare `--store` has
|
|
no catalog — use `-e`/`--query` there). Scripts using
|
|
`query <graph-uri> --query f.gq --name q` become
|
|
`query --store <graph-uri> --query f.gq q`.
|
|
- **Legacy data-plane addressing removed** (#238): `--target`, the positional
|
|
`http(s)://`→remote dispatch, and `--as` on a served write (the actor is
|
|
resolved server-side from the bearer token) no longer exist.
|
|
- **`omnigraph load` replaces direct-storage-only loading; `--mode` is required.**
|
|
Scripts calling `load` without `--mode` must add one (`overwrite|append|merge`).
|
|
- **`omnigraph ingest` is deprecated** (still works; one-line stderr warning).
|
|
Use `load --from <base> --mode <mode>`.
|
|
- **Loading into a missing branch is now an error without `--from`** (CLI and
|
|
`POST /load`/`POST /ingest`): a missing branch returns 404 / fails, never an
|
|
implicit fork. Pass `--from <base>` (CLI) or the request `from` field (HTTP) to
|
|
fork-if-missing. This affects any workflow that relied on auto-forking.
|
|
- **Scope flags that can't apply now error instead of being silently ignored.**
|
|
`--server` on any direct/control/session command, `--cluster` outside the
|
|
cluster-scoped verbs, and `--graph` where no multi-graph scope applies all fail
|
|
with a declared message. `--graph` is the single graph selector and is
|
|
**accepted** on `optimize` / `repair` / `cleanup` when paired with `--cluster`
|
|
(replacing the removed `--cluster-graph`).
|
|
- **`schema apply` is refused against a cluster-managed graph.** The CLI signposts
|
|
`omnigraph cluster apply`; a cluster-backed server returns `409 Conflict`
|
|
(after the Cedar gate, so an unauthorized actor still gets `403`). Cluster
|
|
graphs evolve through `cluster apply`, never a direct apply.
|
|
- **Storage-plane error text changed.** A maintenance verb pointed at a remote
|
|
target now fails with a declared direct-capability message (replacing the older
|
|
"only supported against local graph URIs" wording). Error strings are observable
|
|
contract (Hyrum); pin against the new text.
|
|
- **Non-local destructive writes now require `--yes` in automation.** A
|
|
`cleanup` / overwrite-`load` / `branch delete` against an `http(s)://` or
|
|
`s3://` target with `--json` (or any non-TTY context) previously executed;
|
|
it now **refuses** unless `--yes` is passed. CI scripts that destroy remote
|
|
data must add `--yes`. Local (`file://`) writes are unchanged.
|
|
- **`omnigraph init` no longer scaffolds a config file,** and **refuses a
|
|
cluster-managed storage path** (`<root>/graphs/<id>.omni` under a cluster) —
|
|
create those graphs with `cluster apply`.
|
|
- **`POST /ingest` is deprecated** (kept indefinitely as a shim) and returns
|
|
`Deprecation`/`Link` headers. **A v0.7 CLI talks to `POST /load`,** which a
|
|
pre-0.7 server does not expose — upgrade the server and CLI together, or keep
|
|
using `ingest`.
|
|
- **`ServingPolicy` (cluster crate API) carries verified policy content instead
|
|
of a blob path; `read_serving_snapshot` and several cluster command entry points
|
|
are now `async`.**
|
|
- **`omnigraph --help` reorders commands** (grouped by capability) and **hides
|
|
the deprecated `ingest`** from the listing — `ingest` still runs. Help text is
|
|
observable; this is a deliberate output change.
|
|
|
|
## Upgrade notes
|
|
|
|
- Existing clusters need no migration: an absent `storage:` key keeps the
|
|
config-directory layout byte-for-byte.
|
|
- **`omnigraph.yaml` is no longer read.** There is no automated migrate command
|
|
in 0.7.0; recreate configuration as a team `cluster.yaml` (graphs, schemas,
|
|
stored queries, policies — see `docs/user/clusters/`) plus a per-operator
|
|
`~/.omnigraph/config.yaml` (identity, servers, credentials, defaults — see
|
|
`docs/user/cli/reference.md`).
|
|
- **`omnigraph-server` now requires `--cluster <dir | s3://…>`** — there is no
|
|
positional-URI boot. Run `cluster apply` first, then serve the applied revision.
|
|
- **Gemini-direct embedding users** set `OMNIGRAPH_EMBED_PROVIDER=gemini` (the
|
|
default is now OpenRouter); `OMNIGRAPH_GEMINI_BASE_URL` is removed.
|
|
- Audit scripts for two CLI changes: add `--mode` to every `load`, and add
|
|
`--from <base>` anywhere you relied on a missing branch being auto-created.
|
|
- Upgrade server and CLI together for the `/load` route (or keep `ingest`).
|
|
- Operator setup is three lines: `mkdir -p ~/.omnigraph`, write `operator.actor`
|
|
(and `servers:`) into `~/.omnigraph/config.yaml`, then
|
|
`echo $TOKEN | omnigraph login <server>`.
|
|
|
|
## Internals
|
|
|
|
- The cluster, server, and CLI crates were modularized (the ~7.9k-line cluster
|
|
`lib.rs` into focused modules; the server and CLI test monoliths into per-area
|
|
suites) — pure code movement.
|
|
- The parity matrix (embedded vs remote) is the new referee for CLI behavior; the
|
|
OpenAPI drift test guards `openapi.json`; Lance-surface guard tests pin the
|
|
upstream APIs the engine depends on (the first smoke check on a Lance bump).
|
|
- Gated end-to-end suites run the full cluster lifecycle against a real
|
|
S3-compatible store in CI (lock-release regression, config-free boot from a
|
|
bare bucket URI).
|
|
- The deployment guide gains the bucket-no-volume container recipe for AWS /
|
|
S3-compatible object storage.
|
|
- `clap` updated to 4.6.1. CI runs the full workspace suite on `main` post-merge
|
|
rather than on every PR (faster PR turnaround; the local
|
|
`cargo test --workspace --locked` is the pre-merge gate).
|