# Omnigraph v0.7.0 v0.7.0 is three large arcs in one release. **Operations:** the cluster control plane moves to object storage and the configuration architecture collapses to two single-owner surfaces — a cluster can live entirely on an S3-compatible bucket, a server boots from it with no local files, and the legacy combined `omnigraph.yaml` is **removed**. **CLI:** the command-line surface is unified and made honest — embedded and remote runs are one execution path, `load` becomes the single bulk-write command, every command declares the **capability** it needs (and rejects flags that don't apply), and the server boots only from a cluster. **Engine & substrate:** Lance moves to 7.x, traversal/index/recovery internals get faster and self-healing, and text embedding becomes provider-independent. ## Highlights ### Clusters & storage on object storage - **Clusters on object storage (`storage:`).** `cluster.yaml` gains an optional `storage: s3://bucket/prefix` root. Every stored byte — state ledger, lock, recovery sidecars, approval artifacts, catalog blobs, and the derived graph roots (`/graphs/.omni`) — flows through one storage layer, so `file://` (the default, byte-compatible with existing clusters) and `s3://` are a single code path. The ledger's compare-and-swap uses S3 conditional writes (`If-Match`/`If-None-Match`), verified against AWS, RustFS, and other S3-compatible stores; the state lock is genuinely cross-machine on object storage. - **Config-free serving: `--cluster s3://bucket/prefix`.** The server accepts a bare storage-root URI and boots from the applied revision on the bucket — the ledger and catalog are the whole deployment artifact. Policy bundles serve as digest-verified *content* from the catalog (never re-read from disk). The preferred container shape becomes **bucket, no volume** (see `docs/user/deployment.md`). - **Cluster-only server.** `omnigraph-server` boots **only** from `--cluster ` and serves N graphs (N ≥ 1) under cluster routes (`/graphs/{id}/…`, plus a read-only `GET /graphs` enumeration). The old single-graph flat-route mode, positional-`` boot, and `omnigraph.yaml` `graphs:`-map boot are gone — add or remove graphs with `cluster apply` and restart. - **Resilient cluster boot with strict opt-out.** Graph-attributed startup failures now quarantine that graph and let healthy graphs serve; `/graphs` lists only ready graphs, and quarantined graph routes return 404. Cluster- global failures still refuse boot, and `--require-all-graphs` (or `OMNIGRAPH_REQUIRE_ALL_GRAPHS=1`) restores fail-fast all-or-nothing startup for operators who prefer any degraded graph to abort the process. - **One storage substrate + recovery liveness.** The cluster storage backend and the engine both go through one `StorageAdapter` (versioned read, conditional replace/CAS, prefix delete), exercised by a storage fault-injection matrix. A long-lived server now heals a recoverable write on its *next write* rather than only at restart. ### Configuration: two single-owner surfaces The legacy combined `omnigraph.yaml` is **removed**. Configuration now lives in two surfaces with single owners, plus a zero-config tier: - **Cluster config (`cluster.yaml` + checkout, team-owned)** declares what the system *is*: graphs, schemas, stored queries, policies, storage. A server boots from it via `--cluster`. - **Per-operator config (`~/.omnigraph/config.yaml`, person-owned)** declares who *you* are: `operator.actor` (the last hop of the `--as` chain), output defaults, named servers + clusters, profiles, aliases, and a default scope. `$OMNIGRAPH_HOME` relocates it. - **Credentials keyed by server name.** `omnigraph login ` stores a bearer token in `~/.omnigraph/credentials` (created `0600`; over-permissive files refused). Resolution for a request whose URL matches an operator-defined server: `OMNIGRAPH_TOKEN_` env → the credentials file → the default `OMNIGRAPH_BEARER_TOKEN`. A token is only ever sent to the server it is keyed to. - **Operator targeting and aliases.** `--server ` (with `--graph ` for multi-graph servers) addresses operator-defined endpoints. Operator aliases are pure, **read-only** *bindings* — personal name → (server, graph, stored-query name, default params) — invoking catalog-owned stored queries; they carry no query content and a binding to a stored mutation is rejected. - **Default scopes.** `defaults.server` (served) or `defaults.store` (a zero-flag *local* default — mutually exclusive with `server`) supply the no-flag scope, with an optional `default_graph`. `--profile ` / `$OMNIGRAPH_PROFILE` selects a named scope bundle wholesale; `omnigraph profile list` / `profile show []` inspect what's defined (read-only). ### Unified, capability-aware CLI - **One bulk-write command: `omnigraph load`.** `load` is now the single data-write command and works against remote graphs (over HTTP with the same bearer/actor resolution as every other remote command) — previously the only data command forced to open storage directly. `--mode overwrite|append|merge` is **required** (overwrite is destructive, so there is no default); `--from ` opts into fork-if-missing for `--branch`. `omnigraph ingest` becomes a **deprecated alias** (`--from main --mode merge` defaults; one-line stderr warning). - **No implicit branch forks.** Loading into a branch that does not exist is an **error** unless `--from ` is given — a typo'd branch name no longer silently forks `main` and lands your data there. Same rule on the server. - **One execution path, embedded ≡ remote.** Every CLI verb runs through one `GraphClient` with two implementations (embedded engine, HTTP) sharing a single wire-DTO crate (`omnigraph-api-types`). An executable parity matrix runs every verb against both and asserts identical results, so local and remote no longer drift. - **Declared capabilities + honest addressing.** Every command declares the **capability** it needs — `any` (run against a graph, served or embedded), `served` (needs a server), `direct` (direct storage access), `control` (manage/inspect a cluster), or `local` (no graph) — and the CLI enforces it. Wrong-capability addressing now fails loudly with a declared message (e.g. `--server` on `optimize`) instead of being silently ignored, and a maintenance verb pointed at a remote target is rejected. `omnigraph --help` groups commands by capability with a legend. - **Address cluster graphs for maintenance.** `optimize` / `repair` / `cleanup` accept `--cluster --graph ` (`--cluster` is a cluster directory, storage-root URI, or a `clusters:` name from `~/.omnigraph/config.yaml`), resolving the graph's storage URI from the served cluster state (no need to hand-type `/graphs/.omni`). `--graph` is the single graph selector across server and cluster scopes. Conversely, `omnigraph init` **refuses** a cluster-managed path and points at `cluster apply` — graphs in a cluster are created with ledger/recovery/approvals, not by hand. `schema apply` refuses a cluster-managed graph for the same reason (and the server rejects a cluster- backed schema apply with `409`, pointing at `cluster apply`). - **Write diagnostics + destructive-write safety (RFC-011 Decision 9).** Every write (`load`, `mutate`, `branch create|delete|merge`, `schema apply`, `optimize`, `repair`, `cleanup`) echoes its resolved target + access path to stderr — e.g. `omnigraph load → s3://…/knowledge.omni (direct, remote)` — suppressible with the global `--quiet`. Destructive writes against a **non-local** scope (`cleanup`, overwrite `load`, `branch delete` against an `http(s)://` server or `s3://` store/cluster) require explicit consent: the global `--yes`, an interactive TTY prompt, or — for a non-interactive / `--json` run — a hard refusal instead of silently proceeding. Local (`file://`) writes are unaffected. - **Route alignment: canonical `POST /load`.** The server gains a canonical `POST /load`; `POST /ingest` is now a deprecated alias that emits RFC 9745 `Deprecation: true` + RFC 8288 `Link: ; rel="successor-version"` headers (a sibling-relative reference that resolves under `/graphs/{id}/…`). The CLI's `load` targets `/load`. - **Operator aliases get their own namespace (`omnigraph alias `).** A personal binding to a stored query on a named server is invoked as `omnigraph alias [args]` (RFC-011 Decision 4), so an alias can never shadow — or be shadowed by — a built-in verb. `alias` rejects global scope flags (`--server`/`--graph`/`--store`/`--cluster`/`--profile`/`--as`) its binding already owns. - **No-graph addressing lists candidates (RFC-011 Decision 7).** When a scope has no `--graph` and no `default_graph`, the CLI never silently picks. A **cluster** scope with exactly one applied graph uses it automatically and otherwise **lists the candidates** (from the served catalog). A multi-graph **server** lists the candidates (from `GET /graphs`) and requires `--graph `. - **Invoke stored queries by name (RFC-011 Decision 3).** `omnigraph query ` / `mutate ` invoke a stored query **by name** from the served catalog — `omnigraph query find_people` instead of `--query find.gq --name find_people`. The verb asserts the query's kind (an `expect_mutation` flag on `POST /queries/{name}`: `query ` is rejected with `'' is a mutation — use omnigraph mutate `, and vice-versa). `.gq` files become the explicit ad-hoc lane (`-e` / `--query`), with the positional selecting which query in the source. ### Engine & substrate - **Lance 6.0.1 → 7.0.0.** The columnar substrate is bumped to Lance 7.x with correct-by-design alignment: the unenforced primary key is immutable once set, `WriteParams::auto_cleanup` is disabled so version GC stays operator-owned, and the native namespace/`object_store` 0.13 surface is pinned by surface-guard tests. No on-disk format change for existing graphs. - **Indexed graph traversal.** `Expand` can run over a BTREE-indexed path, asserted semantically equal to the CSR traversal it accelerates. - **Scalar index coverage + filter literal coercion.** Closes index-coverage gaps and coerces filter literals correctly, cutting query latency on indexed scans. - **Index materialization is derived state.** `schema apply` records `@index`/`@key` *intent* and builds nothing (index-only changes touch no table data); `load`/`mutate` build inline through one chokepoint but **defer** an untrainable Vector column as *pending* instead of aborting; `optimize` is the reconciler that materializes declared-but-missing indexes and folds appended fragments back into existing ones. - **Recovery liveness + one storage substrate.** Writers heal a recoverable write on the *next write* (not only at the next read-write open); a storage fault-injection matrix exercises the sidecar lifecycle; the cluster and engine share one `StorageAdapter` over `object_store`. - **Branch-fork self-heal.** Manifest-unreferenced branch forks are reclaimed (eager best-effort + a `cleanup` reconciler backstop), so a failed branch-delete reclaim no longer wedges a reused branch name. - **Composite `@unique(a, b)`.** Enforced as a true composite key, with one shared keying function for intake and branch-merge that fails loudly on an un-keyable column type rather than silently exempting it. ### Embeddings: provider-independent (RFC-012) - **One client, any provider.** Text embedding moves to a single provider-independent `EmbeddingConfig` behind a sealed `Provider` enum: **OpenAI-compatible** (the **OpenRouter** default gateway — one key for many models — plus OpenAI-direct and self-hosted endpoints), native **Gemini**, and a deterministic **Mock**. One client serves both the query path and the offline `omnigraph embed` CLI, with a per-query deadline and `tracing` observability. The dead, uncallable compiler-crate OpenAI client (and its `reqwest`/`tokio` deps) was removed. - **Same-space guarantee.** `@embed("source", model="…")` records the embedding identity (model) in the schema IR so it travels with the data; a string `nearest()` whose resolved embedder model differs from the recorded one is **rejected with a typed error** instead of silently ranking across vector spaces. (`@embed` still does no ingest-time embedding — deferred to a later phase.) ## Breaking & behavior changes - **`omnigraph.yaml` is removed.** The CLI and server no longer read it at all; the `OmnigraphConfig` type, `omnigraph config migrate`, and the deprecation env vars (`OMNIGRAPH_NO_LEGACY_CONFIG`, `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION`, `OMNIGRAPH_CONFIG`) are gone. Configure via a team `cluster.yaml` and a per-operator `~/.omnigraph/config.yaml` (see Upgrade notes). - **`omnigraph-server` boots only from `--cluster`.** The positional-`` single-graph boot and the `omnigraph.yaml` `graphs:`-map boot are removed; all HTTP is under `/graphs/{id}/…` (with flat `/healthz` and the `/graphs` enumeration). Upgrade deployments to `omnigraph-server --cluster `. - **Default embedding provider flips to OpenRouter.** Embedding is no longer hardwired to Gemini: the default provider is **OpenAI-compatible via OpenRouter**, `OMNIGRAPH_GEMINI_BASE_URL` is dropped, and Gemini-direct users must set `OMNIGRAPH_EMBED_PROVIDER=gemini`. A `nearest("string")` query whose resolved model differs from a property's recorded `@embed(model=…)` is now a typed error rather than silent cross-space ranking. - **`query --alias ` is removed.** Invoke operator aliases via `omnigraph alias [args]`. - **`query`/`mutate` no longer take a positional graph URI, `--uri`, or `--name`** (RFC-011 D3). The positional is now the query name; address the graph with `--store` (local) / `--server` / `--profile`, and select a query within an ad-hoc `--query`/`-e` source with the positional (replacing `--name`). By-name catalog invocation is **served-only** (a bare `--store` has no catalog — use `-e`/`--query` there). Scripts using `query --query f.gq --name q` become `query --store --query f.gq q`. - **Legacy data-plane addressing removed** (#238): `--target`, the positional `http(s)://`→remote dispatch, and `--as` on a served write (the actor is resolved server-side from the bearer token) no longer exist. - **`omnigraph load` replaces direct-storage-only loading; `--mode` is required.** Scripts calling `load` without `--mode` must add one (`overwrite|append|merge`). - **`omnigraph ingest` is deprecated** (still works; one-line stderr warning). Use `load --from --mode `. - **Loading into a missing branch is now an error without `--from`** (CLI and `POST /load`/`POST /ingest`): a missing branch returns 404 / fails, never an implicit fork. Pass `--from ` (CLI) or the request `from` field (HTTP) to fork-if-missing. This affects any workflow that relied on auto-forking. - **Scope flags that can't apply now error instead of being silently ignored.** `--server` on any direct/control/session command, `--cluster` outside the cluster-scoped verbs, and `--graph` where no multi-graph scope applies all fail with a declared message. `--graph` is the single graph selector and is **accepted** on `optimize` / `repair` / `cleanup` when paired with `--cluster` (replacing the removed `--cluster-graph`). - **`schema apply` is refused against a cluster-managed graph.** The CLI signposts `omnigraph cluster apply`; a cluster-backed server returns `409 Conflict` (after the Cedar gate, so an unauthorized actor still gets `403`). Cluster graphs evolve through `cluster apply`, never a direct apply. - **Storage-plane error text changed.** A maintenance verb pointed at a remote target now fails with a declared direct-capability message (replacing the older "only supported against local graph URIs" wording). Error strings are observable contract (Hyrum); pin against the new text. - **Non-local destructive writes now require `--yes` in automation.** A `cleanup` / overwrite-`load` / `branch delete` against an `http(s)://` or `s3://` target with `--json` (or any non-TTY context) previously executed; it now **refuses** unless `--yes` is passed. CI scripts that destroy remote data must add `--yes`. Local (`file://`) writes are unchanged. - **`omnigraph init` no longer scaffolds a config file,** and **refuses a cluster-managed storage path** (`/graphs/.omni` under a cluster) — create those graphs with `cluster apply`. - **`POST /ingest` is deprecated** (kept indefinitely as a shim) and returns `Deprecation`/`Link` headers. **A v0.7 CLI talks to `POST /load`,** which a pre-0.7 server does not expose — upgrade the server and CLI together, or keep using `ingest`. - **`ServingPolicy` (cluster crate API) carries verified policy content instead of a blob path; `read_serving_snapshot` and several cluster command entry points are now `async`.** - **`omnigraph --help` reorders commands** (grouped by capability) and **hides the deprecated `ingest`** from the listing — `ingest` still runs. Help text is observable; this is a deliberate output change. ## Upgrade notes - Existing clusters need no migration: an absent `storage:` key keeps the config-directory layout byte-for-byte. - **`omnigraph.yaml` is no longer read.** There is no automated migrate command in 0.7.0; recreate configuration as a team `cluster.yaml` (graphs, schemas, stored queries, policies — see `docs/user/clusters/`) plus a per-operator `~/.omnigraph/config.yaml` (identity, servers, credentials, defaults — see `docs/user/cli/reference.md`). - **`omnigraph-server` now requires `--cluster `** — there is no positional-URI boot. Run `cluster apply` first, then serve the applied revision. - **Gemini-direct embedding users** set `OMNIGRAPH_EMBED_PROVIDER=gemini` (the default is now OpenRouter); `OMNIGRAPH_GEMINI_BASE_URL` is removed. - Audit scripts for two CLI changes: add `--mode` to every `load`, and add `--from ` anywhere you relied on a missing branch being auto-created. - Upgrade server and CLI together for the `/load` route (or keep `ingest`). - Operator setup is three lines: `mkdir -p ~/.omnigraph`, write `operator.actor` (and `servers:`) into `~/.omnigraph/config.yaml`, then `echo $TOKEN | omnigraph login `. ## Internals - The cluster, server, and CLI crates were modularized (the ~7.9k-line cluster `lib.rs` into focused modules; the server and CLI test monoliths into per-area suites) — pure code movement. - The parity matrix (embedded vs remote) is the new referee for CLI behavior; the OpenAPI drift test guards `openapi.json`; Lance-surface guard tests pin the upstream APIs the engine depends on (the first smoke check on a Lance bump). - Gated end-to-end suites run the full cluster lifecycle against a real S3-compatible store in CI (lock-release regression, config-free boot from a bare bucket URI). - The deployment guide gains the bucket-no-volume container recipe for AWS / S3-compatible object storage. - `clap` updated to 4.6.1. CI runs the full workspace suite on `main` post-merge rather than on every PR (faster PR turnaround; the local `cargo test --workspace --locked` is the pre-merge gate).