Merge remote-tracking branch 'origin/main' into ragnorc/shaping-config-integration

# Conflicts:
#	crates/omnigraph-cluster/src/lib.rs
#	crates/omnigraph-cluster/src/serve.rs
#	crates/omnigraph-server/src/lib.rs
#	crates/omnigraph-server/src/settings.rs
#	docs/user/clusters/config.md
This commit is contained in:
aaltshuler 2026-06-16 04:13:00 +03:00
commit 4f8c71fa23
75 changed files with 6557 additions and 6879 deletions

View file

@ -6,34 +6,43 @@
omnigraph init --schema schema.pg graph.omni
omnigraph load --data data.jsonl --mode overwrite graph.omni
omnigraph snapshot graph.omni --branch main --json
omnigraph query --uri graph.omni --query queries.gq --name get_person --params '{"name":"Alice"}'
omnigraph mutate --uri graph.omni --query queries.gq --name insert_person --params '{"name":"Mina","age":28}'
# Invoke a stored query BY NAME from the catalog (served — addressed by scope):
omnigraph query get_person --params '{"name":"Alice"}'
omnigraph mutate insert_person --params '{"name":"Mina","age":28}'
```
`omnigraph query` is the canonical read command (pairs with `POST /query`);
`omnigraph mutate` is the canonical write command (pairs with `POST /mutate`).
The previous names `omnigraph read` and `omnigraph change` keep working as
visible aliases — invocations emit a one-line deprecation warning to stderr
and otherwise behave identically. See [Deprecated names](#deprecated-names)
for the migration table.
The positional argument is the **stored-query name**, invoked from the served
catalog (RFC-011 D3) — the graph is addressed by scope (`--server` / `--profile`
/ defaults), and the verb asserts the query's kind (`query` rejects a stored
mutation, and vice-versa). The previous names `omnigraph read` and
`omnigraph change` keep working as visible aliases — invocations emit a one-line
deprecation warning to stderr. See [Deprecated names](#deprecated-names).
For ad-hoc reads and mutations (REPLs, AI agents, one-off scripts), pass the
GQ source inline with `-e` / `--query-string` instead of a file path:
For **ad-hoc** reads and mutations (REPLs, AI agents, one-off scripts, local dev),
pass the GQ source with `-e` / `--query-string` (inline) or `--query <path>` (a
file), and address a graph's storage directly with `--store`. By-name catalog
invocation is served-only — a bare `--store` has no catalog, so it's the ad-hoc
lane:
```bash
omnigraph query --uri graph.omni \
omnigraph query --store graph.omni \
-e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \
--params '{"name":"Alice"}'
omnigraph mutate --uri graph.omni \
omnigraph mutate --store graph.omni \
-e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \
--params '{"name":"Inline","age":42}'
# A multi-query file: the positional selects which query to run.
omnigraph query --store graph.omni --query queries.gq get_person --params '{"name":"Alice"}'
```
`-e` is mutually exclusive with `--query <path>` and `--alias <name>`; exactly
one of the three must be provided. The inline source travels through the same
parser, lint, params binding, and commit machinery as a file-based query —
only the source loader changes.
`-e` is mutually exclusive with `--query <path>`. With either, the positional
name (optional) selects which query in the source to run. The inline source
travels through the same parser, lint, params binding, and commit machinery as a
file-based query — only the source loader changes.
## Branching And Reviewable Data Flows
@ -50,19 +59,18 @@ omnigraph commit show --uri graph.omni <commit-id> --json
## Remote Server Mode
Serve a graph:
Serve a cluster-applied graph:
```bash
omnigraph-server graph.omni --bind 127.0.0.1:8080
omnigraph cluster apply --config ./company-brain
omnigraph-server --cluster ./company-brain --bind 127.0.0.1:8080
```
Read through the HTTP API:
Read through the HTTP API — invoke a stored query by name from the catalog:
```bash
omnigraph query \
omnigraph query get_person \
--server http://127.0.0.1:8080 \
--query queries.gq \
--name get_person \
--params '{"name":"Alice"}'
```
@ -71,25 +79,31 @@ literal URL); a positional `http(s)://` URI is rejected. If the server requires
auth, set its bearer token and `omnigraph login <server>` (or
`OMNIGRAPH_BEARER_TOKEN`).
## Multi-graph servers (v0.6.0+)
## Multi-graph servers
Against a multi-graph server (started with `--config omnigraph.yaml` referencing a non-empty `graphs:` map), use `omnigraph graphs list` to enumerate the registered graphs. The server must configure bearer tokens and `server.policy.file` with a rule that allows `graph_list`; `/graphs` is closed by default even when the server runs with `--unauthenticated`.
A server boots from a cluster directory (`omnigraph-server --cluster <dir>`) and
serves every graph the cluster declares. Use `omnigraph graphs list` to enumerate
them. The cluster's server-level policy must allow `graph_list`; `/graphs` is
closed by default even when the server runs with `--unauthenticated`.
```bash
OMNIGRAPH_BEARER_TOKEN=admin-token \
omnigraph graphs list --uri http://server.example.com --json
omnigraph graphs list --server http://server.example.com --json
```
For config-driven clients, set the remote graph's `bearer_token_env` to an environment variable containing a token whose actor is authorized by `server.policy.file`.
For an operator-defined server, store its token with `omnigraph login <name>` (or
`OMNIGRAPH_TOKEN_<NAME>`); the actor must be authorized by the cluster's
server-level policy.
`list` rejects local URI targets — it's for remote multi-graph servers only.
`list` rejects local (`--store`) targets — it's for remote multi-graph servers only.
Runtime add/remove is **not** in v0.6.0. To add a graph, stop the server, add a `graphs.<id>` entry to `omnigraph.yaml`, then restart. To remove, stop the server, delete the entry, restart.
Runtime add/remove via API is not exposed. To add or remove a graph, edit the
cluster's `cluster.yaml`, run `omnigraph cluster apply`, then restart the server.
Per-graph URLs: hit a graph's cluster route from any subcommand by pointing `--uri` at it:
Per-graph addressing: select a graph on a multi-graph server with `--graph`:
```bash
omnigraph read --uri http://server.example.com/graphs/beta --query q.gq ...
omnigraph query get_person --server http://server.example.com --graph beta --params '{"name":"Ada"}'
```
## Runs, Policy, And Diagnostics
@ -100,9 +114,9 @@ omnigraph check --query queries.gq graph.omni --json
omnigraph schema plan --schema next.pg graph.omni --json
omnigraph schema apply --schema next.pg graph.omni --json
omnigraph policy validate --config omnigraph.yaml
omnigraph policy test --config omnigraph.yaml
omnigraph policy explain --config omnigraph.yaml --actor act-alice --action read --branch main
omnigraph policy validate --cluster ./company-brain --graph knowledge
omnigraph policy test --cluster ./company-brain --graph knowledge --tests policy.tests.yaml
omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act-alice --action read --branch main
omnigraph commit list graph.omni --json
omnigraph commit show --uri graph.omni <commit-id> --json
@ -116,34 +130,29 @@ also pass `--schema`.
## Config
`omnigraph.yaml` lets the CLI and server share named graphs, defaults, and
query roots:
Configuration has two surfaces with single owners (see the
[CLI reference](reference.md#config-surfaces) for the full schema):
- **`~/.omnigraph/config.yaml`** — your personal operator config: default actor
(`--as`), named servers + credentials, clusters, profiles, aliases, and
default scope (`defaults.server` / `defaults.store` / `default_graph`). It
decides *who you are* and *what you address by default*.
- **`cluster.yaml`** (a team-owned cluster directory) — declares *what the system
is*: graphs, schemas, stored queries, policies, and storage. A server boots
from it (`--cluster <dir>`); see the [cluster guide](../clusters/index.md).
```yaml
graphs:
local:
uri: demo.omni
# ~/.omnigraph/config.yaml
operator:
actor: act-andrew
servers:
dev:
uri: http://127.0.0.1:8080
bearer_token_env: OMNIGRAPH_BEARER_TOKEN
cli:
graph: local
branch: main
query:
roots:
- queries
- .
url: http://127.0.0.1:8080
defaults:
server: dev
default_graph: knowledge
```
The config file can also define:
- server bind defaults
- auth env files
- query aliases for common read and change commands
- `policy.file` for Cedar authorization rules
When policy is enabled, `schema apply` is authorized through the
`schema_apply` action and is typically limited to admins on protected `main`.
@ -161,6 +170,6 @@ one-line warning to stderr and otherwise behave identically.
| `omnigraph query lint` | `omnigraph lint` | Same flags. The argv-level shim rewrites `query lint` to `lint`. |
| `omnigraph query check` | `omnigraph check` | `check` is a visible alias of `omnigraph lint`. |
The `command:` field in `aliases.<name>` in `omnigraph.yaml` accepts both
`read` / `change` (legacy) and `query` / `mutate` (canonical); the two
The `command:` field in `aliases.<name>` in `~/.omnigraph/config.yaml` accepts
both `read` / `change` (legacy) and `query` / `mutate` (canonical); the two
spellings are interchangeable on the wire via serde aliases.

View file

@ -1,31 +1,32 @@
# CLI Reference (`omnigraph`)
A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](index.md).
A reference for the `omnigraph` binary's command surface and the per-operator `~/.omnigraph/config.yaml` schema. For a quick-start guide, see [cli.md](index.md).
Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server <name|url>` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph <id>` for multi-graph servers; exclusive with a positional URI), `--store <uri>` (a single graph's storage directly), or `--profile <name>` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config <dir>`. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected.
Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server <name|url>` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph <id>` for multi-graph servers; exclusive with a positional URI), `--store <uri>` (a single graph's storage directly), or `--profile <name>` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config <dir>`, while `policy` and `queries` read a cluster's applied state via `--cluster <dir|uri>`. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. **`query`/`mutate` are the exception**: their positional is a stored-query *name* (RFC-011 D3), not a graph URI, so they address the graph only via `--store`/`--server`/`--profile`/defaults.
## Top-level commands
| Command | Purpose |
|---|---|
| `init` | `--schema <pg>` → initialize a graph (no longer scaffolds `omnigraph.yaml`; start cluster configs from the [cluster.md](../clusters/index.md) quick-start or `config migrate`) |
| `init` | `--schema <pg>` → initialize a graph (start cluster configs from the [cluster.md](../clusters/index.md) quick-start) |
| `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from <base>` forks a missing `--branch` from `<base>` first |
| `ingest` | deprecated alias of `load --from <base>` (defaults: `--from main --mode merge`); prints a one-line warning to stderr |
| `query` (alias: `read`) | run named read query; source via `--query <path>`, `-e`/`--query-string <GQ>`, or `--alias <name>` (exactly one). `read` is the deprecated previous name and prints a one-line warning to stderr |
| `mutate` (alias: `change`) | run mutation query; same `--query` / `-e` / `--alias` mutual-exclusion as `query`. `change` is the deprecated previous name and prints a one-line warning to stderr |
| `query <name>` (alias: `read`) | run a read query. **Catalog lane** (default): `<name>` is a stored query invoked **by name** from the served catalog (served-only — address with `--server`/`--profile`; the verb asserts the query is a read). **Ad-hoc lane**: with `--query <path>` or `-e`/`--query-string <GQ>`, runs that source (the positional `<name>` then selects which query in it). No positional graph URI — address via `--store`/`--server`/`--profile`. `read` is the deprecated previous name (one-line stderr warning) |
| `mutate <name>` (alias: `change`) | run a mutation query; same catalog (by-name, served-only, verb asserts mutation) / ad-hoc (`--query`/`-e`) lanes as `query`. `change` is the deprecated previous name (one-line stderr warning) |
| `alias <name> [args]` | invoke an operator alias — a read-only personal binding (under `aliases:` in `~/.omnigraph/config.yaml`) to a stored query on a named server (RFC-011 D4; replaces the removed `--alias` flag; stored mutations are rejected before execution) |
| `snapshot` | print current snapshot (per-table version + row count) |
| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
| `branch create \| list \| delete \| merge` | branching ops |
| `commit list \| show` | inspect commit graph |
| `schema plan \| apply \| show (alias: get)` | migrations |
| `schema plan \| apply \| show (alias: get)` | migrations. `apply` refuses a cluster-managed graph (one whose storage is inside a cluster) and points at `cluster apply` — those graphs evolve through the cluster ledger, not a direct apply |
| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
| `config migrate` | propose (or `--write`: apply) the split of a legacy `omnigraph.yaml` — team half → ready-to-review `cluster.yaml`, personal half → `~/.omnigraph/config.yaml` (key-level merge, existing entries win), plus dropped-key reasons and manual steps |
| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve <resource> --as <actor>` (`apply`/`approve` default the actor from the per-operator `omnigraph.yaml`'s `cli.actor` when `--as` is omitted; nothing else in that file affects cluster commands); what apply converges is what an `omnigraph-server --cluster <dir>` deployment serves on its next restart (omnigraph.yaml deployments are unaffected); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock <LOCK_ID>` manually removes a held local state lock by exact id |
| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve <resource> --as <actor>` (`apply`/`approve` default the actor from `~/.omnigraph/config.yaml`'s `operator.actor` when `--as` is omitted); what apply converges is what an `omnigraph-server --cluster <dir>` deployment serves on its next restart (`--cluster` is the server's only boot source — RFC-011 cluster-only); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock <LOCK_ID>` manually removes a held local state lock by exact id |
| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) |
| `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review |
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC (`--confirm` to execute; also needs `--yes` against a non-local `s3://` target — see *Write diagnostics & destructive confirmation*) |
| `embed` | offline JSONL embedding pipeline |
| `policy validate \| test \| explain` | Cedar tooling. Selects `cli.graph`, else `server.graph`, else top-level `policy.file` |
| `policy validate \| test \| explain` | Cedar tooling against a cluster's applied policies (`--cluster <dir>`; `--graph <id>` picks a graph's bundle when several apply). `test` takes `--tests <file>`; `explain` takes `--actor`/`--action`/`--branch`/`--target-branch` |
| `profile list \| show [<name>]` | read-only inspection of `~/.omnigraph/config.yaml` profiles. `list` shows each profile's binding (server/cluster/store) + default graph and marks the `$OMNIGRAPH_PROFILE`-active one; JSON keeps `binding` and adds `scope_kind`, `target`, `valid`, and `error`; `show` resolves one profile's scope (endpoint + default graph), defaulting to the active profile, else the flat operator defaults |
| `version` / `-v` | print `omnigraph 0.3.x` |
## Command capabilities
@ -34,21 +35,30 @@ Every command declares the **capability** it needs — what it requires to reach
- **`any`** — `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show`, `schema apply`. Run against a graph **served (via a server) or embedded (direct against a store)**: accept a positional `file://`/`s3://` URI, `--server <name|url>` (+ `--graph <id>` for multi-graph servers), `--store <uri>`, or `--profile <name>`. A remote server is addressed with `--server` — a positional `http(s)://` URI does **not** dispatch to one.
- **`served`** — `graphs list`. Requires a server (accepts `--server` / `--profile`).
- **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server` / `--graph`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` also accept **`--cluster <dir|s3://…> --cluster-graph <id>`**, which resolves the graph's storage URI from the served cluster state (so you needn't know the `<storage>/graphs/<id>.omni` layout).
- **`control`** — `cluster *`. Operates on a cluster directory via `--config <dir>`.
- **`local`** — `policy *`, `embed`, `login`, `logout`, `config`, `version`, `queries list`. Address no graph.
- **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` additionally accept **`--cluster <dir|s3://…> --graph <id>`** (`--cluster` is a cluster directory or storage-root URI, named via `clusters:` in `~/.omnigraph/config.yaml` or a literal root), which resolves the graph's storage URI from the served cluster state (so you needn't know the `<storage>/graphs/<id>.omni` layout). `--graph` is the one graph selector across all scopes — on these three verbs it picks the cluster graph; on the other `direct` verbs it does not apply.
- **`control`** — `cluster *` via `--config <dir>`; `policy *` and `queries *` via `--cluster <dir|uri>` or a cluster profile.
- **`local`** — `alias`, `embed`, `login`, `logout`, `profile`, `version`. Address no explicit graph scope.
These restrictions are enforced and reported, not silent:
- A served-graph flag (`--server` / `--graph`) on a verb that doesn't reach a graph through a server fails loudly, e.g.: ``optimize is a direct (storage-native) command; --server/--graph address a served graph and do not apply. Pass a storage URI, or --cluster <dir> --cluster-graph <id>.``
- A scope flag on a verb that can't consume it fails loudly rather than being silently dropped — `--server` outside a served scope, `--cluster` outside cluster-scoped verbs, or `--graph` where no multi-graph scope applies, e.g.: ``optimize is a direct (storage-native) command; --server addresses a served graph and does not apply. Pass a storage URI, or --cluster <dir> --graph <id>.``
- A `direct` verb pointed at a remote URI fails loudly, e.g.: ``optimize is a direct (storage-native) command and needs direct storage access; the resolved target is a remote server (https://…). Pass the graph's file:// or s3:// URI.``
- A data verb pointed at a positional `http(s)://` URI fails loudly: ``a remote graph must be addressed with --server <url> — a positional (or --uri) http(s):// URL no longer dispatches to a server.``
- `init` into an **established cluster's** storage layout (`<root>/graphs/<id>.omni` where `<root>` holds `__cluster/state.json`) is refused — graphs in a cluster are created by `cluster apply` (which records ledger / recovery / approvals), not `init`.
To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --cluster-graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design.
To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design.
`omnigraph --help` lists commands with a **capability legend** at the bottom (any / served / direct / control / local).
## Write diagnostics & destructive confirmation
Two global flags make writes self-documenting and guard the dangerous ones (RFC-011 Decision 9):
- **Every write echoes its resolved target to stderr**`omnigraph load → s3://acme/brain/graphs/knowledge.omni (direct, remote)` — so you catch a scope that resolved somewhere unexpected (e.g. *prod*) before it lands. Applies to `load`, `ingest`, `mutate`, `branch create|delete|merge`, `schema apply`, `optimize`, `repair`, `cleanup`. The line is stderr, so `--json` consumers reading stdout are unaffected; suppress it with **`--quiet`**.
- **Destructive writes against a non-local scope require confirmation.** `cleanup`, overwrite `load` (`--mode overwrite`), and `branch delete` proceed freely against a local (`file://`) graph, but when the resolved target is **not local** (a served `http(s)://` graph or an `s3://` store/cluster) they require explicit consent: pass **`--yes`** to confirm, an interactive terminal is prompted, and a non-interactive run (no TTY, or `--json`) **refuses with an error** rather than silently destroying. `cleanup` still also requires its existing `--confirm` (preview→execute); `--yes` is the additional non-local consent.
A "local" target is a bare path or a `file://` URI; `http(s)://`, `s3://`, and other object-store schemes are non-local.
## Config surfaces
Two config surfaces with single owners, plus a zero-config tier:
@ -59,22 +69,20 @@ Two config surfaces with single owners, plus a zero-config tier:
| Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics |
| Flags / env | per invocation | — | everything, explicitly |
`omnigraph.yaml` (below) is the legacy combined file — fully supported
today, slated for staged deprecation; its keys' future homes are
listed there.
### `~/.omnigraph/config.yaml` (operator)
```yaml
operator:
actor: act-andrew # default identity for every --as cascade:
# --as > legacy cli.actor > operator.actor > none
actor: act-andrew # default identity for the --as cascade: --as > operator.actor > none
servers: # operator-owned endpoints; names key the credentials
prod:
url: https://graph.example.com # no tokens in this file, ever
defaults:
output: table # read format default, below --json/--format/alias/legacy
server: prod # the everyday scope when no address is given (RFC-011)
output: table # read format default, below --json/--format/alias
server: prod # the everyday SERVED scope when no address is given (RFC-011)
# store: file:///data/dev.omni # OR a zero-flag LOCAL default (mutually
# # exclusive with `server`); the local-dev
# # counterpart of `server`
default_graph: knowledge # graph selected in a server/cluster scope
clusters: # admin-only: managed-cluster storage roots (RFC-011).
brain: # the ONLY place a storage root lives in this file.
@ -85,8 +93,8 @@ profiles: # named scope bundles (RFC-011); pick with --profile
```
Absent file = empty layer. Unknown keys warn and load (a file written for a
newer CLI works on an older one). `$OMNIGRAPH_CONFIG=<path>` stands in for
`--config` (the flag wins) in both the CLI and the server.
newer CLI works on an older one). Override the config directory with
`$OMNIGRAPH_HOME`.
#### Scopes & profiles (RFC-011)
@ -95,20 +103,32 @@ graph in it; the served-vs-direct access path is derived from the scope, not
toggled. The scope comes from one of (highest precedence first): an explicit
address (a positional URI, `--server`, or `--store <uri>`); a named
`--profile <name>` (or `$OMNIGRAPH_PROFILE`); or the flat `defaults.server` +
`defaults.default_graph`. A **profile** binds exactly one of `server` / `cluster`
/ `store` plus an optional default graph — config data, not state: every command
resolves its scope fresh, there is no sticky "current" mode.
`defaults.default_graph` (a served default) **or** `defaults.store` (a zero-flag
*local* default — mutually exclusive with `defaults.server`). A **profile** binds
exactly one of `server` / `cluster` / `store` plus an optional default graph —
config data, not state: every command resolves its scope fresh, there is no
sticky "current" mode. Inspect what is defined with `omnigraph profile list` and
`omnigraph profile show [<name>]` (read-only).
- `--store <uri>` addresses a single graph's storage directly (ad-hoc / break-glass).
- A `cluster`-bound profile reaches `optimize` / `repair` / `cleanup` for a managed
graph (resolving its storage root from `clusters:`), the same as
`--cluster <root> --cluster-graph <id>`.
`--cluster <root> --graph <id>`. A `--graph` flag overrides the profile's default.
- A `server`-bound scope on a maintenance verb, or a `cluster`-bound scope on a
data verb, is rejected with a message pointing at the right addressing.
- **No graph selected (RFC-011 D7).** When a scope has no `--graph` and no
`default_graph`, the CLI never silently picks:
- **Cluster scope** — exactly **one** applied graph is used automatically;
**several** errors and lists the candidates (from the served catalog).
- **Server scope** — a multi-graph server (any non-empty `GET /graphs`, even a
single entry) errors and lists the candidates: you must pass `--graph <id>`.
A single-graph / flat server (405 on `/graphs`), or one whose `/graphs` is
policy-gated or unreachable, uses its bare URL as before.
`--target` and the positional-`http(s)://`→remote dispatch have been **removed**;
the remaining legacy surfaces (`--cluster-graph`, `omnigraph.yaml`'s `cli.graph`
default) still work and an explicit address always wins.
`--target`, `--cluster-graph`, and the positional-`http(s)://`→remote dispatch
have been **removed** (`--graph` is now the one graph selector across server and
cluster scopes); operator `defaults`/`--profile` supply the no-flag scope and an
explicit address always wins.
#### Credentials keyed by server name
@ -136,10 +156,11 @@ aliases:
format: table
```
`omnigraph query --alias triage 2026-06-01` invokes
`omnigraph alias triage 2026-06-01` invokes
`POST <server>/graphs/spike/queries/weekly_triage` with the keyed
credential. A legacy `omnigraph.yaml` alias with the same name wins during
the deprecation window (with a warning).
credential. Aliases live in their own `alias` namespace (RFC-011 Decision 4),
so an alias can never shadow — or be shadowed by — a built-in verb. (The old
`--alias <name>` flag on `query`/`mutate` was removed.)
A remote command whose URL prefix-matches an operator server's `url` (the
`gh` host model — no flags needed) resolves its token through:
@ -148,64 +169,10 @@ A remote command whose URL prefix-matches an operator server's `url` (the
|---|---|
| 1 | `OMNIGRAPH_TOKEN_<NAME>` env (`prod``OMNIGRAPH_TOKEN_PROD`) |
| 2 | `[<name>]` section in `~/.omnigraph/credentials` |
| 3 | the legacy chain unchanged (`bearer_token_env``OMNIGRAPH_BEARER_TOKEN``auth.env_file`) |
| 3 | the default `OMNIGRAPH_BEARER_TOKEN` env |
A token is only ever sent to the server it is keyed to: URLs matching no
operator server use the legacy chain alone.
## `omnigraph.yaml` schema (legacy combined file)
> **Deprecated.** Loading this file prints a per-key notice
> naming each present key's new home (suppress in CI with
> `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION=1`); `omnigraph config migrate`
> produces the split. The file keeps working through the deprecation
> window. Migrated teams can set `OMNIGRAPH_NO_LEGACY_CONFIG=1` to turn
> any legacy-file load into a hard error (regression guard; the file's
> absence is always fine).
```yaml
project: { name }
graphs:
<name>:
uri: <local|s3://|http(s)://>
bearer_token_env: <ENV_NAME>
queries: # per-graph stored-query registry (server-role; multi-graph mode)
<query-name>: # key MUST equal the `query <name>` symbol inside the .gq
file: <path-to-.gq> # relative to this config's directory
mcp:
expose: true # default true: listed in the MCP catalog (GET /queries); set false to hide (still HTTP-callable)
tool_name: <name> # optional MCP tool-name override (defaults to <query-name>;
# must be unique across exposed queries)
server:
graph: <name>
bind: <ip:port>
cli:
graph: <name>
branch: <name>
output_format: json|jsonl|csv|kv|table
table_max_column_width: 80
table_cell_layout: truncate|wrap
query:
roots: [<dir>, …] # search path for .gq files
auth:
env_file: .env.omni
aliases:
<alias>:
# accepted values: `read` / `query` (read alias), `change` / `mutate`
# (write alias). `query` and `mutate` are recommended; `read` and
# `change` remain accepted forever for back-compat.
command: read|change|query|mutate
query: <path-to-.gq>
name: <query-name>
args: [<positional-name>, …]
graph: <name>
branch: <name>
format: <output-format>
queries: # top-level registry — applies only to a bare-URI (anonymous) graph; a graph served by name uses its `graphs.<id>.queries`. Mirrors top-level `policy`.
<query-name>: { file: <path-to-.gq> } # mcp.expose defaults to true
policy:
file: policy.yaml
```
A keyed token is only ever sent to the server it is keyed to: a URL matching no
operator server falls back to `OMNIGRAPH_BEARER_TOKEN` alone.
## Cluster config preview
@ -228,8 +195,8 @@ apply, refresh, and import acquire `__cluster/lock.json` by default and release
it before returning. `cluster apply` executes only stored-query/policy catalog
writes (content-addressed under `__cluster/resources/`) and requires an
existing `state.json`; graph/schema changes are deferred with warnings, and
applied resources do not serve traffic — the server still boots from
`omnigraph.yaml`. `cluster status` reads state only and reports any existing
applied resources do not serve traffic until an `omnigraph-server --cluster
<dir>` restart picks them up. `cluster status` reads state only and reports any existing
lock metadata. `force-unlock` removes a lock only when the supplied id exactly
matches the lock file. `refresh` requires an existing `state.json`; `import`
creates one only when it is missing. Both observe declared graphs read-only at
@ -248,7 +215,7 @@ embeddings, aliases, and bindings are reserved for later stages. See
## Param resolution
Precedence (high to low): explicit `--params` / `--params-file`, alias positional args, `omnigraph.yaml` defaults. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
Precedence (high to low): explicit `--params` / `--params-file`, alias positional args. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
## Bearer token resolution (CLI)

View file

@ -32,26 +32,24 @@ omnigraph cluster force-unlock <LOCK_ID> --config company-brain --json
`--config` points at a directory, not a file. The directory must contain
`cluster.yaml`. When omitted, it defaults to the current directory.
## Relationship to `omnigraph.yaml`
## Relationship to `~/.omnigraph/config.yaml`
`cluster.yaml` does not replace `omnigraph.yaml`, and the two never describe
the same fact. `omnigraph.yaml` is the permanent **per-operator** layer (CLI
defaults, the operator's identity and credential references, graph targets
for data-plane commands); `cluster.yaml` is the shared desired state of a
`cluster.yaml` and the per-operator `~/.omnigraph/config.yaml` never describe
the same fact. The operator config is the permanent **per-operator** layer
(the operator's identity and credential references, named servers/clusters,
profiles, and CLI defaults); `cluster.yaml` is the shared desired state of a
whole deployment, read only by the `cluster` commands via `--config`.
The exact contract:
- **Cluster commands read `omnigraph.yaml` for exactly one thing**: the
`cli.actor` default used by `apply`/`approve` when `--as` is omitted —
operator identity is a per-operator fact. With `--as` present, no config
is read at all. Nothing else (its graph set, targets, bind, queries,
policies) ever influences a cluster command; a malformed `omnigraph.yaml`
breaks only the no-flag actor lookup, loudly.
- **A `--cluster` server reads `omnigraph.yaml` for nothing** — not even the
implicit current-directory search runs (mode-inference rule 0). Boot from
cluster state XOR `omnigraph.yaml`, never a merge.
- **The other direction is ergonomics, not coupling**: a per-operator
- **Cluster commands read the operator config for exactly one thing**: the
`operator.actor` default used by `apply`/`approve` when `--as` is omitted —
operator identity is a per-operator fact. With `--as` present, the operator
config is not needed. Nothing else in it influences a cluster command.
- **No legacy `omnigraph.yaml`**: the CLI does not read `omnigraph.yaml` at
all, and a `--cluster` server reads only the cluster catalog — boot is
cluster-only.
- **The other direction is ergonomics, not coupling**: per-operator
data-plane commands address a cluster graph by its derived storage root
(`company-brain/graphs/knowledge.omni`) with `--store <uri>` — an ordinary
local path, no special handling.
@ -269,12 +267,11 @@ Deletes remove the resource from state; their old payload blobs stay on disk
(garbage collection is a later stage). Re-running a converged apply is a no-op:
no state write, no revision change (`state_written: false`).
**Applied means serving — for deployments that opt in.** A server started
with `--cluster <dir>` boots from the applied revision (see
**Applied means serving.** A server started with `--cluster <dir>` boots from
the applied revision (see
[Serving from the cluster](#serving-from-the-cluster-the-mode-switch)); it
picks up newly applied state on its next restart. Deployments still booting
from `omnigraph.yaml` are untouched: for them, applied means recorded in the
catalog, nothing more.
picks up newly applied state on its next restart. Until that restart, applied
means recorded in the catalog, nothing more.
### Graph creation

View file

@ -117,7 +117,7 @@ omnigraph cluster apply --config company-brain --as andrew
`--as <actor>` attributes the run: it is recorded in recovery sidecars and
audit entries and threaded into the engine's commit history. Set
`cli: { actor: <you> }` in your per-operator `omnigraph.yaml` to make it the
`operator: { actor: <you> }` in your `~/.omnigraph/config.yaml` to make it the
default when `--as` is omitted (the flag always wins; `approve` requires one
of the two).
@ -244,12 +244,12 @@ with an in-flight apply.
- **CI-driven convergence**: `validate` and `plan --json` are read-only and
safe in pipelines; gate `apply --as ci` on plan review. Approvals are the
human step by design — keep `cluster approve` out of automation.
- **`omnigraph.yaml` still has a job**: per-operator settings — your
`cli.actor` default for `--as`, CLI defaults, credentials, and data-plane
ergonomics (address a cluster graph by its derived root like
`company-brain/graphs/knowledge.omni` with `--store` for loads). It just no
longer describes the deployment — a server boots from one source or the
other, never a merge of both.
- **`~/.omnigraph/config.yaml` is the per-operator config**: your
`operator.actor` default for `--as`, named servers/clusters, credentials,
profiles, and data-plane ergonomics (address a cluster graph by its derived
root like `company-brain/graphs/knowledge.omni` with `--store` for loads). The
cluster directory's `cluster.yaml` is the **sole deployment declaration** the
server boots from the cluster only.
## 7. Maintaining a cluster graph
@ -258,10 +258,11 @@ operation — it runs out-of-band, with direct storage access, against the graph
roots. Address a cluster graph by name instead of hand-typing its storage path:
```bash
omnigraph optimize --cluster ./company-brain --cluster-graph knowledge
omnigraph cleanup --cluster ./company-brain --cluster-graph knowledge --keep 10 --confirm
# --cluster also takes the storage-root URI directly (config-free):
omnigraph optimize --cluster s3://bucket/clusters/company-brain --cluster-graph knowledge
omnigraph optimize --cluster ./company-brain --graph knowledge
omnigraph cleanup --cluster ./company-brain --graph knowledge --keep 10 --confirm
# --cluster also takes the storage-root URI directly (config-free), and a
# `clusters:` name from ~/.omnigraph/config.yaml:
omnigraph optimize --cluster s3://bucket/clusters/company-brain --graph knowledge
```
The graph's storage URI is resolved from the **served cluster state** (the same
@ -270,6 +271,16 @@ not resolvable. Run these from a host with storage access — there are no serve
routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in
a cluster are created by `cluster apply`, not by hand.
If the cluster has exactly **one** applied graph you can omit `--graph` — it is
used automatically. With **several**, omitting `--graph` errors and lists the
candidates (RFC-011 D7); it never picks one for you.
Against an **`s3://`-backed cluster** the resolved graph storage is non-local, so a
destructive `cleanup` additionally requires **`--yes`** (an interactive prompt
otherwise, refusal without a TTY) on top of `--confirm` — see [cli-reference.md](../cli/reference.md)'s
*Write diagnostics & destructive confirmation*. Every maintenance run also echoes
its resolved target to stderr (suppress with `--quiet`).
## What the control plane does not do (yet)
- **No hot reload** — applied changes serve on the next restart.

View file

@ -13,13 +13,10 @@ Omnigraph supports two broad deployment shapes:
The server binary and container image expose the same HTTP surface.
The server also has two **boot sources**: `omnigraph.yaml` (graph targets
declared in the per-operator config) or a **cluster directory**
(`omnigraph-server --cluster <dir>`), which serves the cluster control
The server has a single **boot source**: a **cluster directory**
(`omnigraph-server --cluster <dir | s3://…>`), which serves the cluster control
plane's applied revision — see
[cluster-config.md](clusters/config.md#serving-from-the-cluster-the-mode-switch).
The two are exclusive per deployment; switching is a restart with a different
flag.
## Binary Deployment
@ -30,21 +27,26 @@ Build or install:
On Windows, the binaries are `omnigraph.exe` and `omnigraph-server.exe`.
Run against a local graph:
The server boots from a cluster only (RFC-011) — there is no positional
`<URI>` / single-graph boot. Point it at a local cluster directory:
```bash
omnigraph-server graph.omni --bind 0.0.0.0:8080
omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080
```
Run against an object-store-backed graph:
Or boot config-free from an object-storage-rooted cluster:
```bash
OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
AWS_REGION="us-east-1" \
omnigraph-server s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
omnigraph-server --cluster s3://my-bucket/clusters/company-brain \
--bind 0.0.0.0:8080
```
The server serves every graph in the cluster's applied revision under
`/graphs/{id}/...`. See [clusters](clusters/index.md) for authoring and
applying a cluster.
## Cluster Mode in Containers (AWS, Railway)
A cluster-booted deployment has **two shapes** since the `storage:` root:
@ -80,10 +82,8 @@ docker run -d \
-p 8080:8080 <image>
```
`OMNIGRAPH_CLUSTER` is exclusive: combining it with `OMNIGRAPH_TARGET_URI`,
`OMNIGRAPH_CONFIG`, or `OMNIGRAPH_TARGET` fails fast (exit 64), the same
rule the server itself enforces. The image also ships the `omnigraph` CLI,
so the day-2 loop runs in-container with no `omnigraph.yaml`:
`OMNIGRAPH_CLUSTER` is the server's only boot source. The image also
ships the `omnigraph` CLI, so the day-2 loop runs in-container:
```bash
docker exec -it <container> sh -c \
@ -104,10 +104,10 @@ docker exec -it <container> sh -c \
`omnigraph cluster apply --as <you> --config /var/lib/omnigraph/cluster`
→ force a new deployment (restart).
For a deployment that doesn't need the cluster control plane, the classic
stateless shape — `OMNIGRAPH_TARGET_URI=s3://bucket/graph.omni`, no volume —
remains the simplest AWS architecture (see Binary/Container Deployment
above).
For a stateless, volume-free deployment, root the cluster on object
storage and boot config-free with
`OMNIGRAPH_CLUSTER=s3://bucket/clusters/<name>` (the bucket-no-volume
shape above) — the simplest AWS architecture.
### Railway
@ -181,23 +181,24 @@ Build the image:
docker build -t omnigraph-server:local .
```
Run against a local graph:
The server boots from a cluster only (RFC-011). Run against a cluster
directory on a mounted volume:
```bash
docker run --rm -p 8080:8080 \
-v "$PWD/graph.omni:/data/graph.omni" \
-v "$PWD/company-brain:/var/lib/omnigraph/cluster" \
omnigraph-server:local \
/data/graph.omni --bind 0.0.0.0:8080
--cluster /var/lib/omnigraph/cluster --bind 0.0.0.0:8080
```
Run against an S3-backed graph:
Run config-free against an object-storage-rooted cluster:
```bash
docker run --rm -p 8080:8080 \
-e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
-e AWS_REGION="us-east-1" \
omnigraph-server:local \
s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
--cluster s3://my-bucket/clusters/company-brain \
--bind 0.0.0.0:8080
```
@ -208,27 +209,14 @@ When no positional args are given, the image entrypoint
| Var | Effect |
|---|---|
| `OMNIGRAPH_TARGET_URI` | Graph URI, passed as the positional argument. |
| `OMNIGRAPH_CONFIG` | Path to an `omnigraph.yaml`, passed as `--config`. Used to supply a `policy.file` (Cedar authorization). The config file and any relative `policy.file` must be mounted into the container. |
| `OMNIGRAPH_TARGET` | Graph name to select from the config's `graphs:` block (with `OMNIGRAPH_CONFIG`, when no `OMNIGRAPH_TARGET_URI`). |
| `OMNIGRAPH_CLUSTER` | Cluster boot source — a config directory or a storage-root URI, forwarded as `--cluster`. The only boot source. |
| `OMNIGRAPH_BIND` | Listen address (default `0.0.0.0:8080`). |
`OMNIGRAPH_TARGET_URI` and `OMNIGRAPH_CONFIG` **compose**: set both to keep the
graph URI in the env var while loading policy from the config file (the
positional URI wins over any `graphs:` entry). To enable Cedar policy on a
container otherwise driven by `OMNIGRAPH_TARGET_URI`, mount the config dir and
add `OMNIGRAPH_CONFIG`:
```bash
docker run --rm -p 8080:8080 \
-e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
-e OMNIGRAPH_TARGET_URI="s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0" \
-e OMNIGRAPH_CONFIG="/etc/omnigraph/omnigraph.yaml" \
-v "$PWD/config:/etc/omnigraph:ro" \
omnigraph-server:local
# /etc/omnigraph/omnigraph.yaml contains `policy: { file: policy.yaml }`;
# policy.yaml (+ optional policy.tests.yaml) sit beside it in the mount.
```
Per-graph and server-level Cedar policy come from the cluster's applied
revision (authored in `cluster.yaml` and published with `cluster apply`),
not from a separate config file. The cluster docker shapes — volume vs.
config-free object-storage root — are detailed under
[Cluster Mode in Containers](#cluster-mode-in-containers-aws-railway) above.
## Auth

View file

@ -1,17 +1,18 @@
# Maintenance: Optimize, Repair & Cleanup
**Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster <dir|s3://…> --cluster-graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` / `--graph` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md).
**Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster <dir|s3://…> --graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md).
## `optimize` — non-destructive
- Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed.
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs.
- **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail.
- **Create declared-but-missing indexes (the index reconciler).** `@index`/`@key` declares intent; `schema apply` records it but builds nothing, and `load`/`mutate` defer a column that cannot be built yet (a `Vector` column with no trainable vectors). `optimize` materializes any such declared-but-unbuilt index over the compacted layout — so it is the convergence path for an `@index` added after data exists, or a vector index whose embeddings arrived via a later `embed`. A column still not buildable (no vectors yet) is reported on the table's stat as `pending_indexes` (visible in `--json`), not treated as a failure; the next `optimize` retries. So `optimize` is the single operator-facing index reconciler: it compacts, restores coverage, **and** builds declared-but-missing indexes.
- Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe).
- **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`.
- **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version`.
- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version, pending_indexes` (the last lists any declared `@index` column the reconciler could not build this run, with the reason — e.g. a vector column with no trainable vectors yet).
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today.
## `repair` — explicit
@ -34,6 +35,7 @@
backstop, so it does as much as it can and converges on re-run. The CLI reports
any failed tables; rerun `cleanup` to retry them.
- CLI guards with `--confirm`; without it, prints a preview line.
- **Non-local consent (RFC-011 D9).** Against a non-local target (an `s3://` store/cluster), `cleanup` additionally requires `--yes` on top of `--confirm`: a TTY is prompted, and a non-interactive run (no TTY, or `--json`) refuses rather than destroying. A local (`file://`) target needs only `--confirm`. The same `--yes` gate applies to overwrite `load` and `branch delete`; every maintenance run echoes its resolved target to stderr (suppress with `--quiet`).
- **Recovery floor:** `--keep < 3` may garbage-collect versions that crash recovery needs as a rollback target. Default `--keep 10` is safe.
- **Orphaned-branch reconciliation:** before the version GC, cleanup reclaims any per-table or commit-graph branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged.

View file

@ -20,7 +20,7 @@ Server-scoped action (v0.6.0+; binds to `Omnigraph::Server::"root"`):
10. `graph_list``GET /graphs` registry enumeration (multi-graph mode)
Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` are reserved but not shipped in v0.6.0; operators add/remove graphs by editing `omnigraph.yaml` and restarting.)
Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` over HTTP are reserved but not shipped; operators add/remove graphs by editing the cluster's `cluster.yaml`, running `omnigraph cluster apply`, and restarting the server.)
## Scope kinds
@ -28,38 +28,34 @@ Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — the
- `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops)
- `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected`
## Per-graph vs. server-level policy (multi-graph mode)
## Per-graph vs. server-level policy
In multi mode (`omnigraph.yaml` with a non-empty `graphs:` map), policy files attach at two levels:
A server boots from a cluster (`--cluster <dir>`), and the cluster's
`cluster.yaml` declares its policy bundles in a `policies:` section. Each bundle
names the scopes it `applies_to`: a graph id (per-graph rules — `read`, `change`,
`branch_*`, `schema_apply`) or the literal `cluster` (server-level rules —
`graph_list`).
```yaml
server:
policy:
file: server-policy.yaml # server-level: graph_list
graphs:
# cluster.yaml
policies:
base:
file: base.policy.yaml
applies_to: [cluster, knowledge] # cluster-level + the `knowledge` graph
alpha:
uri: s3://tenant-bucket/alpha
policy:
file: policies/alpha.yaml # per-graph: read, change, branch_*, schema_apply
beta:
uri: s3://tenant-bucket/beta
# no per-graph policy → no engine-layer Cedar enforcement on beta
file: policies/alpha.yaml
applies_to: [alpha] # per-graph: alpha only
```
**Config follows graph identity, not server mode.** A graph served by **name**
(`--target <name>` or `server.graph`) uses its own `graphs.<name>.policy.file`,
exactly as in multi-graph mode. Top-level `policy.file` applies only to an
**anonymous** graph — one served by a bare `<URI>` with no `graphs:` entry.
Serving a **named** graph (single- or multi-graph mode) while top-level
`policy.file` (or `queries:`) is populated **refuses boot**, naming the block,
since the top-level value would otherwise be silently shadowed by the per-graph
block. Move per-graph rules to `graphs.<graph_id>.policy.file` and `graph_list`
rules to `server.policy.file`.
A graph with no bundle bound to it has no engine-layer Cedar enforcement. Each
graph's HTTP request flows through its bound bundle; the management endpoint
(`GET /graphs`) flows through the `cluster`-scoped bundle. When no bundle binds
`cluster`, `GET /graphs` is denied in every runtime state, including
`--unauthenticated`; with bearer tokens configured it returns 403 after admission
control because `graph_list` is not a `read`-equivalent action. The operator must
bind a `cluster`-scoped bundle granting `graph_list` to expose `/graphs`.
Each graph's HTTP request flows through its own per-graph policy. The management endpoint (`GET /graphs`) flows through the server-level policy. When `server.policy.file` is unset, `GET /graphs` is denied in every runtime state, including `--unauthenticated`; with bearer tokens configured, it returns 403 after admission control because `graph_list` is not a `read`-equivalent action. The operator must explicitly authorize via `server-policy.yaml` to expose `/graphs`.
Example server-level policy:
Example `cluster`-scoped bundle:
```yaml
version: 1
@ -72,38 +68,26 @@ rules:
actions: [graph_list]
```
## Configuration
Each per-graph rule may use at most one of `branch_scope` or
`target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they
have no branch context.
`omnigraph.yaml`:
## Actor for direct-engine writes
```yaml
policy:
file: policy.yaml # Cedar rules + groups
tests: policy.tests.yaml # declarative test cases
cli:
actor: act-andrew # default actor for CLI direct-engine writes
```
Each per-graph rule may use at most one of `branch_scope` or `target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they have no branch context.
`cli.actor` is the default actor identity for CLI direct-engine writes
when `policy.file` is configured. Override per-invocation with `--as
<ACTOR>` (top-level flag) — `--as` wins, otherwise `cli.actor` is used,
otherwise no actor. With policy configured and neither set, the
engine-layer footgun guard intentionally denies the write (silent bypass
via "I forgot the actor" is exactly what the guard prevents). Remote
HTTP writes ignore both — they resolve their actor server-side from the
bearer token.
The default actor identity for CLI direct-engine (`--store`) writes is
`operator.actor` in `~/.omnigraph/config.yaml`. Override per-invocation with
`--as <ACTOR>``--as` wins, otherwise `operator.actor`, otherwise no actor.
Remote HTTP writes ignore both — they resolve their actor server-side from the
bearer token. (Direct-store access carries no Cedar policy under RFC-011; policy
lives in the cluster/server.)
## CLI
Policy tooling resolves its graph like server single-mode policy: `cli.graph`
wins, otherwise `server.graph` is used, otherwise the top-level `policy.file`
is validated/tested/explained as the anonymous policy.
Policy tooling reads a cluster's applied policy bundles: pass `--cluster <dir>`,
and `--graph <id>` to pick a graph's bundle when several apply.
- `omnigraph policy validate` — parse + count actors, exit 1 on parse error.
- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch.
- `omnigraph policy test --tests <file>` — run the declarative cases in `<file>` against the selected bundle, exit 1 on any expectation mismatch.
- `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule.
- `omnigraph --as <ACTOR> <subcommand>` — set the actor for the duration of one invocation. Effective for `change`, `load` (and its deprecated `ingest` alias), `branch create|delete|merge`, and `schema apply` against a direct (`--store`) graph. **Rejected** on a served write (`--server`): the actor is bearer-token-resolved server-side, so `--as` can't set it there.
@ -132,7 +116,7 @@ reaches the authorization gate without a matching policy permit.
|---|---|---|---|
| **Open** | no | no | Every request is permitted. Refuses to start unless `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` is set — the operator must explicitly opt in. |
| **DefaultDeny** | yes | no | Every authenticated request for an action other than `read` is rejected with HTTP 403. Closes the "tokens but forgot the policy file" trap — an operator who sets up auth and forgot to point at a policy file used to ship the illusion of protection. |
| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require `server.policy.file`. |
| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require a `cluster`-scoped policy bundle. |
The server refuses to start for the "no tokens, no policy, no flag" cell
and for "policy file, no tokens" — instead of silently shipping an open

View file

@ -1,38 +1,29 @@
# HTTP Server (`omnigraph-server`)
Axum 0.8 + tokio + utoipa-generated OpenAPI. **Two modes** (v0.6.0+): single-graph and multi-graph, with **two boot sources** for multi mode: `omnigraph.yaml` or — exclusively — a cluster directory (`--cluster`). Mode is inferred from CLI args + config shape.
Axum 0.8 + tokio + utoipa-generated OpenAPI. **Cluster-only boot** (RFC-011): the server always boots from a cluster (`--cluster <dir | s3://…>`) and serves N graphs (N ≥ 1) under cluster routes. There is no longer a single-graph flat-route mode, no positional `<URI>` boot, no `--target`, and no `omnigraph.yaml`-`graphs:`-map boot. All HTTP is nested under `/graphs/{graph_id}/...`; `/healthz` and the management `/graphs` enumeration stay flat.
## Modes
## Boot
### Single-graph mode
### Cluster boot (the only boot)
`omnigraph-server <URI>` or `omnigraph-server --target <name> --config omnigraph.yaml`. Routes are flat — `/snapshot`, `/read`, `/branches`, etc.
```bash
omnigraph-server --cluster <dir | s3://> --bind 0.0.0.0:8080
```
**Config follows graph identity.** A bare `<URI>` is an *anonymous* graph and uses the **top-level** `policy.file` / `queries:`. A graph chosen by **name** (`--target` / `server.graph`) uses its own `graphs.<name>.{policy.file, queries}` — the same block multi-graph mode uses. ⚠️ *Changed from v0.6.0, which always used top-level config in single mode: a named-graph config that puts `policy`/`queries` at top-level now **refuses boot** and points you at `graphs.<name>.…` (move the block there). Bare-`<URI>` single mode is unchanged.*
### Multi-graph mode (v0.6.0+)
`omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and **no** single-mode selector (no `server.graph`, no `<URI>`, no `--target`). The server opens every configured graph in parallel at startup (bounded concurrency = 4, fail-fast on the first open error). Routes are nested under `/graphs/{graph_id}/...`. Bare flat paths return 404 in multi mode.
### Cluster-booted multi mode
`omnigraph-server --cluster <dir-or-uri>` boots from the cluster catalog's **applied
revision** instead of
`omnigraph.yaml` — an exclusive boot source: combining it with `<URI>`,
`--target`, or `--config` is a startup error, and `omnigraph.yaml` is never
read in this mode. Always multi-graph routing. See
`omnigraph-server --cluster <dir-or-uri>` boots from the cluster catalog's
**applied revision**. The server resolves that revision into per-graph
startup configs (id, URI, optional per-graph policy, stored-query
registry) plus an optional server-level policy, then opens every
configured graph in parallel at startup (bounded concurrency = 4,
fail-fast on the first open error). Routing is always multi-graph —
requests to bare flat protected paths (`/read`, `/snapshot`, …) return
404; the served surface is `/graphs/{graph_id}/...`. See
[cluster-config.md](../clusters/config.md#serving-from-the-cluster-the-mode-switch)
for what is read and the fail-fast readiness rules. `--bind`,
`--unauthenticated`, and the bearer-token env vars work identically.
for what is read and the fail-fast readiness rules.
Mode inference:
0. CLI `--cluster <dir | s3://…>`**multi, cluster-booted** (exclusive; a scheme-qualified argument reads the ledger straight from the storage root, no local config)
1. CLI positional `<URI>` → single
2. CLI `--target <name>` → single
3. `server.graph` in config → single
4. `--config` + non-empty `graphs:` + no single-mode selector → **multi**
5. otherwise → error with migration hint
A scheme-qualified argument (`s3://…`) reads the ledger straight from the
storage root, with no local config directory. `--bind`,
`--unauthenticated`, and the bearer-token env vars all apply.
### Stored-query validation at startup
@ -40,36 +31,37 @@ If a graph declares a `queries:` registry (see [cli-reference](../cli/reference.
## Endpoint inventory
Per-graph endpoints — same body shape across modes; URLs differ:
| Method | Single-mode path | Multi-mode path | Auth | Action |
|---|---|---|---|---|
| GET | `/healthz` | `/healthz` | none | — |
| GET | `/openapi.json` | `/openapi.json` | none | — (strips security if auth disabled; in multi mode emits cluster paths with `cluster_` operation-id prefix) |
| GET | `/snapshot?branch=` | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch |
| POST | `/query` | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) |
| POST | `/read` | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: </query>; rel="successor-version"`) |
| POST | `/export` | `/graphs/{id}/export` | bearer + `export` | NDJSON stream |
| POST | `/mutate` | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) |
| POST | `/change` | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: </mutate>; rel="successor-version"`) |
| GET | `/queries` | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog |
| POST | `/queries/{name}` | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 |
| GET | `/schema` | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source |
| POST | `/schema/apply` | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate |
| POST | `/load` | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) |
| POST | `/ingest` | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: </load>; rel="successor-version"`) (32 MB body limit) |
| GET | `/branches` | `/graphs/{id}/branches` | bearer + `read` | list branches |
| POST | `/branches` | `/graphs/{id}/branches` | bearer + `branch_create` | create |
| DELETE | `/branches/{branch}` | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete |
| POST | `/branches/merge` | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` |
| GET | `/commits?branch=` | `/graphs/{id}/commits?branch=` | bearer + `read` | list |
| GET | `/commits/{commit_id}` | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show |
Server-level management endpoints (v0.6.0+):
Per-graph endpoints — all nested under `/graphs/{id}/...`. `{id}` is the
graph id from the cluster's applied revision:
| Method | Path | Auth | Action |
|---|---|---|---|
| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs (405 in single mode) |
| GET | `/healthz` | none | — |
| GET | `/openapi.json` | none | — (strips security if auth disabled; emits the nested cluster paths with `cluster_` operation-id prefix) |
| GET | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch |
| POST | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) |
| POST | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: <query>; rel="successor-version"`) |
| POST | `/graphs/{id}/export` | bearer + `export` | NDJSON stream |
| POST | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) |
| POST | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: <mutate>; rel="successor-version"`) |
| GET | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog |
| POST | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 |
| GET | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source |
| POST | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | disabled for cluster-backed serving; returns 409 and points operators at `omnigraph cluster apply` + restart |
| POST | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) |
| POST | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: <load>; rel="successor-version"`) (32 MB body limit) |
| GET | `/graphs/{id}/branches` | bearer + `read` | list branches |
| POST | `/graphs/{id}/branches` | bearer + `branch_create` | create |
| DELETE | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete |
| POST | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` |
| GET | `/graphs/{id}/commits?branch=` | bearer + `read` | list |
| GET | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show |
Server-level management endpoints:
| Method | Path | Auth | Action |
|---|---|---|---|
| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs |
### Stored-query catalog (`GET /queries`)
@ -88,13 +80,14 @@ Invoke a curated, server-side stored query by **name** — the source comes from
- **Requires an explicit policy grant when auth is on.** In default-deny mode (bearer tokens but no `policy.file`), only `read` is permitted, so *every* `/queries/{name}` call returns `404` until an `invoke_query` rule is configured.
- A stored mutation cannot target a `snapshot` (`400`); a parameter type error is a structured `400` naming the parameter.
## Adding and removing graphs (multi mode)
## Adding and removing graphs
Runtime add/remove via API is **not** exposed in v0.6.0 — neither
`POST /graphs` nor `DELETE /graphs/{id}` is implemented. Operators add
or remove graphs by stopping the server, editing the `graphs:` map in
`omnigraph.yaml`, then restarting. The server treats `omnigraph.yaml`
as operator-owned configuration and never writes it.
Runtime add/remove via API is **not** exposed — neither `POST /graphs`
nor `DELETE /graphs/{id}` is implemented. Operators add or remove graphs
by running `cluster apply` against the cluster (which publishes a new
applied revision) and restarting the server so it boots from the new
revision. The server treats the cluster source as operator-owned and
never writes it.
A future release may introduce a managed registry and re-expose runtime
mutation on top of it.
@ -138,8 +131,8 @@ channels:
- **Response headers (RFC 9745)**: every response carries `Deprecation: true`.
- **Response headers (RFC 8288)**: every response carries a `Link` header
pointing at the canonical successor:
`Link: </query>; rel="successor-version"` for `/read`, and
`Link: </mutate>; rel="successor-version"` for `/change`. SDKs and HTTP
`Link: <query>; rel="successor-version"` for `/read`, and
`Link: <mutate>; rel="successor-version"` for `/change`. SDKs and HTTP
proxies can pick the successor up automatically.
Migration is purely cosmetic on the client side — swap the URL path, leave
@ -226,4 +219,4 @@ See [deployment.md](../deployment.md) for token-source operational details.
admission control" above). No global rate limiter is configured;
add `tower_http::limit` if a graph-wide cap is needed.
- Pagination — none (commits/branches return everything; export streams).
- Runtime graph add/remove — edit `omnigraph.yaml` and restart.
- Runtime graph add/remove — run `cluster apply` and restart.

View file

@ -27,7 +27,8 @@ list/`Blob` columns → none.
## L2 — OmniGraph orchestration
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run.
- **`@index`/`@key` declares intent; the physical index is derived state.** A migration records the declaration in the catalog/IR and never fails on it — `schema apply` builds **no** indexes (adding an `@index` to an existing column is a pure metadata change that touches no table data). `load`/`mutate` build declared indexes inline as part of the write, but a column that can't be built yet (a `Vector` column with no trainable vectors — IVF k-means needs ≥1 vector, e.g. rows loaded before `embed` runs) is left **pending**, not fatal. Reads stay correct meanwhile: a missing/partial index degrades to a scan (vector search to brute-force). A later `ensure_indices`/`optimize` materializes the pending index once it is buildable. This mirrors how LanceDB builds indexes asynchronously and serves unindexed rows by brute-force.
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted + vector indexes for the current head; safe to re-run; returns the columns it had to defer as pending. `optimize` runs it after compaction, so the maintenance cron is the convergence path for deferred indexes.
- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.