Merge remote-tracking branch 'origin/main' into ragnorc/shaping-config-integration

# Conflicts:
#	crates/omnigraph-cluster/src/lib.rs
#	crates/omnigraph-cluster/src/serve.rs
#	crates/omnigraph-server/src/lib.rs
#	crates/omnigraph-server/src/settings.rs
#	docs/user/clusters/config.md
This commit is contained in:
aaltshuler 2026-06-16 04:13:00 +03:00
commit 4f8c71fa23
75 changed files with 6557 additions and 6879 deletions

View file

@ -33,8 +33,8 @@ OmniGraph is a typed property-graph engine built as a coordination layer over ma
- **Multi-modal querying**: vector ANN (`nearest`), full-text (`search`/`fuzzy`/`match_text`/`bm25`), Reciprocal Rank Fusion (`rrf`), and graph traversal (`Expand`, anti-join `not { … }`) in one runtime. - **Multi-modal querying**: vector ANN (`nearest`), full-text (`search`/`fuzzy`/`match_text`/`bm25`), Reciprocal Rank Fusion (`rrf`), and graph traversal (`Expand`, anti-join `not { … }`) in one runtime.
- **Branches and commits across the whole graph**: Git-style — every successful publish appends to a commit DAG; merges are three-way at the row level. - **Branches and commits across the whole graph**: Git-style — every successful publish appends to a commit DAG; merges are three-way at the row level.
- **Atomic per-query writes**: `mutate_as` and `load` accumulate insert/update batches into an in-memory `MutationStaging.pending` per touched table; one `stage_*` + `commit_staged` per table runs at end-of-query, then `ManifestBatchPublisher::publish` commits the manifest atomically with per-table `expected_table_versions` CAS. A mid-query failure leaves Lance HEAD untouched on staged tables — no drift, no run state machine, no staging branches. Deletes still inline-commit; D₂ at parse time prevents inserts/updates and deletes from coexisting in one query. - **Atomic per-query writes**: `mutate_as` and `load` accumulate insert/update batches into an in-memory `MutationStaging.pending` per touched table; one `stage_*` + `commit_staged` per table runs at end-of-query, then `ManifestBatchPublisher::publish` commits the manifest atomically with per-table `expected_table_versions` CAS. A mid-query failure leaves Lance HEAD untouched on staged tables — no drift, no run state machine, no staging branches. Deletes still inline-commit; D₂ at parse time prevents inserts/updates and deletes from coexisting in one query.
- **HTTP server**: Axum + utoipa OpenAPI, bearer auth (SHA-256 hashed, optional AWS Secrets Manager). Cedar policy enforcement is engine-wide — every `_as` writer calls `Omnigraph::enforce(action, scope, actor)`, so HTTP, CLI, and embedded SDK consumers all hit the same gate. **Two modes** (v0.6.0+): single-graph (legacy flat routes) and multi-graph (`/graphs/{graph_id}/...` cluster routes + read-only `GET /graphs` enumeration). Per-graph + server-level Cedar policies. Multi-graph mode boots from a cluster directory (`--cluster <dir | s3://…>`, RFC-005) or the legacy `omnigraph.yaml` `graphs:` map. Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not exposed — operators run `cluster apply` (or edit the legacy file) and restart. - **HTTP server**: Axum + utoipa OpenAPI, bearer auth (SHA-256 hashed, optional AWS Secrets Manager). Cedar policy enforcement is engine-wide — every `_as` writer calls `Omnigraph::enforce(action, scope, actor)`, so HTTP, CLI, and embedded SDK consumers all hit the same gate. **Cluster-only boot** (RFC-011): the server always boots from a cluster directory (`--cluster <dir | s3://…>`, RFC-005) and serves N graphs (N ≥ 1) under multi-graph routes (`/graphs/{graph_id}/...` + read-only `GET /graphs` enumeration); there are no single-graph flat routes and no positional-URI boot. Per-graph + server-level Cedar policies. Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not exposed — operators run `cluster apply` and restart.
- **CLI** with two-surface config (RFC-008): the team-owned cluster directory (`cluster.yaml`) plus the per-operator `~/.omnigraph/config.yaml` (servers, credentials, actor, aliases). The legacy combined `omnigraph.yaml` still loads with per-key deprecation warnings — `config migrate` proposes the split, `OMNIGRAPH_NO_LEGACY_CONFIG=1` enforces strict mode. **Never extend `omnigraph.yaml`.** Multi-format output (json/jsonl/csv/kv/table). - **CLI** with two-surface config (RFC-007/008): the team-owned cluster directory (`cluster.yaml`) plus the per-operator `~/.omnigraph/config.yaml` (servers, clusters, credentials, actor, profiles, aliases, defaults). Graphs are addressed via `--store`/`--server`/`--cluster`/`--profile`/operator defaults (RFC-011). Multi-format output (json/jsonl/csv/kv/table).
Throughout the docs, capabilities are split into **L1 — Inherited from Lance** vs **L2 — Added by OmniGraph**. Throughout the docs, capabilities are split into **L1 — Inherited from Lance** vs **L2 — Added by OmniGraph**.
@ -96,7 +96,7 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec
| Cedar policy actions, scopes, CLI | [docs/user/operations/policy.md](docs/user/operations/policy.md) | | Cedar policy actions, scopes, CLI | [docs/user/operations/policy.md](docs/user/operations/policy.md) |
| HTTP server endpoints, auth, error model, body limits | [docs/user/operations/server.md](docs/user/operations/server.md) | | HTTP server endpoints, auth, error model, body limits | [docs/user/operations/server.md](docs/user/operations/server.md) |
| CLI quick-start | [docs/user/cli/index.md](docs/user/cli/index.md) | | CLI quick-start | [docs/user/cli/index.md](docs/user/cli/index.md) |
| CLI command surface and config schemas (`~/.omnigraph/config.yaml`, legacy `omnigraph.yaml`) | [docs/user/cli/reference.md](docs/user/cli/reference.md) | | CLI command surface and config schema (`~/.omnigraph/config.yaml`) | [docs/user/cli/reference.md](docs/user/cli/reference.md) |
| Audit / actor tracking | [docs/user/operations/audit.md](docs/user/operations/audit.md) | | Audit / actor tracking | [docs/user/operations/audit.md](docs/user/operations/audit.md) |
| Error taxonomy and result serialization | [docs/user/operations/errors.md](docs/user/operations/errors.md) | | Error taxonomy and result serialization | [docs/user/operations/errors.md](docs/user/operations/errors.md) |
| Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) | | Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) |
@ -144,6 +144,7 @@ These are architectural rules that need to be in scope on every change. They're
4. **Bearer-token plaintext never persists in process memory.** Tokens are hashed at startup; auth uses constant-time comparison; the actor id is server-resolved from the hash match and must not be settable by the client. 4. **Bearer-token plaintext never persists in process memory.** Tokens are hashed at startup; auth uses constant-time comparison; the actor id is server-resolved from the hash match and must not be settable by the client.
5. **Reads always see the current index state for the branch they're reading.** Indexes track the branch head, not historical snapshots. If you change index lifecycle, preserve this guarantee. 5. **Reads always see the current index state for the branch they're reading.** Indexes track the branch head, not historical snapshots. If you change index lifecycle, preserve this guarantee.
6. **Stable type IDs survive renames.** Schema migration relies on identity that's stable across rename — don't mint new IDs on rename. 6. **Stable type IDs survive renames.** Schema migration relies on identity that's stable across rename — don't mint new IDs on rename.
7. **Logical contract over physical state.** Physical state (index coverage, fragment layout, compaction versions, staged writes) is derived and rebuildable; it must never fail a logical operation. Check preconditions against logical state and let reconciliation converge the physical state idempotently — genuine logical conflicts still fail loudly. This is the rule rules 16 instantiate; full statement and applications in [docs/dev/invariants.md](docs/dev/invariants.md).
### Deny-list (fast-pass review filter — full reasoning in [docs/dev/invariants.md](docs/dev/invariants.md)) ### Deny-list (fast-pass review filter — full reasoning in [docs/dev/invariants.md](docs/dev/invariants.md))
@ -179,7 +180,7 @@ Rust stable workspace (edition 2024). `protoc` is a build dependency (`brew inst
cargo build --workspace --locked # build everything cargo build --workspace --locked # build everything
cargo test --workspace --locked # the canonical CI gate (matches CI exactly) cargo test --workspace --locked # the canonical CI gate (matches CI exactly)
cargo run -p omnigraph-cli -- <args> # run the `omnigraph` CLI from source cargo run -p omnigraph-cli -- <args> # run the `omnigraph` CLI from source
cargo run -p omnigraph-server -- <uri> --bind 0.0.0.0:8080 # run the server from source cargo run -p omnigraph-server -- --cluster <dir|s3://...> --bind 0.0.0.0:8080 # run the server from source
# Run one crate / one test file / one test fn # Run one crate / one test file / one test fn
cargo test -p omnigraph-engine --test traversal # one integration-test file (see docs/dev/testing.md) cargo test -p omnigraph-engine --test traversal # one integration-test file (see docs/dev/testing.md)
@ -231,10 +232,10 @@ omnigraph cleanup --keep 10 --older-than 7d --confirm s3://my-bucket/graph.omni
# Stand up the HTTP server (token from env) # Stand up the HTTP server (token from env)
OMNIGRAPH_SERVER_BEARER_TOKEN=xxxx \ OMNIGRAPH_SERVER_BEARER_TOKEN=xxxx \
omnigraph-server s3://my-bucket/graph.omni --bind 0.0.0.0:8080 omnigraph-server --cluster s3://my-bucket/cluster --bind 0.0.0.0:8080
# Cedar policy explain # Cedar policy explain
omnigraph policy explain --actor act-alice --action change --branch main omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act-alice --action change --branch main
``` ```
--- ---
@ -250,7 +251,7 @@ omnigraph policy explain --actor act-alice --action change --branch main
| Compaction (`compact_files`) + reindex (`optimize_indices`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; per table runs `compact_files` **then Lance `optimize_indices`** (folds appended/rewritten fragments back into existing indexes — incremental merge, not retrain) and **publishes the resulting version to `__manifest`** (so the manifest tracks the Lance HEAD — required for reads to observe the work and for schema apply / strict writes to pass their HEAD-vs-manifest precondition), under the per-`(table, main)` write queue with `SidecarKind::Optimize` recovery coverage spanning both ops; **commits even with no compaction work if index coverage is stale**; **refuses on an unrecovered graph**; **skips uncovered HEAD > manifest drift** with `DriftNeedsRepair`; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent; reindex is skipped for them too today), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) | | Compaction (`compact_files`) + reindex (`optimize_indices`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; per table runs `compact_files` **then Lance `optimize_indices`** (folds appended/rewritten fragments back into existing indexes — incremental merge, not retrain) and **publishes the resulting version to `__manifest`** (so the manifest tracks the Lance HEAD — required for reads to observe the work and for schema apply / strict writes to pass their HEAD-vs-manifest precondition), under the per-`(table, main)` write queue with `SidecarKind::Optimize` recovery coverage spanning both ops; **commits even with no compaction work if index coverage is stale**; **refuses on an unrecovered graph**; **skips uncovered HEAD > manifest drift** with `DriftNeedsRepair`; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent; reindex is skipped for them too today), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) |
| Repair uncovered drift | — | `omnigraph repair` explicitly classifies uncovered table `HEAD > manifest` drift: verified maintenance drift (`ReserveFragments`/`Rewrite`) can be published with `--confirm`; suspicious or unverifiable drift requires `--force --confirm`. Sidecar-covered crash residuals still recover automatically on open. | | Repair uncovered drift | — | `omnigraph repair` explicitly classifies uncovered table `HEAD > manifest` drift: verified maintenance drift (`ReserveFragments`/`Rewrite`) can be published with `--confirm`; suspicious or unverifiable drift requires `--force --confirm`. Sidecar-covered crash residuals still recover automatically on open. |
| Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy | | Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy |
| BTREE / inverted (FTS) / vector indexes | ✅ | `ensure_indices` builds them per `@index`/`@key` column, dispatched by type via `node_prop_index_kind` (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. Coverage of fragments appended after build is restored by `optimize`'s `optimize_indices` pass (see Compaction row). | | BTREE / inverted (FTS) / vector indexes | ✅ | `@index`/`@key` declares intent; the physical index is derived state that never fails a logical op. Built per column through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. **Schema apply builds nothing** (records intent only); `load`/`mutate` build inline but **defer an untrainable Vector column** (no trainable vectors yet) as *pending* rather than aborting. `ensure_indices`/`optimize` is the reconciler that materializes declared-but-missing indexes and restores coverage of appended/rewritten fragments (`optimize_indices`), reporting still-pending columns (see Compaction row). |
| `merge_insert` upsert | ✅ | `LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering | | `merge_insert` upsert | ✅ | `LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering |
| Vector search | ✅ | `nearest()` query op; embedding pipeline (Gemini / OpenAI clients); `@embed` in schema | | Vector search | ✅ | `nearest()` query op; embedding pipeline (Gemini / OpenAI clients); `@embed` in schema |
| Full-text search | ✅ | `search/fuzzy/match_text/bm25` query ops | | Full-text search | ✅ | `search/fuzzy/match_text/bm25` query ops |
@ -264,8 +265,8 @@ omnigraph policy explain --actor act-alice --action change --branch main
| Three-way row-level merge | — | `OrderedTableCursor` + `StagedTableWriter`, structured `MergeConflictKind` | | Three-way row-level merge | — | `OrderedTableCursor` + `StagedTableWriter`, structured `MergeConflictKind` |
| Change feeds | — | `diff_between` / `diff_commits` with manifest fast path + ID streaming | | Change feeds | — | `diff_between` / `diff_commits` with manifest fast path + ID streaming |
| Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/operations/policy.md](docs/user/operations/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. | | Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/operations/policy.md](docs/user/operations/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. |
| HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **multi-graph mode (v0.6.0+) with cluster routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Multi-graph boots from a cluster directory (`--cluster`) or the legacy `omnigraph.yaml`; add/remove graphs via `cluster apply` (or by editing the legacy file) and restarting.** | | HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **cluster-only boot (RFC-011): always `--cluster <dir | s3://…>`, serving N graphs (N ≥ 1) under multi-graph routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Add/remove graphs via `cluster apply` and restart.** |
| CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`; legacy `omnigraph.yaml` deprecated per RFC-008), aliases, multi-format output (json/jsonl/csv/kv/table) | | CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`), scope addressing (`--store`/`--server`/`--cluster`/`--profile`/defaults, RFC-011), aliases, multi-format output (json/jsonl/csv/kv/table) |
| Audit / actor tracking | — | `_as` write APIs + actor map in commit graph | | Audit / actor tracking | — | `_as` write APIs + actor map in commit graph |
| Local RustFS bootstrap | — | `scripts/local-rustfs-bootstrap.sh` one-shot S3-backed dev environment | | Local RustFS bootstrap | — | `scripts/local-rustfs-bootstrap.sh` one-shot S3-backed dev environment |

View file

@ -7,7 +7,8 @@
**Lakehouse native graph engine built for context assembly** **Lakehouse native graph engine built for context assembly**
Omnigraph acts as operational state & coordination layer for agents Omnigraph acts as operational state & coordination layer for agents.
Hundreds of agents can enrich the graph on parallel isolated branches and changes can be reviewed and merged safely.
- Git-style versioning & branching - Git-style versioning & branching
- Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly - Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly

View file

@ -325,6 +325,13 @@ pub struct InvokeStoredQueryRequest {
/// mutation). Mutually exclusive with `branch`. /// mutation). Mutually exclusive with `branch`.
#[serde(default)] #[serde(default)]
pub snapshot: Option<String>, pub snapshot: Option<String>,
/// The kind the caller expects (RFC-011 Decision 3): `Some(false)` for
/// `omnigraph query <name>`, `Some(true)` for `omnigraph mutate <name>`.
/// When set and it disagrees with the stored query's actual kind, the
/// server rejects the call (400) so the verb asserts the kind. `None`
/// (the default) skips the check — preserving older clients and aliases.
#[serde(default)]
pub expect_mutation: Option<bool>,
} }
/// Response for `POST /queries/{name}`: the read envelope for a stored /// Response for `POST /queries/{name}`: the read envelope for a stored

View file

@ -18,10 +18,10 @@ any — run against a graph, served (--server / --profile) or embedded (--store
URI): query, mutate, load, branch, snapshot, export, commit, schema show/apply.\n \ URI): query, mutate, load, branch, snapshot, export, commit, schema show/apply.\n \
served require a server: graphs.\n \ served require a server: graphs.\n \
direct direct storage access; reject --server (init, optimize, repair, cleanup, \ direct direct storage access; reject --server (init, optimize, repair, cleanup, \
schema plan, lint, queries validate).\n \ schema plan, lint).\n \
control manage a cluster via --config: cluster.\n \ control manage or inspect a cluster (cluster via --config; policy & queries via \
local no graph; local config & tooling: policy, embed, login, logout, config, \ --cluster).\n \
version, queries list.\n\ local no explicit graph scope; local config & tooling: alias, embed, login, logout, profile, version.\n\
See the 'Command capabilities' section of the CLI reference for which flags apply where.")] See the 'Command capabilities' section of the CLI reference for which flags apply where.")]
pub(crate) struct Cli { pub(crate) struct Cli {
/// Actor id for direct-engine writes; overrides `cli.actor`. No effect on /// Actor id for direct-engine writes; overrides `cli.actor`. No effect on
@ -37,9 +37,11 @@ pub(crate) struct Cli {
#[arg(long, global = true, value_name = "NAME|URL")] #[arg(long, global = true, value_name = "NAME|URL")]
pub(crate) server: Option<String>, pub(crate) server: Option<String>,
/// Graph id on a multi-graph `--server` (appends `/graphs/<id>` to /// Select a graph within a multi-graph scope: on a `--server` it appends
/// the server url). Requires --server. /// `/graphs/<id>` to the server url; on a `--cluster` it picks which
#[arg(long, global = true, value_name = "GRAPH_ID", requires = "server")] /// cluster graph to maintain. Rejected on a single-graph address (a
/// positional URI / `--store`).
#[arg(long, global = true, value_name = "GRAPH_ID")]
pub(crate) graph: Option<String>, pub(crate) graph: Option<String>,
/// Select a named scope bundle (RFC-011) from `profiles:` in /// Select a named scope bundle (RFC-011) from `profiles:` in
@ -56,6 +58,26 @@ pub(crate) struct Cli {
#[arg(long, global = true, value_name = "URI")] #[arg(long, global = true, value_name = "URI")]
pub(crate) store: Option<String>, pub(crate) store: Option<String>,
/// Address a cluster-managed graph's storage for maintenance (RFC-011):
/// a cluster directory or storage-root URI — named via `clusters:` in
/// ~/.omnigraph/config.yaml, or a literal `file://`/`s3://` root. Pair
/// with `--graph <id>` to select the graph. Used by optimize / repair /
/// cleanup; exclusive with a positional URI / `--store` / `--server`.
#[arg(long, global = true, value_name = "DIR|URI")]
pub(crate) cluster: Option<String>,
/// Skip the confirmation prompt for a destructive write (`cleanup`,
/// overwrite `load`, `branch delete`) against a non-local scope (RFC-011
/// Decision 9). Without it, a non-local destructive write prompts on a TTY
/// and refuses (errors) when there is no TTY or `--json` is set.
#[arg(long, global = true)]
pub(crate) yes: bool,
/// Suppress the one-line resolved-write-target diagnostic that write
/// commands echo to stderr (RFC-011 Decision 9).
#[arg(long, global = true)]
pub(crate) quiet: bool,
#[command(subcommand)] #[command(subcommand)]
pub(crate) command: Command, pub(crate) command: Command,
} }
@ -70,22 +92,16 @@ pub(crate) enum Command {
/// when used. Pairs with `omnigraph mutate` on the write side. /// when used. Pairs with `omnigraph mutate` on the write side.
#[command(visible_alias = "read")] #[command(visible_alias = "read")]
Query { Query {
/// Graph URI /// Query name. With no `--query`/`-e`, the stored query to invoke from
#[arg(long)] /// the catalog (served — addressed via --server/--profile). With
uri: Option<String>, /// `--query`/`-e`, selects which query in that ad-hoc source to run.
#[arg(hide = true)]
legacy_uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
#[arg(long, conflicts_with_all = ["query", "query_string"])]
alias: Option<String>,
#[arg(long, conflicts_with_all = ["alias", "query_string"])]
query: Option<PathBuf>,
/// Inline GQ source — alternative to `--query <path>` and `--alias <name>`.
#[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with_all = ["query", "alias"])]
query_string: Option<String>,
#[arg(long)]
name: Option<String>, name: Option<String>,
/// Ad-hoc query file (a `.gq` you're authoring / break-glass).
#[arg(long, conflicts_with = "query_string")]
query: Option<PathBuf>,
/// Inline ad-hoc GQ source — alternative to `--query <path>`.
#[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with = "query")]
query_string: Option<String>,
#[command(flatten)] #[command(flatten)]
params: ParamsArgs, params: ParamsArgs,
#[arg(long, conflicts_with = "snapshot")] #[arg(long, conflicts_with = "snapshot")]
@ -96,8 +112,6 @@ pub(crate) enum Command {
format: Option<ReadOutputFormat>, format: Option<ReadOutputFormat>,
#[arg(long, conflicts_with = "format")] #[arg(long, conflicts_with = "format")]
json: bool, json: bool,
#[arg()]
alias_args: Vec<String>,
}, },
/// Execute a graph mutation query against a branch. /// Execute a graph mutation query against a branch.
/// ///
@ -106,38 +120,48 @@ pub(crate) enum Command {
/// warning when used. Pairs with `omnigraph query` on the read side. /// warning when used. Pairs with `omnigraph query` on the read side.
#[command(visible_alias = "change")] #[command(visible_alias = "change")]
Mutate { Mutate {
/// Graph URI /// Query name. With no `--query`/`-e`, the stored mutation to invoke
#[arg(long)] /// from the catalog (served — addressed via --server/--profile). With
uri: Option<String>, /// `--query`/`-e`, selects which query in that ad-hoc source to run.
#[arg(hide = true)]
legacy_uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
#[arg(long, conflicts_with_all = ["query", "query_string"])]
alias: Option<String>,
#[arg(long, conflicts_with_all = ["alias", "query_string"])]
query: Option<PathBuf>,
/// Inline GQ source — alternative to `--query <path>` and `--alias <name>`.
#[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with_all = ["query", "alias"])]
query_string: Option<String>,
#[arg(long)]
name: Option<String>, name: Option<String>,
/// Ad-hoc mutation file (a `.gq` you're authoring / break-glass).
#[arg(long, conflicts_with = "query_string")]
query: Option<PathBuf>,
/// Inline ad-hoc GQ source — alternative to `--query <path>`.
#[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with = "query")]
query_string: Option<String>,
#[command(flatten)] #[command(flatten)]
params: ParamsArgs, params: ParamsArgs,
#[arg(long)] #[arg(long)]
branch: Option<String>, branch: Option<String>,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
#[arg()] },
alias_args: Vec<String>, /// Invoke an operator alias (RFC-011 Decision 4).
///
/// An alias is a personal binding under `aliases:` in
/// ~/.omnigraph/config.yaml — name → (server, graph, stored-query name,
/// default params). `omnigraph alias <name> [args]` invokes the bound
/// stored query on its server. Living in its own namespace, an alias can
/// never shadow or be shadowed by a built-in verb. Replaces the removed
/// `--alias` flag on `query`/`mutate`.
Alias {
/// Alias name (a key under `aliases:` in ~/.omnigraph/config.yaml).
name: String,
/// Positional args bound to the alias's declared `args` params, in order.
args: Vec<String>,
#[command(flatten)]
params: ParamsArgs,
#[arg(long, conflicts_with = "json")]
format: Option<ReadOutputFormat>,
#[arg(long, conflicts_with = "format")]
json: bool,
}, },
/// Load data into a graph (local or remote) /// Load data into a graph (local or remote)
Load { Load {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
data: PathBuf, data: PathBuf,
/// Target branch (defaults to main). Without --from it must exist. /// Target branch (defaults to main). Without --from it must exist.
#[arg(long)] #[arg(long)]
@ -159,8 +183,6 @@ pub(crate) enum Command {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
data: PathBuf, data: PathBuf,
#[arg(long)] #[arg(long)]
branch: Option<String>, branch: Option<String>,
@ -181,8 +203,6 @@ pub(crate) enum Command {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
branch: Option<String>, branch: Option<String>,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -192,8 +212,6 @@ pub(crate) enum Command {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
branch: Option<String>, branch: Option<String>,
#[arg(long, hide = true)] #[arg(long, hide = true)]
jsonl: bool, jsonl: bool,
@ -238,30 +256,12 @@ pub(crate) enum Command {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
/// Cluster directory or storage-root URI; with --cluster-graph, resolves
/// the graph's storage URI from the served cluster state.
#[arg(long, conflicts_with = "uri", requires = "cluster_graph")]
cluster: Option<String>,
/// Graph id within --cluster.
#[arg(long, requires = "cluster")]
cluster_graph: Option<String>,
#[arg(long)]
json: bool, json: bool,
}, },
/// Classify and explicitly repair manifest/head drift /// Classify and explicitly repair manifest/head drift
Repair { Repair {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
/// Cluster directory or storage-root URI; with --cluster-graph, resolves
/// the graph's storage URI from the served cluster state.
#[arg(long, conflicts_with = "uri", requires = "cluster_graph")]
cluster: Option<String>,
/// Graph id within --cluster.
#[arg(long, requires = "cluster")]
cluster_graph: Option<String>,
/// Publish verified maintenance drift. Without this flag, repair only /// Publish verified maintenance drift. Without this flag, repair only
/// previews what it would do. /// previews what it would do.
#[arg(long)] #[arg(long)]
@ -277,15 +277,6 @@ pub(crate) enum Command {
Cleanup { Cleanup {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
/// Cluster directory or storage-root URI; with --cluster-graph, resolves
/// the graph's storage URI from the served cluster state.
#[arg(long, conflicts_with = "uri", requires = "cluster_graph")]
cluster: Option<String>,
/// Graph id within --cluster.
#[arg(long, requires = "cluster")]
cluster_graph: Option<String>,
/// Number of recent versions to keep per table. Either `--keep` or /// Number of recent versions to keep per table. Either `--keep` or
/// `--older-than` (or both) must be set. /// `--older-than` (or both) must be set.
#[arg(long)] #[arg(long)]
@ -315,8 +306,6 @@ pub(crate) enum Command {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
query: PathBuf, query: PathBuf,
#[arg(long)] #[arg(long)]
schema: Option<PathBuf>, schema: Option<PathBuf>,
@ -336,8 +325,7 @@ pub(crate) enum Command {
command: ClusterCommand, command: ClusterCommand,
}, },
// ── Session / config ── no graph addressing; local tooling. /// Policy administration and diagnostics against a cluster's applied bundles
/// Policy administration and diagnostics
Policy { Policy {
#[command(subcommand)] #[command(subcommand)]
command: PolicyCommand, command: PolicyCommand,
@ -363,16 +351,32 @@ pub(crate) enum Command {
#[arg(long)] #[arg(long)]
json: bool, json: bool,
}, },
/// Legacy-config tooling (RFC-008): split omnigraph.yaml into its /// Inspect the scope profiles in ~/.omnigraph/config.yaml (read-only).
/// two destinations. Profile {
Config {
#[command(subcommand)] #[command(subcommand)]
command: ConfigCommand, command: ProfileCommand,
}, },
/// Print the CLI version /// Print the CLI version
Version, Version,
} }
#[derive(Debug, Subcommand)]
pub(crate) enum ProfileCommand {
/// List the profiles defined in ~/.omnigraph/config.yaml.
List {
#[arg(long)]
json: bool,
},
/// Show a profile's resolved scope. With no name, shows the active
/// (`$OMNIGRAPH_PROFILE`) profile, else the flat operator defaults.
Show {
/// Profile name (optional).
name: Option<String>,
#[arg(long)]
json: bool,
},
}
#[derive(Debug, Subcommand)] #[derive(Debug, Subcommand)]
pub(crate) enum ClusterCommand { pub(crate) enum ClusterCommand {
/// Validate cluster.yaml and referenced schemas, queries, and policy files. /// Validate cluster.yaml and referenced schemas, queries, and policy files.
@ -469,8 +473,6 @@ pub(crate) enum GraphsCommand {
#[arg(long)] #[arg(long)]
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
json: bool, json: bool,
}, },
} }
@ -483,8 +485,6 @@ pub(crate) enum BranchCommand {
#[arg(long)] #[arg(long)]
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
from: Option<String>, from: Option<String>,
name: String, name: String,
#[arg(long)] #[arg(long)]
@ -496,8 +496,6 @@ pub(crate) enum BranchCommand {
#[arg(long)] #[arg(long)]
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
json: bool, json: bool,
}, },
/// Delete a branch /// Delete a branch
@ -505,8 +503,6 @@ pub(crate) enum BranchCommand {
/// Graph URI /// Graph URI
#[arg(long)] #[arg(long)]
uri: Option<String>, uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
name: String, name: String,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -516,8 +512,6 @@ pub(crate) enum BranchCommand {
/// Graph URI /// Graph URI
#[arg(long)] #[arg(long)]
uri: Option<String>, uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
source: String, source: String,
#[arg(long)] #[arg(long)]
into: Option<String>, into: Option<String>,
@ -533,8 +527,6 @@ pub(crate) enum SchemaCommand {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
schema: PathBuf, schema: PathBuf,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -549,8 +541,6 @@ pub(crate) enum SchemaCommand {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
schema: PathBuf, schema: PathBuf,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -572,8 +562,6 @@ pub(crate) enum SchemaCommand {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
json: bool, json: bool,
}, },
} }
@ -586,8 +574,6 @@ pub(crate) enum CommitCommand {
/// Graph URI /// Graph URI
uri: Option<String>, uri: Option<String>,
#[arg(long)] #[arg(long)]
config: Option<PathBuf>,
#[arg(long)]
branch: Option<String>, branch: Option<String>,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -597,8 +583,6 @@ pub(crate) enum CommitCommand {
/// Graph URI /// Graph URI
#[arg(long)] #[arg(long)]
uri: Option<String>, uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
commit_id: String, commit_id: String,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
@ -607,20 +591,24 @@ pub(crate) enum CommitCommand {
#[derive(Debug, Subcommand)] #[derive(Debug, Subcommand)]
pub(crate) enum PolicyCommand { pub(crate) enum PolicyCommand {
/// Validate policy YAML and compiled Cedar policy state /// Compile and validate the Cedar policy bundle(s) applied in a cluster.
Validate { ///
#[arg(long)] /// Sources the bundle(s) from the cluster's applied policies
config: Option<PathBuf>, /// (`--cluster <dir>`); pass the global `--graph <id>` to pick one
}, /// graph's bundle when several apply.
/// Run declarative policy tests from policy.tests.yaml Validate {},
/// Run declarative policy tests against a cluster's applied bundle.
///
/// The cluster model has no per-bundle tests file, so the cases are
/// supplied explicitly with `--tests <file>` and checked against the
/// bundle selected by `--cluster` (+ optional `--graph`).
Test { Test {
/// Path to a policy.tests.yaml file.
#[arg(long)] #[arg(long)]
config: Option<PathBuf>, tests: PathBuf,
}, },
/// Explain one policy decision locally /// Explain one policy decision against a cluster's applied bundle.
Explain { Explain {
#[arg(long)]
config: Option<PathBuf>,
#[arg(long)] #[arg(long)]
actor: String, actor: String,
#[arg(long)] #[arg(long)]
@ -634,24 +622,19 @@ pub(crate) enum PolicyCommand {
#[derive(Debug, Subcommand)] #[derive(Debug, Subcommand)]
pub(crate) enum QueriesCommand { pub(crate) enum QueriesCommand {
/// Type-check the stored-query registry against the live schema. /// Type-check a cluster's stored-query registry against its schemas.
/// ///
/// Distinct from `omnigraph lint` (which lints one `.gq` file): /// Distinct from `omnigraph lint` (which lints one `.gq` file): this
/// this validates the whole `queries:` registry — opening the graph /// validates the whole `queries:` registry of a cluster (`--cluster
/// to read its schema and confirming every stored query still /// <dir>`, optional `--graph <id>`) by reading each graph's applied
/// type-checks. Exits non-zero on any breakage. /// schema and confirming every stored query still type-checks. Exits
/// non-zero on any breakage.
Validate { Validate {
/// Graph URI
uri: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
}, },
/// List the registered stored queries (name, MCP exposure, params). /// List a cluster's registered stored queries (name, params).
List { List {
#[arg(long)]
config: Option<PathBuf>,
#[arg(long)] #[arg(long)]
json: bool, json: bool,
}, },
@ -682,7 +665,6 @@ impl From<CliLoadMode> for LoadMode {
} }
} }
} }
impl CliLoadMode { impl CliLoadMode {
pub(crate) fn as_str(self) -> &'static str { pub(crate) fn as_str(self) -> &'static str {
match self { match self {
@ -692,21 +674,3 @@ impl CliLoadMode {
} }
} }
} }
#[derive(Debug, Subcommand)]
pub(crate) enum ConfigCommand {
/// Propose (and with --write, apply) the RFC-008 split of a legacy
/// omnigraph.yaml: team half -> a ready-to-review cluster.yaml,
/// personal half -> ~/.omnigraph/config.yaml (key-level merge,
/// existing entries always win). Touches nothing without --write.
Migrate {
/// Path to the legacy omnigraph.yaml (default: ./omnigraph.yaml)
#[arg(long)]
config: Option<PathBuf>,
/// Apply the split instead of only printing it
#[arg(long)]
write: bool,
#[arg(long)]
json: bool,
},
}

View file

@ -29,7 +29,8 @@ use omnigraph::db::{Omnigraph, ReadTarget};
use omnigraph_api_types::{ use omnigraph_api_types::{
BranchCreateOutput, BranchCreateRequest, BranchDeleteOutput, BranchListOutput, BranchCreateOutput, BranchCreateRequest, BranchDeleteOutput, BranchListOutput,
BranchMergeOutput, BranchMergeRequest, ChangeOutput, CommitListOutput, CommitOutput, BranchMergeOutput, BranchMergeRequest, ChangeOutput, CommitListOutput, CommitOutput,
ErrorOutput, ExportRequest, GraphListResponse, IngestOutput, IngestRequest, ReadOutput, ErrorOutput, ExportRequest, GraphListResponse, IngestOutput, IngestRequest,
InvokeStoredQueryRequest, ReadOutput,
ReadRequest, SchemaApplyOutput, SchemaApplyRequest, SchemaOutput, SnapshotOutput, commit_output, ReadRequest, SchemaApplyOutput, SchemaApplyRequest, SchemaOutput, SnapshotOutput, commit_output,
ingest_output, read_output, schema_apply_output, snapshot_payload, ingest_output, read_output, schema_apply_output, snapshot_payload,
}; };
@ -39,22 +40,20 @@ use serde_json::Value;
use crate::cli::CliLoadMode; use crate::cli::CliLoadMode;
use crate::helpers::{ use crate::helpers::{
ResolvedCliGraph, apply_bearer_token, apply_server_flag, build_http_client, is_remote_uri, apply_bearer_token, apply_server_flag, build_http_client, is_remote_uri,
legacy_change_request_body, open_local_db_with_policy, query_params_from_json, legacy_change_request_body, query_params_from_json,
remote_json, remote_url, resolve_cli_actor, resolve_cli_graph, resolve_remote_bearer_token, remote_json, remote_url, resolve_cli_actor, resolve_cli_graph, resolve_remote_bearer_token,
select_named_query, resolve_server_flag, select_named_query,
}; };
use crate::output::{LoadOutput, load_output_from_result, load_output_from_tables}; use crate::output::{LoadOutput, load_output_from_result, load_output_from_tables};
use omnigraph_server::config::OmnigraphConfig;
pub(crate) enum GraphClient { pub(crate) enum GraphClient {
/// Local engine at `uri`. Reads (`resolve()`) leave `graph`/`actor` /// Local engine at `uri`. Reads (`resolve()`) leave `actor` empty;
/// empty and open without policy; writes (`resolve_with_policy()`) /// writes (`resolve_with_policy()`) attribute the resolved actor.
/// fill them, opening through `open_local_db_with_policy` and /// Direct-store access carries no Cedar policy (RFC-011: policy lives
/// attributing the resolved actor. /// in the cluster/server, not in per-operator addressing).
Embedded { Embedded {
uri: String, uri: String,
graph: Option<ResolvedCliGraph>,
actor: Option<String>, actor: Option<String>,
}, },
/// Remote HTTP server. The actor is resolved server-side from the /// Remote HTTP server. The actor is resolved server-side from the
@ -66,6 +65,43 @@ pub(crate) enum GraphClient {
}, },
} }
/// RFC-011 Decision 7: a server scope that selects no graph (no `--graph`, no
/// `default_graph`) must not silently fall through to the bare server URL when
/// the server is multi-graph. Best-effort probe `GET /graphs`: a populated list
/// forces `--graph` (listing the candidates); a single-graph/flat server (405),
/// a policy-gated `/graphs`, or an unreachable server all proceed — the bare URL
/// is then correct, or the real request surfaces the failure. Only fires on the
/// no-graph path, so a `--graph`/`default_graph` happy path does no extra I/O.
async fn require_graph_for_multi_graph_server(
scope: &crate::scope::ResolvedScope,
) -> Result<()> {
let (Some(server), None) = (scope.server.as_deref(), scope.graph.as_deref()) else {
return Ok(());
};
let Some(base) = resolve_server_flag(Some(server), None)? else {
return Ok(());
};
let token = resolve_remote_bearer_token(Some(&base))?;
let probe = GraphClient::Remote {
http: build_http_client()?,
base_url: base,
token,
};
if let Ok(resp) = probe.list_graphs().await {
if !resp.graphs.is_empty() {
let ids: Vec<&str> = resp.graphs.iter().map(|g| g.graph_id.as_str()).collect();
bail!(
"server scope '{server}' has {} {}: [{}]; pass --graph <id> to select one \
(or set `default_graph` in your operator config)",
ids.len(),
if ids.len() == 1 { "graph" } else { "graphs" },
ids.join(", ")
);
}
}
Ok(())
}
/// A remote graph must be addressed with `--server` (RFC-011): a positional or /// A remote graph must be addressed with `--server` (RFC-011): a positional or
/// `--uri` `http(s)://` URL no longer auto-dispatches to a server. A remote URL /// `--uri` `http(s)://` URL no longer auto-dispatches to a server. A remote URL
/// produced by a server scope (`via_server`) is fine. /// produced by a server scope (`via_server`) is fine.
@ -86,8 +122,7 @@ impl GraphClient {
/// fork. Mirrors the read verbs' current preamble (`resolve_uri` /// fork. Mirrors the read verbs' current preamble (`resolve_uri`
/// path, not the policy-bearing `resolve_cli_graph`). Used by reads /// path, not the policy-bearing `resolve_cli_graph`). Used by reads
/// and `query` (which opens without policy, like the reads). /// and `query` (which opens without policy, like the reads).
pub(crate) fn resolve( pub(crate) async fn resolve(
config: &OmnigraphConfig,
server: Option<&str>, server: Option<&str>,
graph: Option<&str>, graph: Option<&str>,
uri: Option<String>, uri: Option<String>,
@ -100,8 +135,9 @@ impl GraphClient {
let scope = crate::scope::resolve_scope( let scope = crate::scope::resolve_scope(
&crate::operator::load_operator_config()?, &crate::operator::load_operator_config()?,
crate::planes::Capability::Any, crate::planes::Capability::Any,
crate::scope::ScopeFlags { profile, store, server, graph, uri }, crate::scope::ScopeFlags { profile, store, server, cluster: None, graph, uri },
)?; )?;
require_graph_for_multi_graph_server(&scope).await?;
let (server, graph, uri) = ( let (server, graph, uri) = (
scope.server.as_deref(), scope.server.as_deref(),
scope.graph.as_deref(), scope.graph.as_deref(),
@ -109,8 +145,8 @@ impl GraphClient {
); );
let via_server = server.is_some(); let via_server = server.is_some();
let uri = apply_server_flag(server, graph, uri)?; let uri = apply_server_flag(server, graph, uri)?;
let token = resolve_remote_bearer_token(config, uri.as_deref())?; let token = resolve_remote_bearer_token(uri.as_deref())?;
let uri = crate::helpers::resolve_uri(config, uri)?; let uri = crate::helpers::resolve_uri(uri)?;
reject_positional_remote(via_server, &uri)?; reject_positional_remote(via_server, &uri)?;
if is_remote_uri(&uri) { if is_remote_uri(&uri) {
Ok(GraphClient::Remote { Ok(GraphClient::Remote {
@ -119,11 +155,7 @@ impl GraphClient {
token, token,
}) })
} else { } else {
Ok(GraphClient::Embedded { Ok(GraphClient::Embedded { uri, actor: None })
uri,
graph: None,
actor: None,
})
} }
} }
@ -133,8 +165,7 @@ impl GraphClient {
/// resolved up front. The embedded arm then opens WITH policy. The /// resolved up front. The embedded arm then opens WITH policy. The
/// resolution order matches the write arms exactly: server flag → /// resolution order matches the write arms exactly: server flag →
/// bearer token → graph. /// bearer token → graph.
pub(crate) fn resolve_with_policy( pub(crate) async fn resolve_with_policy(
config: &OmnigraphConfig,
server: Option<&str>, server: Option<&str>,
graph: Option<&str>, graph: Option<&str>,
uri: Option<String>, uri: Option<String>,
@ -147,8 +178,9 @@ impl GraphClient {
let scope = crate::scope::resolve_scope( let scope = crate::scope::resolve_scope(
&crate::operator::load_operator_config()?, &crate::operator::load_operator_config()?,
crate::planes::Capability::Any, crate::planes::Capability::Any,
crate::scope::ScopeFlags { profile, store, server, graph, uri }, crate::scope::ScopeFlags { profile, store, server, cluster: None, graph, uri },
)?; )?;
require_graph_for_multi_graph_server(&scope).await?;
let (server, graph, uri) = ( let (server, graph, uri) = (
scope.server.as_deref(), scope.server.as_deref(),
scope.graph.as_deref(), scope.graph.as_deref(),
@ -156,8 +188,8 @@ impl GraphClient {
); );
let via_server = server.is_some(); let via_server = server.is_some();
let uri = apply_server_flag(server, graph, uri)?; let uri = apply_server_flag(server, graph, uri)?;
let token = resolve_remote_bearer_token(config, uri.as_deref())?; let token = resolve_remote_bearer_token(uri.as_deref())?;
let resolved = resolve_cli_graph(config, uri)?; let resolved = resolve_cli_graph(uri)?;
reject_positional_remote(via_server, &resolved.uri)?; reject_positional_remote(via_server, &resolved.uri)?;
if resolved.is_remote { if resolved.is_remote {
// A served write resolves the actor server-side from the bearer // A served write resolves the actor server-side from the bearer
@ -175,10 +207,9 @@ impl GraphClient {
token, token,
}) })
} else { } else {
let actor = resolve_cli_actor(cli_as, config)?; let actor = resolve_cli_actor(cli_as)?;
Ok(GraphClient::Embedded { Ok(GraphClient::Embedded {
uri: resolved.uri.clone(), uri: resolved.uri,
graph: Some(resolved),
actor, actor,
}) })
} }
@ -192,28 +223,15 @@ impl GraphClient {
} }
} }
/// The selected graph name, when a policy-bearing embedded client was
/// resolved against a named graph. `None` for remote and for reads.
pub(crate) fn selected(&self) -> Option<&str> {
match self {
GraphClient::Embedded { graph, .. } => graph.as_ref().and_then(ResolvedCliGraph::selected),
GraphClient::Remote { .. } => None,
}
}
pub(crate) fn is_remote(&self) -> bool { pub(crate) fn is_remote(&self) -> bool {
matches!(self, GraphClient::Remote { .. }) matches!(self, GraphClient::Remote { .. })
} }
/// Open the local engine the way the resolved client demands: with /// Open the local engine. Direct-store access carries no Cedar policy
/// policy when a `graph` context is present (write path), bare /// (RFC-011), so both read and write paths open bare; the actor is still
/// otherwise (read/`query` path). Captures today's two open paths in /// attributed on the write via the `_as` engine APIs.
/// one place so each verb stays a single match arm. async fn open_embedded(uri: &str) -> Result<Omnigraph> {
async fn open_embedded(uri: &str, graph: &Option<ResolvedCliGraph>) -> Result<Omnigraph> { Ok(Omnigraph::open(uri).await?)
match graph {
Some(graph) => open_local_db_with_policy(graph).await,
None => Ok(Omnigraph::open(uri).await?),
}
} }
pub(crate) async fn branch_list(&self) -> Result<BranchListOutput> { pub(crate) async fn branch_list(&self) -> Result<BranchListOutput> {
@ -375,8 +393,8 @@ impl GraphClient {
.await?; .await?;
Ok(load_output_from_tables(base_url, branch, mode.as_str(), &output)) Ok(load_output_from_tables(base_url, branch, mode.as_str(), &output))
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let result = db let result = db
.load_file_as(branch, from, data, mode.into(), actor.as_deref()) .load_file_as(branch, from, data, mode.into(), actor.as_deref())
.await?; .await?;
@ -418,8 +436,8 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let result = db let result = db
.load_file_as(branch, Some(from), data, mode.into(), actor.as_deref()) .load_file_as(branch, Some(from), data, mode.into(), actor.as_deref())
.await?; .await?;
@ -457,10 +475,10 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let (selected_name, query_params) = select_named_query(query_source, query_name)?; let (selected_name, query_params) = select_named_query(query_source, query_name)?;
let params = query_params_from_json(&query_params, params_json)?; let params = query_params_from_json(&query_params, params_json)?;
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let actor = actor.as_deref(); let actor = actor.as_deref();
let result = db let result = db
.mutate_as(branch, query_source, &selected_name, &params, actor) .mutate_as(branch, query_source, &selected_name, &params, actor)
@ -511,10 +529,10 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, .. } => { GraphClient::Embedded { uri, .. } => {
let (selected_name, query_params) = select_named_query(query_source, query_name)?; let (selected_name, query_params) = select_named_query(query_source, query_name)?;
let params = query_params_from_json(&query_params, params_json)?; let params = query_params_from_json(&query_params, params_json)?;
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let result = db let result = db
.query(target.clone(), query_source, &selected_name, &params) .query(target.clone(), query_source, &selected_name, &params)
.await?; .await?;
@ -523,6 +541,50 @@ impl GraphClient {
} }
} }
/// `invoke_named` — run a stored query **by catalog name** (RFC-011 D3).
/// Served-only: the catalog is server-owned, so a `--store` (embedded)
/// scope has nothing to resolve the name against. `expect_mutation` carries
/// the verb's asserted kind; the server rejects a mismatch (400) before
/// running, so the response is exactly the expected envelope — the caller
/// deserializes it as the concrete `T` (`ReadOutput` for `query`,
/// `ChangeOutput` for `mutate`), sidestepping the untagged wire enum.
pub(crate) async fn invoke_named<T: serde::de::DeserializeOwned>(
&self,
name: &str,
expect_mutation: bool,
params_json: Option<&Value>,
branch: Option<String>,
snapshot: Option<String>,
) -> Result<T> {
match self {
GraphClient::Remote {
http,
base_url,
token,
} => {
let body = InvokeStoredQueryRequest {
params: params_json.cloned(),
branch,
snapshot,
expect_mutation: Some(expect_mutation),
};
remote_json(
http,
Method::POST,
remote_url(base_url, &["queries", name], &[])?,
Some(serde_json::to_value(body)?),
token.as_deref(),
)
.await
}
GraphClient::Embedded { .. } => bail!(
"by-name invocation needs a server (the stored-query catalog is \
server-owned); use -e '<gq>' or --query <file> for an ad-hoc query \
against --store, or address a server with --server / --profile"
),
}
}
pub(crate) async fn branch_create_from( pub(crate) async fn branch_create_from(
&self, &self,
from: &str, from: &str,
@ -546,8 +608,8 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let actor = actor.as_deref(); let actor = actor.as_deref();
db.branch_create_from_as(ReadTarget::branch(from), name, actor) db.branch_create_from_as(ReadTarget::branch(from), name, actor)
.await?; .await?;
@ -577,8 +639,8 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let actor = actor.as_deref(); let actor = actor.as_deref();
db.branch_delete_as(name, actor).await?; db.branch_delete_as(name, actor).await?;
Ok(BranchDeleteOutput { Ok(BranchDeleteOutput {
@ -609,8 +671,8 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let actor = actor.as_deref(); let actor = actor.as_deref();
let outcome = db.branch_merge_as(source, into, actor).await?; let outcome = db.branch_merge_as(source, into, actor).await?;
Ok(BranchMergeOutput { Ok(BranchMergeOutput {
@ -660,8 +722,8 @@ impl GraphClient {
) )
.await .await
} }
GraphClient::Embedded { uri, graph, actor } => { GraphClient::Embedded { uri, actor } => {
let db = Self::open_embedded(uri, graph).await?; let db = Self::open_embedded(uri).await?;
let result = db let result = db
.apply_schema_as_with_catalog_check( .apply_schema_as_with_catalog_check(
schema_source, schema_source,
@ -730,9 +792,9 @@ impl GraphClient {
/// `graphs list` — enumerate the graphs a remote multi-graph server /// `graphs list` — enumerate the graphs a remote multi-graph server
/// serves (`GET /graphs`). Remote-only by design: there is no local /// serves (`GET /graphs`). Remote-only by design: there is no local
/// enumeration endpoint, so the Embedded arm fails loudly pointing the /// enumeration endpoint, so the Embedded arm fails loudly. Routing it
/// operator at `omnigraph.yaml`. Routing it through the enum still buys /// through the enum still buys the shared `resolve()` addressing/token
/// the shared `resolve()` addressing/token preamble. /// preamble.
pub(crate) async fn list_graphs(&self) -> Result<GraphListResponse> { pub(crate) async fn list_graphs(&self) -> Result<GraphListResponse> {
match self { match self {
GraphClient::Remote { GraphClient::Remote {
@ -750,9 +812,9 @@ impl GraphClient {
.await .await
} }
GraphClient::Embedded { .. } => bail!( GraphClient::Embedded { .. } => bail!(
"`omnigraph graphs list` requires a remote multi-graph server URL \ "`omnigraph graphs list` requires a remote multi-graph server \
(http:// or https://). To enumerate local graphs, read `omnigraph.yaml` \ (--server <url>). To enumerate the graphs in a cluster, run \
directly." `omnigraph cluster status --config <dir>`."
), ),
} }
} }

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -1,22 +1,16 @@
//! In-source test suite for the CLI binary (moved verbatim from //! In-source test suite for the CLI binary (moved verbatim from
//! main.rs; `use super::*` resolves through the #[path] declaration). //! main.rs; `use super::*` resolves through the #[path] declaration).
use std::fs;
use super::{ use super::{
DEFAULT_BEARER_TOKEN_ENV, apply_bearer_token, bearer_token_from_env_file, DEFAULT_BEARER_TOKEN_ENV, apply_bearer_token, legacy_change_request_body,
legacy_change_request_body, load_cli_config, load_env_file_into_process, normalize_bearer_token, resolve_remote_bearer_token,
normalize_bearer_token, parse_env_assignment, resolve_cli_graph, resolve_policy_context,
resolve_remote_bearer_token,
}; };
use omnigraph_server::load_config;
use reqwest::header::AUTHORIZATION; use reqwest::header::AUTHORIZATION;
use serde_json::json; use serde_json::json;
use tempfile::tempdir;
#[test] #[test]
fn legacy_change_request_body_uses_legacy_field_names() { fn legacy_change_request_body_uses_legacy_field_names() {
// `execute_change_remote` hits `POST /change`, which old // `mutate`'s remote arm hits `POST /change`, which old
// `omnigraph-server` builds deserialize as `ChangeRequest` with // `omnigraph-server` builds deserialize as `ChangeRequest` with
// **required** `query_source` and optional `query_name` keys. // **required** `query_source` and optional `query_name` keys.
// Newer servers accept both spellings via serde alias, but a // Newer servers accept both spellings via serde alias, but a
@ -96,120 +90,20 @@
} }
#[test] #[test]
fn parse_env_assignment_supports_plain_and_exported_values() { fn resolve_remote_bearer_token_falls_back_to_default_env() {
assert_eq!( // RFC-011: with no operator server matching the URL, the only chain
parse_env_assignment("DEMO_TOKEN=demo-token"), // left is the default `OMNIGRAPH_BEARER_TOKEN` env (no omnigraph.yaml
Some(("DEMO_TOKEN".to_string(), "demo-token".to_string())) // scoped chain). Hermetic: no operator config is read for a literal URL
); // that matches no `servers:` entry.
assert_eq!(
parse_env_assignment("export DEMO_TOKEN=\"quoted-token\""),
Some(("DEMO_TOKEN".to_string(), "quoted-token".to_string()))
);
assert_eq!(parse_env_assignment("# comment"), None);
assert_eq!(parse_env_assignment(" "), None);
}
#[test]
fn bearer_token_from_env_file_reads_named_value() {
let temp = tempdir().unwrap();
let env_file = temp.path().join(".env.omni");
fs::write(
&env_file,
"FIRST=ignore\nexport DEMO_TOKEN=\" demo-token \"\n",
)
.unwrap();
assert_eq!(
bearer_token_from_env_file(&env_file, "DEMO_TOKEN")
.unwrap()
.as_deref(),
Some("demo-token")
);
assert_eq!(
bearer_token_from_env_file(&env_file, "MISSING").unwrap(),
None
);
}
#[test]
fn load_env_file_into_process_sets_missing_values_without_overriding_existing_ones() {
let temp = tempdir().unwrap();
let env_file = temp.path().join(".env.omni");
fs::write(
&env_file,
"AUTOLOAD_ONLY=from-file\nAUTOLOAD_PRESET=from-file\n",
)
.unwrap();
let missing_key = "AUTOLOAD_ONLY";
let preset_key = "AUTOLOAD_PRESET";
let previous_missing = std::env::var_os(missing_key);
let previous_preset = std::env::var_os(preset_key);
unsafe {
std::env::remove_var(missing_key);
std::env::set_var(preset_key, "from-env");
}
load_env_file_into_process(&env_file).unwrap();
assert_eq!(std::env::var(missing_key).unwrap(), "from-file");
assert_eq!(std::env::var(preset_key).unwrap(), "from-env");
unsafe {
if let Some(value) = previous_missing {
std::env::set_var(missing_key, value);
} else {
std::env::remove_var(missing_key);
}
if let Some(value) = previous_preset {
std::env::set_var(preset_key, value);
} else {
std::env::remove_var(preset_key);
}
}
}
#[test]
fn resolve_remote_bearer_token_uses_scoped_env_file_with_global_fallback() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
r#"
graphs:
demo:
uri: https://example.com
bearer_token_env: DEMO_TOKEN
auth:
env_file: .env.omni
cli:
graph: demo
"#,
)
.unwrap();
fs::write(
temp.path().join(".env.omni"),
"DEMO_TOKEN=scoped-token\nOMNIGRAPH_BEARER_TOKEN=global-token\n",
)
.unwrap();
let previous = std::env::var_os(DEFAULT_BEARER_TOKEN_ENV); let previous = std::env::var_os(DEFAULT_BEARER_TOKEN_ENV);
let previous_home = std::env::var_os("OMNIGRAPH_HOME"); let previous_home = std::env::var_os("OMNIGRAPH_HOME");
unsafe { unsafe {
std::env::remove_var(DEFAULT_BEARER_TOKEN_ENV); std::env::set_var(DEFAULT_BEARER_TOKEN_ENV, "global-token");
// Hermetic: the keyed hop (RFC-007 PR 2) must not pick up a real std::env::set_var("OMNIGRAPH_HOME", "/nonexistent/omnigraph-test-home");
// ~/.omnigraph on the developer's machine — and with no operator
// servers defined, the legacy chain below must behave
// byte-identically to pre-PR-2 (tested-as-untouched).
std::env::set_var("OMNIGRAPH_HOME", temp.path().join("no-operator-config"));
} }
let config_path = temp.path().join("omnigraph.yaml");
let config = load_config(Some(&config_path)).unwrap();
assert_eq!( assert_eq!(
resolve_remote_bearer_token(&config, Some("https://override.example.com")) resolve_remote_bearer_token(Some("https://override.example.com"))
.unwrap() .unwrap()
.as_deref(), .as_deref(),
Some("global-token") Some("global-token")
@ -228,196 +122,3 @@ cli:
} }
} }
} }
#[test]
fn load_cli_config_autoloads_env_file_into_process() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
r#"
auth:
env_file: .env.omni
graphs:
demo:
uri: s3://bucket/prefix
"#,
)
.unwrap();
fs::write(
temp.path().join(".env.omni"),
"AUTOLOAD_FROM_CONFIG=loaded\n",
)
.unwrap();
let key = "AUTOLOAD_FROM_CONFIG";
let previous = std::env::var_os(key);
unsafe {
std::env::remove_var(key);
}
let config_path = temp.path().join("omnigraph.yaml");
let config = load_cli_config(Some(&config_path)).unwrap();
assert_eq!(
config.resolve_target_uri(None, Some("demo"), None).unwrap(),
"s3://bucket/prefix"
);
assert_eq!(std::env::var(key).unwrap(), "loaded");
unsafe {
if let Some(value) = previous {
std::env::set_var(key, value);
} else {
std::env::remove_var(key);
}
}
}
#[test]
fn graph_identity_resolve_policy_context_named_cli_graph_uses_graph_key_not_project_name_or_uri()
{
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/local-policy-graph.omni
policy:
file: ./policy.yaml
cli:
graph: local
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let context = resolve_policy_context(&config).unwrap();
assert_eq!(context.graph_id, "local");
}
#[test]
fn graph_identity_resolve_policy_context_server_graph_uses_graph_key_when_cli_graph_absent() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/local-policy-graph.omni
policy:
file: ./server-policy.yaml
server:
graph: local
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let context = resolve_policy_context(&config).unwrap();
assert_eq!(context.graph_id, "local");
assert!(context.policy_file.ends_with("server-policy.yaml"));
}
#[test]
fn graph_identity_resolve_policy_context_anonymous_uses_top_level_default_identity() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/local-policy-graph.omni
policy:
file: ./top-policy.yaml
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let context = resolve_policy_context(&config).unwrap();
assert_eq!(context.graph_id, "default");
assert!(context.policy_file.ends_with("top-policy.yaml"));
}
#[test]
fn graph_identity_resolve_cli_graph_named_target_uses_graph_key_not_project_name_or_uri() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
prod:
uri: s3://bucket/prod-graph/
policy:
file: ./prod-policy.yaml
cli:
graph: prod
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
// `--target` is removed; the `cli.graph` default drives the same
// graph-key (not project name / URI) selection.
let graph = resolve_cli_graph(&config, None).unwrap();
assert_eq!(graph.selected(), Some("prod"));
assert_eq!(graph.graph_id, "prod");
assert_eq!(graph.uri, "s3://bucket/prod-graph/");
}
#[test]
fn graph_identity_resolve_cli_graph_positional_uri_uses_anonymous_normalized_uri() {
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
project:
name: misleading-project
graphs:
local:
uri: /tmp/configured-graph.omni
policy:
file: ./policy.yaml
cli:
graph: local
"#,
)
.unwrap();
let config = load_config(Some(&config_path)).unwrap();
let local_graph_path = temp.path().join("explicit-graph.omni");
let local_graph = resolve_cli_graph(
&config,
Some(format!("file://{}", local_graph_path.display())),
)
.unwrap();
assert_eq!(local_graph.selected(), None);
assert_eq!(
local_graph.graph_id,
local_graph_path.to_string_lossy().as_ref()
);
assert_eq!(local_graph.policy_file, None);
let s3_graph = resolve_cli_graph(
&config,
Some("s3://bucket/anonymous-graph/".to_string()),
)
.unwrap();
assert_eq!(s3_graph.selected(), None);
assert_eq!(s3_graph.graph_id, "s3://bucket/anonymous-graph");
assert_eq!(s3_graph.policy_file, None);
}

View file

@ -1,408 +0,0 @@
//! `omnigraph config migrate` (RFC-008 stage 2): split a legacy
//! `omnigraph.yaml` into its two destinations — the team half as a
//! ready-to-review `cluster.yaml` proposal, the personal half merged into
//! `~/.omnigraph/config.yaml` — and name what's obsolete. The command is
//! the completeness test of RFC-008's migration map: any key it cannot
//! place is a bug in the RFC.
//!
//! Touches nothing without `--write`. Referenced `.gq`/policy files are
//! never moved; manual steps are printed instead.
use std::collections::BTreeMap;
use std::path::{Path, PathBuf};
use color_eyre::Result;
use color_eyre::eyre::eyre;
use omnigraph_server::OmnigraphConfig;
use serde::Serialize;
use crate::operator;
#[derive(Debug, Serialize)]
pub(crate) struct MigrateReport {
pub(crate) source: String,
/// The ready-to-review cluster.yaml text (None when the legacy file
/// declares nothing team-shaped).
pub(crate) cluster_yaml: Option<String>,
/// Operator keys to merge: dotted key -> YAML value text.
pub(crate) operator_merge: BTreeMap<String, String>,
/// Keys with no destination, and why.
pub(crate) dropped: Vec<DroppedKey>,
/// Steps the command will not do for you.
pub(crate) manual_steps: Vec<String>,
}
#[derive(Debug, Serialize)]
pub(crate) struct DroppedKey {
pub(crate) key: String,
pub(crate) reason: String,
}
/// Classify a parsed legacy config into the report. Pure — no I/O.
pub(crate) fn build_report(config: &OmnigraphConfig, source: &Path) -> MigrateReport {
let mut dropped = Vec::new();
let mut manual_steps = Vec::new();
let mut operator_merge: BTreeMap<String, String> = BTreeMap::new();
// ---- personal half ----
if let Some(actor) = &config.cli.actor {
operator_merge.insert("operator.actor".into(), actor.clone());
}
if let Some(format) = config.cli.output_format {
operator_merge.insert(
"defaults.output".into(),
serde_yaml::to_string(&format).unwrap_or_default().trim().to_string(),
);
}
if let Some(width) = config.cli.table_max_column_width {
operator_merge.insert("defaults.table_max_column_width".into(), width.to_string());
}
if let Some(layout) = config.cli.table_cell_layout {
operator_merge.insert(
"defaults.table_cell_layout".into(),
serde_yaml::to_string(&layout).unwrap_or_default().trim().to_string(),
);
}
if config.cli.graph.is_some() {
dropped.push(DroppedKey {
key: "cli.graph".into(),
reason: "address graphs explicitly via --store/--server, or set defaults.default_graph in the operator config".into(),
});
}
if config.cli.branch.is_some() {
dropped.push(DroppedKey {
key: "cli.branch".into(),
reason: "pass --branch explicitly".into(),
});
}
// Remote graphs with a token env become operator servers (the keyed
// chain replaces invented env-var names).
for (name, target) in &config.graphs {
if target.uri.starts_with("http://") || target.uri.starts_with("https://") {
operator_merge.insert(format!("servers.{name}.url"), target.uri.clone());
if target.bearer_token_env.is_some() {
manual_steps.push(format!(
"store the '{name}' token in the keyed chain: echo $TOKEN | omnigraph login {name} (replaces bearer_token_env)"
));
}
}
}
if config.auth.env_file.is_some() {
manual_steps.push(
"auth.env_file keeps working during the window; prefer `omnigraph login <server>` per server going forward".into(),
);
}
// Legacy aliases split: content -> catalog stored query, binding ->
// operator alias referencing the name.
for (name, alias) in &config.aliases {
let query_name = alias.name.clone().unwrap_or_else(|| name.clone());
operator_merge.insert(
format!("aliases.{name}"),
format!(
"{{ server: TODO-server-name, graph: {}, query: {query_name}, args: [{}] }}",
alias.graph.as_deref().unwrap_or("TODO-graph-id"),
alias.args.join(", ")
),
);
manual_steps.push(format!(
"alias '{name}': move its query content ('{}') into the cluster checkout's queries/ so '{query_name}' becomes a catalog stored query",
alias.query
));
}
// ---- team half ----
let has_team_content = !config.graphs.is_empty()
|| !config.queries.is_empty()
|| config.policy.file.is_some()
|| config.server.policy.file.is_some();
let cluster_yaml = has_team_content.then(|| {
let mut out = String::from("version: 1\n");
if let Some(name) = &config.project.name {
out.push_str(&format!("metadata:\n name: {name}\n"));
}
out.push_str("# storage: s3://bucket/prefix # or omit: this folder is the root\n");
if !config.graphs.is_empty() || !config.queries.is_empty() {
out.push_str("graphs:\n");
}
// Single-graph top-level queries belong to a graph the legacy file
// never named; propose one.
if !config.queries.is_empty() && config.graphs.is_empty() {
out.push_str(" default: # TODO: pick the graph id\n schema: # TODO: path to this graph's .pg schema\n queries: queries/\n");
}
for (name, target) in &config.graphs {
out.push_str(&format!(" {name}:\n"));
out.push_str(" schema: # TODO: path to this graph's .pg schema\n");
if !target.queries.is_empty() {
out.push_str(" queries: queries/ # move the .gq files here\n");
}
out.push_str(&format!(
" # legacy root: {} — the cluster manages graph roots under its storage; run `omnigraph cluster import` after reviewing\n",
target.uri
));
}
let mut policies: Vec<(String, String, String)> = Vec::new();
if let Some(file) = &config.policy.file {
policies.push(("default".into(), file.clone(), "graph.<id> # TODO: bind".into()));
}
if let Some(file) = &config.server.policy.file {
policies.push(("server".into(), file.clone(), "cluster".into()));
}
for (name, target) in &config.graphs {
if let Some(file) = &target.policy.file {
policies.push((name.clone(), file.clone(), format!("graph.{name}")));
}
}
if !policies.is_empty() {
out.push_str("policies:\n");
for (name, file, binding) in policies {
out.push_str(&format!(
" {name}:\n file: {file}\n applies_to: [{binding}]\n"
));
}
}
out
});
if !config.query.roots.is_empty() {
dropped.push(DroppedKey {
key: "query.roots".into(),
reason: "obsolete — cluster query discovery (queries: <dir>) replaced it".into(),
});
}
if config.server.bind.is_some() || config.server.graph.is_some() {
dropped.push(DroppedKey {
key: "server.bind / server.graph".into(),
reason: "deployment runtime — pass --bind / target flags or env".into(),
});
}
if config.project.name.is_some() && cluster_yaml.is_none() {
dropped.push(DroppedKey {
key: "project.name".into(),
reason: "the cluster's metadata.name is the deployment label".into(),
});
}
MigrateReport {
source: source.display().to_string(),
cluster_yaml,
operator_merge,
dropped,
manual_steps,
}
}
pub(crate) fn render_report(report: &MigrateReport) -> String {
let mut out = format!("migration plan for {}\n", report.source);
if let Some(cluster) = &report.cluster_yaml {
out.push_str("\n== team half -> cluster.yaml (ready to review) ==\n");
out.push_str(cluster);
}
if !report.operator_merge.is_empty() {
out.push_str("\n== personal half -> ~/.omnigraph/config.yaml ==\n");
for (key, value) in &report.operator_merge {
out.push_str(&format!(" {key}: {value}\n"));
}
}
if !report.dropped.is_empty() {
out.push_str("\n== no destination ==\n");
for dropped in &report.dropped {
out.push_str(&format!(" {}{}\n", dropped.key, dropped.reason));
}
}
if !report.manual_steps.is_empty() {
out.push_str("\n== manual steps ==\n");
for step in &report.manual_steps {
out.push_str(&format!(" - {step}\n"));
}
}
out.push_str("\n(nothing written; pass --write to apply the operator merge and emit cluster.yaml)\n");
out
}
/// `--write`: merge the personal half into the operator config (key-level,
/// existing entries always win; the prior file is backed up) and write the
/// team half to cluster.yaml in the legacy config's directory (or
/// cluster.yaml.proposed when one already exists).
pub(crate) fn apply_report(report: &MigrateReport, legacy_dir: &Path) -> Result<Vec<String>> {
let mut written = Vec::new();
if !report.operator_merge.is_empty() {
let dir = operator::operator_dir()
.ok_or_else(|| eyre!("no home directory resolvable for the operator config"))?;
std::fs::create_dir_all(&dir)?;
let path = dir.join(operator::OPERATOR_CONFIG_FILE);
let existing_text = std::fs::read_to_string(&path).unwrap_or_default();
let mut mapping: serde_yaml::Mapping = if existing_text.trim().is_empty() {
serde_yaml::Mapping::new()
} else {
serde_yaml::from_str(&existing_text)
.map_err(|err| eyre!("operator config '{}' does not parse: {err}", path.display()))?
};
let mut merged_any = false;
for (dotted, value_text) in &report.operator_merge {
if merge_dotted_if_absent(&mut mapping, dotted, value_text)? {
merged_any = true;
}
}
if merged_any {
if !existing_text.is_empty() {
let backup = path.with_extension("yaml.bak");
std::fs::write(&backup, &existing_text)?;
written.push(format!("backed up prior operator config to {}", backup.display()));
}
let rendered = serde_yaml::to_string(&mapping)?;
let tmp = path.with_extension(format!("yaml.tmp.{}", std::process::id()));
std::fs::write(&tmp, &rendered)?;
std::fs::rename(&tmp, &path)?;
written.push(format!("merged personal keys into {}", path.display()));
} else {
written.push("operator config already carries every personal key (nothing merged)".into());
}
}
if let Some(cluster) = &report.cluster_yaml {
let target = legacy_dir.join("cluster.yaml");
let target = if target.exists() {
legacy_dir.join("cluster.yaml.proposed")
} else {
target
};
std::fs::write(&target, cluster)?;
written.push(format!("wrote team-half proposal to {}", target.display()));
}
Ok(written)
}
/// Set `a.b.c` in the mapping only when absent; returns whether it wrote.
fn merge_dotted_if_absent(
mapping: &mut serde_yaml::Mapping,
dotted: &str,
value_text: &str,
) -> Result<bool> {
let value: serde_yaml::Value =
serde_yaml::from_str(value_text).unwrap_or(serde_yaml::Value::String(value_text.into()));
let parts: Vec<&str> = dotted.split('.').collect();
let mut current = mapping;
for part in &parts[..parts.len() - 1] {
let key = serde_yaml::Value::String((*part).into());
let entry = current
.entry(key)
.or_insert_with(|| serde_yaml::Value::Mapping(serde_yaml::Mapping::new()));
current = entry
.as_mapping_mut()
.ok_or_else(|| eyre!("operator config key '{dotted}' collides with a non-mapping"))?;
}
let leaf = serde_yaml::Value::String(parts[parts.len() - 1].into());
if current.contains_key(&leaf) {
return Ok(false);
}
current.insert(leaf, value);
Ok(true)
}
pub(crate) fn legacy_config_path(explicit: Option<&PathBuf>) -> PathBuf {
explicit.cloned().unwrap_or_else(|| PathBuf::from("omnigraph.yaml"))
}
#[cfg(test)]
mod tests {
use super::*;
use omnigraph_server::config::load_config;
fn full_legacy_fixture(dir: &Path) -> PathBuf {
let path = dir.join("omnigraph.yaml");
std::fs::write(
&path,
r#"
project: { name: brain }
graphs:
prod:
uri: https://graph.example.com
bearer_token_env: PROD_TOKEN
policy: { file: ./prod.policy.yaml }
queries:
find: { file: ./find.gq }
local:
uri: /tmp/local.omni
server: { bind: "0.0.0.0:9999", policy: { file: ./server.policy.yaml } }
auth: { env_file: .env.omni }
cli:
graph: prod
branch: main
actor: act-me
output_format: json
table_max_column_width: 40
query: { roots: ["."] }
aliases:
triage: { command: query, query: ./triage.gq, name: weekly_triage, args: [since], graph: prod }
policy: { file: ./top.policy.yaml }
queries:
top_q: { file: ./top.gq }
"#,
)
.unwrap();
path
}
/// The RFC-008 completeness contract: every top-level key of the
/// legacy schema must appear in the report somewhere (team half,
/// operator merge, dropped, or manual steps).
#[test]
fn every_legacy_key_is_classified() {
let dir = tempfile::tempdir().unwrap();
let path = full_legacy_fixture(dir.path());
let config = load_config(Some(&path)).unwrap();
let report = build_report(&config, &path);
let rendered = render_report(&report);
let serialized =
serde_yaml::to_value(OmnigraphConfig::default()).expect("default serializes");
for key in serialized.as_mapping().unwrap().keys() {
let key = key.as_str().unwrap();
assert!(
rendered.contains(key)
|| report.operator_merge.keys().any(|k| k.contains(key))
|| matches!(key, "graphs" | "queries" | "policy" | "project")
&& report.cluster_yaml.is_some(),
"legacy key '{key}' is unclassified — fix the RFC-008 map: {rendered}"
);
}
// spot checks on each section
assert_eq!(report.operator_merge["operator.actor"], "act-me");
assert_eq!(report.operator_merge["defaults.output"], "json");
assert_eq!(
report.operator_merge["servers.prod.url"],
"https://graph.example.com"
);
assert!(report.operator_merge["aliases.triage"].contains("query: weekly_triage"));
let cluster = report.cluster_yaml.as_deref().unwrap();
assert!(cluster.contains("version: 1"));
assert!(cluster.contains("name: brain"));
assert!(cluster.contains(" prod:"));
assert!(cluster.contains("applies_to: [cluster]"));
assert!(cluster.contains("applies_to: [graph.prod]"));
assert!(report.dropped.iter().any(|d| d.key == "query.roots"));
assert!(report.dropped.iter().any(|d| d.key.contains("server.bind")));
assert!(
report
.manual_steps
.iter()
.any(|s| s.contains("omnigraph login prod"))
);
}
#[test]
fn merge_dotted_never_clobbers_existing() {
let mut mapping: serde_yaml::Mapping =
serde_yaml::from_str("operator:\n actor: keep-me\n").unwrap();
assert!(!merge_dotted_if_absent(&mut mapping, "operator.actor", "new").unwrap());
assert!(merge_dotted_if_absent(&mut mapping, "defaults.output", "json").unwrap());
let text = serde_yaml::to_string(&mapping).unwrap();
assert!(text.contains("keep-me") && !text.contains("new"));
assert!(text.contains("output: json"));
}
}

View file

@ -18,10 +18,10 @@ use std::env;
use std::path::{Path, PathBuf}; use std::path::{Path, PathBuf};
use color_eyre::Result; use color_eyre::Result;
use color_eyre::eyre::eyre; use color_eyre::eyre::{bail, eyre};
use serde::Deserialize; use serde::Deserialize;
use omnigraph_server::config::ReadOutputFormat; use crate::read_format::{ReadOutputFormat, TableCellLayout};
pub(crate) const OPERATOR_HOME_ENV: &str = "OMNIGRAPH_HOME"; pub(crate) const OPERATOR_HOME_ENV: &str = "OMNIGRAPH_HOME";
pub(crate) const OPERATOR_DIR: &str = ".omnigraph"; pub(crate) const OPERATOR_DIR: &str = ".omnigraph";
@ -91,8 +91,7 @@ pub(crate) struct OperatorServer {
#[derive(Debug, Default, Deserialize)] #[derive(Debug, Default, Deserialize)]
pub(crate) struct OperatorIdentity { pub(crate) struct OperatorIdentity {
/// Default actor for every `--as` cascade (CLI direct-engine writes and /// Default actor for every `--as` cascade (CLI direct-engine writes and
/// cluster commands alike): `--as` > legacy config actor (RFC-008 /// cluster commands alike): `--as` > this > none.
/// window) > this > none.
pub(crate) actor: Option<String>, pub(crate) actor: Option<String>,
#[serde(flatten)] #[serde(flatten)]
unknown: serde_yaml::Mapping, unknown: serde_yaml::Mapping,
@ -102,14 +101,19 @@ pub(crate) struct OperatorIdentity {
pub(crate) struct OperatorDefaults { pub(crate) struct OperatorDefaults {
/// Default read output format, below every more-specific source. /// Default read output format, below every more-specific source.
pub(crate) output: Option<ReadOutputFormat>, pub(crate) output: Option<ReadOutputFormat>,
/// Table rendering preferences (below the legacy cli.table_* keys /// Table rendering preferences for `--format table`.
/// during the RFC-008 window).
pub(crate) table_max_column_width: Option<usize>, pub(crate) table_max_column_width: Option<usize>,
pub(crate) table_cell_layout: Option<omnigraph_server::config::TableCellLayout>, pub(crate) table_cell_layout: Option<TableCellLayout>,
/// Default server scope (RFC-011): the everyday addressing when no /// Default server scope (RFC-011): the everyday addressing when no
/// `--profile` / primitive / legacy address is given. Names an entry /// `--profile` / primitive / legacy address is given. Names an entry
/// under `servers:`. /// under `servers:`. Mutually exclusive with `store` — a scope binds one
/// entity.
pub(crate) server: Option<String>, pub(crate) server: Option<String>,
/// Default **store** scope (RFC-011): a `file://` / `s3://` graph storage
/// URI used as the zero-flag local default for graph commands when no
/// `--profile` / primitive address is given. The local-dev counterpart of
/// `server`; mutually exclusive with it.
pub(crate) store: Option<String>,
/// Default graph selected within a server/cluster scope when no /// Default graph selected within a server/cluster scope when no
/// `--graph` is passed (RFC-011). /// `--graph` is passed (RFC-011).
pub(crate) default_graph: Option<String>, pub(crate) default_graph: Option<String>,
@ -202,10 +206,36 @@ impl OperatorConfig {
self.defaults.server.as_deref() self.defaults.server.as_deref()
} }
/// The flat-default store scope URI, if set (RFC-011) — the zero-flag
/// local-dev default.
pub(crate) fn default_store(&self) -> Option<&str> {
self.defaults.store.as_deref()
}
/// The flat-default graph within a server/cluster scope, if set (RFC-011). /// The flat-default graph within a server/cluster scope, if set (RFC-011).
pub(crate) fn default_graph(&self) -> Option<&str> { pub(crate) fn default_graph(&self) -> Option<&str> {
self.defaults.default_graph.as_deref() self.defaults.default_graph.as_deref()
} }
/// A scope binds one entity (Decision 6): `defaults.server` and
/// `defaults.store` are mutually exclusive, and a `store` (already a single
/// graph) cannot carry a `default_graph`. Both are refused loudly rather
/// than silently dropped.
fn validate_defaults(&self) -> Result<()> {
if self.defaults.server.is_some() && self.defaults.store.is_some() {
bail!(
"operator config `defaults` sets both `server` and `store` — a default scope \
binds one entity; keep one (use a `profile` if you need both)"
);
}
if self.defaults.store.is_some() && self.defaults.default_graph.is_some() {
bail!(
"operator config `defaults` sets both `store` and `default_graph` — a store is \
already a single graph; drop `default_graph` (it applies only to a server/cluster scope)"
);
}
Ok(())
}
} }
impl OperatorProfile { impl OperatorProfile {
@ -282,6 +312,7 @@ pub(crate) fn load_operator_config_at(path: &Path) -> Result<OperatorConfig> {
for warning in config.unknown_key_warnings() { for warning in config.unknown_key_warnings() {
eprintln!("warning: {warning} in operator config '{}'", path.display()); eprintln!("warning: {warning} in operator config '{}'", path.display());
} }
config.validate_defaults()?;
Ok(config) Ok(config)
} }
@ -560,6 +591,42 @@ mod tests {
assert_eq!(config.output(), Some(ReadOutputFormat::Json)); assert_eq!(config.output(), Some(ReadOutputFormat::Json));
} }
#[test]
fn defaults_store_parses_and_is_accessible() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("config.yaml");
fs::write(&path, "defaults:\n store: file:///tmp/dev.omni\n").unwrap();
let config = load_operator_config_at(&path).unwrap();
assert_eq!(config.default_store(), Some("file:///tmp/dev.omni"));
assert_eq!(config.default_server(), None);
}
#[test]
fn defaults_server_and_store_together_is_a_loud_error() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("config.yaml");
fs::write(
&path,
"defaults:\n server: prod\n store: file:///tmp/dev.omni\n",
)
.unwrap();
let err = load_operator_config_at(&path).unwrap_err().to_string();
assert!(err.contains("binds one entity"), "{err}");
}
#[test]
fn defaults_store_with_default_graph_is_a_loud_error() {
let dir = tempfile::tempdir().unwrap();
let path = dir.path().join("config.yaml");
fs::write(
&path,
"defaults:\n store: file:///tmp/dev.omni\n default_graph: knowledge\n",
)
.unwrap();
let err = load_operator_config_at(&path).unwrap_err().to_string();
assert!(err.contains("already a single graph"), "{err}");
}
#[test] #[test]
fn unknown_keys_warn_but_load() { fn unknown_keys_warn_but_load() {
// A file written for a later slice (servers/aliases) must load // A file written for a later slice (servers/aliases) must load

View file

@ -749,15 +749,10 @@ pub(crate) fn print_snapshot_human(branch: &str, manifest_version: u64, entries:
pub(crate) fn print_read_output( pub(crate) fn print_read_output(
output: &ReadOutput, output: &ReadOutput,
format: ReadOutputFormat, format: ReadOutputFormat,
config: &OmnigraphConfig,
) -> Result<()> { ) -> Result<()> {
println!( println!(
"{}", "{}",
render_read( render_read(output, format, &resolve_table_render_options())?
output,
format,
&resolve_table_render_options(config),
)?
); );
Ok(()) Ok(())
} }
@ -907,21 +902,87 @@ pub(crate) fn finish_logout(
Ok(()) Ok(())
} }
/// Table prefs cascade (RFC-007/008): legacy cli.table_* (window) > #[derive(Debug, Serialize)]
/// operator defaults.table_* > built-in. pub(crate) struct ProfileListItem {
pub(crate) fn resolve_table_render_options(config: &OmnigraphConfig) -> ReadRenderOptions { pub(crate) name: String,
/// `server: <n>` / `cluster: <n>` / `store: <uri>` / `invalid: <reason>`.
pub(crate) binding: String,
/// `server` | `cluster` | `store` | `invalid`.
pub(crate) scope_kind: String,
/// The bound server/cluster name, or the store URI. `None` when invalid.
pub(crate) target: Option<String>,
pub(crate) valid: bool,
pub(crate) error: Option<String>,
pub(crate) default_graph: Option<String>,
pub(crate) active: bool,
}
#[derive(Debug, Serialize)]
pub(crate) struct ProfileDetail {
/// Profile name, or `(defaults)` for the no-name flat-defaults view.
pub(crate) name: String,
/// `server` | `cluster` | `store` | `none`.
pub(crate) scope_kind: String,
/// The bound server/cluster name, or the store URI.
pub(crate) target: Option<String>,
/// Resolved endpoint: a server's URL / a cluster's root / the store URI;
/// `None` if a named server/cluster isn't defined in this config.
pub(crate) endpoint: Option<String>,
pub(crate) default_graph: Option<String>,
pub(crate) output_format: Option<String>,
}
pub(crate) fn print_profile_list(items: &[ProfileListItem], json: bool) -> Result<()> {
if json {
return print_json(&items);
}
if items.is_empty() {
println!("no profiles defined in the operator config");
return Ok(());
}
for item in items {
let active = if item.active { " (active)" } else { "" };
let graph = item
.default_graph
.as_deref()
.map(|g| format!(" · graph: {g}"))
.unwrap_or_default();
println!("{}{active} {}{graph}", item.name, item.binding);
}
Ok(())
}
pub(crate) fn print_profile_detail(detail: &ProfileDetail, json: bool) -> Result<()> {
if json {
return print_json(detail);
}
println!("profile: {}", detail.name);
let target = detail
.target
.as_deref()
.map(|t| format!(" {t}"))
.unwrap_or_default();
println!(" scope: {}{target}", detail.scope_kind);
if let Some(endpoint) = &detail.endpoint {
println!(" endpoint: {endpoint}");
} else if matches!(detail.scope_kind.as_str(), "server" | "cluster") {
println!(" endpoint: (undefined — name not in this config)");
}
if let Some(graph) = &detail.default_graph {
println!(" default graph: {graph}");
}
if let Some(format) = &detail.output_format {
println!(" output: {format}");
}
Ok(())
}
/// Table prefs cascade (RFC-011): operator defaults.table_* > built-in.
pub(crate) fn resolve_table_render_options() -> ReadRenderOptions {
let operator = crate::operator::load_operator_config().unwrap_or_default(); let operator = crate::operator::load_operator_config().unwrap_or_default();
ReadRenderOptions { ReadRenderOptions {
max_column_width: config max_column_width: operator.defaults.table_max_column_width.unwrap_or(80),
.cli cell_layout: operator.defaults.table_cell_layout.unwrap_or_default(),
.table_max_column_width
.or(operator.defaults.table_max_column_width)
.unwrap_or(80),
cell_layout: config
.cli
.table_cell_layout
.or(operator.defaults.table_cell_layout)
.unwrap_or_default(),
} }
} }

View file

@ -82,9 +82,7 @@ impl Capability {
/// classifier) plus the one Data→Served refinement: `graphs` is remote-only. /// classifier) plus the one Data→Served refinement: `graphs` is remote-only.
/// ///
/// This reflects *current enforced behavior*, so messages stay truthful: /// This reflects *current enforced behavior*, so messages stay truthful:
/// `queries list` is `Local` (reads config today) and `queries validate` is /// `queries`/`policy` read a cluster's applied state (`Control`).
/// `Direct` (opens a graph directly today). Both converge to the RFC end-state
/// (served / control) only when later slices re-route them.
pub(crate) fn command_capability(cmd: &Command) -> Capability { pub(crate) fn command_capability(cmd: &Command) -> Capability {
if let Command::Graphs { .. } = cmd { if let Command::Graphs { .. } = cmd {
return Capability::Served; return Capability::Served;
@ -100,8 +98,7 @@ pub(crate) fn command_capability(cmd: &Command) -> Capability {
/// The plane a subcommand belongs to. Exhaustive — a new `Command` variant /// The plane a subcommand belongs to. Exhaustive — a new `Command` variant
/// will not compile until classified. Descends into the nested enums where /// will not compile until classified. Descends into the nested enums where
/// the plane differs per subcommand (`schema plan` is storage while `schema /// the plane differs per subcommand (`schema plan` is storage while `schema
/// show`/`apply` are data; `queries validate` opens the graph while `queries /// show`/`apply` are data; `queries`/`policy` read cluster applied state).
/// list` only reads config).
pub(crate) fn command_plane(cmd: &Command) -> Plane { pub(crate) fn command_plane(cmd: &Command) -> Plane {
match cmd { match cmd {
Command::Query { .. } Command::Query { .. }
@ -119,23 +116,22 @@ pub(crate) fn command_plane(cmd: &Command) -> Plane {
Command::Schema { Command::Schema {
command: SchemaCommand::Plan { .. }, command: SchemaCommand::Plan { .. },
} => Plane::Storage, } => Plane::Storage,
Command::Queries { // `queries` and `policy` tooling now source their inputs from a
command: QueriesCommand::Validate { .. }, // cluster's applied state (`--cluster`), so they live on the control
} => Plane::Storage, // plane (RFC-011 — omnigraph.yaml excised from the CLI).
Command::Queries { Command::Queries { .. } => Plane::Control,
command: QueriesCommand::List { .. }, Command::Policy { .. } => Plane::Control,
} => Plane::Session,
Command::Init { .. } Command::Init { .. }
| Command::Optimize { .. } | Command::Optimize { .. }
| Command::Repair { .. } | Command::Repair { .. }
| Command::Cleanup { .. } | Command::Cleanup { .. }
| Command::Lint { .. } => Plane::Storage, | Command::Lint { .. } => Plane::Storage,
Command::Cluster { .. } => Plane::Control, Command::Cluster { .. } => Plane::Control,
Command::Policy { .. } Command::Alias { .. }
| Command::Embed(_) | Command::Embed(_)
| Command::Login { .. } | Command::Login { .. }
| Command::Logout { .. } | Command::Logout { .. }
| Command::Config { .. } | Command::Profile { .. }
| Command::Version => Plane::Session, | Command::Version => Plane::Session,
} }
} }
@ -147,7 +143,7 @@ pub(crate) fn command_label(cmd: &Command) -> &'static str {
Command::Version => "version", Command::Version => "version",
Command::Login { .. } => "login", Command::Login { .. } => "login",
Command::Logout { .. } => "logout", Command::Logout { .. } => "logout",
Command::Config { .. } => "config", Command::Profile { .. } => "profile",
Command::Embed(_) => "embed", Command::Embed(_) => "embed",
Command::Init { .. } => "init", Command::Init { .. } => "init",
Command::Load { .. } => "load", Command::Load { .. } => "load",
@ -168,6 +164,7 @@ pub(crate) fn command_label(cmd: &Command) -> &'static str {
Command::Commit { .. } => "commit", Command::Commit { .. } => "commit",
Command::Query { .. } => "query", Command::Query { .. } => "query",
Command::Mutate { .. } => "mutate", Command::Mutate { .. } => "mutate",
Command::Alias { .. } => "alias",
Command::Policy { .. } => "policy", Command::Policy { .. } => "policy",
Command::Optimize { .. } => "optimize", Command::Optimize { .. } => "optimize",
Command::Repair { .. } => "repair", Command::Repair { .. } => "repair",
@ -177,35 +174,128 @@ pub(crate) fn command_label(cmd: &Command) -> &'static str {
} }
} }
/// Reject the data-plane addressing flags (`--server`/`--graph`) on any verb /// The verbs that consume a cluster scope. Maintenance/lint select a graph with
/// that does not live on the data plane. This replaces the old silent-ignore /// `--cluster <root> --graph <id>`; policy/queries inspect the cluster's
/// — e.g. `optimize --server prod` previously dropped `--server` and tried to /// applied control-plane state and may optionally use `--graph` to select one
/// resolve a default target, failing (if at all) with an unrelated message. /// bundle/registry. `init` is storage-plane too but *creates* a graph (cluster
/// Now it fails with one honest, declared error. RFC-010 Slice 1. /// graphs are born from `cluster apply`, not `init`), and `schema plan` takes a
/// positional URI, so the guard rejects `--cluster`/`--graph` there rather than
/// silently dropping the flag.
pub(crate) fn accepts_cluster_addressing(cmd: &Command) -> bool {
matches!(
cmd,
Command::Optimize { .. }
| Command::Repair { .. }
| Command::Cleanup { .. }
// `lint` can type-check a `.gq` against a cluster graph's schema
// (RFC-011): `--cluster <dir> --graph <id>`.
| Command::Lint { .. }
// The policy/queries tooling addresses a cluster's applied state
// (RFC-011): `--cluster <dir>` selects the cluster, `--graph <id>`
// picks a graph's bundle/registry within it.
| Command::Policy { .. }
| Command::Queries { .. }
)
}
/// Reject a scope-addressing flag (`--server`/`--cluster`/`--graph`) on a verb
/// that cannot consume it, rather than silently dropping it (the old behavior:
/// e.g. `optimize --server prod` dropped `--server` and failed later with an
/// unrelated message). `alias` gets an extra guard because its binding owns all
/// addressing and several ignored globals sit outside this three-flag guard.
/// Each flag has a distinct valid surface:
/// - `--server` → served-graph scopes (`any`/`served`);
/// - `--cluster` → cluster-scoped direct/control verbs;
/// - `--graph` → any multi-graph scope: a served scope *or* a cluster one.
/// RFC-010 Slice 1, generalized for RFC-011 cluster addressing.
pub(crate) fn guard_addressing(cli: &Cli) -> Result<()> { pub(crate) fn guard_addressing(cli: &Cli) -> Result<()> {
if cli.server.is_none() && cli.graph.is_none() { if let Command::Alias { .. } = &cli.command {
let mut flags = Vec::new();
if cli.server.is_some() {
flags.push("--server");
}
if cli.graph.is_some() {
flags.push("--graph");
}
if cli.store.is_some() {
flags.push("--store");
}
if cli.cluster.is_some() {
flags.push("--cluster");
}
if cli.profile.is_some() {
flags.push("--profile");
}
if cli.as_actor.is_some() {
flags.push("--as");
}
if !flags.is_empty() {
bail!(
"`alias` uses the server, graph, and stored query declared in \
`aliases.<name>` in ~/.omnigraph/config.yaml; remove global scope \
flag(s): {}",
flags.join(", ")
);
}
}
if cli.server.is_none() && cli.cluster.is_none() && cli.graph.is_none() {
return Ok(()); return Ok(());
} }
let capability = command_capability(&cli.command); let capability = command_capability(&cli.command);
if capability.accepts_server_addressing() {
return Ok(());
}
let label = command_label(&cli.command); let label = command_label(&cli.command);
let how = match capability { let cluster_ok = accepts_cluster_addressing(&cli.command);
Capability::Direct => match cli.command {
Command::Init { .. } => "Pass a storage URI.", if cli.server.is_some() && !capability.accepts_server_addressing() {
_ => "Pass a storage URI, or --cluster <dir> --cluster-graph <id>.", bail!(
"`{label}` is a {} command; --server addresses a served graph and does not apply.{}",
capability.describe(),
remediation(capability, &cli.command),
);
}
if cli.cluster.is_some() && !cluster_ok {
bail!(
"`{label}` is a {} command; --cluster addresses a cluster-scoped command \
and does not apply.{}",
capability.describe(),
remediation(capability, &cli.command),
);
}
if cli.graph.is_some() && !(capability.accepts_server_addressing() || cluster_ok) {
bail!(
"`{label}` is a {} command; --graph selects a graph within a server or cluster \
scope and does not apply.{}",
capability.describe(),
remediation(capability, &cli.command),
);
}
Ok(())
}
/// The "what to do instead" tail for a wrong-address error, by capability.
/// Includes its own leading space when non-empty so the caller appends it
/// directly — an empty tail (the served-addressing capabilities, which only
/// reach this fn for a misplaced `--cluster`/`--graph`) leaves no trailing space.
fn remediation(capability: Capability, cmd: &Command) -> &'static str {
match capability {
Capability::Direct => match cmd {
Command::Init { .. } => " Pass a storage URI.",
Command::Optimize { .. } | Command::Repair { .. } | Command::Cleanup { .. } => {
" Pass a storage URI, or --cluster <dir> --graph <id>."
}
_ => " Pass a storage URI.",
}, },
Capability::Control => "It operates on a cluster (pass --config <dir>).", Capability::Control => match cmd {
Capability::Local => "It does not address a graph.", Command::Cluster { .. } => {
Capability::Any | Capability::Served => { " It operates on a cluster config directory (pass --config <dir>)."
unreachable!("served-addressing capabilities returned early") }
} Command::Policy { .. } | Command::Queries { .. } => {
}; " It operates on a cluster (pass --cluster <dir|uri>, or select a cluster profile)."
bail!( }
"`{label}` is a {} command; --server/--graph address a served graph and do not apply. {how}", _ => " It operates on a cluster.",
capability.describe() },
); Capability::Local => " It does not address a graph.",
Capability::Any | Capability::Served => "",
}
} }
#[cfg(test)] #[cfg(test)]
@ -235,11 +325,17 @@ mod tests {
// The one Data→Served refinement — if the `graphs` guard were deleted, // The one Data→Served refinement — if the `graphs` guard were deleted,
// every other assertion here would still pass. // every other assertion here would still pass.
assert_eq!(cap(&["omnigraph", "graphs", "list"]), Capability::Served); assert_eq!(cap(&["omnigraph", "graphs", "list"]), Capability::Served);
assert_eq!(cap(&["omnigraph", "alias", "who"]), Capability::Local);
assert_eq!(cap(&["omnigraph", "optimize", "graph.omni"]), Capability::Direct); assert_eq!(cap(&["omnigraph", "optimize", "graph.omni"]), Capability::Direct);
assert_eq!(cap(&["omnigraph", "schema", "plan", "--schema", "s.pg", "graph.omni"]), Capability::Direct); assert_eq!(cap(&["omnigraph", "schema", "plan", "--schema", "s.pg", "graph.omni"]), Capability::Direct);
assert_eq!(cap(&["omnigraph", "cluster", "status", "--config", "."]), Capability::Control); assert_eq!(cap(&["omnigraph", "cluster", "status", "--config", "."]), Capability::Control);
assert_eq!(cap(&["omnigraph", "version"]), Capability::Local); assert_eq!(cap(&["omnigraph", "version"]), Capability::Local);
assert_eq!(cap(&["omnigraph", "queries", "list"]), Capability::Local); // `queries`/`policy` tooling reads cluster state now (control plane).
assert_eq!(cap(&["omnigraph", "queries", "list"]), Capability::Control);
assert_eq!(
cap(&["omnigraph", "policy", "validate"]),
Capability::Control
);
} }
#[test] #[test]

View file

@ -1,9 +1,31 @@
use clap::ValueEnum;
use color_eyre::eyre::Result; use color_eyre::eyre::Result;
use omnigraph_server::ReadOutputFormat;
use omnigraph_server::api::ReadOutput; use omnigraph_server::api::ReadOutput;
use omnigraph_server::config::TableCellLayout; use serde::{Deserialize, Serialize};
use serde_json::{Map, Value}; use serde_json::{Map, Value};
/// Output rendering format for read-shaped commands (`read`/`query`/`alias`).
/// A CLI presentation concern — lives here, not in the server.
#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)]
#[serde(rename_all = "snake_case")]
pub enum ReadOutputFormat {
#[default]
Table,
Kv,
Csv,
Jsonl,
Json,
}
/// How an over-wide table cell is laid out when rendering `--format table`.
#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)]
#[serde(rename_all = "snake_case")]
pub enum TableCellLayout {
#[default]
Truncate,
Wrap,
}
pub struct ReadRenderOptions { pub struct ReadRenderOptions {
pub max_column_width: usize, pub max_column_width: usize,
pub cell_layout: TableCellLayout, pub cell_layout: TableCellLayout,

View file

@ -42,6 +42,7 @@ pub(crate) struct ScopeFlags<'a> {
pub(crate) profile: Option<&'a str>, pub(crate) profile: Option<&'a str>,
pub(crate) store: Option<&'a str>, pub(crate) store: Option<&'a str>,
pub(crate) server: Option<&'a str>, pub(crate) server: Option<&'a str>,
pub(crate) cluster: Option<&'a str>,
pub(crate) graph: Option<&'a str>, pub(crate) graph: Option<&'a str>,
pub(crate) uri: Option<String>, pub(crate) uri: Option<String>,
} }
@ -56,17 +57,49 @@ pub(crate) fn resolve_scope(
capability: Capability, capability: Capability,
flags: ScopeFlags<'_>, flags: ScopeFlags<'_>,
) -> Result<ResolvedScope> { ) -> Result<ResolvedScope> {
// `--store` is its own way to address a graph; combining it with a positional // At most one explicit scope primitive may address a command — a positional
// URI or `--server` is a contradiction, not a silent precedence. // URI, `--store`, `--server`, or `--cluster` are mutually exclusive ways to
if flags.store.is_some() && (flags.uri.is_some() || flags.server.is_some()) { // name the graph. Combining them is a contradiction, not a silent precedence.
let primitives: Vec<&str> = [
flags.uri.as_deref().map(|_| "a positional URI"),
flags.store.map(|_| "--store"),
flags.server.map(|_| "--server"),
flags.cluster.map(|_| "--cluster"),
]
.into_iter()
.flatten()
.collect();
if primitives.len() > 1 {
bail!( bail!(
"--store is exclusive with a positional URI and --server — pick one way to \ "{} are mutually exclusive — pick one way to address the graph",
address the graph" primitives.join(" and ")
); );
} }
// 1. Any explicit address wins; reproduce today's behavior untouched.
// `--store` is an explicit store URI — fold it into `uri`. // 1a. `--cluster` is the cluster scope primitive (maintenance): resolve its
// root + select the graph with `--graph`.
if let Some(cluster) = flags.cluster {
return scope_from_binding(
op,
capability,
ScopeBinding::Cluster(cluster.to_string()),
flags.graph.map(str::to_string),
"--cluster",
);
}
// 1b. Any other explicit address wins; reproduce today's behavior untouched.
// `--store` is an explicit store URI — fold it into `uri`.
if flags.uri.is_some() || flags.server.is_some() || flags.store.is_some() { if flags.uri.is_some() || flags.server.is_some() || flags.store.is_some() {
// `--graph` selects within a multi-graph scope; a bare positional URI /
// `--store` is already a single graph, so a stray `--graph` is an error
// rather than a silently-dropped flag.
if flags.graph.is_some() && flags.server.is_none() {
bail!(
"--graph selects a graph within a server or cluster scope; a positional \
URI / --store is already a single graph"
);
}
return Ok(ResolvedScope { return Ok(ResolvedScope {
server: flags.server.map(str::to_string), server: flags.server.map(str::to_string),
graph: flags.graph.map(str::to_string), graph: flags.graph.map(str::to_string),
@ -107,6 +140,18 @@ pub(crate) fn resolve_scope(
); );
} }
// 3b. Flat default store scope — the zero-flag local-dev default (RFC-011).
// Mutually exclusive with `defaults.server` (enforced at config load).
if let Some(store) = op.default_store() {
return scope_from_binding(
op,
capability,
ScopeBinding::Store(store.to_string()),
flags.graph.map(str::to_string),
"operator defaults",
);
}
// 4. Nothing resolved — leave the tuple empty; downstream falls through to // 4. Nothing resolved — leave the tuple empty; downstream falls through to
// today's behavior (legacy `cli.graph` default or a no-address error). // today's behavior (legacy `cli.graph` default or a no-address error).
Ok(ResolvedScope::default()) Ok(ResolvedScope::default())
@ -128,8 +173,8 @@ fn scope_from_binding(
if capability == Capability::Direct { if capability == Capability::Direct {
bail!( bail!(
"this command needs direct storage access, but {source} resolves a \ "this command needs direct storage access, but {source} resolves a \
server scope; name storage explicitly with --store <uri> (or a \ server scope; name storage explicitly with --store <uri> (or \
--cluster/--cluster-graph for a managed graph)" --cluster <dir> --graph <id> for a managed graph)"
); );
} }
Ok(ResolvedScope { Ok(ResolvedScope {
@ -141,23 +186,25 @@ fn scope_from_binding(
ScopeBinding::Cluster(cluster) => { ScopeBinding::Cluster(cluster) => {
if capability == Capability::Any { if capability == Capability::Any {
bail!( bail!(
"{source} resolves a cluster scope, which is maintenance-only; run \ "{source} resolves a cluster scope, which is not valid for graph data \
data commands through a server, or use --store <uri> for ad-hoc \ commands; run data commands through a server, or use --store <uri> \
direct access" for ad-hoc direct access"
); );
} }
// A cluster binding is a config name (resolved against `clusters:`) // A cluster value is a config name (resolved against `clusters:`)
// or a literal root URI. // or a literal root: an `s3://`/`file://` URI or a local cluster
let root = if let Some(root) = op.cluster_root(&cluster) { // directory. Only a configured name is rewritten; anything else is
root.to_string() // passed through to the cluster-state resolver verbatim, so a bare
} else if cluster.contains("://") { // directory path keeps working as it did for per-command `--cluster`.
cluster let root = op
} else { .cluster_root(&cluster)
bail!( .map(str::to_string)
"unknown cluster '{cluster}' ({source}); define it under `clusters:` \ .unwrap_or(cluster);
in operator config, or use a literal root URI" // A cluster holds many graphs; maintenance addresses one at a time.
); // When no `--graph`/`default_graph` is given, leave `cluster_graph`
}; // empty and defer to the async storage-URI resolver (RFC-011 D7),
// which enumerates the catalog: auto-use a sole graph, else error
// and list the candidates.
Ok(ResolvedScope { Ok(ResolvedScope {
cluster: Some(root), cluster: Some(root),
cluster_graph: graph, cluster_graph: graph,
@ -192,6 +239,7 @@ mod tests {
profile: None, profile: None,
store: None, store: None,
server: None, server: None,
cluster: None,
graph: None, graph: None,
uri: None, uri: None,
} }
@ -230,7 +278,7 @@ mod tests {
} }
#[test] #[test]
fn store_is_exclusive_with_positional_uri_and_server() { fn scope_primitives_are_mutually_exclusive() {
let op = OperatorConfig::default(); let op = OperatorConfig::default();
for flags in [ for flags in [
ScopeFlags { ScopeFlags {
@ -243,12 +291,128 @@ mod tests {
server: Some("prod"), server: Some("prod"),
..flags() ..flags()
}, },
ScopeFlags {
cluster: Some("./brain"),
uri: Some("file://other.omni".into()),
..flags()
},
ScopeFlags {
cluster: Some("./brain"),
server: Some("prod"),
..flags()
},
] { ] {
let err = resolve_scope(&op, Capability::Any, flags).unwrap_err().to_string(); let err = resolve_scope(&op, Capability::Direct, flags)
assert!(err.contains("--store is exclusive"), "{err}"); .unwrap_err()
.to_string();
assert!(err.contains("mutually exclusive"), "{err}");
} }
} }
#[test]
fn cluster_flag_resolves_root_and_graph_for_maintenance() {
let op = cfg("clusters:\n brain:\n root: s3://acme/brain\n");
let scope = resolve_scope(
&op,
Capability::Direct,
ScopeFlags {
cluster: Some("brain"),
graph: Some("knowledge"),
..flags()
},
)
.unwrap();
assert_eq!(scope.cluster.as_deref(), Some("s3://acme/brain"));
assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge"));
}
#[test]
fn cluster_flag_accepts_a_literal_root_uri() {
let op = OperatorConfig::default();
let scope = resolve_scope(
&op,
Capability::Direct,
ScopeFlags {
cluster: Some("s3://bucket/clusters/brain"),
graph: Some("knowledge"),
..flags()
},
)
.unwrap();
assert_eq!(scope.cluster.as_deref(), Some("s3://bucket/clusters/brain"));
assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge"));
}
#[test]
fn cluster_scope_without_a_graph_defers_to_catalog_enumeration() {
// RFC-011 D7: with no `--graph`/`default_graph`, resolution no longer
// bails here — it resolves the cluster root and leaves `cluster_graph`
// empty, deferring to the async storage-URI resolver (which enumerates
// the catalog: auto-use a sole graph, else error listing candidates).
let op = cfg("clusters:\n brain:\n root: s3://acme/brain\n");
let scope = resolve_scope(
&op,
Capability::Direct,
ScopeFlags {
cluster: Some("brain"),
..flags()
},
)
.unwrap();
assert_eq!(scope.cluster.as_deref(), Some("s3://acme/brain"));
assert_eq!(scope.cluster_graph, None);
}
#[test]
fn graph_on_a_bare_store_or_uri_is_rejected() {
let op = OperatorConfig::default();
for flags in [
ScopeFlags {
uri: Some("graph.omni".into()),
graph: Some("knowledge"),
..flags()
},
ScopeFlags {
store: Some("s3://b/g.omni"),
graph: Some("knowledge"),
..flags()
},
] {
let err = resolve_scope(&op, Capability::Any, flags)
.unwrap_err()
.to_string();
assert!(err.contains("already a single graph"), "{err}");
}
}
#[test]
fn flat_default_store_drives_local_verbs() {
// RFC-011: `defaults.store` is the zero-flag local default — no flags,
// no profile → the store URI resolves as the (single-graph) store scope.
let op = cfg("defaults:\n store: file:///tmp/dev.omni\n");
let scope = resolve_scope(&op, Capability::Any, flags()).unwrap();
assert_eq!(scope.uri.as_deref(), Some("file:///tmp/dev.omni"));
assert_eq!(scope.server, None);
}
#[test]
fn flat_default_store_rejects_graph() {
// A store is already a single graph, so `--graph` against a default
// store is a loud error.
let op = cfg("defaults:\n store: file:///tmp/dev.omni\n");
let err = resolve_scope(
&op,
Capability::Any,
ScopeFlags {
graph: Some("knowledge"),
..flags()
},
)
.unwrap_err()
.to_string();
assert!(err.contains("does not apply to a store scope"), "{err}");
}
#[test] #[test]
fn flat_default_server_drives_data_verbs() { fn flat_default_server_drives_data_verbs() {
let op = cfg("defaults:\n server: prod\n default_graph: knowledge\nservers:\n prod:\n url: https://x\n"); let op = cfg("defaults:\n server: prod\n default_graph: knowledge\nservers:\n prod:\n url: https://x\n");
@ -294,6 +458,27 @@ mod tests {
assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge")); assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge"));
} }
#[test]
fn profile_cluster_scope_with_graph_override() {
// The deferral closed by this slice: a `--graph` flag overrides a
// profile cluster's default_graph, exactly as it does for a server scope.
let op = cfg(
"clusters:\n brain:\n root: s3://acme/brain\nprofiles:\n admin:\n cluster: brain\n default_graph: knowledge\n",
);
let scope = resolve_scope(
&op,
Capability::Direct,
ScopeFlags {
profile: Some("admin"),
graph: Some("archive"),
..flags()
},
)
.unwrap();
assert_eq!(scope.cluster.as_deref(), Some("s3://acme/brain"));
assert_eq!(scope.cluster_graph.as_deref(), Some("archive")); // flag beats profile default
}
#[test] #[test]
fn server_scope_on_maintenance_verb_errors() { fn server_scope_on_maintenance_verb_errors() {
let op = cfg("defaults:\n server: prod\nservers:\n prod:\n url: https://x\n"); let op = cfg("defaults:\n server: prod\nservers:\n prod:\n url: https://x\n");
@ -316,7 +501,7 @@ mod tests {
) )
.unwrap_err() .unwrap_err()
.to_string(); .to_string();
assert!(err.contains("maintenance-only"), "{err}"); assert!(err.contains("not valid for graph data commands"), "{err}");
} }
#[test] #[test]

View file

@ -683,51 +683,8 @@ fn cluster_apply_locked_exits_nonzero() {
assert!(!temp.path().join("__cluster/resources").exists()); assert!(!temp.path().join("__cluster/resources").exists());
} }
#[test] /// RFC-011: the actor chain is `--as` > `operator.actor` > none. The CLI no
fn cluster_apply_uses_cli_actor_from_local_config() { /// longer reads omnigraph.yaml `cli.actor`.
let temp = tempdir().unwrap();
write_cluster_config_fixture(temp.path());
fs::write(
temp.path().join("omnigraph.yaml"),
"cli:\n actor: act-local\n",
)
.unwrap();
// Phase 1: import once (setup, not under test).
let output = cli()
.current_dir(temp.path())
.arg("cluster")
.arg("import")
.arg("--config")
.arg(temp.path())
.output()
.unwrap();
assert!(output.status.success(), "{output:?}");
// Phase 2: apply alone, capturing the echoed actor (idempotent re-runs).
let apply = |extra: &[&str]| {
let mut command = cli();
command.current_dir(temp.path());
for arg in extra {
command.arg(arg);
}
let output = command
.arg("cluster")
.arg("apply")
.arg("--config")
.arg(temp.path())
.arg("--json")
.output()
.unwrap();
let json: serde_json::Value =
serde_json::from_str(String::from_utf8_lossy(&output.stdout).trim()).unwrap();
json["actor"].clone()
};
assert_eq!(apply(&[]), "act-local", "cli.actor is the no-flag default");
assert_eq!(apply(&["--as", "andrew"]), "andrew", "--as overrides cli.actor");
}
/// RFC-007 PR 1: the operator layer joins the actor chain —
/// `--as` > legacy `cli.actor` (RFC-008 window) > `operator.actor` > none.
#[test] #[test]
fn cluster_apply_uses_operator_actor_from_omnigraph_home() { fn cluster_apply_uses_operator_actor_from_omnigraph_home() {
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
@ -771,41 +728,31 @@ fn cluster_apply_uses_operator_actor_from_omnigraph_home() {
json["actor"].clone() json["actor"].clone()
}; };
// No --as, no omnigraph.yaml: the operator identity applies. // No --as: the operator identity applies.
assert_eq!( assert_eq!(
apply(&[]), apply(&[]),
"act-operator", "act-operator",
"operator.actor is the no-flag, no-legacy-config default" "operator.actor is the no-flag default"
); );
// --as still wins over everything. // --as still wins over the operator layer.
assert_eq!(apply(&["--as", "andrew"]), "andrew"); assert_eq!(apply(&["--as", "andrew"]), "andrew");
// A legacy cli.actor (RFC-008 window) outranks the operator layer.
fs::write(
temp.path().join("omnigraph.yaml"),
"cli:\n actor: act-legacy\n",
)
.unwrap();
assert_eq!(
apply(&[]),
"act-legacy",
"legacy cli.actor wins over operator.actor during the deprecation window"
);
} }
#[test] #[test]
fn cluster_approve_uses_cli_actor_fallback() { fn cluster_approve_uses_operator_actor_fallback() {
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
write_cluster_config_fixture(temp.path()); write_cluster_config_fixture(temp.path());
let operator_home = tempdir().unwrap();
fs::write( fs::write(
temp.path().join("omnigraph.yaml"), operator_home.path().join("config.yaml"),
"cli:\n actor: act-local\n", "operator:\n actor: act-operator\n",
) )
.unwrap(); .unwrap();
// Converge, then remove the graph so a gated delete is pending. // Converge, then remove the graph so a gated delete is pending.
for command in ["import", "apply"] { for command in ["import", "apply"] {
let output = cli() let output = cli()
.current_dir(temp.path()) .current_dir(temp.path())
.env("OMNIGRAPH_HOME", operator_home.path())
.arg("cluster") .arg("cluster")
.arg(command) .arg(command)
.arg("--config") .arg("--config")
@ -818,6 +765,7 @@ fn cluster_approve_uses_cli_actor_fallback() {
let output = cli() let output = cli()
.current_dir(temp.path()) .current_dir(temp.path())
.env("OMNIGRAPH_HOME", operator_home.path())
.arg("cluster") .arg("cluster")
.arg("approve") .arg("approve")
.arg("graph.knowledge") .arg("graph.knowledge")
@ -829,14 +777,17 @@ fn cluster_approve_uses_cli_actor_fallback() {
assert!(output.status.success(), "{output:?}"); assert!(output.status.success(), "{output:?}");
let json: serde_json::Value = let json: serde_json::Value =
serde_json::from_str(String::from_utf8_lossy(&output.stdout).trim()).unwrap(); serde_json::from_str(String::from_utf8_lossy(&output.stdout).trim()).unwrap();
assert_eq!(json["approved_by"], "act-local"); assert_eq!(json["approved_by"], "act-operator");
// With neither flag nor config: refused with the actionable message. // With neither flag nor operator config: refused with the actionable
// message (an approval without an approver is meaningless).
let bare = tempdir().unwrap(); let bare = tempdir().unwrap();
write_cluster_config_fixture(bare.path()); write_cluster_config_fixture(bare.path());
let bare_home = tempdir().unwrap();
let output = output_failure( let output = output_failure(
cli() cli()
.current_dir(bare.path()) .current_dir(bare.path())
.env("OMNIGRAPH_HOME", bare_home.path())
.arg("cluster") .arg("cluster")
.arg("approve") .arg("approve")
.arg("graph.knowledge") .arg("graph.knowledge")
@ -845,11 +796,13 @@ fn cluster_approve_uses_cli_actor_fallback() {
); );
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
assert!(stderr.contains("--as"), "{stderr}"); assert!(stderr.contains("--as"), "{stderr}");
assert!(stderr.contains("cli.actor"), "{stderr}");
} }
#[test] #[test]
fn cluster_commands_ignore_malformed_local_config() { fn cluster_commands_ignore_legacy_omnigraph_yaml() {
// RFC-011: the CLI never reads omnigraph.yaml for cluster commands — a
// present (even malformed) legacy file is inert. The actor falls back to
// `operator.actor`, then to none (no loud failure on absence).
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
write_cluster_config_fixture(temp.path()); write_cluster_config_fixture(temp.path());
fs::write(temp.path().join("omnigraph.yaml"), "{{{{ not yaml").unwrap(); fs::write(temp.path().join("omnigraph.yaml"), "{{{{ not yaml").unwrap();
@ -873,14 +826,11 @@ fn cluster_commands_ignore_malformed_local_config() {
"cluster {command} touched omnigraph.yaml" "cluster {command} touched omnigraph.yaml"
); );
} }
// import + apply with an explicit --as: the config is never loaded. // import + apply (no --as, no operator config): the legacy file is never
for (command, args) in [("import", vec![]), ("apply", vec!["--as", "andrew"])] { // loaded and the no-actor apply succeeds (actor defaults to none).
let mut invocation = cli(); for command in ["import", "apply"] {
invocation.current_dir(temp.path()); let output = cli()
for arg in &args { .current_dir(temp.path())
invocation.arg(arg);
}
let output = invocation
.arg("cluster") .arg("cluster")
.arg(command) .arg(command)
.arg("--config") .arg("--config")
@ -893,20 +843,6 @@ fn cluster_commands_ignore_malformed_local_config() {
String::from_utf8_lossy(&output.stderr) String::from_utf8_lossy(&output.stderr)
); );
} }
// Only the no-flag actor lookup is allowed to fail, and loudly.
let output = output_failure(
cli()
.current_dir(temp.path())
.arg("cluster")
.arg("apply")
.arg("--config")
.arg(temp.path()),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("omnigraph.yaml") && stderr.contains("--as"),
"the actor-default config read must fail loudly and actionably: {stderr}"
);
} }
#[test] #[test]
@ -975,7 +911,7 @@ fn optimize_resolves_a_cluster_graph_by_id() {
.arg("optimize") .arg("optimize")
.arg("--cluster") .arg("--cluster")
.arg(temp.path()) .arg(temp.path())
.arg("--cluster-graph") .arg("--graph")
.arg("knowledge") .arg("knowledge")
.arg("--json"), .arg("--json"),
); );
@ -994,7 +930,7 @@ fn optimize_unknown_cluster_graph_id_errors() {
.arg("optimize") .arg("optimize")
.arg("--cluster") .arg("--cluster")
.arg(temp.path()) .arg(temp.path())
.arg("--cluster-graph") .arg("--graph")
.arg("does-not-exist") .arg("does-not-exist")
.arg("--json"), .arg("--json"),
); );
@ -1006,19 +942,80 @@ fn optimize_unknown_cluster_graph_id_errors() {
} }
#[test] #[test]
fn cluster_flag_requires_cluster_graph() { fn optimize_auto_uses_the_sole_cluster_graph() {
// clap enforces both-or-neither. // RFC-011 D7: a cluster with exactly one applied graph needs no --graph —
// the resolver enumerates the catalog and uses the only candidate.
let temp = applied_knowledge_cluster();
let out = output_success(
cli()
.arg("optimize")
.arg("--cluster")
.arg(temp.path())
.arg("--json"),
);
assert!(
parse_stdout_json(&out)["tables"].as_array().is_some(),
"optimize should auto-resolve the sole cluster graph"
);
}
/// Stand up an applied cluster with two graphs (`knowledge`, `archive`).
fn applied_two_graph_cluster() -> tempfile::TempDir {
let temp = tempdir().unwrap();
let root = temp.path();
fs::write(
root.join("people.pg"),
"node Person {\n name: String @key\n age: I32?\n}\n",
)
.unwrap();
fs::write(root.join("base.policy.yaml"), "rules: []\n").unwrap();
fs::write(
root.join("cluster.yaml"),
r#"
version: 1
metadata:
name: two-graph
state:
backend: cluster
lock: true
graphs:
knowledge:
schema: ./people.pg
archive:
schema: ./people.pg
policies:
base:
file: ./base.policy.yaml
applies_to: [knowledge, archive]
"#,
)
.unwrap();
init_named_cluster_graph(root, "knowledge", "people.pg");
init_named_cluster_graph(root, "archive", "people.pg");
assert_eq!(cluster_json(root, "import")["ok"], true);
assert_eq!(cluster_json(root, "apply")["converged"], true);
temp
}
#[test]
fn optimize_on_multi_graph_cluster_without_graph_lists_candidates() {
// RFC-011 D7: >1 graph and no --graph → error naming every candidate,
// never an auto-pick.
let temp = applied_two_graph_cluster();
let out = output_failure( let out = output_failure(
cli() cli()
.arg("optimize") .arg("optimize")
.arg("--cluster") .arg("--cluster")
.arg(".") .arg(temp.path())
.arg("--json"), .arg("--json"),
); );
let stderr = String::from_utf8_lossy(&out.stderr); let stderr = String::from_utf8_lossy(&out.stderr);
assert!( assert!(
stderr.contains("cluster-graph") || stderr.contains("required"), stderr.contains("2 graphs")
"expected --cluster to require --cluster-graph; got: {stderr}" && stderr.contains("archive")
&& stderr.contains("knowledge")
&& stderr.contains("--graph <id>"),
"expected a candidate-listing error; got: {stderr}"
); );
} }
@ -1042,6 +1039,47 @@ fn init_refuses_a_cluster_managed_path_and_signposts_cluster_apply() {
assert!(!temp.path().join("graphs").join("sneaky.omni").exists()); assert!(!temp.path().join("graphs").join("sneaky.omni").exists());
} }
#[test]
fn schema_apply_refuses_a_cluster_managed_graph_and_signposts_cluster_apply() {
// RFC-011 Decision 10: a direct `schema apply` against a cluster-managed
// graph's storage root would bypass the ledger/recovery/approvals, so it is
// refused and points at `cluster apply` (mirrors `init`'s refusal).
let temp = applied_knowledge_cluster();
// A schema that WOULD change the graph (adds `bio`) — so the no-mutation
// assertion below is meaningful, not a no-op re-apply.
fs::write(
temp.path().join("people_v2.pg"),
"node Person {\n name: String @key\n age: I32?\n bio: String?\n}\n",
)
.unwrap();
let out = output_failure(
cli()
.arg("schema")
.arg("apply")
.arg("--schema")
.arg(temp.path().join("people_v2.pg"))
.arg("--store")
.arg(temp.path().join("graphs").join("knowledge.omni")),
);
let stderr = String::from_utf8_lossy(&out.stderr);
assert!(
stderr.contains("cluster apply"),
"schema apply against a cluster-managed graph should signpost `cluster apply`; got: {stderr}"
);
// And it bailed BEFORE mutating: the live schema still lacks `bio`.
let show = output_success(
cli()
.arg("schema")
.arg("show")
.arg(temp.path().join("graphs").join("knowledge.omni")),
);
assert!(
!stdout_string(&show).contains("bio"),
"the refused apply must not have changed the live schema; got: {}",
stdout_string(&show)
);
}
#[test] #[test]
fn init_outside_a_cluster_still_works() { fn init_outside_a_cluster_still_works() {
// Regression guard: ordinary init (no cluster layout) is unaffected. // Regression guard: ordinary init (no cluster layout) is unaffected.
@ -1076,7 +1114,7 @@ fn optimize_by_cluster_works_when_catalog_payloads_are_degraded() {
.arg("optimize") .arg("optimize")
.arg("--cluster") .arg("--cluster")
.arg(temp.path()) .arg(temp.path())
.arg("--cluster-graph") .arg("--graph")
.arg("knowledge") .arg("knowledge")
.arg("--json"), .arg("--json"),
); );

View file

@ -3,6 +3,7 @@
use std::fs; use std::fs;
use omnigraph::db::Omnigraph;
use tempfile::tempdir; use tempfile::tempdir;
mod support; mod support;
@ -236,27 +237,28 @@ fn cluster_e2e_out_of_band_schema_drift_then_apply_converges_it() {
let apply = cluster_json(temp.path(), "apply"); let apply = cluster_json(temp.path(), "apply");
assert_eq!(apply["converged"], true, "{apply}"); assert_eq!(apply["converged"], true, "{apply}");
// Out-of-band: the live graph evolves, cluster.yaml stays put. // Out-of-band: the live graph evolves while cluster.yaml stays put. RFC-011
fs::write( // D10 makes the CLI `schema apply` refuse a cluster-managed graph, so this
temp.path().join("people_v2.pg"), // simulates a true bypass — a direct engine apply against the storage root,
r#" // exactly the drift the control plane must still detect and converge.
let people_v2 = r#"
node Person { node Person {
name: String @key name: String @key
age: I32? age: I32?
bio: String? bio: String?
} }
"#, "#;
) tokio::runtime::Runtime::new().unwrap().block_on(async {
.unwrap(); let db = Omnigraph::open(
output_success( temp.path()
cli() .join("graphs/knowledge.omni")
.arg("schema") .to_string_lossy()
.arg("apply") .as_ref(),
.arg(temp.path().join("graphs/knowledge.omni")) )
.arg("--schema") .await
.arg(temp.path().join("people_v2.pg")) .unwrap();
.arg("--json"), db.apply_schema(people_v2).await.unwrap();
); });
// Drift is visible... // Drift is visible...
let refresh = cluster_json(temp.path(), "refresh"); let refresh = cluster_json(temp.path(), "refresh");

View file

@ -165,12 +165,87 @@ fn optimize_with_server_flag_errors_wrong_plane() {
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
assert!( assert!(
stderr.contains("`optimize` is a direct (storage-native) command") stderr.contains("`optimize` is a direct (storage-native) command")
&& stderr.contains("--server/--graph address a served graph and do not apply") && stderr.contains("--server addresses a served graph and does not apply")
&& stderr.contains("Pass a storage URI, or --cluster <dir> --cluster-graph <id>."), && stderr.contains("Pass a storage URI, or --cluster <dir> --graph <id>."),
"wrong-capability guard message not found; got: {stderr}" "wrong-capability guard message not found; got: {stderr}"
); );
} }
#[test]
fn wrong_address_guard_message_has_no_trailing_space() {
// The remediation tail is empty for served-addressing capabilities, so a
// misplaced --cluster on a data verb must not leave "… does not apply. "
// with a dangling space (error text is observable contract). NO_COLOR keeps
// the assertion off ANSI styling.
let output = output_failure(
cli()
.env("NO_COLOR", "1")
.arg("query")
.arg("--cluster")
.arg("./brain")
.arg("-e")
.arg("query q { Person { id } }"),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("and does not apply."),
"expected the wrong-address message; got: {stderr}"
);
assert!(
!stderr.contains("and does not apply. "),
"trailing space after the message; got: {stderr}"
);
}
#[test]
fn graph_flag_on_a_positional_uri_errors() {
// RFC-011: `--graph` selects within a multi-graph scope (a server or
// cluster). An explicit `--store <uri>` is already a single graph, so
// pairing it with `--graph` is a loud error, not a silently-dropped flag.
// (The guard lets `--graph` reach a data verb; the scope resolver rejects
// it.)
let temp = tempdir().unwrap();
let graph = graph_path(temp.path());
init_graph(&graph);
let output = output_failure(
cli()
.arg("query")
.arg("--store")
.arg(&graph)
.arg("--graph")
.arg("knowledge")
.arg("-e")
.arg("query q { Person { id } }"),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("already a single graph"),
"expected --graph-on-explicit-store rejection; got: {stderr}"
);
}
#[test]
fn query_by_name_against_a_store_needs_a_server() {
// RFC-011 D3: by-name (catalog) invocation is served-only — the catalog is
// server-owned, so a bare `--store` has nothing to resolve the name
// against. The ad-hoc lane (`-e`/`--query`) is the local alternative.
let temp = tempdir().unwrap();
let graph = graph_path(temp.path());
init_graph(&graph);
let output = output_failure(
cli()
.arg("query")
.arg("find_people")
.arg("--store")
.arg(&graph),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("needs a server"),
"expected a served-only by-name error; got: {stderr}"
);
}
#[test] #[test]
fn optimize_with_remote_target_errors_storage_plane() { fn optimize_with_remote_target_errors_storage_plane() {
// RFC-010 Slice 1: a maintenance verb pointed at a remote URI fails loudly // RFC-010 Slice 1: a maintenance verb pointed at a remote URI fails loudly
@ -454,10 +529,9 @@ query list_people() {
#[test] #[test]
fn deprecated_read_and_change_subcommands_emit_warnings() { fn deprecated_read_and_change_subcommands_emit_warnings() {
// Both subcommands require `--query`/`--query-string`/`--alias`, so // Both subcommands require `--query`/`--query-string`, so invoking them
// invoking them with no args will exit non-zero. That's fine -- // with no args will exit non-zero. That's fine -- we only care that the
// we only care that the deprecation warning is printed before the // deprecation warning is printed before the argument-required error.
// argument-required error.
let output = cli().arg("read").output().unwrap(); let output = cli().arg("read").output().unwrap();
let stderr = String::from_utf8(output.stderr).unwrap(); let stderr = String::from_utf8(output.stderr).unwrap();
assert!( assert!(
@ -525,13 +599,15 @@ query list_people() {
} }
#[test] #[test]
fn query_lint_can_resolve_graph_and_query_from_config() { fn query_lint_can_resolve_graph_from_store_scope() {
// RFC-011: lint resolves its graph target through `--store` (the direct
// scope), not omnigraph.yaml's cli.graph; the .gq path is plain cwd-relative.
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
let graph = graph_path(temp.path()); let graph = graph_path(temp.path());
let config_path = temp.path().join("omnigraph.yaml");
init_graph(&graph); init_graph(&graph);
let query_path = temp.path().join("queries.gq");
write_query_file( write_query_file(
&temp.path().join("queries.gq"), &query_path,
r#" r#"
query list_people() { query list_people() {
match { $p: Person } match { $p: Person }
@ -539,16 +615,15 @@ query list_people() {
} }
"#, "#,
); );
write_config(&config_path, &local_yaml_config(&graph));
let output = output_success( let output = output_success(
cli() cli()
.arg("query") .arg("query")
.arg("lint") .arg("lint")
.arg("--query") .arg("--query")
.arg("queries.gq") .arg(&query_path)
.arg("--config") .arg("--store")
.arg(&config_path) .arg(&graph)
.arg("--json"), .arg("--json"),
); );
let payload: Value = serde_json::from_slice(&output.stdout).unwrap(); let payload: Value = serde_json::from_slice(&output.stdout).unwrap();
@ -616,7 +691,9 @@ query list_people() {
); );
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
assert!( assert!(
stderr.contains("lint requires --schema <schema.pg> or a resolvable graph target") stderr.contains("lint requires --schema <schema.pg>")
|| stderr.contains("no graph addressed"),
"expected a schema-or-graph-target requirement; got: {stderr}"
); );
} }
@ -785,10 +862,10 @@ fn read_json_outputs_rows_for_named_query() {
let output = output_success( let output = output_success(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(&queries) .arg(&queries)
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Alice"}"#) .arg(r#"{"name":"Alice"}"#)
@ -817,7 +894,6 @@ fn read_via_store_flag_and_profile_match_positional_uri() {
let output = output_success( let output = output_success(
cmd.arg("--query") cmd.arg("--query")
.arg(&queries) .arg(&queries)
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Alice"}"#) .arg(r#"{"name":"Alice"}"#)
@ -826,8 +902,8 @@ fn read_via_store_flag_and_profile_match_positional_uri() {
serde_json::from_slice(&output.stdout).unwrap() serde_json::from_slice(&output.stdout).unwrap()
}; };
// Baseline: positional URI. // Baseline: --store names the graph.
let baseline = read_rows(cli().arg("query").arg(&graph)); let baseline = read_rows(cli().arg("query").arg("--store").arg(&graph));
assert_eq!(baseline["rows"][0]["p.name"], "Alice"); assert_eq!(baseline["rows"][0]["p.name"], "Alice");
// --store names the same graph directly. // --store names the same graph directly.
@ -914,43 +990,38 @@ fn export_jsonl_outputs_source_rows_for_selected_branch_and_type() {
); );
} }
// RFC-011: `policy validate|test|explain` source the Cedar bundle from a
// converged cluster's applied policies (`--cluster <dir>` + `--graph <id>`),
// not omnigraph.yaml's policy.file.
#[test] #[test]
fn policy_validate_accepts_valid_policy_file() { fn policy_validate_accepts_cluster_bundle() {
let temp = tempdir().unwrap(); let cluster = converged_loaded_cluster("knowledge", Some(POLICY_YAML));
let (config, _) = write_policy_config_fixture(temp.path());
let output = output_success( let output = output_success(
cli() cli()
.arg("policy") .arg("policy")
.arg("validate") .arg("validate")
.arg("--config") .arg("--cluster")
.arg(&config), .arg(cluster.path())
.arg("--graph")
.arg("knowledge"),
); );
let stdout = stdout_string(&output); let stdout = stdout_string(&output);
assert!(stdout.contains("policy valid:")); assert!(stdout.contains("policy valid:"));
assert!(stdout.contains("policy.yaml"));
assert!(stdout.contains("[2 actors]")); assert!(stdout.contains("[2 actors]"));
} }
#[test] #[test]
fn policy_validate_fails_for_invalid_policy_file() { fn policy_validate_fails_for_invalid_cluster_bundle() {
let temp = tempdir().unwrap(); // The cluster does not validate a policy bundle's internal rules, so an
let config = temp.path().join("omnigraph.yaml"); // applied-but-malformed bundle reaches `policy validate`, which compiles it
let policy = temp.path().join("policy.yaml"); // and surfaces the error (here: a duplicate rule id).
fs::write( let cluster = converged_loaded_cluster(
&config, "knowledge",
r#" Some(
project: r#"
name: policy-test-graph
policy:
file: ./policy.yaml
"#,
)
.unwrap();
fs::write(
&policy,
r#"
version: 1 version: 1
groups: groups:
team: [act-andrew] team: [act-andrew]
@ -966,26 +1037,42 @@ rules:
actions: [export] actions: [export]
branch_scope: any branch_scope: any
"#, "#,
) ),
.unwrap(); );
let output = output_failure( let output = output_failure(
cli() cli()
.arg("policy") .arg("policy")
.arg("validate") .arg("validate")
.arg("--config") .arg("--cluster")
.arg(&config), .arg(cluster.path())
.arg("--graph")
.arg("knowledge"),
); );
let stderr = String::from_utf8(output.stderr).unwrap(); let stderr = String::from_utf8(output.stderr).unwrap();
assert!(stderr.contains("duplicate policy rule id")); assert!(
stderr.contains("duplicate policy rule id"),
"expected a duplicate-rule error; got: {stderr}"
);
} }
#[test] #[test]
fn policy_test_runs_declarative_cases() { fn policy_test_runs_declarative_cases_against_cluster_bundle() {
let temp = tempdir().unwrap(); let cluster = converged_loaded_cluster("knowledge", Some(POLICY_YAML));
let (config, _) = write_policy_config_fixture(temp.path()); let tests = cluster.path().join("policy.tests.yaml");
fs::write(&tests, POLICY_TESTS_YAML).unwrap();
let output = output_success(cli().arg("policy").arg("test").arg("--config").arg(&config)); let output = output_success(
cli()
.arg("policy")
.arg("test")
.arg("--cluster")
.arg(cluster.path())
.arg("--graph")
.arg("knowledge")
.arg("--tests")
.arg(&tests),
);
let stdout = stdout_string(&output); let stdout = stdout_string(&output);
assert!(stdout.contains("policy tests passed: 2 cases")); assert!(stdout.contains("policy tests passed: 2 cases"));
@ -993,15 +1080,16 @@ fn policy_test_runs_declarative_cases() {
#[test] #[test]
fn policy_explain_reports_decision_and_matched_rule() { fn policy_explain_reports_decision_and_matched_rule() {
let temp = tempdir().unwrap(); let cluster = converged_loaded_cluster("knowledge", Some(POLICY_YAML));
let (config, _) = write_policy_config_fixture(temp.path());
let allow = output_success( let allow = output_success(
cli() cli()
.arg("policy") .arg("policy")
.arg("explain") .arg("explain")
.arg("--config") .arg("--cluster")
.arg(&config) .arg(cluster.path())
.arg("--graph")
.arg("knowledge")
.arg("--actor") .arg("--actor")
.arg("act-andrew") .arg("act-andrew")
.arg("--action") .arg("--action")
@ -1017,8 +1105,10 @@ fn policy_explain_reports_decision_and_matched_rule() {
cli() cli()
.arg("policy") .arg("policy")
.arg("explain") .arg("explain")
.arg("--config") .arg("--cluster")
.arg(&config) .arg(cluster.path())
.arg("--graph")
.arg("knowledge")
.arg("--actor") .arg("--actor")
.arg("act-bruno") .arg("act-bruno")
.arg("--action") .arg("--action")
@ -1032,22 +1122,26 @@ fn policy_explain_reports_decision_and_matched_rule() {
} }
#[test] #[test]
fn read_can_resolve_uri_from_config() { fn read_resolves_uri_from_default_store_scope() {
// RFC-011: a zero-flag read resolves its graph from `defaults.store` in the
// operator config (the local-dev default scope) — no omnigraph.yaml.
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
let graph = graph_path(temp.path()); let graph = graph_path(temp.path());
let config = temp.path().join("omnigraph.yaml");
init_graph(&graph); init_graph(&graph);
load_fixture(&graph); load_fixture(&graph);
write_config(&config, &local_yaml_config(&graph)); let home = tempdir().unwrap();
std::fs::write(
home.path().join("config.yaml"),
format!("defaults:\n store: {}\n", graph.to_string_lossy()),
)
.unwrap();
let output = output_success( let output = output_success(
cli() cli()
.env("OMNIGRAPH_HOME", home.path())
.arg("read") .arg("read")
.arg("--config")
.arg(&config)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")) .arg(fixture("test.gq"))
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Alice"}"#) .arg(r#"{"name":"Alice"}"#)
@ -1067,10 +1161,10 @@ fn read_csv_format_outputs_header_and_row_values() {
let output = output_success( let output = output_success(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")) .arg(fixture("test.gq"))
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Alice"}"#) .arg(r#"{"name":"Alice"}"#)
@ -1104,10 +1198,10 @@ fn read_uses_operator_default_output_format() {
command command
.env("OMNIGRAPH_HOME", operator_home.path()) .env("OMNIGRAPH_HOME", operator_home.path())
.arg("read") .arg("read")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")) .arg(fixture("test.gq"))
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Alice"}"#); .arg(r#"{"name":"Alice"}"#);
@ -1139,10 +1233,10 @@ fn read_jsonl_format_outputs_metadata_header_first() {
let output = output_success( let output = output_success(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")) .arg(fixture("test.gq"))
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Alice"}"#) .arg(r#"{"name":"Alice"}"#)
@ -1174,6 +1268,7 @@ query insert_person($name: String, $age: I32) {
let output = output_success( let output = output_success(
cli() cli()
.arg("change") .arg("change")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(&mutation_file) .arg(&mutation_file)
@ -1190,10 +1285,10 @@ query insert_person($name: String, $age: I32) {
let verify = output_success( let verify = output_success(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")) .arg(fixture("test.gq"))
.arg("--name")
.arg("get_person") .arg("get_person")
.arg("--params") .arg("--params")
.arg(r#"{"name":"Eve"}"#) .arg(r#"{"name":"Eve"}"#)
@ -1205,13 +1300,13 @@ query insert_person($name: String, $age: I32) {
} }
#[test] #[test]
fn change_can_resolve_uri_and_branch_from_config() { fn change_resolves_uri_and_default_branch_from_store_scope() {
// RFC-011: a mutate resolves its graph from `--store` and defaults the
// branch to main (no omnigraph.yaml cli.graph / cli.branch).
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
let graph = graph_path(temp.path()); let graph = graph_path(temp.path());
let config = temp.path().join("omnigraph.yaml");
init_graph(&graph); init_graph(&graph);
load_fixture(&graph); load_fixture(&graph);
write_config(&config, &local_yaml_config(&graph));
let mutation_file = temp.path().join("config-mutations.gq"); let mutation_file = temp.path().join("config-mutations.gq");
write_query_file( write_query_file(
&mutation_file, &mutation_file,
@ -1225,8 +1320,8 @@ query insert_person($name: String, $age: I32) {
let output = output_success( let output = output_success(
cli() cli()
.arg("change") .arg("change")
.arg("--config") .arg("--store")
.arg(&config) .arg(&graph)
.arg("--query") .arg("--query")
.arg(&mutation_file) .arg(&mutation_file)
.arg("--params") .arg("--params")
@ -1248,6 +1343,7 @@ fn read_requires_name_for_multi_query_files() {
let output = output_failure( let output = output_failure(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&graph) .arg(&graph)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")), .arg(fixture("test.gq")),
@ -1266,6 +1362,7 @@ fn read_supports_inline_query_string() {
let output = output_success( let output = output_success(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&repo) .arg(&repo)
.arg("-e") .arg("-e")
.arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }") .arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }")
@ -1281,11 +1378,12 @@ fn read_supports_inline_query_string() {
#[test] #[test]
fn positional_http_uri_on_a_data_verb_is_rejected() { fn positional_http_uri_on_a_data_verb_is_rejected() {
// RFC-011: a positional/`--uri` http(s):// URL no longer dispatches to a // RFC-011: a `--store` http(s):// URL no longer dispatches to a remote
// remote server — that requires `--server <url>`. // server — that requires `--server <url>`.
let output = output_failure( let output = output_failure(
cli() cli()
.arg("query") .arg("query")
.arg("--store")
.arg("http://127.0.0.1:1") .arg("http://127.0.0.1:1")
.arg("-e") .arg("-e")
.arg("query q() { match { $p: Person { } } return { $p } }"), .arg("query q() { match { $p: Person { } } return { $p } }"),
@ -1293,7 +1391,7 @@ fn positional_http_uri_on_a_data_verb_is_rejected() {
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
assert!( assert!(
stderr.contains("must be addressed with `--server <url>`"), stderr.contains("must be addressed with `--server <url>`"),
"expected positional-remote rejection; got: {stderr}" "expected store-remote rejection; got: {stderr}"
); );
} }
@ -1331,6 +1429,7 @@ fn change_supports_inline_query_string() {
let output = output_success( let output = output_success(
cli() cli()
.arg("change") .arg("change")
.arg("--store")
.arg(&repo) .arg(&repo)
.arg("--query-string") .arg("--query-string")
.arg("query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }") .arg("query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }")
@ -1345,6 +1444,7 @@ fn change_supports_inline_query_string() {
let verify = output_success( let verify = output_success(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&repo) .arg(&repo)
.arg("-e") .arg("-e")
.arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name } }") .arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name } }")
@ -1366,6 +1466,7 @@ fn read_rejects_query_string_combined_with_query() {
let output = output_failure( let output = output_failure(
cli() cli()
.arg("read") .arg("read")
.arg("--store")
.arg(&repo) .arg(&repo)
.arg("--query") .arg("--query")
.arg(fixture("test.gq")) .arg(fixture("test.gq"))
@ -1386,7 +1487,7 @@ fn read_rejects_empty_query_string() {
init_graph(&repo); init_graph(&repo);
load_fixture(&repo); load_fixture(&repo);
let output = output_failure(cli().arg("read").arg(&repo).arg("-e").arg("")); let output = output_failure(cli().arg("read").arg("--store").arg(&repo).arg("-e").arg(""));
let stderr = String::from_utf8(output.stderr).unwrap(); let stderr = String::from_utf8(output.stderr).unwrap();
assert!( assert!(
stderr.contains("must not be empty"), stderr.contains("must not be empty"),
@ -1514,6 +1615,160 @@ fn branch_delete_rejects_main() {
assert!(stderr.contains("cannot delete branch 'main'")); assert!(stderr.contains("cannot delete branch 'main'"));
} }
// ── RFC-011 Decision 9: write diagnostics + non-local destructive-confirm ──
#[test]
fn write_echoes_resolved_target_to_stderr() {
// Every write echoes its resolved target + access path to stderr; --json
// (stdout) is unaffected. A local load → "(direct, local)".
let temp = tempdir().unwrap();
let graph = graph_path(temp.path());
init_graph(&graph);
let data = fixture("test.jsonl");
let output = output_success(
cli()
.arg("load")
.arg("--mode")
.arg("append")
.arg("--data")
.arg(&data)
.arg(&graph)
.arg("--json"),
);
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(
stderr.contains("omnigraph load →") && stderr.contains("(direct, local)"),
"missing write-target echo; stderr: {stderr}"
);
// stdout still parses as JSON — the echo went to stderr.
let _: Value = serde_json::from_slice(&output.stdout).unwrap();
}
#[test]
fn quiet_suppresses_the_write_target_echo() {
let temp = tempdir().unwrap();
let graph = graph_path(temp.path());
init_graph(&graph);
let data = fixture("test.jsonl");
let output = output_success(
cli()
.arg("--quiet")
.arg("load")
.arg("--mode")
.arg("append")
.arg("--data")
.arg(&data)
.arg(&graph),
);
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(
!stderr.contains("omnigraph load →"),
"--quiet should suppress the echo; stderr: {stderr}"
);
}
#[test]
fn branch_delete_against_non_local_scope_refuses_without_yes() {
// No bucket needed: the confirm gate fires before the graph is opened.
let output = output_failure(
cli()
.arg("branch")
.arg("delete")
.arg("--store")
.arg("s3://fake-bucket/g.omni")
.arg("feature")
.arg("--json"),
);
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(
stderr.contains("refusing destructive `branch delete`") && stderr.contains("--yes"),
"expected a non-local destructive refusal; stderr: {stderr}"
);
}
#[test]
fn branch_delete_against_non_local_scope_passes_gate_with_yes() {
// With --yes the gate is bypassed; the command then fails for an unrelated
// reason (the fake bucket can't be opened), so the refusal must be ABSENT.
let output = output_failure(
cli()
.arg("branch")
.arg("delete")
.arg("--store")
.arg("s3://fake-bucket/g.omni")
.arg("feature")
.arg("--yes")
.arg("--json"),
);
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(
!stderr.contains("refusing destructive"),
"--yes should bypass the confirm gate; stderr: {stderr}"
);
}
#[test]
fn overwrite_load_against_non_local_scope_refuses_without_yes() {
let output = output_failure(
cli()
.arg("load")
.arg("--mode")
.arg("overwrite")
.arg("--data")
.arg(fixture("test.jsonl"))
.arg("--store")
.arg("s3://fake-bucket/g.omni")
.arg("--json"),
);
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(
stderr.contains("refusing destructive `load --mode overwrite`"),
"expected a non-local overwrite refusal; stderr: {stderr}"
);
}
#[test]
fn cleanup_against_non_local_scope_refuses_without_yes() {
// Past the --confirm preview gate, a non-local cleanup still needs --yes.
let output = output_failure(
cli()
.arg("cleanup")
.arg("--store")
.arg("s3://fake-bucket/g.omni")
.arg("--keep")
.arg("5")
.arg("--confirm")
.arg("--json"),
);
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(
stderr.contains("refusing destructive `cleanup`"),
"expected a non-local cleanup refusal; stderr: {stderr}"
);
}
#[test]
fn cleanup_against_local_scope_executes_with_confirm() {
// Local cleanup needs no --yes; --confirm alone executes (and echoes).
let temp = tempdir().unwrap();
let graph = graph_path(temp.path());
init_graph(&graph);
load_fixture(&graph);
let output = output_success(
cli()
.arg("cleanup")
.arg("--keep")
.arg("1")
.arg("--confirm")
.arg(&graph)
.arg("--json"),
);
let payload: Value = serde_json::from_slice(&output.stdout).unwrap();
assert!(payload["tables"].as_array().is_some(), "{payload}");
let stderr = String::from_utf8(output.stderr).unwrap();
assert!(stderr.contains("omnigraph cleanup →"), "stderr: {stderr}");
}
#[test] #[test]
fn branch_merge_defaults_target_to_main() { fn branch_merge_defaults_target_to_main() {
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
@ -1663,19 +1918,17 @@ fn snapshot_json_returns_manifest_version_and_tables() {
} }
#[test] #[test]
fn snapshot_can_resolve_uri_from_config() { fn snapshot_resolves_uri_from_store_scope() {
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
let graph = graph_path(temp.path()); let graph = graph_path(temp.path());
let config = temp.path().join("omnigraph.yaml");
init_graph(&graph); init_graph(&graph);
load_fixture(&graph); load_fixture(&graph);
write_config(&config, &local_yaml_config(&graph));
let output = output_success( let output = output_success(
cli() cli()
.arg("snapshot") .arg("snapshot")
.arg("--config") .arg("--store")
.arg(&config) .arg(&graph)
.arg("--json"), .arg("--json"),
); );
let payload: Value = serde_json::from_slice(&output.stdout).unwrap(); let payload: Value = serde_json::from_slice(&output.stdout).unwrap();
@ -1816,3 +2069,162 @@ fn cli_fails_for_invalid_merge_requests() {
.contains("distinct source and target") .contains("distinct source and target")
); );
} }
/// RFC-011 Decision 8: `profile list` / `profile show` inspect the operator
/// config's profiles read-only. Hermetic via OMNIGRAPH_HOME.
fn profile_home() -> tempfile::TempDir {
let home = tempdir().unwrap();
std::fs::write(
home.path().join("config.yaml"),
"operator:\n actor: act-andrew\n\
defaults:\n output: json\n server: prod\n default_graph: knowledge\n\
servers:\n prod:\n url: https://graph.example.com\n\
clusters:\n brain:\n root: s3://acme/clusters/brain\n\
profiles:\n\
\x20 staging:\n server: prod\n default_graph: kb\n\
\x20 brain-admin:\n cluster: brain\n\
\x20 localdev:\n store: file:///data/dev.omni\n\
\x20 broken:\n server: a\n store: b\n",
)
.unwrap();
home
}
#[test]
fn profile_list_names_each_profile_with_its_binding_and_marks_active() {
let home = profile_home();
let out = output_success(
cli()
.env("OMNIGRAPH_HOME", home.path())
.env("OMNIGRAPH_PROFILE", "staging")
.arg("profile")
.arg("list"),
);
let stdout = stdout_string(&out);
assert!(stdout.contains("staging (active)"), "{stdout}");
assert!(stdout.contains("server: prod"), "{stdout}");
assert!(stdout.contains("cluster: brain"), "{stdout}");
assert!(stdout.contains("store: file:///data/dev.omni"), "{stdout}");
// A malformed (two-scope) profile is reported, not a hard failure.
assert!(stdout.contains("broken") && stdout.contains("invalid:"), "{stdout}");
}
#[test]
fn profile_list_json_shape() {
let home = profile_home();
let out = output_success(
cli()
.env("OMNIGRAPH_HOME", home.path())
.arg("profile")
.arg("list")
.arg("--json"),
);
let items: Value = serde_json::from_slice(&out.stdout).unwrap();
let brain = items
.as_array()
.unwrap()
.iter()
.find(|p| p["name"] == "brain-admin")
.unwrap();
assert_eq!(brain["binding"], "cluster: brain");
assert_eq!(brain["scope_kind"], "cluster");
assert_eq!(brain["target"], "brain");
assert_eq!(brain["valid"], true);
assert!(brain["error"].is_null());
assert_eq!(brain["active"], false);
let broken = items
.as_array()
.unwrap()
.iter()
.find(|p| p["name"] == "broken")
.unwrap();
assert_eq!(broken["scope_kind"], "invalid");
assert_eq!(broken["valid"], false);
assert!(broken["target"].is_null());
assert!(
broken["error"]
.as_str()
.unwrap()
.contains("profile 'broken'")
);
}
#[test]
fn profile_show_resolves_named_scope_endpoints() {
let home = profile_home();
// A cluster profile resolves its root.
let cluster = output_success(
cli()
.env("OMNIGRAPH_HOME", home.path())
.arg("profile")
.arg("show")
.arg("brain-admin"),
);
let cs = stdout_string(&cluster);
assert!(cs.contains("scope: cluster brain"), "{cs}");
assert!(cs.contains("endpoint: s3://acme/clusters/brain"), "{cs}");
// A store profile shows its URI as the endpoint.
let store = output_success(
cli()
.env("OMNIGRAPH_HOME", home.path())
.arg("profile")
.arg("show")
.arg("localdev")
.arg("--json"),
);
let detail: Value = serde_json::from_slice(&store.stdout).unwrap();
assert_eq!(detail["scope_kind"], "store");
assert_eq!(detail["endpoint"], "file:///data/dev.omni");
}
#[test]
fn profile_show_without_name_falls_back_to_flat_defaults() {
let home = profile_home();
let out = output_success(
cli()
.env("OMNIGRAPH_HOME", home.path())
.arg("profile")
.arg("show")
.arg("--json"),
);
let detail: Value = serde_json::from_slice(&out.stdout).unwrap();
assert_eq!(detail["name"], "(defaults)");
assert_eq!(detail["scope_kind"], "server");
assert_eq!(detail["endpoint"], "https://graph.example.com");
assert_eq!(detail["default_graph"], "knowledge");
}
#[test]
fn profile_show_without_name_uses_active_env_profile() {
let home = profile_home();
let out = output_success(
cli()
.env("OMNIGRAPH_HOME", home.path())
.env("OMNIGRAPH_PROFILE", "brain-admin")
.arg("profile")
.arg("show")
.arg("--json"),
);
let detail: Value = serde_json::from_slice(&out.stdout).unwrap();
// No name arg, but $OMNIGRAPH_PROFILE selects brain-admin (not the flat defaults).
assert_eq!(detail["name"], "brain-admin");
assert_eq!(detail["scope_kind"], "cluster");
assert_eq!(detail["endpoint"], "s3://acme/clusters/brain");
// output_format renders as the canonical lowercase value name.
assert_eq!(detail["output_format"], "json");
}
#[test]
fn profile_show_unknown_name_errors() {
let home = profile_home();
let out = output_failure(
cli()
.env("OMNIGRAPH_HOME", home.path())
.arg("profile")
.arg("show")
.arg("nope"),
);
let stderr = String::from_utf8_lossy(&out.stderr);
assert!(stderr.contains("unknown profile 'nope'"), "{stderr}");
}

View file

@ -2,7 +2,6 @@
//! Moved verbatim from tests/cli.rs in the modularization. //! Moved verbatim from tests/cli.rs in the modularization.
use serde_json::Value;
use tempfile::tempdir; use tempfile::tempdir;
mod support; mod support;
@ -57,227 +56,172 @@ query list_people() {
assert_eq!(stdout_string(&lint_output), stdout_string(&check_output)); assert_eq!(stdout_string(&lint_output), stdout_string(&check_output));
} }
// Legacy `omnigraph.yaml` `aliases:` invoked via the `--alias` flag were
// removed in RFC-011 D4 — operator aliases now live under `omnigraph alias
// <name>` (the happy path is covered by system_local's operator-alias e2e).
// The legacy file-alias path has no CLI entry point.
#[test] #[test]
fn read_alias_from_yaml_config_runs_with_kv_output() { fn alias_flag_is_removed_from_query() {
let temp = tempdir().unwrap(); // RFC-011 D4: `--alias` no longer exists on query/mutate; use `alias <name>`.
let graph = graph_path(temp.path()); let output = output_failure(cli().arg("query").arg("--alias").arg("who"));
let config = temp.path().join("omnigraph.yaml"); let stderr = String::from_utf8_lossy(&output.stderr);
let query = temp.path().join("aliases.gq"); assert!(
init_graph(&graph); stderr.contains("unexpected argument") && stderr.contains("--alias"),
load_fixture(&graph); "expected clap to reject --alias on query; got: {stderr}"
write_query_file(
&query,
&std::fs::read_to_string(fixture("test.gq")).unwrap(),
); );
write_config(
&config,
&format!(
"{}aliases:\n owner:\n command: read\n query: aliases.gq\n name: get_person\n args: [name]\n format: kv\n",
local_yaml_config(&graph)
),
);
let output = output_success(
cli()
.arg("read")
.arg("--config")
.arg(&config)
.arg("--alias")
.arg("owner")
.arg("Alice"),
);
let stdout = stdout_string(&output);
assert!(stdout.contains("row 1"));
assert!(stdout.contains("p.name: Alice"));
} }
#[test] #[test]
fn read_alias_uses_alias_target_without_cli_default_and_accepts_url_like_arg() { fn alias_unknown_name_errors_listing_defined() {
let temp = tempdir().unwrap(); // Hermetic: an unknown alias fails before any network, listing defined ones.
let graph = graph_path(temp.path()); let home = tempdir().unwrap();
let config = temp.path().join("omnigraph.yaml"); std::fs::write(
let query = temp.path().join("aliases.gq"); home.path().join("config.yaml"),
let data = temp.path().join("url-like.jsonl"); "servers:\n dev:\n url: https://x\naliases:\n who:\n server: dev\n query: find_person\n",
init_graph(&graph); )
write_jsonl( .unwrap();
&data, let output = output_failure(
r#"{"type":"Person","data":{"name":"https://example.com","age":30}}"#,
);
output_success(
cli() cli()
.arg("load") .env("OMNIGRAPH_HOME", home.path())
.arg("--mode") .arg("alias")
.arg("overwrite") .arg("nope"),
.arg("--data")
.arg(&data)
.arg(&graph),
); );
write_query_file( let stderr = String::from_utf8_lossy(&output.stderr);
&query, assert!(
&std::fs::read_to_string(fixture("test.gq")).unwrap(), stderr.contains("unknown alias 'nope'") && stderr.contains("who"),
"expected an unknown-alias error listing defined aliases; got: {stderr}"
); );
write_config(
&config,
&format!(
"graphs:\n local:\n uri: '{}'\nquery:\n roots:\n - .\npolicy: {{}}\naliases:\n owner:\n command: read\n query: aliases.gq\n name: get_person\n args: [name]\n graph: local\n format: kv\n",
graph.to_string_lossy()
),
);
let output = output_success(
cli()
.arg("read")
.arg("--config")
.arg(&config)
.arg("--alias")
.arg("owner")
.arg("https://example.com"),
);
let stdout = stdout_string(&output);
assert!(stdout.contains("row 1"));
assert!(stdout.contains("p.name: https://example.com"));
} }
#[test] #[test]
fn change_alias_from_yaml_config_persists_changes() { fn alias_rejects_global_scope_flags_that_the_binding_owns() {
let temp = tempdir().unwrap(); for (flag, value) in [
let graph = graph_path(temp.path()); ("--server", "dev"),
let config = temp.path().join("omnigraph.yaml"); ("--graph", "local"),
let query = temp.path().join("mutations.gq"); ("--store", "file:///tmp/graph.omni"),
init_graph(&graph); ("--cluster", "."),
load_fixture(&graph); ("--profile", "prod"),
write_query_file( ("--as", "act-op"),
&query, ] {
r#" let output = output_failure(cli().arg(flag).arg(value).arg("alias").arg("who"));
query insert_person($name: String, $age: I32) { let stderr = String::from_utf8_lossy(&output.stderr);
insert Person { name: $name, age: $age } assert!(
stderr.contains("`alias` uses the server, graph, and stored query")
&& stderr.contains(flag),
"expected {flag} to be rejected by the alias binding guard; got: {stderr}"
);
}
} }
"#,
#[test]
fn queries_and_policy_wrong_server_scope_points_at_cluster_scope() {
let output = output_failure(cli().arg("--server").arg("prod").arg("queries").arg("list"));
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("pass --cluster <dir|uri>") && !stderr.contains("pass --config <dir>"),
"queries should point at --cluster, not --config; got: {stderr}"
); );
write_config(
&config, let output = output_failure(
&format!( cli()
"{}aliases:\n add_person:\n command: change\n query: mutations.gq\n name: insert_person\n args: [name, age]\n", .arg("--server")
local_yaml_config(&graph) .arg("prod")
.arg("policy")
.arg("validate"),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("pass --cluster <dir|uri>") && !stderr.contains("pass --config <dir>"),
"policy should point at --cluster, not --config; got: {stderr}"
);
}
// RFC-011: `queries validate`/`list` source the registry + schemas from a
// converged cluster's applied state (`--cluster <dir>`), not omnigraph.yaml.
/// Build a converged single-graph cluster (id `knowledge`) with one stored
/// query. `query_block` is the YAML under the graph's `queries:` key.
fn converged_cluster_with_query(query_file: &str, query_src: &str, query_block: &str) -> tempfile::TempDir {
let temp = tempdir().unwrap();
let dir = temp.path();
std::fs::copy(fixture("test.pg"), dir.join("graph.pg")).unwrap();
write_query_file(&dir.join(query_file), query_src);
std::fs::write(
dir.join("cluster.yaml"),
format!(
"version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\n\
graphs:\n knowledge:\n schema: ./graph.pg\n queries:\n{query_block}"
), ),
); )
.unwrap();
let output = output_success( output_success(cli().arg("cluster").arg("import").arg("--config").arg(dir));
cli() output_success(cli().arg("cluster").arg("apply").arg("--config").arg(dir));
.arg("change") temp
.arg("--config")
.arg(&config)
.arg("--alias")
.arg("add_person")
.arg("Eve")
.arg("29")
.arg("--json"),
);
let payload: Value = serde_json::from_slice(&output.stdout).unwrap();
assert_eq!(payload["affected_nodes"], 1);
let verify = output_success(
cli()
.arg("read")
.arg(&graph)
.arg("--query")
.arg(fixture("test.gq"))
.arg("--name")
.arg("get_person")
.arg("--params")
.arg(r#"{"name":"Eve"}"#)
.arg("--json"),
);
let verify_payload: Value = serde_json::from_slice(&verify.stdout).unwrap();
assert_eq!(verify_payload["row_count"], 1);
} }
#[test] #[test]
fn queries_validate_exits_zero_on_clean_registry() { fn queries_validate_exits_zero_on_clean_registry() {
let graph = SystemGraph::loaded(); let cluster = converged_cluster_with_query(
graph.write_query(
"find_person.gq", "find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
); " find_person:\n file: ./find_person.gq\n",
let config = graph.write_config(
"omnigraph.yaml",
&queries_test_config(
&graph.path().to_string_lossy(),
"find_person",
"find_person.gq",
),
); );
let output = output_success( let output = output_success(
cli() cli()
.arg("queries") .arg("queries")
.arg("validate") .arg("validate")
.arg("--config") .arg("--cluster")
.arg(&config), .arg(cluster.path()),
); );
let stdout = stdout_string(&output); let stdout = stdout_string(&output);
assert!(stdout.contains("OK"), "stdout:\n{stdout}"); assert!(stdout.contains("OK"), "stdout:\n{stdout}");
} }
#[test] #[test]
fn queries_validate_exits_nonzero_on_type_broken_query() { fn cluster_import_rejects_a_type_broken_query() {
let graph = SystemGraph::loaded(); // In the cluster model a stored query is type-checked at the cluster
// `Widget` is not in the fixture schema. // boundary (import/apply), so a broken query can never reach the applied
graph.write_query( // state `queries validate` reads — the gate is upstream. `Widget` is not in
"ghost.gq", // the fixture schema, so import must reject it, naming the query.
let temp = tempdir().unwrap();
let dir = temp.path();
std::fs::copy(fixture("test.pg"), dir.join("graph.pg")).unwrap();
write_query_file(
&dir.join("ghost.gq"),
"query ghost() { match { $w: Widget } return { $w.name } }", "query ghost() { match { $w: Widget } return { $w.name } }",
); );
let config = graph.write_config( std::fs::write(
"omnigraph.yaml", dir.join("cluster.yaml"),
&queries_test_config(&graph.path().to_string_lossy(), "ghost", "ghost.gq"), "version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\n\
graphs:\n knowledge:\n schema: ./graph.pg\n queries:\n ghost:\n file: ./ghost.gq\n",
)
.unwrap();
let output = output_failure(cli().arg("cluster").arg("import").arg("--config").arg(dir));
let combined = format!(
"{}{}",
stdout_string(&output),
String::from_utf8_lossy(&output.stderr)
); );
let output = output_failure(
cli()
.arg("queries")
.arg("validate")
.arg("--config")
.arg(&config),
);
let stdout = stdout_string(&output);
assert!( assert!(
stdout.contains("ghost"), combined.contains("ghost"),
"validation should name the broken query; stdout:\n{stdout}" "cluster import must reject the broken query, naming it; got:\n{combined}"
); );
} }
#[test] #[test]
fn queries_list_prints_registered_query() { fn queries_list_prints_registered_query() {
let graph = SystemGraph::loaded(); let cluster = converged_cluster_with_query(
graph.write_query(
"find_person.gq", "find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
); " find_person:\n file: ./find_person.gq\n",
// Exposed with an explicit tool name so the list shows the MCP suffix.
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" find_person:\n",
" file: ./find_person.gq\n",
" mcp: {{ expose: true, tool_name: lookup_person }}\n",
"cli:\n",
" graph: local\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
); );
let output = output_success( let output = output_success(
cli() cli()
.arg("queries") .arg("queries")
.arg("list") .arg("list")
.arg("--config") .arg("--cluster")
.arg(&config), .arg(cluster.path()),
); );
let stdout = stdout_string(&output); let stdout = stdout_string(&output);
assert!(stdout.contains("find_person"), "stdout:\n{stdout}"); assert!(stdout.contains("find_person"), "stdout:\n{stdout}");
@ -285,242 +229,37 @@ fn queries_list_prints_registered_query() {
stdout.contains("$name: String"), stdout.contains("$name: String"),
"list should show typed params; stdout:\n{stdout}" "list should show typed params; stdout:\n{stdout}"
); );
assert!(
stdout.contains("[mcp: lookup_person]"),
"list should show the MCP tool name for exposed queries; stdout:\n{stdout}"
);
} }
#[test] #[test]
fn queries_list_requires_graph_selection_for_per_graph_only_registries() { fn queries_validate_requires_a_cluster() {
let graph = SystemGraph::loaded(); // RFC-011: with no --cluster (and no cluster profile), the command errors
graph.write_query( // loudly rather than reading any omnigraph.yaml.
"find_person.gq", let output = output_failure(cli().arg("queries").arg("validate"));
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" find_person:\n",
" file: ./find_person.gq\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
let output = output_failure(
cli()
.arg("queries")
.arg("list")
.arg("--config")
.arg(&config),
);
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
assert!( assert!(
stderr.contains("local") && stderr.contains("set `cli.graph`"), stderr.contains("needs a cluster") || stderr.contains("--cluster"),
"error must name the graph and give a concrete selection hint; stderr:\n{stderr}" "queries validate must require a cluster; stderr:\n{stderr}"
); );
} }
#[test] #[test]
fn queries_list_without_graph_selection_lists_top_level_registry() { fn queries_validate_graph_filter_selects_one_graph() {
let graph = SystemGraph::loaded(); // A multi-graph cluster: validate scoped to `knowledge` type-checks only
graph.write_query( // that graph's registry, ignoring `engineering`'s.
"top_find.gq", let temp = tempdir().unwrap();
"query top_find($name: String) { match { $p: Person { name: $name } } return { $p.age } }", let dir = temp.path();
); write_multi_graph_cluster_fixture(dir);
let config = graph.write_config( output_success(cli().arg("cluster").arg("import").arg("--config").arg(dir));
"omnigraph.yaml", output_success(cli().arg("cluster").arg("apply").arg("--config").arg(dir));
concat!(
"queries:\n",
" top_find:\n",
" file: ./top_find.gq\n",
"policy: {}\n",
),
);
let output = output_success(
cli()
.arg("queries")
.arg("list")
.arg("--config")
.arg(&config),
);
let stdout = stdout_string(&output);
assert!(stdout.contains("top_find"), "stdout:\n{stdout}");
}
#[test]
fn queries_list_unknown_cli_graph_errors() {
// `queries list` opens no graph URI, so unknown-graph validation can't ride
// along on URI resolution the way it does for every other command. An
// unknown `cli.graph` selection must still error (naming the graph) instead
// of silently falling back to the top-level registry and showing the wrong
// (or empty) catalog. (`--target` was removed; `cli.graph` drives selection.)
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&format!(
"graphs:\n local:\n uri: '{}'\n queries:\n find_person:\n file: ./find_person.gq\ncli:\n graph: nonexistent\npolicy: {{}}\n",
graph.path().to_string_lossy().replace('\'', "''"),
),
);
let output = output_failure(cli().arg("queries").arg("list").arg("--config").arg(&config));
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("nonexistent"),
"error must name the unknown graph; stderr:\n{stderr}"
);
}
#[test]
fn queries_commands_reject_named_graph_with_populated_top_level_block() {
// A named graph (here via `cli.graph`) uses its own `graphs.<name>` block,
// so a populated top-level `queries:` block would be silently ignored — a
// config the server REFUSES to boot. `queries validate`/`list` must reject
// it too (matching boot) instead of validating/listing the per-graph block
// and giving a false green.
let graph = SystemGraph::loaded();
graph.write_query(
"find_person.gq",
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" find_person:\n",
" file: ./find_person.gq\n",
"cli:\n",
" graph: local\n",
"queries:\n", // populated top-level block: the coherence violation
" legacy:\n",
" file: ./legacy.gq\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
// Both resolve `local` from cli.graph (no positional URI), so both must
// error and name the graph + the ignored block — like server boot does.
for sub in ["validate", "list"] {
let output = output_failure(cli().arg("queries").arg(sub).arg("--config").arg(&config));
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("local") && stderr.contains("queries"),
"`queries {sub}` must reject a named graph with a populated top-level block; stderr:\n{stderr}"
);
}
}
#[test]
fn queries_validate_exits_nonzero_on_duplicate_tool_name() {
// Two exposed queries claiming one MCP tool name is a load-time
// collision — `queries validate` must fail (offline, before the engine
// opens) and name both queries plus the contested tool.
let graph = SystemGraph::loaded();
graph.write_query(
"a.gq",
"query a() { match { $p: Person } return { $p.name } }",
);
graph.write_query(
"b.gq",
"query b() { match { $p: Person } return { $p.name } }",
);
let config = graph.write_config(
"omnigraph.yaml",
&format!(
concat!(
"graphs:\n",
" local:\n",
" uri: '{}'\n",
" queries:\n",
" a:\n",
" file: ./a.gq\n",
" mcp: {{ expose: true, tool_name: dup }}\n",
" b:\n",
" file: ./b.gq\n",
" mcp: {{ expose: true, tool_name: dup }}\n",
"cli:\n",
" graph: local\n",
"policy: {{}}\n",
),
graph.path().to_string_lossy().replace('\'', "''")
),
);
let output = output_failure(
cli()
.arg("queries")
.arg("validate")
.arg("--config")
.arg(&config),
);
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("dup") && stderr.contains("'a'") && stderr.contains("'b'"),
"duplicate tool name should be reported naming both queries; stderr:\n{stderr}"
);
}
#[test]
fn queries_validate_positional_uri_ignores_default_graph() {
// A positional URI is anonymous → the schema AND the registry both come
// from top-level, even when `cli.graph` names a graph whose per-graph
// queries would fail. Pins that the URI and registry can't diverge.
let graph = SystemGraph::loaded();
graph.write_query(
"clean.gq",
"query clean($name: String) { match { $p: Person { name: $name } } return { $p.age } }",
);
// `Widget` is not in the fixture schema — the default graph's per-graph
// query would break validate if it were (wrongly) selected.
graph.write_query(
"broken.gq",
"query broken() { match { $w: Widget } return { $w.name } }",
);
let config = graph.write_config(
"omnigraph.yaml",
concat!(
"cli:\n graph: prod\n",
"graphs:\n",
" prod:\n",
" uri: /nonexistent-prod.omni\n",
" queries:\n",
" broken:\n",
" file: ./broken.gq\n",
"queries:\n",
" clean:\n",
" file: ./clean.gq\n",
"policy: {}\n",
),
);
// Positional URI = the real loaded graph; selection is anonymous, so the
// CLEAN top-level registry validates (not prod's broken one).
let output = output_success( let output = output_success(
cli() cli()
.arg("queries") .arg("queries")
.arg("validate") .arg("validate")
.arg(graph.path()) .arg("--cluster")
.arg("--config") .arg(dir)
.arg(&config), .arg("--graph")
); .arg("knowledge"),
let stdout = stdout_string(&output);
assert!(
stdout.contains("OK"),
"positional URI must validate the top-level registry, not the cli.graph default; stdout:\n{stdout}"
); );
assert!(stdout_string(&output).contains("OK"));
} }

View file

@ -121,7 +121,7 @@ fn schema_plan_with_server_flag_errors_wrong_plane() {
let stderr = String::from_utf8_lossy(&output.stderr); let stderr = String::from_utf8_lossy(&output.stderr);
assert!( assert!(
stderr.contains("`schema plan` is a direct (storage-native) command") stderr.contains("`schema plan` is a direct (storage-native) command")
&& stderr.contains("Pass a storage URI, or --cluster <dir> --cluster-graph <id>."), && stderr.contains("Pass a storage URI."),
"schema plan wrong-capability message not found; got: {stderr}" "schema plan wrong-capability message not found; got: {stderr}"
); );
} }
@ -334,7 +334,13 @@ fn schema_apply_json_adds_index_for_existing_property() {
let dataset = snapshot.open("node:Person").await.unwrap(); let dataset = snapshot.open("node:Person").await.unwrap();
dataset.load_indices().await.unwrap().len() dataset.load_indices().await.unwrap().len()
}); });
assert!(after_index_count > before_index_count); // iss-848: `schema apply` records the `@index` intent but defers the physical
// index build (materialized later by ensure_indices/optimize; on this empty
// table nothing builds anyway). So the physical index count is unchanged.
assert_eq!(
after_index_count, before_index_count,
"schema apply records @index intent but defers the physical build (iss-848)"
);
} }
#[test] #[test]
@ -540,163 +546,18 @@ fn graphs_subcommand_help_lists_list_only() {
#[test] #[test]
fn graphs_list_against_local_uri_errors_with_remote_only_message() { fn graphs_list_against_local_uri_errors_with_remote_only_message() {
// RFC-011: `graphs list` is served-only; a `--store` (local) address has no
// enumeration endpoint, so it fails loudly pointing at a server / cluster.
let output = output_failure( let output = output_failure(
cli() cli()
.arg("graphs") .arg("graphs")
.arg("list") .arg("list")
.arg("--uri") .arg("--store")
.arg("/tmp/local"), .arg("/tmp/local"),
); );
let stderr = String::from_utf8_lossy(&output.stderr).into_owned(); let stderr = String::from_utf8_lossy(&output.stderr).into_owned();
assert!( assert!(
stderr.contains("remote multi-graph server URL"), stderr.contains("remote multi-graph server"),
"expected 'remote multi-graph server URL' rejection in stderr; got:\n{stderr}" "expected a remote-server rejection in stderr; got:\n{stderr}"
); );
} }
/// RFC-008 stage 1: loading a legacy omnigraph.yaml emits the per-key
/// deprecation block (the migration map applied to THIS file), suppressible
/// via OMNIGRAPH_SUPPRESS_YAML_DEPRECATION.
#[test]
fn legacy_config_load_warns_per_key_and_suppression_silences() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"cli:\n actor: act-x\ngraphs:\n g:\n uri: /tmp/never-opened\n",
)
.unwrap();
// `graphs list --json` loads the config and exits without touching the
// graph URI.
let output = cli()
.current_dir(temp.path())
.arg("graphs")
.arg("list")
.arg("--json")
.output()
.unwrap();
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("deprecated (RFC-008)") && stderr.contains("`cli.actor` -> `operator.actor`"),
"{stderr}"
);
assert!(stderr.contains("config migrate"), "{stderr}");
let output = cli()
.current_dir(temp.path())
.env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1")
.arg("graphs")
.arg("list")
.arg("--json")
.output()
.unwrap();
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(!stderr.contains("deprecated (RFC-008)"), "{stderr}");
}
/// RFC-008 stage 2: `config migrate` proposes the split read-only, applies
/// it with --write (operator merge never clobbers; cluster.yaml emitted),
/// and a second --write is idempotent.
#[test]
fn config_migrate_splits_legacy_config() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"graphs:\n prod:\n uri: https://graph.example.com\n bearer_token_env: PROD_TOKEN\ncli:\n actor: act-me\n output_format: json\npolicy:\n file: ./top.policy.yaml\n",
)
.unwrap();
let operator_home = tempfile::tempdir().unwrap();
fs::write(
operator_home.path().join("config.yaml"),
"operator:\n actor: act-existing\n",
)
.unwrap();
// Read-only proposal: names both halves, writes nothing.
let output = cli()
.current_dir(temp.path())
.env("OMNIGRAPH_HOME", operator_home.path())
.env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1")
.arg("config")
.arg("migrate")
.output()
.unwrap();
assert!(output.status.success(), "{output:?}");
let stdout = String::from_utf8_lossy(&output.stdout);
assert!(stdout.contains("team half -> cluster.yaml"), "{stdout}");
assert!(stdout.contains("operator.actor: act-me"), "{stdout}");
assert!(stdout.contains("omnigraph login prod"), "{stdout}");
assert!(!temp.path().join("cluster.yaml").exists());
// --write: cluster.yaml lands; the existing operator actor is KEPT.
let output = cli()
.current_dir(temp.path())
.env("OMNIGRAPH_HOME", operator_home.path())
.env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1")
.arg("config")
.arg("migrate")
.arg("--write")
.output()
.unwrap();
assert!(output.status.success(), "{output:?}");
let cluster = fs::read_to_string(temp.path().join("cluster.yaml")).unwrap();
assert!(cluster.contains("version: 1") && cluster.contains(" prod:"), "{cluster}");
let operator_text =
fs::read_to_string(operator_home.path().join("config.yaml")).unwrap();
assert!(operator_text.contains("act-existing"), "{operator_text}");
assert!(!operator_text.contains("act-me"), "existing keys win: {operator_text}");
assert!(operator_text.contains("output: json"), "{operator_text}");
assert!(
operator_text.contains("url: https://graph.example.com"),
"{operator_text}"
);
// Second --write: cluster.yaml exists -> proposal file, no clobber.
let output = cli()
.current_dir(temp.path())
.env("OMNIGRAPH_HOME", operator_home.path())
.env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1")
.arg("config")
.arg("migrate")
.arg("--write")
.output()
.unwrap();
assert!(output.status.success(), "{output:?}");
assert!(temp.path().join("cluster.yaml.proposed").exists());
}
/// RFC-008 stage 4: OMNIGRAPH_NO_LEGACY_CONFIG refuses a present legacy
/// file (pointing at config migrate) but changes nothing on migrated
/// setups with no file.
#[test]
fn strict_mode_refuses_legacy_file_but_not_its_absence() {
let temp = tempdir().unwrap();
fs::write(temp.path().join("omnigraph.yaml"), "cli:\n actor: a\n").unwrap();
let output = cli()
.current_dir(temp.path())
.env("OMNIGRAPH_NO_LEGACY_CONFIG", "1")
.arg("graphs")
.arg("list")
.arg("--json")
.output()
.unwrap();
assert!(!output.status.success());
let stderr = String::from_utf8_lossy(&output.stderr);
assert!(
stderr.contains("OMNIGRAPH_NO_LEGACY_CONFIG") && stderr.contains("config migrate"),
"{stderr}"
);
// Migrated setup (no file): strict mode is a no-op — a config-loading
// command that tolerates empty defaults succeeds.
let clean = tempdir().unwrap();
let output = cli()
.current_dir(clean.path())
.env("OMNIGRAPH_NO_LEGACY_CONFIG", "1")
.arg("queries")
.arg("list")
.arg("--json")
.output()
.unwrap();
assert!(output.status.success(), "{output:?}");
}

View file

@ -25,21 +25,23 @@ const KNOWN_DIVERGENCES: &[&str] = &[
// populated by the rows below as they are written // populated by the rows below as they are written
]; ];
/// One matched setup per row: twin graphs + the SAME Cedar bundle on both /// One matched setup per row: twin graphs + the parity Cedar bundle on the
/// arms (the local arm via --config top-level policy.file; the server via /// served arm. The local (`--store`) arm carries no policy (RFC-011); the
/// its config). Returns everything a row needs. /// bundle is permissive for `act-parity`, so the arms still agree.
struct Parity { struct Parity {
_temp: TempDir, _temp: TempDir,
local: std::path::PathBuf, local: std::path::PathBuf,
local_cfg: std::path::PathBuf,
server: TestServer, server: TestServer,
} }
fn parity() -> Parity { fn parity() -> Parity {
let (temp, local, remote) = twin_graphs(); let (temp, local, remote) = twin_graphs();
let (local_cfg, server_cfg) = parity_configs(temp.path(), &local, &remote); // RFC-011 cluster-only: the remote arm is served from a converged
let server = spawn_server_with_config_env( // cluster directory (one graph, id `parity`), seeded with the same
&server_cfg, // fixture data as the local twin.
let cluster_dir = parity_configs(temp.path(), &local, &remote);
let server = spawn_server_with_cluster_env(
&cluster_dir,
&[( &[(
"OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", "OMNIGRAPH_SERVER_BEARER_TOKENS_JSON",
r#"{"act-parity":"parity-tok"}"#, r#"{"act-parity":"parity-tok"}"#,
@ -48,14 +50,13 @@ fn parity() -> Parity {
Parity { Parity {
_temp: temp, _temp: temp,
local, local,
local_cfg,
server, server,
} }
} }
impl Parity { impl Parity {
fn run(&self, args: &[&str]) -> (std::process::Output, std::process::Output) { fn run(&self, args: &[&str]) -> (std::process::Output, std::process::Output) {
run_both_with_config(&self.local, Some(&self.local_cfg), &self.server.base_url, args) run_both(&self.local, &self.server.base_url, args)
} }
} }
@ -83,7 +84,6 @@ fn parity_query() {
"query", "query",
"--query", "--query",
query.to_str().unwrap(), query.to_str().unwrap(),
"--name",
"get_person", "get_person",
"--params", "--params",
r#"{"name":"Alice"}"#, r#"{"name":"Alice"}"#,
@ -142,7 +142,10 @@ fn parity_branch_create_delete() {
let (l, r) = p.run(&["branch", "create", "--from", "main", "parity-branch", "--json"], let (l, r) = p.run(&["branch", "create", "--from", "main", "parity-branch", "--json"],
); );
assert_parity("branch create", &l, &r); assert_parity("branch create", &l, &r);
let (l, r) = p.run(&["branch", "delete", "parity-branch", "--json"], // `branch delete` is destructive: the served (remote) arm is non-local and
// requires consent (RFC-011 Decision 9), so the row passes `--yes` to test
// the operation itself, not the safety gate. The local arm ignores `--yes`.
let (l, r) = p.run(&["branch", "delete", "parity-branch", "--yes", "--json"],
); );
assert_parity("branch delete", &l, &r); assert_parity("branch delete", &l, &r);
} }
@ -229,7 +232,6 @@ fn parity_errors_share_exit_codes() {
"query", "query",
"--query", "--query",
query.to_str().unwrap(), query.to_str().unwrap(),
"--name",
"no_such_query", "no_such_query",
"--json", "--json",
], ],
@ -249,7 +251,6 @@ fn parity_errors_share_exit_codes() {
"query", "query",
"--query", "--query",
query.to_str().unwrap(), query.to_str().unwrap(),
"--name",
"get_person", "get_person",
"--json", "--json",
], ],

View file

@ -339,6 +339,63 @@ impl SystemGraph {
} }
} }
/// A converged cluster directory the server can boot from (`--cluster`),
/// serving one graph seeded with the standard fixture. Holds the temp dir
/// alive for the test's lifetime.
pub struct ClusterFixture {
_temp: TempDir,
dir: PathBuf,
}
impl ClusterFixture {
pub fn path(&self) -> &Path {
&self.dir
}
}
/// Build a converged cluster (RFC-011 cluster-only serving) with a single
/// graph `graph_id`, seeded with the `test.jsonl` fixture so reads return
/// data. When `policy_yaml` is `Some`, the bundle is bound to the graph
/// scope. The server boots from the returned path via `--cluster`.
pub fn converged_loaded_cluster(graph_id: &str, policy_yaml: Option<&str>) -> ClusterFixture {
let temp = tempdir().unwrap();
let dir = temp.path().to_path_buf();
fs::copy(fixture("test.pg"), dir.join("graph.pg")).unwrap();
let policy_block = match policy_yaml {
Some(source) => {
fs::write(dir.join("graph.policy.yaml"), source).unwrap();
format!(
"policies:\n graph:\n file: ./graph.policy.yaml\n applies_to: [{graph_id}]\n"
)
}
None => String::new(),
};
fs::write(
dir.join("cluster.yaml"),
format!(
"version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\ngraphs:\n {graph_id}:\n schema: ./graph.pg\n{policy_block}"
),
)
.unwrap();
output_success(cli().arg("cluster").arg("import").arg("--config").arg(&dir));
output_success(cli().arg("cluster").arg("apply").arg("--config").arg(&dir));
let served_root = dir.join("graphs").join(format!("{graph_id}.omni"));
output_success(
cli()
.arg("load")
.arg("--data")
.arg(fixture("test.jsonl"))
.arg("--mode")
.arg("overwrite")
.arg(&served_root),
);
ClusterFixture { _temp: temp, dir }
}
// ---- helpers moved from the monolithic tests/cli.rs ---- // ---- helpers moved from the monolithic tests/cli.rs ----
#[allow(unused_imports)] #[allow(unused_imports)]
use lance::Dataset; use lance::Dataset;
@ -788,29 +845,94 @@ rules:
.to_string() .to_string()
} }
/// Per-arm config files carrying the same policy. Both arms address the /// The graph id the parity cluster serves the remote arm under. The
/// graph by positional URI, so the TOP-LEVEL policy.file applies on each /// remote arm addresses it with `--graph PARITY_GRAPH_ID` (RFC-011: the
/// side (single-graph semantics). /// server is cluster-only, so a graph selector is required).
pub fn parity_configs(root: &Path, _local_graph: &Path, remote_graph: &Path) -> (PathBuf, PathBuf) { pub const PARITY_GRAPH_ID: &str = "parity";
/// Build the remote arm's configuration (RFC-011 cluster-only server).
///
/// The remote arm is served from a converged cluster directory whose single
/// graph (id `parity`) carries the parity Cedar bundle (bound to the graph
/// scope). The cluster's derived graph root (`<dir>/graphs/parity.omni`) is
/// seeded with the SAME fixture data as the local twin so the two arms compare
/// like-for-like. The local (`--store`) arm carries no Cedar policy (RFC-011),
/// which is fine because the parity bundle is permissive for `act-parity`.
///
/// `local_graph` is overwritten with a byte-for-byte copy of the cluster's
/// seeded served graph so identity-bearing values that are NOT scrubbed
/// (e.g. `graph_commit_id`, edge `id`s in export) match across the arms —
/// the served graph is the source of truth and the local twin mirrors it.
///
/// Returns the `cluster_dir`. The caller spawns the server with `--cluster`.
pub fn parity_configs(root: &Path, local_graph: &Path, _remote_graph: &Path) -> PathBuf {
let policy = root.join("parity.policy.yaml"); let policy = root.join("parity.policy.yaml");
fs::write(&policy, parity_policy_yaml()).unwrap(); fs::write(&policy, parity_policy_yaml()).unwrap();
let local_cfg = root.join("local.omnigraph.yaml");
// Remote arm: a cluster directory the server boots from. One graph
// (`parity`), schema = the shared fixture, policy bound to the graph.
let cluster_dir = root.join("parity-cluster");
fs::create_dir_all(&cluster_dir).unwrap();
fs::copy(fixture("test.pg"), cluster_dir.join("parity.pg")).unwrap();
fs::copy(&policy, cluster_dir.join("parity.policy.yaml")).unwrap();
fs::write( fs::write(
&local_cfg, cluster_dir.join("cluster.yaml"),
format!("policy:\n file: {}\n", policy.display()),
)
.unwrap();
let server_cfg = root.join("server.omnigraph.yaml");
fs::write(
&server_cfg,
format!( format!(
"server:\n graph: parity\ngraphs:\n parity:\n uri: {}\n policy:\n file: {}\n", r#"version: 1
remote_graph.display(), metadata:
policy.display() name: parity
state:
backend: cluster
lock: true
graphs:
{PARITY_GRAPH_ID}:
schema: ./parity.pg
policies:
parity:
file: ./parity.policy.yaml
applies_to: [{PARITY_GRAPH_ID}]
"#
), ),
) )
.unwrap(); .unwrap();
(local_cfg, server_cfg)
// Converge the cluster (creates the empty graph at the derived root),
// then seed it with the same fixture data the local twin holds.
output_success(
cli()
.arg("cluster")
.arg("import")
.arg("--config")
.arg(&cluster_dir),
);
output_success(
cli()
.arg("cluster")
.arg("apply")
.arg("--config")
.arg(&cluster_dir),
);
let served_root = cluster_dir
.join("graphs")
.join(format!("{PARITY_GRAPH_ID}.omni"));
output_success(
cli()
.arg("load")
.arg("--data")
.arg(fixture("test.jsonl"))
.arg("--mode")
.arg("overwrite")
.arg(&served_root),
);
// Mirror the seeded served graph into the local twin so both arms hold
// identical ULIDs / commit ids (the served graph is authoritative).
if local_graph.exists() {
fs::remove_dir_all(local_graph).unwrap();
}
copy_dir(&served_root, local_graph);
cluster_dir
} }
/// Run one CLI invocation per arm with identical verb args: locally against /// Run one CLI invocation per arm with identical verb args: locally against
@ -821,21 +943,14 @@ pub fn run_both(
local_graph: &Path, local_graph: &Path,
server_url: &str, server_url: &str,
args: &[&str], args: &[&str],
) -> (std::process::Output, std::process::Output) {
run_both_with_config(local_graph, None, server_url, args)
}
pub fn run_both_with_config(
local_graph: &Path,
local_config: Option<&Path>,
server_url: &str,
args: &[&str],
) -> (std::process::Output, std::process::Output) { ) -> (std::process::Output, std::process::Output) {
// Address both arms with GLOBAL flags (`--store` / `--server`) appended after // Address both arms with GLOBAL flags (`--store` / `--server`) appended after
// the verb + its args, so the address is placed correctly regardless of // the verb + its args, so the address is placed correctly regardless of
// subcommand nesting (a positional graph only works for top-level verbs; // subcommand nesting (a positional graph only works for top-level verbs;
// `schema show <graph>` etc. need the global flag). Local = embedded store, // `schema show <graph>` etc. need the global flag). Local = embedded store,
// remote = served. // remote = served. RFC-011: a direct (`--store`) write carries no Cedar
// policy — the parity policy is permissive for `act-parity` on the served
// arm, so the two arms still agree.
let mut local = cli(); let mut local = cli();
local local
.args(args) .args(args)
@ -843,9 +958,6 @@ pub fn run_both_with_config(
.arg(local_graph) .arg(local_graph)
.arg("--as") .arg("--as")
.arg(PARITY_ACTOR); .arg(PARITY_ACTOR);
if let Some(config) = local_config {
local.arg("--config").arg(config);
}
let local_out = local.output().unwrap(); let local_out = local.output().unwrap();
let mut remote = cli(); let mut remote = cli();
@ -853,7 +965,11 @@ pub fn run_both_with_config(
.env("OMNIGRAPH_BEARER_TOKEN", PARITY_TOKEN) .env("OMNIGRAPH_BEARER_TOKEN", PARITY_TOKEN)
.args(args) .args(args)
.arg("--server") .arg("--server")
.arg(server_url); .arg(server_url)
// RFC-011: the parity server is cluster-only (multi-graph), so the
// remote arm must name the graph it addresses.
.arg("--graph")
.arg(PARITY_GRAPH_ID);
let remote_out = remote.output().unwrap(); let remote_out = remote.output().unwrap();
(local_out, remote_out) (local_out, remote_out)
} }

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -38,8 +38,9 @@ use diff::{
diff_resources, resource_kind, diff_resources, resource_kind,
}; };
pub use serve::{ pub use serve::{
ServingGraph, ServingPolicy, ServingQuery, ServingSnapshot, cluster_root_for_graph_uri, ServingGraph, ServingPolicy, ServingQuery, ServingSnapshot, cluster_graph_ids,
read_serving_snapshot, read_serving_snapshot_from_storage, resolve_graph_storage_uri, cluster_root_for_graph_uri, read_serving_snapshot, read_serving_snapshot_from_storage,
resolve_graph_storage_uri,
}; };
use store::{ClusterStore, StateLockGuard, StateSnapshot}; use store::{ClusterStore, StateLockGuard, StateSnapshot};
use sweep::{ use sweep::{

View file

@ -112,32 +112,14 @@ pub async fn cluster_root_for_graph_uri(graph_uri: &str) -> Option<String> {
/// ///
/// `cluster` is a config directory or a storage-root URI (`s3://…`, config-free), /// `cluster` is a config directory or a storage-root URI (`s3://…`, config-free),
/// mirroring the server's `--cluster` dispatch. /// mirroring the server's `--cluster` dispatch.
pub async fn resolve_graph_storage_uri( pub async fn resolve_graph_storage_uri(cluster: &str, graph_id: &str) -> Result<String, Diagnostic> {
cluster: &str, let backend = open_cluster_backend(cluster)?;
graph_id: &str,
) -> Result<String, Diagnostic> {
let backend = if cluster.contains("://") {
ClusterStore::for_storage_root(cluster)?
} else {
ClusterStore::for_config_dir(Path::new(cluster))
};
let mut observations = backend.observations(); let mut observations = backend.observations();
let snapshot = backend.read_state(&mut observations).await?; let snapshot = backend.read_state(&mut observations).await?;
let state = snapshot.state.ok_or_else(|| { let state = snapshot.state.ok_or_else(|| missing_state_diagnostic(cluster))?;
Diagnostic::error(
"cluster_state_missing",
CLUSTER_STATE_FILE,
format!("cluster `{cluster}` has no applied state; run `cluster apply` first"),
)
})?;
let address = format!("graph.{graph_id}"); let address = format!("graph.{graph_id}");
if !state.applied_revision.resources.contains_key(&address) { if !state.applied_revision.resources.contains_key(&address) {
let applied: Vec<&str> = state let applied = applied_graph_ids(&state);
.applied_revision
.resources
.keys()
.filter_map(|a| a.strip_prefix("graph."))
.collect();
return Err(Diagnostic::error( return Err(Diagnostic::error(
"graph_not_applied", "graph_not_applied",
address, address,
@ -151,6 +133,46 @@ pub async fn resolve_graph_storage_uri(
Ok(backend.graph_root(graph_id)) Ok(backend.graph_root(graph_id))
} }
/// List the graph ids applied in a cluster's served state (sorted). Reads the
/// ledger only — no catalog validation — like `resolve_graph_storage_uri`, so
/// it works on a degraded cluster. Used to enumerate candidates when no
/// `--graph` is selected (RFC-011 Decision 7).
pub async fn cluster_graph_ids(cluster: &str) -> Result<Vec<String>, Diagnostic> {
let backend = open_cluster_backend(cluster)?;
let mut observations = backend.observations();
let snapshot = backend.read_state(&mut observations).await?;
let state = snapshot.state.ok_or_else(|| missing_state_diagnostic(cluster))?;
Ok(applied_graph_ids(&state))
}
fn open_cluster_backend(cluster: &str) -> Result<ClusterStore, Diagnostic> {
if cluster.contains("://") {
ClusterStore::for_storage_root(cluster)
} else {
Ok(ClusterStore::for_config_dir(Path::new(cluster)))
}
}
fn missing_state_diagnostic(cluster: &str) -> Diagnostic {
Diagnostic::error(
"cluster_state_missing",
CLUSTER_STATE_FILE,
format!("cluster `{cluster}` has no applied state; run `cluster apply` first"),
)
}
fn applied_graph_ids(state: &crate::types::ClusterState) -> Vec<String> {
let mut ids: Vec<String> = state
.applied_revision
.resources
.keys()
.filter_map(|a| a.strip_prefix("graph."))
.map(str::to_string)
.collect();
ids.sort();
ids
}
/// Split `<root>/graphs/<id>.omni` → `<root>`, gating on the exact cluster /// Split `<root>/graphs/<id>.omni` → `<root>`, gating on the exact cluster
/// graph-layout shape (a single `<id>` segment, no nested path). `None` for /// graph-layout shape (a single `<id>` segment, no nested path). `None` for
/// anything else — no I/O is done for non-cluster-shaped URIs. /// anything else — no I/O is done for non-cluster-shaped URIs.

View file

@ -1,14 +1,15 @@
//! Server-level concurrent HTTP benchmark for MR-686 (PR 0 baseline). //! Server-level concurrent HTTP benchmark for MR-686 (PR 0 baseline).
//! //!
//! Drives concurrent `/change` requests against an in-process Omnigraph HTTP //! Drives concurrent `/change` requests against an in-process Omnigraph HTTP
//! server. Measures the global `Arc<RwLock<Omnigraph>>` lock penalty on //! server. Originally written to measure the global `Arc<RwLock<Omnigraph>>`
//! current `main` so PR 1 + PR 2 can be evaluated against a real baseline. //! lock penalty as an MR-686 baseline; that lock has since been removed
//! (engine write APIs are `&self`, the server holds a lockless
//! `Arc<Omnigraph>`), so this now measures the concurrent write path itself
//! (per-`(table, branch)` queue contention + Lance I/O).
//! //!
//! Per the MR-686 plan: this is the load-bearing bench. `Omnigraph::mutate_as` //! Driving the HTTP server is still the right level: an engine-level bench on
//! is `&mut self`, so an engine-level concurrent bench either serializes on the //! a single handle measures Lance contention, not the server's request-path
//! borrow checker (measures nothing) or drives multiple handles (measures Lance //! concurrency.
//! contention, not the server bottleneck). Driving the HTTP server is the only
//! way to measure the actual `RwLock<Omnigraph>` contention this work removes.
//! //!
//! Usage: //! Usage:
//! ```sh //! ```sh

File diff suppressed because it is too large Load diff

View file

@ -51,25 +51,15 @@ pub(crate) async fn server_graphs_list(
State(state): State<AppState>, State(state): State<AppState>,
actor: Option<Extension<ResolvedActor>>, actor: Option<Extension<ResolvedActor>>,
) -> std::result::Result<Json<GraphListResponse>, ApiError> { ) -> std::result::Result<Json<GraphListResponse>, ApiError> {
// 405 in single mode — there's no registry to enumerate, and the let registry = &state.routing().registry;
// legacy URL surface didn't expose this endpoint.
let registry = match state.routing() {
GraphRouting::Single { .. } => {
return Err(ApiError::method_not_allowed(
"GET /graphs is only available in multi-graph mode",
));
}
GraphRouting::Multi { registry, .. } => registry,
};
// Server-level Cedar gate. `state.server_policy` is loaded from // Server-level Cedar gate. `state.server_policy` is loaded from the
// `server.policy.file` in `omnigraph.yaml` at startup. When no // cluster-scoped policy bundle at startup. When no server policy is
// server policy is configured, `authorize_request_server` falls // configured, `authorize_request_server` falls through to the MR-723
// through to the MR-723 default-deny semantics (every non-Read // default-deny semantics (every non-Read action denied for an
// action denied for an authenticated actor). `GraphList` is not // authenticated actor). `GraphList` is not `Read`, so without a server
// `Read`, so without a server policy the request gets 403 — which // policy the request gets 403 — which is the right default (don't leak
// is the right default (don't leak the registry until the operator // the registry until the operator explicitly authorizes it).
// explicitly authorizes it).
authorize_request( authorize_request(
actor.as_ref().map(|Extension(actor)| actor), actor.as_ref().map(|Extension(actor)| actor),
state.server_policy.as_deref(), state.server_policy.as_deref(),
@ -93,17 +83,15 @@ pub(crate) async fn server_graphs_list(
} }
pub(crate) async fn server_openapi(State(state): State<AppState>) -> Json<utoipa::openapi::OpenApi> { pub(crate) async fn server_openapi(State(state): State<AppState>) -> Json<utoipa::openapi::OpenApi> {
let mut doc = ApiDoc::openapi(); // `served_openapi` is the single nesting source — the protected
// routes always live under `/graphs/{graph_id}/...` (public/management
// paths `/healthz`, `/graphs` stay flat). Building from it here means
// the runtime spec and the committed `openapi.json` share one nesting
// pass and can't drift.
let mut doc = crate::served_openapi();
if !state.requires_bearer_auth() { if !state.requires_bearer_auth() {
strip_security(&mut doc); strip_security(&mut doc);
} }
// MR-668: in multi mode, the protected routes live under
// `/graphs/{graph_id}/...`. Rewrite the doc so the spec matches
// the routes the router actually serves. Public paths (`/healthz`)
// stay flat in both modes.
if matches!(state.routing(), GraphRouting::Multi { .. }) {
nest_paths_under_cluster_prefix(&mut doc);
}
Json(doc) Json(doc)
} }
@ -248,16 +236,11 @@ pub(crate) async fn require_bearer_auth(
Ok(next.run(request).await) Ok(next.run(request).await)
} }
/// Routing middleware (MR-668). Resolves the active graph for the /// Routing middleware (RFC-011 cluster-only). Resolves the active graph
/// request and injects `Arc<GraphHandle>` as an extension so handlers can /// for the request and injects `Arc<GraphHandle>` as an extension so
/// extract it via `Extension<Arc<GraphHandle>>`. /// handlers can extract it via `Extension<Arc<GraphHandle>>`.
/// ///
/// **Single mode**: the routing field holds the single handle directly. /// Routes are always nested under `/graphs/{graph_id}/...`. The
/// Routes are flat; every request resolves to that handle, regardless
/// of the URI path. No registry walk, no sentinel key, no
/// programmer-error guard.
///
/// **Multi mode**: routes are nested under `/graphs/{graph_id}/...`. The
/// middleware extracts `{graph_id}` from the URI path and looks it up in /// middleware extracts `{graph_id}` from the URI path and looks it up in
/// the registry. Returns 404 if the graph is not registered. /// the registry. Returns 404 if the graph is not registered.
/// ///
@ -268,39 +251,33 @@ pub(crate) async fn resolve_graph_handle(
mut request: Request, mut request: Request,
next: Next, next: Next,
) -> std::result::Result<Response, ApiError> { ) -> std::result::Result<Response, ApiError> {
let handle = match &state.routing { let registry = &state.routing.registry;
GraphRouting::Single { handle } => Arc::clone(handle), // `Router::nest("/graphs/{graph_id}", inner)` rewrites
GraphRouting::Multi { registry, .. } => { // `request.uri().path()` to the inner suffix (e.g. `/snapshot`).
// `Router::nest("/graphs/{graph_id}", inner)` rewrites // The pre-rewrite URI is preserved in the `OriginalUri`
// `request.uri().path()` to the inner suffix (e.g. `/snapshot`). // request extension by axum's router; we read from there to
// The pre-rewrite URI is preserved in the `OriginalUri` // extract `{graph_id}`. Fall back to the current URI only if
// request extension by axum's router; we read from there to // the extension is missing, which shouldn't happen for
// extract `{graph_id}`. Fall back to the current URI only if // nested routes but is safe defensive code.
// the extension is missing, which shouldn't happen for let original_path: String = request
// nested routes but is safe defensive code. .extensions()
let original_path: String = request .get::<OriginalUri>()
.extensions() .map(|OriginalUri(uri)| uri.path().to_string())
.get::<OriginalUri>() .unwrap_or_else(|| request.uri().path().to_string());
.map(|OriginalUri(uri)| uri.path().to_string()) let graph_id_str = original_path
.unwrap_or_else(|| request.uri().path().to_string()); .strip_prefix("/graphs/")
let graph_id_str = original_path .and_then(|rest| rest.split('/').next())
.strip_prefix("/graphs/") .filter(|s| !s.is_empty())
.and_then(|rest| rest.split('/').next()) .ok_or_else(|| {
.filter(|s| !s.is_empty()) ApiError::bad_request("cluster route missing /graphs/{graph_id} prefix".to_string())
.ok_or_else(|| { })?;
ApiError::bad_request( let graph_id = GraphId::try_from(graph_id_str.to_string())
"cluster route missing /graphs/{graph_id} prefix".to_string(), .map_err(|err| ApiError::bad_request(err.to_string()))?;
) let key = GraphKey::cluster(graph_id.clone());
})?; let handle = match registry.get(&key) {
let graph_id = GraphId::try_from(graph_id_str.to_string()) RegistryLookup::Ready(handle) => handle,
.map_err(|err| ApiError::bad_request(err.to_string()))?; RegistryLookup::Gone => {
let key = GraphKey::cluster(graph_id.clone()); return Err(ApiError::not_found(format!("graph '{graph_id}' not found")));
match registry.get(&key) {
RegistryLookup::Ready(handle) => handle,
RegistryLookup::Gone => {
return Err(ApiError::not_found(format!("graph '{graph_id}' not found")));
}
}
} }
}; };
@ -382,22 +359,25 @@ pub(crate) fn authorize(
// runtime state means the docstring contract on // runtime state means the docstring contract on
// `server_graphs_list` ("don't leak the registry until the // `server_graphs_list` ("don't leak the registry until the
// operator explicitly authorizes it") holds uniformly; the // operator explicitly authorizes it") holds uniformly; the
// operator's only path to enabling it is configuring an // operator's only path to enabling it is configuring a
// explicit `server.policy.file` in omnigraph.yaml. // cluster-scoped policy bundle, applying the cluster, and
// restarting the server.
if request.action.resource_kind() == PolicyResourceKind::Server { if request.action.resource_kind() == PolicyResourceKind::Server {
return Ok(Authz::Denied( return Ok(Authz::Denied(
"server-scoped actions require an explicit `server.policy.file` \ "server-scoped actions require an explicit cluster policy bundle \
configured in omnigraph.yaml the management surface is closed \ applied with `omnigraph cluster apply` and served after restart \
by default in every runtime state, including --unauthenticated, \ the management surface is closed by default in every runtime state, \
so that server topology is never exposed without operator opt-in." including --unauthenticated, so that server topology is never exposed \
without operator opt-in."
.to_string(), .to_string(),
)); ));
} }
if actor.is_some() && request.action != PolicyAction::Read { if actor.is_some() && request.action != PolicyAction::Read {
return Ok(Authz::Denied( return Ok(Authz::Denied(
"server runs in default-deny mode (bearer tokens configured but no \ "server runs in default-deny mode (bearer tokens configured but no \
policy file). Only `read` actions are permitted; configure \ applied policy bundle). Only `read` actions are permitted; configure \
`policy.file` in omnigraph.yaml to enable other actions." a graph or cluster policy bundle in the cluster config, run \
`omnigraph cluster apply`, and restart the server to enable other actions."
.to_string(), .to_string(),
)); ));
} }
@ -510,7 +490,7 @@ pub(crate) fn deprecation_headers(successor_link: &'static str) -> [(HeaderName,
operation_id = "read", operation_id = "read",
request_body = ReadRequest, request_body = ReadRequest,
responses( responses(
(status = 200, description = "Query results (response includes `Deprecation: true` + `Link: </query>; rel=\"successor-version\"`)", body = ReadOutput), (status = 200, description = "Query results (response includes `Deprecation: true` + `Link: <query>; rel=\"successor-version\"`)", body = ReadOutput),
(status = 400, description = "Bad request", body = ErrorOutput), (status = 400, description = "Bad request", body = ErrorOutput),
(status = 401, description = "Unauthorized", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput),
(status = 403, description = "Forbidden", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput),
@ -524,7 +504,7 @@ pub(crate) fn deprecation_headers(successor_link: &'static str) -> [(HeaderName,
/// route is kept indefinitely for byte-stable back-compat. New integrations /// route is kept indefinitely for byte-stable back-compat. New integrations
/// should target `POST /query`, which has clean field names (`query` / /// should target `POST /query`, which has clean field names (`query` /
/// `name`) and a 400-on-mutation guard. Responses from this route include /// `name`) and a 400-on-mutation guard. Responses from this route include
/// `Deprecation: true` and `Link: </query>; rel="successor-version"` /// `Deprecation: true` and `Link: <query>; rel="successor-version"`
/// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the /// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the
/// signal. /// signal.
pub(crate) async fn server_read( pub(crate) async fn server_read(
@ -544,7 +524,7 @@ pub(crate) async fn server_read(
) )
.await?; .await?;
Ok(( Ok((
deprecation_headers("</query>; rel=\"successor-version\""), deprecation_headers("<query>; rel=\"successor-version\""),
Json(api::read_output(selected_name, &target, result)), Json(api::read_output(selected_name, &target, result)),
)) ))
} }
@ -793,7 +773,7 @@ pub(crate) async fn run_query(
operation_id = "change", operation_id = "change",
request_body = ChangeRequest, request_body = ChangeRequest,
responses( responses(
(status = 200, description = "Mutation results (response includes `Deprecation: true` + `Link: </mutate>; rel=\"successor-version\"`)", body = ChangeOutput), (status = 200, description = "Mutation results (response includes `Deprecation: true` + `Link: <mutate>; rel=\"successor-version\"`)", body = ChangeOutput),
(status = 400, description = "Bad request", body = ErrorOutput), (status = 400, description = "Bad request", body = ErrorOutput),
(status = 401, description = "Unauthorized", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput),
(status = 403, description = "Forbidden", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput),
@ -809,7 +789,7 @@ pub(crate) async fn run_query(
/// kept indefinitely for back-compat. New integrations should target /// kept indefinitely for back-compat. New integrations should target
/// `POST /mutate`, which has identical semantics and a name that pairs /// `POST /mutate`, which has identical semantics and a name that pairs
/// cleanly with `POST /query`. Responses from this route include /// cleanly with `POST /query`. Responses from this route include
/// `Deprecation: true` and `Link: </mutate>; rel="successor-version"` /// `Deprecation: true` and `Link: <mutate>; rel="successor-version"`
/// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the /// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the
/// signal. /// signal.
pub(crate) async fn server_change( pub(crate) async fn server_change(
@ -830,7 +810,7 @@ pub(crate) async fn server_change(
) )
.await?; .await?;
Ok(( Ok((
deprecation_headers("</mutate>; rel=\"successor-version\""), deprecation_headers("<mutate>; rel=\"successor-version\""),
Json(output), Json(output),
)) ))
} }
@ -980,6 +960,22 @@ pub(crate) async fn server_invoke_query(
let query_name = stored.name.clone(); let query_name = stored.name.clone();
let is_mutation = stored.is_mutation(); let is_mutation = stored.is_mutation();
// RFC-011 D3: the CLI verb asserts the stored query's kind. `query <name>`
// sends `expect_mutation: false`, `mutate <name>` sends `true`; a mismatch
// is rejected here so the wrong verb errors instead of silently running.
if let Some(expected) = req.expect_mutation {
if expected != is_mutation {
let (actual, verb) = if is_mutation {
("mutation", "mutate")
} else {
("read", "query")
};
return Err(ApiError::bad_request(format!(
"'{query_name}' is a {actual} — use omnigraph {verb} {query_name}"
)));
}
}
info!( info!(
graph = %handle.uri, graph = %handle.uri,
actor = ?actor_ref.map(|a| a.actor_id.as_ref()), actor = ?actor_ref.map(|a| a.actor_id.as_ref()),
@ -1117,12 +1113,16 @@ pub(crate) async fn server_schema_get(
(status = 400, description = "Bad request", body = ErrorOutput), (status = 400, description = "Bad request", body = ErrorOutput),
(status = 401, description = "Unauthorized", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput),
(status = 403, description = "Forbidden", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput),
(status = 409, description = "Schema apply is disabled for cluster-backed serving; use `omnigraph cluster apply` and restart", body = ErrorOutput),
(status = 429, description = "Per-actor admission cap exceeded; honor `Retry-After` header", body = ErrorOutput), (status = 429, description = "Per-actor admission cap exceeded; honor `Retry-After` header", body = ErrorOutput),
), ),
security(("bearer_token" = [])), security(("bearer_token" = [])),
)] )]
/// Apply a schema migration. /// Apply a schema migration.
/// ///
/// Cluster-backed servers reject this route with `409 Conflict`; operators
/// must apply schema changes through `omnigraph cluster apply` and restart.
///
/// Diffs `schema_source` against the current schema and applies the resulting /// Diffs `schema_source` against the current schema and applies the resulting
/// migration steps (add/drop type, add/drop column, etc.). **Destructive**: /// migration steps (add/drop type, add/drop column, etc.). **Destructive**:
/// some steps drop data. Returns the list of steps applied; if `applied` is /// some steps drop data. Returns the list of steps applied; if `applied` is
@ -1149,6 +1149,17 @@ pub(crate) async fn server_schema_apply(
target_branch: Some("main".to_string()), target_branch: Some("main".to_string()),
}, },
)?; )?;
// Disable HTTP schema apply on cluster-backed serving AFTER the Cedar gate,
// so an unauthorized actor gets a 403 (not a 409 that would disclose the
// server is cluster-backed): 401 → 403 → 409, never leak topology before
// authorization. An authorized actor gets the actionable 409 signpost.
if state.routing().config_path.is_some() {
return Err(ApiError::conflict(
"server-side schema apply is disabled for cluster-backed serving; \
update the cluster config, run `omnigraph cluster apply`, and restart \
the server.",
));
}
let est_bytes = request.schema_source.len() as u64; let est_bytes = request.schema_source.len() as u64;
let _admission = state let _admission = state
.workload .workload
@ -1180,6 +1191,25 @@ pub(crate) async fn server_schema_apply(
.await .await
.map_err(ApiError::from_omni)? .map_err(ApiError::from_omni)?
}; };
// Prompt index convergence (iss-848): schema apply records `@index` intent
// but defers the physical build. On a long-lived server, materialize it
// promptly rather than waiting for the next `optimize` cron — spawned
// detached so it never blocks or fails the apply response. Best-effort: a
// failure is logged and the index still converges on the next optimize.
// The CLI is one-shot, so it has no equivalent; its convergence path is the
// operator's optimize cadence.
if result.applied {
let engine = Arc::clone(&handle.engine);
tokio::spawn(async move {
if let Err(err) = engine.ensure_indices().await {
tracing::warn!(
target: "omnigraph::server",
error = %err,
"post-apply ensure_indices failed; indexes will converge on the next optimize",
);
}
});
}
Ok(Json(schema_apply_output(handle.uri.as_str(), result))) Ok(Json(schema_apply_output(handle.uri.as_str(), result)))
} }
@ -1311,7 +1341,7 @@ pub(crate) async fn server_load(
operation_id = "ingest", operation_id = "ingest",
request_body = IngestRequest, request_body = IngestRequest,
responses( responses(
(status = 200, description = "Load results (response includes `Deprecation: true` + `Link: </load>; rel=\"successor-version\"`)", body = IngestOutput), (status = 200, description = "Load results (response includes `Deprecation: true` + `Link: <load>; rel=\"successor-version\"`)", body = IngestOutput),
(status = 400, description = "Bad request", body = ErrorOutput), (status = 400, description = "Bad request", body = ErrorOutput),
(status = 401, description = "Unauthorized", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput),
(status = 403, description = "Forbidden", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput),
@ -1325,7 +1355,7 @@ pub(crate) async fn server_load(
/// Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is /// Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is
/// kept indefinitely for back-compat. New integrations should target /// kept indefinitely for back-compat. New integrations should target
/// `POST /load`, which has identical semantics. Responses from this route /// `POST /load`, which has identical semantics. Responses from this route
/// include `Deprecation: true` and `Link: </load>; rel="successor-version"` /// include `Deprecation: true` and `Link: <load>; rel="successor-version"`
/// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal. /// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal.
pub(crate) async fn server_ingest( pub(crate) async fn server_ingest(
State(state): State<AppState>, State(state): State<AppState>,
@ -1341,7 +1371,7 @@ pub(crate) async fn server_ingest(
) )
.await?; .await?;
Ok(( Ok((
deprecation_headers("</load>; rel=\"successor-version\""), deprecation_headers("<load>; rel=\"successor-version\""),
Json(output), Json(output),
)) ))
} }
@ -1725,4 +1755,3 @@ pub(crate) fn query_params_from_json(
json_params_to_param_map(params_json, query_params, JsonParamMode::Standard) json_params_to_param_map(params_json, query_params, JsonParamMode::Standard)
.map_err(|err| color_eyre::eyre::eyre!(err.to_string())) .map_err(|err| color_eyre::eyre::eyre!(err.to_string()))
} }

View file

@ -1,11 +1,10 @@
pub mod api; pub mod api;
mod handlers; mod handlers;
mod settings; mod settings;
pub use settings::{load_server_settings, classify_server_runtime_state, server_config_is_multi, ServerRuntimeState}; pub use settings::{load_server_settings, classify_server_runtime_state, ServerRuntimeState};
use settings::*; use settings::*;
use handlers::*; use handlers::*;
pub mod auth; pub mod auth;
pub mod config;
pub mod graph_id; pub mod graph_id;
pub mod identity; pub mod identity;
pub mod policy; pub mod policy;
@ -46,11 +45,6 @@ use axum::response::{IntoResponse, Response};
use axum::routing::{delete, get, post}; use axum::routing::{delete, get, post};
use axum::{Json, Router}; use axum::{Json, Router};
use color_eyre::eyre::{Result, WrapErr, bail, eyre}; use color_eyre::eyre::{Result, WrapErr, bail, eyre};
pub use config::{
AliasCommand, AliasConfig, CliDefaults, DEFAULT_CONFIG_FILE, OmnigraphConfig, PolicySettings,
ProjectConfig, QueryDefaults, ReadOutputFormat, ServerDefaults, TableCellLayout, TargetConfig,
graph_resource_id_for_selection, load_config,
};
use futures::stream; use futures::stream;
use omnigraph::db::{Omnigraph, ReadTarget}; use omnigraph::db::{Omnigraph, ReadTarget};
use omnigraph::error::{ManifestConflictDetails, ManifestErrorKind, OmniError}; use omnigraph::error::{ManifestConflictDetails, ManifestErrorKind, OmniError};
@ -122,6 +116,20 @@ fn hash_bearer_token(token: &str) -> BearerTokenHash {
)] )]
pub struct ApiDoc; pub struct ApiDoc;
/// The canonical served OpenAPI shape (RFC-011 cluster-only): the static
/// `ApiDoc` with every protected path nested under `/graphs/{graph_id}/…`
/// and `cluster_`-prefixed operation ids. `/healthz` and `/graphs` stay
/// flat. This is the single source of nesting — both the runtime
/// `server_openapi` handler and the committed `openapi.json` derive from
/// it, so the published spec can never describe routes the server does
/// not serve. The handler additionally strips security in open mode; the
/// committed spec retains it.
pub fn served_openapi() -> utoipa::openapi::OpenApi {
let mut doc = ApiDoc::openapi();
handlers::nest_paths_under_cluster_prefix(&mut doc);
doc
}
struct SecurityAddon; struct SecurityAddon;
impl utoipa::Modify for SecurityAddon { impl utoipa::Modify for SecurityAddon {
@ -143,11 +151,10 @@ const SERVER_SOURCE_VERSION: Option<&str> = option_env!("OMNIGRAPH_SOURCE_VERSIO
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
pub struct ServerConfig { pub struct ServerConfig {
/// Server topology + the graphs to open at startup. Single-mode /// Server topology + the graphs to open at startup. RFC-011
/// invocations (`omnigraph-server <URI>` or `--target <name>`) /// cluster-only: the server always boots from a cluster
/// produce `ServerConfigMode::Single`; multi-mode invocations /// (`--cluster <dir | s3://…>`) and serves N graphs under cluster
/// (`--config omnigraph.yaml` with a non-empty `graphs:` map and /// routes.
/// no single-mode selector) produce `ServerConfigMode::Multi`.
pub mode: ServerConfigMode, pub mode: ServerConfigMode,
pub bind: String, pub bind: String,
/// Operator opt-in for fully-unauthenticated dev mode (MR-723). /// Operator opt-in for fully-unauthenticated dev mode (MR-723).
@ -161,49 +168,33 @@ pub struct ServerConfig {
pub allow_unauthenticated: bool, pub allow_unauthenticated: bool,
} }
/// What `load_server_settings` produces after applying the four-rule /// What `load_server_settings` produces. RFC-011 cluster-only: the
/// mode inference matrix (MR-668 decision 2). /// server always boots from a cluster's applied revision into a
/// multi-graph deployment (N ≥ 1 graphs).
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
pub enum ServerConfigMode { pub enum ServerConfigMode {
/// Legacy invocation — one graph at the given URI. Either: /// Cluster boot — `--cluster <dir | s3://…>` resolves the applied
/// * `omnigraph-server <URI>` (CLI positional), or /// revision into per-graph startup configs plus an optional
/// * `omnigraph-server --target <name> --config omnigraph.yaml`, or /// server-level policy.
/// * `omnigraph-server --config omnigraph.yaml` with `server.graph`
/// set to a named target.
Single {
uri: String,
/// Cedar graph resource id for the single graph. A named selection
/// uses the graph name; an anonymous URI uses the normalized URI to
/// preserve legacy single-graph policy identity.
graph_id: String,
/// Top-level `policy.file` (single-graph Cedar policy).
policy_file: Option<PathBuf>,
/// Top-level stored-query registry, loaded and identity-checked
/// at settings-build time; type-checked against the schema when
/// the engine opens.
queries: QueryRegistry,
},
/// Multi-graph invocation — `--config omnigraph.yaml` with a
/// non-empty `graphs:` map and no single-mode selector.
Multi { Multi {
/// Per-graph startup configs, sorted by graph id (BTreeMap /// Per-graph startup configs, sorted by graph id (BTreeMap
/// iteration order). The parallel-open loop iterates this. /// iteration order). The parallel-open loop iterates this.
graphs: Vec<GraphStartupConfig>, graphs: Vec<GraphStartupConfig>,
/// Path to the config file the server was started from. Kept on /// The cluster boot source (config directory or storage root).
/// the mode so future runtime mutation (deferred — see release /// Kept on the mode so future runtime mutation (deferred — see
/// notes) can locate the source of truth without re-parsing CLI /// release notes) can locate the source of truth without
/// args. /// re-parsing CLI args.
config_path: PathBuf, config_path: PathBuf,
/// `server.policy.file` (server-level Cedar policy for the /// Server-level Cedar policy for the management endpoints
/// management endpoints). Wired into `GET /graphs` authorization. /// (`GET /graphs`). Wired into `GET /graphs` authorization.
server_policy: Option<PolicySource>, server_policy: Option<PolicySource>,
}, },
} }
/// Where a Cedar policy bundle comes from at startup. File-based for /// Where a Cedar policy bundle comes from at startup. Cluster-local files are
/// omnigraph.yaml deployments; inline (digest-verified catalog content) /// used during config application; inline digest-verified catalog content is
/// for cluster-mode boots, where the catalog may live on object storage /// used for serving, where the catalog may live on object storage and the
/// and the server must not re-read mutable state after the snapshot. /// server must not re-read mutable state after the snapshot.
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
pub enum PolicySource { pub enum PolicySource {
File(PathBuf), File(PathBuf),
@ -227,36 +218,25 @@ pub struct GraphStartupConfig {
pub queries: QueryRegistry, pub queries: QueryRegistry,
} }
/// Runtime routing for the server. Single mode = legacy /// Runtime routing for the server (RFC-011 cluster-only). Every
/// `omnigraph-server <URI>` invocation, one graph, flat HTTP routes. /// deployment serves cluster routes (`/graphs/{graph_id}/...`) backed by
/// Multi mode = `--config omnigraph.yaml` with a non-empty `graphs:` /// a registry of N graphs (N ≥ 1). The single-graph convenience
/// map, N graphs, cluster routes (`/graphs/{graph_id}/...`). Mode is /// constructors build a one-graph registry keyed by `default`; the
/// determined at startup by `load_server_settings`. /// cluster boot path builds an N-graph registry. There is no longer a
/// flat-route mode.
/// ///
/// In single mode the handle lives here directly — there is no /// `config_path` is the boot source (the cluster directory or storage
/// registry, no sentinel key, no walk-and-assert. In multi mode the /// root); preserved here so future runtime mutation (deferred) can find
/// registry carries N handles and the middleware dispatches on the /// the source of truth without re-parsing CLI args. The server treats
/// URL's `{graph_id}` segment. /// the source as operator-owned and never writes it.
/// ///
/// Both modes share the same handler bodies — the routing middleware /// All handler bodies are mode-agnostic — the routing middleware
/// (`resolve_graph_handle`) injects `Arc<GraphHandle>` as a request /// (`resolve_graph_handle`) injects `Arc<GraphHandle>` as a request
/// extension so handlers never see the routing discriminator. /// extension by looking up the `{graph_id}` URL segment in the registry.
#[derive(Clone)] #[derive(Clone)]
pub enum GraphRouting { pub struct GraphRouting {
/// Single-graph deployment: one handle, flat routes (`/snapshot`, pub registry: Arc<GraphRegistry>,
/// `/read`, …). The `handle.uri` field carries the URI the engine pub config_path: Option<PathBuf>,
/// was opened from. Backward compatible with v0.6.0 deployments.
Single { handle: Arc<GraphHandle> },
/// Multi-graph deployment: many handles, cluster routes
/// (`/graphs/{graph_id}/...`). `config_path` is the `omnigraph.yaml`
/// the server reads at startup; preserved here so future runtime
/// mutation (deferred) can find the source of truth without
/// re-parsing CLI args. The server treats the file as
/// operator-owned and never writes it.
Multi {
registry: Arc<GraphRegistry>,
config_path: Option<PathBuf>,
},
} }
#[derive(Clone)] #[derive(Clone)]
@ -272,12 +252,10 @@ pub struct AppState {
/// see MR-668 decision Q6. /// see MR-668 decision Q6.
workload: Arc<workload::WorkloadController>, workload: Arc<workload::WorkloadController>,
bearer_tokens: Arc<[(BearerTokenHash, Arc<str>)]>, bearer_tokens: Arc<[(BearerTokenHash, Arc<str>)]>,
/// Server-level Cedar policy. Used by management endpoints (`POST /// Server-level Cedar policy. Used by management endpoints (`GET
/// /graphs`, `GET /graphs`) which act on the registry resource, /// /graphs`) which act on the registry resource, not on a per-graph
/// not on a per-graph resource. Loaded from `server.policy.file` /// resource. Loaded from the cluster-scoped policy binding when
/// in `omnigraph.yaml`. `None` outside multi mode and when no /// configured. Per-graph policies live on each `GraphHandle.policy`.
/// server policy is configured. Per-graph policies live on each
/// `GraphHandle.policy`.
server_policy: Option<Arc<PolicyEngine>>, server_policy: Option<Arc<PolicyEngine>>,
} }
@ -502,11 +480,13 @@ impl AppState {
)) ))
} }
/// Single-mode shared construction: wraps the bare engine + per-graph /// Single-graph convenience construction (RFC-011 cluster-only):
/// policy in a `GraphHandle` carried directly by `GraphRouting::Single`. /// wraps the bare engine + per-graph policy in a `GraphHandle` keyed
/// Per-graph policy enforcement on the engine (MR-722) is re-applied /// by `default`, then builds a one-graph registry so the deployment
/// via `Omnigraph::with_policy` so HTTP and engine layers can never /// serves the same `/graphs/{graph_id}/...` cluster routes as any
/// diverge. /// other. Per-graph policy enforcement on the engine (MR-722) is
/// re-applied via `Omnigraph::with_policy` so HTTP and engine layers
/// can never diverge.
fn build_single_mode( fn build_single_mode(
uri: String, uri: String,
db: Omnigraph, db: Omnigraph,
@ -525,18 +505,13 @@ impl AppState {
} else { } else {
db db
}; };
// `GraphHandle.key` is required by the struct, but in single // The convenience constructors address the single graph by the
// mode it is never a registry key (there's no registry) and // reserved id `default` — both the registry key and the URL
// never compared against user input (routes are flat, no // segment (`/graphs/default/...`).
// `{graph_id}` parameter). The label appears only in tracing
// output from `resolve_graph_handle`. The literal below is a
// log label, not a routing key — when the future cluster
// catalog ships, single mode may carry the catalog-assigned
// id here instead.
let uri = normalize_root_uri(&uri).unwrap_or(uri); let uri = normalize_root_uri(&uri).unwrap_or(uri);
let key = GraphKey::cluster( let graph_id =
GraphId::try_from("default").expect("'default' is a valid GraphId log label"), GraphId::try_from("default").expect("'default' is a valid GraphId");
); let key = GraphKey::cluster(graph_id);
let handle = Arc::new(GraphHandle { let handle = Arc::new(GraphHandle {
key, key,
uri, uri,
@ -544,8 +519,15 @@ impl AppState {
policy: policy_engine, policy: policy_engine,
queries, queries,
}); });
let registry = Arc::new(
GraphRegistry::from_handles(vec![handle])
.expect("a single handle never collides on graph id"),
);
Self { Self {
routing: GraphRouting::Single { handle }, routing: GraphRouting {
registry,
config_path: None,
},
workload, workload,
bearer_tokens, bearer_tokens,
server_policy: None, server_policy: None,
@ -553,12 +535,11 @@ impl AppState {
} }
/// Multi-mode constructor — used by the startup loop. Operators /// Multi-mode constructor — used by the startup loop. Operators
/// reach this by invoking `omnigraph-server --config omnigraph.yaml` /// reach this by invoking `omnigraph-server --cluster <dir|s3://...>`.
/// with a non-empty `graphs:` map.
/// ///
/// Caller supplies the already-opened `GraphHandle`s and (optionally) /// Caller supplies the already-opened `GraphHandle`s and (optionally)
/// the path to the source config file. `server_policy` is loaded /// the path to the source cluster. `server_policy` is loaded from the
/// from `server.policy.file` if configured. /// cluster-scoped policy binding if configured.
pub fn new_multi( pub fn new_multi(
handles: Vec<Arc<GraphHandle>>, handles: Vec<Arc<GraphHandle>>,
bearer_tokens: Vec<(String, String)>, bearer_tokens: Vec<(String, String)>,
@ -569,7 +550,7 @@ impl AppState {
let bearer_tokens = hash_bearer_tokens(bearer_tokens); let bearer_tokens = hash_bearer_tokens(bearer_tokens);
let registry = Arc::new(GraphRegistry::from_handles(handles)?); let registry = Arc::new(GraphRegistry::from_handles(handles)?);
Ok(Self { Ok(Self {
routing: GraphRouting::Multi { routing: GraphRouting {
registry, registry,
config_path, config_path,
}, },
@ -581,9 +562,7 @@ impl AppState {
/// Runtime routing accessor. Handlers don't typically inspect this — /// Runtime routing accessor. Handlers don't typically inspect this —
/// they extract `Arc<GraphHandle>` via the routing middleware — but /// they extract `Arc<GraphHandle>` via the routing middleware — but
/// `build_app` matches on it to decide flat vs nested route /// `server_graphs_list` reads the registry through it.
/// mounting, and a handful of management endpoints (`GET /graphs`,
/// the OpenAPI cluster rewrite) match on the discriminant.
pub fn routing(&self) -> &GraphRouting { pub fn routing(&self) -> &GraphRouting {
&self.routing &self.routing
} }
@ -597,13 +576,9 @@ impl AppState {
} }
// Any per-graph policy also requires auth — otherwise the // Any per-graph policy also requires auth — otherwise the
// policy gate would receive unauthenticated requests. Reading // policy gate would receive unauthenticated requests. Reading
// from `routing` is O(1) in both arms: single mode is a direct // the cached `any_per_graph_policy` flag off the registry
// `handle.policy.is_some()` check, multi mode reads the // snapshot is O(1).
// cached `any_per_graph_policy` flag on the registry snapshot. self.routing.registry.snapshot_ref().any_per_graph_policy
match &self.routing {
GraphRouting::Single { handle } => handle.policy.is_some(),
GraphRouting::Multi { registry, .. } => registry.snapshot_ref().any_per_graph_policy,
}
} }
fn authenticate_bearer_token(&self, provided_token: &str) -> Option<ResolvedActor> { fn authenticate_bearer_token(&self, provided_token: &str) -> Option<ResolvedActor> {
@ -898,18 +873,6 @@ fn validate_and_attach(
}) })
} }
/// Format every load error (parse / identity failure) into a multi-line
/// boot-abort message.
fn format_registry_load_errors(label: &str, errors: &[queries::LoadError]) -> String {
let joined = errors
.iter()
.map(|e| e.to_string())
.collect::<Vec<_>>()
.join("\n ");
format!("graph '{label}': stored-query registry failed to load:\n {joined}")
}
pub fn build_app(state: AppState) -> Router { pub fn build_app(state: AppState) -> Router {
// The per-graph protected routes, identical in single + multi mode. // The per-graph protected routes, identical in single + multi mode.
// Two middleware layers wrap them (outer first, inner last): // Two middleware layers wrap them (outer first, inner last):
@ -975,13 +938,9 @@ pub fn build_app(state: AppState) -> Router {
// Management endpoints (`GET /graphs`) live alongside the per-graph // Management endpoints (`GET /graphs`) live alongside the per-graph
// router. They go through bearer auth but NOT through // router. They go through bearer auth but NOT through
// `resolve_graph_handle` — they operate on the registry directly. // `resolve_graph_handle` — they operate on the registry directly.
// The endpoint is mounted in both modes; in single mode the handler
// returns 405 so clients see "resource exists, wrong context"
// rather than 404 "no such resource."
// //
// Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not // Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not
// exposed in v0.6.0 — operators add graphs by editing // exposed — operators run `cluster apply` and restart.
// `omnigraph.yaml` and restarting.
let management = Router::new() let management = Router::new()
.route("/graphs", get(server_graphs_list)) .route("/graphs", get(server_graphs_list))
.route_layer(middleware::from_fn_with_state( .route_layer(middleware::from_fn_with_state(
@ -989,15 +948,11 @@ pub fn build_app(state: AppState) -> Router {
require_bearer_auth, require_bearer_auth,
)); ));
// Mount the protected routes differently per mode: // RFC-011 cluster-only: per-graph routes always nest under
// * Single → flat routes (legacy: `/snapshot`, `/read`, etc.) // `/graphs/{graph_id}/...`; there are no flat single-graph routes.
// * Multi → nested under `/graphs/{graph_id}/...` let protected: Router<AppState> = Router::new()
let protected: Router<AppState> = match state.routing() { .nest("/graphs/{graph_id}", per_graph_protected)
GraphRouting::Single { .. } => per_graph_protected.merge(management), .merge(management);
GraphRouting::Multi { .. } => Router::new()
.nest("/graphs/{graph_id}", per_graph_protected)
.merge(management),
};
Router::new() Router::new()
.route("/healthz", get(server_health)) .route("/healthz", get(server_health))
@ -1018,7 +973,6 @@ pub async fn serve(config: ServerConfig) -> Result<()> {
// policy OR any per-graph policy file. Mirrors the // policy OR any per-graph policy file. Mirrors the
// `requires_bearer_auth` semantics on AppState. // `requires_bearer_auth` semantics on AppState.
let has_policy_configured = match &config.mode { let has_policy_configured = match &config.mode {
ServerConfigMode::Single { policy_file, .. } => policy_file.is_some(),
ServerConfigMode::Multi { ServerConfigMode::Multi {
graphs, graphs,
server_policy, server_policy,
@ -1039,36 +993,14 @@ pub async fn serve(config: ServerConfig) -> Result<()> {
ServerRuntimeState::DefaultDeny => warn!( ServerRuntimeState::DefaultDeny => warn!(
"bearer tokens are configured but no policy file is set — running in \ "bearer tokens are configured but no policy file is set — running in \
default-deny mode (only `read` actions are permitted for authenticated \ default-deny mode (only `read` actions are permitted for authenticated \
actors). Configure `policy.file` in omnigraph.yaml to enable Cedar rules." actors). Configure a graph or cluster policy bundle in the cluster config, \
run `omnigraph cluster apply`, and restart to enable Cedar rules."
), ),
ServerRuntimeState::PolicyEnabled => {} ServerRuntimeState::PolicyEnabled => {}
} }
let bind = config.bind.clone(); let bind = config.bind.clone();
let state = match config.mode { let state = match config.mode {
ServerConfigMode::Single {
uri,
graph_id,
policy_file,
queries,
} => {
let uri_for_log = uri.clone();
info!(
uri = %uri_for_log,
graph_id = %graph_id,
bind = %bind,
mode = "single",
"serving omnigraph"
);
AppState::open_single_with_queries_for_graph_id(
uri,
tokens,
policy_file.as_ref(),
queries,
Some(graph_id),
)
.await?
}
ServerConfigMode::Multi { ServerConfigMode::Multi {
graphs, graphs,
config_path, config_path,
@ -1076,7 +1008,7 @@ pub async fn serve(config: ServerConfig) -> Result<()> {
} => { } => {
info!( info!(
bind = %bind, bind = %bind,
mode = "multi", mode = "cluster",
graph_count = graphs.len(), graph_count = graphs.len(),
config = %config_path.display(), config = %config_path.display(),
"serving omnigraph" "serving omnigraph"
@ -1197,4 +1129,3 @@ async fn shutdown_signal() {
} }
info!("shutdown signal received"); info!("shutdown signal received");
} }

View file

@ -8,16 +8,10 @@ use omnigraph_server::{ServerConfig, init_tracing, load_server_settings, serve};
#[command(name = "omnigraph-server")] #[command(name = "omnigraph-server")]
#[command(about = "HTTP server for the Omnigraph graph database")] #[command(about = "HTTP server for the Omnigraph graph database")]
struct Cli { struct Cli {
/// Graph URI
uri: Option<String>,
#[arg(long)]
target: Option<String>,
#[arg(long)]
config: Option<PathBuf>,
/// Boot from a cluster: either a config directory (storage resolved /// Boot from a cluster: either a config directory (storage resolved
/// through cluster.yaml) or a storage-root URI directly /// through cluster.yaml) or a storage-root URI directly
/// (s3://bucket/prefix — config-free serving from the bucket). /// (s3://bucket/prefix — config-free serving from the bucket).
/// Exclusive: cannot combine with <URI>, --target, or --config. /// The server's only boot source (RFC-011 cluster-only).
#[arg(long)] #[arg(long)]
cluster: Option<PathBuf>, cluster: Option<PathBuf>,
#[arg(long)] #[arg(long)]
@ -36,14 +30,7 @@ async fn main() -> Result<()> {
init_tracing(); init_tracing();
let cli = Cli::parse(); let cli = Cli::parse();
let settings: ServerConfig = load_server_settings( let settings: ServerConfig =
cli.config.as_ref(), load_server_settings(cli.cluster.as_ref(), cli.bind, cli.unauthenticated).await?;
cli.cluster.as_ref(),
cli.uri,
cli.target,
cli.bind,
cli.unauthenticated,
)
.await?;
serve(settings).await serve(settings).await
} }

View file

@ -13,7 +13,6 @@
//! Renaming either is a breaking change to callers, by design. //! Renaming either is a breaking change to callers, by design.
use std::collections::BTreeMap; use std::collections::BTreeMap;
use std::fs;
use std::sync::Arc; use std::sync::Arc;
use omnigraph_compiler::catalog::Catalog; use omnigraph_compiler::catalog::Catalog;
@ -22,8 +21,6 @@ use omnigraph_compiler::query::parser::parse_query;
use omnigraph_compiler::query::typecheck::typecheck_query_decl; use omnigraph_compiler::query::typecheck::typecheck_query_decl;
use omnigraph_compiler::types::{PropType, ScalarType}; use omnigraph_compiler::types::{PropType, ScalarType};
use crate::config::{OmnigraphConfig, QueryEntry};
/// One loaded stored query. `source` is the full `.gq` file text — the /// One loaded stored query. `source` is the full `.gq` file text — the
/// invocation handler hands it to `run_query` / `run_mutate` verbatim, /// invocation handler hands it to `run_query` / `run_mutate` verbatim,
/// which reuse the same parse/IR/exec path as the inline routes (no /// which reuse the same parse/IR/exec path as the inline routes (no
@ -68,8 +65,9 @@ pub struct QueryRegistry {
by_name: BTreeMap<String, StoredQuery>, by_name: BTreeMap<String, StoredQuery>,
} }
/// In-memory registry entry before file I/O. Used by [`QueryRegistry::load`] /// In-memory registry spec: a query's name + already-read `.gq` source. The
/// (after reading each `.gq` from disk) and directly by tests. /// input to [`QueryRegistry::from_specs`] — built by the server's cluster boot
/// and by the CLI's `queries` tooling from a cluster serving snapshot.
#[derive(Debug, Clone)] #[derive(Debug, Clone)]
pub struct RegistrySpec { pub struct RegistrySpec {
pub name: String, pub name: String,
@ -169,47 +167,6 @@ impl QueryRegistry {
} }
} }
/// Read each registry entry's `.gq` file from disk and build the
/// registry. `entries` is either the top-level `queries` map (single
/// mode) or a graph's `queries` map (multi mode); `config` resolves
/// each entry's relative `file:` path against `base_dir`.
pub fn load(
config: &OmnigraphConfig,
entries: &BTreeMap<String, QueryEntry>,
) -> Result<Self, Vec<LoadError>> {
let mut specs = Vec::with_capacity(entries.len());
let mut errors = Vec::new();
for (name, entry) in entries {
let path = config.resolve_query_file(&entry.file);
match fs::read_to_string(&path) {
Ok(source) => specs.push(RegistrySpec {
name: name.clone(),
source,
expose: entry.mcp.expose,
tool_name: entry.mcp.tool_name.clone(),
}),
Err(err) => errors.push(LoadError {
query: Some(name.clone()),
message: format!("cannot read '{}': {err}", path.display()),
}),
}
}
// Parse/identity/uniqueness-check the readable specs even when some
// files failed to read, so every broken entry (I/O, parse, identity,
// tool-name collision) surfaces in one pass rather than one per
// restart. I/O errors come first (in `entries` key order), then the
// spec errors. A non-empty `errors` always fails the load.
match Self::from_specs(specs) {
Ok(registry) if errors.is_empty() => Ok(registry),
Ok(_) => Err(errors),
Err(spec_errors) => {
errors.extend(spec_errors);
Err(errors)
}
}
}
pub fn lookup(&self, name: &str) -> Option<&StoredQuery> { pub fn lookup(&self, name: &str) -> Option<&StoredQuery> {
self.by_name.get(name) self.by_name.get(name)
} }
@ -653,36 +610,4 @@ embedding: Vector(4)
assert!(entry2.params.is_empty(), "no declared params → empty list"); assert!(entry2.params.is_empty(), "no declared params → empty list");
} }
// --- load() error collection (file I/O + parse in one pass) ---
#[test]
fn load_collects_io_and_parse_errors_in_one_pass() {
use crate::config::load_config;
let temp = tempfile::tempdir().unwrap();
std::fs::write(
temp.path().join("good.gq"),
"query good() { match { $u: User } return { $u.name } }",
)
.unwrap();
std::fs::write(temp.path().join("broken.gq"), "query broken( {{ not valid").unwrap();
// `missing.gq` is deliberately not written (an I/O failure).
std::fs::write(
temp.path().join("omnigraph.yaml"),
"queries:\n good:\n file: ./good.gq\n \
missing:\n file: ./missing.gq\n broken:\n file: ./broken.gq\n",
)
.unwrap();
let config = load_config(Some(&temp.path().join("omnigraph.yaml"))).unwrap();
let errors = QueryRegistry::load(&config, config.query_entries()).unwrap_err();
let joined = errors.iter().map(|e| e.to_string()).collect::<Vec<_>>().join("\n");
// Both the missing file AND the parse error surface in one pass —
// the I/O failure must not mask the parse failure.
assert!(joined.contains("missing"), "I/O error must surface: {joined}");
assert!(
joined.contains("broken") && joined.contains("parse error"),
"the parse error in a readable file must surface in the same pass: {joined}"
);
assert!(!joined.contains("'good'"), "the valid entry is not an error: {joined}");
}
} }

View file

@ -1,14 +1,13 @@
//! Server settings: omnigraph.yaml/CLI/env resolution, mode inference //! Server settings: cluster/CLI/env resolution, bearer-token sources, and
//! (single vs multi vs cluster), bearer-token sources, and runtime-state //! runtime-state classification (moved verbatim from lib.rs in the
//! classification (moved verbatim from lib.rs in the modularization). //! modularization).
use super::*; use super::*;
/// Build serving settings from a cluster directory's applied revision /// Build serving settings from a cluster directory's applied revision
/// (RFC-005 §D2): graphs at derived roots, stored queries from verified /// (RFC-005 §D2): graphs at derived roots, stored queries from verified
/// catalog blob content, policy bundles from blob paths with their applied /// catalog blob content, policy bundles from blob paths with their applied
/// bindings. Always multi-graph routing. The unauthenticated/env handling /// bindings. Always multi-graph routing.
/// matches the omnigraph.yaml path.
pub(crate) async fn load_cluster_settings( pub(crate) async fn load_cluster_settings(
cluster_dir: &PathBuf, cluster_dir: &PathBuf,
cli_bind: Option<String>, cli_bind: Option<String>,
@ -131,163 +130,24 @@ pub(crate) async fn load_cluster_settings(
}) })
} }
/// RFC-011 cluster-only boot: the server serves exclusively from a
/// cluster's applied revision (`--cluster <dir | s3://…>`). The legacy
/// omnigraph.yaml / `--target` / positional-URI single-graph boot paths
/// were removed — a deployment serves from exactly one source.
pub async fn load_server_settings( pub async fn load_server_settings(
config_path: Option<&PathBuf>,
cli_cluster: Option<&PathBuf>, cli_cluster: Option<&PathBuf>,
cli_uri: Option<String>,
cli_target: Option<String>,
cli_bind: Option<String>, cli_bind: Option<String>,
cli_allow_unauthenticated: bool, cli_allow_unauthenticated: bool,
) -> Result<ServerConfig> { ) -> Result<ServerConfig> {
// Rule 0 (RFC-005): --cluster is an exclusive boot source. It is checked let Some(cluster_dir) = cli_cluster else {
// before anything reads omnigraph.yaml — in cluster mode that file is
// never opened, not even the implicit current-directory search.
if let Some(cluster_dir) = cli_cluster {
if cli_uri.is_some() || cli_target.is_some() || config_path.is_some() {
bail!(
"--cluster is an exclusive boot source; it cannot combine with a graph URI, --target, or --config (axiom 15: a deployment serves from one source)"
);
}
return load_cluster_settings(cluster_dir, cli_bind, cli_allow_unauthenticated).await;
}
let config = load_config(config_path)?;
let bind = cli_bind.unwrap_or_else(|| config.server_bind().to_string());
// Either `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` flips
// this. Treat any non-empty, non-"0"/"false" string as truthy —
// standard 12-factor "any value is true" reading of the env var.
let env_unauth = std::env::var("OMNIGRAPH_UNAUTHENTICATED")
.ok()
.map(|v| {
let trimmed = v.trim();
!trimmed.is_empty() && trimmed != "0" && !trimmed.eq_ignore_ascii_case("false")
})
.unwrap_or(false);
let allow_unauthenticated = cli_allow_unauthenticated || env_unauth;
// MR-668 decision 2 — four-rule mode inference matrix.
//
// 1. CLI `<URI>` positional → Single (URI = the value)
// 2. CLI `--target <name>` → Single (URI = graphs.<name>.uri)
// 3. `server.graph` in config → Single (URI = graphs.<server.graph>.uri)
// 4. `--config` + non-empty `graphs:` + no single-mode selector
// → Multi (every entry in `graphs:`)
// 5. otherwise → error with migration hint
//
// Rules 1-3 are mutually compatible (CLI URI wins over `--target`
// wins over `server.graph`), reusing the existing
// `resolve_target_uri` precedence.
let has_cli_uri = cli_uri.is_some();
let has_cli_target = cli_target.is_some();
let has_server_graph = config.server_graph_name().is_some();
let has_graphs_map = !config.graphs.is_empty();
let has_explicit_config = config_path.is_some();
let mode = if has_cli_uri || has_cli_target || has_server_graph {
// Rules 1, 2, or 3 → Single mode.
let raw_uri = config.resolve_target_uri(
cli_uri,
cli_target.as_deref(),
config.server_graph_name(),
)?;
let uri = normalize_root_uri(&raw_uri).wrap_err_with(|| {
format!("normalize single-graph URI '{raw_uri}' from server settings")
})?;
// Config follows graph IDENTITY, not mode: a bare URI is anonymous
// (top-level config); a graph chosen by name uses its per-graph
// `graphs.<name>.{policy,queries}`. `resolve_target_uri` already
// errored on an unknown name, so a `Some(name)` here is a known graph.
let selected: Option<&str> = if has_cli_uri {
None
} else {
cli_target.as_deref().or_else(|| config.server_graph_name())
};
// A named selection must not leave a populated top-level block
// silently unused — refuse boot and point at the per-graph block. The
// same rule the CLI selection gate enforces, shared via one helper so
// the boot check and `omnigraph queries validate`/`list` can't drift.
config.ensure_top_level_blocks_honored(selected)?;
// Load + identity-check now (no engine needed); the schema
// type-check happens when the engine opens.
let policy_file = config.resolve_policy_file_for(selected);
let queries = QueryRegistry::load(&config, config.query_entries_for(selected))
.map_err(|errs| color_eyre::eyre::eyre!(format_registry_load_errors(&uri, &errs)))?;
let graph_id = graph_resource_id_for_selection(selected, &uri);
ServerConfigMode::Single {
uri,
graph_id,
policy_file,
queries,
}
} else if has_explicit_config && has_graphs_map {
// Multi mode: every graph uses its per-graph block; top-level
// policy/queries are never honored, so a populated one is an error.
let unhonored = config.populated_top_level_blocks();
if !unhonored.is_empty() {
bail!(
"multi-graph mode: top-level {} {} not honored — each graph uses its own \
`graphs.<graph_id>.` block. Move per-graph rules there (and any \
`graph_list` policy to `server.policy.file`).",
unhonored.join(" and "),
if unhonored.len() == 1 { "is" } else { "are" },
);
}
// Rule 4 → Multi mode. Build a startup config per graph.
let mut graphs = Vec::with_capacity(config.graphs.len());
for (name, target) in &config.graphs {
// Validate the graph id can construct a `GraphId` newtype.
// Doing this here (not at registry insert) so a malformed
// omnigraph.yaml fails at startup with a clear error.
GraphId::try_from(name.clone()).map_err(|err| {
color_eyre::eyre::eyre!("invalid graph id '{name}' in omnigraph.yaml: {err}")
})?;
let raw_uri = config.resolve_uri_value(&target.uri);
let uri = normalize_root_uri(&raw_uri).wrap_err_with(|| {
format!("normalize URI '{raw_uri}' for graph '{name}' in omnigraph.yaml")
})?;
// Per-graph `queries:`, selected through the shared
// `query_entries_for` so server and CLI resolve identically.
// Load + identity-check now; the schema type-check happens
// when this graph's engine opens.
let queries = QueryRegistry::load(&config, config.query_entries_for(Some(name.as_str())))
.map_err(|errs| color_eyre::eyre::eyre!(format_registry_load_errors(name, &errs)))?;
graphs.push(GraphStartupConfig {
graph_id: name.clone(),
uri,
policy: config.resolve_target_policy_file(name).map(PolicySource::File),
embedding: None,
queries,
});
}
let config_path = config_path
.cloned()
.expect("has_explicit_config implies config_path is Some");
let server_policy = config.resolve_server_policy_file().map(PolicySource::File);
ServerConfigMode::Multi {
graphs,
config_path,
server_policy,
}
} else {
// Rule 5 → error with migration hint.
bail!( bail!(
"no graph to serve: pass a URI (`omnigraph-server <URI>`), select a target \ "omnigraph-server boots from a cluster: pass --cluster <dir|s3://…> \
(`--target <name> --config omnigraph.yaml`), set `server.graph: <name>` in \ (the cluster's applied revision is the deployment artifact). The legacy \
omnigraph.yaml, or for multi-graph mode add a `graphs:` map to the config \ single-graph boot (positional <URI>, --target, --config omnigraph.yaml) \
file referenced by `--config`." was removed in RFC-011."
); );
}; };
load_cluster_settings(cluster_dir, cli_bind, cli_allow_unauthenticated).await
Ok(ServerConfig {
mode,
bind,
allow_unauthenticated,
})
}
/// Whether the loaded config will run the server in multi-graph mode.
/// Useful for the test that constructs `ServerConfig` directly.
pub fn server_config_is_multi(config: &ServerConfig) -> bool {
matches!(config.mode, ServerConfigMode::Multi { .. })
} }
/// MR-723 server runtime state, classified from the three-state matrix /// MR-723 server runtime state, classified from the three-state matrix
@ -337,7 +197,8 @@ pub fn classify_server_runtime_state(
"server has no bearer tokens and no policy file configured. This is a fully \ "server has no bearer tokens and no policy file configured. This is a fully \
open server pass `--unauthenticated` (or set OMNIGRAPH_UNAUTHENTICATED=1) \ open server pass `--unauthenticated` (or set OMNIGRAPH_UNAUTHENTICATED=1) \
if you actually want that, otherwise configure bearer tokens (see \ if you actually want that, otherwise configure bearer tokens (see \
docs/user/operations/server.md) and/or `policy.file` in omnigraph.yaml." docs/user/operations/server.md) and a graph or cluster policy bundle in \
the cluster config, then run `omnigraph cluster apply` and restart."
), ),
(false, false, true) => Ok(ServerRuntimeState::Open), (false, false, true) => Ok(ServerRuntimeState::Open),
(true, false, _) => Ok(ServerRuntimeState::DefaultDeny), (true, false, _) => Ok(ServerRuntimeState::DefaultDeny),
@ -427,8 +288,8 @@ pub(crate) fn server_bearer_tokens_from_env() -> Result<Vec<(String, String)>> {
mod tests { mod tests {
use super::{ use super::{
GraphStartupConfig, ServerConfig, ServerConfigMode, ServerRuntimeState, GraphStartupConfig, ServerConfig, ServerConfigMode, ServerRuntimeState,
classify_server_runtime_state, hash_bearer_token, load_server_settings, classify_server_runtime_state, hash_bearer_token, normalize_bearer_token,
normalize_bearer_token, parse_bearer_tokens_json, serve, server_bearer_tokens_from_env, parse_bearer_tokens_json, serve, server_bearer_tokens_from_env,
}; };
use serial_test::serial; use serial_test::serial;
use std::env; use std::env;
@ -587,108 +448,15 @@ mod tests {
} }
#[tokio::test] #[tokio::test]
async fn server_settings_load_from_yaml_config() { async fn server_settings_require_cluster_boot_source() {
let temp = tempdir().unwrap(); // RFC-011 cluster-only: with no --cluster the server refuses to
let config = temp.path().join("omnigraph.yaml"); // start and names the cluster-required remedy.
fs::write( let error = super::load_server_settings(None, None, false)
&config, .await
r#" .unwrap_err();
graphs:
local:
uri: /tmp/demo.omni
server:
graph: local
bind: 0.0.0.0:9090
"#,
)
.unwrap();
let settings = load_server_settings(Some(&config), None, None, None, None, false).await.unwrap();
match &settings.mode {
ServerConfigMode::Single { uri, graph_id, .. } => {
assert_eq!(uri, "/tmp/demo.omni");
assert_eq!(graph_id, "local");
}
ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"),
}
assert_eq!(settings.bind, "0.0.0.0:9090");
}
#[tokio::test]
async fn server_settings_cli_flags_override_yaml_config() {
let temp = tempdir().unwrap();
let config = temp.path().join("omnigraph.yaml");
fs::write(
&config,
r#"
graphs:
local:
uri: /tmp/demo.omni
server:
graph: local
bind: 127.0.0.1:8080
"#,
)
.unwrap();
let settings = load_server_settings(
Some(&config),
None,
Some("/tmp/override.omni".to_string()),
None,
Some("0.0.0.0:9999".to_string()),
false,
)
.await
.unwrap();
match &settings.mode {
ServerConfigMode::Single { uri, graph_id, .. } => {
assert_eq!(uri, "/tmp/override.omni");
assert_eq!(graph_id, "/tmp/override.omni");
}
ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"),
}
assert_eq!(settings.bind, "0.0.0.0:9999");
}
#[tokio::test]
async fn server_settings_can_resolve_named_target() {
let temp = tempdir().unwrap();
let config = temp.path().join("omnigraph.yaml");
fs::write(
&config,
r#"
graphs:
local:
uri: ./demo.omni
dev:
uri: http://127.0.0.1:8080
server:
graph: local
bind: 127.0.0.1:8080
"#,
)
.unwrap();
let settings =
load_server_settings(Some(&config), None, None, Some("dev".to_string()), None, false)
.await
.unwrap();
match &settings.mode {
ServerConfigMode::Single { uri, graph_id, .. } => {
assert_eq!(uri, "http://127.0.0.1:8080");
assert_eq!(graph_id, "dev");
}
ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"),
}
}
#[tokio::test]
async fn server_settings_require_uri_from_cli_or_config() {
let error = load_server_settings(None, None, None, None, None, false).await.unwrap_err();
assert!( assert!(
error.to_string().contains("no graph to serve"), error.to_string().contains("boots from a cluster"),
"expected mode-inference error, got: {error}", "expected cluster-required error, got: {error}",
); );
} }
@ -799,17 +567,21 @@ server:
]); ]);
let temp = tempdir().unwrap(); let temp = tempdir().unwrap();
// Graph path doesn't need to exist — classifier fires before // Graph path doesn't need to exist — classifier fires before
// `AppState::open_with_bearer_tokens_and_policy`. // any engine open.
let config = ServerConfig { let config = ServerConfig {
mode: ServerConfigMode::Single { mode: ServerConfigMode::Multi {
uri: temp graphs: vec![GraphStartupConfig {
.path() graph_id: "default".to_string(),
.join("graph.omni") uri: temp
.to_string_lossy() .path()
.into_owned(), .join("graph.omni")
graph_id: "default".to_string(), .to_string_lossy()
policy_file: None, .into_owned(),
queries: crate::queries::QueryRegistry::default(), policy: None,
queries: crate::queries::QueryRegistry::default(),
}],
config_path: temp.path().join("cluster"),
server_policy: None,
}, },
bind: "127.0.0.1:0".to_string(), bind: "127.0.0.1:0".to_string(),
allow_unauthenticated: false, allow_unauthenticated: false,
@ -824,75 +596,6 @@ server:
); );
} }
#[tokio::test]
#[serial]
async fn unauthenticated_env_var_classification() {
// MR-723 PR A: closes the gap where the env-var read path inside
// `load_server_settings` was structurally implemented but not
// exercised by any test. Three properties to pin, all in one
// sequential test because `cargo test` runs the mod test suite
// in parallel and `OMNIGRAPH_UNAUTHENTICATED` is process-global
// — interleaving with another test that sets the same env var
// (concurrent classifier tests, even the bearer-token suite
// sharing `EnvGuard`) corrupts the read. Sequential within one
// test fn is the simplest race-free shape.
let temp = tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
local:
uri: /tmp/demo-unauth.omni
server:
graph: local
"#,
)
.unwrap();
// Truthy values flip Open mode on, even with CLI flag off.
for value in ["1", "true", "yes", "TRUE", "anything"] {
let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", Some(value))]);
let settings = load_server_settings(Some(&config_path), None, None, None, None, false).await
.expect("settings load should succeed");
assert!(
settings.allow_unauthenticated,
"OMNIGRAPH_UNAUTHENTICATED={value:?} should enable Open mode",
);
}
// Falsy values keep refusal behavior, even with CLI flag off.
for value in ["0", "false", "FALSE", ""] {
let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", Some(value))]);
let settings = load_server_settings(Some(&config_path), None, None, None, None, false).await
.expect("settings load should succeed");
assert!(
!settings.allow_unauthenticated,
"OMNIGRAPH_UNAUTHENTICATED={value:?} should NOT enable Open mode",
);
}
// Unset env var: also false.
let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", None)]);
let settings = load_server_settings(Some(&config_path), None, None, None, None, false).await
.expect("settings load should succeed");
assert!(
!settings.allow_unauthenticated,
"OMNIGRAPH_UNAUTHENTICATED unset should NOT enable Open mode",
);
drop(_guard);
// CLI flag wins even when env is falsy — `serve()` honors the
// OR of both inputs.
let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", Some("0"))]);
let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await
.expect("settings load should succeed");
assert!(
settings.allow_unauthenticated,
"--unauthenticated CLI flag should win even when env is falsy",
);
}
#[test] #[test]
fn classify_policy_enabled_requires_tokens() { fn classify_policy_enabled_requires_tokens() {
// State 3: tokens + policy → PolicyEnabled, regardless of the // State 3: tokens + policy → PolicyEnabled, regardless of the

View file

@ -50,7 +50,7 @@ async fn protected_routes_require_bearer_token() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -85,7 +85,7 @@ async fn protected_routes_accept_valid_bearer_token_while_healthz_stays_open() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer demo-token") .header("authorization", "Bearer demo-token")
.body(Body::empty()) .body(Body::empty())
@ -108,7 +108,7 @@ async fn protected_routes_accept_any_configured_team_bearer_token() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-two") .header("authorization", "Bearer token-two")
.body(Body::empty()) .body(Body::empty())
@ -158,7 +158,7 @@ rules:
let (ok_status, _) = json_response( let (ok_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-a") .header("authorization", "Bearer token-a")
.body(Body::empty()) .body(Body::empty())
@ -172,7 +172,7 @@ rules:
let (denied_status, denied_body) = json_response( let (denied_status, denied_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-b") .header("authorization", "Bearer token-b")
.body(Body::empty()) .body(Body::empty())
@ -190,7 +190,7 @@ rules:
let (bad_status, _) = json_response( let (bad_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer wrong-token") .header("authorization", "Bearer wrong-token")
.body(Body::empty()) .body(Body::empty())
@ -245,7 +245,7 @@ rules:
let (spoof_up_status, spoof_up_body) = json_response( let (spoof_up_status, spoof_up_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-b") .header("authorization", "Bearer token-b")
.header("x-actor-id", "act-a") .header("x-actor-id", "act-a")
@ -270,7 +270,7 @@ rules:
let (spoof_down_status, _) = json_response( let (spoof_down_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-a") .header("authorization", "Bearer token-a")
.header("x-actor-id", "act-b") .header("x-actor-id", "act-b")
@ -290,7 +290,7 @@ rules:
let (empty_spoof_status, _) = json_response( let (empty_spoof_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-b") .header("authorization", "Bearer token-b")
.header("x-actor-id", "") .header("x-actor-id", "")
@ -316,7 +316,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() {
let (missing_status, missing_body) = json_response( let (missing_status, missing_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -332,7 +332,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() {
let (snapshot_status, snapshot_body) = json_response( let (snapshot_status, snapshot_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.body(Body::empty()) .body(Body::empty())
@ -350,7 +350,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() {
let (forbidden_status, forbidden_body) = json_response( let (forbidden_status, forbidden_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/export") .uri(g("/export"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -369,7 +369,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/export") .uri(g("/export"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -410,7 +410,7 @@ async fn policy_uses_resolved_branch_for_snapshot_reads() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -458,7 +458,7 @@ async fn policy_blocks_change_on_protected_main_but_allows_unprotected_branch()
let (main_status, main_body) = json_response( let (main_status, main_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -482,7 +482,7 @@ async fn policy_blocks_change_on_protected_main_but_allows_unprotected_branch()
let (feature_status, feature_body) = json_response( let (feature_status, feature_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -533,7 +533,7 @@ async fn policy_blocks_non_admin_merge_to_main_and_allows_admin() {
let (deny_status, deny_body) = json_response( let (deny_status, deny_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -551,7 +551,7 @@ async fn policy_blocks_non_admin_merge_to_main_and_allows_admin() {
let (allow_status, allow_body) = json_response( let (allow_status, allow_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -578,7 +578,7 @@ async fn authenticated_change_stamps_actor_on_commits() {
let (change_status, change_body) = json_response( let (change_status, change_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer token-one") .header("authorization", "Bearer token-one")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -592,7 +592,7 @@ async fn authenticated_change_stamps_actor_on_commits() {
let (commits_status, commits_body) = json_response( let (commits_status, commits_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/commits?branch=main") .uri(g("/commits?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-one") .header("authorization", "Bearer token-one")
.body(Body::empty()) .body(Body::empty())
@ -623,7 +623,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() {
let (create_status, _) = json_response( let (create_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer token-one") .header("authorization", "Bearer token-one")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -642,7 +642,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() {
let (change_status, _) = json_response( let (change_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer token-one") .header("authorization", "Bearer token-one")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -659,7 +659,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() {
let (merge_status, merge_body) = json_response( let (merge_status, merge_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer token-two") .header("authorization", "Bearer token-two")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -673,7 +673,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() {
let (commit_status, commit_body) = json_response( let (commit_status, commit_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/commits?branch=main") .uri(g("/commits?branch=main"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer token-two") .header("authorization", "Bearer token-two")
.body(Body::empty()) .body(Body::empty())
@ -691,7 +691,6 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() {
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn engine_layer_policy_fires_via_direct_arc_omnigraph_from_new_single() { async fn engine_layer_policy_fires_via_direct_arc_omnigraph_from_new_single() {
use omnigraph_server::GraphRouting;
let temp = init_loaded_graph().await; let temp = init_loaded_graph().await;
let graph = graph_path(temp.path()); let graph = graph_path(temp.path());
let db = Omnigraph::open(graph.to_str().unwrap()).await.unwrap(); let db = Omnigraph::open(graph.to_str().unwrap()).await.unwrap();
@ -717,9 +716,14 @@ async fn engine_layer_policy_fires_via_direct_arc_omnigraph_from_new_single() {
// embedded consumer holding `Arc<Omnigraph>` would. If `new_single` // embedded consumer holding `Arc<Omnigraph>` would. If `new_single`
// failed to apply `with_policy` to the engine, this `mutate_as` // failed to apply `with_policy` to the engine, this `mutate_as`
// would succeed — the HTTP-layer is bypassed entirely. // would succeed — the HTTP-layer is bypassed entirely.
let handle = match state.routing() { // RFC-011 cluster-only: the single-graph convenience constructor
GraphRouting::Single { handle } => Arc::clone(handle), // registers the graph under the reserved id `default`.
GraphRouting::Multi { .. } => panic!("expected single-mode routing"), let key = omnigraph_server::GraphKey::cluster(
omnigraph_server::GraphId::try_from("default").unwrap(),
);
let handle = match state.routing().registry.get(&key) {
omnigraph_server::RegistryLookup::Ready(handle) => handle,
omnigraph_server::RegistryLookup::Gone => panic!("default graph must be registered"),
}; };
let engine = Arc::clone(&handle.engine); let engine = Arc::clone(&handle.engine);
@ -758,7 +762,7 @@ async fn oversized_request_body_returns_payload_too_large() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(oversized)) .body(Body::from(oversized))
@ -781,7 +785,7 @@ async fn default_deny_mode_allows_read_for_authenticated_actor() {
let (status, _body) = json_response( let (status, _body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot") .uri(g("/snapshot"))
.method(Method::GET) .method(Method::GET)
.header(AUTHORIZATION, "Bearer demo-token") .header(AUTHORIZATION, "Bearer demo-token")
.body(Body::empty()) .body(Body::empty())
@ -808,7 +812,7 @@ async fn default_deny_mode_rejects_change_with_forbidden() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header(AUTHORIZATION, "Bearer demo-token") .header(AUTHORIZATION, "Bearer demo-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -840,7 +844,7 @@ async fn default_deny_mode_rejects_schema_apply_with_forbidden() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/schema/apply") .uri(g("/schema/apply"))
.method(Method::POST) .method(Method::POST)
.header(AUTHORIZATION, "Bearer demo-token") .header(AUTHORIZATION, "Bearer demo-token")
.header("content-type", "application/json") .header("content-type", "application/json")

View file

@ -18,10 +18,7 @@ use support::*;
mod multi_graph_startup { mod multi_graph_startup {
use super::*; use super::*;
use omnigraph::storage::normalize_root_uri; use omnigraph::storage::normalize_root_uri;
use omnigraph_server::{ use omnigraph_server::{GraphHandle, GraphId, GraphKey, GraphRegistry, InsertError};
GraphHandle, GraphId, GraphKey, GraphRegistry, InsertError, ServerConfig, ServerConfigMode,
load_server_settings,
};
use std::sync::Arc; use std::sync::Arc;
async fn build_multi_mode_app(graph_ids: &[&str]) -> (Vec<tempfile::TempDir>, Router) { async fn build_multi_mode_app(graph_ids: &[&str]) -> (Vec<tempfile::TempDir>, Router) {
@ -280,10 +277,11 @@ mod multi_graph_startup {
); );
} }
/// Flat routes 404 in multi mode — the router only mounts under /// RFC-011 cluster-only: flat per-graph routes never resolve — the
/// `/graphs/{graph_id}/...` so `/snapshot` doesn't resolve. /// router only mounts under `/graphs/{graph_id}/...` so a root
/// `/snapshot` returns 404.
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn flat_routes_404_in_multi_mode() { async fn flat_routes_404_at_root() {
let (_dirs, app) = build_multi_mode_app(&["alpha"]).await; let (_dirs, app) = build_multi_mode_app(&["alpha"]).await;
let resp = app let resp = app
.oneshot( .oneshot(
@ -298,28 +296,6 @@ mod multi_graph_startup {
assert_eq!(resp.status(), StatusCode::NOT_FOUND); assert_eq!(resp.status(), StatusCode::NOT_FOUND);
} }
/// `GraphId` validation runs at startup — a reserved name in
/// `omnigraph.yaml` produces a clear error rather than getting
/// rejected per-request.
#[tokio::test]
async fn load_server_settings_rejects_reserved_graph_id() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
policies:
uri: /tmp/g1.omni
"#,
)
.unwrap();
let err = load_server_settings(Some(&config_path), None, None, None, None, false).await.unwrap_err();
assert!(
err.to_string().contains("invalid graph id 'policies'"),
"expected reserved-name rejection, got: {err}"
);
}
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn registry_rejects_duplicate_normalized_graph_uris() { async fn registry_rejects_duplicate_normalized_graph_uris() {
@ -375,372 +351,6 @@ graphs:
assert_eq!(listed[0].uri, graph_uri); assert_eq!(listed[0].uri, graph_uri);
} }
// ── Four-rule mode inference matrix ───────────────────────────────
/// Rule 1: CLI positional URI → Single.
#[tokio::test]
async fn mode_inference_cli_uri_is_single() {
let settings = load_server_settings(
None,
None,
Some("/tmp/cli.omni".to_string()),
None,
None,
true, // allow unauth so we get past the runtime-state check
)
.await
.unwrap();
match settings.mode {
ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/cli.omni"),
ServerConfigMode::Multi { .. } => panic!("expected Single (rule 1), got Multi"),
}
}
/// Rule 2: --target picks one graph from `graphs:` map → Single.
#[tokio::test]
async fn mode_inference_cli_target_is_single() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
alpha:
uri: /tmp/alpha.omni
beta:
uri: /tmp/beta.omni
"#,
)
.unwrap();
let settings =
load_server_settings(Some(&config_path), None, None, Some("alpha".into()), None, true)
.await
.unwrap();
match settings.mode {
ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/alpha.omni"),
ServerConfigMode::Multi { .. } => panic!("expected Single (rule 2), got Multi"),
}
}
/// Rule 3: `server.graph` set → Single (target picked from config).
#[tokio::test]
async fn mode_inference_server_graph_is_single() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
alpha:
uri: /tmp/alpha.omni
beta:
uri: /tmp/beta.omni
server:
graph: beta
"#,
)
.unwrap();
let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap();
match settings.mode {
ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/beta.omni"),
ServerConfigMode::Multi { .. } => panic!("expected Single (rule 3), got Multi"),
}
}
/// Rule 4: `--config` + non-empty `graphs:` + no single-mode selector → Multi.
#[tokio::test]
async fn mode_inference_config_plus_graphs_is_multi() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
alpha:
uri: /tmp/alpha.omni
beta:
uri: /tmp/beta.omni
"#,
)
.unwrap();
let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap();
match settings.mode {
ServerConfigMode::Multi { graphs, .. } => {
let ids: Vec<&str> = graphs.iter().map(|g| g.graph_id.as_str()).collect();
// BTreeMap iteration order is alphabetical.
assert_eq!(ids, vec!["alpha", "beta"]);
}
ServerConfigMode::Single { .. } => panic!("expected Multi (rule 4), got Single"),
}
}
#[tokio::test]
async fn mode_inference_multi_rejects_top_level_policy_file() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
policy:
file: ./policy.yaml
graphs:
alpha:
uri: /tmp/alpha.omni
"#,
)
.unwrap();
let err = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("top-level") && msg.contains("policy.file") && msg.contains("not honored"),
"expected top-level-not-honored guidance, got: {msg}"
);
assert!(
msg.contains("graphs.<graph_id>"),
"expected per-graph migration guidance, got: {msg}"
);
assert!(
msg.contains("server.policy.file"),
"expected server policy migration guidance, got: {msg}"
);
}
#[tokio::test]
async fn mode_inference_multi_rejects_top_level_queries() {
// Symmetric to the policy guard: a top-level `queries:` block in
// multi-graph mode is not honored (each graph uses its own), so it
// is a loud error rather than a silent no-op.
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
"queries:\n q:\n file: ./q.gq\ngraphs:\n alpha:\n uri: /tmp/alpha.omni\n",
)
.unwrap();
let err = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("queries") && msg.contains("not honored"),
"top-level queries must be rejected in multi-graph mode: {msg}"
);
}
#[tokio::test]
async fn single_mode_named_graph_rejects_top_level_blocks() {
// Serving a graph by name (`--target`/`server.graph`) uses its
// per-graph block; a populated top-level block would be silently
// shadowed, so boot refuses and names the per-graph location.
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
"policy:\n file: ./top.yaml\ngraphs:\n prod:\n uri: /tmp/prod.omni\n",
)
.unwrap();
let err =
load_server_settings(Some(&config_path), None, None, Some("prod".to_string()), None, true)
.await
.unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("prod") && msg.contains("policy.file") && msg.contains("graphs.prod"),
"named single-mode + top-level policy must refuse, naming the graph: {msg}"
);
}
#[tokio::test]
async fn single_mode_named_graph_uses_per_graph_policy_and_queries() {
// The identity rule: `--target prod` attaches `graphs.prod`'s own
// policy + queries, not the top-level ones (which are absent here).
let temp = tempfile::tempdir().unwrap();
fs::write(
temp.path().join("prod.gq"),
"query pq() { match { $u: User } return { $u.name } }",
)
.unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
"graphs:\n prod:\n uri: /tmp/prod.omni\n policy:\n file: ./prod-policy.yaml\n \
queries:\n pq:\n file: ./prod.gq\n",
)
.unwrap();
let settings =
load_server_settings(Some(&config_path), None, None, Some("prod".to_string()), None, true)
.await
.unwrap();
match settings.mode {
ServerConfigMode::Single {
graph_id,
policy_file,
queries,
..
} => {
assert_eq!(graph_id, "prod", "named single-mode keeps graph identity");
assert!(
policy_file
.as_ref()
.is_some_and(|p| p.ends_with("prod-policy.yaml")),
"per-graph policy attached: {policy_file:?}"
);
assert!(queries.lookup("pq").is_some(), "per-graph query attached");
}
other => panic!("expected Single mode, got {other:?}"),
}
}
#[tokio::test]
async fn mode_inference_normalizes_multi_graph_uris() {
let temp = tempfile::tempdir().unwrap();
let graph = temp.path().join("alpha.omni");
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
format!(
r#"
graphs:
alpha:
uri: file://{}/
"#,
graph.display()
),
)
.unwrap();
let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap();
match settings.mode {
ServerConfigMode::Multi { graphs, .. } => {
assert_eq!(graphs[0].uri, graph.to_string_lossy());
}
ServerConfigMode::Single { .. } => panic!("expected Multi"),
}
}
/// Rule 5: nothing → error with migration hint.
#[tokio::test]
async fn mode_inference_no_inputs_errors_with_migration_hint() {
let err = load_server_settings(None, None, None, None, None, true).await.unwrap_err();
let msg = err.to_string();
assert!(
msg.contains("no graph to serve"),
"expected migration-hint error, got: {msg}"
);
}
/// Rule 4 sub-case: `--config` with empty `graphs:` map and no
/// single-mode selector → rule 5 fires (no graph to serve).
#[tokio::test]
async fn mode_inference_empty_graphs_map_errors() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(&config_path, "server:\n bind: 127.0.0.1:8080\n").unwrap();
let err = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap_err();
assert!(err.to_string().contains("no graph to serve"));
}
/// `--config` + `<URI>` together: URI wins → Single (the CLI URI
/// takes precedence over the config's graphs map).
#[tokio::test]
async fn mode_inference_cli_uri_overrides_graphs_map() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
alpha:
uri: /tmp/alpha.omni
"#,
)
.unwrap();
let settings = load_server_settings(
Some(&config_path),
None,
Some("/tmp/cli-override.omni".to_string()),
None,
None,
true,
)
.await
.unwrap();
match settings.mode {
ServerConfigMode::Single { uri, .. } => {
assert_eq!(
uri, "/tmp/cli-override.omni",
"CLI URI must win over graphs: map"
);
}
ServerConfigMode::Multi { .. } => {
panic!("expected Single (CLI URI wins), got Multi")
}
}
}
/// Per-graph `policy.file` is resolved relative to the config base_dir.
#[tokio::test]
async fn per_graph_policy_file_is_resolved_relative_to_base_dir() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
graphs:
alpha:
uri: /tmp/alpha.omni
policy:
file: ./policies/alpha.yaml
beta:
uri: /tmp/beta.omni
"#,
)
.unwrap();
let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap();
let graphs = match settings.mode {
ServerConfigMode::Multi { graphs, .. } => graphs,
_ => panic!("expected Multi"),
};
// graphs is BTreeMap-iter order (alphabetical).
let alpha = &graphs[0];
let beta = &graphs[1];
assert_eq!(alpha.graph_id, "alpha");
let omnigraph_server::PolicySource::File(alpha_policy) =
alpha.policy.as_ref().unwrap()
else {
panic!("yaml-configured policy must stay file-based");
};
assert_eq!(alpha_policy, &temp.path().join("policies/alpha.yaml"));
assert_eq!(beta.graph_id, "beta");
assert!(beta.policy.is_none());
}
/// `server.policy.file` resolves alongside the graphs map.
#[tokio::test]
async fn server_policy_file_is_resolved_relative_to_base_dir() {
let temp = tempfile::tempdir().unwrap();
let config_path = temp.path().join("omnigraph.yaml");
fs::write(
&config_path,
r#"
server:
policy:
file: ./server-policy.yaml
graphs:
alpha:
uri: /tmp/alpha.omni
"#,
)
.unwrap();
let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap();
match settings.mode {
ServerConfigMode::Multi { server_policy, .. } => {
let omnigraph_server::PolicySource::File(path) = server_policy.unwrap() else {
panic!("yaml-configured server policy must stay file-based");
};
assert_eq!(path, temp.path().join("server-policy.yaml"));
}
_ => panic!("expected Multi"),
}
}
/// `GET /graphs` must NOT leak the registry in Open mode without /// `GET /graphs` must NOT leak the registry in Open mode without
/// an explicit server policy. Operators who pass `--unauthenticated` /// an explicit server policy. Operators who pass `--unauthenticated`
/// opted into trusting the network for graph DATA, not for leaking /// opted into trusting the network for graph DATA, not for leaking
@ -786,28 +396,6 @@ graphs:
); );
} }
/// `GET /graphs` returns 405 in single mode (resource exists in the
/// API surface, just not operational without a `graphs:` map).
#[tokio::test(flavor = "multi_thread")]
async fn get_graphs_returns_405_in_single_mode() {
let temp = init_loaded_graph().await;
let graph = graph_path(temp.path());
let state = AppState::open(graph.to_string_lossy().to_string())
.await
.unwrap();
let app = build_app(state);
let resp = app
.oneshot(
Request::builder()
.method(Method::GET)
.uri("/graphs")
.body(Body::empty())
.unwrap(),
)
.await
.unwrap();
assert_eq!(resp.status(), StatusCode::METHOD_NOT_ALLOWED);
}
/// `GET /graphs` requires bearer auth when tokens are configured. /// `GET /graphs` requires bearer auth when tokens are configured.
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
@ -971,52 +559,4 @@ rules:
); );
} }
/// Loads an `omnigraph.yaml` with two graphs and verifies multi-mode
/// inference plus graph entry resolution. Cluster-route dispatch is
/// covered by the route tests above.
#[tokio::test(flavor = "multi_thread")]
async fn server_settings_load_multi_graph_config_entries() {
let cfg_dir = tempfile::tempdir().unwrap();
// Real graph storage dirs (the URIs in the config must point to
// a graph init-able location).
let alpha_dir = cfg_dir.path().join("alpha.omni");
let beta_dir = cfg_dir.path().join("beta.omni");
let schema = fs::read_to_string(fixture("test.pg")).unwrap();
Omnigraph::init(alpha_dir.to_str().unwrap(), &schema)
.await
.unwrap();
Omnigraph::init(beta_dir.to_str().unwrap(), &schema)
.await
.unwrap();
let config_path = cfg_dir.path().join("omnigraph.yaml");
fs::write(
&config_path,
format!(
r#"
graphs:
alpha:
uri: {alpha}
beta:
uri: {beta}
"#,
alpha = alpha_dir.display(),
beta = beta_dir.display(),
),
)
.unwrap();
let settings: ServerConfig =
load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap();
assert!(matches!(settings.mode, ServerConfigMode::Multi { .. }));
match settings.mode {
ServerConfigMode::Multi { graphs, .. } => {
assert_eq!(graphs.len(), 2);
let ids: Vec<&str> = graphs.iter().map(|g| g.graph_id.as_str()).collect();
assert_eq!(ids, vec!["alpha", "beta"]);
}
_ => unreachable!(),
}
}
} }

View file

@ -63,7 +63,7 @@ async fn export_route_returns_jsonl_for_branch_snapshot() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/export") .uri(g("/export"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", format!("Bearer {}", token)) .header("authorization", format!("Bearer {}", token))
@ -99,7 +99,7 @@ async fn snapshot_route_returns_manifest_dataset_version() {
let (snapshot_status, snapshot_body) = json_response( let (snapshot_status, snapshot_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -131,7 +131,7 @@ async fn ingest_creates_branch_returns_metadata_and_stamps_actor() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer token-one") .header("authorization", "Bearer token-one")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -195,7 +195,7 @@ async fn ingest_existing_branch_skips_branch_create_policy_check() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -223,7 +223,7 @@ async fn ingest_without_from_returns_404_for_missing_branch_and_creates_nothing(
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&ingest).unwrap())) .body(Body::from(serde_json::to_vec(&ingest).unwrap()))
@ -264,7 +264,7 @@ async fn ingest_without_from_loads_into_existing_branch() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&ingest).unwrap())) .body(Body::from(serde_json::to_vec(&ingest).unwrap()))
@ -294,7 +294,7 @@ async fn ingest_denies_missing_branch_without_branch_create_permission() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -327,7 +327,7 @@ async fn ingest_denies_when_actor_lacks_change_permission() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -357,7 +357,7 @@ async fn ingest_rejects_payloads_over_32_mib() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&oversize).unwrap())) .body(Body::from(serde_json::to_vec(&oversize).unwrap()))
@ -419,7 +419,7 @@ async fn branch_merge_conflict_response_includes_structured_conflicts() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&merge).unwrap())) .body(Body::from(serde_json::to_vec(&merge).unwrap()))
@ -451,7 +451,7 @@ async fn repeated_read_after_change_sees_updated_state_from_same_app() {
let (change_status, change_body) = json_response( let (change_status, change_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&change).unwrap())) .body(Body::from(serde_json::to_vec(&change).unwrap()))
@ -471,7 +471,7 @@ async fn repeated_read_after_change_sees_updated_state_from_same_app() {
let (read_status, read_body) = json_response( let (read_status, read_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&read).unwrap())) .body(Body::from(serde_json::to_vec(&read).unwrap()))
@ -497,7 +497,7 @@ async fn query_endpoint_runs_inline_read() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/query") .uri(g("/query"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&query).unwrap())) .body(Body::from(serde_json::to_vec(&query).unwrap()))
@ -524,7 +524,7 @@ async fn query_endpoint_rejects_mutation_with_400() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/query") .uri(g("/query"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&query).unwrap())) .body(Body::from(serde_json::to_vec(&query).unwrap()))
@ -555,7 +555,7 @@ async fn mutate_endpoint_runs_inline_mutation() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/mutate") .uri(g("/mutate"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&request).unwrap())) .body(Body::from(serde_json::to_vec(&request).unwrap()))
@ -580,7 +580,7 @@ async fn mutate_endpoint_runs_inline_mutation() {
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn change_endpoint_emits_deprecation_headers() { async fn change_endpoint_emits_deprecation_headers() {
// `/change` is kept indefinitely for back-compat but flagged at runtime // `/change` is kept indefinitely for back-compat but flagged at runtime
// per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: </mutate>; // per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: <mutate>;
// rel="successor-version"`). The OpenAPI side is covered by // rel="successor-version"`). The OpenAPI side is covered by
// `openapi_change_is_deprecated` in tests/openapi.rs. // `openapi_change_is_deprecated` in tests/openapi.rs.
let (_temp, app) = app_for_loaded_graph().await; let (_temp, app) = app_for_loaded_graph().await;
@ -595,7 +595,7 @@ async fn change_endpoint_emits_deprecation_headers() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&request).unwrap())) .body(Body::from(serde_json::to_vec(&request).unwrap()))
@ -615,7 +615,7 @@ async fn change_endpoint_emits_deprecation_headers() {
); );
assert_eq!( assert_eq!(
response.headers().get("link").and_then(|v| v.to_str().ok()), response.headers().get("link").and_then(|v| v.to_str().ok()),
Some("</mutate>; rel=\"successor-version\""), Some("<mutate>; rel=\"successor-version\""),
"POST /change must point at /mutate via `Link` rel=successor-version (RFC 8288)" "POST /change must point at /mutate via `Link` rel=successor-version (RFC 8288)"
); );
} }
@ -635,7 +635,7 @@ async fn load_endpoint_loads_into_existing_branch() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/load") .uri(g("/load"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&request).unwrap())) .body(Body::from(serde_json::to_vec(&request).unwrap()))
@ -658,7 +658,7 @@ async fn load_endpoint_loads_into_existing_branch() {
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn ingest_endpoint_emits_deprecation_headers() { async fn ingest_endpoint_emits_deprecation_headers() {
// `/ingest` is the deprecated alias of `/load` (RFC-009 Phase 5): flagged // `/ingest` is the deprecated alias of `/load` (RFC-009 Phase 5): flagged
// at runtime per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: </load>; // at runtime per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: <load>;
// rel="successor-version"`). The OpenAPI side is covered by // rel="successor-version"`). The OpenAPI side is covered by
// `openapi_ingest_is_deprecated` in tests/openapi.rs. // `openapi_ingest_is_deprecated` in tests/openapi.rs.
let (_temp, app) = app_for_loaded_graph().await; let (_temp, app) = app_for_loaded_graph().await;
@ -672,7 +672,7 @@ async fn ingest_endpoint_emits_deprecation_headers() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&request).unwrap())) .body(Body::from(serde_json::to_vec(&request).unwrap()))
@ -692,7 +692,7 @@ async fn ingest_endpoint_emits_deprecation_headers() {
); );
assert_eq!( assert_eq!(
response.headers().get("link").and_then(|v| v.to_str().ok()), response.headers().get("link").and_then(|v| v.to_str().ok()),
Some("</load>; rel=\"successor-version\""), Some("<load>; rel=\"successor-version\""),
"POST /ingest must point at /load via `Link` rel=successor-version (RFC 8288)" "POST /ingest must point at /load via `Link` rel=successor-version (RFC 8288)"
); );
} }
@ -714,7 +714,7 @@ async fn read_endpoint_emits_deprecation_headers() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&request).unwrap())) .body(Body::from(serde_json::to_vec(&request).unwrap()))
@ -734,7 +734,7 @@ async fn read_endpoint_emits_deprecation_headers() {
); );
assert_eq!( assert_eq!(
response.headers().get("link").and_then(|v| v.to_str().ok()), response.headers().get("link").and_then(|v| v.to_str().ok()),
Some("</query>; rel=\"successor-version\""), Some("<query>; rel=\"successor-version\""),
"POST /read must point at /query via `Link` rel=successor-version (RFC 8288)" "POST /read must point at /query via `Link` rel=successor-version (RFC 8288)"
); );
} }
@ -757,7 +757,7 @@ async fn query_endpoint_does_not_emit_deprecation_headers() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/query") .uri(g("/query"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&request).unwrap())) .body(Body::from(serde_json::to_vec(&request).unwrap()))
@ -789,7 +789,7 @@ async fn change_endpoint_accepts_legacy_field_names() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&legacy_body).unwrap())) .body(Body::from(serde_json::to_vec(&legacy_body).unwrap()))
@ -808,7 +808,7 @@ async fn change_endpoint_accepts_legacy_field_names() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&canonical_body).unwrap())) .body(Body::from(serde_json::to_vec(&canonical_body).unwrap()))
@ -826,7 +826,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (list_status, list_body) = json_response( let (list_status, list_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -842,7 +842,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (create_status, create_body) = json_response( let (create_status, create_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&create).unwrap())) .body(Body::from(serde_json::to_vec(&create).unwrap()))
@ -856,7 +856,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (list_status, list_body) = json_response( let (list_status, list_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -874,7 +874,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (change_status, change_body) = json_response( let (change_status, change_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&change).unwrap())) .body(Body::from(serde_json::to_vec(&change).unwrap()))
@ -895,7 +895,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (read_status, read_body) = json_response( let (read_status, read_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&read_main_before).unwrap())) .body(Body::from(serde_json::to_vec(&read_main_before).unwrap()))
@ -912,7 +912,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (merge_status, merge_body) = json_response( let (merge_status, merge_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&merge).unwrap())) .body(Body::from(serde_json::to_vec(&merge).unwrap()))
@ -934,7 +934,7 @@ async fn remote_branch_list_create_merge_flow_works() {
let (read_status, read_body) = json_response( let (read_status, read_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&read_main_after).unwrap())) .body(Body::from(serde_json::to_vec(&read_main_after).unwrap()))
@ -957,7 +957,7 @@ async fn remote_branch_delete_flow_works() {
let (create_status, _) = json_response( let (create_status, _) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&create).unwrap())) .body(Body::from(serde_json::to_vec(&create).unwrap()))
@ -969,7 +969,7 @@ async fn remote_branch_delete_flow_works() {
let (delete_status, delete_body) = json_response( let (delete_status, delete_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/feature") .uri(g("/branches/feature"))
.method(Method::DELETE) .method(Method::DELETE)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -981,7 +981,7 @@ async fn remote_branch_delete_flow_works() {
let (list_status, list_body) = json_response( let (list_status, list_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -1009,7 +1009,7 @@ async fn branch_delete_denies_without_policy_permission() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/feature") .uri(g("/branches/feature"))
.method(Method::DELETE) .method(Method::DELETE)
.header("authorization", "Bearer token-team") .header("authorization", "Bearer token-team")
.body(Body::empty()) .body(Body::empty())
@ -1081,7 +1081,7 @@ query vector_search_string($q: String) {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&read).unwrap())) .body(Body::from(serde_json::to_vec(&read).unwrap()))
@ -1134,7 +1134,7 @@ async fn change_conflict_returns_manifest_conflict_409() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from( .body(Body::from(
@ -1206,7 +1206,7 @@ async fn change_concurrent_inserts_same_key_serialize_without_409() {
}) })
.unwrap(); .unwrap();
let req = Request::builder() let req = Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -1238,7 +1238,7 @@ async fn change_concurrent_inserts_same_key_serialize_without_409() {
let (snapshot_status, snapshot_body) = json_response( let (snapshot_status, snapshot_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -1319,7 +1319,7 @@ async fn change_concurrent_updates_same_key_serialize_via_publisher_cas() {
}) })
.unwrap(); .unwrap();
let req = Request::builder() let req = Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -1428,7 +1428,7 @@ query insert_c($name: String) {
}) })
.unwrap(); .unwrap();
let req = Request::builder() let req = Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -1445,7 +1445,7 @@ query insert_c($name: String) {
}) })
.unwrap(); .unwrap();
let req = Request::builder() let req = Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -1474,7 +1474,7 @@ query insert_c($name: String) {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -1582,7 +1582,7 @@ async fn ingest_per_actor_admission_cap_returns_429() {
}) })
.unwrap(); .unwrap();
let req = Request::builder() let req = Request::builder()
.uri("/ingest") .uri(g("/ingest"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer flooder-token") .header("authorization", "Bearer flooder-token")
.header("content-type", "application/json") .header("content-type", "application/json")

View file

@ -248,7 +248,7 @@ async fn concurrent_branch_ops_morphological_matrix() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -369,7 +369,7 @@ async fn concurrent_branch_ops_morphological_matrix() {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -717,31 +717,15 @@ graphs:
#[tokio::test] #[tokio::test]
async fn cluster_boot_refusals() { async fn cluster_boot_refusals() {
// Mutual exclusion with --config / URI. // RFC-011 cluster-only: with no --cluster, boot refuses with the
// cluster-required remedy.
let err = omnigraph_server::load_server_settings(None, None, true)
.await
.unwrap_err();
assert!(err.to_string().contains("boots from a cluster"), "{err}");
let temp = converged_cluster_dir("").await; let temp = converged_cluster_dir("").await;
let dir = temp.path().to_path_buf(); let dir = temp.path().to_path_buf();
let err = omnigraph_server::load_server_settings(
Some(&dir.join("omnigraph.yaml")),
Some(&dir),
None,
None,
None,
true,
)
.await
.unwrap_err();
assert!(err.to_string().contains("exclusive boot source"), "{err}");
let err = omnigraph_server::load_server_settings(
None,
Some(&dir),
Some("file:///tmp/x.omni".to_string()),
None,
None,
true,
)
.await
.unwrap_err();
assert!(err.to_string().contains("exclusive boot source"), "{err}");
// Tampered catalog blob refuses boot with the remedy. // Tampered catalog blob refuses boot with the remedy.
let blob_dir = dir.join("__cluster/resources/query/knowledge/find_person"); let blob_dir = dir.join("__cluster/resources/query/knowledge/find_person");

View file

@ -8,10 +8,9 @@ use axum::body::{Body, to_bytes};
use axum::http::{Method, Request, StatusCode}; use axum::http::{Method, Request, StatusCode};
use omnigraph::db::Omnigraph; use omnigraph::db::Omnigraph;
use omnigraph::loader::{LoadMode, load_jsonl}; use omnigraph::loader::{LoadMode, load_jsonl};
use omnigraph_server::{ApiDoc, AppState, build_app}; use omnigraph_server::{AppState, build_app, served_openapi};
use serde_json::Value; use serde_json::Value;
use tower::ServiceExt; use tower::ServiceExt;
use utoipa::OpenApi;
fn fixture(name: &str) -> PathBuf { fn fixture(name: &str) -> PathBuf {
PathBuf::from(env!("CARGO_MANIFEST_DIR")) PathBuf::from(env!("CARGO_MANIFEST_DIR"))
@ -71,7 +70,10 @@ async fn json_response(app: &Router, request: Request<Body>) -> (StatusCode, Val
} }
fn openapi_doc() -> utoipa::openapi::OpenApi { fn openapi_doc() -> utoipa::openapi::OpenApi {
ApiDoc::openapi() // RFC-011 cluster-only: the canonical committed spec is the SERVED
// shape — protected routes nested under `/graphs/{graph_id}/…`,
// `/healthz` and `/graphs` flat. This matches what the server serves.
served_openapi()
} }
fn openapi_json() -> Value { fn openapi_json() -> Value {
@ -159,26 +161,28 @@ fn openapi_info_contains_version() {
// Path coverage tests // Path coverage tests
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// The canonical served spec keeps `/healthz` and `/graphs` flat; every
// protected route nests under `/graphs/{graph_id}/…`.
const EXPECTED_PATHS: &[&str] = &[ const EXPECTED_PATHS: &[&str] = &[
"/healthz", "/healthz",
"/graphs", "/graphs",
"/snapshot", "/graphs/{graph_id}/snapshot",
"/read", "/graphs/{graph_id}/read",
"/query", "/graphs/{graph_id}/query",
"/export", "/graphs/{graph_id}/export",
"/change", "/graphs/{graph_id}/change",
"/mutate", "/graphs/{graph_id}/mutate",
"/queries", "/graphs/{graph_id}/queries",
"/queries/{name}", "/graphs/{graph_id}/queries/{name}",
"/schema", "/graphs/{graph_id}/schema",
"/schema/apply", "/graphs/{graph_id}/schema/apply",
"/load", "/graphs/{graph_id}/load",
"/ingest", "/graphs/{graph_id}/ingest",
"/branches", "/graphs/{graph_id}/branches",
"/branches/{branch}", "/graphs/{graph_id}/branches/{branch}",
"/branches/merge", "/graphs/{graph_id}/branches/merge",
"/commits", "/graphs/{graph_id}/commits",
"/commits/{commit_id}", "/graphs/{graph_id}/commits/{commit_id}",
]; ];
#[test] #[test]
@ -222,25 +226,25 @@ fn openapi_healthz_is_get() {
#[test] #[test]
fn openapi_read_is_post() { fn openapi_read_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/read"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/read"]["post"].is_object());
} }
#[test] #[test]
fn openapi_export_is_post() { fn openapi_export_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/export"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/export"]["post"].is_object());
} }
#[test] #[test]
fn openapi_change_is_post() { fn openapi_change_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/change"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/change"]["post"].is_object());
} }
#[test] #[test]
fn openapi_mutate_is_post() { fn openapi_mutate_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/mutate"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/mutate"]["post"].is_object());
} }
// Deprecation flagging — `/read` and `/change` are kept indefinitely for // Deprecation flagging — `/read` and `/change` are kept indefinitely for
@ -253,7 +257,7 @@ fn openapi_mutate_is_post() {
fn openapi_read_is_deprecated() { fn openapi_read_is_deprecated() {
let doc = openapi_json(); let doc = openapi_json();
assert_eq!( assert_eq!(
doc["paths"]["/read"]["post"]["deprecated"], doc["paths"]["/graphs/{graph_id}/read"]["post"]["deprecated"],
serde_json::Value::Bool(true), serde_json::Value::Bool(true),
"/read must be flagged deprecated in OpenAPI; use /query instead" "/read must be flagged deprecated in OpenAPI; use /query instead"
); );
@ -263,7 +267,7 @@ fn openapi_read_is_deprecated() {
fn openapi_change_is_deprecated() { fn openapi_change_is_deprecated() {
let doc = openapi_json(); let doc = openapi_json();
assert_eq!( assert_eq!(
doc["paths"]["/change"]["post"]["deprecated"], doc["paths"]["/graphs/{graph_id}/change"]["post"]["deprecated"],
serde_json::Value::Bool(true), serde_json::Value::Bool(true),
"/change must be flagged deprecated in OpenAPI; use /mutate instead" "/change must be flagged deprecated in OpenAPI; use /mutate instead"
); );
@ -272,7 +276,7 @@ fn openapi_change_is_deprecated() {
#[test] #[test]
fn openapi_query_is_not_deprecated() { fn openapi_query_is_not_deprecated() {
let doc = openapi_json(); let doc = openapi_json();
let deprecated = doc["paths"]["/query"]["post"] let deprecated = doc["paths"]["/graphs/{graph_id}/query"]["post"]
.get("deprecated") .get("deprecated")
.and_then(serde_json::Value::as_bool) .and_then(serde_json::Value::as_bool)
.unwrap_or(false); .unwrap_or(false);
@ -285,7 +289,7 @@ fn openapi_query_is_not_deprecated() {
#[test] #[test]
fn openapi_mutate_is_not_deprecated() { fn openapi_mutate_is_not_deprecated() {
let doc = openapi_json(); let doc = openapi_json();
let deprecated = doc["paths"]["/mutate"]["post"] let deprecated = doc["paths"]["/graphs/{graph_id}/mutate"]["post"]
.get("deprecated") .get("deprecated")
.and_then(serde_json::Value::as_bool) .and_then(serde_json::Value::as_bool)
.unwrap_or(false); .unwrap_or(false);
@ -298,15 +302,15 @@ fn openapi_mutate_is_not_deprecated() {
#[test] #[test]
fn openapi_ingest_is_post() { fn openapi_ingest_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/ingest"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/ingest"]["post"].is_object());
} }
#[test] #[test]
fn openapi_load_is_not_deprecated() { fn openapi_load_is_not_deprecated() {
// RFC-009 Phase 5: /load is the canonical bulk-load endpoint. // RFC-009 Phase 5: /load is the canonical bulk-load endpoint.
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/load"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/load"]["post"].is_object());
let deprecated = doc["paths"]["/load"]["post"] let deprecated = doc["paths"]["/graphs/{graph_id}/load"]["post"]
.get("deprecated") .get("deprecated")
.and_then(serde_json::Value::as_bool) .and_then(serde_json::Value::as_bool)
.unwrap_or(false); .unwrap_or(false);
@ -321,7 +325,7 @@ fn openapi_ingest_is_deprecated() {
// RFC-009 Phase 5: /ingest is now the deprecated alias of /load. // RFC-009 Phase 5: /ingest is now the deprecated alias of /load.
let doc = openapi_json(); let doc = openapi_json();
assert_eq!( assert_eq!(
doc["paths"]["/ingest"]["post"]["deprecated"], doc["paths"]["/graphs/{graph_id}/ingest"]["post"]["deprecated"],
serde_json::Value::Bool(true), serde_json::Value::Bool(true),
"/ingest must be flagged deprecated now that /load is canonical" "/ingest must be flagged deprecated now that /load is canonical"
); );
@ -330,32 +334,32 @@ fn openapi_ingest_is_deprecated() {
#[test] #[test]
fn openapi_branches_supports_get_and_post() { fn openapi_branches_supports_get_and_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/branches"]["get"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/branches"]["get"].is_object());
assert!(doc["paths"]["/branches"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/branches"]["post"].is_object());
} }
#[test] #[test]
fn openapi_branch_delete_is_delete() { fn openapi_branch_delete_is_delete() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/branches/{branch}"]["delete"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/branches/{branch}"]["delete"].is_object());
} }
#[test] #[test]
fn openapi_branch_merge_is_post() { fn openapi_branch_merge_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/branches/merge"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/branches/merge"]["post"].is_object());
} }
#[test] #[test]
fn openapi_commits_is_get() { fn openapi_commits_is_get() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/commits"]["get"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/commits"]["get"].is_object());
} }
#[test] #[test]
fn openapi_commit_show_is_get() { fn openapi_commit_show_is_get() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/commits/{commit_id}"]["get"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/commits/{commit_id}"]["get"].is_object());
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@ -510,13 +514,13 @@ fn query_request_query_is_required() {
#[test] #[test]
fn openapi_query_is_post() { fn openapi_query_is_post() {
let doc = openapi_json(); let doc = openapi_json();
assert!(doc["paths"]["/query"]["post"].is_object()); assert!(doc["paths"]["/graphs/{graph_id}/query"]["post"].is_object());
} }
#[test] #[test]
fn query_endpoint_documents_mutation_400() { fn query_endpoint_documents_mutation_400() {
let doc = openapi_json(); let doc = openapi_json();
let four_hundred = &doc["paths"]["/query"]["post"]["responses"]["400"]; let four_hundred = &doc["paths"]["/graphs/{graph_id}/query"]["post"]["responses"]["400"];
let description = four_hundred["description"].as_str().unwrap_or_default(); let description = four_hundred["description"].as_str().unwrap_or_default();
assert!( assert!(
description.contains("mutations") || description.contains("POST /mutate"), description.contains("mutations") || description.contains("POST /mutate"),
@ -727,21 +731,21 @@ fn openapi_defines_bearer_token_security_scheme() {
fn protected_endpoints_reference_bearer_token_security() { fn protected_endpoints_reference_bearer_token_security() {
let doc = openapi_json(); let doc = openapi_json();
let protected_paths = [ let protected_paths = [
("/read", "post"), ("/graphs/{graph_id}/read", "post"),
("/change", "post"), ("/graphs/{graph_id}/change", "post"),
("/schema/apply", "post"), ("/graphs/{graph_id}/schema/apply", "post"),
("/queries", "get"), ("/graphs/{graph_id}/queries", "get"),
("/queries/{name}", "post"), ("/graphs/{graph_id}/queries/{name}", "post"),
("/load", "post"), ("/graphs/{graph_id}/load", "post"),
("/ingest", "post"), ("/graphs/{graph_id}/ingest", "post"),
("/export", "post"), ("/graphs/{graph_id}/export", "post"),
("/snapshot", "get"), ("/graphs/{graph_id}/snapshot", "get"),
("/branches", "get"), ("/graphs/{graph_id}/branches", "get"),
("/branches", "post"), ("/graphs/{graph_id}/branches", "post"),
("/branches/{branch}", "delete"), ("/graphs/{graph_id}/branches/{branch}", "delete"),
("/branches/merge", "post"), ("/graphs/{graph_id}/branches/merge", "post"),
("/commits", "get"), ("/graphs/{graph_id}/commits", "get"),
("/commits/{commit_id}", "get"), ("/graphs/{graph_id}/commits/{commit_id}", "get"),
]; ];
for (path, method) in protected_paths { for (path, method) in protected_paths {
@ -773,7 +777,7 @@ fn healthz_does_not_require_security() {
#[test] #[test]
fn branch_delete_has_branch_path_parameter() { fn branch_delete_has_branch_path_parameter() {
let doc = openapi_json(); let doc = openapi_json();
let params = doc["paths"]["/branches/{branch}"]["delete"]["parameters"] let params = doc["paths"]["/graphs/{graph_id}/branches/{branch}"]["delete"]["parameters"]
.as_array() .as_array()
.unwrap(); .unwrap();
let has_branch = params let has_branch = params
@ -788,7 +792,7 @@ fn branch_delete_has_branch_path_parameter() {
#[test] #[test]
fn commit_show_has_commit_id_path_parameter() { fn commit_show_has_commit_id_path_parameter() {
let doc = openapi_json(); let doc = openapi_json();
let params = doc["paths"]["/commits/{commit_id}"]["get"]["parameters"] let params = doc["paths"]["/graphs/{graph_id}/commits/{commit_id}"]["get"]["parameters"]
.as_array() .as_array()
.unwrap(); .unwrap();
let has_commit_id = params let has_commit_id = params
@ -803,7 +807,7 @@ fn commit_show_has_commit_id_path_parameter() {
#[test] #[test]
fn snapshot_has_branch_query_parameter() { fn snapshot_has_branch_query_parameter() {
let doc = openapi_json(); let doc = openapi_json();
let params = doc["paths"]["/snapshot"]["get"]["parameters"] let params = doc["paths"]["/graphs/{graph_id}/snapshot"]["get"]["parameters"]
.as_array() .as_array()
.unwrap(); .unwrap();
let has_branch = params let has_branch = params
@ -818,7 +822,7 @@ fn snapshot_has_branch_query_parameter() {
#[test] #[test]
fn commits_has_branch_query_parameter() { fn commits_has_branch_query_parameter() {
let doc = openapi_json(); let doc = openapi_json();
let params = doc["paths"]["/commits"]["get"]["parameters"] let params = doc["paths"]["/graphs/{graph_id}/commits"]["get"]["parameters"]
.as_array() .as_array()
.unwrap(); .unwrap();
let has_branch = params let has_branch = params
@ -858,7 +862,7 @@ fn openapi_operations_have_tags() {
#[test] #[test]
fn read_endpoint_200_references_read_output_schema() { fn read_endpoint_200_references_read_output_schema() {
let doc = openapi_json(); let doc = openapi_json();
let content = &doc["paths"]["/read"]["post"]["responses"]["200"]["content"]; let content = &doc["paths"]["/graphs/{graph_id}/read"]["post"]["responses"]["200"]["content"];
let schema = &content["application/json"]["schema"]; let schema = &content["application/json"]["schema"];
let ref_path = schema["$ref"].as_str().unwrap(); let ref_path = schema["$ref"].as_str().unwrap();
assert!( assert!(
@ -870,7 +874,7 @@ fn read_endpoint_200_references_read_output_schema() {
#[test] #[test]
fn change_endpoint_200_references_change_output_schema() { fn change_endpoint_200_references_change_output_schema() {
let doc = openapi_json(); let doc = openapi_json();
let content = &doc["paths"]["/change"]["post"]["responses"]["200"]["content"]; let content = &doc["paths"]["/graphs/{graph_id}/change"]["post"]["responses"]["200"]["content"];
let schema = &content["application/json"]["schema"]; let schema = &content["application/json"]["schema"];
let ref_path = schema["$ref"].as_str().unwrap(); let ref_path = schema["$ref"].as_str().unwrap();
assert!( assert!(
@ -895,11 +899,11 @@ fn healthz_200_references_health_output_schema() {
fn error_responses_reference_error_output_schema() { fn error_responses_reference_error_output_schema() {
let doc = openapi_json(); let doc = openapi_json();
let paths_with_errors = [ let paths_with_errors = [
("/read", "post", "400"), ("/graphs/{graph_id}/read", "post", "400"),
("/read", "post", "401"), ("/graphs/{graph_id}/read", "post", "401"),
("/change", "post", "400"), ("/graphs/{graph_id}/change", "post", "400"),
("/change", "post", "409"), ("/graphs/{graph_id}/change", "post", "409"),
("/branches", "post", "409"), ("/graphs/{graph_id}/branches", "post", "409"),
]; ];
for (path, method, status) in paths_with_errors { for (path, method, status) in paths_with_errors {
@ -921,13 +925,13 @@ fn error_responses_reference_error_output_schema() {
fn post_endpoints_have_request_body() { fn post_endpoints_have_request_body() {
let doc = openapi_json(); let doc = openapi_json();
let post_paths = [ let post_paths = [
("/read", "ReadRequest"), ("/graphs/{graph_id}/read", "ReadRequest"),
("/change", "ChangeRequest"), ("/graphs/{graph_id}/change", "ChangeRequest"),
("/schema/apply", "SchemaApplyRequest"), ("/graphs/{graph_id}/schema/apply", "SchemaApplyRequest"),
("/ingest", "IngestRequest"), ("/graphs/{graph_id}/ingest", "IngestRequest"),
("/export", "ExportRequest"), ("/graphs/{graph_id}/export", "ExportRequest"),
("/branches", "BranchCreateRequest"), ("/graphs/{graph_id}/branches", "BranchCreateRequest"),
("/branches/merge", "BranchMergeRequest"), ("/graphs/{graph_id}/branches/merge", "BranchMergeRequest"),
]; ];
for (path, expected_schema) in post_paths { for (path, expected_schema) in post_paths {
@ -948,7 +952,7 @@ fn post_endpoints_have_request_body() {
#[test] #[test]
fn invoke_stored_query_request_body_is_optional() { fn invoke_stored_query_request_body_is_optional() {
let doc = openapi_json(); let doc = openapi_json();
let request_body = &doc["paths"]["/queries/{name}"]["post"]["requestBody"]; let request_body = &doc["paths"]["/graphs/{graph_id}/queries/{name}"]["post"]["requestBody"];
assert!( assert!(
request_body.is_object(), request_body.is_object(),
"POST /queries/{{name}} should document its optional request body" "POST /queries/{{name}} should document its optional request body"
@ -1051,12 +1055,14 @@ async fn auth_mode_spec_has_security_on_protected_operations() {
.body(Body::empty()) .body(Body::empty())
.unwrap(); .unwrap();
let (_, json) = json_response(&app, request).await; let (_, json) = json_response(&app, request).await;
// RFC-011 cluster-only: the served spec always nests protected
// routes under `/graphs/{graph_id}/...`.
let protected_paths = [ let protected_paths = [
("/read", "post"), ("/graphs/{graph_id}/read", "post"),
("/change", "post"), ("/graphs/{graph_id}/change", "post"),
("/snapshot", "get"), ("/graphs/{graph_id}/snapshot", "get"),
("/branches", "get"), ("/graphs/{graph_id}/branches", "get"),
("/commits", "get"), ("/graphs/{graph_id}/commits", "get"),
]; ];
for (path, method) in protected_paths { for (path, method) in protected_paths {
let security = &json["paths"][path][method]["security"]; let security = &json["paths"][path][method]["security"];
@ -1073,22 +1079,6 @@ async fn auth_mode_spec_has_security_on_protected_operations() {
} }
} }
#[tokio::test]
async fn auth_mode_spec_matches_static_generation() {
let (_temp, app) = app_for_loaded_graph_with_auth("secret").await;
let request = Request::builder()
.method(Method::GET)
.uri("/openapi.json")
.body(Body::empty())
.unwrap();
let (_, served) = json_response(&app, request).await;
let static_doc = openapi_json();
assert_eq!(
served, static_doc,
"auth-mode served spec must match static generation"
);
}
#[tokio::test] #[tokio::test]
async fn auth_mode_healthz_still_has_no_security() { async fn auth_mode_healthz_still_has_no_security() {
let (_temp, app) = app_for_loaded_graph_with_auth("secret").await; let (_temp, app) = app_for_loaded_graph_with_auth("secret").await;
@ -1394,8 +1384,9 @@ async fn multi_mode_operation_ids_are_unique() {
} }
#[tokio::test] #[tokio::test]
async fn single_mode_openapi_unchanged_by_cluster_filter() { async fn served_spec_always_nests_under_cluster_prefix() {
// Regression: single mode still emits the legacy flat surface. // RFC-011 cluster-only: even a one-graph convenience app serves the
// nested cluster surface and never the flat protected routes.
let (_temp, app) = app_for_loaded_graph().await; let (_temp, app) = app_for_loaded_graph().await;
let request = Request::builder() let request = Request::builder()
.method(Method::GET) .method(Method::GET)
@ -1405,16 +1396,37 @@ async fn single_mode_openapi_unchanged_by_cluster_filter() {
let (_, json) = json_response(&app, request).await; let (_, json) = json_response(&app, request).await;
let paths = json["paths"].as_object().unwrap(); let paths = json["paths"].as_object().unwrap();
let path_keys: HashSet<&str> = paths.keys().map(|k| k.as_str()).collect(); let path_keys: HashSet<&str> = paths.keys().map(|k| k.as_str()).collect();
for expected in EXPECTED_PATHS {
assert!(
path_keys.contains(expected),
"single mode must still emit flat path: {expected}"
);
}
for cluster in EXPECTED_CLUSTER_PATHS { for cluster in EXPECTED_CLUSTER_PATHS {
assert!( assert!(
!path_keys.contains(cluster), path_keys.contains(cluster),
"single mode must NOT emit cluster path: {cluster}" "served spec must emit cluster path: {cluster}. Found: {path_keys:?}"
);
}
// The flat protected routes must NOT appear — only the nested
// cluster surface plus the always-flat `/healthz` and `/graphs`.
let flat_protected = [
"/snapshot",
"/read",
"/query",
"/export",
"/change",
"/mutate",
"/queries",
"/queries/{name}",
"/schema",
"/schema/apply",
"/load",
"/ingest",
"/branches",
"/branches/{branch}",
"/branches/merge",
"/commits",
"/commits/{commit_id}",
];
for flat in flat_protected {
assert!(
!path_keys.contains(flat),
"served spec must NOT emit flat protected path: {flat}"
); );
} }
} }

View file

@ -43,7 +43,7 @@ async fn server_opens_s3_graph_directly_and_serves_snapshot_and_read() {
let (snapshot_status, snapshot_body) = json_response( let (snapshot_status, snapshot_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot") .uri(g("/snapshot"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer s3-token") .header("authorization", "Bearer s3-token")
.body(Body::empty()) .body(Body::empty())
@ -63,7 +63,7 @@ async fn server_opens_s3_graph_directly_and_serves_snapshot_and_read() {
let (read_status, read_body) = json_response( let (read_status, read_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("authorization", "Bearer s3-token") .header("authorization", "Bearer s3-token")
.header("content-type", "application/json") .header("content-type", "application/json")
@ -134,11 +134,8 @@ async fn server_boots_cluster_from_bare_storage_uri_and_serves_query() {
} }
let settings = omnigraph_server::load_server_settings( let settings = omnigraph_server::load_server_settings(
None,
Some(&std::path::PathBuf::from(&root)), Some(&std::path::PathBuf::from(&root)),
None, None,
None,
None,
true, true,
) )
.await .await

View file

@ -2,6 +2,7 @@
//! Moved verbatim from tests/server.rs in the modularization. //! Moved verbatim from tests/server.rs in the modularization.
use std::fs; use std::fs;
use std::sync::Arc;
use axum::body::Body; use axum::body::Body;
use axum::http::{Method, Request, StatusCode}; use axum::http::{Method, Request, StatusCode};
@ -11,7 +12,9 @@ use omnigraph::loader::LoadMode;
use omnigraph_server::api::{ use omnigraph_server::api::{
ChangeRequest, ErrorOutput, ReadRequest, SchemaApplyRequest, SchemaOutput, ChangeRequest, ErrorOutput, ReadRequest, SchemaApplyRequest, SchemaOutput,
}; };
use omnigraph_server::{AppState, build_app}; use omnigraph_server::{
AppState, GraphHandle, GraphId, GraphKey, PolicyEngine, build_app, workload,
};
use serde_json::json; use serde_json::json;
@ -30,7 +33,7 @@ async fn schema_apply_route_updates_graph_for_authorized_admin() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -54,6 +57,111 @@ async fn schema_apply_route_updates_graph_for_authorized_admin() {
); );
} }
#[tokio::test]
async fn schema_apply_route_refuses_cluster_backed_server_mode() {
let temp = init_graph_with_schema(&fs::read_to_string(fixture("test.pg")).unwrap()).await;
let graph = graph_path(temp.path());
let graph_uri = graph.to_string_lossy().to_string();
let engine = Omnigraph::open(&graph_uri).await.unwrap();
let handle = Arc::new(GraphHandle {
key: GraphKey::cluster(GraphId::try_from("default").unwrap()),
uri: graph_uri.clone(),
engine: Arc::new(engine),
policy: None,
queries: None,
});
let state = AppState::new_multi(
vec![handle],
Vec::new(),
None,
workload::WorkloadController::from_env(),
Some(temp.path().join("cluster.yaml")),
)
.unwrap();
let app = build_app(state);
let request = Request::builder()
.method(Method::POST)
.uri(g("/schema/apply"))
.header("content-type", "application/json")
.body(Body::from(
serde_json::to_vec(&SchemaApplyRequest {
schema_source: additive_schema_with_nickname(),
..Default::default()
})
.unwrap(),
))
.unwrap();
let (status, payload) = json_response(&app, request).await;
assert_eq!(status, StatusCode::CONFLICT, "body: {payload}");
assert!(
payload["error"]
.as_str()
.unwrap_or_default()
.contains("cluster apply"),
"body: {payload}"
);
let reopened = Omnigraph::open(&graph_uri).await.unwrap();
assert!(
!reopened.catalog().node_types["Person"]
.properties
.contains_key("nickname"),
"cluster-backed schema apply must not mutate the graph"
);
}
#[tokio::test]
async fn schema_apply_route_cluster_backed_denies_unauthorized_actor_before_409() {
// The cluster-backed 409 is reported AFTER the Cedar gate, so an actor
// without `schema_apply` permission gets a 403 — never a 409 that would
// disclose the server is cluster-backed (401 → 403 → 409, no topology leak
// before authorization). POLICY_YAML grants read/export but not schema_apply,
// so act-ragnor is denied.
let temp = init_graph_with_schema(&fs::read_to_string(fixture("test.pg")).unwrap()).await;
let graph = graph_path(temp.path());
let graph_uri = graph.to_string_lossy().to_string();
let engine = Omnigraph::open(&graph_uri).await.unwrap();
let policy = PolicyEngine::load_graph_from_source(POLICY_YAML, "default").unwrap();
let handle = Arc::new(GraphHandle {
key: GraphKey::cluster(GraphId::try_from("default").unwrap()),
uri: graph_uri,
engine: Arc::new(engine),
policy: Some(Arc::new(policy)),
queries: None,
});
let state = AppState::new_multi(
vec![handle],
vec![("act-ragnor".to_string(), "admin-token".to_string())],
None,
workload::WorkloadController::from_env(),
Some(temp.path().join("cluster.yaml")),
)
.unwrap();
let app = build_app(state);
let request = Request::builder()
.method(Method::POST)
.uri(g("/schema/apply"))
.header("content-type", "application/json")
.header("authorization", "Bearer admin-token")
.body(Body::from(
serde_json::to_vec(&SchemaApplyRequest {
schema_source: additive_schema_with_nickname(),
..Default::default()
})
.unwrap(),
))
.unwrap();
let (status, payload) = json_response(&app, request).await;
assert_eq!(
status,
StatusCode::FORBIDDEN,
"an unauthorized actor must get 403 before the cluster-backed 409: {payload}"
);
}
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn schema_apply_route_rejects_stored_query_breakage_before_publish() { async fn schema_apply_route_rejects_stored_query_breakage_before_publish() {
let (temp, app) = app_with_stored_queries( let (temp, app) = app_with_stored_queries(
@ -65,7 +173,7 @@ async fn schema_apply_route_rejects_stored_query_breakage_before_publish() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -115,7 +223,7 @@ async fn schema_apply_route_noop_keeps_valid_stored_query_registry() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -142,7 +250,7 @@ async fn schema_apply_route_requires_schema_apply_policy_permission() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -173,7 +281,7 @@ async fn schema_apply_route_requires_bearer_token_when_policy_enabled() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from( .body(Body::from(
serde_json::to_vec(&SchemaApplyRequest { serde_json::to_vec(&SchemaApplyRequest {
@ -203,7 +311,7 @@ async fn schema_apply_route_can_rename_type() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -239,7 +347,7 @@ async fn schema_apply_route_can_rename_property() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -279,7 +387,7 @@ async fn schema_apply_route_can_add_index() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -294,6 +402,11 @@ async fn schema_apply_route_can_add_index() {
assert_eq!(status, StatusCode::OK); assert_eq!(status, StatusCode::OK);
assert_eq!(payload["applied"], true); assert_eq!(payload["applied"], true);
// iss-848: the /schema/apply route accepts the index-add and applies it as a
// metadata change — it records the `@index` intent in the catalog/IR but does
// NOT build the physical index inline (the build is deferred to
// ensure_indices/optimize; on this empty table nothing would build anyway).
// So the physical index count is unchanged by the apply.
let reopened = Omnigraph::open(graph.to_str().unwrap()).await.unwrap(); let reopened = Omnigraph::open(graph.to_str().unwrap()).await.unwrap();
let snapshot = reopened let snapshot = reopened
.snapshot_of(ReadTarget::branch("main")) .snapshot_of(ReadTarget::branch("main"))
@ -301,7 +414,10 @@ async fn schema_apply_route_can_add_index() {
.unwrap(); .unwrap();
let dataset = snapshot.open("node:Person").await.unwrap(); let dataset = snapshot.open("node:Person").await.unwrap();
let after_index_count = dataset.load_indices().await.unwrap().len(); let after_index_count = dataset.load_indices().await.unwrap().len();
assert!(after_index_count > before_index_count); assert_eq!(
after_index_count, before_index_count,
"schema apply records @index intent but defers the physical build (iss-848)"
);
} }
#[tokio::test] #[tokio::test]
@ -315,7 +431,7 @@ async fn schema_apply_route_rejects_unsupported_plan() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -356,7 +472,7 @@ async fn schema_apply_route_rejects_when_non_main_branch_exists() {
let request = Request::builder() let request = Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -385,7 +501,7 @@ async fn schema_drift_returns_conflict_for_snapshot_read_and_change() {
let (snapshot_status, snapshot_body) = json_response( let (snapshot_status, snapshot_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/snapshot?branch=main") .uri(g("/snapshot?branch=main"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -413,7 +529,7 @@ async fn schema_drift_returns_conflict_for_snapshot_read_and_change() {
let (read_status, read_body) = json_response( let (read_status, read_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&read).unwrap())) .body(Body::from(serde_json::to_vec(&read).unwrap()))
@ -441,7 +557,7 @@ async fn schema_drift_returns_conflict_for_snapshot_read_and_change() {
let (change_status, change_body) = json_response( let (change_status, change_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(serde_json::to_vec(&change).unwrap())) .body(Body::from(serde_json::to_vec(&change).unwrap()))
@ -467,7 +583,7 @@ async fn schema_route_returns_current_source() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/schema") .uri(g("/schema"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -486,7 +602,7 @@ async fn schema_route_requires_bearer_token_when_auth_configured() {
let (missing_status, missing_body) = json_response( let (missing_status, missing_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/schema") .uri(g("/schema"))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -502,7 +618,7 @@ async fn schema_route_requires_bearer_token_when_auth_configured() {
let (ok_status, ok_body) = json_response( let (ok_status, ok_body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/schema") .uri(g("/schema"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer demo-token") .header("authorization", "Bearer demo-token")
.body(Body::empty()) .body(Body::empty())
@ -533,7 +649,7 @@ async fn schema_route_denied_when_actor_lacks_read_permission() {
let (status, body) = json_response( let (status, body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/schema") .uri(g("/schema"))
.method(Method::GET) .method(Method::GET)
.header("authorization", "Bearer team-token") .header("authorization", "Bearer team-token")
.body(Body::empty()) .body(Body::empty())
@ -574,7 +690,7 @@ async fn schema_apply_route_soft_drops_property_via_http() {
&app, &app,
Request::builder() Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -631,7 +747,7 @@ async fn schema_apply_route_soft_drops_node_type_via_http() {
&app, &app,
Request::builder() Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -683,7 +799,7 @@ async fn schema_apply_route_hard_drops_property_with_allow_data_loss() {
&app, &app,
Request::builder() Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -738,7 +854,7 @@ async fn schema_apply_route_keeps_drops_soft_without_flag() {
&app, &app,
Request::builder() Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -770,29 +886,27 @@ async fn schema_apply_route_additive_property_preserves_existing_rows() {
// AddProperty wasn't pinned with a row-count check anywhere. // AddProperty wasn't pinned with a row-count check anywhere.
// Load N rows, apply schema adding nullable property, verify // Load N rows, apply schema adding nullable property, verify
// every row is still readable and the new column is null. // every row is still readable and the new column is null.
let (temp, app) = app_for_graph_with_auth_tokens_and_policy( let (temp, app) = app_for_loaded_graph_with_auth_tokens_and_policy(
&fs::read_to_string(fixture("test.pg")).unwrap(),
&[("act-ragnor", "admin-token")], &[("act-ragnor", "admin-token")],
SCHEMA_APPLY_POLICY_YAML, SCHEMA_APPLY_POLICY_YAML,
) )
.await; .await;
let graph = graph_path(temp.path()); let graph = graph_path(temp.path());
// Standard fixture data: 4 Persons + 1 Company. Load it. // Standard fixture data is loaded before the app is built, so the server
// handle applies schema from the same manifest it is serving.
let pre_count = { let pre_count = {
let db = Omnigraph::open(graph.to_str().unwrap()).await.unwrap(); let db = Omnigraph::open(graph.to_str().unwrap()).await.unwrap();
db.load(
"main",
&fs::read_to_string(fixture("test.jsonl")).unwrap(),
LoadMode::Append,
)
.await
.unwrap();
let snap = db let snap = db
.snapshot_of(omnigraph::db::ReadTarget::branch("main")) .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
.await .await
.unwrap(); .unwrap();
snap.entry("node:Person").expect("Person").row_count snap.open("node:Person")
.await
.expect("Person")
.count_rows(None)
.await
.unwrap()
}; };
assert!(pre_count > 0, "fixture should have loaded Person rows"); assert!(pre_count > 0, "fixture should have loaded Person rows");
@ -800,7 +914,7 @@ async fn schema_apply_route_additive_property_preserves_existing_rows() {
&app, &app,
Request::builder() Request::builder()
.method(Method::POST) .method(Method::POST)
.uri("/schema/apply") .uri(g("/schema/apply"))
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", "Bearer admin-token") .header("authorization", "Bearer admin-token")
.body(Body::from( .body(Body::from(
@ -822,7 +936,13 @@ async fn schema_apply_route_additive_property_preserves_existing_rows() {
.snapshot_of(omnigraph::db::ReadTarget::branch("main")) .snapshot_of(omnigraph::db::ReadTarget::branch("main"))
.await .await
.unwrap(); .unwrap();
let post_count = snap.entry("node:Person").expect("Person").row_count; let post_count = snap
.open("node:Person")
.await
.expect("Person")
.count_rows(None)
.await
.unwrap();
assert_eq!( assert_eq!(
post_count, pre_count, post_count, pre_count,
"AddProperty should preserve row count", "AddProperty should preserve row count",

View file

@ -82,6 +82,58 @@ async fn invoke_stored_read_returns_rows() {
assert!(body["rows"].is_array(), "read envelope shape; body: {body}"); assert!(body["rows"].is_array(), "read envelope shape; body: {body}");
} }
#[tokio::test(flavor = "multi_thread")]
async fn invoke_with_mismatched_expected_kind_is_rejected() {
// RFC-011 D3: the CLI verb asserts the stored query's kind via
// `expect_mutation`. Invoking a read with `expect_mutation: true`
// (i.e. `omnigraph mutate <a-read>`) is a 400 naming the right verb.
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, false)],
&[("act-invoke", "t-invoke")],
INVOKE_POLICY_YAML,
)
.await;
let (status, body) = json_response(
&app,
invoke_request(
"find_person",
"t-invoke",
json!({ "expect_mutation": true, "params": { "name": "Alice" } }),
),
)
.await;
assert_eq!(status, StatusCode::BAD_REQUEST, "body: {body}");
assert!(
body["error"]
.as_str()
.unwrap_or_default()
.contains("'find_person' is a read — use omnigraph query find_person"),
"expected a kind-mismatch error; body: {body}"
);
}
#[tokio::test(flavor = "multi_thread")]
async fn invoke_with_matching_expected_kind_runs() {
// The matching assertion (`omnigraph query <a-read>`) passes through.
let (_temp, app) = app_with_stored_queries(
&[("find_person", FIND_PERSON_GQ, false)],
&[("act-invoke", "t-invoke")],
INVOKE_POLICY_YAML,
)
.await;
let (status, body) = json_response(
&app,
invoke_request(
"find_person",
"t-invoke",
json!({ "expect_mutation": false, "params": { "name": "Alice" } }),
),
)
.await;
assert_eq!(status, StatusCode::OK, "matching kind should run; body: {body}");
assert_eq!(body["query_name"], "find_person");
}
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn invoke_stored_read_accepts_absent_or_empty_body() { async fn invoke_stored_read_accepts_absent_or_empty_body() {
let no_param_query = "query list_people() { match { $p: Person } return { $p.name } }"; let no_param_query = "query list_people() { match { $p: Person } return { $p.name } }";
@ -272,7 +324,7 @@ async fn list_queries_returns_only_exposed_with_typed_params() {
INVOKE_POLICY_YAML, INVOKE_POLICY_YAML,
) )
.await; .await;
let (status, body) = json_response(&app, get_request("/queries", "t-invoke")).await; let (status, body) = json_response(&app, get_request(&g("/queries"), "t-invoke")).await;
assert_eq!(status, StatusCode::OK, "body: {body}"); assert_eq!(status, StatusCode::OK, "body: {body}");
let entries = body["queries"].as_array().unwrap(); let entries = body["queries"].as_array().unwrap();
@ -303,7 +355,7 @@ async fn list_queries_is_read_gated_so_a_non_invoker_can_list() {
INVOKE_POLICY_YAML, INVOKE_POLICY_YAML,
) )
.await; .await;
let (status, body) = json_response(&app, get_request("/queries", "t-noinvoke")).await; let (status, body) = json_response(&app, get_request(&g("/queries"), "t-noinvoke")).await;
assert_eq!(status, StatusCode::OK, "read-gated catalog; body: {body}"); assert_eq!(status, StatusCode::OK, "read-gated catalog; body: {body}");
let names: Vec<&str> = body["queries"] let names: Vec<&str> = body["queries"]
.as_array() .as_array()
@ -320,7 +372,7 @@ async fn list_queries_is_read_gated_so_a_non_invoker_can_list() {
#[tokio::test(flavor = "multi_thread")] #[tokio::test(flavor = "multi_thread")]
async fn list_queries_is_empty_when_no_registry() { async fn list_queries_is_empty_when_no_registry() {
let (_temp, app) = app_for_loaded_graph_with_auth("demo-token").await; let (_temp, app) = app_for_loaded_graph_with_auth("demo-token").await;
let (status, body) = json_response(&app, get_request("/queries", "demo-token")).await; let (status, body) = json_response(&app, get_request(&g("/queries"), "demo-token")).await;
assert_eq!(status, StatusCode::OK, "body: {body}"); assert_eq!(status, StatusCode::OK, "body: {body}");
assert!( assert!(
body["queries"].as_array().unwrap().is_empty(), body["queries"].as_array().unwrap().is_empty(),

View file

@ -248,9 +248,17 @@ rules:
pub const FIND_PERSON_GQ: &str = pub const FIND_PERSON_GQ: &str =
"query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }"; "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }";
/// RFC-011 cluster-only: the single-graph convenience apps built by the
/// `app_for_loaded_graph*` helpers serve the graph under the reserved id
/// `default`. This prefixes a flat per-graph path (e.g. `/snapshot`) with
/// the cluster route prefix so tests address `/graphs/default/snapshot`.
pub fn g(path: &str) -> String {
format!("/graphs/default{path}")
}
pub fn invoke_request(name: &str, token: &str, body: Value) -> Request<Body> { pub fn invoke_request(name: &str, token: &str, body: Value) -> Request<Body> {
Request::builder() Request::builder()
.uri(format!("/queries/{name}")) .uri(g(&format!("/queries/{name}")))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.header("authorization", format!("Bearer {token}")) .header("authorization", format!("Bearer {token}"))
@ -265,7 +273,7 @@ pub fn invoke_request_bytes(
content_type: Option<&str>, content_type: Option<&str>,
) -> Request<Body> { ) -> Request<Body> {
let mut builder = Request::builder() let mut builder = Request::builder()
.uri(format!("/queries/{name}")) .uri(g(&format!("/queries/{name}")))
.method(Method::POST) .method(Method::POST)
.header("authorization", format!("Bearer {token}")); .header("authorization", format!("Bearer {token}"));
if let Some(content_type) = content_type { if let Some(content_type) = content_type {
@ -656,7 +664,7 @@ pub mod matrix {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -686,7 +694,7 @@ pub mod matrix {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -728,7 +736,7 @@ pub mod matrix {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri(format!("/snapshot?branch={}", branch)) .uri(g(&format!("/snapshot?branch={}", branch)))
.method(Method::GET) .method(Method::GET)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -766,7 +774,7 @@ pub mod matrix {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/read") .uri(g("/read"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -833,7 +841,7 @@ pub mod matrix {
.clone() .clone()
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -874,7 +882,7 @@ pub mod matrix {
let response = app let response = app
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -910,7 +918,7 @@ pub mod matrix {
let response = app let response = app
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -943,7 +951,7 @@ pub mod matrix {
let response = app let response = app
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri("/branches") .uri(g("/branches"))
.method(Method::POST) .method(Method::POST)
.header("content-type", "application/json") .header("content-type", "application/json")
.body(Body::from(body)) .body(Body::from(body))
@ -970,7 +978,7 @@ pub mod matrix {
let response = app let response = app
.oneshot( .oneshot(
Request::builder() Request::builder()
.uri(format!("/branches/{}", name)) .uri(g(&format!("/branches/{}", name)))
.method(Method::DELETE) .method(Method::DELETE)
.body(Body::empty()) .body(Body::empty())
.unwrap(), .unwrap(),
@ -1091,7 +1099,7 @@ pub async fn http_change_decision(
let (status, _body) = json_response( let (status, _body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/change") .uri(g("/change"))
.method(Method::POST) .method(Method::POST)
.header(AUTHORIZATION, format!("Bearer {token}")) .header(AUTHORIZATION, format!("Bearer {token}"))
.header("content-type", "application/json") .header("content-type", "application/json")
@ -1141,7 +1149,7 @@ pub async fn http_merge_decision(
let (status, _body) = json_response( let (status, _body) = json_response(
&app, &app,
Request::builder() Request::builder()
.uri("/branches/merge") .uri(g("/branches/merge"))
.method(Method::POST) .method(Method::POST)
.header(AUTHORIZATION, format!("Bearer {token}")) .header(AUTHORIZATION, format!("Bearer {token}"))
.header("content-type", "application/json") .header("content-type", "application/json")
@ -1191,5 +1199,5 @@ graphs:
} }
pub async fn cluster_settings(dir: &Path) -> color_eyre::eyre::Result<omnigraph_server::ServerConfig> { pub async fn cluster_settings(dir: &Path) -> color_eyre::eyre::Result<omnigraph_server::ServerConfig> {
omnigraph_server::load_server_settings(None, Some(&dir.to_path_buf()), None, None, None, true).await omnigraph_server::load_server_settings(Some(&dir.to_path_buf()), None, true).await
} }

View file

@ -34,10 +34,10 @@ pub(crate) use namespace::open_table_head_for_write;
use namespace::{branch_manifest_namespace, staged_table_namespace}; use namespace::{branch_manifest_namespace, staged_table_namespace};
use publisher::{GraphNamespacePublisher, ManifestBatchPublisher}; use publisher::{GraphNamespacePublisher, ManifestBatchPublisher};
pub(crate) use recovery::{ pub(crate) use recovery::{
RecoveryMode, RecoverySidecar, RecoverySidecarHandle, SidecarKind, SidecarTablePin, RecoveryMode, RecoverySidecarHandle, SidecarKind, SidecarTablePin, SidecarTableRegistration,
SidecarTableRegistration, SidecarTombstone, delete_sidecar, has_schema_apply_sidecar, SidecarTombstone, delete_sidecar, has_schema_apply_sidecar, heal_pending_sidecars_roll_forward,
heal_pending_sidecars_roll_forward, list_sidecars, new_sidecar, recover_manifest_drift, list_sidecars, new_sidecar, recover_manifest_drift, schema_apply_serial_queue_key,
schema_apply_serial_queue_key, write_sidecar, write_sidecar,
}; };
pub use state::SubTableEntry; pub use state::SubTableEntry;
#[cfg(test)] #[cfg(test)]

View file

@ -793,10 +793,10 @@ pub(crate) fn schema_apply_serial_queue_key() -> crate::db::write_queue::TableQu
/// same table append extra Lance restore commits which `omnigraph /// same table append extra Lance restore commits which `omnigraph
/// cleanup` reclaims. /// cleanup` reclaims.
/// ///
/// Concurrency: today recovery runs synchronously in `Omnigraph::open` /// Concurrency: the open-time sweep runs synchronously in `Omnigraph::open`
/// *before* the engine is wrapped in the server's `Arc<RwLock<Omnigraph>>`. /// before the engine handle is published to any caller, so no request
/// No request handlers can race, so this sweep does NOT acquire write /// handler can race it and it does NOT acquire write queues. In-process
/// queues. In-process callers (refresh, write entry points) must use /// callers (refresh, write entry points) must use
/// [`heal_pending_sidecars_roll_forward`] instead, which serializes /// [`heal_pending_sidecars_roll_forward`] instead, which serializes
/// against live writers via per-(table_key, branch) queue acquisition. /// against live writers via per-(table_key, branch) queue acquisition.
pub(crate) async fn recover_manifest_drift( pub(crate) async fn recover_manifest_drift(

View file

@ -11,9 +11,9 @@ pub use graph_coordinator::{GraphCoordinator, ReadTarget, ResolvedTarget, Snapsh
pub use manifest::{Snapshot, SubTableEntry, SubTableUpdate}; pub use manifest::{Snapshot, SubTableEntry, SubTableUpdate};
pub(crate) use omnigraph::ensure_public_branch_ref; pub(crate) use omnigraph::ensure_public_branch_ref;
pub use omnigraph::{ pub use omnigraph::{
CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, RepairAction, CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, PendingIndex,
RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions, SchemaApplyResult, RepairAction, RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions,
SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats, SchemaApplyResult, SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats,
}; };
pub(crate) const SCHEMA_APPLY_LOCK_BRANCH: &str = "__schema_apply_lock__"; pub(crate) const SCHEMA_APPLY_LOCK_BRANCH: &str = "__schema_apply_lock__";

View file

@ -40,6 +40,7 @@ pub use repair::{
RepairAction, RepairClassification, RepairOptions, RepairStats, TableRepairStats, RepairAction, RepairClassification, RepairOptions, RepairStats, TableRepairStats,
}; };
pub use schema_apply::SchemaApplyOptions; pub use schema_apply::SchemaApplyOptions;
pub use table_ops::PendingIndex;
use super::commit_graph::GraphCommit; use super::commit_graph::GraphCommit;
use super::manifest::{ use super::manifest::{
@ -113,10 +114,11 @@ pub struct Omnigraph {
/// Read-heavy on schema introspection paths, written only by /// Read-heavy on schema introspection paths, written only by
/// `apply_schema`. Same ArcSwap rationale as `catalog`. /// `apply_schema`. Same ArcSwap rationale as `catalog`.
schema_source: Arc<ArcSwap<String>>, schema_source: Arc<ArcSwap<String>>,
/// Per-`(table_key, branch)` writer queues. Reachable from engine /// Per-`(table_key, branch)` writer queues — the engine's
/// internals (mutation finalize, schema_apply, branch_merge, /// write-serialization mechanism (the server holds the engine as a
/// ensure_indices, delete_where) and from future MR-870 recovery /// lockless `Arc<Omnigraph>`). Reachable from engine internals
/// reconciler. PR 1b adds the field; callers acquire in commits 4+. /// (mutation finalize, schema_apply, branch_merge, ensure_indices,
/// delete_where, the fork path, recovery reconciler).
write_queue: Arc<crate::db::write_queue::WriteQueueManager>, write_queue: Arc<crate::db::write_queue::WriteQueueManager>,
/// Process-wide mutex held across the swap → operate → restore window /// Process-wide mutex held across the swap → operate → restore window
/// in `branch_merge_impl`. Two concurrent merges with distinct targets /// in `branch_merge_impl`. Two concurrent merges with distinct targets
@ -1107,11 +1109,15 @@ impl Omnigraph {
/// unbranched subtables keep inheriting `main`, while subtables inherited /// unbranched subtables keep inheriting `main`, while subtables inherited
/// from an ancestor branch are first forked into the active branch before /// from an ancestor branch are first forked into the active branch before
/// their index metadata is updated. /// their index metadata is updated.
pub async fn ensure_indices(&self) -> Result<()> { /// Returns the declared indexes that could not be materialized on this
/// pass (today: vector columns with no trainable vectors yet). They are
/// deferred, not errors; a later `ensure_indices`/`optimize` builds them
/// once the column is trainable. Reads stay correct (brute-force) meanwhile.
pub async fn ensure_indices(&self) -> Result<Vec<PendingIndex>> {
table_ops::ensure_indices(self).await table_ops::ensure_indices(self).await
} }
pub async fn ensure_indices_on(&self, branch: &str) -> Result<()> { pub async fn ensure_indices_on(&self, branch: &str) -> Result<Vec<PendingIndex>> {
table_ops::ensure_indices_on(self, branch).await table_ops::ensure_indices_on(self, branch).await
} }
@ -1517,6 +1523,13 @@ impl Omnigraph {
table_ops::open_for_mutation_on_branch(self, branch, table_key, op_kind).await table_ops::open_for_mutation_on_branch(self, branch, table_key, op_kind).await
} }
/// Fork `table_key` onto `active_branch` from the given source state,
/// self-healing a manifest-unreferenced leftover fork if one is in the
/// way. Callers that reach this MUST already hold the per-`(table_key,
/// active_branch)` write queue (so the reclaim cannot race an in-process
/// fork) and must have confirmed via the live manifest that the table is
/// not yet on `active_branch`. Both the first-write fork path
/// (`open_owned_dataset_for_branch_write`) and `branch_merge` satisfy this.
pub(crate) async fn fork_dataset_from_entry_state( pub(crate) async fn fork_dataset_from_entry_state(
&self, &self,
table_key: &str, table_key: &str,
@ -1525,7 +1538,7 @@ impl Omnigraph {
source_version: u64, source_version: u64,
active_branch: &str, active_branch: &str,
) -> Result<SnapshotHandle> { ) -> Result<SnapshotHandle> {
table_ops::fork_dataset_from_entry_state( match table_ops::fork_dataset_from_entry_state(
self, self,
table_key, table_key,
full_path, full_path,
@ -1533,7 +1546,21 @@ impl Omnigraph {
source_version, source_version,
active_branch, active_branch,
) )
.await .await?
{
crate::storage_layer::ForkOutcome::Created(ds) => Ok(ds),
crate::storage_layer::ForkOutcome::RefAlreadyExists => {
table_ops::reclaim_orphaned_fork_and_refork(
self,
table_key,
full_path,
source_branch,
source_version,
active_branch,
)
.await
}
}
} }
pub(crate) async fn reopen_for_mutation( pub(crate) async fn reopen_for_mutation(
@ -1568,19 +1595,10 @@ impl Omnigraph {
&self, &self,
table_key: &str, table_key: &str,
ds: &mut SnapshotHandle, ds: &mut SnapshotHandle,
) -> Result<()> { ) -> Result<Vec<PendingIndex>> {
table_ops::build_indices_on_dataset(self, table_key, ds).await table_ops::build_indices_on_dataset(self, table_key, ds).await
} }
pub(crate) async fn build_indices_on_dataset_for_catalog(
&self,
catalog: &Catalog,
table_key: &str,
ds: &mut SnapshotHandle,
) -> Result<()> {
table_ops::build_indices_on_dataset_for_catalog(self, catalog, table_key, ds).await
}
// Used only by in-tree tests (`#[cfg(test)]`); the runtime path now // Used only by in-tree tests (`#[cfg(test)]`); the runtime path now
// uses `commit_updates_on_branch_with_expected` exclusively. // uses `commit_updates_on_branch_with_expected` exclusively.
#[cfg(test)] #[cfg(test)]
@ -2536,25 +2554,49 @@ edge WorksAt: Person -> Company
} }
#[tokio::test] #[tokio::test]
async fn test_apply_schema_adds_index_for_existing_property() { async fn test_apply_schema_defers_index_then_reconciler_builds_it() {
// iss-848: schema apply records the @index intent but builds nothing
// inline; a later ensure_indices materializes it once the table has
// rows. (Use `age`, which is unindexed in TEST_SCHEMA — `name @key` is
// already FTS-indexed at seed, so it can't show the deferral.)
let dir = tempfile::tempdir().unwrap(); let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap(); let uri = dir.path().to_str().unwrap();
let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap(); let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap();
seed_person_row(&mut db, "Alice", Some(30)).await;
let desired = TEST_SCHEMA.replace("name: String @key", "name: String @key @index"); let desired = TEST_SCHEMA.replace("age: I32?", "age: I32? @index");
db.apply_schema(&desired).await.unwrap(); db.apply_schema(&desired).await.unwrap();
// Apply built nothing — the BTREE on `age` is deferred.
let snapshot = db.snapshot().await; let snapshot = db.snapshot().await;
let ds = db let ds = db
.storage() .storage()
.open_snapshot_at_table(&snapshot, "node:Person") .open_snapshot_at_table(&snapshot, "node:Person")
.await .await
.unwrap(); .unwrap();
assert!(db.storage().has_fts_index(&ds, "name").await.unwrap()); assert!(
!db.storage().has_btree_index(&ds, "age").await.unwrap(),
"apply must not build the index inline (deferred to the reconciler)"
);
// The reconciler materializes it (Person has a row).
db.ensure_indices().await.unwrap();
let snapshot = db.snapshot().await;
let ds = db
.storage()
.open_snapshot_at_table(&snapshot, "node:Person")
.await
.unwrap();
assert!(
db.storage().has_btree_index(&ds, "age").await.unwrap(),
"ensure_indices must build the deferred index"
);
} }
#[tokio::test] #[tokio::test]
async fn test_apply_schema_rewrite_preserves_existing_indices() { async fn test_apply_schema_rewrite_defers_index_then_reconciler_restores() {
// iss-848: an AddProperty rewrite writes a new dataset version without
// rebuilding indexes inline (deferred); ensure_indices restores them.
let dir = tempfile::tempdir().unwrap(); let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap(); let uri = dir.path().to_str().unwrap();
let initial_schema = TEST_SCHEMA.replace("name: String @key", "name: String @key @index"); let initial_schema = TEST_SCHEMA.replace("name: String @key", "name: String @key @index");
@ -2567,6 +2609,8 @@ edge WorksAt: Person -> Company
); );
db.apply_schema(&desired).await.unwrap(); db.apply_schema(&desired).await.unwrap();
// After the rewrite the reconciler restores index coverage.
db.ensure_indices().await.unwrap();
let snapshot = db.snapshot().await; let snapshot = db.snapshot().await;
let ds = db let ds = db
.storage() .storage()

View file

@ -140,6 +140,12 @@ pub struct TableOptimizeStats {
/// Lance HEAD version observed by optimize for drift skips. `None` for /// Lance HEAD version observed by optimize for drift skips. `None` for
/// normal compaction/no-op/blob skips. /// normal compaction/no-op/blob skips.
pub lance_head_version: Option<u64>, pub lance_head_version: Option<u64>,
/// Declared `@index` columns on this table the reconciler could not build
/// this run, each with the `reason` (today: a vector column with no
/// trainable vectors yet). Empty on the common path. Reported, not fatal — a
/// later `optimize` retries; the `list_indices`/`indisvalid` analog so
/// operators can see which index is pending and why.
pub pending_indexes: Vec<super::PendingIndex>,
} }
impl TableOptimizeStats { impl TableOptimizeStats {
@ -153,6 +159,7 @@ impl TableOptimizeStats {
skipped: None, skipped: None,
manifest_version: None, manifest_version: None,
lance_head_version: None, lance_head_version: None,
pending_indexes: Vec::new(),
} }
} }
@ -166,6 +173,7 @@ impl TableOptimizeStats {
skipped: Some(reason), skipped: Some(reason),
manifest_version: None, manifest_version: None,
lance_head_version: None, lance_head_version: None,
pending_indexes: Vec::new(),
} }
} }
@ -183,6 +191,7 @@ impl TableOptimizeStats {
skipped: Some(SkipReason::DriftNeedsRepair), skipped: Some(SkipReason::DriftNeedsRepair),
manifest_version: Some(manifest_version), manifest_version: Some(manifest_version),
lance_head_version: Some(lance_head_version), lance_head_version: Some(lance_head_version),
pending_indexes: Vec::new(),
} }
} }
} }
@ -259,9 +268,7 @@ pub async fn optimize_all_tables(db: &Omnigraph) -> Result<Vec<TableOptimizeStat
// the original row addresses on rewrite). The CSR/CSC graph topology index // the original row addresses on rewrite). The CSR/CSC graph topology index
// is rebuilt only when an edge table moved. Mirrors schema_apply's // is rebuilt only when an edge table moved. Mirrors schema_apply's
// post-publish invalidation. // post-publish invalidation.
let any_committed = stats let any_committed = stats.iter().any(|s| matches!(s, Ok(st) if st.committed));
.iter()
.any(|s| matches!(s, Ok(st) if st.committed));
let edge_committed = stats let edge_committed = stats
.iter() .iter()
.any(|s| matches!(s, Ok(st) if st.committed && st.table_key.starts_with("edge:"))); .any(|s| matches!(s, Ok(st) if st.committed && st.table_key.starts_with("edge:")));
@ -371,14 +378,26 @@ async fn optimize_one_table(
let will_compact = plan.num_tasks() > 0; let will_compact = plan.num_tasks() > 0;
// Even when there is nothing to compact, the table may still have index // Even when there is nothing to compact, the table may still have index
// work: rows appended since the index was built (e.g. via `ingest --mode // work: rows appended since the index was built (e.g. via `ingest --mode
// merge`) are scanned unindexed until folded in. Either compaction or stale // merge`) are scanned unindexed until folded in (needs_reindex), OR a
// index coverage is enough to enter the publish path. If NEITHER, this // declared `@index` was never built — schema apply records the intent but
// table is a no-op and must NOT be pinned in a sidecar — a zero-commit pin // defers the physical build (iss-848), so optimize is the operator-facing
// classifies NoMovement on recovery and forces an all-or-nothing rollback // reconciler that materializes it (needs_index_create). Any of the three is
// of sibling tables' legitimate work. Uncovered pre-existing manifest/HEAD // enough to enter the publish path. If NONE, this table is a no-op and must
// drift is skipped above and must go through explicit repair. // NOT be pinned in a sidecar — a zero-commit pin classifies NoMovement on
// recovery and forces an all-or-nothing rollback of sibling tables'
// legitimate work. Uncovered pre-existing manifest/HEAD drift is skipped
// above and goes through explicit repair, so this only runs on a healthy
// table under the per-table queue + sidecar.
let needs_reindex = TableStore::has_unindexed_fragments(&ds).await?; let needs_reindex = TableStore::has_unindexed_fragments(&ds).await?;
if !will_compact && !needs_reindex { // needs_index_work_* checks "a declared index is missing AND row_count > 0",
// so empty tables stay no-ops (never pinned). It re-reads the head under the
// queue we already hold, so it is consistent with `ds`.
let needs_index_create = if let Some(type_name) = table_key.strip_prefix("node:") {
super::table_ops::needs_index_work_node(db, type_name, &table_key, &full_path, None).await?
} else {
super::table_ops::needs_index_work_edge(db, &table_key, &full_path, None).await?
};
if !will_compact && !needs_reindex && !needs_index_create {
return Ok(TableOptimizeStats::compacted( return Ok(TableOptimizeStats::compacted(
table_key, table_key,
&CompactionMetrics::default(), &CompactionMetrics::default(),
@ -427,7 +446,30 @@ async fn optimize_one_table(
ds.optimize_indices(&OptimizeOptions::default()) ds.optimize_indices(&OptimizeOptions::default())
.await .await
.map_err(|e| OmniError::Lance(format!("optimize_indices on {}: {}", table_key, e)))?; .map_err(|e| OmniError::Lance(format!("optimize_indices on {}: {}", table_key, e)))?;
let version_after = ds.version().version;
// Materialize any declared-but-missing index over the just-compacted layout,
// reusing the build chokepoint (idempotent: skips existing indexes; fault-
// isolates an untrainable vector column into `pending` rather than failing).
// Run it UNCONDITIONALLY now that we are past the no-op gate — not only when
// `needs_index_create`. A table can enter this path for compaction or
// reindex while its sole missing index is an untrainable Vector column
// (which `needs_index_work_*` does not count as buildable work); calling the
// build here is what surfaces that column in `pending_indexes`, so optimize
// can't compact a table yet silently drop the deferred-index signal.
// Idempotent + cheap when there is nothing to build. Vector index creation
// is an inline-commit residual; the Optimize sidecar's loose post_commit_pin
// covers the extra commits.
let catalog = db.catalog();
let mut snapshot = crate::storage_layer::SnapshotHandle::new(ds);
let pending_indexes: Vec<super::PendingIndex> =
super::table_ops::build_indices_on_dataset_for_catalog(
db,
&catalog,
&table_key,
&mut snapshot,
)
.await?;
let version_after = snapshot.dataset().version().version;
let committed = version_after != version_before; let committed = version_after != version_before;
// Pin the per-writer Phase B → Phase C residual for optimize: Lance HEAD has // Pin the per-writer Phase B → Phase C residual for optimize: Lance HEAD has
@ -438,9 +480,6 @@ async fn optimize_one_table(
// expected = the version observed under the queue). On failure the sidecar // expected = the version observed under the queue). On failure the sidecar
// is intentionally left for the open-time recovery sweep to roll forward. // is intentionally left for the open-time recovery sweep to roll forward.
if committed { if committed {
// Re-wrap the post-compaction dataset to read its state through the
// trait surface (`table_state` is a read; no HEAD advance).
let snapshot = crate::storage_layer::SnapshotHandle::new(ds);
let state = db.storage().table_state(&full_path, &snapshot).await?; let state = db.storage().table_state(&full_path, &snapshot).await?;
let update = crate::db::SubTableUpdate { let update = crate::db::SubTableUpdate {
table_key: table_key.clone(), table_key: table_key.clone(),
@ -467,7 +506,9 @@ async fn optimize_one_table(
); );
} }
Ok(TableOptimizeStats::compacted(table_key, &metrics, committed)) let mut stat = TableOptimizeStats::compacted(table_key, &metrics, committed);
stat.pending_indexes = pending_indexes;
Ok(stat)
} }
/// Run Lance `cleanup_old_versions` on every node + edge table on `main`, /// Run Lance `cleanup_old_versions` on every node + edge table on `main`,
@ -599,27 +640,37 @@ pub struct BranchReconcileStats {
pub failures: Vec<(String, String)>, pub failures: Vec<(String, String)>,
} }
/// Drop every per-table and commit-graph Lance branch that the manifest no /// Drop every per-table and commit-graph Lance branch fork the manifest does
/// longer references. /// not reference.
/// ///
/// Orphaned forks arise when a `branch_delete` flips the manifest authority /// Two origins produce a manifest-unreferenced fork:
/// (atomic) but a downstream best-effort reclaim does not complete. They are /// 1. A `branch_delete` flips the manifest authority (atomic) but a
/// unreachable through any snapshot — no manifest entry can name them — yet /// downstream best-effort reclaim does not complete — the whole branch is
/// they pin their `tree/{branch}/` storage and can block reusing the branch /// gone from the manifest, but a `tree/{branch}/` ref lingers.
/// name. This is the guaranteed convergence backstop: it is idempotent and /// 2. A first-write fork (or a merge fork) creates the branch ref before the
/// derived purely from the manifest authority, so it no-ops once everything is /// manifest publish, then the writer dies / is cancelled — the branch is
/// reconciled, and it would harmlessly find nothing if a future Lance atomic /// still a live manifest branch, but the manifest's snapshot of it does
/// multi-dataset branch op prevented orphans from forming. /// not place *this table* on the branch.
/// ///
/// The keep-set is the full (unfiltered) manifest branch list, so system /// The write path self-heals (2) on the next write to the table
/// branches' forks are never reclaimed; `main`/default is not a named Lance /// (`reclaim_orphaned_fork_and_refork`); this is the guaranteed-convergence
/// branch and so is never a candidate. Referencing children are dropped before /// backstop that also covers (1) and any table the write path never revisits.
/// parents (Lance refuses to delete a referenced parent) by ordering longest ///
/// branch names first. /// The orphan test is therefore **per-table**, not per-branch-name: a Lance
/// branch `B` on table `T` is an orphan iff `B` is not a live manifest branch
/// at all (origin 1) OR the manifest's branch-`B` snapshot does not place `T`
/// on `B` (origin 2). A legitimately-forked table (`table_branch == Some(B)`)
/// is kept. `main` and internal/system branches are never candidates. Lance
/// refuses to force-delete a branch with referencing descendants, so children
/// are dropped before parents (longest name first). Idempotent and authority-
/// derived: no-ops once reconciled, and degrades to finding nothing if a future
/// Lance atomic multi-dataset branch op prevents orphans from forming.
pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result<BranchReconcileStats> { pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result<BranchReconcileStats> {
use std::collections::HashSet; use std::collections::{HashMap, HashSet};
let keep: HashSet<String> = db // Live manifest branches: the set whose per-table placements are
// authoritative. A branch absent here is a whole-branch (origin-1) orphan.
let live_branches: HashSet<String> = db
.coordinator .coordinator
.read() .read()
.await .await
@ -640,6 +691,12 @@ pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result<BranchReconci
.collect(); .collect();
let mut stats = BranchReconcileStats::default(); let mut stats = BranchReconcileStats::default();
// Per-branch snapshots are resolved once and cached across tables (few
// branches in practice); origin-2 detection consults the branch's own view.
// Failures are cached too: one branch-level read failure should not refetch
// and append duplicate per-table noise for every table that lists the ref.
let mut branch_snapshots: HashMap<String, crate::db::Snapshot> = HashMap::new();
let mut failed_branch_snapshots: HashSet<String> = HashSet::new();
// Per-table fault isolation: one table's transient failure is recorded and // Per-table fault isolation: one table's transient failure is recorded and
// logged, never aborting the rest of the sweep. // logged, never aborting the rest of the sweep.
@ -658,7 +715,104 @@ pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result<BranchReconci
continue; continue;
} }
}; };
for branch in orphan_branches(listed, &keep) {
// Decide per (table, branch) whether the fork is an orphan.
let mut orphans: Vec<String> = Vec::new();
for branch in listed {
// `main` is not a named Lance branch; system/internal branches
// (e.g. the schema-apply lock) own legitimate forks — never touch.
if branch == "main" || crate::db::is_internal_system_branch(&branch) {
continue;
}
let is_orphan = if !live_branches.contains(&branch) {
true // origin 1: whole branch gone from the manifest
} else {
// origin 2: live branch, but does the manifest place THIS
// table on it? Resolve (and cache) the branch's snapshot.
if failed_branch_snapshots.contains(&branch) {
continue;
}
if !branch_snapshots.contains_key(&branch) {
let branch_snapshot =
match crate::failpoints::maybe_fail("cleanup.resolve_branch_snapshot") {
Ok(()) => db.snapshot_for_branch(Some(&branch)).await,
Err(injected) => Err(injected),
};
match branch_snapshot {
Ok(snap) => {
branch_snapshots.insert(branch.clone(), snap);
}
Err(err) => {
tracing::warn!(
target: "omnigraph::cleanup",
table = %table_key,
branch = %branch,
error = %err,
"resolving branch snapshot failed during reconcile; skipping",
);
stats.failures.push((table_key.clone(), err.to_string()));
failed_branch_snapshots.insert(branch.clone());
continue;
}
}
}
branch_snapshots[&branch]
.entry(&table_key)
.map(|e| e.table_branch.as_deref() != Some(branch.as_str()))
.unwrap_or(true)
};
if is_orphan {
orphans.push(branch);
}
}
// Children before parents (longest name first) so Lance's referenced-
// parent RefConflict cannot block reclamation.
orphans.sort_by(|a, b| b.len().cmp(&a.len()).then_with(|| a.cmp(b)));
for branch in orphans {
// Serialize against in-process live writers before destroying a ref.
// A first-write fork holds the per-(table, branch) write queue from
// before the fork through the manifest publish; on a LIVE branch its
// in-flight fork looks exactly like an origin-2 orphan (manifest not
// yet advanced). Acquire the same queue so cleanup waits for any such
// writer, then RE-VALIDATE under the queue with a fresh read: if the
// writer published in the meantime (table now placed on the branch),
// it is no longer an orphan — skip it. (Cross-process writers remain
// the documented one-winner-CAS gap.) One key held at a time → no
// lock-order inversion against multi-table `acquire_many` writers.
let _guard = db
.write_queue()
.acquire(&(table_key.clone(), Some(branch.clone())))
.await;
// Decide under the queue from FRESH authority via the shared
// classifier (same decision the write-path reclaim uses) — never
// from the sweep-start `live_branches` capture. A branch created
// AFTER that capture is absent from the stale set yet may already
// carry a legitimately-published fork (an in-process writer held
// this queue through its fork+publish; we just waited on it), so a
// stale "origin-1 ⇒ delete" shortcut would destroy a live fork.
// Only `Orphan` is reclaimed; `Indeterminate` (transient read) is
// skipped and recorded. (Cross-process writers remain the documented
// one-winner-CAS gap.) One key held at a time → no lock-order
// inversion vs multi-table `acquire_many` writers.
match super::table_ops::classify_fork_ref(db, &table_key, &branch).await {
super::table_ops::ForkRefStatus::Orphan => {}
super::table_ops::ForkRefStatus::Legitimate => continue,
super::table_ops::ForkRefStatus::Indeterminate => {
tracing::warn!(
target: "omnigraph::cleanup",
table = %table_key,
branch = %branch,
"fresh re-check inconclusive during reconcile; skipping to avoid \
destroying a possibly-live fork (will retry next cleanup)",
);
stats.failures.push((
table_key.clone(),
format!("indeterminate fork status for {branch}"),
));
continue;
}
}
let outcome = match crate::failpoints::maybe_fail("cleanup.reconcile_fork") { let outcome = match crate::failpoints::maybe_fail("cleanup.reconcile_fork") {
Ok(()) => storage.force_delete_branch(&full_path, &branch).await, Ok(()) => storage.force_delete_branch(&full_path, &branch).await,
Err(injected) => Err(injected), Err(injected) => Err(injected),
@ -679,15 +833,17 @@ pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result<BranchReconci
} }
} }
// Commit-graph orphans (best-effort: the dataset may not exist on a graph // Commit-graph orphans are whole-branch (not per-table), so the simple
// that has never committed; any failure is isolated and retried next time). // "branch name not in the live set" test still applies there.
if let Err(err) = reconcile_commit_graph_orphans(db, &keep, &mut stats).await { if let Err(err) = reconcile_commit_graph_orphans(db, &live_branches, &mut stats).await {
tracing::warn!( tracing::warn!(
target: "omnigraph::cleanup", target: "omnigraph::cleanup",
error = %err, error = %err,
"commit-graph orphan reconcile failed; will retry next cleanup", "commit-graph orphan reconcile failed; will retry next cleanup",
); );
stats.failures.push(("_graph_commits".to_string(), err.to_string())); stats
.failures
.push(("_graph_commits".to_string(), err.to_string()));
} }
Ok(stats) Ok(stats)
@ -715,7 +871,9 @@ async fn reconcile_commit_graph_orphans(
error = %err, error = %err,
"reclaiming orphaned commit-graph branch failed; will retry next cleanup", "reclaiming orphaned commit-graph branch failed; will retry next cleanup",
); );
stats.failures.push(("_graph_commits".to_string(), err.to_string())); stats
.failures
.push(("_graph_commits".to_string(), err.to_string()));
} }
} }
} }
@ -744,3 +902,66 @@ pub(super) fn all_table_keys(catalog: &omnigraph_compiler::catalog::Catalog) ->
keys.sort(); keys.sort();
keys keys
} }
#[cfg(all(test, feature = "failpoints"))]
mod tests {
use super::*;
use crate::failpoints::ScopedFailPoint;
use crate::loader::{LoadMode, load_jsonl};
fn node_table_uri(root: &str, type_name: &str) -> String {
let mut hash: u64 = 0xcbf2_9ce4_8422_2325;
for &b in type_name.as_bytes() {
hash ^= b as u64;
hash = hash.wrapping_mul(0x100_0000_01b3);
}
format!("{}/nodes/{hash:016x}", root.trim_end_matches('/'))
}
#[tokio::test]
async fn reconcile_caches_live_branch_snapshot_resolution_failure() {
let _scenario = fail::FailScenario::setup();
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap();
let schema = "node Person { name: String @key }\nnode Company { name: String @key }\n";
let mut db = Omnigraph::init(uri, schema).await.unwrap();
load_jsonl(
&mut db,
"{\"type\":\"Person\",\"data\":{\"name\":\"Alice\"}}\n\
{\"type\":\"Company\",\"data\":{\"name\":\"Acme\"}}",
LoadMode::Merge,
)
.await
.unwrap();
db.branch_create("feature").await.unwrap();
for type_name in ["Person", "Company"] {
let table_uri = node_table_uri(uri, type_name);
let mut ds = lance::Dataset::open(&table_uri).await.unwrap();
let base = ds.version().version;
ds.create_branch("feature", base, None).await.unwrap();
}
let _fp = ScopedFailPoint::new("cleanup.resolve_branch_snapshot", "return");
let stats = reconcile_orphaned_branches(&db).await.unwrap();
assert_eq!(
stats.failures.len(),
1,
"one live-branch snapshot resolution failure should be reported once, \
not once per table: {:?}",
stats.failures
);
assert!(
stats.failures[0]
.1
.contains("cleanup.resolve_branch_snapshot"),
"the recorded failure should be the branch-snapshot resolution failure: {:?}",
stats.failures
);
assert!(
stats.reclaimed.is_empty(),
"unreadable live-branch refs must be left for the next cleanup run"
);
}
}

View file

@ -193,7 +193,6 @@ where
let mut added_tables = BTreeSet::new(); let mut added_tables = BTreeSet::new();
let mut renamed_tables = HashMap::new(); let mut renamed_tables = HashMap::new();
let mut rewritten_tables = BTreeSet::new(); let mut rewritten_tables = BTreeSet::new();
let mut indexed_tables = BTreeSet::new();
let mut dropped_tables = BTreeSet::new(); let mut dropped_tables = BTreeSet::new();
// Hard-drop cleanup targets: (table_key, full_dataset_uri). // Hard-drop cleanup targets: (table_key, full_dataset_uri).
// Populated for DropProperty { Hard } and DropType { Hard }; the // Populated for DropProperty { Hard } and DropType { Hard }; the
@ -252,14 +251,14 @@ where
.or_default() .or_default()
.insert(to.clone(), from.clone()); .insert(to.clone(), from.clone());
} }
SchemaMigrationStep::AddConstraint { // AddConstraint is only ever an `@index` addition (every other
type_kind, // added constraint plans as UnsupportedChange). It records intent
type_name, // in the desired catalog/IR; the physical index is built off the
.. // critical path by ensure_indices/optimize (iss-848), so the apply
} => { // does no table work for it — a pure metadata change like the two
indexed_tables.insert(schema_table_key(*type_kind, type_name)); // metadata steps below.
} SchemaMigrationStep::AddConstraint { .. }
SchemaMigrationStep::UpdateTypeMetadata { .. } | SchemaMigrationStep::UpdateTypeMetadata { .. }
| SchemaMigrationStep::UpdatePropertyMetadata { .. } => {} | SchemaMigrationStep::UpdatePropertyMetadata { .. } => {}
SchemaMigrationStep::DropProperty { SchemaMigrationStep::DropProperty {
type_kind, type_kind,
@ -347,18 +346,15 @@ where
let mut table_updates = HashMap::<String, crate::db::SubTableUpdate>::new(); let mut table_updates = HashMap::<String, crate::db::SubTableUpdate>::new();
let mut table_tombstones = HashMap::<String, u64>::new(); let mut table_tombstones = HashMap::<String, u64>::new();
// Recovery sidecar: protect the per-table commit_staged loop in // Recovery sidecar: protect the per-table `stage_overwrite` +
// rewritten_tables + indexed_tables. The post_commit_pin we record // `commit_staged` in rewritten_tables — the only tables that advance Lance
// here is a lower bound (expected + 1); the classifier loose-matches // HEAD inline now that index building is deferred to the reconciler
// for SidecarKind::SchemaApply because the actual N depends on how // (iss-848). Each rewritten table is exactly one commit, so
// many indices need building. See classify_table's loose-match arm. // `post_commit_pin = expected + 1` is now exact (it was a loose lower bound
// when index builds added extra commits); the classifier's loose-match for
// SidecarKind::SchemaApply still accepts it.
let recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = rewritten_tables let recovery_pins: Vec<crate::db::manifest::SidecarTablePin> = rewritten_tables
.iter() .iter()
.chain(indexed_tables.iter().filter(|t| {
!rewritten_tables.contains(*t)
&& !added_tables.contains(*t)
&& !renamed_tables.contains_key(*t)
}))
.filter_map(|table_key| { .filter_map(|table_key| {
let entry = snapshot.entry(table_key)?; let entry = snapshot.entry(table_key)?;
Some(crate::db::manifest::SidecarTablePin { Some(crate::db::manifest::SidecarTablePin {
@ -432,10 +428,10 @@ where
// manifest publish via `commit_changes_with_actor` below. // manifest publish via `commit_changes_with_actor` below.
// //
// Schema-apply already holds the graph-wide `__schema_apply_lock__` // Schema-apply already holds the graph-wide `__schema_apply_lock__`
// sentinel branch, so under PR 1b's intermediate state these // sentinel branch, so these per-table acquisitions are uncontended in
// per-table acquisitions are uncontended. They exist for symmetry // practice. They exist for symmetry with the recovery reconciler, which
// with future MR-870 recovery, which will need queue acquisition // acquires the same queues before any `Dataset::restore` it issues for
// before any `Dataset::restore` it issues for SchemaApply sidecars. // SchemaApply sidecars.
let mut schema_apply_queue_keys: Vec<(String, Option<String>)> = recovery_pins let mut schema_apply_queue_keys: Vec<(String, Option<String>)> = recovery_pins
.iter() .iter()
.map(|pin| (pin.table_key.clone(), pin.table_branch.clone())) .map(|pin| (pin.table_key.clone(), pin.table_branch.clone()))
@ -490,10 +486,11 @@ where
let table_path = table_path_for_table_key(table_key)?; let table_path = table_path_for_table_key(table_key)?;
let dataset_uri = db.storage().dataset_uri(&table_path); let dataset_uri = db.storage().dataset_uri(&table_path);
let schema = schema_for_table_key(&desired_catalog, table_key)?; let schema = schema_for_table_key(&desired_catalog, table_key)?;
let mut ds = let ds =
SnapshotHandle::new(TableStore::create_empty_dataset(&dataset_uri, &schema).await?); SnapshotHandle::new(TableStore::create_empty_dataset(&dataset_uri, &schema).await?);
db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds) // Indexes for the new table are materialized off the critical path by
.await?; // ensure_indices/optimize (iss-848); a 0-row table is never trainable
// anyway. The @index intent is recorded in the persisted catalog/IR.
let state = db.storage().table_state(&dataset_uri, &ds).await?; let state = db.storage().table_state(&dataset_uri, &ds).await?;
table_registrations.insert(table_key.clone(), table_path); table_registrations.insert(table_key.clone(), table_path);
table_updates.insert( table_updates.insert(
@ -533,10 +530,9 @@ where
.await?; .await?;
let table_path = table_path_for_table_key(target_table_key)?; let table_path = table_path_for_table_key(target_table_key)?;
let dataset_uri = db.storage().dataset_uri(&table_path); let dataset_uri = db.storage().dataset_uri(&table_path);
let mut target_ds = let target_ds =
SnapshotHandle::new(TableStore::write_dataset(&dataset_uri, batch).await?); SnapshotHandle::new(TableStore::write_dataset(&dataset_uri, batch).await?);
db.build_indices_on_dataset_for_catalog(&desired_catalog, target_table_key, &mut target_ds) // Indexes on the renamed table are reconciled later (iss-848).
.await?;
let state = db.storage().table_state(&dataset_uri, &target_ds).await?; let state = db.storage().table_state(&dataset_uri, &target_ds).await?;
table_registrations.insert(target_table_key.clone(), table_path); table_registrations.insert(target_table_key.clone(), table_path);
table_updates.insert( table_updates.insert(
@ -593,9 +589,10 @@ where
.open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref()) .open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
.await?; .await?;
let staged = db.storage().stage_overwrite(&existing, batch).await?; let staged = db.storage().stage_overwrite(&existing, batch).await?;
let mut target_ds = db.storage().commit_staged(existing, staged).await?; let target_ds = db.storage().commit_staged(existing, staged).await?;
db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut target_ds) // The rewrite drops the table's existing index coverage; it is
.await?; // restored off the critical path by optimize's optimize_indices /
// ensure_indices (iss-848). Reads scan uncovered fragments meanwhile.
let state = db.storage().table_state(&dataset_uri, &target_ds).await?; let state = db.storage().table_state(&dataset_uri, &target_ds).await?;
table_updates.insert( table_updates.insert(
table_key.clone(), table_key.clone(),
@ -609,41 +606,12 @@ where
); );
} }
for table_key in &indexed_tables { // Index-only changes (AddConstraint, i.e. adding an `@index`) are pure
if added_tables.contains(table_key) // metadata: the new `@index` intent is recorded in the desired catalog/IR
|| renamed_tables.contains_key(table_key) // persisted below, and the physical index is materialized off the critical
|| rewritten_tables.contains(table_key) // path by `ensure_indices`/`optimize` (iss-848). Schema apply touches no
{ // table data for them, so there is no per-table loop here and no recovery
continue; // pin (no Lance HEAD advances). Reads stay correct meanwhile via a scan.
}
let entry = snapshot.entry(table_key).ok_or_else(|| {
OmniError::manifest(format!(
"missing table '{}' for schema index apply",
table_key
))
})?;
ensure_snapshot_entry_head_matches(db, entry).await?;
let dataset_uri = db.storage().dataset_uri(&entry.table_path);
let mut ds = db
.storage()
.open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref())
.await?;
db.storage()
.ensure_expected_version(&ds, table_key, entry.table_version)?;
db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds)
.await?;
let state = db.storage().table_state(&dataset_uri, &ds).await?;
table_updates.insert(
table_key.clone(),
crate::db::SubTableUpdate {
table_key: table_key.clone(),
table_version: state.version,
table_branch: None,
row_count: state.row_count,
version_metadata: state.version_metadata,
},
);
}
let mut manifest_changes = Vec::new(); let mut manifest_changes = Vec::new();
for (table_key, table_path) in table_registrations { for (table_key, table_path) in table_registrations {

View file

@ -21,7 +21,7 @@ pub(super) async fn graph_index_for_resolved(
db.runtime_cache.graph_index(resolved, &catalog).await db.runtime_cache.graph_index(resolved, &catalog).await
} }
pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> { pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<Vec<PendingIndex>> {
let current_branch = db let current_branch = db
.coordinator .coordinator
.read() .read()
@ -31,7 +31,7 @@ pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> {
ensure_indices_for_branch(db, current_branch.as_deref()).await ensure_indices_for_branch(db, current_branch.as_deref()).await
} }
pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<()> { pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<Vec<PendingIndex>> {
let branch = normalize_branch_name(branch)?; let branch = normalize_branch_name(branch)?;
ensure_indices_for_branch(db, branch.as_deref()).await ensure_indices_for_branch(db, branch.as_deref()).await
} }
@ -73,12 +73,16 @@ pub(super) async fn failpoint_publish_table_head_without_index_rebuild_for_test(
.await .await
} }
pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&str>) -> Result<()> { pub(super) async fn ensure_indices_for_branch(
db: &Omnigraph,
branch: Option<&str>,
) -> Result<Vec<PendingIndex>> {
db.ensure_schema_state_valid().await?; db.ensure_schema_state_valid().await?;
db.ensure_schema_apply_idle("ensure_indices").await?; db.ensure_schema_apply_idle("ensure_indices").await?;
let resolved = db.resolved_branch_target(branch).await?; let resolved = db.resolved_branch_target(branch).await?;
let snapshot = resolved.snapshot; let snapshot = resolved.snapshot;
let mut updates = Vec::new(); let mut updates = Vec::new();
let mut pending = Vec::new();
let active_branch = resolved.branch; let active_branch = resolved.branch;
let catalog = db.catalog(); let catalog = db.catalog();
@ -160,9 +164,8 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
// that needs index work. Held across the per-table commit loop and // that needs index work. Held across the per-table commit loop and
// the manifest publish at the end of this function. Sorted-order // the manifest publish at the end of this function. Sorted-order
// acquisition prevents lock-order inversion against concurrent // acquisition prevents lock-order inversion against concurrent
// multi-table writers (mutation finalize, branch_merge, future // multi-table writers (mutation finalize, branch_merge, the fork
// MR-870 recovery). Under PR 1b's intermediate state (global server // path, recovery).
// RwLock still in place), this acquisition is uncontended.
let queue_keys: Vec<(String, Option<String>)> = recovery_pins let queue_keys: Vec<(String, Option<String>)> = recovery_pins
.iter() .iter()
.map(|pin| (pin.table_key.clone(), pin.table_branch.clone())) .map(|pin| (pin.table_key.clone(), pin.table_branch.clone()))
@ -217,7 +220,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
}; };
let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0); let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0);
if row_count > 0 { if row_count > 0 {
build_indices_on_dataset(db, &table_key, &mut ds).await?; pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?);
} }
let state = db.storage().table_state(&full_path, &ds).await?; let state = db.storage().table_state(&full_path, &ds).await?;
@ -265,7 +268,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
}; };
let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0); let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0);
if row_count > 0 { if row_count > 0 {
build_indices_on_dataset(db, &table_key, &mut ds).await?; pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?);
} }
let state = db.storage().table_state(&full_path, &ds).await?; let state = db.storage().table_state(&full_path, &ds).await?;
@ -307,7 +310,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st
} }
} }
Ok(()) Ok(pending)
} }
/// The single scalar/vector index a node property receives from a one-column /// The single scalar/vector index a node property receives from a one-column
@ -352,6 +355,26 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option<NodePropIndexKind> {
} }
} }
/// Whether a vector column currently has at least one non-null vector — the
/// minimum for Lance IVF k-means to train (the `ivf_flat(1)` index we build
/// needs >=1 vector). Used identically by `needs_index_work_node` (so an
/// untrainable column is not pinned for recovery — avoiding a zero-commit pin
/// that would roll back a sibling's index work) and by the vector build arm (so
/// `create_vector_index` is only attempted when it can succeed, keeping its
/// genuine errors fatal instead of swallowed as pending). If index params
/// become size-aware (dev-graph iss-687), this threshold moves with them.
async fn vector_column_trainable(
db: &Omnigraph,
ds: &SnapshotHandle,
column: &str,
) -> Result<bool> {
Ok(db
.storage()
.count_rows(ds, Some(format!("{column} IS NOT NULL")))
.await?
> 0)
}
/// Returns true if the node table is missing at least one declared /// Returns true if the node table is missing at least one declared
/// scalar/vector index that `build_indices_on_dataset_for_catalog` would /// scalar/vector index that `build_indices_on_dataset_for_catalog` would
/// build AND has at least one row (the ensure_indices loop has /// build AND has at least one row (the ensure_indices loop has
@ -366,7 +389,7 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option<NodePropIndexKind> {
/// (DateTime/Date/numeric/Bool), FTS for free-text Strings, or a Vector index. /// (DateTime/Date/numeric/Bool), FTS for free-text Strings, or a Vector index.
/// Edges get BTree only (id, src, dst). This helper and the builder share /// Edges get BTree only (id, src, dst). This helper and the builder share
/// `node_prop_index_kind` so they cannot drift — see its doc comment. /// `node_prop_index_kind` so they cannot drift — see its doc comment.
async fn needs_index_work_node( pub(super) async fn needs_index_work_node(
db: &Omnigraph, db: &Omnigraph,
type_name: &str, type_name: &str,
table_key: &str, table_key: &str,
@ -409,7 +432,14 @@ async fn needs_index_work_node(
} }
} }
Some(NodePropIndexKind::Vector) => { Some(NodePropIndexKind::Vector) => {
if !db.storage().has_vector_index(&ds, prop_name).await? { // Only count a missing vector index as buildable *work* when the
// column is trainable (>=1 non-null vector). An untrainable
// column would defer in the build and commit nothing; pinning it
// for recovery would be a zero-commit pin that classifies
// NoMovement and rolls back a sibling table's index work.
if !db.storage().has_vector_index(&ds, prop_name).await?
&& vector_column_trainable(db, &ds, prop_name).await?
{
return Ok(true); return Ok(true);
} }
} }
@ -434,7 +464,7 @@ async fn needs_index_work_node(
/// ///
/// Empty edge tables are skipped by the ensure_indices loop the same /// Empty edge tables are skipped by the ensure_indices loop the same
/// way node tables are; see `needs_index_work_node`. /// way node tables are; see `needs_index_work_node`.
async fn needs_index_work_edge( pub(super) async fn needs_index_work_edge(
db: &Omnigraph, db: &Omnigraph,
table_key: &str, table_key: &str,
full_path: &str, full_path: &str,
@ -551,8 +581,14 @@ pub(super) async fn open_owned_dataset_for_branch_write(
)); ));
} }
} }
fork_dataset_from_entry_state( // The fork advances Lance state before the manifest publish. The
db, // caller holds the per-(table, active_branch) write queue from
// before this fork through the publish, so a leftover ref is a
// manifest-unreferenced fork (interrupted prior fork, or
// delete+recreate), not a live in-process fork. The wrapper
// self-heals it (reclaim + re-fork); see
// `Omnigraph::fork_dataset_from_entry_state`.
db.fork_dataset_from_entry_state(
table_key, table_key,
full_path, full_path,
source_branch, source_branch,
@ -580,7 +616,7 @@ pub(super) async fn fork_dataset_from_entry_state(
source_branch: Option<&str>, source_branch: Option<&str>,
source_version: u64, source_version: u64,
active_branch: &str, active_branch: &str,
) -> Result<SnapshotHandle> { ) -> Result<crate::storage_layer::ForkOutcome<SnapshotHandle>> {
db.storage() db.storage()
.fork_branch_from_state( .fork_branch_from_state(
full_path, full_path,
@ -592,6 +628,172 @@ pub(super) async fn fork_dataset_from_entry_state(
.await .await
} }
/// Classification of a Lance branch ref `B` on table `T` against FRESH manifest
/// authority — the single decision both fork-ref reclaim sites share: the
/// write-path reclaim ([`reclaim_orphaned_fork_and_refork`]) and the cleanup
/// reconciler (`optimize::reconcile_orphaned_branches`). Having one classifier
/// keeps the two destructive sites from drifting (the bug history: each was
/// hardened separately and the other lagged).
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub(crate) enum ForkRefStatus {
/// The manifest places `T` on `B` — a legitimate fork. Never destroy.
Legitimate,
/// The manifest does not reference this fork (`T` not on `B`, or `B` absent
/// from the manifest entirely). Reclaimable.
Orphan,
/// Fresh authority could not be established (a transient read failure on a
/// live branch). Ambiguous — do not destroy; the caller retries / converges.
Indeterminate,
}
/// Classify a fork ref from FRESH manifest authority (bypasses the coordinator
/// cache). MUST be called with the per-`(table, branch)` write queue held, so
/// the classification is stable against in-process writers for the caller's
/// critical section. Both reclaim sites map the result to their own action
/// (write path: reclaim vs retryable; cleanup: delete vs skip), but the
/// destroy-only-on-`Orphan` rule is enforced here, once.
pub(crate) async fn classify_fork_ref(
db: &Omnigraph,
table_key: &str,
branch: &str,
) -> ForkRefStatus {
// `classify.fresh_read` failpoint: simulate a transient failure of the
// fresh-authority read (no-op without the `failpoints` feature). Lets a
// test exercise the Indeterminate path — a read failure on a live branch
// must classify as Indeterminate (skip), never Orphan (destroy).
let fresh = match crate::failpoints::maybe_fail("classify.fresh_read") {
Ok(()) => db.fresh_snapshot_for_branch(Some(branch)).await,
Err(injected) => Err(injected),
};
match fresh {
Ok(snap) => {
let placed = snap
.entry(table_key)
.map(|e| e.table_branch.as_deref() == Some(branch))
.unwrap_or(false);
if placed {
ForkRefStatus::Legitimate
} else {
// Branch resolves but the manifest does not place this table on
// it — a manifest-unreferenced fork.
ForkRefStatus::Orphan
}
}
// Branch did not resolve. `all_branches` lists `_refs/branches/` live, so
// absent there = genuinely no such manifest branch (origin-1 orphan);
// present (or a list error) = transient read — never destroy on that.
Err(_) => match db.coordinator.read().await.all_branches().await {
Ok(fresh) if !fresh.iter().any(|b| b == branch) => ForkRefStatus::Orphan,
_ => ForkRefStatus::Indeterminate,
},
}
}
/// Reclaim a manifest-unreferenced fork and re-fork in its place.
///
/// Reached when `fork_branch_from_state` reports `RefAlreadyExists`. This is a
/// destructive op (it force-deletes a Lance branch ref), so it owns its own
/// safety precondition rather than trusting the caller's: it re-derives, via
/// [`classify_fork_ref`], that the manifest does not place this table on
/// `active_branch`. The caller's earlier proof may have come from the
/// coordinator's *cached* branch snapshot (`resolved_branch_target` returns
/// the cache when the handle is bound to `active_branch` — an embedded handle
/// on the branch, or `branch_merge`'s target swap); trusting it could
/// force-delete a fork a concurrent writer just legitimately published. Only
/// once fresh authority confirms the ref is unreferenced does it drop the ref
/// (idempotent `force_delete_branch`) and re-fork, exactly once.
///
/// If fresh authority shows the table IS on `active_branch` (a legitimate
/// concurrent fork), or a second collision occurs after reclaim (a foreign-
/// process writer recreated the ref — the documented one-winner-CAS gap), it
/// surfaces a retryable conflict; on retry the winner's fork is visible and
/// the no-fork path runs.
pub(super) async fn reclaim_orphaned_fork_and_refork(
db: &Omnigraph,
table_key: &str,
full_path: &str,
source_branch: Option<&str>,
source_version: u64,
active_branch: &str,
) -> Result<SnapshotHandle> {
// Self-validate against FRESH authority before destroying anything. Only an
// Orphan is reclaimable; a Legitimate status (a concurrent writer published
// a real fork despite the caller's possibly-cached proof) or an
// Indeterminate one (transient read) surfaces a retryable conflict rather
// than stranding the manifest at a version the recreated ref won't have.
match classify_fork_ref(db, table_key, active_branch).await {
ForkRefStatus::Orphan => {}
ForkRefStatus::Legitimate => {
let actual = db
.fresh_snapshot_for_branch(Some(active_branch))
.await
.ok()
.and_then(|s| s.entry(table_key).map(|e| e.table_version))
.unwrap_or(source_version);
return Err(OmniError::manifest_expected_version_mismatch(
table_key,
source_version,
actual,
));
}
ForkRefStatus::Indeterminate => {
return Err(OmniError::manifest_conflict(format!(
"could not verify whether branch '{active_branch}' still owns an orphaned \
fork for table '{table_key}' because fresh manifest authority was \
unavailable; refresh and retry"
)));
}
}
crate::failpoints::maybe_fail("fork.before_reclaim")?;
db.storage()
.force_delete_branch(full_path, active_branch)
.await
.map_err(|e| {
// Lance refuses to delete a branch with dependent child branches
// even under force (RefConflict). Unreachable for a leaf first-write
// fork (the cleanup reconciler also drops children before parents),
// but surface it actionably if it ever happens. We match loosely on
// "referenc" rather than the exact prose, which is not a Lance API
// contract; a typed RefConflict variant through `force_delete_branch`
// is the durable follow-up.
if e.to_string().contains("referenc") {
OmniError::manifest_conflict(format!(
"branch '{active_branch}' cannot reclaim the leftover fork for \
table '{table_key}' because it has dependent child branches; \
delete the child branches (or run `omnigraph cleanup`) first"
))
} else {
e
}
})?;
match fork_dataset_from_entry_state(
db,
table_key,
full_path,
source_branch,
source_version,
active_branch,
)
.await?
{
crate::storage_layer::ForkOutcome::Created(ds) => Ok(ds),
crate::storage_layer::ForkOutcome::RefAlreadyExists => {
let live = db.fresh_snapshot_for_branch(Some(active_branch)).await?;
let actual = live
.entry(table_key)
.map(|e| e.table_version)
.unwrap_or(source_version);
Err(OmniError::manifest_expected_version_mismatch(
table_key,
source_version,
actual,
))
}
}
}
pub(super) async fn reopen_for_mutation( pub(super) async fn reopen_for_mutation(
db: &Omnigraph, db: &Omnigraph,
table_key: &str, table_key: &str,
@ -632,11 +834,25 @@ pub(super) async fn open_dataset_at_state(
.await .await
} }
/// A declared index the builder could not materialize on this pass. Today the
/// only such case is a vector (IVF) column with no trainable vectors yet
/// (KMeans needs >=1 vector), e.g. the load-before-embed window. Reported, not
/// fatal: a later `ensure_indices`/`optimize` retries once the column is
/// buildable, and reads stay correct via brute-force meanwhile. Surfacing
/// pending index *status* rather than failing the operation is the database
/// norm (Postgres `indisvalid`, LanceDB `list_indices`).
#[derive(Debug, Clone)]
pub struct PendingIndex {
pub table_key: String,
pub column: String,
pub reason: String,
}
pub(super) async fn build_indices_on_dataset( pub(super) async fn build_indices_on_dataset(
db: &Omnigraph, db: &Omnigraph,
table_key: &str, table_key: &str,
ds: &mut SnapshotHandle, ds: &mut SnapshotHandle,
) -> Result<()> { ) -> Result<Vec<PendingIndex>> {
let catalog = db.catalog(); let catalog = db.catalog();
build_indices_on_dataset_for_catalog(db, &catalog, table_key, ds).await build_indices_on_dataset_for_catalog(db, &catalog, table_key, ds).await
} }
@ -646,8 +862,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
catalog: &Catalog, catalog: &Catalog,
table_key: &str, table_key: &str,
ds: &mut SnapshotHandle, ds: &mut SnapshotHandle,
) -> Result<()> { ) -> Result<Vec<PendingIndex>> {
if let Some(type_name) = table_key.strip_prefix("node:") { if let Some(type_name) = table_key.strip_prefix("node:") {
let mut pending = Vec::new();
if !db.storage().has_btree_index(ds, "id").await? { if !db.storage().has_btree_index(ds, "id").await? {
stage_and_commit_btree(db, table_key, ds, &["id"]).await?; stage_and_commit_btree(db, table_key, ds, &["id"]).await?;
} }
@ -676,22 +893,52 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
} }
Some(NodePropIndexKind::Vector) => { Some(NodePropIndexKind::Vector) => {
if !db.storage().has_vector_index(ds, prop_name).await? { if !db.storage().has_vector_index(ds, prop_name).await? {
// Inline-commit residual: lance-6.0.1 does not // A vector (IVF) index trains k-means over the column,
// expose `build_index_metadata_from_segments` as // so it needs >=1 non-null vector (KMeans errors
// `pub`, so vector indices cannot be staged from // "cannot train N centroids with 0 vectors"). Precheck
// outside the lance crate. Document at the call // trainability: a column with no vectors yet (e.g. rows
// site; companion ticket to lance-format/lance#6658. // loaded before `embed`) is recorded as a *pending*
let new_snap = db // index and skipped — deferred, not failed. The SAME
.storage_inline_residual() // predicate gates `needs_index_work_node`, so an
.create_vector_index(ds.clone(), prop_name.as_str()) // untrainable column is never pinned for recovery (no
.await // zero-commit pin that would roll back a sibling
.map_err(|e| { // table's index work). This function is the chokepoint
OmniError::Lance(format!( // every write path funnels through (load/mutate, schema
"create Vector index on {}({}): {}", // apply, ensure_indices, optimize, merge), realizing
table_key, prop_name, e // the governing principle — physical index state never
)) // fails a logical operation. Only when trainable do we
})?; // attempt the build, and then we PROPAGATE any error: a
*ds = new_snap; // genuine I/O/manifest/Lance failure must stay fatal,
// not be hidden as pending. (Vector creation is an
// inline-commit residual until lance#6666; iss-951.)
if vector_column_trainable(db, ds, prop_name).await? {
let new_snap = db
.storage_inline_residual()
.create_vector_index(ds.clone(), prop_name.as_str())
.await
.map_err(|e| {
OmniError::Lance(format!(
"create Vector index on {}({}): {}",
table_key, prop_name, e
))
})?;
*ds = new_snap;
} else {
tracing::info!(
target: "omnigraph::index",
table = %table_key,
column = %prop_name,
"deferring Vector index: column has no \
trainable vectors yet",
);
pending.push(PendingIndex {
table_key: table_key.to_string(),
column: prop_name.clone(),
reason: "column has no non-null vectors to \
train on yet"
.to_string(),
});
}
} }
} }
// Enum + orderable scalars (DateTime/Date/numeric/Bool) // Enum + orderable scalars (DateTime/Date/numeric/Bool)
@ -709,7 +956,7 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
} }
} }
} }
return Ok(()); return Ok(pending);
} }
if table_key.starts_with("edge:") { if table_key.starts_with("edge:") {
@ -722,7 +969,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog(
if !db.storage().has_btree_index(ds, "dst").await? { if !db.storage().has_btree_index(ds, "dst").await? {
stage_and_commit_btree(db, table_key, ds, &["dst"]).await?; stage_and_commit_btree(db, table_key, ds, &["dst"]).await?;
} }
return Ok(()); // Edge tables only get BTree (id/src/dst), which build at any
// cardinality; no pending state is possible here.
return Ok(Vec::new());
} }
Err(OmniError::manifest(format!( Err(OmniError::manifest(format!(
@ -844,7 +1093,11 @@ async fn prepare_updates_for_commit(
crate::db::MutationOpKind::SchemaRewrite, crate::db::MutationOpKind::SchemaRewrite,
) )
.await?; .await?;
build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?; // Any column not yet buildable (e.g. a vector column whose rows
// have null embeddings) is deferred and logged inside
// build_indices; a later ensure_indices/optimize materializes it.
// The load/mutate/merge commit must not fail on it.
let _pending = build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?;
let state = db.storage().table_state(&full_path, &ds).await?; let state = db.storage().table_state(&full_path, &ds).await?;
prepared_update.table_version = state.version; prepared_update.table_version = state.version;
prepared_update.row_count = state.row_count; prepared_update.row_count = state.row_count;
@ -1045,3 +1298,78 @@ pub(super) async fn ensure_commit_graph_initialized(db: &Omnigraph) -> Result<()
pub(super) async fn invalidate_graph_index(db: &Omnigraph) { pub(super) async fn invalidate_graph_index(db: &Omnigraph) {
db.runtime_cache.invalidate_all().await; db.runtime_cache.invalidate_all().await;
} }
#[cfg(test)]
mod classify_fork_ref_tests {
//! Direct coverage of [`classify_fork_ref`] — the single fresh-authority
//! decision both fork-ref reclaim sites (write-path reclaim + cleanup
//! reconciler) route through. Pins each deterministic status so reverting
//! the fresh-authority logic at either site fails here. (The `Indeterminate`
//! arm needs an injected transient read and is covered under the
//! `failpoints` suite.)
use super::*;
use crate::db::Omnigraph;
use crate::loader::LoadMode;
const SCHEMA: &str = "node Person { name: String @key }\nnode Company { name: String @key }\n";
/// On-disk dataset path for a node table, taken from the manifest entry
/// (the same path the engine uses) so the test forges against the real ref.
async fn node_path(db: &Omnigraph, branch: &str, table_key: &str) -> String {
let snap = db.snapshot_for_branch(Some(branch)).await.unwrap();
let entry = snap.entry(table_key).unwrap();
format!("{}/{}", db.root_uri, entry.table_path)
}
#[tokio::test]
async fn classify_distinguishes_legitimate_unreferenced_and_ghost() {
let dir = tempfile::tempdir().unwrap();
let db = Omnigraph::init(dir.path().to_str().unwrap(), SCHEMA)
.await
.unwrap();
db.branch_create("feature").await.unwrap();
// Legitimate: a real write forks Company onto `feature`, and the
// manifest places Company on `feature`.
db.load_as(
"feature",
None,
r#"{"type":"Company","data":{"name":"Acme"}}"#,
LoadMode::Merge,
None,
)
.await
.unwrap();
assert_eq!(
classify_fork_ref(&db, "node:Company", "feature").await,
ForkRefStatus::Legitimate,
"a manifest-placed fork must classify as Legitimate (never destroyed)"
);
// Orphan (manifest-unreferenced): forge a `feature` ref on Person, which
// the manifest's `feature` snapshot still places on main.
let person = node_path(&db, "feature", "node:Person").await;
{
let mut ds = lance::Dataset::open(&person).await.unwrap();
let v = ds.version().version;
ds.create_branch("feature", v, None).await.unwrap();
}
assert_eq!(
classify_fork_ref(&db, "node:Person", "feature").await,
ForkRefStatus::Orphan,
"a ref the manifest does not place on the branch must classify as Orphan"
);
// Orphan (ghost): a ref for a branch the manifest does not have at all.
{
let mut ds = lance::Dataset::open(&person).await.unwrap();
let v = ds.version().version;
ds.create_branch("ghost", v, None).await.unwrap();
}
assert_eq!(
classify_fork_ref(&db, "node:Person", "ghost").await,
ForkRefStatus::Orphan,
"a ref for a branch absent from the manifest must classify as Orphan"
);
}
}

View file

@ -1,12 +1,15 @@
//! Per-`(table_key, branch)` writer queues — MR-686 scaffolding. //! Per-`(table_key, branch)` writer queues.
//! //!
//! Today every server-layer write serializes on the global //! These queues are the engine's write-serialization mechanism: the server
//! `Arc<RwLock<Omnigraph>>` in `AppState`. MR-686 replaces that with //! holds the engine as a lockless `Arc<Omnigraph>` (writes are `&self`), so
//! per-`(table_key, branch_ref)` queues so disjoint-key writes proceed //! disjoint-key writes proceed concurrently and only writes to the same
//! concurrently. This module owns the queue data structure; callers in //! `(table_key, branch_ref)` serialize here. This module owns the queue
//! `MutationStaging::commit_all`, `branch_merge`, `schema_apply`, //! data structure; callers in `MutationStaging::commit_all`, `branch_merge`,
//! `ensure_indices`, `delete_where`, and the future MR-870 recovery //! `schema_apply`, `ensure_indices`, `delete_where`, the fork path (first
//! reconciler acquire guards before any per-table Lance commit. //! write to a table on a branch — acquired before the fork, held through the
//! manifest publish), and the recovery reconciler acquire guards before any
//! per-table Lance commit. Serialization is in-process only; cross-process
//! writers on one graph remain one-winner-CAS at the manifest publish.
//! //!
//! ## Why exclusive `tokio::sync::Mutex<()>` per key //! ## Why exclusive `tokio::sync::Mutex<()>` per key
//! //!

View file

@ -1323,9 +1323,9 @@ impl Omnigraph {
// branch_merge writes only to the target branch. // branch_merge writes only to the target branch.
// //
// Held across the per-table publish loop and the manifest // Held across the per-table publish loop and the manifest
// commit + record_merge_commit calls below. Under PR 1b's // commit + record_merge_commit calls below, so no concurrent
// intermediate state (global server RwLock still in place), // writer to a touched (table, target_branch) can interleave
// this acquisition is uncontended. // between our commit_staged and our publish.
let active_branch_for_keys = self.active_branch().await; let active_branch_for_keys = self.active_branch().await;
let merge_queue_keys: Vec<(String, Option<String>)> = ordered_table_keys let merge_queue_keys: Vec<(String, Option<String>)> = ordered_table_keys
.iter() .iter()

View file

@ -741,14 +741,45 @@ impl Omnigraph {
// tables. Branch is threaded explicitly — no coordinator swap. // tables. Branch is threaded explicitly — no coordinator swap.
let mut staging = MutationStaging::default(); let mut staging = MutationStaging::default();
// Lower + validate up front so the touched-table set is known before
// execution. A lowering/validation error returns exactly as it did
// when this happened inside execute_named_mutation.
let ir = self.lower_named_mutation(query_source, query_name)?;
// Up-front fork-queue acquisition (see the loader for the full
// rationale): if this mutation will fork any touched table onto a
// non-main branch, acquire the per-(table, branch) write queues for
// every touched table before the first fork and hold them through the
// publish, so the orphan-fork reclaim can't race a concurrent
// in-process fork. The touched set is derived from the lowered IR.
let fork_queue_guards: Option<(
Vec<(String, Option<String>)>,
Vec<tokio::sync::OwnedMutexGuard<()>>,
)> = if let Some(active) = requested.as_deref() {
let snapshot = self.snapshot_for_branch(Some(active)).await?;
let touched: Vec<(String, Option<String>)> = self
.touched_table_keys(&ir)
.into_iter()
.map(|k| (k, Some(active.to_string())))
.collect();
let needs_fork = touched.iter().any(|(table_key, _)| {
snapshot
.entry(table_key)
.map(|e| e.table_branch.as_deref() != Some(active))
.unwrap_or(false)
});
if needs_fork {
let guards = self.write_queue().acquire_many(&touched).await;
Some((touched, guards))
} else {
None
}
} else {
None
};
let exec_result = self let exec_result = self
.execute_named_mutation( .execute_named_mutation(&ir, &resolved_params, requested.as_deref(), &mut staging)
query_source,
query_name,
&resolved_params,
requested.as_deref(),
&mut staging,
)
.await; .await;
match exec_result { match exec_result {
@ -768,6 +799,7 @@ impl Omnigraph {
requested.as_deref(), requested.as_deref(),
crate::db::manifest::SidecarKind::Mutation, crate::db::manifest::SidecarKind::Mutation,
actor_id, actor_id,
fork_queue_guards,
) )
.await?; .await?;
// Failpoint that wedges the documented finalize→publisher // Failpoint that wedges the documented finalize→publisher
@ -817,14 +849,19 @@ impl Omnigraph {
} }
} }
async fn execute_named_mutation( /// Lower + validate a named mutation query into its IR.
///
/// Hoisted out of [`Self::execute_named_mutation`] so the caller can
/// inspect the IR before execution — specifically to compute the
/// touched-table set (see [`Self::touched_table_keys`]) for up-front
/// write-queue acquisition. Performs the same find → typecheck → lower
/// → D₂ checks that execution previously did inline, so error behavior
/// is unchanged.
fn lower_named_mutation(
&self, &self,
query_source: &str, query_source: &str,
query_name: &str, query_name: &str,
params: &ParamMap, ) -> Result<omnigraph_compiler::ir::MutationIR> {
branch: Option<&str>,
staging: &mut MutationStaging,
) -> Result<MutationResult> {
let query_decl = omnigraph_compiler::find_named_query(query_source, query_name) let query_decl = omnigraph_compiler::find_named_query(query_source, query_name)
.map_err(|e| OmniError::manifest(e.to_string()))?; .map_err(|e| OmniError::manifest(e.to_string()))?;
@ -841,7 +878,61 @@ impl Omnigraph {
let ir = lower_mutation_query(&query_decl)?; let ir = lower_mutation_query(&query_decl)?;
// D₂: reject mixed insert/update + delete before any I/O. // D₂: reject mixed insert/update + delete before any I/O.
enforce_no_mixed_destructive_constructive(&ir)?; enforce_no_mixed_destructive_constructive(&ir)?;
Ok(ir)
}
/// The COMPLETE set of `(node|edge):{type}` table keys a mutation IR can
/// touch at execution time, keyed as `MutationStaging`/`commit_all` key
/// them. Must be a superset of everything execution forks/commits, since
/// it drives the up-front fork-queue acquisition and `commit_all`'s
/// held-guard coverage check — a miss means an unserialized fork/commit.
///
/// The set is a pure function of (IR ops + catalog). For each op it mirrors
/// the execute path's node-vs-edge dispatch (`node_types` first, then
/// `edge_types`). A `delete <Node>` additionally **cascades** to every edge
/// type whose endpoint is that node (see `execute_delete_node`), forking
/// those edge tables during execution — so they are included here, derived
/// the same way the executor derives them (`from_type`/`to_type` match).
/// Unknown types are skipped (the execute path surfaces the error).
/// Sorted + deduped for one-shot `acquire_many`.
fn touched_table_keys(&self, ir: &omnigraph_compiler::ir::MutationIR) -> Vec<String> {
use omnigraph_compiler::ir::MutationOpIR;
let catalog = self.catalog();
let mut keys: Vec<String> = Vec::new();
for op in &ir.ops {
let type_name = match op {
MutationOpIR::Insert { type_name, .. }
| MutationOpIR::Update { type_name, .. }
| MutationOpIR::Delete { type_name, .. } => type_name,
};
if catalog.node_types.contains_key(type_name) {
keys.push(format!("node:{type_name}"));
// A node delete cascades to every edge touching this node type,
// forking those edge tables. Include them so the up-front
// acquisition covers the cascade (mirrors execute_delete_node).
if matches!(op, MutationOpIR::Delete { .. }) {
for (edge_name, edge_type) in &catalog.edge_types {
if edge_type.from_type == *type_name || edge_type.to_type == *type_name {
keys.push(format!("edge:{edge_name}"));
}
}
}
} else if catalog.edge_types.contains_key(type_name) {
keys.push(format!("edge:{type_name}"));
}
}
keys.sort();
keys.dedup();
keys
}
async fn execute_named_mutation(
&self,
ir: &omnigraph_compiler::ir::MutationIR,
params: &ParamMap,
branch: Option<&str>,
staging: &mut MutationStaging,
) -> Result<MutationResult> {
let mut total = MutationResult::default(); let mut total = MutationResult::default();
for op in &ir.ops { for op in &ir.ops {
let result = match op { let result = match op {

View file

@ -463,12 +463,28 @@ impl StagedMutation {
/// unreferenced (cleaned by `cleanup_old_versions`'s age sweep) /// unreferenced (cleaned by `cleanup_old_versions`'s age sweep)
/// rather than being committed and creating a Lance-HEAD-ahead /// rather than being committed and creating a Lance-HEAD-ahead
/// residual. /// residual.
/// `held_guards`: when the caller already holds the per-`(table_key,
/// branch)` write queues for every touched table (the fork path acquires
/// them up front, before the fork, and holds them through the manifest
/// publish), it passes `(acquired_keys, guards)` here so `commit_all`
/// reuses them instead of re-acquiring — the queue is a non-re-entrant
/// `tokio::Mutex`, so re-acquiring a held key would self-deadlock.
/// `None` (the steady-state path) means `commit_all` acquires them
/// itself. `acquired_keys` must cover every key `commit_all` would
/// acquire (debug-asserted below) — the guards from `acquire_many` don't
/// carry their keys, so the caller hands the key set alongside them. The
/// fork path guarantees coverage by keying every touched table uniformly
/// by the resolved target branch.
pub(crate) async fn commit_all( pub(crate) async fn commit_all(
self, self,
db: &crate::db::Omnigraph, db: &crate::db::Omnigraph,
branch: Option<&str>, branch: Option<&str>,
sidecar_kind: SidecarKind, sidecar_kind: SidecarKind,
actor_id: Option<&str>, actor_id: Option<&str>,
held_guards: Option<(
Vec<(String, Option<String>)>,
Vec<tokio::sync::OwnedMutexGuard<()>>,
)>,
) -> Result<( ) -> Result<(
Vec<SubTableUpdate>, Vec<SubTableUpdate>,
HashMap<String, u64>, HashMap<String, u64>,
@ -483,21 +499,18 @@ impl StagedMutation {
op_kinds, op_kinds,
} = self; } = self;
// Acquire per-(table_key, branch) queues for every touched // Per-(table_key, branch) queues for every touched table — both
// table — both staged and inline-committed. Sorted by // staged and inline-committed. Sorted by `acquire_many` internally
// `acquire_many` internally so all multi-table writers // so all multi-table writers (mutation, branch_merge, schema_apply,
// (mutation, branch_merge, schema_apply, future MR-870 // the fork path, recovery) agree on acquisition order — prevents
// recovery) agree on acquisition order — prevents lock-order // lock-order inversion deadlock.
// inversion deadlock.
// //
// For inline-committed tables (delete-only mutations), Lance // For inline-committed tables (delete-only mutations), Lance HEAD
// HEAD has already advanced inside `delete_where` before // has already advanced inside `delete_where` before `commit_all`
// `commit_all` runs. Holding the queue here doesn't prevent // runs. Holding the queue here prevents another writer from
// that interleaving (commit 6 will move queue acquisition into // interleaving between our delete and our publish, which would
// `delete_where`'s call site); it does prevent another writer // otherwise leave a Lance-HEAD-ahead residual the delete-only
// from interleaving between our delete and our publish, which // sidecar (added below) would have to recover.
// would otherwise leave a Lance-HEAD-ahead residual the
// delete-only sidecar (added below) would have to recover.
let mut queue_keys: Vec<(String, Option<String>)> = let mut queue_keys: Vec<(String, Option<String>)> =
Vec::with_capacity(staged.len() + inline_committed.len()); Vec::with_capacity(staged.len() + inline_committed.len());
for entry in &staged { for entry in &staged {
@ -512,7 +525,30 @@ impl StagedMutation {
})?; })?;
queue_keys.push((table_key.clone(), path.table_branch.clone())); queue_keys.push((table_key.clone(), path.table_branch.clone()));
} }
let guards = db.write_queue().acquire_many(&queue_keys).await; // Reuse the caller's guards (fork path) when handed in, else acquire
// our own. When reusing, every key we would acquire MUST already be
// covered — re-acquiring a held non-re-entrant key would deadlock, and
// a key we'd need but DON'T hold would commit unserialized. This is a
// load-bearing safety invariant, so it is checked in ALL builds (not a
// debug_assert) and fails the write loudly+safely rather than silently
// proceeding unguarded if a future execution path ever touches a table
// outside the caller's pre-computed set.
let guards = match held_guards {
Some((acquired_keys, guards)) => {
let held: std::collections::HashSet<&(String, Option<String>)> =
acquired_keys.iter().collect();
if let Some(missing) = queue_keys.iter().find(|k| !held.contains(k)) {
return Err(OmniError::manifest_internal(format!(
"commit_all: pre-held write-queue guards do not cover touched table \
'{}' on branch {:?} the caller's up-front acquisition set diverged \
from the staged/inline set (a touched-table-set bug)",
missing.0, missing.1
)));
}
guards
}
None => db.write_queue().acquire_many(&queue_keys).await,
};
// Re-capture manifest pins under the queue (PR 2 / MR-686). // Re-capture manifest pins under the queue (PR 2 / MR-686).
// //

View file

@ -418,6 +418,45 @@ async fn load_jsonl_reader<R: BufRead>(
LoadMode::Overwrite => crate::db::MutationOpKind::SchemaRewrite, LoadMode::Overwrite => crate::db::MutationOpKind::SchemaRewrite,
}; };
// Up-front fork-queue acquisition. The first write to a table on a
// non-main branch forks it (create_branch), which advances Lance state
// before the manifest publish; the reclaim of any manifest-unreferenced
// leftover (`reclaim_orphaned_fork_and_refork`) must not race a concurrent
// in-process fork. So when this load will fork at least one touched table,
// acquire the per-(table, branch) write queues for ALL touched tables up
// front (one sorted `acquire_many`, keyed uniformly by the target branch
// so it covers what `commit_all` recomputes) and hold them through the
// publish. Main-branch loads never fork; branch loads where every touched
// table is already forked skip this and let `commit_all` acquire at commit.
let fork_queue_guards: Option<(
Vec<(String, Option<String>)>,
Vec<tokio::sync::OwnedMutexGuard<()>>,
)> = if let Some(active) = branch {
let touched: Vec<(String, Option<String>)> = node_rows
.keys()
.map(|t| (format!("node:{t}"), Some(active.to_string())))
.chain(
edge_rows
.keys()
.map(|e| (format!("edge:{e}"), Some(active.to_string()))),
)
.collect();
let needs_fork = touched.iter().any(|(table_key, _)| {
snapshot
.entry(table_key)
.map(|e| e.table_branch.as_deref() != Some(active))
.unwrap_or(false)
});
if needs_fork {
let guards = db.write_queue().acquire_many(&touched).await;
Some((touched, guards))
} else {
None
}
} else {
None
};
// Phase 2a: build and validate every node batch up front. Cheap and // Phase 2a: build and validate every node batch up front. Cheap and
// synchronous — surfaces validation errors before any S3 traffic. // synchronous — surfaces validation errors before any S3 traffic.
let mut prepared_nodes: Vec<(String, String, RecordBatch, usize)> = let mut prepared_nodes: Vec<(String, String, RecordBatch, usize)> =
@ -551,7 +590,13 @@ async fn load_jsonl_reader<R: BufRead>(
// across the manifest publish below — see exec/mutation.rs for // across the manifest publish below — see exec/mutation.rs for
// the rationale (interleaving prevention). // the rationale (interleaving prevention).
let (updates, expected_versions, sidecar_handle, _queue_guards) = staged let (updates, expected_versions, sidecar_handle, _queue_guards) = staged
.commit_all(db, branch, crate::db::manifest::SidecarKind::Load, actor_id) .commit_all(
db,
branch,
crate::db::manifest::SidecarKind::Load,
actor_id,
fork_queue_guards,
)
.await?; .await?;
// Same finalize → publisher residual as mutations: per-table // Same finalize → publisher residual as mutations: per-table
// staged commits have advanced Lance HEAD, but the manifest // staged commits have advanced Lance HEAD, but the manifest

View file

@ -184,6 +184,26 @@ pub(crate) fn staged_handles_as_writes(handles: &[StagedHandle]) -> Vec<StagedWr
handles.iter().map(|h| h.inner.clone()).collect() handles.iter().map(|h| h.inner.clone()).collect()
} }
/// Outcome of a per-table branch fork (`fork_branch_from_state`).
///
/// `RefAlreadyExists` means a Lance branch ref for the target already exists
/// on the dataset, so `create_branch` could not create it cleanly. By the
/// fork caller's contract — the caller re-checks the live manifest under the
/// held per-`(table, branch)` write queue and only forks when the manifest
/// does *not* place the table on the branch — such a ref is a
/// manifest-unreferenced fork (the residue of an interrupted prior fork, or a
/// delete+recreate), which the caller reclaims and re-forks. The fork
/// operation does not editorialize ("incomplete prior delete"); it returns
/// this typed signal and lets the db layer decide.
// `pub` (not `pub(crate)`) to match the visibility of the sealed
// `TableStorage::fork_branch_from_state` that returns it (and the already-`pub`
// `SnapshotHandle`); avoids a private-interfaces warning. The trait is sealed,
// so this widening does not let external code construct or branch on it.
pub enum ForkOutcome<D> {
Created(D),
RefAlreadyExists,
}
// ─── TableStorage trait ──────────────────────────────────────────────────── // ─── TableStorage trait ────────────────────────────────────────────────────
/// Engine-internal trait covering every Lance dataset operation an /// Engine-internal trait covering every Lance dataset operation an
@ -231,7 +251,7 @@ pub trait TableStorage: sealed::Sealed + Send + Sync + Debug {
table_key: &str, table_key: &str,
source_version: u64, source_version: u64,
target_branch: &str, target_branch: &str,
) -> Result<SnapshotHandle>; ) -> Result<ForkOutcome<SnapshotHandle>>;
async fn delete_branch(&self, dataset_uri: &str, branch: &str) -> Result<()>; async fn delete_branch(&self, dataset_uri: &str, branch: &str) -> Result<()>;
@ -497,17 +517,22 @@ impl TableStorage for TableStore {
table_key: &str, table_key: &str,
source_version: u64, source_version: u64,
target_branch: &str, target_branch: &str,
) -> Result<SnapshotHandle> { ) -> Result<ForkOutcome<SnapshotHandle>> {
TableStore::fork_branch_from_state( Ok(
self, match TableStore::fork_branch_from_state(
dataset_uri, self,
source_branch, dataset_uri,
table_key, source_branch,
source_version, table_key,
target_branch, source_version,
target_branch,
)
.await?
{
ForkOutcome::Created(ds) => ForkOutcome::Created(SnapshotHandle::new(ds)),
ForkOutcome::RefAlreadyExists => ForkOutcome::RefAlreadyExists,
},
) )
.await
.map(SnapshotHandle::new)
} }
async fn delete_branch(&self, dataset_uri: &str, branch: &str) -> Result<()> { async fn delete_branch(&self, dataset_uri: &str, branch: &str) -> Result<()> {

View file

@ -26,6 +26,7 @@ use std::sync::Arc;
use crate::db::manifest::{TableVersionMetadata, open_table_head_for_write}; use crate::db::manifest::{TableVersionMetadata, open_table_head_for_write};
use crate::db::{Snapshot, SubTableEntry}; use crate::db::{Snapshot, SubTableEntry};
use crate::error::{OmniError, Result}; use crate::error::{OmniError, Result};
use crate::storage_layer::ForkOutcome;
#[derive(Debug, Clone, PartialEq, Eq)] #[derive(Debug, Clone, PartialEq, Eq)]
pub struct TableState { pub struct TableState {
@ -285,7 +286,7 @@ impl TableStore {
table_key: &str, table_key: &str,
source_version: u64, source_version: u64,
target_branch: &str, target_branch: &str,
) -> Result<Dataset> { ) -> Result<ForkOutcome<Dataset>> {
let mut source_ds = self let mut source_ds = self
.open_dataset_head(dataset_uri, source_branch) .open_dataset_head(dataset_uri, source_branch)
.await? .await?
@ -294,31 +295,49 @@ impl TableStore {
.map_err(|e| OmniError::Lance(e.to_string()))?; .map_err(|e| OmniError::Lance(e.to_string()))?;
self.ensure_expected_version(&source_ds, table_key, source_version)?; self.ensure_expected_version(&source_ds, table_key, source_version)?;
if source_ds if let Err(create_err) = source_ds
.create_branch(target_branch, source_version, None) .create_branch(target_branch, source_version, None)
.await .await
.is_err()
{ {
// The target branch ref already exists. The caller // Disambiguate the failure: only a genuinely pre-existing ref is a
// (`open_owned_dataset_for_branch_write`) re-reads the live manifest // reclaim candidate. Mapping EVERY create_branch failure to
// before forking and returns a retryable error when a concurrent // `RefAlreadyExists` would route a transient I/O / version / Lance
// writer legitimately holds the fork, so reaching here means the // internal error into the destructive reclaim path. So check whether
// manifest does NOT reference this fork: it is an orphan from an // the ref actually exists; if not, the failure is real — propagate
// incomplete prior `branch_delete`. Surface the actionable cleanup // it (preserving error fidelity) rather than force-deleting.
// error rather than guessing from Lance branch versions. //
return Err(OmniError::manifest_conflict(format!( // `list_branches` reads `_refs/branches/` from the store, so it sees
"branch '{}' has orphaned table state for '{}' from an incomplete \ // a fully-formed manifest-unreferenced fork (our common case — a
prior delete; run `omnigraph cleanup` to reclaim it before reusing \ // create_branch that completed but whose manifest publish did not).
this branch name", // It does NOT see a phase-1-only Lance "zombie" (tree dir written,
target_branch, table_key // no BranchContents) — but neither does `cleanup`'s reconciler, also
))); // list_branches-based. A zombie only forms if create_branch is
// interrupted *between its two internal phases* (a far narrower
// window than the manifest-publish gap), and it surfaces here as the
// propagated create error requiring manual reclaim. We deliberately
// do NOT force-delete on a not-found-ref failure: it is
// indistinguishable from a transient error on a fresh create, and
// force-deleting there is the destructive overreach this guard
// removes. The caller holds the per-(table, branch) write queue, so
// no in-process writer races this fork; a cross-process create
// between our check and now is the documented one-winner-CAS gap and
// propagates as a retryable error.
let ref_exists = source_ds
.list_branches()
.await
.map(|b| b.contains_key(target_branch))
.unwrap_or(false);
if ref_exists {
return Ok(ForkOutcome::RefAlreadyExists);
}
return Err(OmniError::Lance(create_err.to_string()));
} }
let ds = self let ds = self
.open_dataset_head(dataset_uri, Some(target_branch)) .open_dataset_head(dataset_uri, Some(target_branch))
.await?; .await?;
self.ensure_expected_version(&ds, table_key, source_version)?; self.ensure_expected_version(&ds, table_key, source_version)?;
Ok(ds) Ok(ForkOutcome::Created(ds))
} }
pub async fn scan_batches(&self, ds: &Dataset) -> Result<Vec<RecordBatch>> { pub async fn scan_batches(&self, ds: &Dataset) -> Result<Vec<RecordBatch>> {

View file

@ -5,7 +5,9 @@ mod helpers;
use fail::FailScenario; use fail::FailScenario;
use futures::FutureExt; use futures::FutureExt;
use omnigraph::db::Omnigraph; use omnigraph::db::Omnigraph;
use omnigraph::error::{ManifestErrorKind, OmniError};
use omnigraph::failpoints::ScopedFailPoint; use omnigraph::failpoints::ScopedFailPoint;
use omnigraph::loader::LoadMode;
use helpers::recovery::{ use helpers::recovery::{
FollowUpMutation, RecoveryExpectation, TableExpectation, assert_post_recovery_invariants, FollowUpMutation, RecoveryExpectation, TableExpectation, assert_post_recovery_invariants,
@ -127,12 +129,12 @@ async fn branch_delete_partial_failure_converges_via_cleanup() {
} }
// Reusing a branch name whose delete left an orphaned fork (before `cleanup` // Reusing a branch name whose delete left an orphaned fork (before `cleanup`
// reconciles it) must fail with a clear, actionable error pointing at // reconciles it) must SELF-HEAL on the next write — the write reclaims the
// `cleanup`, not the opaque `ExpectedVersionMismatch` that leaks from the fork // manifest-unreferenced fork and re-forks, rather than wedging with "incomplete
// path. The recreate itself succeeds; the first write to the previously-forked // prior delete; run cleanup". (This test was the inverse before the fork-as-
// table is where the stale orphan collides. // idempotent-reconcile fix; its flip is the signal the bug class is closed.)
#[tokio::test] #[tokio::test]
async fn recreate_over_orphaned_fork_before_cleanup_is_actionable() { async fn recreate_over_orphaned_fork_self_heals_without_cleanup() {
let _scenario = FailScenario::setup(); let _scenario = FailScenario::setup();
let dir = tempfile::tempdir().unwrap(); let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string(); let uri = dir.path().to_str().unwrap().to_string();
@ -158,10 +160,10 @@ async fn recreate_over_orphaned_fork_before_cleanup_is_actionable() {
} }
// Recreate the name and write to the previously-forked table WITHOUT a // Recreate the name and write to the previously-forked table WITHOUT a
// cleanup in between. // cleanup in between. The write must self-heal the stale orphan fork.
main.branch_create("feature").await.unwrap(); main.branch_create("feature").await.unwrap();
let mut feature2 = Omnigraph::open(&uri).await.unwrap(); let mut feature2 = Omnigraph::open(&uri).await.unwrap();
let err = helpers::mutate_branch( helpers::mutate_branch(
&mut feature2, &mut feature2,
"feature", "feature",
MUTATION_QUERIES, MUTATION_QUERIES,
@ -169,20 +171,83 @@ async fn recreate_over_orphaned_fork_before_cleanup_is_actionable() {
&mixed_params(&[("$name", "Frank")], &[("$age", 41)]), &mixed_params(&[("$name", "Frank")], &[("$age", 41)]),
) )
.await .await
.expect_err("write should collide with the stale orphaned fork"); .expect("recreate-over-orphan write must self-heal, not require cleanup");
let msg = err.to_string(); // The recreated branch forks FRESH from main: the deleted branch's Eve is
assert!( // gone and only the new Frank is added on top of main's seed. A count of
msg.contains("cleanup") // main + 2 would mean Eve resurrected from the stale fork (the bug).
&& (msg.contains("orphan") || msg.contains("incomplete prior delete")), let main_people = helpers::count_rows(&main, "node:Person").await;
"expected an actionable orphaned-fork error pointing at cleanup, got: {msg}" let feature_people = helpers::count_rows_branch(&feature2, "feature", "node:Person").await;
); assert_eq!(
assert!( feature_people,
!msg.contains("expected manifest table version"), main_people + 1,
"should not surface the opaque ExpectedVersionMismatch, got: {msg}" "self-healed feature must fork fresh from main (+Frank only); \
main={main_people}, feature={feature_people} (main+2 Eve resurrected)"
); );
} }
// The write-path orphan reclaim shares the same fresh-authority classifier as
// cleanup. If that classifier is Indeterminate (transient read on a live
// branch), the write must return a clear retryable authority-read conflict and
// leave the ref in place. It must not squeeze the ambiguity through
// ExpectedVersionMismatch with expected == actual, which lies about the cause.
#[tokio::test]
async fn recreate_over_orphaned_fork_reports_indeterminate_authority_read() {
let _scenario = FailScenario::setup();
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string();
let db = helpers::init_and_load(&dir).await;
db.branch_create("feature").await.unwrap();
let person_uri = node_table_uri(&uri, "Person");
{
let mut ds = lance::Dataset::open(&person_uri).await.unwrap();
let base = ds.version().version;
ds.create_branch("feature", base, None).await.unwrap();
}
let row = r#"{"type":"Person","data":{"name":"Grace","age":37}}"#;
{
let _fp = ScopedFailPoint::new("classify.fresh_read", "return");
let err = db
.load_as("feature", None, row, LoadMode::Merge, None)
.await
.expect_err("indeterminate authority read must fail retryably");
match &err {
OmniError::Manifest(manifest) => {
assert_eq!(manifest.kind, ManifestErrorKind::Conflict);
assert!(
manifest.details.is_none(),
"indeterminate authority read is not an expected-version mismatch: {manifest:?}"
);
}
other => panic!("expected manifest conflict, got {other:?}"),
}
let message = err.to_string();
assert!(
message.contains("could not verify")
&& message.contains("fresh manifest authority was unavailable")
&& message.contains("refresh and retry"),
"error should name the unavailable authority read, got: {message}"
);
assert!(
!message.contains("expected manifest table version"),
"indeterminate authority must not be reported as a version mismatch: {message}"
);
let ds = lance::Dataset::open(&person_uri).await.unwrap();
assert!(
ds.list_branches().await.unwrap().contains_key("feature"),
"ambiguous orphan status must leave the fork for a later retry"
);
}
db.load_as("feature", None, row, LoadMode::Merge, None)
.await
.expect("when fresh authority is available, the orphan is reclaimed and write converges");
}
// cleanup is the guaranteed convergence backstop, so one table's transient // cleanup is the guaranteed convergence backstop, so one table's transient
// failure must not abort the whole sweep. Inject a one-shot version-GC failure // failure must not abort the whole sweep. Inject a one-shot version-GC failure
// for a single table and assert: cleanup still succeeds, the failure is // for a single table and assert: cleanup still succeeds, the failure is
@ -330,6 +395,68 @@ async fn cleanup_reclaims_orphaned_commit_graph_branch() {
} }
} }
// `classify_fork_ref` returns `Indeterminate` when the fresh-authority read
// fails on a LIVE branch — and a destructive caller must SKIP, never delete, on
// that ambiguity. Here the reconciler has a genuine origin-2 orphan candidate
// (a manifest-unreferenced Person fork on the live `feature` branch), but the
// `classify.fresh_read` failpoint makes the fresh re-check fail: cleanup must
// leave the ref in place (cannot confirm it is unreferenced), then reclaim it on
// the next run once the read succeeds. This pins the Indeterminate arm and the
// don't-destroy-on-ambiguity rule end-to-end through cleanup.
#[tokio::test]
async fn reconcile_skips_fork_when_fresh_recheck_is_unavailable_then_converges() {
let _scenario = FailScenario::setup();
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string();
let mut db = helpers::init_and_load(&dir).await;
db.branch_create("feature").await.unwrap();
// Forge a manifest-unreferenced Person fork on the live `feature` branch —
// a genuine orphan the reconciler would normally reclaim.
let person_uri = node_table_uri(&uri, "Person");
{
let mut ds = lance::Dataset::open(&person_uri).await.unwrap();
let base = ds.version().version;
ds.create_branch("feature", base, None).await.unwrap();
assert!(
ds.list_branches().await.unwrap().contains_key("feature"),
"precondition: forged orphan fork present"
);
}
// With the fresh re-check failing, the fork's status is Indeterminate (the
// branch is live but unreadable) → cleanup must SKIP it, not delete.
{
let _fp = ScopedFailPoint::new("classify.fresh_read", "return");
db.cleanup(omnigraph::db::CleanupPolicyOptions {
keep_versions: Some(1),
older_than: None,
})
.await
.unwrap();
let ds = lance::Dataset::open(&person_uri).await.unwrap();
assert!(
ds.list_branches().await.unwrap().contains_key("feature"),
"reconcile must NOT delete a fork whose fresh re-check is inconclusive"
);
}
// Read succeeds now → cleanup confirms the orphan and reclaims it (converges).
db.cleanup(omnigraph::db::CleanupPolicyOptions {
keep_versions: Some(1),
older_than: None,
})
.await
.unwrap();
{
let ds = lance::Dataset::open(&person_uri).await.unwrap();
assert!(
!ds.list_branches().await.unwrap().contains_key("feature"),
"next cleanup (fresh read available) must reclaim the confirmed orphan"
);
}
}
// A branch_delete whose best-effort commit-graph reclaim fails leaves a // A branch_delete whose best-effort commit-graph reclaim fails leaves a
// commit-graph "zombie" branch. Recreating that name must heal the zombie and // commit-graph "zombie" branch. Recreating that name must heal the zombie and
// succeed (branch_create force-deletes a stale commit-graph ref since the // succeed (branch_create force-deletes a stale commit-graph ref since the
@ -2619,69 +2746,66 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() {
} }
/// Acceptance test: a stage-step failure in the staged-index path /// Acceptance test: a stage-step failure in the staged-index path
/// (`stage_create_btree_index` succeeded; `commit_staged` not yet /// (`stage_create_btree_index` succeeded; `commit_staged` not yet called)
/// called) leaves NO Lance-HEAD drift on the existing tables. /// leaves NO Lance-HEAD drift, so other tables stay writable.
/// Subsequent operations against those tables succeed without
/// `ExpectedVersionMismatch`.
/// ///
/// Path: `apply_schema(v1 → v2)` adds a new node type. The /// Under iss-848 schema apply no longer builds indexes inline — the build
/// `added_tables` loop in `schema_apply` creates the empty dataset and /// happens in the reconciler (`ensure_indices`/`optimize`) and at load. So this
/// then calls `build_indices_on_dataset_for_catalog` → /// fires the failpoint where it lives now: an `ensure_indices` build of a BTREE
/// `stage_and_commit_btree(..., &["id"])`. The failpoint fires /// that a prior apply declared (`@index`) but deferred. The failpoint fires
/// between `stage_create_btree_index` and `commit_staged`, so the /// between `stage_create_btree_index` and `commit_staged`, so the staged
/// staged segments are written under `_indices/<uuid>/` but Lance HEAD /// segment is written under `_indices/<uuid>/` but `node:Person`'s Lance HEAD is
/// on the new dataset is unchanged at v=1. The schema-apply lock /// unchanged. `ensure_indices` fails and its EnsureIndices sidecar pins only
/// branch is released by `apply_schema`'s outer match. Existing /// Person at NoMovement (a clean no-op on the next open). A write to a
/// tables (e.g. `node:Person`) are completely untouched by the new /// different, unpinned table (`node:Company`) is unaffected: mutations/loads run
/// node's added_tables iteration — they're outside the failed apply /// a roll-forward-only heal and proceed — they do not refuse on a pending
/// path entirely — and we assert that mutations against them continue /// sidecar the way `optimize`/`repair` do — so the write succeeds with no drift.
/// to work.
///
/// The orphan empty dataset from the failed apply is acceptable
/// residual: it's unreferenced by `__manifest` and will be reclaimed
/// by `cleanup_old_versions` (or removed when a future apply at the
/// same target path resolves the rename).
#[tokio::test] #[tokio::test]
async fn ensure_indices_stage_btree_failure_leaves_existing_tables_writable() { async fn ensure_indices_stage_btree_failure_leaves_existing_tables_writable() {
let _scenario = FailScenario::setup(); let _scenario = FailScenario::setup();
let dir = tempfile::tempdir().unwrap(); let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string(); let uri = dir.path().to_str().unwrap().to_string();
// Init with TEST_SCHEMA which declares Person + Knows. Indices on
// those tables get built during init.
let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap(); let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap();
// Apply a schema that adds a new node type. The added_tables loop // Seed a Person row — the load builds Person's id BTREE + name FTS.
// will hit the failpoint between stage and commit on the new
// node:Project table's btree-on-id build. (TEST_SCHEMA already
// has Person + Company + Knows + WorksAt — pick a name that isn't
// already declared.)
let extended_schema = format!(
"{}\nnode Project {{ name: String @key }}\n",
helpers::TEST_SCHEMA
);
{
let _failpoint =
ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
let err = db.apply_schema(&extended_schema).await.unwrap_err();
assert!(
err.to_string()
.contains("ensure_indices.post_stage_pre_commit_btree"),
"schema apply should fail with the synthetic failpoint error, got: {err}"
);
}
// Existing tables stayed at their pre-apply versions; subsequent
// mutations against them succeed (no Lance-HEAD drift).
mutate_main( mutate_main(
&mut db, &mut db,
helpers::MUTATION_QUERIES, helpers::MUTATION_QUERIES,
"insert_person", "insert_person",
&mixed_params(&[("$name", "Eve")], &[("$age", 22)]), &mixed_params(&[("$name", "Alice")], &[("$age", 30)]),
) )
.await .await
.expect("Person mutation must succeed after the failed schema apply — existing tables are not drifted"); .expect("seed Person");
// Add `@index` on `age`: schema apply records the intent but defers the
// physical build (iss-848), so the BTREE on `age` is unbuilt.
let indexed_schema = helpers::TEST_SCHEMA.replace("age: I32?", "age: I32? @index");
db.apply_schema(&indexed_schema)
.await
.expect("adding an @index is metadata-only and succeeds");
{
// ensure_indices builds the deferred `age` BTREE on Person; the failpoint
// fires between stage and commit, so Person's Lance HEAD does not move.
let _failpoint =
ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return");
let err = db.ensure_indices().await.unwrap_err();
assert!(
err.to_string()
.contains("ensure_indices.post_stage_pre_commit_btree"),
"ensure_indices should fail with the synthetic failpoint error, got: {err}"
);
}
// A different, unpinned table is untouched by the failed index build.
use omnigraph::loader::{LoadMode, load_jsonl};
load_jsonl(
&mut db,
r#"{"type": "Company", "data": {"name": "Acme"}}"#,
LoadMode::Append,
)
.await
.expect("Company write on a table untouched by the failed ensure_indices should succeed");
} }
fn assert_no_staging_files(graph: &std::path::Path) { fn assert_no_staging_files(graph: &std::path::Path) {

View file

@ -54,6 +54,19 @@ pub async fn init_and_load(dir: &tempfile::TempDir) -> Omnigraph {
db db
} }
/// On-disk Lance dataset URI for a node type, mirroring the engine's
/// `nodes/{fnv1a(type)}` layout. Used by tests that reach the raw Lance
/// dataset to forge or inspect branch state. (Local copies exist in
/// `failpoints.rs` / `maintenance.rs`; this is the shared one for new tests.)
pub fn node_table_uri(root: &str, type_name: &str) -> String {
let mut hash: u64 = 0xcbf2_9ce4_8422_2325;
for &b in type_name.as_bytes() {
hash ^= b as u64;
hash = hash.wrapping_mul(0x100_0000_01b3);
}
format!("{}/nodes/{hash:016x}", root.trim_end_matches('/'))
}
/// Read all rows from a sub-table by table_key. /// Read all rows from a sub-table by table_key.
pub async fn read_table(db: &Omnigraph, table_key: &str) -> Vec<RecordBatch> { pub async fn read_table(db: &Omnigraph, table_key: &str) -> Vec<RecordBatch> {
let snap = snapshot_main(db).await.unwrap(); let snap = snapshot_main(db).await.unwrap();

View file

@ -843,3 +843,222 @@ async fn cleanup_reconciles_orphaned_branch_forks() {
.await .await
.unwrap(); .unwrap();
} }
// cleanup must reclaim a manifest-unreferenced fork even when the BRANCH is
// still live (origin 2: an interrupted first-write fork), while KEEPING a table
// that is legitimately forked on that same live branch. Before the per-table
// authority broadening, the reconciler keyed only on the branch name and so
// never reclaimed a fork on a live branch — the wedge the handoff hit.
#[tokio::test]
async fn cleanup_reconciles_live_branch_orphan_fork_but_keeps_legitimate_fork() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string();
let mut db = init_and_load(&dir).await;
db.branch_create("feature").await.unwrap();
// Legitimately fork Company onto the live `feature` branch (a real write).
db.load_as(
"feature",
None,
r#"{"type":"Company","data":{"name":"Acme"}}"#,
LoadMode::Merge,
None,
)
.await
.unwrap();
// Forge a manifest-unreferenced Person fork on the SAME live branch: the
// manifest's `feature` snapshot still places Person on main (Person was
// never written on feature), so this ref is an origin-2 orphan.
let person_uri = node_table_uri(&uri, "Person");
{
let mut ds = Dataset::open(&person_uri).await.unwrap();
let base = ds.version().version;
ds.create_branch("feature", base, None).await.unwrap();
assert!(
ds.list_branches().await.unwrap().contains_key("feature"),
"precondition: forged orphan Person fork present on the live branch"
);
}
let company_uri = node_table_uri(&uri, "Company");
let main_people = count_rows(&db, "node:Person").await;
let main_companies = count_rows(&db, "node:Company").await;
db.cleanup(CleanupPolicyOptions {
keep_versions: Some(1),
older_than: None,
})
.await
.unwrap();
// Origin-2 orphan reclaimed...
{
let ds = Dataset::open(&person_uri).await.unwrap();
assert!(
!ds.list_branches().await.unwrap().contains_key("feature"),
"cleanup must reclaim the manifest-unreferenced Person fork on the live branch"
);
}
// ...but the legitimate Company fork on the same live branch is kept.
{
let ds = Dataset::open(&company_uri).await.unwrap();
assert!(
ds.list_branches().await.unwrap().contains_key("feature"),
"cleanup must NOT reclaim a legitimately-forked table on a live branch"
);
}
// main is untouched.
assert_eq!(count_rows(&db, "node:Person").await, main_people);
assert_eq!(count_rows(&db, "node:Company").await, main_companies);
}
// Regression (iss-848): a table with rows but NULL vectors (the load-before-
// embed window) must not abort index building. The vector (IVF) index cannot
// train on 0 vectors, so `create_vector_index` errors with "KMeans cannot
// train 1 centroids with 0 vectors". `build_indices_on_dataset_for_catalog`
// is the chokepoint every caller funnels through (load/mutate via
// prepare_updates_for_commit, ensure_indices, optimize, schema apply, merge),
// so per-index fault isolation there must defer that one column (pending) and
// still build the sibling scalar indexes, instead of propagating the error.
// This exercises both the load path (which builds indices inline) and the
// ensure_indices reconciler. Pre-fix this fails at the load step.
#[tokio::test]
async fn index_build_tolerates_null_vector_rows() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap();
let schema = "node Doc {\n \
slug: String @key\n \
n: I64 @index\n \
embedding: Vector(8)? @index\n\
}\n";
let mut db = Omnigraph::init(uri, schema).await.unwrap();
// Rows present, embeddings null (loaded but not yet embedded).
load_jsonl(
&mut db,
"{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"n\":1}}\n\
{\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"n\":2}}",
LoadMode::Merge,
)
.await
.expect("load rows with null embeddings");
// Must not abort: the untrainable vector column is deferred, the sibling
// BTREE on `n` still builds.
db.ensure_indices()
.await
.expect("ensure_indices must not abort when a vector column has no trainable vectors yet");
}
// iss-848: `optimize` converges declared-but-unbuilt indexes. After an @index is
// added post-data (a metadata-only apply that defers the physical build), the
// column is unindexed and reads scan. `optimize` — the operator's reconciler,
// run on a cron — must materialize it, by composing the ensure_indices
// reconciler after the compaction sweep. Pre-iss-848 optimize only maintained
// coverage of EXISTING indexes (optimize_indices) and never created missing ones.
#[tokio::test]
async fn optimize_materializes_index_declared_but_unbuilt() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap();
let v1 = "node Doc {\n slug: String @key\n rank: I32\n}\n";
let mut db = Omnigraph::init(uri, v1).await.unwrap();
load_jsonl(
&mut db,
"{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\
{\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}",
LoadMode::Merge,
)
.await
.unwrap();
// Add @index on `rank` after data exists: a metadata-only apply that defers
// the physical build (iss-848), so the column is declared-indexed but unbuilt.
let v2 = "node Doc {\n slug: String @key\n rank: I32 @index\n}\n";
db.apply_schema(v2).await.expect("index-only apply");
// Precondition: `rank` is declared @index but unbuilt -> reads degrade.
{
let snap = snapshot_main(&db).await.unwrap();
let ds = snap.open("node:Doc").await.unwrap();
assert!(
matches!(
TableStore::key_column_index_coverage(&ds, "rank")
.await
.unwrap(),
IndexCoverage::Degraded { .. }
),
"rank must be unindexed after the deferred apply"
);
}
db.optimize().await.unwrap();
// Postcondition: optimize's reconciler materialized the declared index.
let snap = snapshot_main(&db).await.unwrap();
let ds = snap.open("node:Doc").await.unwrap();
assert_eq!(
TableStore::key_column_index_coverage(&ds, "rank")
.await
.unwrap(),
IndexCoverage::Indexed,
"optimize must build the declared-but-unbuilt rank index"
);
}
// iss-848 (PR review): the rename path also defers index building. A RenameType
// migration writes the renamed table as a new dataset with the existing rows
// but no indexes (its inline build was removed). optimize must then materialize
// the declared index on the renamed table.
#[tokio::test]
async fn optimize_materializes_index_after_type_rename() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap();
let v1 = "node Doc {\n slug: String @key\n rank: I32 @index\n}\n";
let mut db = Omnigraph::init(uri, v1).await.unwrap();
load_jsonl(
&mut db,
"{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\
{\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}",
LoadMode::Merge,
)
.await
.unwrap();
// Rename Doc -> Item; rows are preserved on the new table key.
let v2 = "node Item @rename_from(\"Doc\") {\n slug: String @key\n rank: I32 @index\n}\n";
let result = db.apply_schema(v2).await.expect("rename apply");
assert!(result.applied);
assert_eq!(
count_rows(&db, "node:Item").await,
2,
"rename must preserve rows"
);
// Post-rename the renamed table's declared rank index is unbuilt (deferred).
{
let snap = snapshot_main(&db).await.unwrap();
let ds = snap.open("node:Item").await.unwrap();
assert!(
matches!(
TableStore::key_column_index_coverage(&ds, "rank")
.await
.unwrap(),
IndexCoverage::Degraded { .. }
),
"rank must be unindexed immediately after the rename"
);
}
db.optimize().await.unwrap();
let snap = snapshot_main(&db).await.unwrap();
let ds = snap.open("node:Item").await.unwrap();
assert_eq!(
TableStore::key_column_index_coverage(&ds, "rank")
.await
.unwrap(),
IndexCoverage::Indexed,
"optimize must build the renamed table's deferred rank index"
);
}

View file

@ -736,3 +736,108 @@ edge Knows: Person -> Person {
// current contract, the data is *unreachable* via omnigraph // current contract, the data is *unreachable* via omnigraph
// (no manifest entry), which is the user-facing guarantee. // (no manifest entry), which is the user-facing guarantee.
} }
// Regression (bug 3 / dev-graph iss-848): a `Vector @index` on a 0-row table
// must not abort an otherwise-valid schema apply. A vector (IVF) index trains
// k-means centroids over the column's vectors, so Lance cannot build it on 0
// vectors — it errors with "Creating empty vector indices with train=False is
// not yet implemented". When a *later* migration touches that table (here, an
// unrelated scalar `@index` on `body`), schema apply reconciles the table's
// whole index set, which previously tried to materialize the dormant vector
// index and aborted the entire migration (all-or-nothing). The build is now
// deferred (pending) when the column is untrainable, instead of failing the
// migration. The dormant index is materialized by a later `ensure_indices` /
// `optimize` once the table has rows. Full decoupling — intent recorded at
// apply, an async reconciler converges physical coverage — is iss-848.
#[tokio::test]
async fn apply_schema_defers_vector_index_on_empty_table() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap();
// init does not build indices, so the declared-but-unbuilt vector index
// sits harmless on the empty table (this is how it survived earlier
// applies that never touched the table).
// `slug` is the user @key; omnigraph injects its own internal `id` column,
// so the key field must not be named `id`.
let v1 = "node Doc {\n \
slug: String @key\n \
body: String?\n \
embedding: Vector(8) @index\n\
}\n";
let mut db = Omnigraph::init(uri, v1).await.unwrap();
// Add an *unrelated* scalar @index on `body`. This routes Doc through
// schema apply's index reconcile, which must NOT abort on the untrainable
// empty vector index.
let v2 = "node Doc {\n \
slug: String @key\n \
body: String? @index\n \
embedding: Vector(8) @index\n\
}\n";
let result = db.apply_schema(v2).await.expect(
"schema apply must succeed: an empty-table vector @index is deferred, not fatal",
);
assert!(result.applied, "the scalar @index change must apply");
// The deferred vector index is not dropped — once the table has a
// trainable vector, `ensure_indices` materializes it without error. (If
// the guard wrongly skipped a non-empty column, this would still be
// unindexed; if it wrongly tried to build on empty, the apply above would
// have failed.)
load_jsonl(
&mut db,
r#"{"type":"Doc","data":{"slug":"d1","body":"hello","embedding":[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8]}}"#,
LoadMode::Merge,
)
.await
.expect("loading a Doc with an embedding must succeed");
db.ensure_indices()
.await
.expect("the deferred vector index must build once the table has a trainable vector");
}
// iss-848: adding an `@index` to an existing column is a pure metadata change.
// Schema apply records the intent (the catalog/IR now declares the index) but
// must NOT build the index inline, so the table's data and manifest version are
// untouched. The physical index is materialized later by ensure_indices /
// optimize. Pre-iss-848 the indexed_tables block built the index inline and
// bumped the table version.
#[tokio::test]
async fn index_only_constraint_apply_touches_no_table_data() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap();
let v1 = "node Doc {\n slug: String @key\n n: I64\n}\n";
let mut db = Omnigraph::init(uri, v1).await.unwrap();
load_jsonl(
&mut db,
r#"{"type":"Doc","data":{"slug":"d1","n":1}}"#,
LoadMode::Merge,
)
.await
.expect("load a Doc");
let before = db
.snapshot_of(ReadTarget::branch("main"))
.await
.unwrap()
.entry("node:Doc")
.unwrap()
.table_version;
// Add an @index on the existing `n` column.
let v2 = "node Doc {\n slug: String @key\n n: I64 @index\n}\n";
let result = db.apply_schema(v2).await.expect("index-only apply must succeed");
assert!(result.applied, "the @index addition must apply");
let after = db
.snapshot_of(ReadTarget::branch("main"))
.await
.unwrap()
.entry("node:Doc")
.unwrap()
.table_version;
assert_eq!(
before, after,
"adding an @index must not bump the table version (no inline index build)"
);
}

View file

@ -1540,3 +1540,109 @@ async fn second_sequential_update_on_same_row_succeeds() {
"Alice's age must reflect the second update" "Alice's age must reflect the second update"
); );
} }
// An interrupted first-write fork (create_branch succeeded, the manifest
// publish did not) leaves a fully-formed Lance branch ref on the table that
// the manifest never references — a "manifest-unreferenced fork". The branch
// itself stays a valid manifest branch, so `cleanup`'s reconciler (keyed on
// the manifest branch list) never reclaims it. Today the next write to that
// table on that branch re-enters the fork path, `create_branch` collides, and
// the engine wedges with "incomplete prior delete; run `omnigraph cleanup`".
//
// We forge that exact residue (a live `feature` branch + a directly-created
// `feature` ref on the Person table the manifest doesn't reference) and assert
// the next write — via both `load` and `mutate` — self-heals by reclaiming the
// orphan fork and re-forking, rather than wedging. No process death / timing
// needed: the forge is the post-crash state.
#[tokio::test]
async fn first_write_self_heals_manifest_unreferenced_fork_on_live_branch() {
let dir = tempfile::tempdir().unwrap();
let uri = dir.path().to_str().unwrap().to_string();
let mut db = init_and_load(&dir).await;
db.branch_create("feature").await.unwrap();
// Forge the manifest-unreferenced fork directly at the Lance layer.
let person_uri = node_table_uri(&uri, "Person");
{
let mut ds = lance::Dataset::open(&person_uri).await.unwrap();
let base = ds.version().version;
ds.create_branch("feature", base, None).await.unwrap();
assert!(
ds.list_branches().await.unwrap().contains_key("feature"),
"precondition: forged orphan fork present on Person"
);
}
// load → must self-heal, not wedge with "incomplete prior delete".
let row = r#"{"type":"Person","data":{"name":"Zoe","age":30}}"#;
db.load_as("feature", None, row, LoadMode::Merge, None)
.await
.expect("load onto a manifest-unreferenced fork must self-heal, not wedge");
// mutate → same path, must also self-heal.
mutate_branch(
&mut db,
"feature",
MUTATION_QUERIES,
"insert_person",
&mixed_params(&[("$name", "Yan")], &[("$age", 41)]),
)
.await
.expect("mutate onto a manifest-unreferenced fork must self-heal");
// The healed branch holds the new rows; main is untouched (still no Zoe/Yan).
let feature_people = count_rows_branch(&db, "feature", "node:Person").await;
let main_people = count_rows(&db, "node:Person").await;
assert!(
feature_people >= main_people + 2,
"feature must contain the two new rows on top of the inherited set \
(feature={feature_people}, main={main_people})"
);
}
// A node delete cascades to every edge table touching that node, forking those
// edge tables during execution. The up-front fork-queue acquisition must cover
// those cascade-forked edges, not just the node table named in the IR — else
// commit_all's held-guard coverage check fails the write (and, before the
// coverage check was promoted out of debug-only, edge commits would slip
// through unserialized). This drives the new code via a DELETE (the only
// cascading op), on a branch, as the FIRST write (so it actually forks).
#[tokio::test]
async fn branch_cascade_delete_forks_node_and_edges_under_held_queues() {
let dir = tempfile::tempdir().unwrap();
let mut db = init_and_load(&dir).await;
db.branch_create("feature").await.unwrap();
// Baseline inherited from main (Alice has 2 Knows + 1 WorksAt edge).
let main_people = count_rows(&db, "node:Person").await;
let main_knows = count_rows(&db, "edge:Knows").await;
// First write to `feature` is `delete Person Alice`, whose cascade forks
// node:Person AND edge:Knows + edge:WorksAt. Pre-fix the up-front set held
// only node:Person, so commit_all's coverage check rejected the write.
mutate_branch(
&mut db,
"feature",
MUTATION_QUERIES,
"remove_person",
&mixed_params(&[("$name", "Alice")], &[]),
)
.await
.expect("branch cascade-delete must hold queues for cascade-forked edge tables");
// Alice and her edges are gone on feature; main is untouched.
assert_eq!(
count_rows_branch(&db, "feature", "node:Person").await,
main_people - 1,
"feature should have Alice removed from the inherited set"
);
assert!(
count_rows_branch(&db, "feature", "edge:Knows").await < main_knows,
"feature should have Alice's cascade-deleted Knows edges removed"
);
assert_eq!(
count_rows(&db, "node:Person").await,
main_people,
"main must be untouched by the branch delete"
);
}

View file

@ -1,6 +1,6 @@
# Architecture # Architecture
OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets, with Git-style branches and commits across the whole graph, multi-modal querying (vector + FTS + BM25 + RRF + graph traversal) in one runtime, an HTTP server with Cedar policy, and a CLI driven by a single `omnigraph.yaml`. OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets, with Git-style branches and commits across the whole graph, multi-modal querying (vector + FTS + BM25 + RRF + graph traversal) in one runtime, an HTTP server with Cedar policy, and a CLI driven by a per-operator `~/.omnigraph/config.yaml` plus team-owned cluster directories.
## Reading guide ## Reading guide

View file

@ -15,6 +15,38 @@ Use it this way:
- Keep implementation ledgers, roadmap detail, and historical MR notes in the - Keep implementation ledgers, roadmap detail, and historical MR notes in the
per-area docs. This file is the filter, not the encyclopedia. per-area docs. This file is the filter, not the encyclopedia.
## Governing principle: logical contract over physical state
The hard invariants below are instances of one rule. Keep it in view whenever
a change touches the boundary between what the graph *means* and how it is
physically stored.
> **Logical state is the contract. Physical state — index coverage, fragment
> layout, compaction versions, staged writes — is derived, rebuildable, and may
> be produced asynchronously. A physical operation must never fail a logical
> one. Preconditions are checked against logical state; physical reconciliation
> is idempotent and may lag or retry. Genuine logical conflicts still fail
> loudly: the licence to lag covers physical convergence, not correctness.**
Invariants that instantiate it: **2** (manifest-atomic visibility) and **5**
(recovery is part of the commit protocol) — a partially-written physical layer
never changes what a graph commit means; **7** (indexes are derived state) — a
query is correct under partial index coverage, and expensive index work
converges from manifest state instead of gating the write path; **13** (failures
bounded and observable) — the licence to lag is not a licence to drop, so a
physical step that cannot make progress is surfaced, not swallowed. Deny-list
items that enforce it: synchronous inline vector/FTS index rebuilds on the
commit path; state that drifts from Lance or the manifest when it can be
derived; job queues for manifest-derivable state where a reconciler fits.
The failure shape it rules out: a legitimate background operation on the
physical layer (compaction, an index build, an interrupted staged write) is
allowed to break a logical operation (a query's correctness, a migration's
success, a branch's writability). The smell to watch for is a logical operation
whose precondition is a *physical* fact — a cached file version, an index's
existence, a fragment count. Make the precondition logical and let a reconciler
converge the physical state.
## Hard Invariants ## Hard Invariants
1. **Respect the substrate.** Lance owns columnar storage, per-dataset 1. **Respect the substrate.** Lance owns columnar storage, per-dataset
@ -105,7 +137,7 @@ Use it this way:
| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) | | Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) |
| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) | | Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) |
| Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) | | Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
| Index lifecycle | Index *creation* per `@index`/`@key` property is dispatched by type (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) via `node_prop_index_kind`; index *coverage maintenance* exists — `optimize` runs Lance `optimize_indices` after compaction to fold appended/rewritten fragments into existing indexes (still an explicit maintenance call, not yet a background reconciler) | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) | | Index lifecycle | `@index`/`@key` declares *intent*; the physical index is derived state and never fails a logical op. `schema apply` builds no indexes (records intent only; index-only changes touch no table data). `load`/`mutate` build inline through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) that fault-isolates an untrainable Vector column into a *pending* index instead of aborting. `optimize`/`ensure_indices` is the reconciler: it creates declared-but-missing indexes and folds appended/rewritten fragments into existing ones (`optimize_indices`), reporting still-pending columns. Explicit maintenance call, not yet a background loop | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) | | Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) |
| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) | | Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) |
| Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) | | Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) |
@ -165,6 +197,22 @@ them explicit.
one-winner-CAS territory; closing this fully needs a cross-process one-winner-CAS territory; closing this fully needs a cross-process
serialization primitive (e.g. lease-based use of the schema-apply lock serialization primitive (e.g. lease-based use of the schema-apply lock
branch) — design it before promoting multi-process write topologies. branch) — design it before promoting multi-process write topologies.
- **Fork reclaim is in-process-safe only:** the first write to a table on a
branch forks it (a Lance `create_branch` that advances state before the
manifest publish). An interrupted fork (crash, or a cancelled request
future) leaves a manifest-unreferenced branch ref. The next write self-heals
it — `reclaim_orphaned_fork_and_refork` (`force_delete_branch` + re-fork)
— but reclaim is only safe because the writer holds the per-`(table,
branch)` write queue from before the fork through the publish AND re-checks
the live manifest under it, so no *in-process* writer can be mid-fork. A
reclaim cannot serialize against a foreign-*process* in-flight fork: it may
force-delete a peer's just-created ref, which makes that peer's commit fail
and retry — the same one-winner-CAS exposure as above, not corruption. The
reclaim never fires unless in-process-queue + manifest authority both prove
the ref is manifest-unreferenced. `cleanup`'s per-table reconciler
(`reconcile_orphaned_branches`) is the guaranteed backstop for any fork the
write path never revisits. Both degrade to a no-op if Lance ships an atomic
multi-dataset branch op.
- **Local `write_text_if_match` is not a cross-process CAS:** object-store - **Local `write_text_if_match` is not a cross-process CAS:** object-store
backends use a true conditional put (ETag If-Match; the in-memory test backends use a true conditional put (ETag If-Match; the in-memory test
backend too), but upstream `object_store` leaves `PutMode::Update` backend too), but upstream `object_store` leaves `PutMode::Update`

View file

@ -1,6 +1,9 @@
# RFC-011: CLI refactoring — one addressing & config model # RFC-011: CLI refactoring — one addressing & config model
**Status:** Proposed **Status:** Accepted — implemented (the `omnigraph.yaml` excision landed as
#250/#251/#252; D1D4, D6, D7, D9, D10 shipped). Two items remain: **D11**
(server-side maintenance jobs) is gated on the bulk-data-plane RFC #219; **D5**
(combined admin scope) stays deferred by design.
**Date:** 2026-06-14 **Date:** 2026-06-14
**Audience:** CLI/server maintainers **Audience:** CLI/server maintainers
**Builds on:** [rfc-007-operator-config.md](rfc-007-operator-config.md) **Builds on:** [rfc-007-operator-config.md](rfc-007-operator-config.md)
@ -526,10 +529,9 @@ Non-blocking; settle when convenient.
server scope and maintain via `--cluster`. A `deployments: { … }` object server scope and maintain via `--cluster`. A `deployments: { … }` object
(server + cluster validated coherent, referenced by a profile) is revisited only (server + cluster validated coherent, referenced by a profile) is revisited only
if admin ergonomics demand it — and Decision 11 largely removes the need. if admin ergonomics demand it — and Decision 11 largely removes the need.
- **D8 — the `profile` command surface.** `profile list` / `profile show` - **D8 — the `profile` command surface.** *Shipped:* `profile list` / `profile
(read-only inspection) are additive diagnostics, shippable anytime; they don't show [<name>]` (read-only inspection). The *no sticky `profile use`* constraint
touch the grammar or resolution. The *no sticky `profile use`* constraint holds holds — it is a design principle, not a command.
regardless — it is a design principle, not a command.
## Safety ## Safety

View file

@ -7,7 +7,7 @@ This file is the always-on map of the test surface. **Consult it before every ta
| Crate | Path | Style | | Crate | Path | Style |
|---|---|---| |---|---|---|
| `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (28 files), fixture-driven, share `tests/helpers/mod.rs` | | `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (28 files), fixture-driven, share `tests/helpers/mod.rs` |
| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | Per-area suites (post-modularization): `cli_cluster.rs` (cluster command surface + operator-actor cascade), `cli_cluster_e2e.rs` (spawned-binary lifecycle compositions — lost-state re-import recovery, out-of-band drift, graph-root destruction, multi-graph mixed-disposition convergence), `cli_data.rs` (load/read/change/branch/commit/export/snapshot/policy/embed/maintenance + operator format cascade), `cli_schema_config.rs` (init/config, schema plan/apply, RFC-008 deprecation warnings + `config migrate` + strict mode), `cli_queries.rs`, `parity_matrix.rs` (RFC-009 Phase 1: the embedded-vs-remote referee — every forked verb run against both arms with matched Cedar policy and the same actor, scrubbed-JSON + exit-code equality; divergences are pinned in its `KNOWN_DIVERGENCES` ledger, never silently repaired), `system_local.rs` (full-cycle cluster lifecycle with a spawned `--cluster` server, applied-policy enforcement over HTTP, keyed-credential auth, operator aliases), `system_remote.rs`; share `tests/support/mod.rs` (hermetic `OMNIGRAPH_HOME` by default) | | `omnigraph-cli` | `crates/omnigraph-cli/tests/` | Per-area suites (post-modularization): `cli_cluster.rs` (cluster command surface + operator-actor cascade), `cli_cluster_e2e.rs` (spawned-binary lifecycle compositions — lost-state re-import recovery, out-of-band drift, graph-root destruction, multi-graph mixed-disposition convergence), `cli_data.rs` (load/read/change/branch/commit/export/snapshot/policy/embed/maintenance + operator format cascade), `cli_schema_config.rs` (init/config, schema plan/apply), `cli_queries.rs`, `parity_matrix.rs` (RFC-009 Phase 1: the embedded-vs-remote referee — every forked verb run against both arms with matched Cedar policy and the same actor, scrubbed-JSON + exit-code equality; divergences are pinned in its `KNOWN_DIVERGENCES` ledger, never silently repaired), `system_local.rs` (full-cycle cluster lifecycle with a spawned `--cluster` server, applied-policy enforcement over HTTP, keyed-credential auth, operator aliases), `system_remote.rs`; share `tests/support/mod.rs` (hermetic `OMNIGRAPH_HOME` by default) |
| `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests`; `tests/failpoints.rs` (feature-gated); `tests/s3_cluster.rs` (bucket-gated full lifecycle on object storage) | Cluster config parser, local JSON state diff, state CAS/lock handling/recovery, read-only validate/plan/status plus explicit refresh/import graph observations, config-only apply (content-addressed payload publish, disposition gating, composite-digest convergence, idempotent re-apply), catalog payload verification (status read-only, refresh drift + self-heal), failpoint crash-mid-apply / CAS-race coverage, Stage 4A graph creation (create executor, recovery sidecars + sweep rows, create crash windows), Stage 4B schema apply (migration previews in plan, schema executor, schema-apply sweep classification, schema crash windows), Stage 4C gated deletes (digest-bound approvals, delete executor + tombstones, delete sweep rows, delete crash windows), and 5A policy binding metadata (applies_to in the applied revision, binding-change diffing + convergence, pre-5A backfill), and the 5B serving-snapshot read API (converged read, refusal rows) | | `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests`; `tests/failpoints.rs` (feature-gated); `tests/s3_cluster.rs` (bucket-gated full lifecycle on object storage) | Cluster config parser, local JSON state diff, state CAS/lock handling/recovery, read-only validate/plan/status plus explicit refresh/import graph observations, config-only apply (content-addressed payload publish, disposition gating, composite-digest convergence, idempotent re-apply), catalog payload verification (status read-only, refresh drift + self-heal), failpoint crash-mid-apply / CAS-race coverage, Stage 4A graph creation (create executor, recovery sidecars + sweep rows, create crash windows), Stage 4B schema apply (migration previews in plan, schema executor, schema-apply sweep classification, schema crash windows), Stage 4C gated deletes (digest-bound approvals, delete executor + tombstones, delete sweep rows, delete crash windows), and 5A policy binding metadata (applies_to in the applied revision, binding-change diffing + convergence, pre-5A backfill), and the 5B serving-snapshot read API (converged read, refusal rows) |
| `omnigraph-server` | `crates/omnigraph-server/tests/` | Per-area suites (post-modularization): `auth_policy.rs`, `data_routes.rs`, `schema_routes.rs`, `stored_queries.rs`, `multi_graph.rs` (cluster-mode boot — converged serving, policy binding wiring, boot refusals — + the concurrent branch-ops matrix), `boot_settings.rs` (mode inference, PolicySource), `s3.rs` (bucket-gated: single-graph serving + config-free `--cluster s3://` boot), `openapi.rs` (OpenAPI drift / regeneration); share `tests/support/mod.rs` | | `omnigraph-server` | `crates/omnigraph-server/tests/` | Per-area suites (post-modularization): `auth_policy.rs`, `data_routes.rs`, `schema_routes.rs`, `stored_queries.rs`, `multi_graph.rs` (cluster-mode boot — converged serving, policy binding wiring, boot refusals — + the concurrent branch-ops matrix), `boot_settings.rs` (mode inference, PolicySource), `s3.rs` (bucket-gated: single-graph serving + config-free `--cluster s3://` boot), `openapi.rs` (OpenAPI drift / regeneration); share `tests/support/mod.rs` |
| `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint | | `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint |
@ -29,7 +29,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
| `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) | | `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) |
| `changes.rs` | `diff_between` / `diff_commits` | | `changes.rs` | `diff_between` / `diff_commits` |
| `consistency.rs` | Cross-table snapshot isolation, atomic publish | | `consistency.rs` | Cross-table snapshot isolation, atomic publish |
| `schema_apply.rs` | Migration plan + apply, schema-apply lock | | `schema_apply.rs` | Migration plan + apply, schema-apply lock; index materialization deferred to the reconciler (iss-848): `apply_schema_defers_vector_index_on_empty_table` (an empty-table Vector `@index` never aborts the apply) and `index_only_constraint_apply_touches_no_table_data` (adding an `@index` is metadata-only — no table-version bump) |
| `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) | | `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) |
| `traversal.rs` | `Expand`, variable-length hops, anti-join (CSR path — `OMNIGRAPH_TRAVERSAL_MODE` unset) | | `traversal.rs` | `Expand`, variable-length hops, anti-join (CSR path — `OMNIGRAPH_TRAVERSAL_MODE` unset) |
| `traversal_indexed.rs` | BTREE-indexed Expand (`execute_expand_indexed`) forced via `OMNIGRAPH_TRAVERSAL_MODE`, asserted semantically equal to the CSR path; own binary, all `#[serial]` so env writes never race | | `traversal_indexed.rs` | BTREE-indexed Expand (`execute_expand_indexed`) forced via `OMNIGRAPH_TRAVERSAL_MODE`, asserted semantically equal to the CSR path; own binary, all `#[serial]` so env writes never race |
@ -42,7 +42,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav
| `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior | | `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior |
| `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths | | `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths |
| `policy_engine_chassis.rs` | Engine-layer Cedar enforcement (MR-722): allow + deny through every `_as` writer via the SDK directly — no HTTP — proving embedded and CLI callers hit the same gate as the server, with action × scope shapes matching `authorize_request` | | `policy_engine_chassis.rs` | Engine-layer Cedar enforcement (MR-722): allow + deny through every `_as` writer via the SDK directly — no HTTP — proving embedded and CLI callers hit the same gate as the server, with action × scope shapes matching `authorize_request` |
| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice | | `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice; the index reconciler (iss-848): `index_build_tolerates_null_vector_rows` (an untrainable Vector column defers instead of aborting the build, sibling indexes still build) and `optimize_materializes_index_declared_but_unbuilt` (optimize creates a declared-but-deferred index) |
| `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`) and the write-entry in-process heal contract (the four `*_after_finalize_publisher_failure_heals_without_reopen` tests — load, mutation, schema apply, branch merge: a follow-up write on the same handle rolls a sidecar-covered residual forward without reopen/refresh) and the storage-fault matrix for the sidecar lifecycle (`recovery.sidecar_{write,delete,list}` / `recovery.record_audit` failpoints: Phase A put failure aborts with zero drift, Phase D delete failure is swallowed and healed by the next write, list failures are loud at heal and open, audit-append failures are retried to exactly one audit row; plus the bucket-gated `s3_load_recovers_after_publisher_failure_without_reopen`). | | `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`) and the write-entry in-process heal contract (the four `*_after_finalize_publisher_failure_heals_without_reopen` tests — load, mutation, schema apply, branch merge: a follow-up write on the same handle rolls a sidecar-covered residual forward without reopen/refresh) and the storage-fault matrix for the sidecar lifecycle (`recovery.sidecar_{write,delete,list}` / `recovery.record_audit` failpoints: Phase A put failure aborts with zero drift, Phase D delete failure is swallowed and healed by the next write, list failures are loud at heal and open, audit-append failures are retried to exactly one audit row; plus the bucket-gated `s3_load_recovers_after_publisher_failure_without_reopen`). |
| `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path | | `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path |
| `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). | | `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). |

View file

@ -19,8 +19,14 @@ publisher's row-level CAS on `__manifest` is the single fence.
`__run__*` branch on an upgraded graph is swept off `__manifest` by the `__run__*` branch on an upgraded graph is swept off `__manifest` by the
v2→v3 internal-schema migration on first read-write open. (The inert v2→v3 internal-schema migration on first read-write open. (The inert
`_graph_runs.lance` bytes remain until a `delete_prefix` primitive lands.) `_graph_runs.lance` bytes remain until a `delete_prefix` primitive lands.)
- Cancelled mutation futures leave **no graph-level state** — only orphaned - Cancelled mutation futures leave **no graph-visible state** — the manifest
Lance fragments, which the existing `omnigraph cleanup` pipe reclaims. is never advanced. They can leave two kinds of unreferenced residue, both
self-healing: orphaned Lance fragments (reclaimed by `omnigraph cleanup`),
and — on the *first* write to a table on a branch, which forks it before the
publish — a manifest-unreferenced branch ref. The next write to that table
reclaims the stale fork and re-forks (`reclaim_orphaned_fork_and_refork`),
and `cleanup`'s per-table reconciler is the guaranteed backstop; see the
fork-reclaim note in [invariants.md](invariants.md).
## Read-your-writes within a multi-statement mutation ## Read-your-writes within a multi-statement mutation

View file

@ -6,34 +6,43 @@
omnigraph init --schema schema.pg graph.omni omnigraph init --schema schema.pg graph.omni
omnigraph load --data data.jsonl --mode overwrite graph.omni omnigraph load --data data.jsonl --mode overwrite graph.omni
omnigraph snapshot graph.omni --branch main --json omnigraph snapshot graph.omni --branch main --json
omnigraph query --uri graph.omni --query queries.gq --name get_person --params '{"name":"Alice"}' # Invoke a stored query BY NAME from the catalog (served — addressed by scope):
omnigraph mutate --uri graph.omni --query queries.gq --name insert_person --params '{"name":"Mina","age":28}' omnigraph query get_person --params '{"name":"Alice"}'
omnigraph mutate insert_person --params '{"name":"Mina","age":28}'
``` ```
`omnigraph query` is the canonical read command (pairs with `POST /query`); `omnigraph query` is the canonical read command (pairs with `POST /query`);
`omnigraph mutate` is the canonical write command (pairs with `POST /mutate`). `omnigraph mutate` is the canonical write command (pairs with `POST /mutate`).
The previous names `omnigraph read` and `omnigraph change` keep working as The positional argument is the **stored-query name**, invoked from the served
visible aliases — invocations emit a one-line deprecation warning to stderr catalog (RFC-011 D3) — the graph is addressed by scope (`--server` / `--profile`
and otherwise behave identically. See [Deprecated names](#deprecated-names) / defaults), and the verb asserts the query's kind (`query` rejects a stored
for the migration table. mutation, and vice-versa). The previous names `omnigraph read` and
`omnigraph change` keep working as visible aliases — invocations emit a one-line
deprecation warning to stderr. See [Deprecated names](#deprecated-names).
For ad-hoc reads and mutations (REPLs, AI agents, one-off scripts), pass the For **ad-hoc** reads and mutations (REPLs, AI agents, one-off scripts, local dev),
GQ source inline with `-e` / `--query-string` instead of a file path: pass the GQ source with `-e` / `--query-string` (inline) or `--query <path>` (a
file), and address a graph's storage directly with `--store`. By-name catalog
invocation is served-only — a bare `--store` has no catalog, so it's the ad-hoc
lane:
```bash ```bash
omnigraph query --uri graph.omni \ omnigraph query --store graph.omni \
-e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \ -e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \
--params '{"name":"Alice"}' --params '{"name":"Alice"}'
omnigraph mutate --uri graph.omni \ omnigraph mutate --store graph.omni \
-e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \ -e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \
--params '{"name":"Inline","age":42}' --params '{"name":"Inline","age":42}'
# A multi-query file: the positional selects which query to run.
omnigraph query --store graph.omni --query queries.gq get_person --params '{"name":"Alice"}'
``` ```
`-e` is mutually exclusive with `--query <path>` and `--alias <name>`; exactly `-e` is mutually exclusive with `--query <path>`. With either, the positional
one of the three must be provided. The inline source travels through the same name (optional) selects which query in the source to run. The inline source
parser, lint, params binding, and commit machinery as a file-based query — travels through the same parser, lint, params binding, and commit machinery as a
only the source loader changes. file-based query — only the source loader changes.
## Branching And Reviewable Data Flows ## Branching And Reviewable Data Flows
@ -50,19 +59,18 @@ omnigraph commit show --uri graph.omni <commit-id> --json
## Remote Server Mode ## Remote Server Mode
Serve a graph: Serve a cluster-applied graph:
```bash ```bash
omnigraph-server graph.omni --bind 127.0.0.1:8080 omnigraph cluster apply --config ./company-brain
omnigraph-server --cluster ./company-brain --bind 127.0.0.1:8080
``` ```
Read through the HTTP API: Read through the HTTP API — invoke a stored query by name from the catalog:
```bash ```bash
omnigraph query \ omnigraph query get_person \
--server http://127.0.0.1:8080 \ --server http://127.0.0.1:8080 \
--query queries.gq \
--name get_person \
--params '{"name":"Alice"}' --params '{"name":"Alice"}'
``` ```
@ -71,25 +79,31 @@ literal URL); a positional `http(s)://` URI is rejected. If the server requires
auth, set its bearer token and `omnigraph login <server>` (or auth, set its bearer token and `omnigraph login <server>` (or
`OMNIGRAPH_BEARER_TOKEN`). `OMNIGRAPH_BEARER_TOKEN`).
## Multi-graph servers (v0.6.0+) ## Multi-graph servers
Against a multi-graph server (started with `--config omnigraph.yaml` referencing a non-empty `graphs:` map), use `omnigraph graphs list` to enumerate the registered graphs. The server must configure bearer tokens and `server.policy.file` with a rule that allows `graph_list`; `/graphs` is closed by default even when the server runs with `--unauthenticated`. A server boots from a cluster directory (`omnigraph-server --cluster <dir>`) and
serves every graph the cluster declares. Use `omnigraph graphs list` to enumerate
them. The cluster's server-level policy must allow `graph_list`; `/graphs` is
closed by default even when the server runs with `--unauthenticated`.
```bash ```bash
OMNIGRAPH_BEARER_TOKEN=admin-token \ OMNIGRAPH_BEARER_TOKEN=admin-token \
omnigraph graphs list --uri http://server.example.com --json omnigraph graphs list --server http://server.example.com --json
``` ```
For config-driven clients, set the remote graph's `bearer_token_env` to an environment variable containing a token whose actor is authorized by `server.policy.file`. For an operator-defined server, store its token with `omnigraph login <name>` (or
`OMNIGRAPH_TOKEN_<NAME>`); the actor must be authorized by the cluster's
server-level policy.
`list` rejects local URI targets — it's for remote multi-graph servers only. `list` rejects local (`--store`) targets — it's for remote multi-graph servers only.
Runtime add/remove is **not** in v0.6.0. To add a graph, stop the server, add a `graphs.<id>` entry to `omnigraph.yaml`, then restart. To remove, stop the server, delete the entry, restart. Runtime add/remove via API is not exposed. To add or remove a graph, edit the
cluster's `cluster.yaml`, run `omnigraph cluster apply`, then restart the server.
Per-graph URLs: hit a graph's cluster route from any subcommand by pointing `--uri` at it: Per-graph addressing: select a graph on a multi-graph server with `--graph`:
```bash ```bash
omnigraph read --uri http://server.example.com/graphs/beta --query q.gq ... omnigraph query get_person --server http://server.example.com --graph beta --params '{"name":"Ada"}'
``` ```
## Runs, Policy, And Diagnostics ## Runs, Policy, And Diagnostics
@ -100,9 +114,9 @@ omnigraph check --query queries.gq graph.omni --json
omnigraph schema plan --schema next.pg graph.omni --json omnigraph schema plan --schema next.pg graph.omni --json
omnigraph schema apply --schema next.pg graph.omni --json omnigraph schema apply --schema next.pg graph.omni --json
omnigraph policy validate --config omnigraph.yaml omnigraph policy validate --cluster ./company-brain --graph knowledge
omnigraph policy test --config omnigraph.yaml omnigraph policy test --cluster ./company-brain --graph knowledge --tests policy.tests.yaml
omnigraph policy explain --config omnigraph.yaml --actor act-alice --action read --branch main omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act-alice --action read --branch main
omnigraph commit list graph.omni --json omnigraph commit list graph.omni --json
omnigraph commit show --uri graph.omni <commit-id> --json omnigraph commit show --uri graph.omni <commit-id> --json
@ -116,34 +130,29 @@ also pass `--schema`.
## Config ## Config
`omnigraph.yaml` lets the CLI and server share named graphs, defaults, and Configuration has two surfaces with single owners (see the
query roots: [CLI reference](reference.md#config-surfaces) for the full schema):
- **`~/.omnigraph/config.yaml`** — your personal operator config: default actor
(`--as`), named servers + credentials, clusters, profiles, aliases, and
default scope (`defaults.server` / `defaults.store` / `default_graph`). It
decides *who you are* and *what you address by default*.
- **`cluster.yaml`** (a team-owned cluster directory) — declares *what the system
is*: graphs, schemas, stored queries, policies, and storage. A server boots
from it (`--cluster <dir>`); see the [cluster guide](../clusters/index.md).
```yaml ```yaml
graphs: # ~/.omnigraph/config.yaml
local: operator:
uri: demo.omni actor: act-andrew
servers:
dev: dev:
uri: http://127.0.0.1:8080 url: http://127.0.0.1:8080
bearer_token_env: OMNIGRAPH_BEARER_TOKEN defaults:
server: dev
cli: default_graph: knowledge
graph: local
branch: main
query:
roots:
- queries
- .
``` ```
The config file can also define:
- server bind defaults
- auth env files
- query aliases for common read and change commands
- `policy.file` for Cedar authorization rules
When policy is enabled, `schema apply` is authorized through the When policy is enabled, `schema apply` is authorized through the
`schema_apply` action and is typically limited to admins on protected `main`. `schema_apply` action and is typically limited to admins on protected `main`.
@ -161,6 +170,6 @@ one-line warning to stderr and otherwise behave identically.
| `omnigraph query lint` | `omnigraph lint` | Same flags. The argv-level shim rewrites `query lint` to `lint`. | | `omnigraph query lint` | `omnigraph lint` | Same flags. The argv-level shim rewrites `query lint` to `lint`. |
| `omnigraph query check` | `omnigraph check` | `check` is a visible alias of `omnigraph lint`. | | `omnigraph query check` | `omnigraph check` | `check` is a visible alias of `omnigraph lint`. |
The `command:` field in `aliases.<name>` in `omnigraph.yaml` accepts both The `command:` field in `aliases.<name>` in `~/.omnigraph/config.yaml` accepts
`read` / `change` (legacy) and `query` / `mutate` (canonical); the two both `read` / `change` (legacy) and `query` / `mutate` (canonical); the two
spellings are interchangeable on the wire via serde aliases. spellings are interchangeable on the wire via serde aliases.

View file

@ -1,31 +1,32 @@
# CLI Reference (`omnigraph`) # CLI Reference (`omnigraph`)
A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](index.md). A reference for the `omnigraph` binary's command surface and the per-operator `~/.omnigraph/config.yaml` schema. For a quick-start guide, see [cli.md](index.md).
Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server <name|url>` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph <id>` for multi-graph servers; exclusive with a positional URI), `--store <uri>` (a single graph's storage directly), or `--profile <name>` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config <dir>`. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server <name|url>` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph <id>` for multi-graph servers; exclusive with a positional URI), `--store <uri>` (a single graph's storage directly), or `--profile <name>` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config <dir>`, while `policy` and `queries` read a cluster's applied state via `--cluster <dir|uri>`. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. **`query`/`mutate` are the exception**: their positional is a stored-query *name* (RFC-011 D3), not a graph URI, so they address the graph only via `--store`/`--server`/`--profile`/defaults.
## Top-level commands ## Top-level commands
| Command | Purpose | | Command | Purpose |
|---|---| |---|---|
| `init` | `--schema <pg>` → initialize a graph (no longer scaffolds `omnigraph.yaml`; start cluster configs from the [cluster.md](../clusters/index.md) quick-start or `config migrate`) | | `init` | `--schema <pg>` → initialize a graph (start cluster configs from the [cluster.md](../clusters/index.md) quick-start) |
| `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from <base>` forks a missing `--branch` from `<base>` first | | `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from <base>` forks a missing `--branch` from `<base>` first |
| `ingest` | deprecated alias of `load --from <base>` (defaults: `--from main --mode merge`); prints a one-line warning to stderr | | `ingest` | deprecated alias of `load --from <base>` (defaults: `--from main --mode merge`); prints a one-line warning to stderr |
| `query` (alias: `read`) | run named read query; source via `--query <path>`, `-e`/`--query-string <GQ>`, or `--alias <name>` (exactly one). `read` is the deprecated previous name and prints a one-line warning to stderr | | `query <name>` (alias: `read`) | run a read query. **Catalog lane** (default): `<name>` is a stored query invoked **by name** from the served catalog (served-only — address with `--server`/`--profile`; the verb asserts the query is a read). **Ad-hoc lane**: with `--query <path>` or `-e`/`--query-string <GQ>`, runs that source (the positional `<name>` then selects which query in it). No positional graph URI — address via `--store`/`--server`/`--profile`. `read` is the deprecated previous name (one-line stderr warning) |
| `mutate` (alias: `change`) | run mutation query; same `--query` / `-e` / `--alias` mutual-exclusion as `query`. `change` is the deprecated previous name and prints a one-line warning to stderr | | `mutate <name>` (alias: `change`) | run a mutation query; same catalog (by-name, served-only, verb asserts mutation) / ad-hoc (`--query`/`-e`) lanes as `query`. `change` is the deprecated previous name (one-line stderr warning) |
| `alias <name> [args]` | invoke an operator alias — a read-only personal binding (under `aliases:` in `~/.omnigraph/config.yaml`) to a stored query on a named server (RFC-011 D4; replaces the removed `--alias` flag; stored mutations are rejected before execution) |
| `snapshot` | print current snapshot (per-table version + row count) | | `snapshot` | print current snapshot (per-table version + row count) |
| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) | | `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
| `branch create \| list \| delete \| merge` | branching ops | | `branch create \| list \| delete \| merge` | branching ops |
| `commit list \| show` | inspect commit graph | | `commit list \| show` | inspect commit graph |
| `schema plan \| apply \| show (alias: get)` | migrations | | `schema plan \| apply \| show (alias: get)` | migrations. `apply` refuses a cluster-managed graph (one whose storage is inside a cluster) and points at `cluster apply` — those graphs evolve through the cluster ledger, not a direct apply |
| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` | | `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
| `config migrate` | propose (or `--write`: apply) the split of a legacy `omnigraph.yaml` — team half → ready-to-review `cluster.yaml`, personal half → `~/.omnigraph/config.yaml` (key-level merge, existing entries win), plus dropped-key reasons and manual steps | | `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve <resource> --as <actor>` (`apply`/`approve` default the actor from `~/.omnigraph/config.yaml`'s `operator.actor` when `--as` is omitted); what apply converges is what an `omnigraph-server --cluster <dir>` deployment serves on its next restart (`--cluster` is the server's only boot source — RFC-011 cluster-only); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock <LOCK_ID>` manually removes a held local state lock by exact id |
| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve <resource> --as <actor>` (`apply`/`approve` default the actor from the per-operator `omnigraph.yaml`'s `cli.actor` when `--as` is omitted; nothing else in that file affects cluster commands); what apply converges is what an `omnigraph-server --cluster <dir>` deployment serves on its next restart (omnigraph.yaml deployments are unaffected); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock <LOCK_ID>` manually removes a held local state lock by exact id |
| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) | | `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) |
| `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review | | `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review |
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC | | `cleanup --keep N --older-than 7d --confirm` | destructive version GC (`--confirm` to execute; also needs `--yes` against a non-local `s3://` target — see *Write diagnostics & destructive confirmation*) |
| `embed` | offline JSONL embedding pipeline | | `embed` | offline JSONL embedding pipeline |
| `policy validate \| test \| explain` | Cedar tooling. Selects `cli.graph`, else `server.graph`, else top-level `policy.file` | | `policy validate \| test \| explain` | Cedar tooling against a cluster's applied policies (`--cluster <dir>`; `--graph <id>` picks a graph's bundle when several apply). `test` takes `--tests <file>`; `explain` takes `--actor`/`--action`/`--branch`/`--target-branch` |
| `profile list \| show [<name>]` | read-only inspection of `~/.omnigraph/config.yaml` profiles. `list` shows each profile's binding (server/cluster/store) + default graph and marks the `$OMNIGRAPH_PROFILE`-active one; JSON keeps `binding` and adds `scope_kind`, `target`, `valid`, and `error`; `show` resolves one profile's scope (endpoint + default graph), defaulting to the active profile, else the flat operator defaults |
| `version` / `-v` | print `omnigraph 0.3.x` | | `version` / `-v` | print `omnigraph 0.3.x` |
## Command capabilities ## Command capabilities
@ -34,21 +35,30 @@ Every command declares the **capability** it needs — what it requires to reach
- **`any`** — `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show`, `schema apply`. Run against a graph **served (via a server) or embedded (direct against a store)**: accept a positional `file://`/`s3://` URI, `--server <name|url>` (+ `--graph <id>` for multi-graph servers), `--store <uri>`, or `--profile <name>`. A remote server is addressed with `--server` — a positional `http(s)://` URI does **not** dispatch to one. - **`any`** — `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show`, `schema apply`. Run against a graph **served (via a server) or embedded (direct against a store)**: accept a positional `file://`/`s3://` URI, `--server <name|url>` (+ `--graph <id>` for multi-graph servers), `--store <uri>`, or `--profile <name>`. A remote server is addressed with `--server` — a positional `http(s)://` URI does **not** dispatch to one.
- **`served`** — `graphs list`. Requires a server (accepts `--server` / `--profile`). - **`served`** — `graphs list`. Requires a server (accepts `--server` / `--profile`).
- **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server` / `--graph`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` also accept **`--cluster <dir|s3://…> --cluster-graph <id>`**, which resolves the graph's storage URI from the served cluster state (so you needn't know the `<storage>/graphs/<id>.omni` layout). - **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` additionally accept **`--cluster <dir|s3://…> --graph <id>`** (`--cluster` is a cluster directory or storage-root URI, named via `clusters:` in `~/.omnigraph/config.yaml` or a literal root), which resolves the graph's storage URI from the served cluster state (so you needn't know the `<storage>/graphs/<id>.omni` layout). `--graph` is the one graph selector across all scopes — on these three verbs it picks the cluster graph; on the other `direct` verbs it does not apply.
- **`control`** — `cluster *`. Operates on a cluster directory via `--config <dir>`. - **`control`** — `cluster *` via `--config <dir>`; `policy *` and `queries *` via `--cluster <dir|uri>` or a cluster profile.
- **`local`** — `policy *`, `embed`, `login`, `logout`, `config`, `version`, `queries list`. Address no graph. - **`local`** — `alias`, `embed`, `login`, `logout`, `profile`, `version`. Address no explicit graph scope.
These restrictions are enforced and reported, not silent: These restrictions are enforced and reported, not silent:
- A served-graph flag (`--server` / `--graph`) on a verb that doesn't reach a graph through a server fails loudly, e.g.: ``optimize is a direct (storage-native) command; --server/--graph address a served graph and do not apply. Pass a storage URI, or --cluster <dir> --cluster-graph <id>.`` - A scope flag on a verb that can't consume it fails loudly rather than being silently dropped — `--server` outside a served scope, `--cluster` outside cluster-scoped verbs, or `--graph` where no multi-graph scope applies, e.g.: ``optimize is a direct (storage-native) command; --server addresses a served graph and does not apply. Pass a storage URI, or --cluster <dir> --graph <id>.``
- A `direct` verb pointed at a remote URI fails loudly, e.g.: ``optimize is a direct (storage-native) command and needs direct storage access; the resolved target is a remote server (https://…). Pass the graph's file:// or s3:// URI.`` - A `direct` verb pointed at a remote URI fails loudly, e.g.: ``optimize is a direct (storage-native) command and needs direct storage access; the resolved target is a remote server (https://…). Pass the graph's file:// or s3:// URI.``
- A data verb pointed at a positional `http(s)://` URI fails loudly: ``a remote graph must be addressed with --server <url> — a positional (or --uri) http(s):// URL no longer dispatches to a server.`` - A data verb pointed at a positional `http(s)://` URI fails loudly: ``a remote graph must be addressed with --server <url> — a positional (or --uri) http(s):// URL no longer dispatches to a server.``
- `init` into an **established cluster's** storage layout (`<root>/graphs/<id>.omni` where `<root>` holds `__cluster/state.json`) is refused — graphs in a cluster are created by `cluster apply` (which records ledger / recovery / approvals), not `init`. - `init` into an **established cluster's** storage layout (`<root>/graphs/<id>.omni` where `<root>` holds `__cluster/state.json`) is refused — graphs in a cluster are created by `cluster apply` (which records ledger / recovery / approvals), not `init`.
To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --cluster-graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design. To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design.
`omnigraph --help` lists commands with a **capability legend** at the bottom (any / served / direct / control / local). `omnigraph --help` lists commands with a **capability legend** at the bottom (any / served / direct / control / local).
## Write diagnostics & destructive confirmation
Two global flags make writes self-documenting and guard the dangerous ones (RFC-011 Decision 9):
- **Every write echoes its resolved target to stderr**`omnigraph load → s3://acme/brain/graphs/knowledge.omni (direct, remote)` — so you catch a scope that resolved somewhere unexpected (e.g. *prod*) before it lands. Applies to `load`, `ingest`, `mutate`, `branch create|delete|merge`, `schema apply`, `optimize`, `repair`, `cleanup`. The line is stderr, so `--json` consumers reading stdout are unaffected; suppress it with **`--quiet`**.
- **Destructive writes against a non-local scope require confirmation.** `cleanup`, overwrite `load` (`--mode overwrite`), and `branch delete` proceed freely against a local (`file://`) graph, but when the resolved target is **not local** (a served `http(s)://` graph or an `s3://` store/cluster) they require explicit consent: pass **`--yes`** to confirm, an interactive terminal is prompted, and a non-interactive run (no TTY, or `--json`) **refuses with an error** rather than silently destroying. `cleanup` still also requires its existing `--confirm` (preview→execute); `--yes` is the additional non-local consent.
A "local" target is a bare path or a `file://` URI; `http(s)://`, `s3://`, and other object-store schemes are non-local.
## Config surfaces ## Config surfaces
Two config surfaces with single owners, plus a zero-config tier: Two config surfaces with single owners, plus a zero-config tier:
@ -59,22 +69,20 @@ Two config surfaces with single owners, plus a zero-config tier:
| Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics | | Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics |
| Flags / env | per invocation | — | everything, explicitly | | Flags / env | per invocation | — | everything, explicitly |
`omnigraph.yaml` (below) is the legacy combined file — fully supported
today, slated for staged deprecation; its keys' future homes are
listed there.
### `~/.omnigraph/config.yaml` (operator) ### `~/.omnigraph/config.yaml` (operator)
```yaml ```yaml
operator: operator:
actor: act-andrew # default identity for every --as cascade: actor: act-andrew # default identity for the --as cascade: --as > operator.actor > none
# --as > legacy cli.actor > operator.actor > none
servers: # operator-owned endpoints; names key the credentials servers: # operator-owned endpoints; names key the credentials
prod: prod:
url: https://graph.example.com # no tokens in this file, ever url: https://graph.example.com # no tokens in this file, ever
defaults: defaults:
output: table # read format default, below --json/--format/alias/legacy output: table # read format default, below --json/--format/alias
server: prod # the everyday scope when no address is given (RFC-011) server: prod # the everyday SERVED scope when no address is given (RFC-011)
# store: file:///data/dev.omni # OR a zero-flag LOCAL default (mutually
# # exclusive with `server`); the local-dev
# # counterpart of `server`
default_graph: knowledge # graph selected in a server/cluster scope default_graph: knowledge # graph selected in a server/cluster scope
clusters: # admin-only: managed-cluster storage roots (RFC-011). clusters: # admin-only: managed-cluster storage roots (RFC-011).
brain: # the ONLY place a storage root lives in this file. brain: # the ONLY place a storage root lives in this file.
@ -85,8 +93,8 @@ profiles: # named scope bundles (RFC-011); pick with --profile
``` ```
Absent file = empty layer. Unknown keys warn and load (a file written for a Absent file = empty layer. Unknown keys warn and load (a file written for a
newer CLI works on an older one). `$OMNIGRAPH_CONFIG=<path>` stands in for newer CLI works on an older one). Override the config directory with
`--config` (the flag wins) in both the CLI and the server. `$OMNIGRAPH_HOME`.
#### Scopes & profiles (RFC-011) #### Scopes & profiles (RFC-011)
@ -95,20 +103,32 @@ graph in it; the served-vs-direct access path is derived from the scope, not
toggled. The scope comes from one of (highest precedence first): an explicit toggled. The scope comes from one of (highest precedence first): an explicit
address (a positional URI, `--server`, or `--store <uri>`); a named address (a positional URI, `--server`, or `--store <uri>`); a named
`--profile <name>` (or `$OMNIGRAPH_PROFILE`); or the flat `defaults.server` + `--profile <name>` (or `$OMNIGRAPH_PROFILE`); or the flat `defaults.server` +
`defaults.default_graph`. A **profile** binds exactly one of `server` / `cluster` `defaults.default_graph` (a served default) **or** `defaults.store` (a zero-flag
/ `store` plus an optional default graph — config data, not state: every command *local* default — mutually exclusive with `defaults.server`). A **profile** binds
resolves its scope fresh, there is no sticky "current" mode. exactly one of `server` / `cluster` / `store` plus an optional default graph —
config data, not state: every command resolves its scope fresh, there is no
sticky "current" mode. Inspect what is defined with `omnigraph profile list` and
`omnigraph profile show [<name>]` (read-only).
- `--store <uri>` addresses a single graph's storage directly (ad-hoc / break-glass). - `--store <uri>` addresses a single graph's storage directly (ad-hoc / break-glass).
- A `cluster`-bound profile reaches `optimize` / `repair` / `cleanup` for a managed - A `cluster`-bound profile reaches `optimize` / `repair` / `cleanup` for a managed
graph (resolving its storage root from `clusters:`), the same as graph (resolving its storage root from `clusters:`), the same as
`--cluster <root> --cluster-graph <id>`. `--cluster <root> --graph <id>`. A `--graph` flag overrides the profile's default.
- A `server`-bound scope on a maintenance verb, or a `cluster`-bound scope on a - A `server`-bound scope on a maintenance verb, or a `cluster`-bound scope on a
data verb, is rejected with a message pointing at the right addressing. data verb, is rejected with a message pointing at the right addressing.
- **No graph selected (RFC-011 D7).** When a scope has no `--graph` and no
`default_graph`, the CLI never silently picks:
- **Cluster scope** — exactly **one** applied graph is used automatically;
**several** errors and lists the candidates (from the served catalog).
- **Server scope** — a multi-graph server (any non-empty `GET /graphs`, even a
single entry) errors and lists the candidates: you must pass `--graph <id>`.
A single-graph / flat server (405 on `/graphs`), or one whose `/graphs` is
policy-gated or unreachable, uses its bare URL as before.
`--target` and the positional-`http(s)://`→remote dispatch have been **removed**; `--target`, `--cluster-graph`, and the positional-`http(s)://`→remote dispatch
the remaining legacy surfaces (`--cluster-graph`, `omnigraph.yaml`'s `cli.graph` have been **removed** (`--graph` is now the one graph selector across server and
default) still work and an explicit address always wins. cluster scopes); operator `defaults`/`--profile` supply the no-flag scope and an
explicit address always wins.
#### Credentials keyed by server name #### Credentials keyed by server name
@ -136,10 +156,11 @@ aliases:
format: table format: table
``` ```
`omnigraph query --alias triage 2026-06-01` invokes `omnigraph alias triage 2026-06-01` invokes
`POST <server>/graphs/spike/queries/weekly_triage` with the keyed `POST <server>/graphs/spike/queries/weekly_triage` with the keyed
credential. A legacy `omnigraph.yaml` alias with the same name wins during credential. Aliases live in their own `alias` namespace (RFC-011 Decision 4),
the deprecation window (with a warning). so an alias can never shadow — or be shadowed by — a built-in verb. (The old
`--alias <name>` flag on `query`/`mutate` was removed.)
A remote command whose URL prefix-matches an operator server's `url` (the A remote command whose URL prefix-matches an operator server's `url` (the
`gh` host model — no flags needed) resolves its token through: `gh` host model — no flags needed) resolves its token through:
@ -148,64 +169,10 @@ A remote command whose URL prefix-matches an operator server's `url` (the
|---|---| |---|---|
| 1 | `OMNIGRAPH_TOKEN_<NAME>` env (`prod``OMNIGRAPH_TOKEN_PROD`) | | 1 | `OMNIGRAPH_TOKEN_<NAME>` env (`prod``OMNIGRAPH_TOKEN_PROD`) |
| 2 | `[<name>]` section in `~/.omnigraph/credentials` | | 2 | `[<name>]` section in `~/.omnigraph/credentials` |
| 3 | the legacy chain unchanged (`bearer_token_env``OMNIGRAPH_BEARER_TOKEN``auth.env_file`) | | 3 | the default `OMNIGRAPH_BEARER_TOKEN` env |
A token is only ever sent to the server it is keyed to: URLs matching no A keyed token is only ever sent to the server it is keyed to: a URL matching no
operator server use the legacy chain alone. operator server falls back to `OMNIGRAPH_BEARER_TOKEN` alone.
## `omnigraph.yaml` schema (legacy combined file)
> **Deprecated.** Loading this file prints a per-key notice
> naming each present key's new home (suppress in CI with
> `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION=1`); `omnigraph config migrate`
> produces the split. The file keeps working through the deprecation
> window. Migrated teams can set `OMNIGRAPH_NO_LEGACY_CONFIG=1` to turn
> any legacy-file load into a hard error (regression guard; the file's
> absence is always fine).
```yaml
project: { name }
graphs:
<name>:
uri: <local|s3://|http(s)://>
bearer_token_env: <ENV_NAME>
queries: # per-graph stored-query registry (server-role; multi-graph mode)
<query-name>: # key MUST equal the `query <name>` symbol inside the .gq
file: <path-to-.gq> # relative to this config's directory
mcp:
expose: true # default true: listed in the MCP catalog (GET /queries); set false to hide (still HTTP-callable)
tool_name: <name> # optional MCP tool-name override (defaults to <query-name>;
# must be unique across exposed queries)
server:
graph: <name>
bind: <ip:port>
cli:
graph: <name>
branch: <name>
output_format: json|jsonl|csv|kv|table
table_max_column_width: 80
table_cell_layout: truncate|wrap
query:
roots: [<dir>, …] # search path for .gq files
auth:
env_file: .env.omni
aliases:
<alias>:
# accepted values: `read` / `query` (read alias), `change` / `mutate`
# (write alias). `query` and `mutate` are recommended; `read` and
# `change` remain accepted forever for back-compat.
command: read|change|query|mutate
query: <path-to-.gq>
name: <query-name>
args: [<positional-name>, …]
graph: <name>
branch: <name>
format: <output-format>
queries: # top-level registry — applies only to a bare-URI (anonymous) graph; a graph served by name uses its `graphs.<id>.queries`. Mirrors top-level `policy`.
<query-name>: { file: <path-to-.gq> } # mcp.expose defaults to true
policy:
file: policy.yaml
```
## Cluster config preview ## Cluster config preview
@ -228,8 +195,8 @@ apply, refresh, and import acquire `__cluster/lock.json` by default and release
it before returning. `cluster apply` executes only stored-query/policy catalog it before returning. `cluster apply` executes only stored-query/policy catalog
writes (content-addressed under `__cluster/resources/`) and requires an writes (content-addressed under `__cluster/resources/`) and requires an
existing `state.json`; graph/schema changes are deferred with warnings, and existing `state.json`; graph/schema changes are deferred with warnings, and
applied resources do not serve traffic — the server still boots from applied resources do not serve traffic until an `omnigraph-server --cluster
`omnigraph.yaml`. `cluster status` reads state only and reports any existing <dir>` restart picks them up. `cluster status` reads state only and reports any existing
lock metadata. `force-unlock` removes a lock only when the supplied id exactly lock metadata. `force-unlock` removes a lock only when the supplied id exactly
matches the lock file. `refresh` requires an existing `state.json`; `import` matches the lock file. `refresh` requires an existing `state.json`; `import`
creates one only when it is missing. Both observe declared graphs read-only at creates one only when it is missing. Both observe declared graphs read-only at
@ -248,7 +215,7 @@ embeddings, aliases, and bindings are reserved for later stages. See
## Param resolution ## Param resolution
Precedence (high to low): explicit `--params` / `--params-file`, alias positional args, `omnigraph.yaml` defaults. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients. Precedence (high to low): explicit `--params` / `--params-file`, alias positional args. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
## Bearer token resolution (CLI) ## Bearer token resolution (CLI)

View file

@ -32,26 +32,24 @@ omnigraph cluster force-unlock <LOCK_ID> --config company-brain --json
`--config` points at a directory, not a file. The directory must contain `--config` points at a directory, not a file. The directory must contain
`cluster.yaml`. When omitted, it defaults to the current directory. `cluster.yaml`. When omitted, it defaults to the current directory.
## Relationship to `omnigraph.yaml` ## Relationship to `~/.omnigraph/config.yaml`
`cluster.yaml` does not replace `omnigraph.yaml`, and the two never describe `cluster.yaml` and the per-operator `~/.omnigraph/config.yaml` never describe
the same fact. `omnigraph.yaml` is the permanent **per-operator** layer (CLI the same fact. The operator config is the permanent **per-operator** layer
defaults, the operator's identity and credential references, graph targets (the operator's identity and credential references, named servers/clusters,
for data-plane commands); `cluster.yaml` is the shared desired state of a profiles, and CLI defaults); `cluster.yaml` is the shared desired state of a
whole deployment, read only by the `cluster` commands via `--config`. whole deployment, read only by the `cluster` commands via `--config`.
The exact contract: The exact contract:
- **Cluster commands read `omnigraph.yaml` for exactly one thing**: the - **Cluster commands read the operator config for exactly one thing**: the
`cli.actor` default used by `apply`/`approve` when `--as` is omitted — `operator.actor` default used by `apply`/`approve` when `--as` is omitted —
operator identity is a per-operator fact. With `--as` present, no config operator identity is a per-operator fact. With `--as` present, the operator
is read at all. Nothing else (its graph set, targets, bind, queries, config is not needed. Nothing else in it influences a cluster command.
policies) ever influences a cluster command; a malformed `omnigraph.yaml` - **No legacy `omnigraph.yaml`**: the CLI does not read `omnigraph.yaml` at
breaks only the no-flag actor lookup, loudly. all, and a `--cluster` server reads only the cluster catalog — boot is
- **A `--cluster` server reads `omnigraph.yaml` for nothing** — not even the cluster-only.
implicit current-directory search runs (mode-inference rule 0). Boot from - **The other direction is ergonomics, not coupling**: per-operator
cluster state XOR `omnigraph.yaml`, never a merge.
- **The other direction is ergonomics, not coupling**: a per-operator
data-plane commands address a cluster graph by its derived storage root data-plane commands address a cluster graph by its derived storage root
(`company-brain/graphs/knowledge.omni`) with `--store <uri>` — an ordinary (`company-brain/graphs/knowledge.omni`) with `--store <uri>` — an ordinary
local path, no special handling. local path, no special handling.
@ -269,12 +267,11 @@ Deletes remove the resource from state; their old payload blobs stay on disk
(garbage collection is a later stage). Re-running a converged apply is a no-op: (garbage collection is a later stage). Re-running a converged apply is a no-op:
no state write, no revision change (`state_written: false`). no state write, no revision change (`state_written: false`).
**Applied means serving — for deployments that opt in.** A server started **Applied means serving.** A server started with `--cluster <dir>` boots from
with `--cluster <dir>` boots from the applied revision (see the applied revision (see
[Serving from the cluster](#serving-from-the-cluster-the-mode-switch)); it [Serving from the cluster](#serving-from-the-cluster-the-mode-switch)); it
picks up newly applied state on its next restart. Deployments still booting picks up newly applied state on its next restart. Until that restart, applied
from `omnigraph.yaml` are untouched: for them, applied means recorded in the means recorded in the catalog, nothing more.
catalog, nothing more.
### Graph creation ### Graph creation

View file

@ -117,7 +117,7 @@ omnigraph cluster apply --config company-brain --as andrew
`--as <actor>` attributes the run: it is recorded in recovery sidecars and `--as <actor>` attributes the run: it is recorded in recovery sidecars and
audit entries and threaded into the engine's commit history. Set audit entries and threaded into the engine's commit history. Set
`cli: { actor: <you> }` in your per-operator `omnigraph.yaml` to make it the `operator: { actor: <you> }` in your `~/.omnigraph/config.yaml` to make it the
default when `--as` is omitted (the flag always wins; `approve` requires one default when `--as` is omitted (the flag always wins; `approve` requires one
of the two). of the two).
@ -244,12 +244,12 @@ with an in-flight apply.
- **CI-driven convergence**: `validate` and `plan --json` are read-only and - **CI-driven convergence**: `validate` and `plan --json` are read-only and
safe in pipelines; gate `apply --as ci` on plan review. Approvals are the safe in pipelines; gate `apply --as ci` on plan review. Approvals are the
human step by design — keep `cluster approve` out of automation. human step by design — keep `cluster approve` out of automation.
- **`omnigraph.yaml` still has a job**: per-operator settings — your - **`~/.omnigraph/config.yaml` is the per-operator config**: your
`cli.actor` default for `--as`, CLI defaults, credentials, and data-plane `operator.actor` default for `--as`, named servers/clusters, credentials,
ergonomics (address a cluster graph by its derived root like profiles, and data-plane ergonomics (address a cluster graph by its derived
`company-brain/graphs/knowledge.omni` with `--store` for loads). It just no root like `company-brain/graphs/knowledge.omni` with `--store` for loads). The
longer describes the deployment — a server boots from one source or the cluster directory's `cluster.yaml` is the **sole deployment declaration** the
other, never a merge of both. server boots from the cluster only.
## 7. Maintaining a cluster graph ## 7. Maintaining a cluster graph
@ -258,10 +258,11 @@ operation — it runs out-of-band, with direct storage access, against the graph
roots. Address a cluster graph by name instead of hand-typing its storage path: roots. Address a cluster graph by name instead of hand-typing its storage path:
```bash ```bash
omnigraph optimize --cluster ./company-brain --cluster-graph knowledge omnigraph optimize --cluster ./company-brain --graph knowledge
omnigraph cleanup --cluster ./company-brain --cluster-graph knowledge --keep 10 --confirm omnigraph cleanup --cluster ./company-brain --graph knowledge --keep 10 --confirm
# --cluster also takes the storage-root URI directly (config-free): # --cluster also takes the storage-root URI directly (config-free), and a
omnigraph optimize --cluster s3://bucket/clusters/company-brain --cluster-graph knowledge # `clusters:` name from ~/.omnigraph/config.yaml:
omnigraph optimize --cluster s3://bucket/clusters/company-brain --graph knowledge
``` ```
The graph's storage URI is resolved from the **served cluster state** (the same The graph's storage URI is resolved from the **served cluster state** (the same
@ -270,6 +271,16 @@ not resolvable. Run these from a host with storage access — there are no serve
routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in
a cluster are created by `cluster apply`, not by hand. a cluster are created by `cluster apply`, not by hand.
If the cluster has exactly **one** applied graph you can omit `--graph` — it is
used automatically. With **several**, omitting `--graph` errors and lists the
candidates (RFC-011 D7); it never picks one for you.
Against an **`s3://`-backed cluster** the resolved graph storage is non-local, so a
destructive `cleanup` additionally requires **`--yes`** (an interactive prompt
otherwise, refusal without a TTY) on top of `--confirm` — see [cli-reference.md](../cli/reference.md)'s
*Write diagnostics & destructive confirmation*. Every maintenance run also echoes
its resolved target to stderr (suppress with `--quiet`).
## What the control plane does not do (yet) ## What the control plane does not do (yet)
- **No hot reload** — applied changes serve on the next restart. - **No hot reload** — applied changes serve on the next restart.

View file

@ -13,13 +13,10 @@ Omnigraph supports two broad deployment shapes:
The server binary and container image expose the same HTTP surface. The server binary and container image expose the same HTTP surface.
The server also has two **boot sources**: `omnigraph.yaml` (graph targets The server has a single **boot source**: a **cluster directory**
declared in the per-operator config) or a **cluster directory** (`omnigraph-server --cluster <dir | s3://…>`), which serves the cluster control
(`omnigraph-server --cluster <dir>`), which serves the cluster control
plane's applied revision — see plane's applied revision — see
[cluster-config.md](clusters/config.md#serving-from-the-cluster-the-mode-switch). [cluster-config.md](clusters/config.md#serving-from-the-cluster-the-mode-switch).
The two are exclusive per deployment; switching is a restart with a different
flag.
## Binary Deployment ## Binary Deployment
@ -30,21 +27,26 @@ Build or install:
On Windows, the binaries are `omnigraph.exe` and `omnigraph-server.exe`. On Windows, the binaries are `omnigraph.exe` and `omnigraph-server.exe`.
Run against a local graph: The server boots from a cluster only (RFC-011) — there is no positional
`<URI>` / single-graph boot. Point it at a local cluster directory:
```bash ```bash
omnigraph-server graph.omni --bind 0.0.0.0:8080 omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080
``` ```
Run against an object-store-backed graph: Or boot config-free from an object-storage-rooted cluster:
```bash ```bash
OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \ OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
AWS_REGION="us-east-1" \ AWS_REGION="us-east-1" \
omnigraph-server s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \ omnigraph-server --cluster s3://my-bucket/clusters/company-brain \
--bind 0.0.0.0:8080 --bind 0.0.0.0:8080
``` ```
The server serves every graph in the cluster's applied revision under
`/graphs/{id}/...`. See [clusters](clusters/index.md) for authoring and
applying a cluster.
## Cluster Mode in Containers (AWS, Railway) ## Cluster Mode in Containers (AWS, Railway)
A cluster-booted deployment has **two shapes** since the `storage:` root: A cluster-booted deployment has **two shapes** since the `storage:` root:
@ -80,10 +82,8 @@ docker run -d \
-p 8080:8080 <image> -p 8080:8080 <image>
``` ```
`OMNIGRAPH_CLUSTER` is exclusive: combining it with `OMNIGRAPH_TARGET_URI`, `OMNIGRAPH_CLUSTER` is the server's only boot source. The image also
`OMNIGRAPH_CONFIG`, or `OMNIGRAPH_TARGET` fails fast (exit 64), the same ships the `omnigraph` CLI, so the day-2 loop runs in-container:
rule the server itself enforces. The image also ships the `omnigraph` CLI,
so the day-2 loop runs in-container with no `omnigraph.yaml`:
```bash ```bash
docker exec -it <container> sh -c \ docker exec -it <container> sh -c \
@ -104,10 +104,10 @@ docker exec -it <container> sh -c \
`omnigraph cluster apply --as <you> --config /var/lib/omnigraph/cluster` `omnigraph cluster apply --as <you> --config /var/lib/omnigraph/cluster`
→ force a new deployment (restart). → force a new deployment (restart).
For a deployment that doesn't need the cluster control plane, the classic For a stateless, volume-free deployment, root the cluster on object
stateless shape — `OMNIGRAPH_TARGET_URI=s3://bucket/graph.omni`, no volume — storage and boot config-free with
remains the simplest AWS architecture (see Binary/Container Deployment `OMNIGRAPH_CLUSTER=s3://bucket/clusters/<name>` (the bucket-no-volume
above). shape above) — the simplest AWS architecture.
### Railway ### Railway
@ -181,23 +181,24 @@ Build the image:
docker build -t omnigraph-server:local . docker build -t omnigraph-server:local .
``` ```
Run against a local graph: The server boots from a cluster only (RFC-011). Run against a cluster
directory on a mounted volume:
```bash ```bash
docker run --rm -p 8080:8080 \ docker run --rm -p 8080:8080 \
-v "$PWD/graph.omni:/data/graph.omni" \ -v "$PWD/company-brain:/var/lib/omnigraph/cluster" \
omnigraph-server:local \ omnigraph-server:local \
/data/graph.omni --bind 0.0.0.0:8080 --cluster /var/lib/omnigraph/cluster --bind 0.0.0.0:8080
``` ```
Run against an S3-backed graph: Run config-free against an object-storage-rooted cluster:
```bash ```bash
docker run --rm -p 8080:8080 \ docker run --rm -p 8080:8080 \
-e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \ -e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
-e AWS_REGION="us-east-1" \ -e AWS_REGION="us-east-1" \
omnigraph-server:local \ omnigraph-server:local \
s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \ --cluster s3://my-bucket/clusters/company-brain \
--bind 0.0.0.0:8080 --bind 0.0.0.0:8080
``` ```
@ -208,27 +209,14 @@ When no positional args are given, the image entrypoint
| Var | Effect | | Var | Effect |
|---|---| |---|---|
| `OMNIGRAPH_TARGET_URI` | Graph URI, passed as the positional argument. | | `OMNIGRAPH_CLUSTER` | Cluster boot source — a config directory or a storage-root URI, forwarded as `--cluster`. The only boot source. |
| `OMNIGRAPH_CONFIG` | Path to an `omnigraph.yaml`, passed as `--config`. Used to supply a `policy.file` (Cedar authorization). The config file and any relative `policy.file` must be mounted into the container. |
| `OMNIGRAPH_TARGET` | Graph name to select from the config's `graphs:` block (with `OMNIGRAPH_CONFIG`, when no `OMNIGRAPH_TARGET_URI`). |
| `OMNIGRAPH_BIND` | Listen address (default `0.0.0.0:8080`). | | `OMNIGRAPH_BIND` | Listen address (default `0.0.0.0:8080`). |
`OMNIGRAPH_TARGET_URI` and `OMNIGRAPH_CONFIG` **compose**: set both to keep the Per-graph and server-level Cedar policy come from the cluster's applied
graph URI in the env var while loading policy from the config file (the revision (authored in `cluster.yaml` and published with `cluster apply`),
positional URI wins over any `graphs:` entry). To enable Cedar policy on a not from a separate config file. The cluster docker shapes — volume vs.
container otherwise driven by `OMNIGRAPH_TARGET_URI`, mount the config dir and config-free object-storage root — are detailed under
add `OMNIGRAPH_CONFIG`: [Cluster Mode in Containers](#cluster-mode-in-containers-aws-railway) above.
```bash
docker run --rm -p 8080:8080 \
-e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
-e OMNIGRAPH_TARGET_URI="s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0" \
-e OMNIGRAPH_CONFIG="/etc/omnigraph/omnigraph.yaml" \
-v "$PWD/config:/etc/omnigraph:ro" \
omnigraph-server:local
# /etc/omnigraph/omnigraph.yaml contains `policy: { file: policy.yaml }`;
# policy.yaml (+ optional policy.tests.yaml) sit beside it in the mount.
```
## Auth ## Auth

View file

@ -1,17 +1,18 @@
# Maintenance: Optimize, Repair & Cleanup # Maintenance: Optimize, Repair & Cleanup
**Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster <dir|s3://…> --cluster-graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` / `--graph` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md). **Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster <dir|s3://…> --graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md).
## `optimize` — non-destructive ## `optimize` — non-destructive
- Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed. - Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed.
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs. - Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs.
- **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail. - **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail.
- **Create declared-but-missing indexes (the index reconciler).** `@index`/`@key` declares intent; `schema apply` records it but builds nothing, and `load`/`mutate` defer a column that cannot be built yet (a `Vector` column with no trainable vectors). `optimize` materializes any such declared-but-unbuilt index over the compacted layout — so it is the convergence path for an `@index` added after data exists, or a vector index whose embeddings arrived via a later `embed`. A column still not buildable (no vectors yet) is reported on the table's stat as `pending_indexes` (visible in `--json`), not treated as a failure; the next `optimize` retries. So `optimize` is the single operator-facing index reconciler: it compacts, restores coverage, **and** builds declared-but-missing indexes.
- Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe). - Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe).
- **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`. - **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`.
- **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift. - **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8). - Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version`. - Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version, pending_indexes` (the last lists any declared `@index` column the reconciler could not build this run, with the reason — e.g. a vector column with no trainable vectors yet).
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today. - **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today.
## `repair` — explicit ## `repair` — explicit
@ -34,6 +35,7 @@
backstop, so it does as much as it can and converges on re-run. The CLI reports backstop, so it does as much as it can and converges on re-run. The CLI reports
any failed tables; rerun `cleanup` to retry them. any failed tables; rerun `cleanup` to retry them.
- CLI guards with `--confirm`; without it, prints a preview line. - CLI guards with `--confirm`; without it, prints a preview line.
- **Non-local consent (RFC-011 D9).** Against a non-local target (an `s3://` store/cluster), `cleanup` additionally requires `--yes` on top of `--confirm`: a TTY is prompted, and a non-interactive run (no TTY, or `--json`) refuses rather than destroying. A local (`file://`) target needs only `--confirm`. The same `--yes` gate applies to overwrite `load` and `branch delete`; every maintenance run echoes its resolved target to stderr (suppress with `--quiet`).
- **Recovery floor:** `--keep < 3` may garbage-collect versions that crash recovery needs as a rollback target. Default `--keep 10` is safe. - **Recovery floor:** `--keep < 3` may garbage-collect versions that crash recovery needs as a rollback target. Default `--keep 10` is safe.
- **Orphaned-branch reconciliation:** before the version GC, cleanup reclaims any per-table or commit-graph branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged. - **Orphaned-branch reconciliation:** before the version GC, cleanup reclaims any per-table or commit-graph branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged.

View file

@ -20,7 +20,7 @@ Server-scoped action (v0.6.0+; binds to `Omnigraph::Server::"root"`):
10. `graph_list``GET /graphs` registry enumeration (multi-graph mode) 10. `graph_list``GET /graphs` registry enumeration (multi-graph mode)
Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` are reserved but not shipped in v0.6.0; operators add/remove graphs by editing `omnigraph.yaml` and restarting.) Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` over HTTP are reserved but not shipped; operators add/remove graphs by editing the cluster's `cluster.yaml`, running `omnigraph cluster apply`, and restarting the server.)
## Scope kinds ## Scope kinds
@ -28,38 +28,34 @@ Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — the
- `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops) - `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops)
- `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected` - `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected`
## Per-graph vs. server-level policy (multi-graph mode) ## Per-graph vs. server-level policy
In multi mode (`omnigraph.yaml` with a non-empty `graphs:` map), policy files attach at two levels: A server boots from a cluster (`--cluster <dir>`), and the cluster's
`cluster.yaml` declares its policy bundles in a `policies:` section. Each bundle
names the scopes it `applies_to`: a graph id (per-graph rules — `read`, `change`,
`branch_*`, `schema_apply`) or the literal `cluster` (server-level rules —
`graph_list`).
```yaml ```yaml
server: # cluster.yaml
policy: policies:
file: server-policy.yaml # server-level: graph_list base:
file: base.policy.yaml
graphs: applies_to: [cluster, knowledge] # cluster-level + the `knowledge` graph
alpha: alpha:
uri: s3://tenant-bucket/alpha file: policies/alpha.yaml
policy: applies_to: [alpha] # per-graph: alpha only
file: policies/alpha.yaml # per-graph: read, change, branch_*, schema_apply
beta:
uri: s3://tenant-bucket/beta
# no per-graph policy → no engine-layer Cedar enforcement on beta
``` ```
**Config follows graph identity, not server mode.** A graph served by **name** A graph with no bundle bound to it has no engine-layer Cedar enforcement. Each
(`--target <name>` or `server.graph`) uses its own `graphs.<name>.policy.file`, graph's HTTP request flows through its bound bundle; the management endpoint
exactly as in multi-graph mode. Top-level `policy.file` applies only to an (`GET /graphs`) flows through the `cluster`-scoped bundle. When no bundle binds
**anonymous** graph — one served by a bare `<URI>` with no `graphs:` entry. `cluster`, `GET /graphs` is denied in every runtime state, including
Serving a **named** graph (single- or multi-graph mode) while top-level `--unauthenticated`; with bearer tokens configured it returns 403 after admission
`policy.file` (or `queries:`) is populated **refuses boot**, naming the block, control because `graph_list` is not a `read`-equivalent action. The operator must
since the top-level value would otherwise be silently shadowed by the per-graph bind a `cluster`-scoped bundle granting `graph_list` to expose `/graphs`.
block. Move per-graph rules to `graphs.<graph_id>.policy.file` and `graph_list`
rules to `server.policy.file`.
Each graph's HTTP request flows through its own per-graph policy. The management endpoint (`GET /graphs`) flows through the server-level policy. When `server.policy.file` is unset, `GET /graphs` is denied in every runtime state, including `--unauthenticated`; with bearer tokens configured, it returns 403 after admission control because `graph_list` is not a `read`-equivalent action. The operator must explicitly authorize via `server-policy.yaml` to expose `/graphs`. Example `cluster`-scoped bundle:
Example server-level policy:
```yaml ```yaml
version: 1 version: 1
@ -72,38 +68,26 @@ rules:
actions: [graph_list] actions: [graph_list]
``` ```
## Configuration Each per-graph rule may use at most one of `branch_scope` or
`target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they
have no branch context.
`omnigraph.yaml`: ## Actor for direct-engine writes
```yaml The default actor identity for CLI direct-engine (`--store`) writes is
policy: `operator.actor` in `~/.omnigraph/config.yaml`. Override per-invocation with
file: policy.yaml # Cedar rules + groups `--as <ACTOR>``--as` wins, otherwise `operator.actor`, otherwise no actor.
tests: policy.tests.yaml # declarative test cases Remote HTTP writes ignore both — they resolve their actor server-side from the
bearer token. (Direct-store access carries no Cedar policy under RFC-011; policy
cli: lives in the cluster/server.)
actor: act-andrew # default actor for CLI direct-engine writes
```
Each per-graph rule may use at most one of `branch_scope` or `target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they have no branch context.
`cli.actor` is the default actor identity for CLI direct-engine writes
when `policy.file` is configured. Override per-invocation with `--as
<ACTOR>` (top-level flag) — `--as` wins, otherwise `cli.actor` is used,
otherwise no actor. With policy configured and neither set, the
engine-layer footgun guard intentionally denies the write (silent bypass
via "I forgot the actor" is exactly what the guard prevents). Remote
HTTP writes ignore both — they resolve their actor server-side from the
bearer token.
## CLI ## CLI
Policy tooling resolves its graph like server single-mode policy: `cli.graph` Policy tooling reads a cluster's applied policy bundles: pass `--cluster <dir>`,
wins, otherwise `server.graph` is used, otherwise the top-level `policy.file` and `--graph <id>` to pick a graph's bundle when several apply.
is validated/tested/explained as the anonymous policy.
- `omnigraph policy validate` — parse + count actors, exit 1 on parse error. - `omnigraph policy validate` — parse + count actors, exit 1 on parse error.
- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch. - `omnigraph policy test --tests <file>` — run the declarative cases in `<file>` against the selected bundle, exit 1 on any expectation mismatch.
- `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule. - `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule.
- `omnigraph --as <ACTOR> <subcommand>` — set the actor for the duration of one invocation. Effective for `change`, `load` (and its deprecated `ingest` alias), `branch create|delete|merge`, and `schema apply` against a direct (`--store`) graph. **Rejected** on a served write (`--server`): the actor is bearer-token-resolved server-side, so `--as` can't set it there. - `omnigraph --as <ACTOR> <subcommand>` — set the actor for the duration of one invocation. Effective for `change`, `load` (and its deprecated `ingest` alias), `branch create|delete|merge`, and `schema apply` against a direct (`--store`) graph. **Rejected** on a served write (`--server`): the actor is bearer-token-resolved server-side, so `--as` can't set it there.
@ -132,7 +116,7 @@ reaches the authorization gate without a matching policy permit.
|---|---|---|---| |---|---|---|---|
| **Open** | no | no | Every request is permitted. Refuses to start unless `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` is set — the operator must explicitly opt in. | | **Open** | no | no | Every request is permitted. Refuses to start unless `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` is set — the operator must explicitly opt in. |
| **DefaultDeny** | yes | no | Every authenticated request for an action other than `read` is rejected with HTTP 403. Closes the "tokens but forgot the policy file" trap — an operator who sets up auth and forgot to point at a policy file used to ship the illusion of protection. | | **DefaultDeny** | yes | no | Every authenticated request for an action other than `read` is rejected with HTTP 403. Closes the "tokens but forgot the policy file" trap — an operator who sets up auth and forgot to point at a policy file used to ship the illusion of protection. |
| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require `server.policy.file`. | | **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require a `cluster`-scoped policy bundle. |
The server refuses to start for the "no tokens, no policy, no flag" cell The server refuses to start for the "no tokens, no policy, no flag" cell
and for "policy file, no tokens" — instead of silently shipping an open and for "policy file, no tokens" — instead of silently shipping an open

View file

@ -1,38 +1,29 @@
# HTTP Server (`omnigraph-server`) # HTTP Server (`omnigraph-server`)
Axum 0.8 + tokio + utoipa-generated OpenAPI. **Two modes** (v0.6.0+): single-graph and multi-graph, with **two boot sources** for multi mode: `omnigraph.yaml` or — exclusively — a cluster directory (`--cluster`). Mode is inferred from CLI args + config shape. Axum 0.8 + tokio + utoipa-generated OpenAPI. **Cluster-only boot** (RFC-011): the server always boots from a cluster (`--cluster <dir | s3://…>`) and serves N graphs (N ≥ 1) under cluster routes. There is no longer a single-graph flat-route mode, no positional `<URI>` boot, no `--target`, and no `omnigraph.yaml`-`graphs:`-map boot. All HTTP is nested under `/graphs/{graph_id}/...`; `/healthz` and the management `/graphs` enumeration stay flat.
## Modes ## Boot
### Single-graph mode ### Cluster boot (the only boot)
`omnigraph-server <URI>` or `omnigraph-server --target <name> --config omnigraph.yaml`. Routes are flat — `/snapshot`, `/read`, `/branches`, etc. ```bash
omnigraph-server --cluster <dir | s3://> --bind 0.0.0.0:8080
```
**Config follows graph identity.** A bare `<URI>` is an *anonymous* graph and uses the **top-level** `policy.file` / `queries:`. A graph chosen by **name** (`--target` / `server.graph`) uses its own `graphs.<name>.{policy.file, queries}` — the same block multi-graph mode uses. ⚠️ *Changed from v0.6.0, which always used top-level config in single mode: a named-graph config that puts `policy`/`queries` at top-level now **refuses boot** and points you at `graphs.<name>.…` (move the block there). Bare-`<URI>` single mode is unchanged.* `omnigraph-server --cluster <dir-or-uri>` boots from the cluster catalog's
**applied revision**. The server resolves that revision into per-graph
### Multi-graph mode (v0.6.0+) startup configs (id, URI, optional per-graph policy, stored-query
registry) plus an optional server-level policy, then opens every
`omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and **no** single-mode selector (no `server.graph`, no `<URI>`, no `--target`). The server opens every configured graph in parallel at startup (bounded concurrency = 4, fail-fast on the first open error). Routes are nested under `/graphs/{graph_id}/...`. Bare flat paths return 404 in multi mode. configured graph in parallel at startup (bounded concurrency = 4,
fail-fast on the first open error). Routing is always multi-graph —
### Cluster-booted multi mode requests to bare flat protected paths (`/read`, `/snapshot`, …) return
404; the served surface is `/graphs/{graph_id}/...`. See
`omnigraph-server --cluster <dir-or-uri>` boots from the cluster catalog's **applied
revision** instead of
`omnigraph.yaml` — an exclusive boot source: combining it with `<URI>`,
`--target`, or `--config` is a startup error, and `omnigraph.yaml` is never
read in this mode. Always multi-graph routing. See
[cluster-config.md](../clusters/config.md#serving-from-the-cluster-the-mode-switch) [cluster-config.md](../clusters/config.md#serving-from-the-cluster-the-mode-switch)
for what is read and the fail-fast readiness rules. `--bind`, for what is read and the fail-fast readiness rules.
`--unauthenticated`, and the bearer-token env vars work identically.
Mode inference: A scheme-qualified argument (`s3://…`) reads the ledger straight from the
storage root, with no local config directory. `--bind`,
0. CLI `--cluster <dir | s3://…>`**multi, cluster-booted** (exclusive; a scheme-qualified argument reads the ledger straight from the storage root, no local config) `--unauthenticated`, and the bearer-token env vars all apply.
1. CLI positional `<URI>` → single
2. CLI `--target <name>` → single
3. `server.graph` in config → single
4. `--config` + non-empty `graphs:` + no single-mode selector → **multi**
5. otherwise → error with migration hint
### Stored-query validation at startup ### Stored-query validation at startup
@ -40,36 +31,37 @@ If a graph declares a `queries:` registry (see [cli-reference](../cli/reference.
## Endpoint inventory ## Endpoint inventory
Per-graph endpoints — same body shape across modes; URLs differ: Per-graph endpoints — all nested under `/graphs/{id}/...`. `{id}` is the
graph id from the cluster's applied revision:
| Method | Single-mode path | Multi-mode path | Auth | Action |
|---|---|---|---|---|
| GET | `/healthz` | `/healthz` | none | — |
| GET | `/openapi.json` | `/openapi.json` | none | — (strips security if auth disabled; in multi mode emits cluster paths with `cluster_` operation-id prefix) |
| GET | `/snapshot?branch=` | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch |
| POST | `/query` | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) |
| POST | `/read` | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: </query>; rel="successor-version"`) |
| POST | `/export` | `/graphs/{id}/export` | bearer + `export` | NDJSON stream |
| POST | `/mutate` | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) |
| POST | `/change` | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: </mutate>; rel="successor-version"`) |
| GET | `/queries` | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog |
| POST | `/queries/{name}` | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 |
| GET | `/schema` | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source |
| POST | `/schema/apply` | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate |
| POST | `/load` | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) |
| POST | `/ingest` | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: </load>; rel="successor-version"`) (32 MB body limit) |
| GET | `/branches` | `/graphs/{id}/branches` | bearer + `read` | list branches |
| POST | `/branches` | `/graphs/{id}/branches` | bearer + `branch_create` | create |
| DELETE | `/branches/{branch}` | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete |
| POST | `/branches/merge` | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` |
| GET | `/commits?branch=` | `/graphs/{id}/commits?branch=` | bearer + `read` | list |
| GET | `/commits/{commit_id}` | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show |
Server-level management endpoints (v0.6.0+):
| Method | Path | Auth | Action | | Method | Path | Auth | Action |
|---|---|---|---| |---|---|---|---|
| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs (405 in single mode) | | GET | `/healthz` | none | — |
| GET | `/openapi.json` | none | — (strips security if auth disabled; emits the nested cluster paths with `cluster_` operation-id prefix) |
| GET | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch |
| POST | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) |
| POST | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: <query>; rel="successor-version"`) |
| POST | `/graphs/{id}/export` | bearer + `export` | NDJSON stream |
| POST | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) |
| POST | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: <mutate>; rel="successor-version"`) |
| GET | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog |
| POST | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 |
| GET | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source |
| POST | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | disabled for cluster-backed serving; returns 409 and points operators at `omnigraph cluster apply` + restart |
| POST | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) |
| POST | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: <load>; rel="successor-version"`) (32 MB body limit) |
| GET | `/graphs/{id}/branches` | bearer + `read` | list branches |
| POST | `/graphs/{id}/branches` | bearer + `branch_create` | create |
| DELETE | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete |
| POST | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` |
| GET | `/graphs/{id}/commits?branch=` | bearer + `read` | list |
| GET | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show |
Server-level management endpoints:
| Method | Path | Auth | Action |
|---|---|---|---|
| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs |
### Stored-query catalog (`GET /queries`) ### Stored-query catalog (`GET /queries`)
@ -88,13 +80,14 @@ Invoke a curated, server-side stored query by **name** — the source comes from
- **Requires an explicit policy grant when auth is on.** In default-deny mode (bearer tokens but no `policy.file`), only `read` is permitted, so *every* `/queries/{name}` call returns `404` until an `invoke_query` rule is configured. - **Requires an explicit policy grant when auth is on.** In default-deny mode (bearer tokens but no `policy.file`), only `read` is permitted, so *every* `/queries/{name}` call returns `404` until an `invoke_query` rule is configured.
- A stored mutation cannot target a `snapshot` (`400`); a parameter type error is a structured `400` naming the parameter. - A stored mutation cannot target a `snapshot` (`400`); a parameter type error is a structured `400` naming the parameter.
## Adding and removing graphs (multi mode) ## Adding and removing graphs
Runtime add/remove via API is **not** exposed in v0.6.0 — neither Runtime add/remove via API is **not** exposed — neither `POST /graphs`
`POST /graphs` nor `DELETE /graphs/{id}` is implemented. Operators add nor `DELETE /graphs/{id}` is implemented. Operators add or remove graphs
or remove graphs by stopping the server, editing the `graphs:` map in by running `cluster apply` against the cluster (which publishes a new
`omnigraph.yaml`, then restarting. The server treats `omnigraph.yaml` applied revision) and restarting the server so it boots from the new
as operator-owned configuration and never writes it. revision. The server treats the cluster source as operator-owned and
never writes it.
A future release may introduce a managed registry and re-expose runtime A future release may introduce a managed registry and re-expose runtime
mutation on top of it. mutation on top of it.
@ -138,8 +131,8 @@ channels:
- **Response headers (RFC 9745)**: every response carries `Deprecation: true`. - **Response headers (RFC 9745)**: every response carries `Deprecation: true`.
- **Response headers (RFC 8288)**: every response carries a `Link` header - **Response headers (RFC 8288)**: every response carries a `Link` header
pointing at the canonical successor: pointing at the canonical successor:
`Link: </query>; rel="successor-version"` for `/read`, and `Link: <query>; rel="successor-version"` for `/read`, and
`Link: </mutate>; rel="successor-version"` for `/change`. SDKs and HTTP `Link: <mutate>; rel="successor-version"` for `/change`. SDKs and HTTP
proxies can pick the successor up automatically. proxies can pick the successor up automatically.
Migration is purely cosmetic on the client side — swap the URL path, leave Migration is purely cosmetic on the client side — swap the URL path, leave
@ -226,4 +219,4 @@ See [deployment.md](../deployment.md) for token-source operational details.
admission control" above). No global rate limiter is configured; admission control" above). No global rate limiter is configured;
add `tower_http::limit` if a graph-wide cap is needed. add `tower_http::limit` if a graph-wide cap is needed.
- Pagination — none (commits/branches return everything; export streams). - Pagination — none (commits/branches return everything; export streams).
- Runtime graph add/remove — edit `omnigraph.yaml` and restart. - Runtime graph add/remove — run `cluster apply` and restart.

View file

@ -27,7 +27,8 @@ list/`Blob` columns → none.
## L2 — OmniGraph orchestration ## L2 — OmniGraph orchestration
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run. - **`@index`/`@key` declares intent; the physical index is derived state.** A migration records the declaration in the catalog/IR and never fails on it — `schema apply` builds **no** indexes (adding an `@index` to an existing column is a pure metadata change that touches no table data). `load`/`mutate` build declared indexes inline as part of the write, but a column that can't be built yet (a `Vector` column with no trainable vectors — IVF k-means needs ≥1 vector, e.g. rows loaded before `embed` runs) is left **pending**, not fatal. Reads stay correct meanwhile: a missing/partial index degrades to a scan (vector search to brute-force). A later `ensure_indices`/`optimize` materializes the pending index once it is buildable. This mirrors how LanceDB builds indexes asynchronously and serves unindexed rows by brute-force.
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted + vector indexes for the current head; safe to re-run; returns the columns it had to defer as pending. `optimize` runs it after compaction, so the maintenance cron is the convergence path for deferred indexes.
- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state. - Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork. - **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column. - Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.

View file

@ -10,14 +10,82 @@
"version": "0.7.0" "version": "0.7.0"
}, },
"paths": { "paths": {
"/branches": { "/graphs": {
"get": {
"tags": [
"management"
],
"summary": "List every graph currently registered with this server (MR-668).",
"description": "Multi-graph mode only. In single mode, the route returns 405 — there's\nno registry to enumerate. Cedar-gated by the server-level policy via\nthe `graph_list` action against `Omnigraph::Server::\"root\"`.\n\nOrder: alphabetical by `graph_id` (server-sorted so clients see\ndeterministic output across requests).",
"operationId": "listGraphs",
"responses": {
"200": {
"description": "List of registered graphs",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/GraphListResponse"
}
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"403": {
"description": "Forbidden",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"405": {
"description": "Method not allowed (single-graph mode)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
}
},
"security": [
{
"bearer_token": []
}
]
}
},
"/graphs/{graph_id}/branches": {
"get": { "get": {
"tags": [ "tags": [
"branches" "branches"
], ],
"summary": "List all branches.", "summary": "List all branches.",
"description": "Returns branch names sorted alphabetically. Read-only.", "description": "Returns branch names sorted alphabetically. Read-only.",
"operationId": "listBranches", "operationId": "cluster_listBranches",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": { "responses": {
"200": { "200": {
"description": "List of branches", "description": "List of branches",
@ -62,7 +130,18 @@
], ],
"summary": "Create a new branch.", "summary": "Create a new branch.",
"description": "Forks `name` off of `from` (defaults to `main`). The new branch shares\ntable data with its parent until it is mutated. Returns 409 if `name`\nalready exists.", "description": "Forks `name` off of `from` (defaults to `main`). The new branch shares\ntable data with its parent until it is mutated. Returns 409 if `name`\nalready exists.",
"operationId": "createBranch", "operationId": "cluster_createBranch",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -142,14 +221,25 @@
] ]
} }
}, },
"/branches/merge": { "/graphs/{graph_id}/branches/merge": {
"post": { "post": {
"tags": [ "tags": [
"branches" "branches"
], ],
"summary": "Merge one branch into another.", "summary": "Merge one branch into another.",
"description": "Merges `source` into `target` (defaults to `main`). Outcome is one of\n`already_up_to_date`, `fast_forward`, or `merged`. Returns 409 with the\nlist of conflicts if the merge cannot be completed; the target is left\nunchanged in that case. **Destructive** to `target` on success.", "description": "Merges `source` into `target` (defaults to `main`). Outcome is one of\n`already_up_to_date`, `fast_forward`, or `merged`. Returns 409 with the\nlist of conflicts if the merge cannot be completed; the target is left\nunchanged in that case. **Destructive** to `target` on success.",
"operationId": "mergeBranches", "operationId": "cluster_mergeBranches",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -229,15 +319,24 @@
] ]
} }
}, },
"/branches/{branch}": { "/graphs/{graph_id}/branches/{branch}": {
"delete": { "delete": {
"tags": [ "tags": [
"branches" "branches"
], ],
"summary": "Delete a branch.", "summary": "Delete a branch.",
"description": "**Irreversible.** Removes the branch pointer; commits remain reachable\nonly if referenced by another branch. Returns 404 if the branch does not\nexist.", "description": "**Irreversible.** Removes the branch pointer; commits remain reachable\nonly if referenced by another branch. Returns 404 if the branch does not\nexist.",
"operationId": "deleteBranch", "operationId": "cluster_deleteBranch",
"parameters": [ "parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
},
{ {
"name": "branch", "name": "branch",
"in": "path", "in": "path",
@ -307,14 +406,25 @@
] ]
} }
}, },
"/change": { "/graphs/{graph_id}/change": {
"post": { "post": {
"tags": [ "tags": [
"mutations" "mutations"
], ],
"summary": "**Deprecated** — use [`POST /mutate`](#tag/mutations/operation/mutate) instead.", "summary": "**Deprecated** — use [`POST /mutate`](#tag/mutations/operation/mutate) instead.",
"description": "Apply a GQ mutation to a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /mutate`, which has identical semantics and a name that pairs\ncleanly with `POST /query`. Responses from this route include\n`Deprecation: true` and `Link: </mutate>; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.", "description": "Apply a GQ mutation to a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /mutate`, which has identical semantics and a name that pairs\ncleanly with `POST /query`. Responses from this route include\n`Deprecation: true` and `Link: <mutate>; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.",
"operationId": "change", "operationId": "cluster_change",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -327,7 +437,7 @@
}, },
"responses": { "responses": {
"200": { "200": {
"description": "Mutation results (response includes `Deprecation: true` + `Link: </mutate>; rel=\"successor-version\"`)", "description": "Mutation results (response includes `Deprecation: true` + `Link: <mutate>; rel=\"successor-version\"`)",
"content": { "content": {
"application/json": { "application/json": {
"schema": { "schema": {
@ -395,15 +505,24 @@
] ]
} }
}, },
"/commits": { "/graphs/{graph_id}/commits": {
"get": { "get": {
"tags": [ "tags": [
"commits" "commits"
], ],
"summary": "List commits.", "summary": "List commits.",
"description": "Filter by `branch` to get the commits on a single branch (most recent\nfirst); omit to list across all branches. Read-only.", "description": "Filter by `branch` to get the commits on a single branch (most recent\nfirst); omit to list across all branches. Read-only.",
"operationId": "listCommits", "operationId": "cluster_listCommits",
"parameters": [ "parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
},
{ {
"name": "branch", "name": "branch",
"in": "query", "in": "query",
@ -455,15 +574,24 @@
] ]
} }
}, },
"/commits/{commit_id}": { "/graphs/{graph_id}/commits/{commit_id}": {
"get": { "get": {
"tags": [ "tags": [
"commits" "commits"
], ],
"summary": "Get a single commit.", "summary": "Get a single commit.",
"description": "Returns the commit's manifest version, parent commit(s), and creation\nmetadata. Read-only.", "description": "Returns the commit's manifest version, parent commit(s), and creation\nmetadata. Read-only.",
"operationId": "getCommit", "operationId": "cluster_getCommit",
"parameters": [ "parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
},
{ {
"name": "commit_id", "name": "commit_id",
"in": "path", "in": "path",
@ -523,14 +651,25 @@
] ]
} }
}, },
"/export": { "/graphs/{graph_id}/export": {
"post": { "post": {
"tags": [ "tags": [
"queries" "queries"
], ],
"summary": "Stream the contents of a branch as NDJSON.", "summary": "Stream the contents of a branch as NDJSON.",
"description": "Emits one JSON object per line (`application/x-ndjson`). Filter with\n`type_names` (node/edge type names) and/or `table_keys`; both empty\nstreams the entire branch. Suitable for large exports — the response is\nstreamed, not buffered. Read-only.", "description": "Emits one JSON object per line (`application/x-ndjson`). Filter with\n`type_names` (node/edge type names) and/or `table_keys`; both empty\nstreams the entire branch. Suitable for large exports — the response is\nstreamed, not buffered. Read-only.",
"operationId": "export", "operationId": "cluster_export",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -586,93 +725,25 @@
] ]
} }
}, },
"/graphs": { "/graphs/{graph_id}/ingest": {
"get": {
"tags": [
"management"
],
"summary": "List every graph currently registered with this server (MR-668).",
"description": "Multi-graph mode only. In single mode, the route returns 405 — there's\nno registry to enumerate. Cedar-gated by the server-level policy via\nthe `graph_list` action against `Omnigraph::Server::\"root\"`.\n\nOrder: alphabetical by `graph_id` (server-sorted so clients see\ndeterministic output across requests).",
"operationId": "listGraphs",
"responses": {
"200": {
"description": "List of registered graphs",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/GraphListResponse"
}
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"403": {
"description": "Forbidden",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"405": {
"description": "Method not allowed (single-graph mode)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
}
},
"security": [
{
"bearer_token": []
}
]
}
},
"/healthz": {
"get": {
"tags": [
"health"
],
"summary": "Liveness probe.",
"description": "Returns server status and version. Unauthenticated; safe to call from any\ncaller. Use this to confirm the server is reachable before invoking other\nendpoints.",
"operationId": "health",
"responses": {
"200": {
"description": "Server is healthy",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HealthOutput"
}
}
}
}
}
}
},
"/ingest": {
"post": { "post": {
"tags": [ "tags": [
"mutations" "mutations"
], ],
"summary": "**Deprecated** — use [`POST /load`](#tag/mutations/operation/load) instead.", "summary": "**Deprecated** — use [`POST /load`](#tag/mutations/operation/load) instead.",
"description": "Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /load`, which has identical semantics. Responses from this route\ninclude `Deprecation: true` and `Link: </load>; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal.", "description": "Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /load`, which has identical semantics. Responses from this route\ninclude `Deprecation: true` and `Link: <load>; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal.",
"operationId": "ingest", "operationId": "cluster_ingest",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -685,7 +756,7 @@
}, },
"responses": { "responses": {
"200": { "200": {
"description": "Load results (response includes `Deprecation: true` + `Link: </load>; rel=\"successor-version\"`)", "description": "Load results (response includes `Deprecation: true` + `Link: <load>; rel=\"successor-version\"`)",
"content": { "content": {
"application/json": { "application/json": {
"schema": { "schema": {
@ -743,14 +814,25 @@
] ]
} }
}, },
"/load": { "/graphs/{graph_id}/load": {
"post": { "post": {
"tags": [ "tags": [
"mutations" "mutations"
], ],
"summary": "Bulk-load NDJSON data into a branch (canonical load endpoint).", "summary": "Bulk-load NDJSON data into a branch (canonical load endpoint).",
"description": "`data` is NDJSON with one record per line. `mode` controls behavior on\nexisting rows: `merge` upserts by id (default), `append` blindly inserts,\n`overwrite` replaces table contents. Branch creation is opt-in by\npresence of `from`: with `from` set, a missing `branch` is created from\nit; without `from`, `branch` must already exist — a missing branch is a\n404, never an implicit fork. **Destructive** when `mode` is `overwrite`\nor when the load produces conflicting writes.\n\nThe legacy `POST /ingest` route has identical semantics and is kept as a\ndeprecated alias.", "description": "`data` is NDJSON with one record per line. `mode` controls behavior on\nexisting rows: `merge` upserts by id (default), `append` blindly inserts,\n`overwrite` replaces table contents. Branch creation is opt-in by\npresence of `from`: with `from` set, a missing `branch` is created from\nit; without `from`, `branch` must already exist — a missing branch is a\n404, never an implicit fork. **Destructive** when `mode` is `overwrite`\nor when the load produces conflicting writes.\n\nThe legacy `POST /ingest` route has identical semantics and is kept as a\ndeprecated alias.",
"operationId": "load", "operationId": "cluster_load",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -820,14 +902,25 @@
] ]
} }
}, },
"/mutate": { "/graphs/{graph_id}/mutate": {
"post": { "post": {
"tags": [ "tags": [
"mutations" "mutations"
], ],
"summary": "Apply a GQ mutation to a branch (canonical mutation endpoint).", "summary": "Apply a GQ mutation to a branch (canonical mutation endpoint).",
"description": "Writes to the named `branch` (defaults to `main`). Mutations are atomic\nper call and produce a new commit. Returns counts of nodes and edges\naffected. **Destructive**: on success the branch is updated; rejected\nmutations may still acquire locks briefly. Returns 409 on merge conflict.\n\nPairs with `POST /query` (read-only). The legacy `POST /change` route\nhas identical semantics and is kept as a deprecated alias.", "description": "Writes to the named `branch` (defaults to `main`). Mutations are atomic\nper call and produce a new commit. Returns counts of nodes and edges\naffected. **Destructive**: on success the branch is updated; rejected\nmutations may still acquire locks briefly. Returns 409 on merge conflict.\n\nPairs with `POST /query` (read-only). The legacy `POST /change` route\nhas identical semantics and is kept as a deprecated alias.",
"operationId": "mutate", "operationId": "cluster_mutate",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -907,14 +1000,25 @@
] ]
} }
}, },
"/queries": { "/graphs/{graph_id}/queries": {
"get": { "get": {
"tags": [ "tags": [
"queries" "queries"
], ],
"summary": "List the graph's exposed stored queries as a typed tool catalog.", "summary": "List the graph's exposed stored queries as a typed tool catalog.",
"description": "Returns the `mcp.expose == true` subset of the `queries:` registry, each\nwith its MCP tool name, read/mutate flag, description/instruction, and\ntyped parameters — enough for a client to register them as tools without\nfetching `.gq` source. Read-gated; the catalog is graph-wide (branch\nindependent — `read` is authorized against `main`). **Not** Cedar-filtered\nper query yet, so it can list a query whose `invoke_query` the caller\nlacks (a known gap until per-query authorization lands).", "description": "Returns the `mcp.expose == true` subset of the `queries:` registry, each\nwith its MCP tool name, read/mutate flag, description/instruction, and\ntyped parameters — enough for a client to register them as tools without\nfetching `.gq` source. Read-gated; the catalog is graph-wide (branch\nindependent — `read` is authorized against `main`). **Not** Cedar-filtered\nper query yet, so it can list a query whose `invoke_query` the caller\nlacks (a known gap until per-query authorization lands).",
"operationId": "list_queries", "operationId": "cluster_list_queries",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": { "responses": {
"200": { "200": {
"description": "Stored-query catalog (the mcp.expose subset, with typed params)", "description": "Stored-query catalog (the mcp.expose subset, with typed params)",
@ -954,15 +1058,24 @@
] ]
} }
}, },
"/queries/{name}": { "/graphs/{graph_id}/queries/{name}": {
"post": { "post": {
"tags": [ "tags": [
"queries" "queries"
], ],
"summary": "Invoke a curated, server-side stored query by name.", "summary": "Invoke a curated, server-side stored query by name.",
"description": "The query source comes from the graph's `queries:` registry, not the\nrequest body — callers send only runtime inputs (`params`, `branch`,\n`snapshot`). Gated by the `invoke_query` Cedar action at the boundary;\na stored *mutation* additionally passes the engine's `change` gate\n(double-gated). An actor **without** `invoke_query` cannot tell a denied\nquery from a missing one — both return the same 404, so the catalog\ncan't be probed without the grant. Once `invoke_query` is held, the\ninner `read`/`change` gate may surface a 403 for an existing query the\nactor can't run (the intended double-gate signal).", "description": "The query source comes from the graph's `queries:` registry, not the\nrequest body — callers send only runtime inputs (`params`, `branch`,\n`snapshot`). Gated by the `invoke_query` Cedar action at the boundary;\na stored *mutation* additionally passes the engine's `change` gate\n(double-gated). An actor **without** `invoke_query` cannot tell a denied\nquery from a missing one — both return the same 404, so the catalog\ncan't be probed without the grant. Once `invoke_query` is held, the\ninner `read`/`change` gate may surface a 403 for an existing query the\nactor can't run (the intended double-gate signal).",
"operationId": "invoke_query", "operationId": "cluster_invoke_query",
"parameters": [ "parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
},
{ {
"name": "name", "name": "name",
"in": "path", "in": "path",
@ -1078,14 +1191,25 @@
] ]
} }
}, },
"/query": { "/graphs/{graph_id}/query": {
"post": { "post": {
"tags": [ "tags": [
"queries" "queries"
], ],
"summary": "Execute an inline read query (friendlier-named alternative to `POST /read`).", "summary": "Execute an inline read query (friendlier-named alternative to `POST /read`).",
"description": "Designed for ad-hoc exploration and AI-agent tool-use: short field\nnames (`query`, `name`) match the CLI `-e` flag and the GQ `query`\nkeyword. Mutations (`insert`/`update`/`delete`) are rejected with 400\n-- use `POST /mutate` (or its deprecated alias `POST /change`) for\nwrite queries. Otherwise behaves identically to `POST /read`: same\ntarget semantics (branch xor snapshot), same Cedar action (Read),\nsame response shape.", "description": "Designed for ad-hoc exploration and AI-agent tool-use: short field\nnames (`query`, `name`) match the CLI `-e` flag and the GQ `query`\nkeyword. Mutations (`insert`/`update`/`delete`) are rejected with 400\n-- use `POST /mutate` (or its deprecated alias `POST /change`) for\nwrite queries. Otherwise behaves identically to `POST /read`: same\ntarget semantics (branch xor snapshot), same Cedar action (Read),\nsame response shape.",
"operationId": "query", "operationId": "cluster_query",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -1145,14 +1269,25 @@
] ]
} }
}, },
"/read": { "/graphs/{graph_id}/read": {
"post": { "post": {
"tags": [ "tags": [
"queries" "queries"
], ],
"summary": "**Deprecated** — use [`POST /query`](#tag/queries/operation/query) instead.", "summary": "**Deprecated** — use [`POST /query`](#tag/queries/operation/query) instead.",
"description": "Execute a GQ read query. Behavior is unchanged from prior releases; the\nroute is kept indefinitely for byte-stable back-compat. New integrations\nshould target `POST /query`, which has clean field names (`query` /\n`name`) and a 400-on-mutation guard. Responses from this route include\n`Deprecation: true` and `Link: </query>; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.", "description": "Execute a GQ read query. Behavior is unchanged from prior releases; the\nroute is kept indefinitely for byte-stable back-compat. New integrations\nshould target `POST /query`, which has clean field names (`query` /\n`name`) and a 400-on-mutation guard. Responses from this route include\n`Deprecation: true` and `Link: <query>; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.",
"operationId": "read", "operationId": "cluster_read",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -1165,7 +1300,7 @@
}, },
"responses": { "responses": {
"200": { "200": {
"description": "Query results (response includes `Deprecation: true` + `Link: </query>; rel=\"successor-version\"`)", "description": "Query results (response includes `Deprecation: true` + `Link: <query>; rel=\"successor-version\"`)",
"content": { "content": {
"application/json": { "application/json": {
"schema": { "schema": {
@ -1213,14 +1348,25 @@
] ]
} }
}, },
"/schema": { "/graphs/{graph_id}/schema": {
"get": { "get": {
"tags": [ "tags": [
"schema" "schema"
], ],
"summary": "Read the current schema source.", "summary": "Read the current schema source.",
"description": "Returns the project's schema as a single string in `.pg` source form.\nUseful for clients that want to introspect available types and tables\nbefore constructing GQ queries. Read-only.", "description": "Returns the project's schema as a single string in `.pg` source form.\nUseful for clients that want to introspect available types and tables\nbefore constructing GQ queries. Read-only.",
"operationId": "getSchema", "operationId": "cluster_getSchema",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"responses": { "responses": {
"200": { "200": {
"description": "Current schema source", "description": "Current schema source",
@ -1260,14 +1406,25 @@
] ]
} }
}, },
"/schema/apply": { "/graphs/{graph_id}/schema/apply": {
"post": { "post": {
"tags": [ "tags": [
"mutations" "mutations"
], ],
"summary": "Apply a schema migration.", "summary": "Apply a schema migration.",
"description": "Diffs `schema_source` against the current schema and applies the resulting\nmigration steps (add/drop type, add/drop column, etc.). **Destructive**:\nsome steps drop data. Returns the list of steps applied; if `applied` is\nfalse the diff was unsupported and no changes were made.", "description": "Cluster-backed servers reject this route with `409 Conflict`; operators\nmust apply schema changes through `omnigraph cluster apply` and restart.\n\nDiffs `schema_source` against the current schema and applies the resulting\nmigration steps (add/drop type, add/drop column, etc.). **Destructive**:\nsome steps drop data. Returns the list of steps applied; if `applied` is\nfalse the diff was unsupported and no changes were made.",
"operationId": "applySchema", "operationId": "cluster_applySchema",
"parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
}
],
"requestBody": { "requestBody": {
"content": { "content": {
"application/json": { "application/json": {
@ -1319,6 +1476,16 @@
} }
} }
}, },
"409": {
"description": "Schema apply is disabled for cluster-backed serving; use `omnigraph cluster apply` and restart",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"429": { "429": {
"description": "Per-actor admission cap exceeded; honor `Retry-After` header", "description": "Per-actor admission cap exceeded; honor `Retry-After` header",
"content": { "content": {
@ -1337,15 +1504,24 @@
] ]
} }
}, },
"/snapshot": { "/graphs/{graph_id}/snapshot": {
"get": { "get": {
"tags": [ "tags": [
"snapshots" "snapshots"
], ],
"summary": "Read the current snapshot of a branch.", "summary": "Read the current snapshot of a branch.",
"description": "Returns the manifest version plus per-table metadata (path, version, row\ncount) for every table on the branch. Defaults to `main` when `branch` is\nomitted. Read-only.", "description": "Returns the manifest version plus per-table metadata (path, version, row\ncount) for every table on the branch. Defaults to `main` when `branch` is\nomitted. Read-only.",
"operationId": "getSnapshot", "operationId": "cluster_getSnapshot",
"parameters": [ "parameters": [
{
"name": "graph_id",
"in": "path",
"description": "Graph id to route the request to.",
"required": true,
"schema": {
"type": "string"
}
},
{ {
"name": "branch", "name": "branch",
"in": "query", "in": "query",
@ -1396,6 +1572,28 @@
} }
] ]
} }
},
"/healthz": {
"get": {
"tags": [
"health"
],
"summary": "Liveness probe.",
"description": "Returns server status and version. Unauthenticated; safe to call from any\ncaller. Use this to confirm the server is reachable before invoking other\nendpoints.",
"operationId": "health",
"responses": {
"200": {
"description": "Server is healthy",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/HealthOutput"
}
}
}
}
}
}
} }
}, },
"components": { "components": {
@ -1891,6 +2089,13 @@
], ],
"description": "Branch to run against. Defaults to `main`; for a stored mutation the\nwrite targets this branch." "description": "Branch to run against. Defaults to `main`; for a stored mutation the\nwrite targets this branch."
}, },
"expect_mutation": {
"type": [
"boolean",
"null"
],
"description": "The kind the caller expects (RFC-011 Decision 3): `Some(false)` for\n`omnigraph query <name>`, `Some(true)` for `omnigraph mutate <name>`.\nWhen set and it disagrees with the stored query's actual kind, the\nserver rejects the call (400) so the verb asserts the kind. `None`\n(the default) skips the check — preserving older clients and aliases."
},
"params": { "params": {
"description": "JSON object whose keys match the stored query's declared parameters." "description": "JSON object whose keys match the stored query's declared parameters."
}, },