diff --git a/AGENTS.md b/AGENTS.md index 91e25ae..8894278 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -33,8 +33,8 @@ OmniGraph is a typed property-graph engine built as a coordination layer over ma - **Multi-modal querying**: vector ANN (`nearest`), full-text (`search`/`fuzzy`/`match_text`/`bm25`), Reciprocal Rank Fusion (`rrf`), and graph traversal (`Expand`, anti-join `not { … }`) in one runtime. - **Branches and commits across the whole graph**: Git-style — every successful publish appends to a commit DAG; merges are three-way at the row level. - **Atomic per-query writes**: `mutate_as` and `load` accumulate insert/update batches into an in-memory `MutationStaging.pending` per touched table; one `stage_*` + `commit_staged` per table runs at end-of-query, then `ManifestBatchPublisher::publish` commits the manifest atomically with per-table `expected_table_versions` CAS. A mid-query failure leaves Lance HEAD untouched on staged tables — no drift, no run state machine, no staging branches. Deletes still inline-commit; D₂ at parse time prevents inserts/updates and deletes from coexisting in one query. -- **HTTP server**: Axum + utoipa OpenAPI, bearer auth (SHA-256 hashed, optional AWS Secrets Manager). Cedar policy enforcement is engine-wide — every `_as` writer calls `Omnigraph::enforce(action, scope, actor)`, so HTTP, CLI, and embedded SDK consumers all hit the same gate. **Two modes** (v0.6.0+): single-graph (legacy flat routes) and multi-graph (`/graphs/{graph_id}/...` cluster routes + read-only `GET /graphs` enumeration). Per-graph + server-level Cedar policies. Multi-graph mode boots from a cluster directory (`--cluster `, RFC-005) or the legacy `omnigraph.yaml` `graphs:` map. Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not exposed — operators run `cluster apply` (or edit the legacy file) and restart. -- **CLI** with two-surface config (RFC-008): the team-owned cluster directory (`cluster.yaml`) plus the per-operator `~/.omnigraph/config.yaml` (servers, credentials, actor, aliases). The legacy combined `omnigraph.yaml` still loads with per-key deprecation warnings — `config migrate` proposes the split, `OMNIGRAPH_NO_LEGACY_CONFIG=1` enforces strict mode. **Never extend `omnigraph.yaml`.** Multi-format output (json/jsonl/csv/kv/table). +- **HTTP server**: Axum + utoipa OpenAPI, bearer auth (SHA-256 hashed, optional AWS Secrets Manager). Cedar policy enforcement is engine-wide — every `_as` writer calls `Omnigraph::enforce(action, scope, actor)`, so HTTP, CLI, and embedded SDK consumers all hit the same gate. **Cluster-only boot** (RFC-011): the server always boots from a cluster directory (`--cluster `, RFC-005) and serves N graphs (N ≥ 1) under multi-graph routes (`/graphs/{graph_id}/...` + read-only `GET /graphs` enumeration); there are no single-graph flat routes and no positional-URI boot. Per-graph + server-level Cedar policies. Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not exposed — operators run `cluster apply` and restart. +- **CLI** with two-surface config (RFC-007/008): the team-owned cluster directory (`cluster.yaml`) plus the per-operator `~/.omnigraph/config.yaml` (servers, clusters, credentials, actor, profiles, aliases, defaults). Graphs are addressed via `--store`/`--server`/`--cluster`/`--profile`/operator defaults (RFC-011). Multi-format output (json/jsonl/csv/kv/table). Throughout the docs, capabilities are split into **L1 — Inherited from Lance** vs **L2 — Added by OmniGraph**. @@ -96,7 +96,7 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec | Cedar policy actions, scopes, CLI | [docs/user/operations/policy.md](docs/user/operations/policy.md) | | HTTP server endpoints, auth, error model, body limits | [docs/user/operations/server.md](docs/user/operations/server.md) | | CLI quick-start | [docs/user/cli/index.md](docs/user/cli/index.md) | -| CLI command surface and config schemas (`~/.omnigraph/config.yaml`, legacy `omnigraph.yaml`) | [docs/user/cli/reference.md](docs/user/cli/reference.md) | +| CLI command surface and config schema (`~/.omnigraph/config.yaml`) | [docs/user/cli/reference.md](docs/user/cli/reference.md) | | Audit / actor tracking | [docs/user/operations/audit.md](docs/user/operations/audit.md) | | Error taxonomy and result serialization | [docs/user/operations/errors.md](docs/user/operations/errors.md) | | Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) | @@ -144,6 +144,7 @@ These are architectural rules that need to be in scope on every change. They're 4. **Bearer-token plaintext never persists in process memory.** Tokens are hashed at startup; auth uses constant-time comparison; the actor id is server-resolved from the hash match and must not be settable by the client. 5. **Reads always see the current index state for the branch they're reading.** Indexes track the branch head, not historical snapshots. If you change index lifecycle, preserve this guarantee. 6. **Stable type IDs survive renames.** Schema migration relies on identity that's stable across rename — don't mint new IDs on rename. +7. **Logical contract over physical state.** Physical state (index coverage, fragment layout, compaction versions, staged writes) is derived and rebuildable; it must never fail a logical operation. Check preconditions against logical state and let reconciliation converge the physical state idempotently — genuine logical conflicts still fail loudly. This is the rule rules 1–6 instantiate; full statement and applications in [docs/dev/invariants.md](docs/dev/invariants.md). ### Deny-list (fast-pass review filter — full reasoning in [docs/dev/invariants.md](docs/dev/invariants.md)) @@ -179,7 +180,7 @@ Rust stable workspace (edition 2024). `protoc` is a build dependency (`brew inst cargo build --workspace --locked # build everything cargo test --workspace --locked # the canonical CI gate (matches CI exactly) cargo run -p omnigraph-cli -- # run the `omnigraph` CLI from source -cargo run -p omnigraph-server -- --bind 0.0.0.0:8080 # run the server from source +cargo run -p omnigraph-server -- --cluster --bind 0.0.0.0:8080 # run the server from source # Run one crate / one test file / one test fn cargo test -p omnigraph-engine --test traversal # one integration-test file (see docs/dev/testing.md) @@ -231,10 +232,10 @@ omnigraph cleanup --keep 10 --older-than 7d --confirm s3://my-bucket/graph.omni # Stand up the HTTP server (token from env) OMNIGRAPH_SERVER_BEARER_TOKEN=xxxx \ - omnigraph-server s3://my-bucket/graph.omni --bind 0.0.0.0:8080 + omnigraph-server --cluster s3://my-bucket/cluster --bind 0.0.0.0:8080 # Cedar policy explain -omnigraph policy explain --actor act-alice --action change --branch main +omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act-alice --action change --branch main ``` --- @@ -250,7 +251,7 @@ omnigraph policy explain --actor act-alice --action change --branch main | Compaction (`compact_files`) + reindex (`optimize_indices`) | ✅ | `omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency; per table runs `compact_files` **then Lance `optimize_indices`** (folds appended/rewritten fragments back into existing indexes — incremental merge, not retrain) and **publishes the resulting version to `__manifest`** (so the manifest tracks the Lance HEAD — required for reads to observe the work and for schema apply / strict writes to pass their HEAD-vs-manifest precondition), under the per-`(table, main)` write queue with `SidecarKind::Optimize` recovery coverage spanning both ops; **commits even with no compaction work if index coverage is stale**; **refuses on an unrecovered graph**; **skips uncovered HEAD > manifest drift** with `DriftNeedsRepair`; **skips blob-bearing tables** (reported via `TableOptimizeStats.skipped`, not silent; reindex is skipped for them too today), gated on `LANCE_SUPPORTS_BLOB_COMPACTION` until the upstream blob-v2 compaction-decode bug is fixed (see [docs/dev/invariants.md](docs/dev/invariants.md) Known Gaps) | | Repair uncovered drift | — | `omnigraph repair` explicitly classifies uncovered table `HEAD > manifest` drift: verified maintenance drift (`ReserveFragments`/`Rewrite`) can be published with `--confirm`; suspicious or unverifiable drift requires `--force --confirm`. Sidecar-covered crash residuals still recover automatically on open. | | Cleanup (`cleanup_old_versions`) | ✅ | `omnigraph cleanup` with `--keep` / `--older-than` policy | -| BTREE / inverted (FTS) / vector indexes | ✅ | `ensure_indices` builds them per `@index`/`@key` column, dispatched by type via `node_prop_index_kind` (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. Coverage of fragments appended after build is restored by `optimize`'s `optimize_indices` pass (see Compaction row). | +| BTREE / inverted (FTS) / vector indexes | ✅ | `@index`/`@key` declares intent; the physical index is derived state that never fails a logical op. Built per column through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector); idempotent; lazy across branches. **Schema apply builds nothing** (records intent only); `load`/`mutate` build inline but **defer an untrainable Vector column** (no trainable vectors yet) as *pending* rather than aborting. `ensure_indices`/`optimize` is the reconciler that materializes declared-but-missing indexes and restores coverage of appended/rewritten fragments (`optimize_indices`), reporting still-pending columns (see Compaction row). | | `merge_insert` upsert | ✅ | `LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering | | Vector search | ✅ | `nearest()` query op; embedding pipeline (Gemini / OpenAI clients); `@embed` in schema | | Full-text search | ✅ | `search/fuzzy/match_text/bm25` query ops | @@ -264,8 +265,8 @@ omnigraph policy explain --actor act-alice --action change --branch main | Three-way row-level merge | — | `OrderedTableCursor` + `StagedTableWriter`, structured `MergeConflictKind` | | Change feeds | — | `diff_between` / `diff_commits` with manifest fast path + ID streaming | | Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/operations/policy.md](docs/user/operations/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. | -| HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **multi-graph mode (v0.6.0+) with cluster routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Multi-graph boots from a cluster directory (`--cluster`) or the legacy `omnigraph.yaml`; add/remove graphs via `cluster apply` (or by editing the legacy file) and restarting.** | -| CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`; legacy `omnigraph.yaml` deprecated per RFC-008), aliases, multi-format output (json/jsonl/csv/kv/table) | +| HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **cluster-only boot (RFC-011): always `--cluster `, serving N graphs (N ≥ 1) under multi-graph routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Add/remove graphs via `cluster apply` and restart.** | +| CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`), scope addressing (`--store`/`--server`/`--cluster`/`--profile`/defaults, RFC-011), aliases, multi-format output (json/jsonl/csv/kv/table) | | Audit / actor tracking | — | `_as` write APIs + actor map in commit graph | | Local RustFS bootstrap | — | `scripts/local-rustfs-bootstrap.sh` one-shot S3-backed dev environment | diff --git a/README.md b/README.md index a75a839..35513a6 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,8 @@ **Lakehouse native graph engine built for context assembly** -Omnigraph acts as operational state & coordination layer for agents +Omnigraph acts as operational state & coordination layer for agents. +Hundreds of agents can enrich the graph on parallel isolated branches and changes can be reviewed and merged safely. - Git-style versioning & branching - Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly diff --git a/crates/omnigraph-api-types/src/lib.rs b/crates/omnigraph-api-types/src/lib.rs index 910d86b..2814602 100644 --- a/crates/omnigraph-api-types/src/lib.rs +++ b/crates/omnigraph-api-types/src/lib.rs @@ -325,6 +325,13 @@ pub struct InvokeStoredQueryRequest { /// mutation). Mutually exclusive with `branch`. #[serde(default)] pub snapshot: Option, + /// The kind the caller expects (RFC-011 Decision 3): `Some(false)` for + /// `omnigraph query `, `Some(true)` for `omnigraph mutate `. + /// When set and it disagrees with the stored query's actual kind, the + /// server rejects the call (400) so the verb asserts the kind. `None` + /// (the default) skips the check — preserving older clients and aliases. + #[serde(default)] + pub expect_mutation: Option, } /// Response for `POST /queries/{name}`: the read envelope for a stored diff --git a/crates/omnigraph-cli/src/cli.rs b/crates/omnigraph-cli/src/cli.rs index ec0da08..94bec5a 100644 --- a/crates/omnigraph-cli/src/cli.rs +++ b/crates/omnigraph-cli/src/cli.rs @@ -18,10 +18,10 @@ any — run against a graph, served (--server / --profile) or embedded (--store URI): query, mutate, load, branch, snapshot, export, commit, schema show/apply.\n \ served — require a server: graphs.\n \ direct — direct storage access; reject --server (init, optimize, repair, cleanup, \ -schema plan, lint, queries validate).\n \ -control — manage a cluster via --config: cluster.\n \ -local — no graph; local config & tooling: policy, embed, login, logout, config, \ -version, queries list.\n\ +schema plan, lint).\n \ +control — manage or inspect a cluster (cluster via --config; policy & queries via \ +--cluster).\n \ +local — no explicit graph scope; local config & tooling: alias, embed, login, logout, profile, version.\n\ See the 'Command capabilities' section of the CLI reference for which flags apply where.")] pub(crate) struct Cli { /// Actor id for direct-engine writes; overrides `cli.actor`. No effect on @@ -37,9 +37,11 @@ pub(crate) struct Cli { #[arg(long, global = true, value_name = "NAME|URL")] pub(crate) server: Option, - /// Graph id on a multi-graph `--server` (appends `/graphs/` to - /// the server url). Requires --server. - #[arg(long, global = true, value_name = "GRAPH_ID", requires = "server")] + /// Select a graph within a multi-graph scope: on a `--server` it appends + /// `/graphs/` to the server url; on a `--cluster` it picks which + /// cluster graph to maintain. Rejected on a single-graph address (a + /// positional URI / `--store`). + #[arg(long, global = true, value_name = "GRAPH_ID")] pub(crate) graph: Option, /// Select a named scope bundle (RFC-011) from `profiles:` in @@ -56,6 +58,26 @@ pub(crate) struct Cli { #[arg(long, global = true, value_name = "URI")] pub(crate) store: Option, + /// Address a cluster-managed graph's storage for maintenance (RFC-011): + /// a cluster directory or storage-root URI — named via `clusters:` in + /// ~/.omnigraph/config.yaml, or a literal `file://`/`s3://` root. Pair + /// with `--graph ` to select the graph. Used by optimize / repair / + /// cleanup; exclusive with a positional URI / `--store` / `--server`. + #[arg(long, global = true, value_name = "DIR|URI")] + pub(crate) cluster: Option, + + /// Skip the confirmation prompt for a destructive write (`cleanup`, + /// overwrite `load`, `branch delete`) against a non-local scope (RFC-011 + /// Decision 9). Without it, a non-local destructive write prompts on a TTY + /// and refuses (errors) when there is no TTY or `--json` is set. + #[arg(long, global = true)] + pub(crate) yes: bool, + + /// Suppress the one-line resolved-write-target diagnostic that write + /// commands echo to stderr (RFC-011 Decision 9). + #[arg(long, global = true)] + pub(crate) quiet: bool, + #[command(subcommand)] pub(crate) command: Command, } @@ -70,22 +92,16 @@ pub(crate) enum Command { /// when used. Pairs with `omnigraph mutate` on the write side. #[command(visible_alias = "read")] Query { - /// Graph URI - #[arg(long)] - uri: Option, - #[arg(hide = true)] - legacy_uri: Option, - #[arg(long)] - config: Option, - #[arg(long, conflicts_with_all = ["query", "query_string"])] - alias: Option, - #[arg(long, conflicts_with_all = ["alias", "query_string"])] - query: Option, - /// Inline GQ source — alternative to `--query ` and `--alias `. - #[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with_all = ["query", "alias"])] - query_string: Option, - #[arg(long)] + /// Query name. With no `--query`/`-e`, the stored query to invoke from + /// the catalog (served — addressed via --server/--profile). With + /// `--query`/`-e`, selects which query in that ad-hoc source to run. name: Option, + /// Ad-hoc query file (a `.gq` you're authoring / break-glass). + #[arg(long, conflicts_with = "query_string")] + query: Option, + /// Inline ad-hoc GQ source — alternative to `--query `. + #[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with = "query")] + query_string: Option, #[command(flatten)] params: ParamsArgs, #[arg(long, conflicts_with = "snapshot")] @@ -96,8 +112,6 @@ pub(crate) enum Command { format: Option, #[arg(long, conflicts_with = "format")] json: bool, - #[arg()] - alias_args: Vec, }, /// Execute a graph mutation query against a branch. /// @@ -106,38 +120,48 @@ pub(crate) enum Command { /// warning when used. Pairs with `omnigraph query` on the read side. #[command(visible_alias = "change")] Mutate { - /// Graph URI - #[arg(long)] - uri: Option, - #[arg(hide = true)] - legacy_uri: Option, - #[arg(long)] - config: Option, - #[arg(long, conflicts_with_all = ["query", "query_string"])] - alias: Option, - #[arg(long, conflicts_with_all = ["alias", "query_string"])] - query: Option, - /// Inline GQ source — alternative to `--query ` and `--alias `. - #[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with_all = ["query", "alias"])] - query_string: Option, - #[arg(long)] + /// Query name. With no `--query`/`-e`, the stored mutation to invoke + /// from the catalog (served — addressed via --server/--profile). With + /// `--query`/`-e`, selects which query in that ad-hoc source to run. name: Option, + /// Ad-hoc mutation file (a `.gq` you're authoring / break-glass). + #[arg(long, conflicts_with = "query_string")] + query: Option, + /// Inline ad-hoc GQ source — alternative to `--query `. + #[arg(short = 'e', long = "query-string", value_name = "GQ", conflicts_with = "query")] + query_string: Option, #[command(flatten)] params: ParamsArgs, #[arg(long)] branch: Option, #[arg(long)] json: bool, - #[arg()] - alias_args: Vec, + }, + /// Invoke an operator alias (RFC-011 Decision 4). + /// + /// An alias is a personal binding under `aliases:` in + /// ~/.omnigraph/config.yaml — name → (server, graph, stored-query name, + /// default params). `omnigraph alias [args]` invokes the bound + /// stored query on its server. Living in its own namespace, an alias can + /// never shadow or be shadowed by a built-in verb. Replaces the removed + /// `--alias` flag on `query`/`mutate`. + Alias { + /// Alias name (a key under `aliases:` in ~/.omnigraph/config.yaml). + name: String, + /// Positional args bound to the alias's declared `args` params, in order. + args: Vec, + #[command(flatten)] + params: ParamsArgs, + #[arg(long, conflicts_with = "json")] + format: Option, + #[arg(long, conflicts_with = "format")] + json: bool, }, /// Load data into a graph (local or remote) Load { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] data: PathBuf, /// Target branch (defaults to main). Without --from it must exist. #[arg(long)] @@ -159,8 +183,6 @@ pub(crate) enum Command { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] data: PathBuf, #[arg(long)] branch: Option, @@ -181,8 +203,6 @@ pub(crate) enum Command { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] branch: Option, #[arg(long)] json: bool, @@ -192,8 +212,6 @@ pub(crate) enum Command { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] branch: Option, #[arg(long, hide = true)] jsonl: bool, @@ -238,30 +256,12 @@ pub(crate) enum Command { /// Graph URI uri: Option, #[arg(long)] - config: Option, - /// Cluster directory or storage-root URI; with --cluster-graph, resolves - /// the graph's storage URI from the served cluster state. - #[arg(long, conflicts_with = "uri", requires = "cluster_graph")] - cluster: Option, - /// Graph id within --cluster. - #[arg(long, requires = "cluster")] - cluster_graph: Option, - #[arg(long)] json: bool, }, /// Classify and explicitly repair manifest/head drift Repair { /// Graph URI uri: Option, - #[arg(long)] - config: Option, - /// Cluster directory or storage-root URI; with --cluster-graph, resolves - /// the graph's storage URI from the served cluster state. - #[arg(long, conflicts_with = "uri", requires = "cluster_graph")] - cluster: Option, - /// Graph id within --cluster. - #[arg(long, requires = "cluster")] - cluster_graph: Option, /// Publish verified maintenance drift. Without this flag, repair only /// previews what it would do. #[arg(long)] @@ -277,15 +277,6 @@ pub(crate) enum Command { Cleanup { /// Graph URI uri: Option, - #[arg(long)] - config: Option, - /// Cluster directory or storage-root URI; with --cluster-graph, resolves - /// the graph's storage URI from the served cluster state. - #[arg(long, conflicts_with = "uri", requires = "cluster_graph")] - cluster: Option, - /// Graph id within --cluster. - #[arg(long, requires = "cluster")] - cluster_graph: Option, /// Number of recent versions to keep per table. Either `--keep` or /// `--older-than` (or both) must be set. #[arg(long)] @@ -315,8 +306,6 @@ pub(crate) enum Command { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] query: PathBuf, #[arg(long)] schema: Option, @@ -336,8 +325,7 @@ pub(crate) enum Command { command: ClusterCommand, }, - // ── Session / config ── no graph addressing; local tooling. - /// Policy administration and diagnostics + /// Policy administration and diagnostics against a cluster's applied bundles Policy { #[command(subcommand)] command: PolicyCommand, @@ -363,16 +351,32 @@ pub(crate) enum Command { #[arg(long)] json: bool, }, - /// Legacy-config tooling (RFC-008): split omnigraph.yaml into its - /// two destinations. - Config { + /// Inspect the scope profiles in ~/.omnigraph/config.yaml (read-only). + Profile { #[command(subcommand)] - command: ConfigCommand, + command: ProfileCommand, }, /// Print the CLI version Version, } +#[derive(Debug, Subcommand)] +pub(crate) enum ProfileCommand { + /// List the profiles defined in ~/.omnigraph/config.yaml. + List { + #[arg(long)] + json: bool, + }, + /// Show a profile's resolved scope. With no name, shows the active + /// (`$OMNIGRAPH_PROFILE`) profile, else the flat operator defaults. + Show { + /// Profile name (optional). + name: Option, + #[arg(long)] + json: bool, + }, +} + #[derive(Debug, Subcommand)] pub(crate) enum ClusterCommand { /// Validate cluster.yaml and referenced schemas, queries, and policy files. @@ -469,8 +473,6 @@ pub(crate) enum GraphsCommand { #[arg(long)] uri: Option, #[arg(long)] - config: Option, - #[arg(long)] json: bool, }, } @@ -483,8 +485,6 @@ pub(crate) enum BranchCommand { #[arg(long)] uri: Option, #[arg(long)] - config: Option, - #[arg(long)] from: Option, name: String, #[arg(long)] @@ -496,8 +496,6 @@ pub(crate) enum BranchCommand { #[arg(long)] uri: Option, #[arg(long)] - config: Option, - #[arg(long)] json: bool, }, /// Delete a branch @@ -505,8 +503,6 @@ pub(crate) enum BranchCommand { /// Graph URI #[arg(long)] uri: Option, - #[arg(long)] - config: Option, name: String, #[arg(long)] json: bool, @@ -516,8 +512,6 @@ pub(crate) enum BranchCommand { /// Graph URI #[arg(long)] uri: Option, - #[arg(long)] - config: Option, source: String, #[arg(long)] into: Option, @@ -533,8 +527,6 @@ pub(crate) enum SchemaCommand { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] schema: PathBuf, #[arg(long)] json: bool, @@ -549,8 +541,6 @@ pub(crate) enum SchemaCommand { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] schema: PathBuf, #[arg(long)] json: bool, @@ -572,8 +562,6 @@ pub(crate) enum SchemaCommand { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] json: bool, }, } @@ -586,8 +574,6 @@ pub(crate) enum CommitCommand { /// Graph URI uri: Option, #[arg(long)] - config: Option, - #[arg(long)] branch: Option, #[arg(long)] json: bool, @@ -597,8 +583,6 @@ pub(crate) enum CommitCommand { /// Graph URI #[arg(long)] uri: Option, - #[arg(long)] - config: Option, commit_id: String, #[arg(long)] json: bool, @@ -607,20 +591,24 @@ pub(crate) enum CommitCommand { #[derive(Debug, Subcommand)] pub(crate) enum PolicyCommand { - /// Validate policy YAML and compiled Cedar policy state - Validate { - #[arg(long)] - config: Option, - }, - /// Run declarative policy tests from policy.tests.yaml + /// Compile and validate the Cedar policy bundle(s) applied in a cluster. + /// + /// Sources the bundle(s) from the cluster's applied policies + /// (`--cluster `); pass the global `--graph ` to pick one + /// graph's bundle when several apply. + Validate {}, + /// Run declarative policy tests against a cluster's applied bundle. + /// + /// The cluster model has no per-bundle tests file, so the cases are + /// supplied explicitly with `--tests ` and checked against the + /// bundle selected by `--cluster` (+ optional `--graph`). Test { + /// Path to a policy.tests.yaml file. #[arg(long)] - config: Option, + tests: PathBuf, }, - /// Explain one policy decision locally + /// Explain one policy decision against a cluster's applied bundle. Explain { - #[arg(long)] - config: Option, #[arg(long)] actor: String, #[arg(long)] @@ -634,24 +622,19 @@ pub(crate) enum PolicyCommand { #[derive(Debug, Subcommand)] pub(crate) enum QueriesCommand { - /// Type-check the stored-query registry against the live schema. + /// Type-check a cluster's stored-query registry against its schemas. /// - /// Distinct from `omnigraph lint` (which lints one `.gq` file): - /// this validates the whole `queries:` registry — opening the graph - /// to read its schema and confirming every stored query still - /// type-checks. Exits non-zero on any breakage. + /// Distinct from `omnigraph lint` (which lints one `.gq` file): this + /// validates the whole `queries:` registry of a cluster (`--cluster + /// `, optional `--graph `) by reading each graph's applied + /// schema and confirming every stored query still type-checks. Exits + /// non-zero on any breakage. Validate { - /// Graph URI - uri: Option, - #[arg(long)] - config: Option, #[arg(long)] json: bool, }, - /// List the registered stored queries (name, MCP exposure, params). + /// List a cluster's registered stored queries (name, params). List { - #[arg(long)] - config: Option, #[arg(long)] json: bool, }, @@ -682,7 +665,6 @@ impl From for LoadMode { } } } - impl CliLoadMode { pub(crate) fn as_str(self) -> &'static str { match self { @@ -692,21 +674,3 @@ impl CliLoadMode { } } } - -#[derive(Debug, Subcommand)] -pub(crate) enum ConfigCommand { - /// Propose (and with --write, apply) the RFC-008 split of a legacy - /// omnigraph.yaml: team half -> a ready-to-review cluster.yaml, - /// personal half -> ~/.omnigraph/config.yaml (key-level merge, - /// existing entries always win). Touches nothing without --write. - Migrate { - /// Path to the legacy omnigraph.yaml (default: ./omnigraph.yaml) - #[arg(long)] - config: Option, - /// Apply the split instead of only printing it - #[arg(long)] - write: bool, - #[arg(long)] - json: bool, - }, -} diff --git a/crates/omnigraph-cli/src/client.rs b/crates/omnigraph-cli/src/client.rs index 5c427f2..7151f5e 100644 --- a/crates/omnigraph-cli/src/client.rs +++ b/crates/omnigraph-cli/src/client.rs @@ -29,7 +29,8 @@ use omnigraph::db::{Omnigraph, ReadTarget}; use omnigraph_api_types::{ BranchCreateOutput, BranchCreateRequest, BranchDeleteOutput, BranchListOutput, BranchMergeOutput, BranchMergeRequest, ChangeOutput, CommitListOutput, CommitOutput, - ErrorOutput, ExportRequest, GraphListResponse, IngestOutput, IngestRequest, ReadOutput, + ErrorOutput, ExportRequest, GraphListResponse, IngestOutput, IngestRequest, + InvokeStoredQueryRequest, ReadOutput, ReadRequest, SchemaApplyOutput, SchemaApplyRequest, SchemaOutput, SnapshotOutput, commit_output, ingest_output, read_output, schema_apply_output, snapshot_payload, }; @@ -39,22 +40,20 @@ use serde_json::Value; use crate::cli::CliLoadMode; use crate::helpers::{ - ResolvedCliGraph, apply_bearer_token, apply_server_flag, build_http_client, is_remote_uri, - legacy_change_request_body, open_local_db_with_policy, query_params_from_json, + apply_bearer_token, apply_server_flag, build_http_client, is_remote_uri, + legacy_change_request_body, query_params_from_json, remote_json, remote_url, resolve_cli_actor, resolve_cli_graph, resolve_remote_bearer_token, - select_named_query, + resolve_server_flag, select_named_query, }; use crate::output::{LoadOutput, load_output_from_result, load_output_from_tables}; -use omnigraph_server::config::OmnigraphConfig; pub(crate) enum GraphClient { - /// Local engine at `uri`. Reads (`resolve()`) leave `graph`/`actor` - /// empty and open without policy; writes (`resolve_with_policy()`) - /// fill them, opening through `open_local_db_with_policy` and - /// attributing the resolved actor. + /// Local engine at `uri`. Reads (`resolve()`) leave `actor` empty; + /// writes (`resolve_with_policy()`) attribute the resolved actor. + /// Direct-store access carries no Cedar policy (RFC-011: policy lives + /// in the cluster/server, not in per-operator addressing). Embedded { uri: String, - graph: Option, actor: Option, }, /// Remote HTTP server. The actor is resolved server-side from the @@ -66,6 +65,43 @@ pub(crate) enum GraphClient { }, } +/// RFC-011 Decision 7: a server scope that selects no graph (no `--graph`, no +/// `default_graph`) must not silently fall through to the bare server URL when +/// the server is multi-graph. Best-effort probe `GET /graphs`: a populated list +/// forces `--graph` (listing the candidates); a single-graph/flat server (405), +/// a policy-gated `/graphs`, or an unreachable server all proceed — the bare URL +/// is then correct, or the real request surfaces the failure. Only fires on the +/// no-graph path, so a `--graph`/`default_graph` happy path does no extra I/O. +async fn require_graph_for_multi_graph_server( + scope: &crate::scope::ResolvedScope, +) -> Result<()> { + let (Some(server), None) = (scope.server.as_deref(), scope.graph.as_deref()) else { + return Ok(()); + }; + let Some(base) = resolve_server_flag(Some(server), None)? else { + return Ok(()); + }; + let token = resolve_remote_bearer_token(Some(&base))?; + let probe = GraphClient::Remote { + http: build_http_client()?, + base_url: base, + token, + }; + if let Ok(resp) = probe.list_graphs().await { + if !resp.graphs.is_empty() { + let ids: Vec<&str> = resp.graphs.iter().map(|g| g.graph_id.as_str()).collect(); + bail!( + "server scope '{server}' has {} {}: [{}]; pass --graph to select one \ + (or set `default_graph` in your operator config)", + ids.len(), + if ids.len() == 1 { "graph" } else { "graphs" }, + ids.join(", ") + ); + } + } + Ok(()) +} + /// A remote graph must be addressed with `--server` (RFC-011): a positional or /// `--uri` `http(s)://` URL no longer auto-dispatches to a server. A remote URL /// produced by a server scope (`via_server`) is fine. @@ -86,8 +122,7 @@ impl GraphClient { /// fork. Mirrors the read verbs' current preamble (`resolve_uri` /// path, not the policy-bearing `resolve_cli_graph`). Used by reads /// and `query` (which opens without policy, like the reads). - pub(crate) fn resolve( - config: &OmnigraphConfig, + pub(crate) async fn resolve( server: Option<&str>, graph: Option<&str>, uri: Option, @@ -100,8 +135,9 @@ impl GraphClient { let scope = crate::scope::resolve_scope( &crate::operator::load_operator_config()?, crate::planes::Capability::Any, - crate::scope::ScopeFlags { profile, store, server, graph, uri }, + crate::scope::ScopeFlags { profile, store, server, cluster: None, graph, uri }, )?; + require_graph_for_multi_graph_server(&scope).await?; let (server, graph, uri) = ( scope.server.as_deref(), scope.graph.as_deref(), @@ -109,8 +145,8 @@ impl GraphClient { ); let via_server = server.is_some(); let uri = apply_server_flag(server, graph, uri)?; - let token = resolve_remote_bearer_token(config, uri.as_deref())?; - let uri = crate::helpers::resolve_uri(config, uri)?; + let token = resolve_remote_bearer_token(uri.as_deref())?; + let uri = crate::helpers::resolve_uri(uri)?; reject_positional_remote(via_server, &uri)?; if is_remote_uri(&uri) { Ok(GraphClient::Remote { @@ -119,11 +155,7 @@ impl GraphClient { token, }) } else { - Ok(GraphClient::Embedded { - uri, - graph: None, - actor: None, - }) + Ok(GraphClient::Embedded { uri, actor: None }) } } @@ -133,8 +165,7 @@ impl GraphClient { /// resolved up front. The embedded arm then opens WITH policy. The /// resolution order matches the write arms exactly: server flag → /// bearer token → graph. - pub(crate) fn resolve_with_policy( - config: &OmnigraphConfig, + pub(crate) async fn resolve_with_policy( server: Option<&str>, graph: Option<&str>, uri: Option, @@ -147,8 +178,9 @@ impl GraphClient { let scope = crate::scope::resolve_scope( &crate::operator::load_operator_config()?, crate::planes::Capability::Any, - crate::scope::ScopeFlags { profile, store, server, graph, uri }, + crate::scope::ScopeFlags { profile, store, server, cluster: None, graph, uri }, )?; + require_graph_for_multi_graph_server(&scope).await?; let (server, graph, uri) = ( scope.server.as_deref(), scope.graph.as_deref(), @@ -156,8 +188,8 @@ impl GraphClient { ); let via_server = server.is_some(); let uri = apply_server_flag(server, graph, uri)?; - let token = resolve_remote_bearer_token(config, uri.as_deref())?; - let resolved = resolve_cli_graph(config, uri)?; + let token = resolve_remote_bearer_token(uri.as_deref())?; + let resolved = resolve_cli_graph(uri)?; reject_positional_remote(via_server, &resolved.uri)?; if resolved.is_remote { // A served write resolves the actor server-side from the bearer @@ -175,10 +207,9 @@ impl GraphClient { token, }) } else { - let actor = resolve_cli_actor(cli_as, config)?; + let actor = resolve_cli_actor(cli_as)?; Ok(GraphClient::Embedded { - uri: resolved.uri.clone(), - graph: Some(resolved), + uri: resolved.uri, actor, }) } @@ -192,28 +223,15 @@ impl GraphClient { } } - /// The selected graph name, when a policy-bearing embedded client was - /// resolved against a named graph. `None` for remote and for reads. - pub(crate) fn selected(&self) -> Option<&str> { - match self { - GraphClient::Embedded { graph, .. } => graph.as_ref().and_then(ResolvedCliGraph::selected), - GraphClient::Remote { .. } => None, - } - } - pub(crate) fn is_remote(&self) -> bool { matches!(self, GraphClient::Remote { .. }) } - /// Open the local engine the way the resolved client demands: with - /// policy when a `graph` context is present (write path), bare - /// otherwise (read/`query` path). Captures today's two open paths in - /// one place so each verb stays a single match arm. - async fn open_embedded(uri: &str, graph: &Option) -> Result { - match graph { - Some(graph) => open_local_db_with_policy(graph).await, - None => Ok(Omnigraph::open(uri).await?), - } + /// Open the local engine. Direct-store access carries no Cedar policy + /// (RFC-011), so both read and write paths open bare; the actor is still + /// attributed on the write via the `_as` engine APIs. + async fn open_embedded(uri: &str) -> Result { + Ok(Omnigraph::open(uri).await?) } pub(crate) async fn branch_list(&self) -> Result { @@ -375,8 +393,8 @@ impl GraphClient { .await?; Ok(load_output_from_tables(base_url, branch, mode.as_str(), &output)) } - GraphClient::Embedded { uri, graph, actor } => { - let db = Self::open_embedded(uri, graph).await?; + GraphClient::Embedded { uri, actor } => { + let db = Self::open_embedded(uri).await?; let result = db .load_file_as(branch, from, data, mode.into(), actor.as_deref()) .await?; @@ -418,8 +436,8 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, actor } => { - let db = Self::open_embedded(uri, graph).await?; + GraphClient::Embedded { uri, actor } => { + let db = Self::open_embedded(uri).await?; let result = db .load_file_as(branch, Some(from), data, mode.into(), actor.as_deref()) .await?; @@ -457,10 +475,10 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, actor } => { + GraphClient::Embedded { uri, actor } => { let (selected_name, query_params) = select_named_query(query_source, query_name)?; let params = query_params_from_json(&query_params, params_json)?; - let db = Self::open_embedded(uri, graph).await?; + let db = Self::open_embedded(uri).await?; let actor = actor.as_deref(); let result = db .mutate_as(branch, query_source, &selected_name, ¶ms, actor) @@ -511,10 +529,10 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, .. } => { + GraphClient::Embedded { uri, .. } => { let (selected_name, query_params) = select_named_query(query_source, query_name)?; let params = query_params_from_json(&query_params, params_json)?; - let db = Self::open_embedded(uri, graph).await?; + let db = Self::open_embedded(uri).await?; let result = db .query(target.clone(), query_source, &selected_name, ¶ms) .await?; @@ -523,6 +541,50 @@ impl GraphClient { } } + /// `invoke_named` — run a stored query **by catalog name** (RFC-011 D3). + /// Served-only: the catalog is server-owned, so a `--store` (embedded) + /// scope has nothing to resolve the name against. `expect_mutation` carries + /// the verb's asserted kind; the server rejects a mismatch (400) before + /// running, so the response is exactly the expected envelope — the caller + /// deserializes it as the concrete `T` (`ReadOutput` for `query`, + /// `ChangeOutput` for `mutate`), sidestepping the untagged wire enum. + pub(crate) async fn invoke_named( + &self, + name: &str, + expect_mutation: bool, + params_json: Option<&Value>, + branch: Option, + snapshot: Option, + ) -> Result { + match self { + GraphClient::Remote { + http, + base_url, + token, + } => { + let body = InvokeStoredQueryRequest { + params: params_json.cloned(), + branch, + snapshot, + expect_mutation: Some(expect_mutation), + }; + remote_json( + http, + Method::POST, + remote_url(base_url, &["queries", name], &[])?, + Some(serde_json::to_value(body)?), + token.as_deref(), + ) + .await + } + GraphClient::Embedded { .. } => bail!( + "by-name invocation needs a server (the stored-query catalog is \ + server-owned); use -e '' or --query for an ad-hoc query \ + against --store, or address a server with --server / --profile" + ), + } + } + pub(crate) async fn branch_create_from( &self, from: &str, @@ -546,8 +608,8 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, actor } => { - let db = Self::open_embedded(uri, graph).await?; + GraphClient::Embedded { uri, actor } => { + let db = Self::open_embedded(uri).await?; let actor = actor.as_deref(); db.branch_create_from_as(ReadTarget::branch(from), name, actor) .await?; @@ -577,8 +639,8 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, actor } => { - let db = Self::open_embedded(uri, graph).await?; + GraphClient::Embedded { uri, actor } => { + let db = Self::open_embedded(uri).await?; let actor = actor.as_deref(); db.branch_delete_as(name, actor).await?; Ok(BranchDeleteOutput { @@ -609,8 +671,8 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, actor } => { - let db = Self::open_embedded(uri, graph).await?; + GraphClient::Embedded { uri, actor } => { + let db = Self::open_embedded(uri).await?; let actor = actor.as_deref(); let outcome = db.branch_merge_as(source, into, actor).await?; Ok(BranchMergeOutput { @@ -660,8 +722,8 @@ impl GraphClient { ) .await } - GraphClient::Embedded { uri, graph, actor } => { - let db = Self::open_embedded(uri, graph).await?; + GraphClient::Embedded { uri, actor } => { + let db = Self::open_embedded(uri).await?; let result = db .apply_schema_as_with_catalog_check( schema_source, @@ -730,9 +792,9 @@ impl GraphClient { /// `graphs list` — enumerate the graphs a remote multi-graph server /// serves (`GET /graphs`). Remote-only by design: there is no local - /// enumeration endpoint, so the Embedded arm fails loudly pointing the - /// operator at `omnigraph.yaml`. Routing it through the enum still buys - /// the shared `resolve()` addressing/token preamble. + /// enumeration endpoint, so the Embedded arm fails loudly. Routing it + /// through the enum still buys the shared `resolve()` addressing/token + /// preamble. pub(crate) async fn list_graphs(&self) -> Result { match self { GraphClient::Remote { @@ -750,9 +812,9 @@ impl GraphClient { .await } GraphClient::Embedded { .. } => bail!( - "`omnigraph graphs list` requires a remote multi-graph server URL \ - (http:// or https://). To enumerate local graphs, read `omnigraph.yaml` \ - directly." + "`omnigraph graphs list` requires a remote multi-graph server \ + (--server ). To enumerate the graphs in a cluster, run \ + `omnigraph cluster status --config `." ), } } diff --git a/crates/omnigraph-cli/src/helpers.rs b/crates/omnigraph-cli/src/helpers.rs index d49d17f..971ca30 100644 --- a/crates/omnigraph-cli/src/helpers.rs +++ b/crates/omnigraph-cli/src/helpers.rs @@ -2,6 +2,8 @@ //! remote HTTP, env/token handling, scaffolding (moved verbatim from //! main.rs in the modularization). +use std::io::IsTerminal; + use super::*; use crate::operator; @@ -16,6 +18,59 @@ pub(crate) fn is_remote_uri(uri: &str) -> bool { uri.starts_with("http://") || uri.starts_with("https://") } +/// Whether a resolved write target is *local* for the purposes of the RFC-011 +/// Decision 9 destructive-confirm gate: a bare path or a `file://` URI. Anything +/// else carrying a scheme — `http(s)://` (served), `s3://` / `gs://` / … (object +/// store) — is non-local and a destructive write against it requires explicit +/// consent. Generalizes `is_remote_uri` (which only catches http(s)). +pub(crate) fn uri_is_local(uri: &str) -> bool { + !uri.contains("://") || uri.starts_with("file://") +} + +/// Echo the resolved write target + access path to stderr (RFC-011 Decision 9), +/// unless `--quiet`. One line, e.g. `omnigraph load → file://g.omni (direct, +/// local)`. stderr so `--json` consumers reading stdout are unaffected; the line +/// legitimately differs embedded-vs-served (that visibility is the point). +pub(crate) fn echo_write_target(quiet: bool, label: &str, uri: &str, served: bool) { + if quiet { + return; + } + let access = if served { + "served" + } else if uri_is_local(uri) { + "direct, local" + } else { + "direct, remote" + }; + eprintln!("omnigraph {label} → {uri} ({access})"); +} + +/// Gate a destructive write (`cleanup`, overwrite `load`, `branch delete`) +/// against a non-local scope (RFC-011 Decision 9). A local target needs no +/// confirmation; otherwise `--yes` consents, an interactive TTY is prompted, and +/// a non-TTY / `--json` run refuses rather than silently proceeding. +pub(crate) fn confirm_destructive(label: &str, uri: &str, yes: bool, json: bool) -> Result<()> { + if uri_is_local(uri) || yes { + return Ok(()); + } + if json || !std::io::stdin().is_terminal() { + bail!( + "refusing destructive `{label}` against non-local target {uri} without confirmation; \ + pass --yes to confirm (an interactive TTY would be prompted instead)" + ); + } + eprint!( + "About to run a destructive `{label}` against {uri} (not local). Type 'yes' to continue: " + ); + io::stderr().flush()?; + let mut answer = String::new(); + io::stdin().read_line(&mut answer)?; + match answer.trim().to_ascii_lowercase().as_str() { + "yes" | "y" => Ok(()), + _ => bail!("aborted: destructive `{label}` not confirmed"), + } +} + /// THE one way the CLI composes a remote request URL. Every remote call /// routes through here so URL assembly has a single mechanism instead of /// per-callsite string interpolation. @@ -64,231 +119,174 @@ pub(crate) fn bearer_token_from_env(var_name: &str) -> Option { normalize_bearer_token(std::env::var(var_name).ok()) } -pub(crate) fn parse_env_assignment(line: &str) -> Option<(String, String)> { - let line = line.trim(); - if line.is_empty() || line.starts_with('#') { - return None; - } - - let line = line.strip_prefix("export ").unwrap_or(line).trim(); - let (name, value) = line.split_once('=')?; - let name = name.trim(); - if name.is_empty() { - return None; - } - - let value = value.trim(); - let value = if value.len() >= 2 - && ((value.starts_with('"') && value.ends_with('"')) - || (value.starts_with('\'') && value.ends_with('\''))) - { - &value[1..value.len() - 1] - } else { - value - }; - - Some((name.to_string(), value.to_string())) -} - -pub(crate) fn bearer_token_from_env_file(path: &Path, var_name: &str) -> Result> { - if !path.exists() { - return Ok(None); - } - - for line in fs::read_to_string(path)?.lines() { - let Some((name, value)) = parse_env_assignment(line) else { - continue; - }; - if name == var_name { - return Ok(normalize_bearer_token(Some(value))); - } - } - - Ok(None) -} - -pub(crate) fn load_env_file_into_process(path: &Path) -> Result<()> { - if !path.exists() { - return Ok(()); - } - - for line in fs::read_to_string(path)?.lines() { - let Some((name, value)) = parse_env_assignment(line) else { - continue; - }; - if std::env::var_os(&name).is_none() { - unsafe { - std::env::set_var(name, value); - } - } - } - - Ok(()) -} - -pub(crate) fn load_cli_config(config_path: Option<&PathBuf>) -> Result { - let config = load_config(config_path)?; - if let Some(path) = config.resolve_auth_env_file() { - load_env_file_into_process(&path)?; - } - Ok(config) +/// The Cedar resource id for a graph selection: the explicit graph name when one +/// is given, else the normalized URI (the anonymous fallback). Used by the +/// `policy` tooling to address a graph's bundle. +pub(crate) fn graph_resource_id_for_selection( + selected_graph: Option<&str>, + normalized_uri: &str, +) -> String { + selected_graph.unwrap_or(normalized_uri).to_string() } #[derive(Debug, Clone)] pub(crate) struct ResolvedCliGraph { pub(crate) uri: String, - pub(crate) selected: Option, - pub(crate) graph_id: String, - pub(crate) policy_file: Option, pub(crate) is_remote: bool, } -impl ResolvedCliGraph { - pub(crate) fn selected(&self) -> Option<&str> { - self.selected.as_deref() - } -} - -pub(crate) struct ResolvedPolicyContext { - pub(crate) policy_file: PathBuf, - pub(crate) graph_id: String, -} - -pub(crate) fn resolve_policy_context(config: &OmnigraphConfig) -> Result { - let selected = config.resolve_policy_tooling_graph_selection()?; - let policy_file = config.resolve_policy_file_for(selected).ok_or_else(|| { - color_eyre::eyre::eyre!( - "policy.file or graphs..policy.file must be set in omnigraph.yaml" - ) - })?; - let graph_id = match selected { - Some(name) => graph_resource_id_for_selection(Some(name), ""), - None => graph_resource_id_for_selection(None, "default"), +/// Resolve the cluster for a control-plane tooling command (`policy`, +/// `queries`) from `--cluster`. A configured name (`clusters:` in operator +/// config) is rewritten to its root; a literal dir / `s3://`/`file://` root is +/// passed through. A `--profile`/`OMNIGRAPH_PROFILE` cluster binding also +/// resolves here when `--cluster` is absent. No omnigraph.yaml. +pub(crate) fn require_cluster_scope( + cluster: Option<&str>, + profile: Option<&str>, + command: &str, +) -> Result { + let op = operator::load_operator_config()?; + let resolve_name = |name: &str| { + op.cluster_root(name) + .map(str::to_string) + .unwrap_or_else(|| name.to_string()) }; - Ok(ResolvedPolicyContext { - policy_file, - graph_id, - }) + if let Some(cluster) = cluster { + return Ok(resolve_name(cluster)); + } + // A cluster profile (flag, else OMNIGRAPH_PROFILE) binds the cluster too. + let profile_name = profile + .map(str::to_string) + .or_else(|| std::env::var(scope::PROFILE_ENV).ok().filter(|s| !s.is_empty())); + if let Some(name) = profile_name { + let profile = op.profile(&name).ok_or_else(|| { + color_eyre::eyre::eyre!("unknown profile '{name}' (not defined under `profiles:`)") + })?; + if let crate::operator::ScopeBinding::Cluster(cluster) = profile.binding(&name)? { + return Ok(resolve_name(&cluster)); + } + } + bail!( + "`{command}` needs a cluster — pass --cluster (or a name from `clusters:` \ + in ~/.omnigraph/config.yaml), or select a cluster profile" + ) } -pub(crate) fn resolve_policy_engine(context: &ResolvedPolicyContext) -> Result { - PolicyEngine::load_graph(&context.policy_file, &context.graph_id) +/// Read a cluster's serving snapshot for a control-plane tooling command, +/// flattening the readiness `Diagnostic` list into one loud error. The single +/// snapshot entry point for `policy`/`queries` so the not-servable message stays +/// identical across them. +async fn read_serving_snapshot_or_report( + cluster: &str, +) -> Result { + omnigraph_cluster::read_serving_snapshot(cluster) + .await + .map_err(|diagnostics| { + color_eyre::eyre::eyre!( + "cluster `{cluster}` is not servable:\n {}", + diagnostics + .iter() + .map(|d| d.message.clone()) + .collect::>() + .join("\n ") + ) + }) } -pub(crate) fn resolve_policy_engine_for_graph(graph: &ResolvedCliGraph) -> Result { - let policy_file = graph.policy_file.as_ref().ok_or_else(|| { - color_eyre::eyre::eyre!( - "policy.file or graphs..policy.file must be set in omnigraph.yaml" - ) - })?; - PolicyEngine::load_graph(policy_file, &graph.graph_id) +/// Resolve the Cedar policy bundle(s) for a `--cluster` policy-tooling command +/// (RFC-011). Sources the applied policies from the cluster's serving snapshot; +/// each `ServingPolicy` carries its `source` (digest-verified content) and the +/// scopes it `applies_to` (`cluster` | `graph.`). The optional `graph` +/// selects a graph's bundle when several apply. +pub(crate) async fn read_cluster_policies( + cluster: &str, +) -> Result> { + Ok(read_serving_snapshot_or_report(cluster).await?.policies) } -pub(crate) async fn open_local_db_with_policy(graph: &ResolvedCliGraph) -> Result { - let db = Omnigraph::open(&graph.uri).await?; - if graph.policy_file.is_some() { - let engine = Arc::new(resolve_policy_engine_for_graph(graph)?); - Ok(db.with_policy(engine as Arc)) - } else { - Ok(db) +/// Pick the single policy bundle that applies to the selection. With `--graph`, +/// the bundle bound to `graph.` (or the cluster-wide one); without it, the +/// sole bundle if there's exactly one. Ambiguity or absence is a loud error. +pub(crate) fn select_cluster_policy<'p>( + cluster: &str, + policies: &'p [omnigraph_cluster::ServingPolicy], + graph: Option<&str>, +) -> Result<&'p omnigraph_cluster::ServingPolicy> { + if let Some(graph_id) = graph { + let graph_ref = format!("graph.{graph_id}"); + let matching: Vec<&omnigraph_cluster::ServingPolicy> = policies + .iter() + .filter(|p| { + p.applies_to + .iter() + .any(|s| s == &graph_ref || s == "cluster") + }) + .collect(); + return match matching.as_slice() { + [only] => Ok(only), + [] => bail!( + "cluster `{cluster}` has no policy bundle bound to graph `{graph_id}` \ + (or to the cluster scope)" + ), + many => bail!( + "graph `{graph_id}` in cluster `{cluster}` matches {} policy bundles ([{}]); \ + the cluster model expects one bundle per graph scope", + many.len(), + many.iter().map(|p| p.name.as_str()).collect::>().join(", ") + ), + }; + } + match policies { + [only] => Ok(only), + [] => bail!("cluster `{cluster}` has no applied policy bundles"), + many => bail!( + "cluster `{cluster}` has {} policy bundles ([{}]); pass --graph to select one", + many.len(), + many.iter().map(|p| p.name.as_str()).collect::>().join(", ") + ), } } -/// THE actor chain (RFC-007 §D3) — every command that needs an identity +/// THE actor chain (RFC-011) — every command that needs an identity /// resolves through this one function (one path per concern): -/// `--as` > legacy `cli.actor` in omnigraph.yaml (RFC-008 window) > -/// `operator.actor` in ~/.omnigraph/config.yaml > none. -pub(crate) fn resolve_actor( - cli_as: Option<&str>, - legacy_config_actor: Option<&str>, -) -> Result> { +/// `--as` > `operator.actor` in ~/.omnigraph/config.yaml > none. +pub(crate) fn resolve_actor(cli_as: Option<&str>) -> Result> { if let Some(actor) = cli_as { return Ok(Some(actor.to_string())); } - if let Some(actor) = legacy_config_actor { - return Ok(Some(actor.to_string())); - } Ok(operator::load_operator_config()? .actor() .map(str::to_string)) } pub(crate) fn resolve_cluster_actor(cli_as: Option<&str>) -> Result> { - if let Some(actor) = cli_as { - return Ok(Some(actor.to_string())); - } - let config = load_config(None).wrap_err( - "resolving the default actor from omnigraph.yaml (pass --as to skip this lookup)", - )?; - resolve_actor(None, config.cli.actor.as_deref()) + resolve_actor(cli_as) } -pub(crate) fn resolve_cli_actor( - cli_as: Option<&str>, - config: &OmnigraphConfig, -) -> Result> { - resolve_actor(cli_as, config.cli.actor.as_deref()) +pub(crate) fn resolve_cli_actor(cli_as: Option<&str>) -> Result> { + resolve_actor(cli_as) } -pub(crate) fn resolve_policy_tests_path(context: &ResolvedPolicyContext) -> PathBuf { - context.policy_file.with_file_name("policy.tests.yaml") -} - -pub(crate) fn normalize_policy_graph_uri(uri: &str) -> Result { - if is_remote_uri(uri) { - Ok(uri.trim_end_matches('/').to_string()) - } else { - Ok(normalize_root_uri(uri)?) - } -} - -pub(crate) fn resolve_remote_bearer_token( - config: &OmnigraphConfig, - explicit_uri: Option<&str>, -) -> Result> { - // `--target` is gone; the legacy explicit-target name is always None. - let explicit_target: Option<&str> = None; +/// The bearer token for a remote request (RFC-011): the operator keyed chain +/// for the matching server (`OMNIGRAPH_TOKEN_` env → 0600 credentials +/// file), then the default `OMNIGRAPH_BEARER_TOKEN` env. No omnigraph.yaml +/// chain. +pub(crate) fn resolve_remote_bearer_token(explicit_uri: Option<&str>) -> Result> { // The keyed hop (RFC-007 §D4, gh-host model): when the effective remote // URL belongs to an operator-defined server, that server's keyed chain // applies first — OMNIGRAPH_TOKEN_ env, then the 0600 credentials - // file. Ok(None) falls through to the legacy chain unchanged, and the - // keyed token is structurally scoped to its own server (§D5 rule 3): - // a URL matching no operator server never sees it. - if let Some(remote_url) = effective_remote_url(config, explicit_uri, explicit_target) { + // file. The keyed token is structurally scoped to its own server: a URL + // matching no operator server never sees it. + if let Some(remote_url) = explicit_uri.filter(|uri| is_remote_uri(uri)) { let operator_config = operator::load_operator_config()?; - if let Some(server) = operator_config.find_server_for_url(&remote_url) { + if let Some(server) = operator_config.find_server_for_url(remote_url) { if let Some(token) = operator::resolve_keyed_token(server)? { return Ok(Some(token)); } } } - let scoped_env = - config.graph_bearer_token_env(explicit_uri, explicit_target, config.cli_graph_name()); - let mut env_names = Vec::new(); - if let Some(name) = scoped_env { - env_names.push(name.to_string()); - } - if env_names - .iter() - .all(|name| name != DEFAULT_BEARER_TOKEN_ENV) - { - env_names.push(DEFAULT_BEARER_TOKEN_ENV.to_string()); - } - - let env_file = config.resolve_auth_env_file(); - for env_name in env_names { - if let Some(token) = bearer_token_from_env(&env_name) { - return Ok(Some(token)); - } - if let Some(path) = env_file.as_ref() { - if let Some(token) = bearer_token_from_env_file(path, &env_name)? { - return Ok(Some(token)); - } - } - } - - Ok(None) + Ok(bearer_token_from_env(DEFAULT_BEARER_TOKEN_ENV)) } /// `--server ` (RFC-007 PR 3): resolve an operator-defined server @@ -336,7 +334,6 @@ pub(crate) fn resolve_server_flag( /// params. The keyed token applies via the ordinary URL match. pub(crate) async fn execute_operator_alias( client: &reqwest::Client, - config: &OmnigraphConfig, alias_name: &str, alias: &crate::operator::OperatorAlias, alias_args: &[String], @@ -344,7 +341,7 @@ pub(crate) async fn execute_operator_alias( ) -> Result { let uri = resolve_server_flag(Some(&alias.server), alias.graph.as_deref())? .expect("server name is present"); - let bearer_token = resolve_remote_bearer_token(config, Some(&uri))?; + let bearer_token = resolve_remote_bearer_token(Some(&uri))?; let mut params = serde_json::Map::new(); for (key, value) in &alias.params { @@ -370,12 +367,16 @@ pub(crate) async fn execute_operator_alias( } } - let body = (!params.is_empty()).then(|| serde_json::json!({ "params": params })); + let mut body = serde_json::Map::new(); + body.insert("expect_mutation".to_string(), Value::Bool(false)); + if !params.is_empty() { + body.insert("params".to_string(), Value::Object(params)); + } remote_json( client, Method::POST, remote_url(&uri, &["queries", &alias.query], &[])?, - body, + Some(Value::Object(body)), bearer_token.as_deref(), ) .await @@ -399,22 +400,6 @@ pub(crate) fn apply_server_flag( resolve_server_flag(server, graph) } -/// The remote base URL a token resolution is FOR — the same scoping -/// `graph_bearer_token_env` uses: an explicit http(s) `--uri` wins, else -/// the config-resolved target's uri (when remote). Local URIs → None. -fn effective_remote_url( - config: &OmnigraphConfig, - explicit_uri: Option<&str>, - explicit_target: Option<&str>, -) -> Option { - if let Some(uri) = explicit_uri { - return is_remote_uri(uri).then(|| uri.to_string()); - } - let target = config.resolve_target_name(explicit_uri, explicit_target, config.cli_graph_name())?; - let uri = &config.graphs.get(target)?.uri; - is_remote_uri(uri).then(|| uri.clone()) -} - pub(crate) fn build_http_client() -> Result { Ok(reqwest::Client::new()) } @@ -455,40 +440,31 @@ pub(crate) async fn remote_json( Ok(serde_json::from_str(&text)?) } -pub(crate) fn resolve_uri(config: &OmnigraphConfig, cli_uri: Option) -> Result { - // `--target` is gone; the second arg (the legacy explicit-target name) is - // always None. A bare command still falls back to `cli.graph` (the third arg). - config.resolve_target_uri(cli_uri, None, config.cli_graph_name()) +/// The graph URI a command addresses (RFC-011): the scope-resolved URI string +/// (positional URI / `--store` / `--profile` / `defaults.store`). No +/// omnigraph.yaml `cli.graph` fallback — an absent address is a loud error. +pub(crate) fn resolve_uri(cli_uri: Option) -> Result { + cli_uri.ok_or_else(|| { + color_eyre::eyre::eyre!( + "no graph addressed — pass a positional URI, --store , --server , \ + --profile , or set a default scope in ~/.omnigraph/config.yaml" + ) + }) } -pub(crate) fn resolve_cli_graph( - config: &OmnigraphConfig, - cli_uri: Option, -) -> Result { - let selected = if cli_uri.is_some() { - None - } else { - config.cli_graph_name().map(str::to_string) - }; - config.resolve_graph_selection(selected.as_deref())?; - let uri = resolve_uri(config, cli_uri)?; - let normalized_uri = normalize_policy_graph_uri(&uri)?; - let graph_id = graph_resource_id_for_selection(selected.as_deref(), &normalized_uri); +pub(crate) fn resolve_cli_graph(cli_uri: Option) -> Result { + let uri = resolve_uri(cli_uri)?; Ok(ResolvedCliGraph { - graph_id, is_remote: is_remote_uri(&uri), - policy_file: config.resolve_policy_file_for(selected.as_deref()), - selected, uri, }) } pub(crate) fn resolve_local_graph( - config: &OmnigraphConfig, cli_uri: Option, operation: &str, ) -> Result { - let graph = resolve_cli_graph(config, cli_uri)?; + let graph = resolve_cli_graph(cli_uri)?; if graph.is_remote { bail!( "`{}` is a direct (storage-native) command and needs direct storage \ @@ -531,22 +507,55 @@ pub(crate) fn parse_duration_arg(s: &str) -> Result { Ok(std::time::Duration::from_secs(secs)) } -pub(crate) fn resolve_local_uri( - config: &OmnigraphConfig, +pub(crate) fn resolve_local_uri(cli_uri: Option, operation: &str) -> Result { + Ok(resolve_local_graph(cli_uri, operation)?.uri) +} + +/// Resolve a direct (storage-native) verb's address to a storage URI through the +/// one RFC-011 scope path — the maintenance verbs (optimize/repair/cleanup) plus +/// `schema plan` and `lint`'s graph-target path. Every primitive funnels here: a +/// positional URI, `--store`, `--cluster --graph `, a `--profile` +/// cluster binding, or operator defaults — all resolved at the `Direct` +/// capability (so a server scope is rejected, a cluster scope is allowed when the +/// verb opts into cluster addressing), then mapped to a storage URI by +/// `resolve_storage_uri`. +pub(crate) async fn resolve_maintenance_uri( + profile: Option<&str>, + store: Option<&str>, + cluster: Option<&str>, + graph: Option<&str>, cli_uri: Option, operation: &str, ) -> Result { - Ok(resolve_local_graph(config, cli_uri, operation)?.uri) + let scope = scope::resolve_scope( + &operator::load_operator_config()?, + planes::Capability::Direct, + scope::ScopeFlags { + profile, + store, + server: None, + cluster, + graph, + uri: cli_uri, + }, + )?; + resolve_storage_uri( + scope.uri, + scope.cluster.as_deref(), + scope.cluster_graph.as_deref(), + operation, + ) + .await } -/// Resolve a storage-plane verb's address to a direct storage URI (RFC-010 -/// Slice 3). `--cluster --cluster-graph ` resolves the graph's -/// storage URI from the **served cluster state** (the truth a `--cluster` -/// server serves); otherwise the ordinary positional-URI path. -/// clap enforces both-or-neither and exclusion with `uri`, so the mismatched -/// arm is defensive. +/// Map a resolved direct address to a storage URI: a cluster scope +/// (`--cluster --graph `, or a `--profile` cluster binding) resolves +/// the graph's storage URI from the **served cluster state**; otherwise the +/// ordinary positional-URI path. When a cluster scope carries no graph +/// selection (RFC-011 D7), enumerate the catalog: a sole graph is used +/// automatically, otherwise error and list the candidates so the operator can +/// pass `--graph `. pub(crate) async fn resolve_storage_uri( - config: &OmnigraphConfig, cli_uri: Option, cluster: Option<&str>, cluster_graph: Option<&str>, @@ -554,8 +563,32 @@ pub(crate) async fn resolve_storage_uri( ) -> Result { match (cluster, cluster_graph) { (Some(cluster), Some(graph_id)) => resolve_cluster_graph_uri(cluster, graph_id).await, - (None, None) => resolve_local_uri(config, cli_uri, operation), - _ => bail!("--cluster and --cluster-graph must be given together"), + (Some(cluster), None) => { + let graph_id = resolve_sole_cluster_graph(cluster).await?; + resolve_cluster_graph_uri(cluster, &graph_id).await + } + (None, None) => resolve_local_uri(cli_uri, operation), + (None, Some(_)) => { + bail!("internal error: a graph was selected without a cluster scope") + } + } +} + +/// Pick the graph for a cluster scope that has no `--graph`/`default_graph` +/// (RFC-011 D7): exactly one applied graph → use it; zero → error; more than +/// one → error and list the candidates. Never auto-picks among several. +async fn resolve_sole_cluster_graph(cluster: &str) -> Result { + let ids = omnigraph_cluster::cluster_graph_ids(cluster) + .await + .map_err(|diagnostic| color_eyre::eyre::eyre!("{}", diagnostic.message))?; + match ids.as_slice() { + [only] => Ok(only.clone()), + [] => bail!("cluster `{cluster}` has no applied graphs; run `cluster apply` first"), + many => bail!( + "cluster `{cluster}` has {} graphs: [{}]; pass --graph to select one", + many.len(), + many.join(", ") + ), } } @@ -570,19 +603,16 @@ async fn resolve_cluster_graph_uri(cluster: &str, graph_id: &str) -> Result, alias_branch: Option, default_branch: &str, ) -> String { cli_branch .or(alias_branch) - .or_else(|| config.cli.branch.clone()) .unwrap_or_else(|| default_branch.to_string()) } pub(crate) fn resolve_read_target( - config: &OmnigraphConfig, cli_branch: Option, cli_snapshot: Option, alias_branch: Option, @@ -590,19 +620,15 @@ pub(crate) fn resolve_read_target( if cli_branch.is_some() && cli_snapshot.is_some() { bail!("read target may specify branch or snapshot, not both"); } - Ok(read_target_from_cli( - cli_branch - .or(alias_branch) - .or_else(|| config.cli.branch.clone()), - cli_snapshot, - )) + Ok(read_target_from_cli(cli_branch.or(alias_branch), cli_snapshot)) } pub(crate) fn resolve_query_path( - config: &OmnigraphConfig, explicit_query: Option<&PathBuf>, alias_query: Option<&str>, ) -> Result { + // The `.gq` path is resolved plainly (cwd-relative) — no omnigraph.yaml + // `query.roots` search. explicit_query .map(PathBuf::from) .or_else(|| alias_query.map(PathBuf::from)) @@ -611,11 +637,9 @@ pub(crate) fn resolve_query_path( "exactly one of --query, --query-string, or --alias must be provided" ) }) - .and_then(|query_path| config.resolve_query_path(&query_path)) } pub(crate) fn resolve_query_source( - config: &OmnigraphConfig, explicit_query: Option<&PathBuf>, inline_query: Option<&str>, alias_query: Option<&str>, @@ -627,7 +651,6 @@ pub(crate) fn resolve_query_source( return Ok(inline.to_string()); } Ok(fs::read_to_string(resolve_query_path( - config, explicit_query, alias_query, )?)?) @@ -637,49 +660,9 @@ pub(crate) fn parse_alias_value(value: &str) -> Value { serde_json::from_str(value).unwrap_or_else(|_| Value::String(value.to_string())) } -pub(crate) fn merged_params_json( - alias_name: Option<&str>, - alias_arg_names: &[String], - alias_arg_values: &[String], - explicit: Option, -) -> Result> { - if alias_arg_values.len() > alias_arg_names.len() { - let alias = alias_name.unwrap_or(""); - bail!( - "alias '{}' expects at most {} args but got {}", - alias, - alias_arg_names.len(), - alias_arg_values.len() - ); - } - - let mut merged = serde_json::Map::new(); - for (arg_name, arg_value) in alias_arg_names.iter().zip(alias_arg_values.iter()) { - merged.insert(arg_name.clone(), parse_alias_value(arg_value)); - } - - match explicit { - Some(Value::Object(object)) => { - for (key, value) in object { - merged.insert(key, value); - } - } - Some(_) => bail!("params JSON must be an object"), - None => {} - } - - if merged.is_empty() { - Ok(None) - } else { - Ok(Some(Value::Object(merged))) - } -} - -/// The format cascade (RFC-007 §D3): `--json` > `--format` > alias format > -/// legacy `cli.output_format` (RFC-008 window) > operator `defaults.output` -/// > table. +/// The format cascade (RFC-011): `--json` > `--format` > alias format > +/// operator `defaults.output` > table. pub(crate) fn resolve_read_format( - config: &OmnigraphConfig, cli_format: Option, json: bool, alias_format: Option, @@ -689,7 +672,6 @@ pub(crate) fn resolve_read_format( } cli_format .or(alias_format) - .or(config.cli.output_format) .or_else(|| { operator::load_operator_config() .ok() @@ -698,43 +680,6 @@ pub(crate) fn resolve_read_format( .unwrap_or_default() } -pub(crate) fn resolve_alias<'a>( - config: &'a OmnigraphConfig, - alias_name: Option<&'a str>, - expected: AliasCommand, -) -> Result> { - let Some(alias_name) = alias_name else { - return Ok(None); - }; - let alias = config.alias(alias_name)?; - if alias.command != expected { - bail!( - "alias '{}' is a {:?} alias, not a {:?} alias", - alias_name, - alias.command, - expected - ); - } - Ok(Some((alias_name, alias))) -} - -pub(crate) fn normalize_legacy_alias_uri( - uri: Option, - target_available: bool, - alias_name: Option<&str>, - mut alias_args: Vec, -) -> (Option, Vec) { - let Some(candidate) = uri else { - return (None, alias_args); - }; - - if alias_name.is_some() && target_available { - alias_args.insert(0, candidate); - return (None, alias_args); - } - - (Some(candidate), alias_args) -} pub(crate) fn read_target_from_cli(branch: Option, snapshot: Option) -> ReadTarget { @@ -783,12 +728,11 @@ pub(crate) fn query_params_from_json( } pub(crate) async fn execute_query_lint( - config: &OmnigraphConfig, cli_uri: Option, schema_path: Option<&PathBuf>, query_path: &PathBuf, ) -> Result { - let resolved_query_path = resolve_query_path(config, Some(query_path), None)?; + let resolved_query_path = resolve_query_path(Some(query_path), None)?; let query_source = fs::read_to_string(&resolved_query_path)?; let query_path = resolved_query_path.to_string_lossy().into_owned(); @@ -806,12 +750,14 @@ pub(crate) async fn execute_query_lint( )); } - let has_graph_target = cli_uri.is_some() || config.cli_graph_name().is_some(); - if !has_graph_target { - bail!("lint requires --schema or a resolvable graph target"); + if cli_uri.is_none() { + bail!( + "lint requires --schema (offline) or a graph target \ + (--store / --cluster --graph )" + ); } - let uri = resolve_local_uri(config, cli_uri, "lint")?; + let uri = resolve_local_uri(cli_uri, "lint")?; let db = Omnigraph::open(&uri).await?; Ok(lint_query_file( &db.catalog(), @@ -821,20 +767,24 @@ pub(crate) async fn execute_query_lint( )) } -pub(crate) fn resolve_selected_graph( - config: &OmnigraphConfig, - cli_uri: Option, - operation: &str, -) -> Result<(String, Option)> { - let graph = resolve_local_graph(config, cli_uri, operation)?; - Ok((graph.uri, graph.selected)) -} - -pub(crate) fn load_registry_or_report( - config: &OmnigraphConfig, - selected: Option<&str>, +/// Build a `QueryRegistry` from a cluster serving snapshot's stored queries, +/// optionally scoped to one graph. The `ServingQuery.source` is the +/// digest-verified `.gq` content, so no file I/O or omnigraph.yaml is involved. +fn registry_from_serving_queries( + queries: &[omnigraph_cluster::ServingQuery], + graph: Option<&str>, ) -> Result { - QueryRegistry::load(config, config.query_entries_for(selected)).map_err(|errors| { + let specs: Vec = queries + .iter() + .filter(|q| graph.is_none_or(|g| q.graph_id == g)) + .map(|q| omnigraph_server::queries::RegistrySpec { + name: q.name.clone(), + source: q.source.clone(), + expose: false, + tool_name: None, + }) + .collect(); + QueryRegistry::from_specs(specs).map_err(|errors| { color_eyre::eyre::eyre!( "stored-query registry failed to load:\n {}", errors @@ -846,83 +796,58 @@ pub(crate) fn load_registry_or_report( }) } -pub(crate) fn graph_query_registry_names(config: &OmnigraphConfig) -> Vec<&str> { - config - .graphs - .iter() - .filter_map(|(name, graph)| (!graph.queries.is_empty()).then_some(name.as_str())) - .collect() -} - -pub(crate) fn resolve_registry_selection_for_list( - config: &OmnigraphConfig, -) -> Result> { - let selected = config.cli_graph_name().map(str::to_string); - if let Some(name) = selected.as_deref() { - config.resolve_graph_selection(Some(name))?; - return Ok(selected); - } - - if !config.query_entries().is_empty() { - return Ok(None); - } - - let graph_names = graph_query_registry_names(config); - if graph_names.is_empty() { - return Ok(None); - } - - bail!( - "stored-query registries are configured for graph{} {} but no graph was selected. Pass a positional URI or set `cli.graph`.", - if graph_names.len() == 1 { "" } else { "s" }, - graph_names.join(", "), - ) -} - -pub(crate) fn validate_registry_for_catalog( - registry: &QueryRegistry, - catalog: &omnigraph_compiler::catalog::Catalog, - label: &str, -) -> omnigraph::error::Result<()> { - let report = check(registry, catalog); - if report.has_breakages() { - return Err(omnigraph::error::OmniError::manifest( - format_check_breakages(label, &report), - )); - } - Ok(()) -} +/// `queries validate --cluster ` (RFC-011): type-check every stored query +/// in the cluster catalog against its graph's applied schema. Both the registry +/// and the schemas come from the cluster serving snapshot — no omnigraph.yaml. +/// With `--graph`, scope to a single graph. pub(crate) async fn execute_queries_validate( - uri: Option, - config_path: Option<&PathBuf>, + cluster: &str, + graph: Option<&str>, json: bool, ) -> Result<()> { - let config = load_cli_config(config_path)?; - // One selection drives both the schema URI and the registry. - let (uri, selected) = resolve_selected_graph(&config, uri, "queries validate")?; - let registry = load_registry_or_report(&config, selected.as_deref())?; - let db = Omnigraph::open(&uri).await?; - let report = check(®istry, &db.catalog()); + let snapshot = read_serving_snapshot_or_report(cluster).await?; - let output = QueriesValidateOutput { - ok: !report.has_breakages(), - breakages: report - .breakages - .iter() - .map(|b| QueriesIssue { + // Type-check per graph: each graph's stored queries against its own schema + // (read from the graph's applied storage root). A `--graph` filter scopes to + // exactly one graph; an unknown id is a loud error. + let mut breakages = Vec::new(); + let mut warnings = Vec::new(); + let mut total = 0usize; + let mut matched_any = false; + for serving_graph in &snapshot.graphs { + if graph.is_some_and(|g| g != serving_graph.graph_id) { + continue; + } + matched_any = true; + let registry = registry_from_serving_queries(&snapshot.queries, Some(&serving_graph.graph_id))?; + let db = Omnigraph::open(&serving_graph.root.to_string_lossy()).await?; + let report = check(®istry, &db.catalog()); + total += registry.len(); + for b in &report.breakages { + breakages.push(QueriesIssue { query: b.query.clone(), message: b.message.clone(), - }) - .collect(), - warnings: report - .warnings - .iter() - .map(|w| QueriesIssue { + }); + } + for w in &report.warnings { + warnings.push(QueriesIssue { query: w.query.clone(), message: w.message.clone(), - }) - .collect(), + }); + } + } + if let Some(graph_id) = graph { + if !matched_any { + bail!("graph `{graph_id}` is not applied in cluster `{cluster}`"); + } + } + + let has_breakages = !breakages.is_empty(); + let output = QueriesValidateOutput { + ok: !has_breakages, + breakages, + warnings, }; if json { @@ -931,8 +856,8 @@ pub(crate) async fn execute_queries_validate( if output.breakages.is_empty() { println!( "OK {} stored quer{} type-check against the schema", - registry.len(), - if registry.len() == 1 { "y" } else { "ies" } + total, + if total == 1 { "y" } else { "ies" } ); } for issue in &output.breakages { @@ -943,17 +868,22 @@ pub(crate) async fn execute_queries_validate( } } - if report.has_breakages() { + if has_breakages { io::stdout().flush()?; std::process::exit(1); } Ok(()) } -pub(crate) fn execute_queries_list(config_path: Option<&PathBuf>, json: bool) -> Result<()> { - let config = load_cli_config(config_path)?; - let selected = resolve_registry_selection_for_list(&config)?; - let registry = load_registry_or_report(&config, selected.as_deref())?; +/// `queries list --cluster ` (RFC-011): list the catalog's stored queries. +/// With `--graph`, scope to one graph. +pub(crate) async fn execute_queries_list( + cluster: &str, + graph: Option<&str>, + json: bool, +) -> Result<()> { + let snapshot = read_serving_snapshot_or_report(cluster).await?; + let registry = registry_from_serving_queries(&snapshot.queries, graph)?; let output = QueriesListOutput { queries: registry @@ -1075,6 +1005,52 @@ pub(crate) fn rewrite_deprecated_argv(args: Vec) -> Vec { mod tests { use super::*; + #[test] + fn graph_resource_id_for_selection_uses_name_or_anonymous_uri() { + assert_eq!( + graph_resource_id_for_selection(Some("local"), "/tmp/graph.omni"), + "local" + ); + assert_eq!( + graph_resource_id_for_selection(None, "/tmp/graph.omni"), + "/tmp/graph.omni" + ); + } + + // RFC-011 Decision 9: locality classifier for the destructive-confirm gate. + #[test] + fn uri_is_local_truth_table() { + // Local: bare path or file://. + assert!(uri_is_local("graph.omni")); + assert!(uri_is_local("/abs/path/graph.omni")); + assert!(uri_is_local("file:///tmp/graph.omni")); + // Non-local: served or object-store schemes. + assert!(!uri_is_local("http://host/graphs/g")); + assert!(!uri_is_local("https://host/graphs/g")); + assert!(!uri_is_local("s3://bucket/graph.omni")); + assert!(!uri_is_local("gs://bucket/graph.omni")); + } + + // RFC-011 Decision 9: a non-local destructive write with `--json` (the CI + // shape — also covers the no-TTY case, since tests run without a terminal) + // refuses rather than proceeding; a local one and an explicit `--yes` pass. + #[test] + fn confirm_destructive_refuses_non_local_without_consent() { + let err = confirm_destructive("cleanup", "s3://b/g.omni", false, true) + .unwrap_err() + .to_string(); + assert!(err.contains("--yes"), "{err}"); + } + + #[test] + fn confirm_destructive_allows_local_and_explicit_yes() { + // Local needs no confirmation, even with --json. + assert!(confirm_destructive("cleanup", "file:///tmp/g.omni", false, true).is_ok()); + assert!(confirm_destructive("branch delete", "graph.omni", false, true).is_ok()); + // --yes consents to a non-local target. + assert!(confirm_destructive("cleanup", "s3://b/g.omni", true, true).is_ok()); + } + // RFC-011 Decision 2: `--server` accepts a literal URL (value with `://`), // bypassing the operator-config registry — so no config / OMNIGRAPH_HOME is // read on this path (hermetic). diff --git a/crates/omnigraph-cli/src/main.rs b/crates/omnigraph-cli/src/main.rs index 606db96..bb3b062 100644 --- a/crates/omnigraph-cli/src/main.rs +++ b/crates/omnigraph-cli/src/main.rs @@ -1,15 +1,11 @@ use std::ffi::OsString; use std::fs; use std::io::{self, Write}; -use std::path::Path; use std::path::PathBuf; -use std::sync::Arc; - use clap::{Arg, ArgAction, Args, CommandFactory, FromArgMatches, Parser, Subcommand, ValueEnum}; -use color_eyre::eyre::{Result, WrapErr, bail}; +use color_eyre::eyre::{Result, bail}; use omnigraph::db::{Omnigraph, ReadTarget, SnapshotId}; use omnigraph::loader::LoadMode; -use omnigraph::storage::normalize_root_uri; use omnigraph_cluster::{ ApplyOptions, ApplyOutput, ApproveOutput, DiagnosticSeverity, ForceUnlockOutput, PlanOutput, StateSyncOutput, StatusOutput, ValidateOutput, apply_config_dir_with_options, approve_config_dir, force_unlock_config_dir, import_config_dir, plan_config_dir, @@ -26,10 +22,9 @@ use omnigraph_api_types::{ ChangeOutput, CommitOutput, ErrorOutput, IngestOutput, ReadOutput, SchemaApplyOutput, SnapshotTableOutput, }; -use omnigraph_server::queries::{QueryRegistry, check, format_check_breakages}; +use omnigraph_server::queries::{QueryRegistry, check}; use omnigraph_server::{ - AliasCommand, OmnigraphConfig, PolicyAction, PolicyDecision, PolicyEngine, PolicyRequest, - PolicyTestConfig, ReadOutputFormat, graph_resource_id_for_selection, load_config, + PolicyAction, PolicyDecision, PolicyEngine, PolicyRequest, PolicyTestConfig, }; use reqwest::Method; use reqwest::header::AUTHORIZATION; @@ -38,12 +33,11 @@ use serde::de::DeserializeOwned; use serde_json::Value; mod embed; -mod migrate; mod operator; mod read_format; use embed::{EmbedArgs, EmbedOutput, execute_embed}; -use read_format::{ReadRenderOptions, render_read}; +use read_format::{ReadOutputFormat, ReadRenderOptions, render_read}; mod cli; mod client; @@ -77,42 +71,6 @@ async fn main() -> Result<()> { // before any per-command dispatch. planes::guard_addressing(&cli)?; match cli.command { - Command::Config { command } => match command { - ConfigCommand::Migrate { config, write, json } => { - let path = migrate::legacy_config_path(config.as_ref()); - if !path.exists() { - bail!( - "no legacy config at '{}' — nothing to migrate", - path.display() - ); - } - let legacy = load_config(Some(&path))?; - let report = migrate::build_report(&legacy, &path); - if write { - let legacy_dir = path - .parent() - .filter(|parent| !parent.as_os_str().is_empty()) - .unwrap_or(std::path::Path::new(".")) - .to_path_buf(); - let written = migrate::apply_report(&report, &legacy_dir)?; - if json { - print_json(&serde_json::json!({ - "report": report, - "written": written, - }))?; - } else { - print!("{}", migrate::render_report(&report)); - for line in written { - println!("wrote: {line}"); - } - } - } else if json { - print_json(&report)?; - } else { - print!("{}", migrate::render_report(&report)); - } - } - }, Command::Login { name, token, json } => { let token = match token { Some(token) => token, @@ -136,6 +94,124 @@ async fn main() -> Result<()> { let path = crate::operator::remove_credential(&name)?; finish_logout(&name, &path, json)?; } + Command::Profile { command } => { + use crate::operator::ScopeBinding; + let op = crate::operator::load_operator_config()?; + let active = std::env::var(scope::PROFILE_ENV) + .ok() + .filter(|s| !s.is_empty()); + match command { + ProfileCommand::List { json } => { + let items: Vec = op + .profiles + .iter() + .map(|(name, profile)| { + let (binding, scope_kind, target, valid, error) = + match profile.binding(name) { + Ok(ScopeBinding::Server(s)) => ( + format!("server: {s}"), + "server".to_string(), + Some(s), + true, + None, + ), + Ok(ScopeBinding::Cluster(c)) => ( + format!("cluster: {c}"), + "cluster".to_string(), + Some(c), + true, + None, + ), + Ok(ScopeBinding::Store(u)) => ( + format!("store: {u}"), + "store".to_string(), + Some(u), + true, + None, + ), + Err(e) => ( + format!("invalid: {e}"), + "invalid".to_string(), + None, + false, + Some(e.to_string()), + ), + }; + ProfileListItem { + name: name.clone(), + binding, + scope_kind, + target, + valid, + error, + default_graph: profile.default_graph.clone(), + active: active.as_deref() == Some(name.as_str()), + } + }) + .collect(); + print_profile_list(&items, json)?; + } + ProfileCommand::Show { name, json } => { + let detail = match name.or(active) { + Some(name) => { + let profile = op.profile(&name).ok_or_else(|| { + color_eyre::eyre::eyre!( + "unknown profile '{name}' (not defined under `profiles:`)" + ) + })?; + let (kind, target, endpoint) = match profile.binding(&name)? { + ScopeBinding::Server(s) => { + let endpoint = op.servers.get(&s).map(|sv| sv.url.clone()); + ("server", Some(s), endpoint) + } + ScopeBinding::Cluster(c) => { + let endpoint = op.cluster_root(&c).map(str::to_string); + ("cluster", Some(c), endpoint) + } + ScopeBinding::Store(u) => ("store", Some(u.clone()), Some(u)), + }; + ProfileDetail { + name, + scope_kind: kind.to_string(), + target, + endpoint, + default_graph: profile + .default_graph + .clone() + .or_else(|| op.default_graph().map(str::to_string)), + output_format: op + .output() + .and_then(|f| f.to_possible_value()) + .map(|v| v.get_name().to_string()), + } + } + // No name and no active profile: the flat operator defaults. + None => { + let (kind, target, endpoint) = if let Some(s) = op.default_server() { + let endpoint = op.servers.get(s).map(|sv| sv.url.clone()); + ("server", Some(s.to_string()), endpoint) + } else if let Some(u) = op.default_store() { + ("store", Some(u.to_string()), Some(u.to_string())) + } else { + ("none", None, None) + }; + ProfileDetail { + name: "(defaults)".to_string(), + scope_kind: kind.to_string(), + target, + endpoint, + default_graph: op.default_graph().map(str::to_string), + output_format: op + .output() + .and_then(|f| f.to_possible_value()) + .map(|v| v.get_name().to_string()), + } + } + }; + print_profile_detail(&detail, json)?; + } + } + } Command::Version => { println!("omnigraph {}", env!("CARGO_PKG_VERSION")); } @@ -170,24 +246,26 @@ async fn main() -> Result<()> { } Command::Load { uri, - config, data, branch, from, mode, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; - let branch = resolve_branch(&config, branch, None, "main"); + ) + .await?; + let branch = resolve_branch(branch, None, "main"); + if matches!(mode, CliLoadMode::Overwrite) { + confirm_destructive("load --mode overwrite", client.uri(), cli.yes, json)?; + } + echo_write_target(cli.quiet, "load", client.uri(), client.is_remote()); let payload = client .load(&branch, from.as_deref(), &data.to_string_lossy(), mode) .await?; @@ -199,7 +277,6 @@ async fn main() -> Result<()> { } Command::Ingest { uri, - config, data, branch, from, @@ -211,18 +288,18 @@ async fn main() -> Result<()> { "warning: `omnigraph ingest` is deprecated and will be removed in a future release; \ use `omnigraph load --from --mode ` (ingest defaults: --from main --mode merge)" ); - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; - let branch = resolve_branch(&config, branch, None, "main"); - let from = resolve_branch(&config, from, None, "main"); + ) + .await?; + let branch = resolve_branch(branch, None, "main"); + let from = resolve_branch(from, None, "main"); + echo_write_target(cli.quiet, "ingest", client.uri(), client.is_remote()); let payload = client .ingest(&branch, &from, &data.to_string_lossy(), mode) .await?; @@ -235,22 +312,21 @@ async fn main() -> Result<()> { Command::Branch { command } => match command { BranchCommand::Create { uri, - config, from, name, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; - let from = resolve_branch(&config, from, None, "main"); + ) + .await?; + let from = resolve_branch(from, None, "main"); + echo_write_target(cli.quiet, "branch create", client.uri(), client.is_remote()); let payload = client.branch_create_from(&from, &name).await?; if json { print_json(&payload)?; @@ -260,18 +336,16 @@ async fn main() -> Result<()> { } BranchCommand::List { uri, - config, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; let payload = client.branch_list().await?; if json { print_json(&payload)?; @@ -283,20 +357,20 @@ async fn main() -> Result<()> { } BranchCommand::Delete { uri, - config, name, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; + confirm_destructive("branch delete", client.uri(), cli.yes, json)?; + echo_write_target(cli.quiet, "branch delete", client.uri(), client.is_remote()); let payload = client.branch_delete(&name).await?; if json { print_json(&payload)?; @@ -306,22 +380,21 @@ async fn main() -> Result<()> { } BranchCommand::Merge { uri, - config, source, into, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; - let into = resolve_branch(&config, into, None, "main"); + ) + .await?; + let into = resolve_branch(into, None, "main"); + echo_write_target(cli.quiet, "branch merge", client.uri(), client.is_remote()); let payload = client.branch_merge(&source, &into).await?; if json { print_json(&payload)?; @@ -338,19 +411,17 @@ async fn main() -> Result<()> { Command::Commit { command } => match command { CommitCommand::List { uri, - config, branch, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; let payload = client.list_commits(branch.as_deref()).await?; if json { print_json(&payload)?; @@ -360,19 +431,17 @@ async fn main() -> Result<()> { } CommitCommand::Show { uri, - config, commit_id, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; let commit = client.get_commit(&commit_id).await?; if json { print_json(&commit)?; @@ -384,13 +453,19 @@ async fn main() -> Result<()> { Command::Schema { command } => match command { SchemaCommand::Plan { uri, - config, schema, json, allow_data_loss, } => { - let config = load_cli_config(config.as_ref())?; - let uri = resolve_local_uri(&config, uri, "schema plan")?; + let uri = resolve_maintenance_uri( + cli.profile.as_deref(), + cli.store.as_deref(), + cli.cluster.as_deref(), + cli.graph.as_deref(), + uri, + "schema plan", + ) + .await?; let schema_source = fs::read_to_string(&schema)?; let db = Omnigraph::open(&uri).await?; let plan = db @@ -413,40 +488,46 @@ async fn main() -> Result<()> { } SchemaCommand::Apply { uri, - config, schema, json, allow_data_loss, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; + // RFC-011 Decision 10: a graph managed by a cluster evolves via + // `cluster apply` (ledger/recovery/approvals), not a direct + // `schema apply` against its storage root — that would bypass the + // ledger. Mirrors `init`'s refusal. Only the embedded path can + // address a storage root; a served apply (`--server`) is the + // server's concern. + if !client.is_remote() { + if let Some(root) = + omnigraph_cluster::cluster_root_for_graph_uri(client.uri()).await + { + bail!( + "`{}` is inside cluster `{root}`. A graph in a cluster evolves via \ + `cluster apply` (which records ledger, recovery, and approvals), not \ + `schema apply`. Update the schema in cluster.yaml and run `cluster apply`.", + client.uri() + ); + } + } let schema_source = fs::read_to_string(&schema)?; - // The stored-query registry check is an embedded-only concern - // (the remote arm ignores the validator — the server runs its - // own check); build it only for the local path so the remote - // path keeps its no-registry-load behavior. - let registry = if client.is_remote() { - None - } else { - let registry = load_registry_or_report(&config, client.selected())?; - (!registry.is_empty()).then_some(registry) - }; - let label = client.selected().unwrap_or(client.uri()).to_string(); + // The embedded (direct-store) arm carries no stored-query + // registry — the registry is cluster-owned (RFC-011), so a + // direct apply has nothing to validate against. The served arm + // runs the server's own catalog check. So the validator is a + // no-op here on both arms. + echo_write_target(cli.quiet, "schema apply", client.uri(), client.is_remote()); let output = client - .apply_schema(&schema_source, allow_data_loss, |catalog| { - if let Some(registry) = registry.as_ref() { - validate_registry_for_catalog(registry, catalog, &label)?; - } - Ok(()) - }) + .apply_schema(&schema_source, allow_data_loss, |_catalog| Ok(())) .await?; if json { print_json(&output)?; @@ -456,18 +537,16 @@ async fn main() -> Result<()> { } SchemaCommand::Show { uri, - config, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; let output = client.schema_source().await?; if json { print_json(&output)?; @@ -478,48 +557,58 @@ async fn main() -> Result<()> { }, Command::Lint { uri, - config, query, schema, json, } => { - let config = load_cli_config(config.as_ref())?; - let output = - execute_query_lint(&config, uri, schema.as_ref(), &query) - .await?; + // A graph target (when `--schema` is absent) resolves through the + // direct scope path (positional URI / --store / --profile / + // defaults.store). Offline (`--schema`) needs no graph, so leave + // the uri unresolved in that case. + let graph_uri = if schema.is_some() { + uri + } else { + Some( + resolve_maintenance_uri( + cli.profile.as_deref(), + cli.store.as_deref(), + cli.cluster.as_deref(), + cli.graph.as_deref(), + uri, + "lint", + ) + .await?, + ) + }; + let output = execute_query_lint(graph_uri, schema.as_ref(), &query).await?; finish_query_lint(&output, json)?; } - Command::Queries { command } => match command { - QueriesCommand::Validate { - uri, - config, - json, - } => { - execute_queries_validate(uri, config.as_ref(), json).await?; + Command::Queries { command } => { + let cluster = + require_cluster_scope(cli.cluster.as_deref(), cli.profile.as_deref(), "queries")?; + match command { + QueriesCommand::Validate { json } => { + execute_queries_validate(&cluster, cli.graph.as_deref(), json).await?; + } + QueriesCommand::List { json } => { + execute_queries_list(&cluster, cli.graph.as_deref(), json).await?; + } } - QueriesCommand::List { - config, - json, - } => { - execute_queries_list(config.as_ref(), json)?; - } - }, + } Command::Snapshot { uri, - config, branch, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; - let branch = resolve_branch(&config, branch, None, "main"); + ) + .await?; + let branch = resolve_branch(branch, None, "main"); let payload = client.snapshot(&branch).await?; if json { print_json(&payload)?; @@ -529,22 +618,20 @@ async fn main() -> Result<()> { } Command::Export { uri, - config, branch, jsonl, type_names, table_keys, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; - let branch = resolve_branch(&config, branch, None, "main"); + ) + .await?; + let branch = resolve_branch(branch, None, "main"); if jsonl { eprintln!("warning: --jsonl is deprecated; `omnigraph export` always emits JSONL"); } @@ -556,280 +643,182 @@ async fn main() -> Result<()> { .await?; } Command::Query { - uri, - legacy_uri, - config, - alias, + name, query, query_string, - name, params, branch, snapshot, format, json, - alias_args, } => { - if alias.is_none() && query.is_none() && query_string.is_none() { - bail!("exactly one of --query, --query-string, or --alias must be provided"); - } - - let config = load_cli_config(config.as_ref())?; - // Operator aliases (RFC-007 PR 3): pure bindings to stored - // queries. A legacy file-alias with the same name wins during - // the RFC-008 window (with a warning); an alias name found - // only in the operator layer takes the invoke path here. - if let Some(alias_name) = alias.as_deref() { - let operator_config = crate::operator::load_operator_config()?; - if let Some(operator_alias) = operator_config.aliases.get(alias_name) { - if config.alias(alias_name).is_ok() { - eprintln!( - "warning: alias '{alias_name}' is defined in both omnigraph.yaml (legacy, wins during the deprecation window) and the operator config; the legacy definition applies" - ); - } else { - // The hidden legacy-uri positional swallows the first - // bare arg; an operator alias always knows its target, - // so reclaim it as the first positional param. - let (_, alias_args) = normalize_legacy_alias_uri( - legacy_uri.clone(), - true, - Some(alias_name), - alias_args.clone(), - ); - let output = execute_operator_alias( - &http_client, - &config, - alias_name, - operator_alias, - &alias_args, - load_params_json(¶ms)?, - ) - .await?; - let format = - resolve_read_format(&config, format, json, operator_alias.format); - print_read_output(&output, format, &config)?; - return Ok(()); - } - } - } - let alias = resolve_alias(&config, alias.as_deref(), AliasCommand::Read)?; - let alias_name = alias.as_ref().map(|(name, _)| *name); - let alias_config = alias.as_ref().map(|(_, alias)| *alias); - let alias_graph = alias_config.and_then(|alias| alias.graph.as_deref()); - let target_available = alias_graph.is_some() || config.cli_graph_name().is_some(); - let (legacy_uri, alias_args) = - normalize_legacy_alias_uri(legacy_uri, target_available, alias_name, alias_args); - // `--target` is gone; resolve an alias's legacy `graph` name to its - // URI (a positional URI still wins). - let uri = match uri.or(legacy_uri) { - Some(uri) => Some(uri), - None => match alias_graph { - Some(name) => Some(config.resolve_target_uri(None, Some(name), None)?), - None => None, - }, - }; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), - uri, + None, cli.profile.as_deref(), cli.store.as_deref(), - )?; - let query_source = resolve_query_source( - &config, - query.as_ref(), - query_string.as_deref(), - alias_config.map(|a| a.query.as_str()), - )?; - let params_json = merged_params_json( - alias_name, - alias_config - .map(|alias| alias.args.as_slice()) - .unwrap_or(&[]), - &alias_args, - load_params_json(¶ms)?, - )?; - let target = resolve_read_target( - &config, - branch, - snapshot, - alias_config.and_then(|alias| alias.branch.clone()), - )?; - let query_name = name.or_else(|| alias_config.and_then(|alias| alias.name.clone())); - let output = client - .query( - target, - &query_source, - query_name.as_deref(), - params_json.as_ref(), - ) - .await?; - let format = resolve_read_format( - &config, - format, - json, - alias_config.and_then(|alias| alias.format), - ); - print_read_output(&output, format, &config)?; + ) + .await?; + let params_json = load_params_json(¶ms)?; + let target = resolve_read_target(branch, snapshot, None)?; + let output: ReadOutput = if query.is_some() || query_string.is_some() { + // Ad-hoc lane: run the source; the positional `name` selects + // within it when it holds more than one query. + let query_source = + resolve_query_source(query.as_ref(), query_string.as_deref(), None)?; + client + .query(target, &query_source, name.as_deref(), params_json.as_ref()) + .await? + } else { + // Catalog lane (served-only): invoke the stored query by name. + let Some(name) = name else { + bail!( + "provide a query name to invoke from the catalog, or -e '' / \ + --query for an ad-hoc query" + ); + }; + let (branch, snapshot) = match &target { + ReadTarget::Branch(b) => (Some(b.clone()), None), + ReadTarget::Snapshot(s) => (None, Some(s.as_str().to_string())), + }; + client + .invoke_named(&name, false, params_json.as_ref(), branch, snapshot) + .await? + }; + let format = resolve_read_format(format, json, None); + print_read_output(&output, format)?; } Command::Mutate { - uri, - legacy_uri, - config, - alias, + name, query, query_string, - name, params, branch, json, - alias_args, } => { - if alias.is_none() && query.is_none() && query_string.is_none() { - bail!("exactly one of --query, --query-string, or --alias must be provided"); - } - - let config = load_cli_config(config.as_ref())?; - let alias = resolve_alias(&config, alias.as_deref(), AliasCommand::Change)?; - let alias_name = alias.as_ref().map(|(name, _)| *name); - let alias_config = alias.as_ref().map(|(_, alias)| *alias); - let alias_graph = alias_config.and_then(|alias| alias.graph.as_deref()); - let target_available = alias_graph.is_some() || config.cli_graph_name().is_some(); - let (legacy_uri, alias_args) = - normalize_legacy_alias_uri(legacy_uri, target_available, alias_name, alias_args); - // `--target` is gone; resolve an alias's legacy `graph` name to its - // URI (a positional URI still wins). - let uri = match uri.or(legacy_uri) { - Some(uri) => Some(uri), - None => match alias_graph { - Some(name) => Some(config.resolve_target_uri(None, Some(name), None)?), - None => None, - }, - }; let client = client::GraphClient::resolve_with_policy( - &config, cli.server.as_deref(), cli.graph.as_deref(), - uri, + None, cli.as_actor.as_deref(), cli.profile.as_deref(), cli.store.as_deref(), - )?; - let query_source = resolve_query_source( - &config, - query.as_ref(), - query_string.as_deref(), - alias_config.map(|a| a.query.as_str()), - )?; - let params_json = merged_params_json( - alias_name, - alias_config - .map(|alias| alias.args.as_slice()) - .unwrap_or(&[]), - &alias_args, - load_params_json(¶ms)?, - )?; - let branch = resolve_branch( - &config, - branch, - alias_config.and_then(|alias| alias.branch.clone()), - "main", - ); - let query_name = name.or_else(|| alias_config.and_then(|alias| alias.name.clone())); - let output = client - .mutate( - &branch, - &query_source, - query_name.as_deref(), - params_json.as_ref(), - ) - .await?; + ) + .await?; + let params_json = load_params_json(¶ms)?; + let branch = resolve_branch(branch, None, "main"); + let output: ChangeOutput = if query.is_some() || query_string.is_some() { + // Ad-hoc lane: run the source; positional `name` selects within it. + let query_source = + resolve_query_source(query.as_ref(), query_string.as_deref(), None)?; + client + .mutate(&branch, &query_source, name.as_deref(), params_json.as_ref()) + .await? + } else { + // Catalog lane (served-only): invoke the stored mutation by name. + let Some(name) = name else { + bail!( + "provide a mutation name to invoke from the catalog, or -e '' / \ + --query for an ad-hoc mutation" + ); + }; + client + .invoke_named(&name, true, params_json.as_ref(), Some(branch), None) + .await? + }; if json { print_json(&output)?; } else { print_change_human(&output); } } - Command::Policy { command } => match command { - PolicyCommand::Validate { config } => { - let config = load_cli_config(config.as_ref())?; - let context = resolve_policy_context(&config)?; - let engine = resolve_policy_engine(&context)?; - println!( - "policy valid: {} [{} actors]", - context.policy_file.display(), - engine.known_actor_count() + Command::Alias { + name, + args, + params, + format, + json, + } => { + let operator_config = crate::operator::load_operator_config()?; + let Some(operator_alias) = operator_config.aliases.get(&name) else { + let defined: Vec<&str> = + operator_config.aliases.keys().map(String::as_str).collect(); + bail!( + "unknown alias '{name}'; defined aliases: [{}] \ + (add it under `aliases:` in ~/.omnigraph/config.yaml)", + defined.join(", ") ); - } - PolicyCommand::Test { config } => { - let config = load_cli_config(config.as_ref())?; - let context = resolve_policy_context(&config)?; - let engine = resolve_policy_engine(&context)?; - let tests_path = resolve_policy_tests_path(&context); - let tests = PolicyTestConfig::load(&tests_path)?; - engine.run_tests(&tests)?; - println!("policy tests passed: {} cases", tests.cases.len()); - } - PolicyCommand::Explain { - config, - actor, - action, - branch, - target_branch, - } => { - let config = load_cli_config(config.as_ref())?; - let context = resolve_policy_context(&config)?; - let engine = resolve_policy_engine(&context)?; - let request = PolicyRequest { + }; + let output = execute_operator_alias( + &http_client, + &name, + operator_alias, + &args, + load_params_json(¶ms)?, + ) + .await?; + let format = resolve_read_format(format, json, operator_alias.format); + print_read_output(&output, format)?; + } + Command::Policy { command } => { + // Policy tooling sources the Cedar bundle(s) from the cluster's + // applied policies (RFC-011): --cluster , + the global --graph + // to pick a graph's bundle when several apply. + let cluster = + require_cluster_scope(cli.cluster.as_deref(), cli.profile.as_deref(), "policy")?; + let graph = cli.graph.as_deref(); + let graph_id = match graph { + Some(id) => graph_resource_id_for_selection(Some(id), ""), + None => graph_resource_id_for_selection(None, "default"), + }; + let policies = read_cluster_policies(&cluster).await?; + match command { + PolicyCommand::Validate {} => { + let bundle = select_cluster_policy(&cluster, &policies, graph)?; + let engine = PolicyEngine::load_graph_from_source(&bundle.source, &graph_id)?; + println!( + "policy valid: bundle '{}' [{} actors]", + bundle.name, + engine.known_actor_count() + ); + } + PolicyCommand::Test { tests } => { + let bundle = select_cluster_policy(&cluster, &policies, graph)?; + let engine = PolicyEngine::load_graph_from_source(&bundle.source, &graph_id)?; + let tests = PolicyTestConfig::load(&tests)?; + engine.run_tests(&tests)?; + println!("policy tests passed: {} cases", tests.cases.len()); + } + PolicyCommand::Explain { + actor, action, branch, target_branch, - }; - let decision = engine.authorize(&actor, &request)?; - print_policy_explain(&decision, &actor, &request); + } => { + let bundle = select_cluster_policy(&cluster, &policies, graph)?; + let engine = PolicyEngine::load_graph_from_source(&bundle.source, &graph_id)?; + let request = PolicyRequest { + action, + branch, + target_branch, + }; + let decision = engine.authorize(&actor, &request)?; + print_policy_explain(&decision, &actor, &request); + } } - }, - Command::Optimize { - uri, - config, - cluster, - cluster_graph, - json, - } => { - let config = load_cli_config(config.as_ref())?; - let uri = if uri.is_some() || cluster.is_some() { - resolve_storage_uri( - &config, - uri, - cluster.as_deref(), - cluster_graph.as_deref(), - "optimize", - ) - .await? - } else { - // RFC-011: no explicit per-command address — consult the scope - // (a --profile cluster binding, --store, or operator defaults). - let scope = scope::resolve_scope( - &operator::load_operator_config()?, - planes::Capability::Direct, - scope::ScopeFlags { - profile: cli.profile.as_deref(), - store: cli.store.as_deref(), - server: None, - graph: cli.graph.as_deref(), - uri: None, - }, - )?; - resolve_storage_uri( - &config, - scope.uri, - scope.cluster.as_deref(), - scope.cluster_graph.as_deref(), - "optimize", - ) - .await? - }; + } + Command::Optimize { uri, json } => { + let uri = resolve_maintenance_uri( + cli.profile.as_deref(), + cli.store.as_deref(), + cli.cluster.as_deref(), + cli.graph.as_deref(), + uri, + "optimize", + ) + .await?; + echo_write_target(cli.quiet, "optimize", &uri, false); let db = Omnigraph::open(&uri).await?; let stats = db.optimize().await?; if json { @@ -843,6 +832,10 @@ async fn main() -> Result<()> { "skipped": s.skipped.map(|r| r.as_str()), "manifest_version": s.manifest_version, "lance_head_version": s.lance_head_version, + "pending_indexes": s.pending_indexes.iter().map(|p| serde_json::json!({ + "column": p.column, + "reason": p.reason, + })).collect::>(), })).collect::>(), }); print_json(&value)?; @@ -859,50 +852,28 @@ async fn main() -> Result<()> { } else { println!(" {:<40} no-op", s.table_key); } + for p in &s.pending_indexes { + println!(" ↳ index pending on '{}': {}", p.column, p.reason); + } } } } Command::Repair { uri, - config, - cluster, - cluster_graph, confirm, force, json, } => { - let config = load_cli_config(config.as_ref())?; - let uri = if uri.is_some() || cluster.is_some() { - resolve_storage_uri( - &config, - uri, - cluster.as_deref(), - cluster_graph.as_deref(), - "repair", - ) - .await? - } else { - // RFC-011: no explicit per-command address — consult the scope. - let scope = scope::resolve_scope( - &operator::load_operator_config()?, - planes::Capability::Direct, - scope::ScopeFlags { - profile: cli.profile.as_deref(), - store: cli.store.as_deref(), - server: None, - graph: cli.graph.as_deref(), - uri: None, - }, - )?; - resolve_storage_uri( - &config, - scope.uri, - scope.cluster.as_deref(), - scope.cluster_graph.as_deref(), - "repair", - ) - .await? - }; + let uri = resolve_maintenance_uri( + cli.profile.as_deref(), + cli.store.as_deref(), + cli.cluster.as_deref(), + cli.graph.as_deref(), + uri, + "repair", + ) + .await?; + echo_write_target(cli.quiet, "repair", &uri, false); let db = Omnigraph::open(&uri).await?; let stats = db .repair(omnigraph::db::RepairOptions { confirm, force }) @@ -978,46 +949,20 @@ async fn main() -> Result<()> { } Command::Cleanup { uri, - config, - cluster, - cluster_graph, keep, older_than, confirm, json, } => { - let config = load_cli_config(config.as_ref())?; - let uri = if uri.is_some() || cluster.is_some() { - resolve_storage_uri( - &config, - uri, - cluster.as_deref(), - cluster_graph.as_deref(), - "cleanup", - ) - .await? - } else { - // RFC-011: no explicit per-command address — consult the scope. - let scope = scope::resolve_scope( - &operator::load_operator_config()?, - planes::Capability::Direct, - scope::ScopeFlags { - profile: cli.profile.as_deref(), - store: cli.store.as_deref(), - server: None, - graph: cli.graph.as_deref(), - uri: None, - }, - )?; - resolve_storage_uri( - &config, - scope.uri, - scope.cluster.as_deref(), - scope.cluster_graph.as_deref(), - "cleanup", - ) - .await? - }; + let uri = resolve_maintenance_uri( + cli.profile.as_deref(), + cli.store.as_deref(), + cli.cluster.as_deref(), + cli.graph.as_deref(), + uri, + "cleanup", + ) + .await?; let older_than_dur = older_than.as_deref().map(parse_duration_arg).transpose()?; @@ -1041,6 +986,11 @@ async fn main() -> Result<()> { ); return Ok(()); } + // Past the preview gate: a real destructive run. Against a non-local + // scope this additionally requires --yes (or an interactive yes), so + // `cleanup --confirm s3://…` in CI refuses rather than destroying. + confirm_destructive("cleanup", &uri, cli.yes, json)?; + echo_write_target(cli.quiet, "cleanup", &uri, false); let options = omnigraph::db::CleanupPolicyOptions { keep_versions: keep, @@ -1142,18 +1092,16 @@ async fn main() -> Result<()> { Command::Graphs { command } => match command { GraphsCommand::List { uri, - config, json, } => { - let config = load_cli_config(config.as_ref())?; let client = client::GraphClient::resolve( - &config, cli.server.as_deref(), cli.graph.as_deref(), uri, cli.profile.as_deref(), cli.store.as_deref(), - )?; + ) + .await?; let payload = client.list_graphs().await?; if json { print_json(&payload)?; diff --git a/crates/omnigraph-cli/src/main_tests.rs b/crates/omnigraph-cli/src/main_tests.rs index 2e1db5c..4f93277 100644 --- a/crates/omnigraph-cli/src/main_tests.rs +++ b/crates/omnigraph-cli/src/main_tests.rs @@ -1,22 +1,16 @@ //! In-source test suite for the CLI binary (moved verbatim from //! main.rs; `use super::*` resolves through the #[path] declaration). - use std::fs; - use super::{ - DEFAULT_BEARER_TOKEN_ENV, apply_bearer_token, bearer_token_from_env_file, - legacy_change_request_body, load_cli_config, load_env_file_into_process, - normalize_bearer_token, parse_env_assignment, resolve_cli_graph, resolve_policy_context, - resolve_remote_bearer_token, + DEFAULT_BEARER_TOKEN_ENV, apply_bearer_token, legacy_change_request_body, + normalize_bearer_token, resolve_remote_bearer_token, }; - use omnigraph_server::load_config; use reqwest::header::AUTHORIZATION; use serde_json::json; - use tempfile::tempdir; #[test] fn legacy_change_request_body_uses_legacy_field_names() { - // `execute_change_remote` hits `POST /change`, which old + // `mutate`'s remote arm hits `POST /change`, which old // `omnigraph-server` builds deserialize as `ChangeRequest` with // **required** `query_source` and optional `query_name` keys. // Newer servers accept both spellings via serde alias, but a @@ -96,120 +90,20 @@ } #[test] - fn parse_env_assignment_supports_plain_and_exported_values() { - assert_eq!( - parse_env_assignment("DEMO_TOKEN=demo-token"), - Some(("DEMO_TOKEN".to_string(), "demo-token".to_string())) - ); - assert_eq!( - parse_env_assignment("export DEMO_TOKEN=\"quoted-token\""), - Some(("DEMO_TOKEN".to_string(), "quoted-token".to_string())) - ); - assert_eq!(parse_env_assignment("# comment"), None); - assert_eq!(parse_env_assignment(" "), None); - } - - #[test] - fn bearer_token_from_env_file_reads_named_value() { - let temp = tempdir().unwrap(); - let env_file = temp.path().join(".env.omni"); - fs::write( - &env_file, - "FIRST=ignore\nexport DEMO_TOKEN=\" demo-token \"\n", - ) - .unwrap(); - - assert_eq!( - bearer_token_from_env_file(&env_file, "DEMO_TOKEN") - .unwrap() - .as_deref(), - Some("demo-token") - ); - assert_eq!( - bearer_token_from_env_file(&env_file, "MISSING").unwrap(), - None - ); - } - - #[test] - fn load_env_file_into_process_sets_missing_values_without_overriding_existing_ones() { - let temp = tempdir().unwrap(); - let env_file = temp.path().join(".env.omni"); - fs::write( - &env_file, - "AUTOLOAD_ONLY=from-file\nAUTOLOAD_PRESET=from-file\n", - ) - .unwrap(); - - let missing_key = "AUTOLOAD_ONLY"; - let preset_key = "AUTOLOAD_PRESET"; - let previous_missing = std::env::var_os(missing_key); - let previous_preset = std::env::var_os(preset_key); - - unsafe { - std::env::remove_var(missing_key); - std::env::set_var(preset_key, "from-env"); - } - - load_env_file_into_process(&env_file).unwrap(); - - assert_eq!(std::env::var(missing_key).unwrap(), "from-file"); - assert_eq!(std::env::var(preset_key).unwrap(), "from-env"); - - unsafe { - if let Some(value) = previous_missing { - std::env::set_var(missing_key, value); - } else { - std::env::remove_var(missing_key); - } - - if let Some(value) = previous_preset { - std::env::set_var(preset_key, value); - } else { - std::env::remove_var(preset_key); - } - } - } - - #[test] - fn resolve_remote_bearer_token_uses_scoped_env_file_with_global_fallback() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - r#" -graphs: - demo: - uri: https://example.com - bearer_token_env: DEMO_TOKEN -auth: - env_file: .env.omni -cli: - graph: demo -"#, - ) - .unwrap(); - fs::write( - temp.path().join(".env.omni"), - "DEMO_TOKEN=scoped-token\nOMNIGRAPH_BEARER_TOKEN=global-token\n", - ) - .unwrap(); - + fn resolve_remote_bearer_token_falls_back_to_default_env() { + // RFC-011: with no operator server matching the URL, the only chain + // left is the default `OMNIGRAPH_BEARER_TOKEN` env (no omnigraph.yaml + // scoped chain). Hermetic: no operator config is read for a literal URL + // that matches no `servers:` entry. let previous = std::env::var_os(DEFAULT_BEARER_TOKEN_ENV); let previous_home = std::env::var_os("OMNIGRAPH_HOME"); unsafe { - std::env::remove_var(DEFAULT_BEARER_TOKEN_ENV); - // Hermetic: the keyed hop (RFC-007 PR 2) must not pick up a real - // ~/.omnigraph on the developer's machine — and with no operator - // servers defined, the legacy chain below must behave - // byte-identically to pre-PR-2 (tested-as-untouched). - std::env::set_var("OMNIGRAPH_HOME", temp.path().join("no-operator-config")); + std::env::set_var(DEFAULT_BEARER_TOKEN_ENV, "global-token"); + std::env::set_var("OMNIGRAPH_HOME", "/nonexistent/omnigraph-test-home"); } - let config_path = temp.path().join("omnigraph.yaml"); - let config = load_config(Some(&config_path)).unwrap(); - assert_eq!( - resolve_remote_bearer_token(&config, Some("https://override.example.com")) + resolve_remote_bearer_token(Some("https://override.example.com")) .unwrap() .as_deref(), Some("global-token") @@ -228,196 +122,3 @@ cli: } } } - - #[test] - fn load_cli_config_autoloads_env_file_into_process() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - r#" -auth: - env_file: .env.omni -graphs: - demo: - uri: s3://bucket/prefix -"#, - ) - .unwrap(); - fs::write( - temp.path().join(".env.omni"), - "AUTOLOAD_FROM_CONFIG=loaded\n", - ) - .unwrap(); - - let key = "AUTOLOAD_FROM_CONFIG"; - let previous = std::env::var_os(key); - unsafe { - std::env::remove_var(key); - } - - let config_path = temp.path().join("omnigraph.yaml"); - let config = load_cli_config(Some(&config_path)).unwrap(); - - assert_eq!( - config.resolve_target_uri(None, Some("demo"), None).unwrap(), - "s3://bucket/prefix" - ); - assert_eq!(std::env::var(key).unwrap(), "loaded"); - - unsafe { - if let Some(value) = previous { - std::env::set_var(key, value); - } else { - std::env::remove_var(key); - } - } - } - - #[test] - fn graph_identity_resolve_policy_context_named_cli_graph_uses_graph_key_not_project_name_or_uri() - { - let temp = tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -project: - name: misleading-project -graphs: - local: - uri: /tmp/local-policy-graph.omni - policy: - file: ./policy.yaml -cli: - graph: local -"#, - ) - .unwrap(); - - let config = load_config(Some(&config_path)).unwrap(); - let context = resolve_policy_context(&config).unwrap(); - assert_eq!(context.graph_id, "local"); - } - - #[test] - fn graph_identity_resolve_policy_context_server_graph_uses_graph_key_when_cli_graph_absent() { - let temp = tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -project: - name: misleading-project -graphs: - local: - uri: /tmp/local-policy-graph.omni - policy: - file: ./server-policy.yaml -server: - graph: local -"#, - ) - .unwrap(); - - let config = load_config(Some(&config_path)).unwrap(); - let context = resolve_policy_context(&config).unwrap(); - assert_eq!(context.graph_id, "local"); - assert!(context.policy_file.ends_with("server-policy.yaml")); - } - - #[test] - fn graph_identity_resolve_policy_context_anonymous_uses_top_level_default_identity() { - let temp = tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -project: - name: misleading-project -graphs: - local: - uri: /tmp/local-policy-graph.omni -policy: - file: ./top-policy.yaml -"#, - ) - .unwrap(); - - let config = load_config(Some(&config_path)).unwrap(); - let context = resolve_policy_context(&config).unwrap(); - assert_eq!(context.graph_id, "default"); - assert!(context.policy_file.ends_with("top-policy.yaml")); - } - - #[test] - fn graph_identity_resolve_cli_graph_named_target_uses_graph_key_not_project_name_or_uri() { - let temp = tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -project: - name: misleading-project -graphs: - prod: - uri: s3://bucket/prod-graph/ - policy: - file: ./prod-policy.yaml -cli: - graph: prod -"#, - ) - .unwrap(); - - let config = load_config(Some(&config_path)).unwrap(); - // `--target` is removed; the `cli.graph` default drives the same - // graph-key (not project name / URI) selection. - let graph = resolve_cli_graph(&config, None).unwrap(); - assert_eq!(graph.selected(), Some("prod")); - assert_eq!(graph.graph_id, "prod"); - assert_eq!(graph.uri, "s3://bucket/prod-graph/"); - } - - #[test] - fn graph_identity_resolve_cli_graph_positional_uri_uses_anonymous_normalized_uri() { - let temp = tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -project: - name: misleading-project -graphs: - local: - uri: /tmp/configured-graph.omni - policy: - file: ./policy.yaml -cli: - graph: local -"#, - ) - .unwrap(); - - let config = load_config(Some(&config_path)).unwrap(); - let local_graph_path = temp.path().join("explicit-graph.omni"); - let local_graph = resolve_cli_graph( - &config, - Some(format!("file://{}", local_graph_path.display())), - ) - .unwrap(); - assert_eq!(local_graph.selected(), None); - assert_eq!( - local_graph.graph_id, - local_graph_path.to_string_lossy().as_ref() - ); - assert_eq!(local_graph.policy_file, None); - - let s3_graph = resolve_cli_graph( - &config, - Some("s3://bucket/anonymous-graph/".to_string()), - ) - .unwrap(); - assert_eq!(s3_graph.selected(), None); - assert_eq!(s3_graph.graph_id, "s3://bucket/anonymous-graph"); - assert_eq!(s3_graph.policy_file, None); - } diff --git a/crates/omnigraph-cli/src/migrate.rs b/crates/omnigraph-cli/src/migrate.rs deleted file mode 100644 index 7410381..0000000 --- a/crates/omnigraph-cli/src/migrate.rs +++ /dev/null @@ -1,408 +0,0 @@ -//! `omnigraph config migrate` (RFC-008 stage 2): split a legacy -//! `omnigraph.yaml` into its two destinations — the team half as a -//! ready-to-review `cluster.yaml` proposal, the personal half merged into -//! `~/.omnigraph/config.yaml` — and name what's obsolete. The command is -//! the completeness test of RFC-008's migration map: any key it cannot -//! place is a bug in the RFC. -//! -//! Touches nothing without `--write`. Referenced `.gq`/policy files are -//! never moved; manual steps are printed instead. - -use std::collections::BTreeMap; -use std::path::{Path, PathBuf}; - -use color_eyre::Result; -use color_eyre::eyre::eyre; -use omnigraph_server::OmnigraphConfig; -use serde::Serialize; - -use crate::operator; - -#[derive(Debug, Serialize)] -pub(crate) struct MigrateReport { - pub(crate) source: String, - /// The ready-to-review cluster.yaml text (None when the legacy file - /// declares nothing team-shaped). - pub(crate) cluster_yaml: Option, - /// Operator keys to merge: dotted key -> YAML value text. - pub(crate) operator_merge: BTreeMap, - /// Keys with no destination, and why. - pub(crate) dropped: Vec, - /// Steps the command will not do for you. - pub(crate) manual_steps: Vec, -} - -#[derive(Debug, Serialize)] -pub(crate) struct DroppedKey { - pub(crate) key: String, - pub(crate) reason: String, -} - -/// Classify a parsed legacy config into the report. Pure — no I/O. -pub(crate) fn build_report(config: &OmnigraphConfig, source: &Path) -> MigrateReport { - let mut dropped = Vec::new(); - let mut manual_steps = Vec::new(); - let mut operator_merge: BTreeMap = BTreeMap::new(); - - // ---- personal half ---- - if let Some(actor) = &config.cli.actor { - operator_merge.insert("operator.actor".into(), actor.clone()); - } - if let Some(format) = config.cli.output_format { - operator_merge.insert( - "defaults.output".into(), - serde_yaml::to_string(&format).unwrap_or_default().trim().to_string(), - ); - } - if let Some(width) = config.cli.table_max_column_width { - operator_merge.insert("defaults.table_max_column_width".into(), width.to_string()); - } - if let Some(layout) = config.cli.table_cell_layout { - operator_merge.insert( - "defaults.table_cell_layout".into(), - serde_yaml::to_string(&layout).unwrap_or_default().trim().to_string(), - ); - } - if config.cli.graph.is_some() { - dropped.push(DroppedKey { - key: "cli.graph".into(), - reason: "address graphs explicitly via --store/--server, or set defaults.default_graph in the operator config".into(), - }); - } - if config.cli.branch.is_some() { - dropped.push(DroppedKey { - key: "cli.branch".into(), - reason: "pass --branch explicitly".into(), - }); - } - - // Remote graphs with a token env become operator servers (the keyed - // chain replaces invented env-var names). - for (name, target) in &config.graphs { - if target.uri.starts_with("http://") || target.uri.starts_with("https://") { - operator_merge.insert(format!("servers.{name}.url"), target.uri.clone()); - if target.bearer_token_env.is_some() { - manual_steps.push(format!( - "store the '{name}' token in the keyed chain: echo $TOKEN | omnigraph login {name} (replaces bearer_token_env)" - )); - } - } - } - if config.auth.env_file.is_some() { - manual_steps.push( - "auth.env_file keeps working during the window; prefer `omnigraph login ` per server going forward".into(), - ); - } - - // Legacy aliases split: content -> catalog stored query, binding -> - // operator alias referencing the name. - for (name, alias) in &config.aliases { - let query_name = alias.name.clone().unwrap_or_else(|| name.clone()); - operator_merge.insert( - format!("aliases.{name}"), - format!( - "{{ server: TODO-server-name, graph: {}, query: {query_name}, args: [{}] }}", - alias.graph.as_deref().unwrap_or("TODO-graph-id"), - alias.args.join(", ") - ), - ); - manual_steps.push(format!( - "alias '{name}': move its query content ('{}') into the cluster checkout's queries/ so '{query_name}' becomes a catalog stored query", - alias.query - )); - } - - // ---- team half ---- - let has_team_content = !config.graphs.is_empty() - || !config.queries.is_empty() - || config.policy.file.is_some() - || config.server.policy.file.is_some(); - let cluster_yaml = has_team_content.then(|| { - let mut out = String::from("version: 1\n"); - if let Some(name) = &config.project.name { - out.push_str(&format!("metadata:\n name: {name}\n")); - } - out.push_str("# storage: s3://bucket/prefix # or omit: this folder is the root\n"); - if !config.graphs.is_empty() || !config.queries.is_empty() { - out.push_str("graphs:\n"); - } - // Single-graph top-level queries belong to a graph the legacy file - // never named; propose one. - if !config.queries.is_empty() && config.graphs.is_empty() { - out.push_str(" default: # TODO: pick the graph id\n schema: # TODO: path to this graph's .pg schema\n queries: queries/\n"); - } - for (name, target) in &config.graphs { - out.push_str(&format!(" {name}:\n")); - out.push_str(" schema: # TODO: path to this graph's .pg schema\n"); - if !target.queries.is_empty() { - out.push_str(" queries: queries/ # move the .gq files here\n"); - } - out.push_str(&format!( - " # legacy root: {} — the cluster manages graph roots under its storage; run `omnigraph cluster import` after reviewing\n", - target.uri - )); - } - let mut policies: Vec<(String, String, String)> = Vec::new(); - if let Some(file) = &config.policy.file { - policies.push(("default".into(), file.clone(), "graph. # TODO: bind".into())); - } - if let Some(file) = &config.server.policy.file { - policies.push(("server".into(), file.clone(), "cluster".into())); - } - for (name, target) in &config.graphs { - if let Some(file) = &target.policy.file { - policies.push((name.clone(), file.clone(), format!("graph.{name}"))); - } - } - if !policies.is_empty() { - out.push_str("policies:\n"); - for (name, file, binding) in policies { - out.push_str(&format!( - " {name}:\n file: {file}\n applies_to: [{binding}]\n" - )); - } - } - out - }); - - if !config.query.roots.is_empty() { - dropped.push(DroppedKey { - key: "query.roots".into(), - reason: "obsolete — cluster query discovery (queries: ) replaced it".into(), - }); - } - if config.server.bind.is_some() || config.server.graph.is_some() { - dropped.push(DroppedKey { - key: "server.bind / server.graph".into(), - reason: "deployment runtime — pass --bind / target flags or env".into(), - }); - } - if config.project.name.is_some() && cluster_yaml.is_none() { - dropped.push(DroppedKey { - key: "project.name".into(), - reason: "the cluster's metadata.name is the deployment label".into(), - }); - } - - MigrateReport { - source: source.display().to_string(), - cluster_yaml, - operator_merge, - dropped, - manual_steps, - } -} - -pub(crate) fn render_report(report: &MigrateReport) -> String { - let mut out = format!("migration plan for {}\n", report.source); - if let Some(cluster) = &report.cluster_yaml { - out.push_str("\n== team half -> cluster.yaml (ready to review) ==\n"); - out.push_str(cluster); - } - if !report.operator_merge.is_empty() { - out.push_str("\n== personal half -> ~/.omnigraph/config.yaml ==\n"); - for (key, value) in &report.operator_merge { - out.push_str(&format!(" {key}: {value}\n")); - } - } - if !report.dropped.is_empty() { - out.push_str("\n== no destination ==\n"); - for dropped in &report.dropped { - out.push_str(&format!(" {} — {}\n", dropped.key, dropped.reason)); - } - } - if !report.manual_steps.is_empty() { - out.push_str("\n== manual steps ==\n"); - for step in &report.manual_steps { - out.push_str(&format!(" - {step}\n")); - } - } - out.push_str("\n(nothing written; pass --write to apply the operator merge and emit cluster.yaml)\n"); - out -} - -/// `--write`: merge the personal half into the operator config (key-level, -/// existing entries always win; the prior file is backed up) and write the -/// team half to cluster.yaml in the legacy config's directory (or -/// cluster.yaml.proposed when one already exists). -pub(crate) fn apply_report(report: &MigrateReport, legacy_dir: &Path) -> Result> { - let mut written = Vec::new(); - - if !report.operator_merge.is_empty() { - let dir = operator::operator_dir() - .ok_or_else(|| eyre!("no home directory resolvable for the operator config"))?; - std::fs::create_dir_all(&dir)?; - let path = dir.join(operator::OPERATOR_CONFIG_FILE); - let existing_text = std::fs::read_to_string(&path).unwrap_or_default(); - let mut mapping: serde_yaml::Mapping = if existing_text.trim().is_empty() { - serde_yaml::Mapping::new() - } else { - serde_yaml::from_str(&existing_text) - .map_err(|err| eyre!("operator config '{}' does not parse: {err}", path.display()))? - }; - let mut merged_any = false; - for (dotted, value_text) in &report.operator_merge { - if merge_dotted_if_absent(&mut mapping, dotted, value_text)? { - merged_any = true; - } - } - if merged_any { - if !existing_text.is_empty() { - let backup = path.with_extension("yaml.bak"); - std::fs::write(&backup, &existing_text)?; - written.push(format!("backed up prior operator config to {}", backup.display())); - } - let rendered = serde_yaml::to_string(&mapping)?; - let tmp = path.with_extension(format!("yaml.tmp.{}", std::process::id())); - std::fs::write(&tmp, &rendered)?; - std::fs::rename(&tmp, &path)?; - written.push(format!("merged personal keys into {}", path.display())); - } else { - written.push("operator config already carries every personal key (nothing merged)".into()); - } - } - - if let Some(cluster) = &report.cluster_yaml { - let target = legacy_dir.join("cluster.yaml"); - let target = if target.exists() { - legacy_dir.join("cluster.yaml.proposed") - } else { - target - }; - std::fs::write(&target, cluster)?; - written.push(format!("wrote team-half proposal to {}", target.display())); - } - - Ok(written) -} - -/// Set `a.b.c` in the mapping only when absent; returns whether it wrote. -fn merge_dotted_if_absent( - mapping: &mut serde_yaml::Mapping, - dotted: &str, - value_text: &str, -) -> Result { - let value: serde_yaml::Value = - serde_yaml::from_str(value_text).unwrap_or(serde_yaml::Value::String(value_text.into())); - let parts: Vec<&str> = dotted.split('.').collect(); - let mut current = mapping; - for part in &parts[..parts.len() - 1] { - let key = serde_yaml::Value::String((*part).into()); - let entry = current - .entry(key) - .or_insert_with(|| serde_yaml::Value::Mapping(serde_yaml::Mapping::new())); - current = entry - .as_mapping_mut() - .ok_or_else(|| eyre!("operator config key '{dotted}' collides with a non-mapping"))?; - } - let leaf = serde_yaml::Value::String(parts[parts.len() - 1].into()); - if current.contains_key(&leaf) { - return Ok(false); - } - current.insert(leaf, value); - Ok(true) -} - -pub(crate) fn legacy_config_path(explicit: Option<&PathBuf>) -> PathBuf { - explicit.cloned().unwrap_or_else(|| PathBuf::from("omnigraph.yaml")) -} - -#[cfg(test)] -mod tests { - use super::*; - use omnigraph_server::config::load_config; - - fn full_legacy_fixture(dir: &Path) -> PathBuf { - let path = dir.join("omnigraph.yaml"); - std::fs::write( - &path, - r#" -project: { name: brain } -graphs: - prod: - uri: https://graph.example.com - bearer_token_env: PROD_TOKEN - policy: { file: ./prod.policy.yaml } - queries: - find: { file: ./find.gq } - local: - uri: /tmp/local.omni -server: { bind: "0.0.0.0:9999", policy: { file: ./server.policy.yaml } } -auth: { env_file: .env.omni } -cli: - graph: prod - branch: main - actor: act-me - output_format: json - table_max_column_width: 40 -query: { roots: ["."] } -aliases: - triage: { command: query, query: ./triage.gq, name: weekly_triage, args: [since], graph: prod } -policy: { file: ./top.policy.yaml } -queries: - top_q: { file: ./top.gq } -"#, - ) - .unwrap(); - path - } - - /// The RFC-008 completeness contract: every top-level key of the - /// legacy schema must appear in the report somewhere (team half, - /// operator merge, dropped, or manual steps). - #[test] - fn every_legacy_key_is_classified() { - let dir = tempfile::tempdir().unwrap(); - let path = full_legacy_fixture(dir.path()); - let config = load_config(Some(&path)).unwrap(); - let report = build_report(&config, &path); - let rendered = render_report(&report); - - let serialized = - serde_yaml::to_value(OmnigraphConfig::default()).expect("default serializes"); - for key in serialized.as_mapping().unwrap().keys() { - let key = key.as_str().unwrap(); - assert!( - rendered.contains(key) - || report.operator_merge.keys().any(|k| k.contains(key)) - || matches!(key, "graphs" | "queries" | "policy" | "project") - && report.cluster_yaml.is_some(), - "legacy key '{key}' is unclassified — fix the RFC-008 map: {rendered}" - ); - } - - // spot checks on each section - assert_eq!(report.operator_merge["operator.actor"], "act-me"); - assert_eq!(report.operator_merge["defaults.output"], "json"); - assert_eq!( - report.operator_merge["servers.prod.url"], - "https://graph.example.com" - ); - assert!(report.operator_merge["aliases.triage"].contains("query: weekly_triage")); - let cluster = report.cluster_yaml.as_deref().unwrap(); - assert!(cluster.contains("version: 1")); - assert!(cluster.contains("name: brain")); - assert!(cluster.contains(" prod:")); - assert!(cluster.contains("applies_to: [cluster]")); - assert!(cluster.contains("applies_to: [graph.prod]")); - assert!(report.dropped.iter().any(|d| d.key == "query.roots")); - assert!(report.dropped.iter().any(|d| d.key.contains("server.bind"))); - assert!( - report - .manual_steps - .iter() - .any(|s| s.contains("omnigraph login prod")) - ); - } - - #[test] - fn merge_dotted_never_clobbers_existing() { - let mut mapping: serde_yaml::Mapping = - serde_yaml::from_str("operator:\n actor: keep-me\n").unwrap(); - assert!(!merge_dotted_if_absent(&mut mapping, "operator.actor", "new").unwrap()); - assert!(merge_dotted_if_absent(&mut mapping, "defaults.output", "json").unwrap()); - let text = serde_yaml::to_string(&mapping).unwrap(); - assert!(text.contains("keep-me") && !text.contains("new")); - assert!(text.contains("output: json")); - } -} diff --git a/crates/omnigraph-cli/src/operator.rs b/crates/omnigraph-cli/src/operator.rs index e48af50..dbfe781 100644 --- a/crates/omnigraph-cli/src/operator.rs +++ b/crates/omnigraph-cli/src/operator.rs @@ -18,10 +18,10 @@ use std::env; use std::path::{Path, PathBuf}; use color_eyre::Result; -use color_eyre::eyre::eyre; +use color_eyre::eyre::{bail, eyre}; use serde::Deserialize; -use omnigraph_server::config::ReadOutputFormat; +use crate::read_format::{ReadOutputFormat, TableCellLayout}; pub(crate) const OPERATOR_HOME_ENV: &str = "OMNIGRAPH_HOME"; pub(crate) const OPERATOR_DIR: &str = ".omnigraph"; @@ -91,8 +91,7 @@ pub(crate) struct OperatorServer { #[derive(Debug, Default, Deserialize)] pub(crate) struct OperatorIdentity { /// Default actor for every `--as` cascade (CLI direct-engine writes and - /// cluster commands alike): `--as` > legacy config actor (RFC-008 - /// window) > this > none. + /// cluster commands alike): `--as` > this > none. pub(crate) actor: Option, #[serde(flatten)] unknown: serde_yaml::Mapping, @@ -102,14 +101,19 @@ pub(crate) struct OperatorIdentity { pub(crate) struct OperatorDefaults { /// Default read output format, below every more-specific source. pub(crate) output: Option, - /// Table rendering preferences (below the legacy cli.table_* keys - /// during the RFC-008 window). + /// Table rendering preferences for `--format table`. pub(crate) table_max_column_width: Option, - pub(crate) table_cell_layout: Option, + pub(crate) table_cell_layout: Option, /// Default server scope (RFC-011): the everyday addressing when no /// `--profile` / primitive / legacy address is given. Names an entry - /// under `servers:`. + /// under `servers:`. Mutually exclusive with `store` — a scope binds one + /// entity. pub(crate) server: Option, + /// Default **store** scope (RFC-011): a `file://` / `s3://` graph storage + /// URI used as the zero-flag local default for graph commands when no + /// `--profile` / primitive address is given. The local-dev counterpart of + /// `server`; mutually exclusive with it. + pub(crate) store: Option, /// Default graph selected within a server/cluster scope when no /// `--graph` is passed (RFC-011). pub(crate) default_graph: Option, @@ -202,10 +206,36 @@ impl OperatorConfig { self.defaults.server.as_deref() } + /// The flat-default store scope URI, if set (RFC-011) — the zero-flag + /// local-dev default. + pub(crate) fn default_store(&self) -> Option<&str> { + self.defaults.store.as_deref() + } + /// The flat-default graph within a server/cluster scope, if set (RFC-011). pub(crate) fn default_graph(&self) -> Option<&str> { self.defaults.default_graph.as_deref() } + + /// A scope binds one entity (Decision 6): `defaults.server` and + /// `defaults.store` are mutually exclusive, and a `store` (already a single + /// graph) cannot carry a `default_graph`. Both are refused loudly rather + /// than silently dropped. + fn validate_defaults(&self) -> Result<()> { + if self.defaults.server.is_some() && self.defaults.store.is_some() { + bail!( + "operator config `defaults` sets both `server` and `store` — a default scope \ + binds one entity; keep one (use a `profile` if you need both)" + ); + } + if self.defaults.store.is_some() && self.defaults.default_graph.is_some() { + bail!( + "operator config `defaults` sets both `store` and `default_graph` — a store is \ + already a single graph; drop `default_graph` (it applies only to a server/cluster scope)" + ); + } + Ok(()) + } } impl OperatorProfile { @@ -282,6 +312,7 @@ pub(crate) fn load_operator_config_at(path: &Path) -> Result { for warning in config.unknown_key_warnings() { eprintln!("warning: {warning} in operator config '{}'", path.display()); } + config.validate_defaults()?; Ok(config) } @@ -560,6 +591,42 @@ mod tests { assert_eq!(config.output(), Some(ReadOutputFormat::Json)); } + #[test] + fn defaults_store_parses_and_is_accessible() { + let dir = tempfile::tempdir().unwrap(); + let path = dir.path().join("config.yaml"); + fs::write(&path, "defaults:\n store: file:///tmp/dev.omni\n").unwrap(); + let config = load_operator_config_at(&path).unwrap(); + assert_eq!(config.default_store(), Some("file:///tmp/dev.omni")); + assert_eq!(config.default_server(), None); + } + + #[test] + fn defaults_server_and_store_together_is_a_loud_error() { + let dir = tempfile::tempdir().unwrap(); + let path = dir.path().join("config.yaml"); + fs::write( + &path, + "defaults:\n server: prod\n store: file:///tmp/dev.omni\n", + ) + .unwrap(); + let err = load_operator_config_at(&path).unwrap_err().to_string(); + assert!(err.contains("binds one entity"), "{err}"); + } + + #[test] + fn defaults_store_with_default_graph_is_a_loud_error() { + let dir = tempfile::tempdir().unwrap(); + let path = dir.path().join("config.yaml"); + fs::write( + &path, + "defaults:\n store: file:///tmp/dev.omni\n default_graph: knowledge\n", + ) + .unwrap(); + let err = load_operator_config_at(&path).unwrap_err().to_string(); + assert!(err.contains("already a single graph"), "{err}"); + } + #[test] fn unknown_keys_warn_but_load() { // A file written for a later slice (servers/aliases) must load diff --git a/crates/omnigraph-cli/src/output.rs b/crates/omnigraph-cli/src/output.rs index 8352426..d6903f4 100644 --- a/crates/omnigraph-cli/src/output.rs +++ b/crates/omnigraph-cli/src/output.rs @@ -749,15 +749,10 @@ pub(crate) fn print_snapshot_human(branch: &str, manifest_version: u64, entries: pub(crate) fn print_read_output( output: &ReadOutput, format: ReadOutputFormat, - config: &OmnigraphConfig, ) -> Result<()> { println!( "{}", - render_read( - output, - format, - &resolve_table_render_options(config), - )? + render_read(output, format, &resolve_table_render_options())? ); Ok(()) } @@ -907,21 +902,87 @@ pub(crate) fn finish_logout( Ok(()) } -/// Table prefs cascade (RFC-007/008): legacy cli.table_* (window) > -/// operator defaults.table_* > built-in. -pub(crate) fn resolve_table_render_options(config: &OmnigraphConfig) -> ReadRenderOptions { +#[derive(Debug, Serialize)] +pub(crate) struct ProfileListItem { + pub(crate) name: String, + /// `server: ` / `cluster: ` / `store: ` / `invalid: `. + pub(crate) binding: String, + /// `server` | `cluster` | `store` | `invalid`. + pub(crate) scope_kind: String, + /// The bound server/cluster name, or the store URI. `None` when invalid. + pub(crate) target: Option, + pub(crate) valid: bool, + pub(crate) error: Option, + pub(crate) default_graph: Option, + pub(crate) active: bool, +} + +#[derive(Debug, Serialize)] +pub(crate) struct ProfileDetail { + /// Profile name, or `(defaults)` for the no-name flat-defaults view. + pub(crate) name: String, + /// `server` | `cluster` | `store` | `none`. + pub(crate) scope_kind: String, + /// The bound server/cluster name, or the store URI. + pub(crate) target: Option, + /// Resolved endpoint: a server's URL / a cluster's root / the store URI; + /// `None` if a named server/cluster isn't defined in this config. + pub(crate) endpoint: Option, + pub(crate) default_graph: Option, + pub(crate) output_format: Option, +} + +pub(crate) fn print_profile_list(items: &[ProfileListItem], json: bool) -> Result<()> { + if json { + return print_json(&items); + } + if items.is_empty() { + println!("no profiles defined in the operator config"); + return Ok(()); + } + for item in items { + let active = if item.active { " (active)" } else { "" }; + let graph = item + .default_graph + .as_deref() + .map(|g| format!(" · graph: {g}")) + .unwrap_or_default(); + println!("{}{active} {}{graph}", item.name, item.binding); + } + Ok(()) +} + +pub(crate) fn print_profile_detail(detail: &ProfileDetail, json: bool) -> Result<()> { + if json { + return print_json(detail); + } + println!("profile: {}", detail.name); + let target = detail + .target + .as_deref() + .map(|t| format!(" {t}")) + .unwrap_or_default(); + println!(" scope: {}{target}", detail.scope_kind); + if let Some(endpoint) = &detail.endpoint { + println!(" endpoint: {endpoint}"); + } else if matches!(detail.scope_kind.as_str(), "server" | "cluster") { + println!(" endpoint: (undefined — name not in this config)"); + } + if let Some(graph) = &detail.default_graph { + println!(" default graph: {graph}"); + } + if let Some(format) = &detail.output_format { + println!(" output: {format}"); + } + Ok(()) +} + +/// Table prefs cascade (RFC-011): operator defaults.table_* > built-in. +pub(crate) fn resolve_table_render_options() -> ReadRenderOptions { let operator = crate::operator::load_operator_config().unwrap_or_default(); ReadRenderOptions { - max_column_width: config - .cli - .table_max_column_width - .or(operator.defaults.table_max_column_width) - .unwrap_or(80), - cell_layout: config - .cli - .table_cell_layout - .or(operator.defaults.table_cell_layout) - .unwrap_or_default(), + max_column_width: operator.defaults.table_max_column_width.unwrap_or(80), + cell_layout: operator.defaults.table_cell_layout.unwrap_or_default(), } } diff --git a/crates/omnigraph-cli/src/planes.rs b/crates/omnigraph-cli/src/planes.rs index c289daa..b599076 100644 --- a/crates/omnigraph-cli/src/planes.rs +++ b/crates/omnigraph-cli/src/planes.rs @@ -82,9 +82,7 @@ impl Capability { /// classifier) plus the one Data→Served refinement: `graphs` is remote-only. /// /// This reflects *current enforced behavior*, so messages stay truthful: -/// `queries list` is `Local` (reads config today) and `queries validate` is -/// `Direct` (opens a graph directly today). Both converge to the RFC end-state -/// (served / control) only when later slices re-route them. +/// `queries`/`policy` read a cluster's applied state (`Control`). pub(crate) fn command_capability(cmd: &Command) -> Capability { if let Command::Graphs { .. } = cmd { return Capability::Served; @@ -100,8 +98,7 @@ pub(crate) fn command_capability(cmd: &Command) -> Capability { /// The plane a subcommand belongs to. Exhaustive — a new `Command` variant /// will not compile until classified. Descends into the nested enums where /// the plane differs per subcommand (`schema plan` is storage while `schema -/// show`/`apply` are data; `queries validate` opens the graph while `queries -/// list` only reads config). +/// show`/`apply` are data; `queries`/`policy` read cluster applied state). pub(crate) fn command_plane(cmd: &Command) -> Plane { match cmd { Command::Query { .. } @@ -119,23 +116,22 @@ pub(crate) fn command_plane(cmd: &Command) -> Plane { Command::Schema { command: SchemaCommand::Plan { .. }, } => Plane::Storage, - Command::Queries { - command: QueriesCommand::Validate { .. }, - } => Plane::Storage, - Command::Queries { - command: QueriesCommand::List { .. }, - } => Plane::Session, + // `queries` and `policy` tooling now source their inputs from a + // cluster's applied state (`--cluster`), so they live on the control + // plane (RFC-011 — omnigraph.yaml excised from the CLI). + Command::Queries { .. } => Plane::Control, + Command::Policy { .. } => Plane::Control, Command::Init { .. } | Command::Optimize { .. } | Command::Repair { .. } | Command::Cleanup { .. } | Command::Lint { .. } => Plane::Storage, Command::Cluster { .. } => Plane::Control, - Command::Policy { .. } + Command::Alias { .. } | Command::Embed(_) | Command::Login { .. } | Command::Logout { .. } - | Command::Config { .. } + | Command::Profile { .. } | Command::Version => Plane::Session, } } @@ -147,7 +143,7 @@ pub(crate) fn command_label(cmd: &Command) -> &'static str { Command::Version => "version", Command::Login { .. } => "login", Command::Logout { .. } => "logout", - Command::Config { .. } => "config", + Command::Profile { .. } => "profile", Command::Embed(_) => "embed", Command::Init { .. } => "init", Command::Load { .. } => "load", @@ -168,6 +164,7 @@ pub(crate) fn command_label(cmd: &Command) -> &'static str { Command::Commit { .. } => "commit", Command::Query { .. } => "query", Command::Mutate { .. } => "mutate", + Command::Alias { .. } => "alias", Command::Policy { .. } => "policy", Command::Optimize { .. } => "optimize", Command::Repair { .. } => "repair", @@ -177,35 +174,128 @@ pub(crate) fn command_label(cmd: &Command) -> &'static str { } } -/// Reject the data-plane addressing flags (`--server`/`--graph`) on any verb -/// that does not live on the data plane. This replaces the old silent-ignore -/// — e.g. `optimize --server prod` previously dropped `--server` and tried to -/// resolve a default target, failing (if at all) with an unrelated message. -/// Now it fails with one honest, declared error. RFC-010 Slice 1. +/// The verbs that consume a cluster scope. Maintenance/lint select a graph with +/// `--cluster --graph `; policy/queries inspect the cluster's +/// applied control-plane state and may optionally use `--graph` to select one +/// bundle/registry. `init` is storage-plane too but *creates* a graph (cluster +/// graphs are born from `cluster apply`, not `init`), and `schema plan` takes a +/// positional URI, so the guard rejects `--cluster`/`--graph` there rather than +/// silently dropping the flag. +pub(crate) fn accepts_cluster_addressing(cmd: &Command) -> bool { + matches!( + cmd, + Command::Optimize { .. } + | Command::Repair { .. } + | Command::Cleanup { .. } + // `lint` can type-check a `.gq` against a cluster graph's schema + // (RFC-011): `--cluster --graph `. + | Command::Lint { .. } + // The policy/queries tooling addresses a cluster's applied state + // (RFC-011): `--cluster ` selects the cluster, `--graph ` + // picks a graph's bundle/registry within it. + | Command::Policy { .. } + | Command::Queries { .. } + ) +} + +/// Reject a scope-addressing flag (`--server`/`--cluster`/`--graph`) on a verb +/// that cannot consume it, rather than silently dropping it (the old behavior: +/// e.g. `optimize --server prod` dropped `--server` and failed later with an +/// unrelated message). `alias` gets an extra guard because its binding owns all +/// addressing and several ignored globals sit outside this three-flag guard. +/// Each flag has a distinct valid surface: +/// - `--server` → served-graph scopes (`any`/`served`); +/// - `--cluster` → cluster-scoped direct/control verbs; +/// - `--graph` → any multi-graph scope: a served scope *or* a cluster one. +/// RFC-010 Slice 1, generalized for RFC-011 cluster addressing. pub(crate) fn guard_addressing(cli: &Cli) -> Result<()> { - if cli.server.is_none() && cli.graph.is_none() { + if let Command::Alias { .. } = &cli.command { + let mut flags = Vec::new(); + if cli.server.is_some() { + flags.push("--server"); + } + if cli.graph.is_some() { + flags.push("--graph"); + } + if cli.store.is_some() { + flags.push("--store"); + } + if cli.cluster.is_some() { + flags.push("--cluster"); + } + if cli.profile.is_some() { + flags.push("--profile"); + } + if cli.as_actor.is_some() { + flags.push("--as"); + } + if !flags.is_empty() { + bail!( + "`alias` uses the server, graph, and stored query declared in \ + `aliases.` in ~/.omnigraph/config.yaml; remove global scope \ + flag(s): {}", + flags.join(", ") + ); + } + } + if cli.server.is_none() && cli.cluster.is_none() && cli.graph.is_none() { return Ok(()); } let capability = command_capability(&cli.command); - if capability.accepts_server_addressing() { - return Ok(()); - } let label = command_label(&cli.command); - let how = match capability { - Capability::Direct => match cli.command { - Command::Init { .. } => "Pass a storage URI.", - _ => "Pass a storage URI, or --cluster --cluster-graph .", + let cluster_ok = accepts_cluster_addressing(&cli.command); + + if cli.server.is_some() && !capability.accepts_server_addressing() { + bail!( + "`{label}` is a {} command; --server addresses a served graph and does not apply.{}", + capability.describe(), + remediation(capability, &cli.command), + ); + } + if cli.cluster.is_some() && !cluster_ok { + bail!( + "`{label}` is a {} command; --cluster addresses a cluster-scoped command \ + and does not apply.{}", + capability.describe(), + remediation(capability, &cli.command), + ); + } + if cli.graph.is_some() && !(capability.accepts_server_addressing() || cluster_ok) { + bail!( + "`{label}` is a {} command; --graph selects a graph within a server or cluster \ + scope and does not apply.{}", + capability.describe(), + remediation(capability, &cli.command), + ); + } + Ok(()) +} + +/// The "what to do instead" tail for a wrong-address error, by capability. +/// Includes its own leading space when non-empty so the caller appends it +/// directly — an empty tail (the served-addressing capabilities, which only +/// reach this fn for a misplaced `--cluster`/`--graph`) leaves no trailing space. +fn remediation(capability: Capability, cmd: &Command) -> &'static str { + match capability { + Capability::Direct => match cmd { + Command::Init { .. } => " Pass a storage URI.", + Command::Optimize { .. } | Command::Repair { .. } | Command::Cleanup { .. } => { + " Pass a storage URI, or --cluster --graph ." + } + _ => " Pass a storage URI.", }, - Capability::Control => "It operates on a cluster (pass --config ).", - Capability::Local => "It does not address a graph.", - Capability::Any | Capability::Served => { - unreachable!("served-addressing capabilities returned early") - } - }; - bail!( - "`{label}` is a {} command; --server/--graph address a served graph and do not apply. {how}", - capability.describe() - ); + Capability::Control => match cmd { + Command::Cluster { .. } => { + " It operates on a cluster config directory (pass --config )." + } + Command::Policy { .. } | Command::Queries { .. } => { + " It operates on a cluster (pass --cluster , or select a cluster profile)." + } + _ => " It operates on a cluster.", + }, + Capability::Local => " It does not address a graph.", + Capability::Any | Capability::Served => "", + } } #[cfg(test)] @@ -235,11 +325,17 @@ mod tests { // The one Data→Served refinement — if the `graphs` guard were deleted, // every other assertion here would still pass. assert_eq!(cap(&["omnigraph", "graphs", "list"]), Capability::Served); + assert_eq!(cap(&["omnigraph", "alias", "who"]), Capability::Local); assert_eq!(cap(&["omnigraph", "optimize", "graph.omni"]), Capability::Direct); assert_eq!(cap(&["omnigraph", "schema", "plan", "--schema", "s.pg", "graph.omni"]), Capability::Direct); assert_eq!(cap(&["omnigraph", "cluster", "status", "--config", "."]), Capability::Control); assert_eq!(cap(&["omnigraph", "version"]), Capability::Local); - assert_eq!(cap(&["omnigraph", "queries", "list"]), Capability::Local); + // `queries`/`policy` tooling reads cluster state now (control plane). + assert_eq!(cap(&["omnigraph", "queries", "list"]), Capability::Control); + assert_eq!( + cap(&["omnigraph", "policy", "validate"]), + Capability::Control + ); } #[test] diff --git a/crates/omnigraph-cli/src/read_format.rs b/crates/omnigraph-cli/src/read_format.rs index b205b19..3ffa9e6 100644 --- a/crates/omnigraph-cli/src/read_format.rs +++ b/crates/omnigraph-cli/src/read_format.rs @@ -1,9 +1,31 @@ +use clap::ValueEnum; use color_eyre::eyre::Result; -use omnigraph_server::ReadOutputFormat; use omnigraph_server::api::ReadOutput; -use omnigraph_server::config::TableCellLayout; +use serde::{Deserialize, Serialize}; use serde_json::{Map, Value}; +/// Output rendering format for read-shaped commands (`read`/`query`/`alias`). +/// A CLI presentation concern — lives here, not in the server. +#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)] +#[serde(rename_all = "snake_case")] +pub enum ReadOutputFormat { + #[default] + Table, + Kv, + Csv, + Jsonl, + Json, +} + +/// How an over-wide table cell is laid out when rendering `--format table`. +#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)] +#[serde(rename_all = "snake_case")] +pub enum TableCellLayout { + #[default] + Truncate, + Wrap, +} + pub struct ReadRenderOptions { pub max_column_width: usize, pub cell_layout: TableCellLayout, diff --git a/crates/omnigraph-cli/src/scope.rs b/crates/omnigraph-cli/src/scope.rs index 4349231..257907d 100644 --- a/crates/omnigraph-cli/src/scope.rs +++ b/crates/omnigraph-cli/src/scope.rs @@ -42,6 +42,7 @@ pub(crate) struct ScopeFlags<'a> { pub(crate) profile: Option<&'a str>, pub(crate) store: Option<&'a str>, pub(crate) server: Option<&'a str>, + pub(crate) cluster: Option<&'a str>, pub(crate) graph: Option<&'a str>, pub(crate) uri: Option, } @@ -56,17 +57,49 @@ pub(crate) fn resolve_scope( capability: Capability, flags: ScopeFlags<'_>, ) -> Result { - // `--store` is its own way to address a graph; combining it with a positional - // URI or `--server` is a contradiction, not a silent precedence. - if flags.store.is_some() && (flags.uri.is_some() || flags.server.is_some()) { + // At most one explicit scope primitive may address a command — a positional + // URI, `--store`, `--server`, or `--cluster` are mutually exclusive ways to + // name the graph. Combining them is a contradiction, not a silent precedence. + let primitives: Vec<&str> = [ + flags.uri.as_deref().map(|_| "a positional URI"), + flags.store.map(|_| "--store"), + flags.server.map(|_| "--server"), + flags.cluster.map(|_| "--cluster"), + ] + .into_iter() + .flatten() + .collect(); + if primitives.len() > 1 { bail!( - "--store is exclusive with a positional URI and --server — pick one way to \ - address the graph" + "{} are mutually exclusive — pick one way to address the graph", + primitives.join(" and ") ); } - // 1. Any explicit address wins; reproduce today's behavior untouched. - // `--store` is an explicit store URI — fold it into `uri`. + + // 1a. `--cluster` is the cluster scope primitive (maintenance): resolve its + // root + select the graph with `--graph`. + if let Some(cluster) = flags.cluster { + return scope_from_binding( + op, + capability, + ScopeBinding::Cluster(cluster.to_string()), + flags.graph.map(str::to_string), + "--cluster", + ); + } + + // 1b. Any other explicit address wins; reproduce today's behavior untouched. + // `--store` is an explicit store URI — fold it into `uri`. if flags.uri.is_some() || flags.server.is_some() || flags.store.is_some() { + // `--graph` selects within a multi-graph scope; a bare positional URI / + // `--store` is already a single graph, so a stray `--graph` is an error + // rather than a silently-dropped flag. + if flags.graph.is_some() && flags.server.is_none() { + bail!( + "--graph selects a graph within a server or cluster scope; a positional \ + URI / --store is already a single graph" + ); + } return Ok(ResolvedScope { server: flags.server.map(str::to_string), graph: flags.graph.map(str::to_string), @@ -107,6 +140,18 @@ pub(crate) fn resolve_scope( ); } + // 3b. Flat default store scope — the zero-flag local-dev default (RFC-011). + // Mutually exclusive with `defaults.server` (enforced at config load). + if let Some(store) = op.default_store() { + return scope_from_binding( + op, + capability, + ScopeBinding::Store(store.to_string()), + flags.graph.map(str::to_string), + "operator defaults", + ); + } + // 4. Nothing resolved — leave the tuple empty; downstream falls through to // today's behavior (legacy `cli.graph` default or a no-address error). Ok(ResolvedScope::default()) @@ -128,8 +173,8 @@ fn scope_from_binding( if capability == Capability::Direct { bail!( "this command needs direct storage access, but {source} resolves a \ - server scope; name storage explicitly with --store (or a \ - --cluster/--cluster-graph for a managed graph)" + server scope; name storage explicitly with --store (or \ + --cluster --graph for a managed graph)" ); } Ok(ResolvedScope { @@ -141,23 +186,25 @@ fn scope_from_binding( ScopeBinding::Cluster(cluster) => { if capability == Capability::Any { bail!( - "{source} resolves a cluster scope, which is maintenance-only; run \ - data commands through a server, or use --store for ad-hoc \ - direct access" + "{source} resolves a cluster scope, which is not valid for graph data \ + commands; run data commands through a server, or use --store \ + for ad-hoc direct access" ); } - // A cluster binding is a config name (resolved against `clusters:`) - // or a literal root URI. - let root = if let Some(root) = op.cluster_root(&cluster) { - root.to_string() - } else if cluster.contains("://") { - cluster - } else { - bail!( - "unknown cluster '{cluster}' ({source}); define it under `clusters:` \ - in operator config, or use a literal root URI" - ); - }; + // A cluster value is a config name (resolved against `clusters:`) + // or a literal root: an `s3://`/`file://` URI or a local cluster + // directory. Only a configured name is rewritten; anything else is + // passed through to the cluster-state resolver verbatim, so a bare + // directory path keeps working as it did for per-command `--cluster`. + let root = op + .cluster_root(&cluster) + .map(str::to_string) + .unwrap_or(cluster); + // A cluster holds many graphs; maintenance addresses one at a time. + // When no `--graph`/`default_graph` is given, leave `cluster_graph` + // empty and defer to the async storage-URI resolver (RFC-011 D7), + // which enumerates the catalog: auto-use a sole graph, else error + // and list the candidates. Ok(ResolvedScope { cluster: Some(root), cluster_graph: graph, @@ -192,6 +239,7 @@ mod tests { profile: None, store: None, server: None, + cluster: None, graph: None, uri: None, } @@ -230,7 +278,7 @@ mod tests { } #[test] - fn store_is_exclusive_with_positional_uri_and_server() { + fn scope_primitives_are_mutually_exclusive() { let op = OperatorConfig::default(); for flags in [ ScopeFlags { @@ -243,12 +291,128 @@ mod tests { server: Some("prod"), ..flags() }, + ScopeFlags { + cluster: Some("./brain"), + uri: Some("file://other.omni".into()), + ..flags() + }, + ScopeFlags { + cluster: Some("./brain"), + server: Some("prod"), + ..flags() + }, ] { - let err = resolve_scope(&op, Capability::Any, flags).unwrap_err().to_string(); - assert!(err.contains("--store is exclusive"), "{err}"); + let err = resolve_scope(&op, Capability::Direct, flags) + .unwrap_err() + .to_string(); + assert!(err.contains("mutually exclusive"), "{err}"); } } + #[test] + fn cluster_flag_resolves_root_and_graph_for_maintenance() { + let op = cfg("clusters:\n brain:\n root: s3://acme/brain\n"); + let scope = resolve_scope( + &op, + Capability::Direct, + ScopeFlags { + cluster: Some("brain"), + graph: Some("knowledge"), + ..flags() + }, + ) + .unwrap(); + assert_eq!(scope.cluster.as_deref(), Some("s3://acme/brain")); + assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge")); + } + + #[test] + fn cluster_flag_accepts_a_literal_root_uri() { + let op = OperatorConfig::default(); + let scope = resolve_scope( + &op, + Capability::Direct, + ScopeFlags { + cluster: Some("s3://bucket/clusters/brain"), + graph: Some("knowledge"), + ..flags() + }, + ) + .unwrap(); + assert_eq!(scope.cluster.as_deref(), Some("s3://bucket/clusters/brain")); + assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge")); + } + + #[test] + fn cluster_scope_without_a_graph_defers_to_catalog_enumeration() { + // RFC-011 D7: with no `--graph`/`default_graph`, resolution no longer + // bails here — it resolves the cluster root and leaves `cluster_graph` + // empty, deferring to the async storage-URI resolver (which enumerates + // the catalog: auto-use a sole graph, else error listing candidates). + let op = cfg("clusters:\n brain:\n root: s3://acme/brain\n"); + let scope = resolve_scope( + &op, + Capability::Direct, + ScopeFlags { + cluster: Some("brain"), + ..flags() + }, + ) + .unwrap(); + assert_eq!(scope.cluster.as_deref(), Some("s3://acme/brain")); + assert_eq!(scope.cluster_graph, None); + } + + #[test] + fn graph_on_a_bare_store_or_uri_is_rejected() { + let op = OperatorConfig::default(); + for flags in [ + ScopeFlags { + uri: Some("graph.omni".into()), + graph: Some("knowledge"), + ..flags() + }, + ScopeFlags { + store: Some("s3://b/g.omni"), + graph: Some("knowledge"), + ..flags() + }, + ] { + let err = resolve_scope(&op, Capability::Any, flags) + .unwrap_err() + .to_string(); + assert!(err.contains("already a single graph"), "{err}"); + } + } + + #[test] + fn flat_default_store_drives_local_verbs() { + // RFC-011: `defaults.store` is the zero-flag local default — no flags, + // no profile → the store URI resolves as the (single-graph) store scope. + let op = cfg("defaults:\n store: file:///tmp/dev.omni\n"); + let scope = resolve_scope(&op, Capability::Any, flags()).unwrap(); + assert_eq!(scope.uri.as_deref(), Some("file:///tmp/dev.omni")); + assert_eq!(scope.server, None); + } + + #[test] + fn flat_default_store_rejects_graph() { + // A store is already a single graph, so `--graph` against a default + // store is a loud error. + let op = cfg("defaults:\n store: file:///tmp/dev.omni\n"); + let err = resolve_scope( + &op, + Capability::Any, + ScopeFlags { + graph: Some("knowledge"), + ..flags() + }, + ) + .unwrap_err() + .to_string(); + assert!(err.contains("does not apply to a store scope"), "{err}"); + } + #[test] fn flat_default_server_drives_data_verbs() { let op = cfg("defaults:\n server: prod\n default_graph: knowledge\nservers:\n prod:\n url: https://x\n"); @@ -294,6 +458,27 @@ mod tests { assert_eq!(scope.cluster_graph.as_deref(), Some("knowledge")); } + #[test] + fn profile_cluster_scope_with_graph_override() { + // The deferral closed by this slice: a `--graph` flag overrides a + // profile cluster's default_graph, exactly as it does for a server scope. + let op = cfg( + "clusters:\n brain:\n root: s3://acme/brain\nprofiles:\n admin:\n cluster: brain\n default_graph: knowledge\n", + ); + let scope = resolve_scope( + &op, + Capability::Direct, + ScopeFlags { + profile: Some("admin"), + graph: Some("archive"), + ..flags() + }, + ) + .unwrap(); + assert_eq!(scope.cluster.as_deref(), Some("s3://acme/brain")); + assert_eq!(scope.cluster_graph.as_deref(), Some("archive")); // flag beats profile default + } + #[test] fn server_scope_on_maintenance_verb_errors() { let op = cfg("defaults:\n server: prod\nservers:\n prod:\n url: https://x\n"); @@ -316,7 +501,7 @@ mod tests { ) .unwrap_err() .to_string(); - assert!(err.contains("maintenance-only"), "{err}"); + assert!(err.contains("not valid for graph data commands"), "{err}"); } #[test] diff --git a/crates/omnigraph-cli/tests/cli_cluster.rs b/crates/omnigraph-cli/tests/cli_cluster.rs index 9205b84..e35a54d 100644 --- a/crates/omnigraph-cli/tests/cli_cluster.rs +++ b/crates/omnigraph-cli/tests/cli_cluster.rs @@ -683,51 +683,8 @@ fn cluster_apply_locked_exits_nonzero() { assert!(!temp.path().join("__cluster/resources").exists()); } -#[test] -fn cluster_apply_uses_cli_actor_from_local_config() { - let temp = tempdir().unwrap(); - write_cluster_config_fixture(temp.path()); - fs::write( - temp.path().join("omnigraph.yaml"), - "cli:\n actor: act-local\n", - ) - .unwrap(); - // Phase 1: import once (setup, not under test). - let output = cli() - .current_dir(temp.path()) - .arg("cluster") - .arg("import") - .arg("--config") - .arg(temp.path()) - .output() - .unwrap(); - assert!(output.status.success(), "{output:?}"); - - // Phase 2: apply alone, capturing the echoed actor (idempotent re-runs). - let apply = |extra: &[&str]| { - let mut command = cli(); - command.current_dir(temp.path()); - for arg in extra { - command.arg(arg); - } - let output = command - .arg("cluster") - .arg("apply") - .arg("--config") - .arg(temp.path()) - .arg("--json") - .output() - .unwrap(); - let json: serde_json::Value = - serde_json::from_str(String::from_utf8_lossy(&output.stdout).trim()).unwrap(); - json["actor"].clone() - }; - assert_eq!(apply(&[]), "act-local", "cli.actor is the no-flag default"); - assert_eq!(apply(&["--as", "andrew"]), "andrew", "--as overrides cli.actor"); -} - -/// RFC-007 PR 1: the operator layer joins the actor chain — -/// `--as` > legacy `cli.actor` (RFC-008 window) > `operator.actor` > none. +/// RFC-011: the actor chain is `--as` > `operator.actor` > none. The CLI no +/// longer reads omnigraph.yaml `cli.actor`. #[test] fn cluster_apply_uses_operator_actor_from_omnigraph_home() { let temp = tempdir().unwrap(); @@ -771,41 +728,31 @@ fn cluster_apply_uses_operator_actor_from_omnigraph_home() { json["actor"].clone() }; - // No --as, no omnigraph.yaml: the operator identity applies. + // No --as: the operator identity applies. assert_eq!( apply(&[]), "act-operator", - "operator.actor is the no-flag, no-legacy-config default" + "operator.actor is the no-flag default" ); - // --as still wins over everything. + // --as still wins over the operator layer. assert_eq!(apply(&["--as", "andrew"]), "andrew"); - - // A legacy cli.actor (RFC-008 window) outranks the operator layer. - fs::write( - temp.path().join("omnigraph.yaml"), - "cli:\n actor: act-legacy\n", - ) - .unwrap(); - assert_eq!( - apply(&[]), - "act-legacy", - "legacy cli.actor wins over operator.actor during the deprecation window" - ); } #[test] -fn cluster_approve_uses_cli_actor_fallback() { +fn cluster_approve_uses_operator_actor_fallback() { let temp = tempdir().unwrap(); write_cluster_config_fixture(temp.path()); + let operator_home = tempdir().unwrap(); fs::write( - temp.path().join("omnigraph.yaml"), - "cli:\n actor: act-local\n", + operator_home.path().join("config.yaml"), + "operator:\n actor: act-operator\n", ) .unwrap(); // Converge, then remove the graph so a gated delete is pending. for command in ["import", "apply"] { let output = cli() .current_dir(temp.path()) + .env("OMNIGRAPH_HOME", operator_home.path()) .arg("cluster") .arg(command) .arg("--config") @@ -818,6 +765,7 @@ fn cluster_approve_uses_cli_actor_fallback() { let output = cli() .current_dir(temp.path()) + .env("OMNIGRAPH_HOME", operator_home.path()) .arg("cluster") .arg("approve") .arg("graph.knowledge") @@ -829,14 +777,17 @@ fn cluster_approve_uses_cli_actor_fallback() { assert!(output.status.success(), "{output:?}"); let json: serde_json::Value = serde_json::from_str(String::from_utf8_lossy(&output.stdout).trim()).unwrap(); - assert_eq!(json["approved_by"], "act-local"); + assert_eq!(json["approved_by"], "act-operator"); - // With neither flag nor config: refused with the actionable message. + // With neither flag nor operator config: refused with the actionable + // message (an approval without an approver is meaningless). let bare = tempdir().unwrap(); write_cluster_config_fixture(bare.path()); + let bare_home = tempdir().unwrap(); let output = output_failure( cli() .current_dir(bare.path()) + .env("OMNIGRAPH_HOME", bare_home.path()) .arg("cluster") .arg("approve") .arg("graph.knowledge") @@ -845,11 +796,13 @@ fn cluster_approve_uses_cli_actor_fallback() { ); let stderr = String::from_utf8_lossy(&output.stderr); assert!(stderr.contains("--as"), "{stderr}"); - assert!(stderr.contains("cli.actor"), "{stderr}"); } #[test] -fn cluster_commands_ignore_malformed_local_config() { +fn cluster_commands_ignore_legacy_omnigraph_yaml() { + // RFC-011: the CLI never reads omnigraph.yaml for cluster commands — a + // present (even malformed) legacy file is inert. The actor falls back to + // `operator.actor`, then to none (no loud failure on absence). let temp = tempdir().unwrap(); write_cluster_config_fixture(temp.path()); fs::write(temp.path().join("omnigraph.yaml"), "{{{{ not yaml").unwrap(); @@ -873,14 +826,11 @@ fn cluster_commands_ignore_malformed_local_config() { "cluster {command} touched omnigraph.yaml" ); } - // import + apply with an explicit --as: the config is never loaded. - for (command, args) in [("import", vec![]), ("apply", vec!["--as", "andrew"])] { - let mut invocation = cli(); - invocation.current_dir(temp.path()); - for arg in &args { - invocation.arg(arg); - } - let output = invocation + // import + apply (no --as, no operator config): the legacy file is never + // loaded and the no-actor apply succeeds (actor defaults to none). + for command in ["import", "apply"] { + let output = cli() + .current_dir(temp.path()) .arg("cluster") .arg(command) .arg("--config") @@ -893,20 +843,6 @@ fn cluster_commands_ignore_malformed_local_config() { String::from_utf8_lossy(&output.stderr) ); } - // Only the no-flag actor lookup is allowed to fail, and loudly. - let output = output_failure( - cli() - .current_dir(temp.path()) - .arg("cluster") - .arg("apply") - .arg("--config") - .arg(temp.path()), - ); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - stderr.contains("omnigraph.yaml") && stderr.contains("--as"), - "the actor-default config read must fail loudly and actionably: {stderr}" - ); } #[test] @@ -975,7 +911,7 @@ fn optimize_resolves_a_cluster_graph_by_id() { .arg("optimize") .arg("--cluster") .arg(temp.path()) - .arg("--cluster-graph") + .arg("--graph") .arg("knowledge") .arg("--json"), ); @@ -994,7 +930,7 @@ fn optimize_unknown_cluster_graph_id_errors() { .arg("optimize") .arg("--cluster") .arg(temp.path()) - .arg("--cluster-graph") + .arg("--graph") .arg("does-not-exist") .arg("--json"), ); @@ -1006,19 +942,80 @@ fn optimize_unknown_cluster_graph_id_errors() { } #[test] -fn cluster_flag_requires_cluster_graph() { - // clap enforces both-or-neither. +fn optimize_auto_uses_the_sole_cluster_graph() { + // RFC-011 D7: a cluster with exactly one applied graph needs no --graph — + // the resolver enumerates the catalog and uses the only candidate. + let temp = applied_knowledge_cluster(); + let out = output_success( + cli() + .arg("optimize") + .arg("--cluster") + .arg(temp.path()) + .arg("--json"), + ); + assert!( + parse_stdout_json(&out)["tables"].as_array().is_some(), + "optimize should auto-resolve the sole cluster graph" + ); +} + +/// Stand up an applied cluster with two graphs (`knowledge`, `archive`). +fn applied_two_graph_cluster() -> tempfile::TempDir { + let temp = tempdir().unwrap(); + let root = temp.path(); + fs::write( + root.join("people.pg"), + "node Person {\n name: String @key\n age: I32?\n}\n", + ) + .unwrap(); + fs::write(root.join("base.policy.yaml"), "rules: []\n").unwrap(); + fs::write( + root.join("cluster.yaml"), + r#" +version: 1 +metadata: + name: two-graph +state: + backend: cluster + lock: true +graphs: + knowledge: + schema: ./people.pg + archive: + schema: ./people.pg +policies: + base: + file: ./base.policy.yaml + applies_to: [knowledge, archive] +"#, + ) + .unwrap(); + init_named_cluster_graph(root, "knowledge", "people.pg"); + init_named_cluster_graph(root, "archive", "people.pg"); + assert_eq!(cluster_json(root, "import")["ok"], true); + assert_eq!(cluster_json(root, "apply")["converged"], true); + temp +} + +#[test] +fn optimize_on_multi_graph_cluster_without_graph_lists_candidates() { + // RFC-011 D7: >1 graph and no --graph → error naming every candidate, + // never an auto-pick. + let temp = applied_two_graph_cluster(); let out = output_failure( cli() .arg("optimize") .arg("--cluster") - .arg(".") + .arg(temp.path()) .arg("--json"), ); let stderr = String::from_utf8_lossy(&out.stderr); assert!( - stderr.contains("cluster-graph") || stderr.contains("required"), - "expected --cluster to require --cluster-graph; got: {stderr}" + stderr.contains("2 graphs") + && stderr.contains("archive") + && stderr.contains("knowledge") + && stderr.contains("--graph "), + "expected a candidate-listing error; got: {stderr}" ); } @@ -1042,6 +1039,47 @@ fn init_refuses_a_cluster_managed_path_and_signposts_cluster_apply() { assert!(!temp.path().join("graphs").join("sneaky.omni").exists()); } +#[test] +fn schema_apply_refuses_a_cluster_managed_graph_and_signposts_cluster_apply() { + // RFC-011 Decision 10: a direct `schema apply` against a cluster-managed + // graph's storage root would bypass the ledger/recovery/approvals, so it is + // refused and points at `cluster apply` (mirrors `init`'s refusal). + let temp = applied_knowledge_cluster(); + // A schema that WOULD change the graph (adds `bio`) — so the no-mutation + // assertion below is meaningful, not a no-op re-apply. + fs::write( + temp.path().join("people_v2.pg"), + "node Person {\n name: String @key\n age: I32?\n bio: String?\n}\n", + ) + .unwrap(); + let out = output_failure( + cli() + .arg("schema") + .arg("apply") + .arg("--schema") + .arg(temp.path().join("people_v2.pg")) + .arg("--store") + .arg(temp.path().join("graphs").join("knowledge.omni")), + ); + let stderr = String::from_utf8_lossy(&out.stderr); + assert!( + stderr.contains("cluster apply"), + "schema apply against a cluster-managed graph should signpost `cluster apply`; got: {stderr}" + ); + // And it bailed BEFORE mutating: the live schema still lacks `bio`. + let show = output_success( + cli() + .arg("schema") + .arg("show") + .arg(temp.path().join("graphs").join("knowledge.omni")), + ); + assert!( + !stdout_string(&show).contains("bio"), + "the refused apply must not have changed the live schema; got: {}", + stdout_string(&show) + ); +} + #[test] fn init_outside_a_cluster_still_works() { // Regression guard: ordinary init (no cluster layout) is unaffected. @@ -1076,7 +1114,7 @@ fn optimize_by_cluster_works_when_catalog_payloads_are_degraded() { .arg("optimize") .arg("--cluster") .arg(temp.path()) - .arg("--cluster-graph") + .arg("--graph") .arg("knowledge") .arg("--json"), ); diff --git a/crates/omnigraph-cli/tests/cli_cluster_e2e.rs b/crates/omnigraph-cli/tests/cli_cluster_e2e.rs index 36b476a..35ded58 100644 --- a/crates/omnigraph-cli/tests/cli_cluster_e2e.rs +++ b/crates/omnigraph-cli/tests/cli_cluster_e2e.rs @@ -3,6 +3,7 @@ use std::fs; +use omnigraph::db::Omnigraph; use tempfile::tempdir; mod support; @@ -236,27 +237,28 @@ fn cluster_e2e_out_of_band_schema_drift_then_apply_converges_it() { let apply = cluster_json(temp.path(), "apply"); assert_eq!(apply["converged"], true, "{apply}"); - // Out-of-band: the live graph evolves, cluster.yaml stays put. - fs::write( - temp.path().join("people_v2.pg"), - r#" + // Out-of-band: the live graph evolves while cluster.yaml stays put. RFC-011 + // D10 makes the CLI `schema apply` refuse a cluster-managed graph, so this + // simulates a true bypass — a direct engine apply against the storage root, + // exactly the drift the control plane must still detect and converge. + let people_v2 = r#" node Person { name: String @key age: I32? bio: String? } -"#, - ) - .unwrap(); - output_success( - cli() - .arg("schema") - .arg("apply") - .arg(temp.path().join("graphs/knowledge.omni")) - .arg("--schema") - .arg(temp.path().join("people_v2.pg")) - .arg("--json"), - ); +"#; + tokio::runtime::Runtime::new().unwrap().block_on(async { + let db = Omnigraph::open( + temp.path() + .join("graphs/knowledge.omni") + .to_string_lossy() + .as_ref(), + ) + .await + .unwrap(); + db.apply_schema(people_v2).await.unwrap(); + }); // Drift is visible... let refresh = cluster_json(temp.path(), "refresh"); diff --git a/crates/omnigraph-cli/tests/cli_data.rs b/crates/omnigraph-cli/tests/cli_data.rs index fc5db0a..81e1aab 100644 --- a/crates/omnigraph-cli/tests/cli_data.rs +++ b/crates/omnigraph-cli/tests/cli_data.rs @@ -165,12 +165,87 @@ fn optimize_with_server_flag_errors_wrong_plane() { let stderr = String::from_utf8_lossy(&output.stderr); assert!( stderr.contains("`optimize` is a direct (storage-native) command") - && stderr.contains("--server/--graph address a served graph and do not apply") - && stderr.contains("Pass a storage URI, or --cluster --cluster-graph ."), + && stderr.contains("--server addresses a served graph and does not apply") + && stderr.contains("Pass a storage URI, or --cluster --graph ."), "wrong-capability guard message not found; got: {stderr}" ); } +#[test] +fn wrong_address_guard_message_has_no_trailing_space() { + // The remediation tail is empty for served-addressing capabilities, so a + // misplaced --cluster on a data verb must not leave "… does not apply. " + // with a dangling space (error text is observable contract). NO_COLOR keeps + // the assertion off ANSI styling. + let output = output_failure( + cli() + .env("NO_COLOR", "1") + .arg("query") + .arg("--cluster") + .arg("./brain") + .arg("-e") + .arg("query q { Person { id } }"), + ); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("and does not apply."), + "expected the wrong-address message; got: {stderr}" + ); + assert!( + !stderr.contains("and does not apply. "), + "trailing space after the message; got: {stderr}" + ); +} + +#[test] +fn graph_flag_on_a_positional_uri_errors() { + // RFC-011: `--graph` selects within a multi-graph scope (a server or + // cluster). An explicit `--store ` is already a single graph, so + // pairing it with `--graph` is a loud error, not a silently-dropped flag. + // (The guard lets `--graph` reach a data verb; the scope resolver rejects + // it.) + let temp = tempdir().unwrap(); + let graph = graph_path(temp.path()); + init_graph(&graph); + let output = output_failure( + cli() + .arg("query") + .arg("--store") + .arg(&graph) + .arg("--graph") + .arg("knowledge") + .arg("-e") + .arg("query q { Person { id } }"), + ); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("already a single graph"), + "expected --graph-on-explicit-store rejection; got: {stderr}" + ); +} + +#[test] +fn query_by_name_against_a_store_needs_a_server() { + // RFC-011 D3: by-name (catalog) invocation is served-only — the catalog is + // server-owned, so a bare `--store` has nothing to resolve the name + // against. The ad-hoc lane (`-e`/`--query`) is the local alternative. + let temp = tempdir().unwrap(); + let graph = graph_path(temp.path()); + init_graph(&graph); + let output = output_failure( + cli() + .arg("query") + .arg("find_people") + .arg("--store") + .arg(&graph), + ); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("needs a server"), + "expected a served-only by-name error; got: {stderr}" + ); +} + #[test] fn optimize_with_remote_target_errors_storage_plane() { // RFC-010 Slice 1: a maintenance verb pointed at a remote URI fails loudly @@ -454,10 +529,9 @@ query list_people() { #[test] fn deprecated_read_and_change_subcommands_emit_warnings() { - // Both subcommands require `--query`/`--query-string`/`--alias`, so - // invoking them with no args will exit non-zero. That's fine -- - // we only care that the deprecation warning is printed before the - // argument-required error. + // Both subcommands require `--query`/`--query-string`, so invoking them + // with no args will exit non-zero. That's fine -- we only care that the + // deprecation warning is printed before the argument-required error. let output = cli().arg("read").output().unwrap(); let stderr = String::from_utf8(output.stderr).unwrap(); assert!( @@ -525,13 +599,15 @@ query list_people() { } #[test] -fn query_lint_can_resolve_graph_and_query_from_config() { +fn query_lint_can_resolve_graph_from_store_scope() { + // RFC-011: lint resolves its graph target through `--store` (the direct + // scope), not omnigraph.yaml's cli.graph; the .gq path is plain cwd-relative. let temp = tempdir().unwrap(); let graph = graph_path(temp.path()); - let config_path = temp.path().join("omnigraph.yaml"); init_graph(&graph); + let query_path = temp.path().join("queries.gq"); write_query_file( - &temp.path().join("queries.gq"), + &query_path, r#" query list_people() { match { $p: Person } @@ -539,16 +615,15 @@ query list_people() { } "#, ); - write_config(&config_path, &local_yaml_config(&graph)); let output = output_success( cli() .arg("query") .arg("lint") .arg("--query") - .arg("queries.gq") - .arg("--config") - .arg(&config_path) + .arg(&query_path) + .arg("--store") + .arg(&graph) .arg("--json"), ); let payload: Value = serde_json::from_slice(&output.stdout).unwrap(); @@ -616,7 +691,9 @@ query list_people() { ); let stderr = String::from_utf8_lossy(&output.stderr); assert!( - stderr.contains("lint requires --schema or a resolvable graph target") + stderr.contains("lint requires --schema ") + || stderr.contains("no graph addressed"), + "expected a schema-or-graph-target requirement; got: {stderr}" ); } @@ -785,10 +862,10 @@ fn read_json_outputs_rows_for_named_query() { let output = output_success( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(&queries) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -817,7 +894,6 @@ fn read_via_store_flag_and_profile_match_positional_uri() { let output = output_success( cmd.arg("--query") .arg(&queries) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -826,8 +902,8 @@ fn read_via_store_flag_and_profile_match_positional_uri() { serde_json::from_slice(&output.stdout).unwrap() }; - // Baseline: positional URI. - let baseline = read_rows(cli().arg("query").arg(&graph)); + // Baseline: --store names the graph. + let baseline = read_rows(cli().arg("query").arg("--store").arg(&graph)); assert_eq!(baseline["rows"][0]["p.name"], "Alice"); // --store names the same graph directly. @@ -914,43 +990,38 @@ fn export_jsonl_outputs_source_rows_for_selected_branch_and_type() { ); } +// RFC-011: `policy validate|test|explain` source the Cedar bundle from a +// converged cluster's applied policies (`--cluster ` + `--graph `), +// not omnigraph.yaml's policy.file. + #[test] -fn policy_validate_accepts_valid_policy_file() { - let temp = tempdir().unwrap(); - let (config, _) = write_policy_config_fixture(temp.path()); +fn policy_validate_accepts_cluster_bundle() { + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_YAML)); let output = output_success( cli() .arg("policy") .arg("validate") - .arg("--config") - .arg(&config), + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge"), ); let stdout = stdout_string(&output); assert!(stdout.contains("policy valid:")); - assert!(stdout.contains("policy.yaml")); assert!(stdout.contains("[2 actors]")); } #[test] -fn policy_validate_fails_for_invalid_policy_file() { - let temp = tempdir().unwrap(); - let config = temp.path().join("omnigraph.yaml"); - let policy = temp.path().join("policy.yaml"); - fs::write( - &config, - r#" -project: - name: policy-test-graph -policy: - file: ./policy.yaml -"#, - ) - .unwrap(); - fs::write( - &policy, - r#" +fn policy_validate_fails_for_invalid_cluster_bundle() { + // The cluster does not validate a policy bundle's internal rules, so an + // applied-but-malformed bundle reaches `policy validate`, which compiles it + // and surfaces the error (here: a duplicate rule id). + let cluster = converged_loaded_cluster( + "knowledge", + Some( + r#" version: 1 groups: team: [act-andrew] @@ -966,26 +1037,42 @@ rules: actions: [export] branch_scope: any "#, - ) - .unwrap(); + ), + ); let output = output_failure( cli() .arg("policy") .arg("validate") - .arg("--config") - .arg(&config), + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge"), ); let stderr = String::from_utf8(output.stderr).unwrap(); - assert!(stderr.contains("duplicate policy rule id")); + assert!( + stderr.contains("duplicate policy rule id"), + "expected a duplicate-rule error; got: {stderr}" + ); } #[test] -fn policy_test_runs_declarative_cases() { - let temp = tempdir().unwrap(); - let (config, _) = write_policy_config_fixture(temp.path()); +fn policy_test_runs_declarative_cases_against_cluster_bundle() { + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_YAML)); + let tests = cluster.path().join("policy.tests.yaml"); + fs::write(&tests, POLICY_TESTS_YAML).unwrap(); - let output = output_success(cli().arg("policy").arg("test").arg("--config").arg(&config)); + let output = output_success( + cli() + .arg("policy") + .arg("test") + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge") + .arg("--tests") + .arg(&tests), + ); let stdout = stdout_string(&output); assert!(stdout.contains("policy tests passed: 2 cases")); @@ -993,15 +1080,16 @@ fn policy_test_runs_declarative_cases() { #[test] fn policy_explain_reports_decision_and_matched_rule() { - let temp = tempdir().unwrap(); - let (config, _) = write_policy_config_fixture(temp.path()); + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_YAML)); let allow = output_success( cli() .arg("policy") .arg("explain") - .arg("--config") - .arg(&config) + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge") .arg("--actor") .arg("act-andrew") .arg("--action") @@ -1017,8 +1105,10 @@ fn policy_explain_reports_decision_and_matched_rule() { cli() .arg("policy") .arg("explain") - .arg("--config") - .arg(&config) + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge") .arg("--actor") .arg("act-bruno") .arg("--action") @@ -1032,22 +1122,26 @@ fn policy_explain_reports_decision_and_matched_rule() { } #[test] -fn read_can_resolve_uri_from_config() { +fn read_resolves_uri_from_default_store_scope() { + // RFC-011: a zero-flag read resolves its graph from `defaults.store` in the + // operator config (the local-dev default scope) — no omnigraph.yaml. let temp = tempdir().unwrap(); let graph = graph_path(temp.path()); - let config = temp.path().join("omnigraph.yaml"); init_graph(&graph); load_fixture(&graph); - write_config(&config, &local_yaml_config(&graph)); + let home = tempdir().unwrap(); + std::fs::write( + home.path().join("config.yaml"), + format!("defaults:\n store: {}\n", graph.to_string_lossy()), + ) + .unwrap(); let output = output_success( cli() + .env("OMNIGRAPH_HOME", home.path()) .arg("read") - .arg("--config") - .arg(&config) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -1067,10 +1161,10 @@ fn read_csv_format_outputs_header_and_row_values() { let output = output_success( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -1104,10 +1198,10 @@ fn read_uses_operator_default_output_format() { command .env("OMNIGRAPH_HOME", operator_home.path()) .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#); @@ -1139,10 +1233,10 @@ fn read_jsonl_format_outputs_metadata_header_first() { let output = output_success( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -1174,6 +1268,7 @@ query insert_person($name: String, $age: I32) { let output = output_success( cli() .arg("change") + .arg("--store") .arg(&graph) .arg("--query") .arg(&mutation_file) @@ -1190,10 +1285,10 @@ query insert_person($name: String, $age: I32) { let verify = output_success( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Eve"}"#) @@ -1205,13 +1300,13 @@ query insert_person($name: String, $age: I32) { } #[test] -fn change_can_resolve_uri_and_branch_from_config() { +fn change_resolves_uri_and_default_branch_from_store_scope() { + // RFC-011: a mutate resolves its graph from `--store` and defaults the + // branch to main (no omnigraph.yaml cli.graph / cli.branch). let temp = tempdir().unwrap(); let graph = graph_path(temp.path()); - let config = temp.path().join("omnigraph.yaml"); init_graph(&graph); load_fixture(&graph); - write_config(&config, &local_yaml_config(&graph)); let mutation_file = temp.path().join("config-mutations.gq"); write_query_file( &mutation_file, @@ -1225,8 +1320,8 @@ query insert_person($name: String, $age: I32) { let output = output_success( cli() .arg("change") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(&graph) .arg("--query") .arg(&mutation_file) .arg("--params") @@ -1248,6 +1343,7 @@ fn read_requires_name_for_multi_query_files() { let output = output_failure( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(fixture("test.gq")), @@ -1266,6 +1362,7 @@ fn read_supports_inline_query_string() { let output = output_success( cli() .arg("read") + .arg("--store") .arg(&repo) .arg("-e") .arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }") @@ -1281,11 +1378,12 @@ fn read_supports_inline_query_string() { #[test] fn positional_http_uri_on_a_data_verb_is_rejected() { - // RFC-011: a positional/`--uri` http(s):// URL no longer dispatches to a - // remote server — that requires `--server `. + // RFC-011: a `--store` http(s):// URL no longer dispatches to a remote + // server — that requires `--server `. let output = output_failure( cli() .arg("query") + .arg("--store") .arg("http://127.0.0.1:1") .arg("-e") .arg("query q() { match { $p: Person { } } return { $p } }"), @@ -1293,7 +1391,7 @@ fn positional_http_uri_on_a_data_verb_is_rejected() { let stderr = String::from_utf8_lossy(&output.stderr); assert!( stderr.contains("must be addressed with `--server `"), - "expected positional-remote rejection; got: {stderr}" + "expected store-remote rejection; got: {stderr}" ); } @@ -1331,6 +1429,7 @@ fn change_supports_inline_query_string() { let output = output_success( cli() .arg("change") + .arg("--store") .arg(&repo) .arg("--query-string") .arg("query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }") @@ -1345,6 +1444,7 @@ fn change_supports_inline_query_string() { let verify = output_success( cli() .arg("read") + .arg("--store") .arg(&repo) .arg("-e") .arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name } }") @@ -1366,6 +1466,7 @@ fn read_rejects_query_string_combined_with_query() { let output = output_failure( cli() .arg("read") + .arg("--store") .arg(&repo) .arg("--query") .arg(fixture("test.gq")) @@ -1386,7 +1487,7 @@ fn read_rejects_empty_query_string() { init_graph(&repo); load_fixture(&repo); - let output = output_failure(cli().arg("read").arg(&repo).arg("-e").arg("")); + let output = output_failure(cli().arg("read").arg("--store").arg(&repo).arg("-e").arg("")); let stderr = String::from_utf8(output.stderr).unwrap(); assert!( stderr.contains("must not be empty"), @@ -1514,6 +1615,160 @@ fn branch_delete_rejects_main() { assert!(stderr.contains("cannot delete branch 'main'")); } +// ── RFC-011 Decision 9: write diagnostics + non-local destructive-confirm ── + +#[test] +fn write_echoes_resolved_target_to_stderr() { + // Every write echoes its resolved target + access path to stderr; --json + // (stdout) is unaffected. A local load → "(direct, local)". + let temp = tempdir().unwrap(); + let graph = graph_path(temp.path()); + init_graph(&graph); + let data = fixture("test.jsonl"); + let output = output_success( + cli() + .arg("load") + .arg("--mode") + .arg("append") + .arg("--data") + .arg(&data) + .arg(&graph) + .arg("--json"), + ); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!( + stderr.contains("omnigraph load →") && stderr.contains("(direct, local)"), + "missing write-target echo; stderr: {stderr}" + ); + // stdout still parses as JSON — the echo went to stderr. + let _: Value = serde_json::from_slice(&output.stdout).unwrap(); +} + +#[test] +fn quiet_suppresses_the_write_target_echo() { + let temp = tempdir().unwrap(); + let graph = graph_path(temp.path()); + init_graph(&graph); + let data = fixture("test.jsonl"); + let output = output_success( + cli() + .arg("--quiet") + .arg("load") + .arg("--mode") + .arg("append") + .arg("--data") + .arg(&data) + .arg(&graph), + ); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!( + !stderr.contains("omnigraph load →"), + "--quiet should suppress the echo; stderr: {stderr}" + ); +} + +#[test] +fn branch_delete_against_non_local_scope_refuses_without_yes() { + // No bucket needed: the confirm gate fires before the graph is opened. + let output = output_failure( + cli() + .arg("branch") + .arg("delete") + .arg("--store") + .arg("s3://fake-bucket/g.omni") + .arg("feature") + .arg("--json"), + ); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!( + stderr.contains("refusing destructive `branch delete`") && stderr.contains("--yes"), + "expected a non-local destructive refusal; stderr: {stderr}" + ); +} + +#[test] +fn branch_delete_against_non_local_scope_passes_gate_with_yes() { + // With --yes the gate is bypassed; the command then fails for an unrelated + // reason (the fake bucket can't be opened), so the refusal must be ABSENT. + let output = output_failure( + cli() + .arg("branch") + .arg("delete") + .arg("--store") + .arg("s3://fake-bucket/g.omni") + .arg("feature") + .arg("--yes") + .arg("--json"), + ); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!( + !stderr.contains("refusing destructive"), + "--yes should bypass the confirm gate; stderr: {stderr}" + ); +} + +#[test] +fn overwrite_load_against_non_local_scope_refuses_without_yes() { + let output = output_failure( + cli() + .arg("load") + .arg("--mode") + .arg("overwrite") + .arg("--data") + .arg(fixture("test.jsonl")) + .arg("--store") + .arg("s3://fake-bucket/g.omni") + .arg("--json"), + ); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!( + stderr.contains("refusing destructive `load --mode overwrite`"), + "expected a non-local overwrite refusal; stderr: {stderr}" + ); +} + +#[test] +fn cleanup_against_non_local_scope_refuses_without_yes() { + // Past the --confirm preview gate, a non-local cleanup still needs --yes. + let output = output_failure( + cli() + .arg("cleanup") + .arg("--store") + .arg("s3://fake-bucket/g.omni") + .arg("--keep") + .arg("5") + .arg("--confirm") + .arg("--json"), + ); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!( + stderr.contains("refusing destructive `cleanup`"), + "expected a non-local cleanup refusal; stderr: {stderr}" + ); +} + +#[test] +fn cleanup_against_local_scope_executes_with_confirm() { + // Local cleanup needs no --yes; --confirm alone executes (and echoes). + let temp = tempdir().unwrap(); + let graph = graph_path(temp.path()); + init_graph(&graph); + load_fixture(&graph); + let output = output_success( + cli() + .arg("cleanup") + .arg("--keep") + .arg("1") + .arg("--confirm") + .arg(&graph) + .arg("--json"), + ); + let payload: Value = serde_json::from_slice(&output.stdout).unwrap(); + assert!(payload["tables"].as_array().is_some(), "{payload}"); + let stderr = String::from_utf8(output.stderr).unwrap(); + assert!(stderr.contains("omnigraph cleanup →"), "stderr: {stderr}"); +} + #[test] fn branch_merge_defaults_target_to_main() { let temp = tempdir().unwrap(); @@ -1663,19 +1918,17 @@ fn snapshot_json_returns_manifest_version_and_tables() { } #[test] -fn snapshot_can_resolve_uri_from_config() { +fn snapshot_resolves_uri_from_store_scope() { let temp = tempdir().unwrap(); let graph = graph_path(temp.path()); - let config = temp.path().join("omnigraph.yaml"); init_graph(&graph); load_fixture(&graph); - write_config(&config, &local_yaml_config(&graph)); let output = output_success( cli() .arg("snapshot") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(&graph) .arg("--json"), ); let payload: Value = serde_json::from_slice(&output.stdout).unwrap(); @@ -1816,3 +2069,162 @@ fn cli_fails_for_invalid_merge_requests() { .contains("distinct source and target") ); } + +/// RFC-011 Decision 8: `profile list` / `profile show` inspect the operator +/// config's profiles read-only. Hermetic via OMNIGRAPH_HOME. +fn profile_home() -> tempfile::TempDir { + let home = tempdir().unwrap(); + std::fs::write( + home.path().join("config.yaml"), + "operator:\n actor: act-andrew\n\ + defaults:\n output: json\n server: prod\n default_graph: knowledge\n\ + servers:\n prod:\n url: https://graph.example.com\n\ + clusters:\n brain:\n root: s3://acme/clusters/brain\n\ + profiles:\n\ + \x20 staging:\n server: prod\n default_graph: kb\n\ + \x20 brain-admin:\n cluster: brain\n\ + \x20 localdev:\n store: file:///data/dev.omni\n\ + \x20 broken:\n server: a\n store: b\n", + ) + .unwrap(); + home +} + +#[test] +fn profile_list_names_each_profile_with_its_binding_and_marks_active() { + let home = profile_home(); + let out = output_success( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .env("OMNIGRAPH_PROFILE", "staging") + .arg("profile") + .arg("list"), + ); + let stdout = stdout_string(&out); + assert!(stdout.contains("staging (active)"), "{stdout}"); + assert!(stdout.contains("server: prod"), "{stdout}"); + assert!(stdout.contains("cluster: brain"), "{stdout}"); + assert!(stdout.contains("store: file:///data/dev.omni"), "{stdout}"); + // A malformed (two-scope) profile is reported, not a hard failure. + assert!(stdout.contains("broken") && stdout.contains("invalid:"), "{stdout}"); +} + +#[test] +fn profile_list_json_shape() { + let home = profile_home(); + let out = output_success( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .arg("profile") + .arg("list") + .arg("--json"), + ); + let items: Value = serde_json::from_slice(&out.stdout).unwrap(); + let brain = items + .as_array() + .unwrap() + .iter() + .find(|p| p["name"] == "brain-admin") + .unwrap(); + assert_eq!(brain["binding"], "cluster: brain"); + assert_eq!(brain["scope_kind"], "cluster"); + assert_eq!(brain["target"], "brain"); + assert_eq!(brain["valid"], true); + assert!(brain["error"].is_null()); + assert_eq!(brain["active"], false); + let broken = items + .as_array() + .unwrap() + .iter() + .find(|p| p["name"] == "broken") + .unwrap(); + assert_eq!(broken["scope_kind"], "invalid"); + assert_eq!(broken["valid"], false); + assert!(broken["target"].is_null()); + assert!( + broken["error"] + .as_str() + .unwrap() + .contains("profile 'broken'") + ); +} + +#[test] +fn profile_show_resolves_named_scope_endpoints() { + let home = profile_home(); + // A cluster profile resolves its root. + let cluster = output_success( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .arg("profile") + .arg("show") + .arg("brain-admin"), + ); + let cs = stdout_string(&cluster); + assert!(cs.contains("scope: cluster brain"), "{cs}"); + assert!(cs.contains("endpoint: s3://acme/clusters/brain"), "{cs}"); + + // A store profile shows its URI as the endpoint. + let store = output_success( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .arg("profile") + .arg("show") + .arg("localdev") + .arg("--json"), + ); + let detail: Value = serde_json::from_slice(&store.stdout).unwrap(); + assert_eq!(detail["scope_kind"], "store"); + assert_eq!(detail["endpoint"], "file:///data/dev.omni"); +} + +#[test] +fn profile_show_without_name_falls_back_to_flat_defaults() { + let home = profile_home(); + let out = output_success( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .arg("profile") + .arg("show") + .arg("--json"), + ); + let detail: Value = serde_json::from_slice(&out.stdout).unwrap(); + assert_eq!(detail["name"], "(defaults)"); + assert_eq!(detail["scope_kind"], "server"); + assert_eq!(detail["endpoint"], "https://graph.example.com"); + assert_eq!(detail["default_graph"], "knowledge"); +} + +#[test] +fn profile_show_without_name_uses_active_env_profile() { + let home = profile_home(); + let out = output_success( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .env("OMNIGRAPH_PROFILE", "brain-admin") + .arg("profile") + .arg("show") + .arg("--json"), + ); + let detail: Value = serde_json::from_slice(&out.stdout).unwrap(); + // No name arg, but $OMNIGRAPH_PROFILE selects brain-admin (not the flat defaults). + assert_eq!(detail["name"], "brain-admin"); + assert_eq!(detail["scope_kind"], "cluster"); + assert_eq!(detail["endpoint"], "s3://acme/clusters/brain"); + // output_format renders as the canonical lowercase value name. + assert_eq!(detail["output_format"], "json"); +} + +#[test] +fn profile_show_unknown_name_errors() { + let home = profile_home(); + let out = output_failure( + cli() + .env("OMNIGRAPH_HOME", home.path()) + .arg("profile") + .arg("show") + .arg("nope"), + ); + let stderr = String::from_utf8_lossy(&out.stderr); + assert!(stderr.contains("unknown profile 'nope'"), "{stderr}"); +} diff --git a/crates/omnigraph-cli/tests/cli_queries.rs b/crates/omnigraph-cli/tests/cli_queries.rs index 2f2ff00..92f7879 100644 --- a/crates/omnigraph-cli/tests/cli_queries.rs +++ b/crates/omnigraph-cli/tests/cli_queries.rs @@ -2,7 +2,6 @@ //! Moved verbatim from tests/cli.rs in the modularization. -use serde_json::Value; use tempfile::tempdir; mod support; @@ -57,227 +56,172 @@ query list_people() { assert_eq!(stdout_string(&lint_output), stdout_string(&check_output)); } +// Legacy `omnigraph.yaml` `aliases:` invoked via the `--alias` flag were +// removed in RFC-011 D4 — operator aliases now live under `omnigraph alias +// ` (the happy path is covered by system_local's operator-alias e2e). +// The legacy file-alias path has no CLI entry point. + #[test] -fn read_alias_from_yaml_config_runs_with_kv_output() { - let temp = tempdir().unwrap(); - let graph = graph_path(temp.path()); - let config = temp.path().join("omnigraph.yaml"); - let query = temp.path().join("aliases.gq"); - init_graph(&graph); - load_fixture(&graph); - write_query_file( - &query, - &std::fs::read_to_string(fixture("test.gq")).unwrap(), +fn alias_flag_is_removed_from_query() { + // RFC-011 D4: `--alias` no longer exists on query/mutate; use `alias `. + let output = output_failure(cli().arg("query").arg("--alias").arg("who")); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("unexpected argument") && stderr.contains("--alias"), + "expected clap to reject --alias on query; got: {stderr}" ); - write_config( - &config, - &format!( - "{}aliases:\n owner:\n command: read\n query: aliases.gq\n name: get_person\n args: [name]\n format: kv\n", - local_yaml_config(&graph) - ), - ); - - let output = output_success( - cli() - .arg("read") - .arg("--config") - .arg(&config) - .arg("--alias") - .arg("owner") - .arg("Alice"), - ); - let stdout = stdout_string(&output); - - assert!(stdout.contains("row 1")); - assert!(stdout.contains("p.name: Alice")); } #[test] -fn read_alias_uses_alias_target_without_cli_default_and_accepts_url_like_arg() { - let temp = tempdir().unwrap(); - let graph = graph_path(temp.path()); - let config = temp.path().join("omnigraph.yaml"); - let query = temp.path().join("aliases.gq"); - let data = temp.path().join("url-like.jsonl"); - init_graph(&graph); - write_jsonl( - &data, - r#"{"type":"Person","data":{"name":"https://example.com","age":30}}"#, - ); - output_success( +fn alias_unknown_name_errors_listing_defined() { + // Hermetic: an unknown alias fails before any network, listing defined ones. + let home = tempdir().unwrap(); + std::fs::write( + home.path().join("config.yaml"), + "servers:\n dev:\n url: https://x\naliases:\n who:\n server: dev\n query: find_person\n", + ) + .unwrap(); + let output = output_failure( cli() - .arg("load") - .arg("--mode") - .arg("overwrite") - .arg("--data") - .arg(&data) - .arg(&graph), + .env("OMNIGRAPH_HOME", home.path()) + .arg("alias") + .arg("nope"), ); - write_query_file( - &query, - &std::fs::read_to_string(fixture("test.gq")).unwrap(), + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("unknown alias 'nope'") && stderr.contains("who"), + "expected an unknown-alias error listing defined aliases; got: {stderr}" ); - write_config( - &config, - &format!( - "graphs:\n local:\n uri: '{}'\nquery:\n roots:\n - .\npolicy: {{}}\naliases:\n owner:\n command: read\n query: aliases.gq\n name: get_person\n args: [name]\n graph: local\n format: kv\n", - graph.to_string_lossy() - ), - ); - - let output = output_success( - cli() - .arg("read") - .arg("--config") - .arg(&config) - .arg("--alias") - .arg("owner") - .arg("https://example.com"), - ); - let stdout = stdout_string(&output); - - assert!(stdout.contains("row 1")); - assert!(stdout.contains("p.name: https://example.com")); } #[test] -fn change_alias_from_yaml_config_persists_changes() { - let temp = tempdir().unwrap(); - let graph = graph_path(temp.path()); - let config = temp.path().join("omnigraph.yaml"); - let query = temp.path().join("mutations.gq"); - init_graph(&graph); - load_fixture(&graph); - write_query_file( - &query, - r#" -query insert_person($name: String, $age: I32) { - insert Person { name: $name, age: $age } +fn alias_rejects_global_scope_flags_that_the_binding_owns() { + for (flag, value) in [ + ("--server", "dev"), + ("--graph", "local"), + ("--store", "file:///tmp/graph.omni"), + ("--cluster", "."), + ("--profile", "prod"), + ("--as", "act-op"), + ] { + let output = output_failure(cli().arg(flag).arg(value).arg("alias").arg("who")); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("`alias` uses the server, graph, and stored query") + && stderr.contains(flag), + "expected {flag} to be rejected by the alias binding guard; got: {stderr}" + ); + } } -"#, + +#[test] +fn queries_and_policy_wrong_server_scope_points_at_cluster_scope() { + let output = output_failure(cli().arg("--server").arg("prod").arg("queries").arg("list")); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("pass --cluster ") && !stderr.contains("pass --config "), + "queries should point at --cluster, not --config; got: {stderr}" ); - write_config( - &config, - &format!( - "{}aliases:\n add_person:\n command: change\n query: mutations.gq\n name: insert_person\n args: [name, age]\n", - local_yaml_config(&graph) + + let output = output_failure( + cli() + .arg("--server") + .arg("prod") + .arg("policy") + .arg("validate"), + ); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("pass --cluster ") && !stderr.contains("pass --config "), + "policy should point at --cluster, not --config; got: {stderr}" + ); +} + +// RFC-011: `queries validate`/`list` source the registry + schemas from a +// converged cluster's applied state (`--cluster `), not omnigraph.yaml. + +/// Build a converged single-graph cluster (id `knowledge`) with one stored +/// query. `query_block` is the YAML under the graph's `queries:` key. +fn converged_cluster_with_query(query_file: &str, query_src: &str, query_block: &str) -> tempfile::TempDir { + let temp = tempdir().unwrap(); + let dir = temp.path(); + std::fs::copy(fixture("test.pg"), dir.join("graph.pg")).unwrap(); + write_query_file(&dir.join(query_file), query_src); + std::fs::write( + dir.join("cluster.yaml"), + format!( + "version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\n\ + graphs:\n knowledge:\n schema: ./graph.pg\n queries:\n{query_block}" ), - ); - - let output = output_success( - cli() - .arg("change") - .arg("--config") - .arg(&config) - .arg("--alias") - .arg("add_person") - .arg("Eve") - .arg("29") - .arg("--json"), - ); - let payload: Value = serde_json::from_slice(&output.stdout).unwrap(); - assert_eq!(payload["affected_nodes"], 1); - - let verify = output_success( - cli() - .arg("read") - .arg(&graph) - .arg("--query") - .arg(fixture("test.gq")) - .arg("--name") - .arg("get_person") - .arg("--params") - .arg(r#"{"name":"Eve"}"#) - .arg("--json"), - ); - let verify_payload: Value = serde_json::from_slice(&verify.stdout).unwrap(); - assert_eq!(verify_payload["row_count"], 1); + ) + .unwrap(); + output_success(cli().arg("cluster").arg("import").arg("--config").arg(dir)); + output_success(cli().arg("cluster").arg("apply").arg("--config").arg(dir)); + temp } #[test] fn queries_validate_exits_zero_on_clean_registry() { - let graph = SystemGraph::loaded(); - graph.write_query( + let cluster = converged_cluster_with_query( "find_person.gq", "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - &queries_test_config( - &graph.path().to_string_lossy(), - "find_person", - "find_person.gq", - ), + " find_person:\n file: ./find_person.gq\n", ); let output = output_success( cli() .arg("queries") .arg("validate") - .arg("--config") - .arg(&config), + .arg("--cluster") + .arg(cluster.path()), ); let stdout = stdout_string(&output); assert!(stdout.contains("OK"), "stdout:\n{stdout}"); } #[test] -fn queries_validate_exits_nonzero_on_type_broken_query() { - let graph = SystemGraph::loaded(); - // `Widget` is not in the fixture schema. - graph.write_query( - "ghost.gq", +fn cluster_import_rejects_a_type_broken_query() { + // In the cluster model a stored query is type-checked at the cluster + // boundary (import/apply), so a broken query can never reach the applied + // state `queries validate` reads — the gate is upstream. `Widget` is not in + // the fixture schema, so import must reject it, naming the query. + let temp = tempdir().unwrap(); + let dir = temp.path(); + std::fs::copy(fixture("test.pg"), dir.join("graph.pg")).unwrap(); + write_query_file( + &dir.join("ghost.gq"), "query ghost() { match { $w: Widget } return { $w.name } }", ); - let config = graph.write_config( - "omnigraph.yaml", - &queries_test_config(&graph.path().to_string_lossy(), "ghost", "ghost.gq"), + std::fs::write( + dir.join("cluster.yaml"), + "version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\n\ + graphs:\n knowledge:\n schema: ./graph.pg\n queries:\n ghost:\n file: ./ghost.gq\n", + ) + .unwrap(); + let output = output_failure(cli().arg("cluster").arg("import").arg("--config").arg(dir)); + let combined = format!( + "{}{}", + stdout_string(&output), + String::from_utf8_lossy(&output.stderr) ); - let output = output_failure( - cli() - .arg("queries") - .arg("validate") - .arg("--config") - .arg(&config), - ); - let stdout = stdout_string(&output); assert!( - stdout.contains("ghost"), - "validation should name the broken query; stdout:\n{stdout}" + combined.contains("ghost"), + "cluster import must reject the broken query, naming it; got:\n{combined}" ); } #[test] fn queries_list_prints_registered_query() { - let graph = SystemGraph::loaded(); - graph.write_query( + let cluster = converged_cluster_with_query( "find_person.gq", "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - // Exposed with an explicit tool name so the list shows the MCP suffix. - let config = graph.write_config( - "omnigraph.yaml", - &format!( - concat!( - "graphs:\n", - " local:\n", - " uri: '{}'\n", - " queries:\n", - " find_person:\n", - " file: ./find_person.gq\n", - " mcp: {{ expose: true, tool_name: lookup_person }}\n", - "cli:\n", - " graph: local\n", - "policy: {{}}\n", - ), - graph.path().to_string_lossy().replace('\'', "''") - ), + " find_person:\n file: ./find_person.gq\n", ); let output = output_success( cli() .arg("queries") .arg("list") - .arg("--config") - .arg(&config), + .arg("--cluster") + .arg(cluster.path()), ); let stdout = stdout_string(&output); assert!(stdout.contains("find_person"), "stdout:\n{stdout}"); @@ -285,242 +229,37 @@ fn queries_list_prints_registered_query() { stdout.contains("$name: String"), "list should show typed params; stdout:\n{stdout}" ); - assert!( - stdout.contains("[mcp: lookup_person]"), - "list should show the MCP tool name for exposed queries; stdout:\n{stdout}" - ); } #[test] -fn queries_list_requires_graph_selection_for_per_graph_only_registries() { - let graph = SystemGraph::loaded(); - graph.write_query( - "find_person.gq", - "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - &format!( - concat!( - "graphs:\n", - " local:\n", - " uri: '{}'\n", - " queries:\n", - " find_person:\n", - " file: ./find_person.gq\n", - "policy: {{}}\n", - ), - graph.path().to_string_lossy().replace('\'', "''") - ), - ); - - let output = output_failure( - cli() - .arg("queries") - .arg("list") - .arg("--config") - .arg(&config), - ); +fn queries_validate_requires_a_cluster() { + // RFC-011: with no --cluster (and no cluster profile), the command errors + // loudly rather than reading any omnigraph.yaml. + let output = output_failure(cli().arg("queries").arg("validate")); let stderr = String::from_utf8_lossy(&output.stderr); assert!( - stderr.contains("local") && stderr.contains("set `cli.graph`"), - "error must name the graph and give a concrete selection hint; stderr:\n{stderr}" + stderr.contains("needs a cluster") || stderr.contains("--cluster"), + "queries validate must require a cluster; stderr:\n{stderr}" ); } #[test] -fn queries_list_without_graph_selection_lists_top_level_registry() { - let graph = SystemGraph::loaded(); - graph.write_query( - "top_find.gq", - "query top_find($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - concat!( - "queries:\n", - " top_find:\n", - " file: ./top_find.gq\n", - "policy: {}\n", - ), - ); - - let output = output_success( - cli() - .arg("queries") - .arg("list") - .arg("--config") - .arg(&config), - ); - let stdout = stdout_string(&output); - assert!(stdout.contains("top_find"), "stdout:\n{stdout}"); -} - -#[test] -fn queries_list_unknown_cli_graph_errors() { - // `queries list` opens no graph URI, so unknown-graph validation can't ride - // along on URI resolution the way it does for every other command. An - // unknown `cli.graph` selection must still error (naming the graph) instead - // of silently falling back to the top-level registry and showing the wrong - // (or empty) catalog. (`--target` was removed; `cli.graph` drives selection.) - let graph = SystemGraph::loaded(); - graph.write_query( - "find_person.gq", - "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - &format!( - "graphs:\n local:\n uri: '{}'\n queries:\n find_person:\n file: ./find_person.gq\ncli:\n graph: nonexistent\npolicy: {{}}\n", - graph.path().to_string_lossy().replace('\'', "''"), - ), - ); - let output = output_failure(cli().arg("queries").arg("list").arg("--config").arg(&config)); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - stderr.contains("nonexistent"), - "error must name the unknown graph; stderr:\n{stderr}" - ); -} - -#[test] -fn queries_commands_reject_named_graph_with_populated_top_level_block() { - // A named graph (here via `cli.graph`) uses its own `graphs.` block, - // so a populated top-level `queries:` block would be silently ignored — a - // config the server REFUSES to boot. `queries validate`/`list` must reject - // it too (matching boot) instead of validating/listing the per-graph block - // and giving a false green. - let graph = SystemGraph::loaded(); - graph.write_query( - "find_person.gq", - "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - &format!( - concat!( - "graphs:\n", - " local:\n", - " uri: '{}'\n", - " queries:\n", - " find_person:\n", - " file: ./find_person.gq\n", - "cli:\n", - " graph: local\n", - "queries:\n", // populated top-level block: the coherence violation - " legacy:\n", - " file: ./legacy.gq\n", - "policy: {{}}\n", - ), - graph.path().to_string_lossy().replace('\'', "''") - ), - ); - // Both resolve `local` from cli.graph (no positional URI), so both must - // error and name the graph + the ignored block — like server boot does. - for sub in ["validate", "list"] { - let output = output_failure(cli().arg("queries").arg(sub).arg("--config").arg(&config)); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - stderr.contains("local") && stderr.contains("queries"), - "`queries {sub}` must reject a named graph with a populated top-level block; stderr:\n{stderr}" - ); - } -} - -#[test] -fn queries_validate_exits_nonzero_on_duplicate_tool_name() { - // Two exposed queries claiming one MCP tool name is a load-time - // collision — `queries validate` must fail (offline, before the engine - // opens) and name both queries plus the contested tool. - let graph = SystemGraph::loaded(); - graph.write_query( - "a.gq", - "query a() { match { $p: Person } return { $p.name } }", - ); - graph.write_query( - "b.gq", - "query b() { match { $p: Person } return { $p.name } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - &format!( - concat!( - "graphs:\n", - " local:\n", - " uri: '{}'\n", - " queries:\n", - " a:\n", - " file: ./a.gq\n", - " mcp: {{ expose: true, tool_name: dup }}\n", - " b:\n", - " file: ./b.gq\n", - " mcp: {{ expose: true, tool_name: dup }}\n", - "cli:\n", - " graph: local\n", - "policy: {{}}\n", - ), - graph.path().to_string_lossy().replace('\'', "''") - ), - ); - let output = output_failure( - cli() - .arg("queries") - .arg("validate") - .arg("--config") - .arg(&config), - ); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - stderr.contains("dup") && stderr.contains("'a'") && stderr.contains("'b'"), - "duplicate tool name should be reported naming both queries; stderr:\n{stderr}" - ); -} - -#[test] -fn queries_validate_positional_uri_ignores_default_graph() { - // A positional URI is anonymous → the schema AND the registry both come - // from top-level, even when `cli.graph` names a graph whose per-graph - // queries would fail. Pins that the URI and registry can't diverge. - let graph = SystemGraph::loaded(); - graph.write_query( - "clean.gq", - "query clean($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - // `Widget` is not in the fixture schema — the default graph's per-graph - // query would break validate if it were (wrongly) selected. - graph.write_query( - "broken.gq", - "query broken() { match { $w: Widget } return { $w.name } }", - ); - let config = graph.write_config( - "omnigraph.yaml", - concat!( - "cli:\n graph: prod\n", - "graphs:\n", - " prod:\n", - " uri: /nonexistent-prod.omni\n", - " queries:\n", - " broken:\n", - " file: ./broken.gq\n", - "queries:\n", - " clean:\n", - " file: ./clean.gq\n", - "policy: {}\n", - ), - ); - // Positional URI = the real loaded graph; selection is anonymous, so the - // CLEAN top-level registry validates (not prod's broken one). +fn queries_validate_graph_filter_selects_one_graph() { + // A multi-graph cluster: validate scoped to `knowledge` type-checks only + // that graph's registry, ignoring `engineering`'s. + let temp = tempdir().unwrap(); + let dir = temp.path(); + write_multi_graph_cluster_fixture(dir); + output_success(cli().arg("cluster").arg("import").arg("--config").arg(dir)); + output_success(cli().arg("cluster").arg("apply").arg("--config").arg(dir)); let output = output_success( cli() .arg("queries") .arg("validate") - .arg(graph.path()) - .arg("--config") - .arg(&config), - ); - let stdout = stdout_string(&output); - assert!( - stdout.contains("OK"), - "positional URI must validate the top-level registry, not the cli.graph default; stdout:\n{stdout}" + .arg("--cluster") + .arg(dir) + .arg("--graph") + .arg("knowledge"), ); + assert!(stdout_string(&output).contains("OK")); } diff --git a/crates/omnigraph-cli/tests/cli_schema_config.rs b/crates/omnigraph-cli/tests/cli_schema_config.rs index 0b6eca9..5577aa8 100644 --- a/crates/omnigraph-cli/tests/cli_schema_config.rs +++ b/crates/omnigraph-cli/tests/cli_schema_config.rs @@ -121,7 +121,7 @@ fn schema_plan_with_server_flag_errors_wrong_plane() { let stderr = String::from_utf8_lossy(&output.stderr); assert!( stderr.contains("`schema plan` is a direct (storage-native) command") - && stderr.contains("Pass a storage URI, or --cluster --cluster-graph ."), + && stderr.contains("Pass a storage URI."), "schema plan wrong-capability message not found; got: {stderr}" ); } @@ -334,7 +334,13 @@ fn schema_apply_json_adds_index_for_existing_property() { let dataset = snapshot.open("node:Person").await.unwrap(); dataset.load_indices().await.unwrap().len() }); - assert!(after_index_count > before_index_count); + // iss-848: `schema apply` records the `@index` intent but defers the physical + // index build (materialized later by ensure_indices/optimize; on this empty + // table nothing builds anyway). So the physical index count is unchanged. + assert_eq!( + after_index_count, before_index_count, + "schema apply records @index intent but defers the physical build (iss-848)" + ); } #[test] @@ -540,163 +546,18 @@ fn graphs_subcommand_help_lists_list_only() { #[test] fn graphs_list_against_local_uri_errors_with_remote_only_message() { + // RFC-011: `graphs list` is served-only; a `--store` (local) address has no + // enumeration endpoint, so it fails loudly pointing at a server / cluster. let output = output_failure( cli() .arg("graphs") .arg("list") - .arg("--uri") + .arg("--store") .arg("/tmp/local"), ); let stderr = String::from_utf8_lossy(&output.stderr).into_owned(); assert!( - stderr.contains("remote multi-graph server URL"), - "expected 'remote multi-graph server URL' rejection in stderr; got:\n{stderr}" + stderr.contains("remote multi-graph server"), + "expected a remote-server rejection in stderr; got:\n{stderr}" ); } - -/// RFC-008 stage 1: loading a legacy omnigraph.yaml emits the per-key -/// deprecation block (the migration map applied to THIS file), suppressible -/// via OMNIGRAPH_SUPPRESS_YAML_DEPRECATION. -#[test] -fn legacy_config_load_warns_per_key_and_suppression_silences() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "cli:\n actor: act-x\ngraphs:\n g:\n uri: /tmp/never-opened\n", - ) - .unwrap(); - - // `graphs list --json` loads the config and exits without touching the - // graph URI. - let output = cli() - .current_dir(temp.path()) - .arg("graphs") - .arg("list") - .arg("--json") - .output() - .unwrap(); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - stderr.contains("deprecated (RFC-008)") && stderr.contains("`cli.actor` -> `operator.actor`"), - "{stderr}" - ); - assert!(stderr.contains("config migrate"), "{stderr}"); - - let output = cli() - .current_dir(temp.path()) - .env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1") - .arg("graphs") - .arg("list") - .arg("--json") - .output() - .unwrap(); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!(!stderr.contains("deprecated (RFC-008)"), "{stderr}"); -} - -/// RFC-008 stage 2: `config migrate` proposes the split read-only, applies -/// it with --write (operator merge never clobbers; cluster.yaml emitted), -/// and a second --write is idempotent. -#[test] -fn config_migrate_splits_legacy_config() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n prod:\n uri: https://graph.example.com\n bearer_token_env: PROD_TOKEN\ncli:\n actor: act-me\n output_format: json\npolicy:\n file: ./top.policy.yaml\n", - ) - .unwrap(); - let operator_home = tempfile::tempdir().unwrap(); - fs::write( - operator_home.path().join("config.yaml"), - "operator:\n actor: act-existing\n", - ) - .unwrap(); - - // Read-only proposal: names both halves, writes nothing. - let output = cli() - .current_dir(temp.path()) - .env("OMNIGRAPH_HOME", operator_home.path()) - .env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1") - .arg("config") - .arg("migrate") - .output() - .unwrap(); - assert!(output.status.success(), "{output:?}"); - let stdout = String::from_utf8_lossy(&output.stdout); - assert!(stdout.contains("team half -> cluster.yaml"), "{stdout}"); - assert!(stdout.contains("operator.actor: act-me"), "{stdout}"); - assert!(stdout.contains("omnigraph login prod"), "{stdout}"); - assert!(!temp.path().join("cluster.yaml").exists()); - - // --write: cluster.yaml lands; the existing operator actor is KEPT. - let output = cli() - .current_dir(temp.path()) - .env("OMNIGRAPH_HOME", operator_home.path()) - .env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1") - .arg("config") - .arg("migrate") - .arg("--write") - .output() - .unwrap(); - assert!(output.status.success(), "{output:?}"); - let cluster = fs::read_to_string(temp.path().join("cluster.yaml")).unwrap(); - assert!(cluster.contains("version: 1") && cluster.contains(" prod:"), "{cluster}"); - let operator_text = - fs::read_to_string(operator_home.path().join("config.yaml")).unwrap(); - assert!(operator_text.contains("act-existing"), "{operator_text}"); - assert!(!operator_text.contains("act-me"), "existing keys win: {operator_text}"); - assert!(operator_text.contains("output: json"), "{operator_text}"); - assert!( - operator_text.contains("url: https://graph.example.com"), - "{operator_text}" - ); - - // Second --write: cluster.yaml exists -> proposal file, no clobber. - let output = cli() - .current_dir(temp.path()) - .env("OMNIGRAPH_HOME", operator_home.path()) - .env("OMNIGRAPH_SUPPRESS_YAML_DEPRECATION", "1") - .arg("config") - .arg("migrate") - .arg("--write") - .output() - .unwrap(); - assert!(output.status.success(), "{output:?}"); - assert!(temp.path().join("cluster.yaml.proposed").exists()); -} - -/// RFC-008 stage 4: OMNIGRAPH_NO_LEGACY_CONFIG refuses a present legacy -/// file (pointing at config migrate) but changes nothing on migrated -/// setups with no file. -#[test] -fn strict_mode_refuses_legacy_file_but_not_its_absence() { - let temp = tempdir().unwrap(); - fs::write(temp.path().join("omnigraph.yaml"), "cli:\n actor: a\n").unwrap(); - let output = cli() - .current_dir(temp.path()) - .env("OMNIGRAPH_NO_LEGACY_CONFIG", "1") - .arg("graphs") - .arg("list") - .arg("--json") - .output() - .unwrap(); - assert!(!output.status.success()); - let stderr = String::from_utf8_lossy(&output.stderr); - assert!( - stderr.contains("OMNIGRAPH_NO_LEGACY_CONFIG") && stderr.contains("config migrate"), - "{stderr}" - ); - - // Migrated setup (no file): strict mode is a no-op — a config-loading - // command that tolerates empty defaults succeeds. - let clean = tempdir().unwrap(); - let output = cli() - .current_dir(clean.path()) - .env("OMNIGRAPH_NO_LEGACY_CONFIG", "1") - .arg("queries") - .arg("list") - .arg("--json") - .output() - .unwrap(); - assert!(output.status.success(), "{output:?}"); -} diff --git a/crates/omnigraph-cli/tests/parity_matrix.rs b/crates/omnigraph-cli/tests/parity_matrix.rs index 65a584f..e46f064 100644 --- a/crates/omnigraph-cli/tests/parity_matrix.rs +++ b/crates/omnigraph-cli/tests/parity_matrix.rs @@ -25,21 +25,23 @@ const KNOWN_DIVERGENCES: &[&str] = &[ // populated by the rows below as they are written ]; -/// One matched setup per row: twin graphs + the SAME Cedar bundle on both -/// arms (the local arm via --config top-level policy.file; the server via -/// its config). Returns everything a row needs. +/// One matched setup per row: twin graphs + the parity Cedar bundle on the +/// served arm. The local (`--store`) arm carries no policy (RFC-011); the +/// bundle is permissive for `act-parity`, so the arms still agree. struct Parity { _temp: TempDir, local: std::path::PathBuf, - local_cfg: std::path::PathBuf, server: TestServer, } fn parity() -> Parity { let (temp, local, remote) = twin_graphs(); - let (local_cfg, server_cfg) = parity_configs(temp.path(), &local, &remote); - let server = spawn_server_with_config_env( - &server_cfg, + // RFC-011 cluster-only: the remote arm is served from a converged + // cluster directory (one graph, id `parity`), seeded with the same + // fixture data as the local twin. + let cluster_dir = parity_configs(temp.path(), &local, &remote); + let server = spawn_server_with_cluster_env( + &cluster_dir, &[( "OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", r#"{"act-parity":"parity-tok"}"#, @@ -48,14 +50,13 @@ fn parity() -> Parity { Parity { _temp: temp, local, - local_cfg, server, } } impl Parity { fn run(&self, args: &[&str]) -> (std::process::Output, std::process::Output) { - run_both_with_config(&self.local, Some(&self.local_cfg), &self.server.base_url, args) + run_both(&self.local, &self.server.base_url, args) } } @@ -83,7 +84,6 @@ fn parity_query() { "query", "--query", query.to_str().unwrap(), - "--name", "get_person", "--params", r#"{"name":"Alice"}"#, @@ -142,7 +142,10 @@ fn parity_branch_create_delete() { let (l, r) = p.run(&["branch", "create", "--from", "main", "parity-branch", "--json"], ); assert_parity("branch create", &l, &r); - let (l, r) = p.run(&["branch", "delete", "parity-branch", "--json"], + // `branch delete` is destructive: the served (remote) arm is non-local and + // requires consent (RFC-011 Decision 9), so the row passes `--yes` to test + // the operation itself, not the safety gate. The local arm ignores `--yes`. + let (l, r) = p.run(&["branch", "delete", "parity-branch", "--yes", "--json"], ); assert_parity("branch delete", &l, &r); } @@ -229,7 +232,6 @@ fn parity_errors_share_exit_codes() { "query", "--query", query.to_str().unwrap(), - "--name", "no_such_query", "--json", ], @@ -249,7 +251,6 @@ fn parity_errors_share_exit_codes() { "query", "--query", query.to_str().unwrap(), - "--name", "get_person", "--json", ], diff --git a/crates/omnigraph-cli/tests/support/mod.rs b/crates/omnigraph-cli/tests/support/mod.rs index c19d6a6..ff6a5d4 100644 --- a/crates/omnigraph-cli/tests/support/mod.rs +++ b/crates/omnigraph-cli/tests/support/mod.rs @@ -339,6 +339,63 @@ impl SystemGraph { } } +/// A converged cluster directory the server can boot from (`--cluster`), +/// serving one graph seeded with the standard fixture. Holds the temp dir +/// alive for the test's lifetime. +pub struct ClusterFixture { + _temp: TempDir, + dir: PathBuf, +} + +impl ClusterFixture { + pub fn path(&self) -> &Path { + &self.dir + } +} + +/// Build a converged cluster (RFC-011 cluster-only serving) with a single +/// graph `graph_id`, seeded with the `test.jsonl` fixture so reads return +/// data. When `policy_yaml` is `Some`, the bundle is bound to the graph +/// scope. The server boots from the returned path via `--cluster`. +pub fn converged_loaded_cluster(graph_id: &str, policy_yaml: Option<&str>) -> ClusterFixture { + let temp = tempdir().unwrap(); + let dir = temp.path().to_path_buf(); + fs::copy(fixture("test.pg"), dir.join("graph.pg")).unwrap(); + + let policy_block = match policy_yaml { + Some(source) => { + fs::write(dir.join("graph.policy.yaml"), source).unwrap(); + format!( + "policies:\n graph:\n file: ./graph.policy.yaml\n applies_to: [{graph_id}]\n" + ) + } + None => String::new(), + }; + fs::write( + dir.join("cluster.yaml"), + format!( + "version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\ngraphs:\n {graph_id}:\n schema: ./graph.pg\n{policy_block}" + ), + ) + .unwrap(); + + output_success(cli().arg("cluster").arg("import").arg("--config").arg(&dir)); + output_success(cli().arg("cluster").arg("apply").arg("--config").arg(&dir)); + + let served_root = dir.join("graphs").join(format!("{graph_id}.omni")); + output_success( + cli() + .arg("load") + .arg("--data") + .arg(fixture("test.jsonl")) + .arg("--mode") + .arg("overwrite") + .arg(&served_root), + ); + + ClusterFixture { _temp: temp, dir } +} + // ---- helpers moved from the monolithic tests/cli.rs ---- #[allow(unused_imports)] use lance::Dataset; @@ -788,29 +845,94 @@ rules: .to_string() } -/// Per-arm config files carrying the same policy. Both arms address the -/// graph by positional URI, so the TOP-LEVEL policy.file applies on each -/// side (single-graph semantics). -pub fn parity_configs(root: &Path, _local_graph: &Path, remote_graph: &Path) -> (PathBuf, PathBuf) { +/// The graph id the parity cluster serves the remote arm under. The +/// remote arm addresses it with `--graph PARITY_GRAPH_ID` (RFC-011: the +/// server is cluster-only, so a graph selector is required). +pub const PARITY_GRAPH_ID: &str = "parity"; + +/// Build the remote arm's configuration (RFC-011 cluster-only server). +/// +/// The remote arm is served from a converged cluster directory whose single +/// graph (id `parity`) carries the parity Cedar bundle (bound to the graph +/// scope). The cluster's derived graph root (`/graphs/parity.omni`) is +/// seeded with the SAME fixture data as the local twin so the two arms compare +/// like-for-like. The local (`--store`) arm carries no Cedar policy (RFC-011), +/// which is fine because the parity bundle is permissive for `act-parity`. +/// +/// `local_graph` is overwritten with a byte-for-byte copy of the cluster's +/// seeded served graph so identity-bearing values that are NOT scrubbed +/// (e.g. `graph_commit_id`, edge `id`s in export) match across the arms — +/// the served graph is the source of truth and the local twin mirrors it. +/// +/// Returns the `cluster_dir`. The caller spawns the server with `--cluster`. +pub fn parity_configs(root: &Path, local_graph: &Path, _remote_graph: &Path) -> PathBuf { let policy = root.join("parity.policy.yaml"); fs::write(&policy, parity_policy_yaml()).unwrap(); - let local_cfg = root.join("local.omnigraph.yaml"); + + // Remote arm: a cluster directory the server boots from. One graph + // (`parity`), schema = the shared fixture, policy bound to the graph. + let cluster_dir = root.join("parity-cluster"); + fs::create_dir_all(&cluster_dir).unwrap(); + fs::copy(fixture("test.pg"), cluster_dir.join("parity.pg")).unwrap(); + fs::copy(&policy, cluster_dir.join("parity.policy.yaml")).unwrap(); fs::write( - &local_cfg, - format!("policy:\n file: {}\n", policy.display()), - ) - .unwrap(); - let server_cfg = root.join("server.omnigraph.yaml"); - fs::write( - &server_cfg, + cluster_dir.join("cluster.yaml"), format!( - "server:\n graph: parity\ngraphs:\n parity:\n uri: {}\n policy:\n file: {}\n", - remote_graph.display(), - policy.display() + r#"version: 1 +metadata: + name: parity +state: + backend: cluster + lock: true +graphs: + {PARITY_GRAPH_ID}: + schema: ./parity.pg +policies: + parity: + file: ./parity.policy.yaml + applies_to: [{PARITY_GRAPH_ID}] +"# ), ) .unwrap(); - (local_cfg, server_cfg) + + // Converge the cluster (creates the empty graph at the derived root), + // then seed it with the same fixture data the local twin holds. + output_success( + cli() + .arg("cluster") + .arg("import") + .arg("--config") + .arg(&cluster_dir), + ); + output_success( + cli() + .arg("cluster") + .arg("apply") + .arg("--config") + .arg(&cluster_dir), + ); + let served_root = cluster_dir + .join("graphs") + .join(format!("{PARITY_GRAPH_ID}.omni")); + output_success( + cli() + .arg("load") + .arg("--data") + .arg(fixture("test.jsonl")) + .arg("--mode") + .arg("overwrite") + .arg(&served_root), + ); + + // Mirror the seeded served graph into the local twin so both arms hold + // identical ULIDs / commit ids (the served graph is authoritative). + if local_graph.exists() { + fs::remove_dir_all(local_graph).unwrap(); + } + copy_dir(&served_root, local_graph); + + cluster_dir } /// Run one CLI invocation per arm with identical verb args: locally against @@ -821,21 +943,14 @@ pub fn run_both( local_graph: &Path, server_url: &str, args: &[&str], -) -> (std::process::Output, std::process::Output) { - run_both_with_config(local_graph, None, server_url, args) -} - -pub fn run_both_with_config( - local_graph: &Path, - local_config: Option<&Path>, - server_url: &str, - args: &[&str], ) -> (std::process::Output, std::process::Output) { // Address both arms with GLOBAL flags (`--store` / `--server`) appended after // the verb + its args, so the address is placed correctly regardless of // subcommand nesting (a positional graph only works for top-level verbs; // `schema show ` etc. need the global flag). Local = embedded store, - // remote = served. + // remote = served. RFC-011: a direct (`--store`) write carries no Cedar + // policy — the parity policy is permissive for `act-parity` on the served + // arm, so the two arms still agree. let mut local = cli(); local .args(args) @@ -843,9 +958,6 @@ pub fn run_both_with_config( .arg(local_graph) .arg("--as") .arg(PARITY_ACTOR); - if let Some(config) = local_config { - local.arg("--config").arg(config); - } let local_out = local.output().unwrap(); let mut remote = cli(); @@ -853,7 +965,11 @@ pub fn run_both_with_config( .env("OMNIGRAPH_BEARER_TOKEN", PARITY_TOKEN) .args(args) .arg("--server") - .arg(server_url); + .arg(server_url) + // RFC-011: the parity server is cluster-only (multi-graph), so the + // remote arm must name the graph it addresses. + .arg("--graph") + .arg(PARITY_GRAPH_ID); let remote_out = remote.output().unwrap(); (local_out, remote_out) } diff --git a/crates/omnigraph-cli/tests/system_local.rs b/crates/omnigraph-cli/tests/system_local.rs index ddedaf7..e9e4b2f 100644 --- a/crates/omnigraph-cli/tests/system_local.rs +++ b/crates/omnigraph-cli/tests/system_local.rs @@ -3,6 +3,7 @@ mod support; use std::env; use std::fs; +use omnigraph::db::Omnigraph; use reqwest::blocking::Client; use serde_json::Value; @@ -62,53 +63,6 @@ cases: expect: allow "#; -fn yaml_string(value: &str) -> String { - format!("'{}'", value.replace('\'', "''")) -} - -fn local_policy_config(graph: &SystemGraph) -> String { - format!( - "\ -project: - name: policy-e2e-local -graphs: - local: - uri: {} - policy: - file: ./policy.yaml -cli: - graph: local - branch: main -query: - roots: - - . -", - yaml_string(&graph.path().to_string_lossy()) - ) -} - -fn local_policy_server_graph_config(graph: &SystemGraph) -> String { - format!( - "\ -project: - name: policy-e2e-local -graphs: - local: - uri: {} - policy: - file: ./policy.yaml -server: - graph: local -cli: - branch: main -query: - roots: - - . -", - yaml_string(&graph.path().to_string_lossy()) - ) -} - fn insert_person_query(graph: &SystemGraph, name: &str) -> std::path::PathBuf { graph.write_query( name, @@ -231,10 +185,10 @@ fn local_cli_end_to_end_init_load_read_change_read_flow() { let read_before = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -246,6 +200,7 @@ fn local_cli_end_to_end_init_load_read_change_read_flow() { let change_payload = parse_stdout_json(&output_success( cli() .arg("change") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(&mutation_file) @@ -259,10 +214,10 @@ fn local_cli_end_to_end_init_load_read_change_read_flow() { let read_after = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Eve"}"#) @@ -277,6 +232,7 @@ fn local_cli_end_to_end_init_load_read_change_read_flow() { let inline_change = parse_stdout_json(&output_success( cli() .arg("change") + .arg("--store") .arg(graph.path()) .arg("-e") .arg("query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }") @@ -291,6 +247,7 @@ fn local_cli_end_to_end_init_load_read_change_read_flow() { let inline_read = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query-string") .arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }") @@ -322,6 +279,7 @@ fn local_cli_end_to_end_branch_change_merge_flow() { let change_payload = parse_stdout_json(&output_success( cli() .arg("change") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(&mutation_file) @@ -337,10 +295,10 @@ fn local_cli_end_to_end_branch_change_merge_flow() { let feature_read = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--branch") .arg("feature") @@ -365,10 +323,10 @@ fn local_cli_end_to_end_branch_change_merge_flow() { let main_read = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Zoe"}"#) @@ -435,10 +393,10 @@ fn local_cli_ingest_creates_review_branch_and_keeps_it_readable() { let zoe = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--branch") .arg("feature-ingest") @@ -452,10 +410,10 @@ fn local_cli_ingest_creates_review_branch_and_keeps_it_readable() { let bob = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--branch") .arg("feature-ingest") @@ -629,10 +587,10 @@ fn local_cli_export_round_trips_full_branch_graph() { let eve = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(&imported_graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Eve"}"#) @@ -644,10 +602,10 @@ fn local_cli_export_round_trips_full_branch_graph() { let friends = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(&imported_graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("friends_of") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -665,31 +623,9 @@ fn local_cli_s3_end_to_end_init_load_read_flow() { let temp = tempfile::tempdir().unwrap(); let query_root = temp.path(); - let config = query_root.join("omnigraph.yaml"); let query = query_root.join("test.gq"); fs::copy(fixture("test.gq"), &query).unwrap(); - write_config( - &config, - &format!( - "\ -graphs: - rustfs: - uri: '{}' -cli: - graph: rustfs - branch: main -query: - roots: - - . -policy: {{}} -", - graph_uri - ), - ); - // current_dir matters: `init` scaffolds an omnigraph.yaml into its cwd, - // and without this it pollutes the crate dir, breaking unrelated tests - // (anything resolving a graph target from the cwd config). output_success( cli() .current_dir(query_root) @@ -709,15 +645,16 @@ policy: {{}} .arg(&graph_uri), ); + // RFC-011: the graph is addressed by `--store `; the `.gq` path is + // resolved cwd-relative (no omnigraph.yaml `query.roots`). let read = parse_stdout_json(&output_success( cli() .current_dir(query_root) .arg("read") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(&graph_uri) .arg("--query") .arg("test.gq") - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -730,8 +667,8 @@ policy: {{}} cli() .current_dir(query_root) .arg("snapshot") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(&graph_uri) .arg("--json"), )); assert!(snapshot["tables"].is_array()); @@ -779,6 +716,7 @@ fn local_cli_failed_change_keeps_target_state_unchanged() { let output = output_failure( cli() .arg("change") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(&mutation_file) @@ -791,10 +729,10 @@ fn local_cli_failed_change_keeps_target_state_unchanged() { let friends_payload = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("friends_of") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -806,36 +744,22 @@ fn local_cli_failed_change_keeps_target_state_unchanged() { } #[test] -fn local_cli_resolves_relative_query_against_config_base_dir() { +fn local_cli_resolves_relative_query_cwd_relative() { + // RFC-011: omnigraph.yaml `query.roots` search is gone — a `--query` + // path is resolved plainly relative to the process cwd. This pins that + // a bare relative `.gq` filename resolves against `.current_dir`, and + // that the file actually read is the cwd-local one (a same-named query + // elsewhere with different columns is never picked up). let graph = SystemGraph::loaded(); let root = graph.path().parent().unwrap(); - let config_dir = root.join("config"); - let query_dir = config_dir.join("queries"); - let ambient_dir = root.join("ambient"); - fs::create_dir_all(&query_dir).unwrap(); - fs::create_dir_all(&ambient_dir).unwrap(); + let cwd_dir = root.join("cwd"); + let other_dir = root.join("other"); + fs::create_dir_all(&cwd_dir).unwrap(); + fs::create_dir_all(&other_dir).unwrap(); - let config = config_dir.join("omnigraph.yaml"); - write_config( - &config, - &format!( - "\ -graphs: - local: - uri: '{}' -cli: - graph: local - branch: main -query: - roots: - - queries -policy: {{}} -", - graph.path().display() - ), - ); + // The query in the cwd projects (age, name). write_query_file( - &query_dir.join("local.gq"), + &cwd_dir.join("local.gq"), r#" query get_person($name: String) { match { @@ -845,8 +769,10 @@ query get_person($name: String) { } "#, ); + // A same-named query elsewhere projects only (name): if cwd-relative + // resolution regressed and picked this up, the columns assert fails. write_query_file( - &ambient_dir.join("local.gq"), + &other_dir.join("local.gq"), r#" query get_person($name: String) { match { @@ -859,13 +785,12 @@ query get_person($name: String) { let payload = parse_stdout_json(&output_success( cli() - .current_dir(&ambient_dir) + .current_dir(&cwd_dir) .arg("read") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(graph.path()) .arg("--query") .arg("local.gq") - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -974,10 +899,10 @@ query get_task($slug: String) { let filtered = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(&queries) - .arg("--name") .arg("due_with_tag") .arg("--params") .arg(r#"{"deadline":"2026-04-02T00:00:00Z","tag":"launch"}"#) @@ -999,10 +924,10 @@ query get_task($slug: String) { let insert_payload = parse_stdout_json(&output_success( cli() .arg("change") + .arg("--store") .arg(&graph) .arg("--query") .arg(&queries) - .arg("--name") .arg("insert_task") .arg("--params") .arg( @@ -1015,10 +940,10 @@ query get_task($slug: String) { let update_payload = parse_stdout_json(&output_success( cli() .arg("change") + .arg("--store") .arg(&graph) .arg("--query") .arg(&queries) - .arg("--name") .arg("update_task") .arg("--params") .arg(r#"{"slug":"gamma","due_at":"2026-04-04T10:45:00Z","tags":["embed","released"],"scores":[13,21],"active_days":["2026-04-04","2026-04-05"]}"#) @@ -1029,10 +954,10 @@ query get_task($slug: String) { let gamma = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(&queries) - .arg("--name") .arg("get_task") .arg("--params") .arg(r#"{"slug":"gamma"}"#) @@ -1117,10 +1042,10 @@ query vector_search($q: String) { .env("OMNIGRAPH_EMBED_PROVIDER", "gemini") .env("OMNIGRAPH_EMBED_MODEL", "gemini-embedding-2-preview") .arg("read") + .arg("--store") .arg(&graph) .arg("--query") .arg(&queries) - .arg("--name") .arg("vector_search") .arg("--params") .arg(r#"{"q":"alpha"}"#) @@ -1141,122 +1066,145 @@ query vector_search($q: String) { #[test] fn local_cli_policy_tooling_is_end_to_end() { - // Sanity check for the read-only policy CLI surfaces. These don't - // mutate the graph; they parse and evaluate the effective policy for - // named graph selections, including per-graph policy files. - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - let server_graph_config = graph.write_config( - "omnigraph-policy-server.yaml", - &local_policy_server_graph_config(&graph), + // RFC-011: the read-only policy CLI surfaces source the bundle from a + // cluster's applied policies (`--cluster ` + `--graph `), not + // from an omnigraph.yaml `graphs:` map. These don't mutate the graph; + // they parse and evaluate the effective bundle bound to the graph. + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + // `policy test` has no per-bundle tests file in the cluster model, so + // the cases are supplied explicitly via `--tests`. + let tests_file = cluster.path().join("policy.tests.yaml"); + fs::write(&tests_file, POLICY_E2E_TESTS_YAML).unwrap(); + + let validate = output_success( + cli() + .arg("policy") + .arg("validate") + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge"), ); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - graph.write_config("policy.tests.yaml", POLICY_E2E_TESTS_YAML); + assert!(stdout_string(&validate).contains("policy valid:")); - for config in [&config, &server_graph_config] { - let validate = output_success( - cli() - .arg("policy") - .arg("validate") - .arg("--config") - .arg(config), - ); - assert!(stdout_string(&validate).contains("policy valid:")); + let tests = output_success( + cli() + .arg("policy") + .arg("test") + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge") + .arg("--tests") + .arg(&tests_file), + ); + assert!(stdout_string(&tests).contains("policy tests passed: 2 cases")); - let tests = output_success(cli().arg("policy").arg("test").arg("--config").arg(config)); - assert!(stdout_string(&tests).contains("policy tests passed: 2 cases")); - - let explain = output_success( - cli() - .arg("policy") - .arg("explain") - .arg("--config") - .arg(config) - .arg("--actor") - .arg("act-bruno") - .arg("--action") - .arg("change") - .arg("--branch") - .arg("main"), - ); - let explain_stdout = stdout_string(&explain); - assert!(explain_stdout.contains("decision: deny")); - assert!(explain_stdout.contains("branch: main")); - } + let explain = output_success( + cli() + .arg("policy") + .arg("explain") + .arg("--cluster") + .arg(cluster.path()) + .arg("--graph") + .arg("knowledge") + .arg("--actor") + .arg("act-bruno") + .arg("--action") + .arg("change") + .arg("--branch") + .arg("main"), + ); + let explain_stdout = stdout_string(&explain); + assert!(explain_stdout.contains("decision: deny")); + assert!(explain_stdout.contains("branch: main")); } +/// Token→actor map for the served-policy tests: the bearer tokens the +/// cluster server resolves to `act-bruno` / `act-ragnor`. +const POLICY_TOKENS_JSON: &str = r#"{"act-bruno":"bruno-tok","act-ragnor":"ragnor-tok"}"#; + #[test] fn local_cli_change_enforces_engine_layer_policy() { - // Asserts MR-722 PR #4: when the selected graph has a configured - // policy file, the CLI loads PolicyEngine into Omnigraph and every - // direct-engine write hits `enforce(action, scope, actor)` — identical - // to what the HTTP server gets, regardless of transport. + // RFC-011: a CLI direct-store write carries NO policy — policy lives in + // the cluster/server. So engine-layer policy on a direct write no longer + // exists; this test asserts the faithful migration: the SERVER enforces + // the bundle bound to the served graph, addressed via `--server --graph` + // with a bearer token that resolves to the actor. // // Three cases, each discriminating: // - // 1. Policy installed, no actor source (no `cli.actor` in config, - // no `--as` flag) → engine-layer footgun guard fires; CLI exits - // non-zero with a "no actor" message. Silent bypass is the bug - // PR #4 prevents. - // 2. Policy installed, `--as act-bruno`, change on main → Cedar - // denies (bruno can change unprotected branches; main is - // protected). CLI exits non-zero with a "denied" message. - // 3. Policy installed, `--as act-ragnor`, change on main → - // Cedar permits (admins-write rule). Write succeeds and the - // inserted row is readable. - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - let mutation_file = insert_person_query(&graph, "system-local-policy-change.gq"); - - // Case 1: policy configured, no actor threaded → footgun guard. - let no_actor = output_failure( - cli() - .arg("change") - .arg("--config") - .arg(&config) - .arg("--query") - .arg(&mutation_file) - .arg("--params") - .arg(r#"{"name":"NoActorPerson","age":1}"#) - .arg("--json"), + // 1. No token → the server refuses (401, unauthenticated). The old + // embedded "no actor" footgun does not apply to the served path + // (the actor comes from the token), so this replaces it. + // 2. bruno token, change on protected main → Cedar denies (bruno can + // change unprotected branches; main is protected). Non-zero exit, + // "denied" surfaced from the server error body. + // 3. ragnor token, change on main → Cedar permits (admins-write). Write + // succeeds and the inserted row is readable. + if skip_system_e2e("local_cli_change_enforces_engine_layer_policy") { + return; + } + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), + &[("OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", POLICY_TOKENS_JSON)], ); - let no_actor_stderr = String::from_utf8_lossy(&no_actor.stderr); + let insert = + "query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }"; + + // Case 1: no token → the server refuses before any policy check. + let no_token = cli() + .arg("change") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("-e") + .arg(insert) + .arg("--params") + .arg(r#"{"name":"NoTokenPerson","age":1}"#) + .arg("--json") + .output() + .unwrap(); assert!( - no_actor_stderr.contains("no actor"), - "expected 'no actor' footgun message, got stderr: {no_actor_stderr}" + !no_token.status.success(), + "unauthenticated served write must be refused: {no_token:?}" ); - // Case 2: `--as act-bruno` against protected main → denied. - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("change") - .arg("--config") - .arg(&config) - .arg("--query") - .arg(&mutation_file) - .arg("--params") - .arg(r#"{"name":"BrunoOnMain","age":2}"#) - .arg("--json"), - ); + // Case 2: bruno token against protected main → denied by the server. + let denied = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") + .arg("change") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("-e") + .arg(insert) + .arg("--params") + .arg(r#"{"name":"BrunoOnMain","age":2}"#) + .arg("--json") + .output() + .unwrap(); + assert!(!denied.status.success(), "bruno/main must be denied"); let denied_stderr = String::from_utf8_lossy(&denied.stderr); assert!( denied_stderr.contains("denied"), "expected 'denied' message for bruno/main, got stderr: {denied_stderr}" ); - // Case 3: `--as act-ragnor` against main → permitted by admins-write. + // Case 3: ragnor token against main → permitted by admins-write. let allowed = parse_stdout_json(&output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("change") - .arg("--config") - .arg(&config) - .arg("--query") - .arg(&mutation_file) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("-e") + .arg(insert) .arg("--params") .arg(r#"{"name":"RagnorOnMain","age":3}"#) .arg("--json"), @@ -1266,14 +1214,19 @@ fn local_cli_change_enforces_engine_layer_policy() { assert_eq!(allowed["actor_id"], "act-ragnor"); // Verify the row landed — proves the write actually committed, not - // just that enforce returned Ok and silently dropped the work. + // just that enforce returned Ok and silently dropped the work. The read + // uses the bruno token: POLICY_E2E_YAML grants `read` to the `team` + // group (bruno), while admins (ragnor) get write-only rules. let verify = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") .arg("read") - .arg(graph.path()) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"RagnorOnMain"}"#) @@ -1284,27 +1237,30 @@ fn local_cli_change_enforces_engine_layer_policy() { } #[test] -fn local_cli_positional_uri_does_not_inherit_default_graph_policy() { +fn local_cli_direct_store_write_is_unpoliced_regardless_of_actor() { + // RFC-011: a direct (`--store`) write carries no Cedar policy at all — + // policy lives in the cluster/server. So a write that the SERVED path + // would deny (bruno changing protected main) succeeds on the direct + // path, regardless of the actor. This is the faithful replacement for + // the obsolete `..._positional_uri_does_not_inherit_default_graph_policy` + // premise: a positional/`--store` address has no policy to inherit. let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - let mutation_file = insert_person_query(&graph, "system-local-policy-positional.gq"); + let mutation_file = insert_person_query(&graph, "system-local-policy-direct.gq"); let allowed = parse_stdout_json(&output_success( cli() .arg("--as") .arg("act-bruno") .arg("change") - .arg("--config") - .arg(&config) - .arg("--uri") + .arg("--store") .arg(graph.path()) .arg("--query") .arg(&mutation_file) .arg("--params") - .arg(r#"{"name":"PositionalUriBruno","age":4}"#) + .arg(r#"{"name":"DirectStoreBruno","age":4}"#) .arg("--json"), )); + assert_eq!(allowed["branch"], "main"); assert_eq!(allowed["affected_nodes"], 1); assert_eq!(allowed["actor_id"], "act-bruno"); } @@ -1322,28 +1278,44 @@ fn local_cli_positional_uri_does_not_inherit_default_graph_policy() { #[test] fn local_cli_load_enforces_engine_layer_policy() { - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - let data = graph.write_jsonl( - "system-local-policy-load.jsonl", - r#"{"type":"Person","data":{"name":"LoadPolicy","age":11}}"#, + // RFC-011 served re-point: the server enforces the graph-bound bundle on + // a remote load. A load into protected main is a `change`: bruno + // (team-write-unprotected) is denied, ragnor (admins-write) is allowed. + if skip_system_e2e("local_cli_load_enforces_engine_layer_policy") { + return; + } + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), + &[("OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", POLICY_TOKENS_JSON)], ); + let temp = tempfile::tempdir().unwrap(); + let data = temp.path().join("policy-load.jsonl"); + fs::write( + &data, + r#"{"type":"Person","data":{"name":"LoadPolicy","age":11}}"#, + ) + .unwrap(); // act-bruno: change-on-protected is denied (team-write-unprotected only). - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("load") - .arg("--mode") - .arg("overwrite") - .arg("--config") - .arg(&config) - .arg("--data") - .arg(&data) - .arg("--json"), - ); + let denied = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") + .arg("load") + .arg("--mode") + .arg("overwrite") + // `--yes` clears the RFC-011 Decision 9 destructive-write confirmation + // so the policy check (not the confirmation refusal) is what denies. + .arg("--yes") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("--data") + .arg(&data) + .arg("--json") + .output() + .unwrap(); + assert!(!denied.status.success(), "bruno/main load must be denied"); let stderr = String::from_utf8_lossy(&denied.stderr); assert!( stderr.contains("denied"), @@ -1353,13 +1325,15 @@ fn local_cli_load_enforces_engine_layer_policy() { // act-ragnor: admins-write rule permits change anywhere. let allowed = parse_stdout_json(&output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("load") .arg("--mode") .arg("overwrite") - .arg("--config") - .arg(&config) + .arg("--yes") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("--data") .arg(&data) .arg("--json"), @@ -1370,47 +1344,55 @@ fn local_cli_load_enforces_engine_layer_policy() { #[test] fn local_cli_ingest_enforces_engine_layer_policy() { - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - let data = graph.write_jsonl( - "system-local-policy-ingest.jsonl", + // RFC-011 served re-point: ingest into a new branch requires both + // BranchCreate and Change. Bruno has change-unprotected only (no + // branch-ops) — either gate denies. Ragnor has admins-write + + // admins-branch-ops — both fire as ingest creates the branch + loads. + if skip_system_e2e("local_cli_ingest_enforces_engine_layer_policy") { + return; + } + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), + &[("OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", POLICY_TOKENS_JSON)], + ); + let temp = tempfile::tempdir().unwrap(); + let data = temp.path().join("policy-ingest.jsonl"); + fs::write( + &data, r#"{"type":"Person","data":{"name":"IngestPolicy","age":12}}"#, - ); + ) + .unwrap(); - // act-bruno: ingest into a new branch requires both BranchCreate and - // Change. Bruno has change-unprotected only, and the implicit - // branch_create fires first when the target branch doesn't exist. - // Either gate is enough to deny — assert denial without pinning - // which one fires first. - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("ingest") - .arg("--config") - .arg(&config) - .arg("--data") - .arg(&data) - .arg("--branch") - .arg("policy-ingest-feature") - .arg("--json"), - ); + let denied = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") + .arg("ingest") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("--data") + .arg(&data) + .arg("--branch") + .arg("policy-ingest-feature") + .arg("--json") + .output() + .unwrap(); + assert!(!denied.status.success(), "bruno ingest must be denied"); let stderr = String::from_utf8_lossy(&denied.stderr); assert!( stderr.contains("denied"), "expected 'denied' for bruno ingest, got: {stderr}" ); - // act-ragnor: admins-write covers Change, admins-branch-ops covers - // BranchCreate. Both fire as ingest creates the branch + loads. let allowed = parse_stdout_json(&output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("ingest") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("--data") .arg(&data) .arg("--branch") @@ -1421,130 +1403,33 @@ fn local_cli_ingest_enforces_engine_layer_policy() { assert_eq!(allowed["branch_created"], true); } -#[test] -fn local_cli_schema_apply_enforces_engine_layer_policy() { - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - - // Additive: add a nullable property; SDK-compatible with the fixture - // schema. Uses the schema-apply scope (TargetBranch("main")). - let new_schema = std::fs::read_to_string(fixture("test.pg")) - .unwrap() - .replace( - " age: I32?\n}", - " age: I32?\n nickname: String?\n}", - ); - let schema_path = graph.path().join("policy-additive.pg"); - std::fs::write(&schema_path, &new_schema).unwrap(); - - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("schema") - .arg("apply") - .arg("--config") - .arg(&config) - .arg("--schema") - .arg(&schema_path) - .arg("--json"), - ); - let stderr = String::from_utf8_lossy(&denied.stderr); - assert!( - stderr.contains("denied"), - "expected 'denied' for bruno schema apply, got: {stderr}" - ); - - let allowed = parse_stdout_json(&output_success( - cli() - .arg("--as") - .arg("act-ragnor") - .arg("schema") - .arg("apply") - .arg("--config") - .arg(&config) - .arg("--schema") - .arg(&schema_path) - .arg("--json"), - )); - assert_eq!(allowed["applied"], true); -} - -#[test] -fn local_cli_schema_apply_rejects_stored_query_breakage_before_publish() { - let graph = SystemGraph::loaded(); - graph.write_query( - "stored-find-person.gq", - "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }", - ); - let config = graph.write_config( - "omnigraph-stored-query-schema.yaml", - &format!( - "\ -graphs: - local: - uri: {} - queries: - find_person: - file: ./stored-find-person.gq -cli: - graph: local - branch: main -query: - roots: - - . -policy: {{}} -", - yaml_string(&graph.path().to_string_lossy()) - ), - ); - let renamed_schema = std::fs::read_to_string(fixture("test.pg")) - .unwrap() - .replace("age: I32?", "years: I32? @rename_from(\"age\")"); - let schema_path = graph.write_file("stored-query-breaks.pg", &renamed_schema); - - let rejected = output_failure( - cli() - .arg("schema") - .arg("apply") - .arg("--config") - .arg(&config) - .arg("--schema") - .arg(&schema_path) - .arg("--json"), - ); - let stderr = String::from_utf8_lossy(&rejected.stderr); - assert!( - stderr.contains("find_person") && stderr.contains("schema check"), - "schema apply should reject the stored-query breakage before publish; stderr: {stderr}" - ); - - let schema = stdout_string(&output_success( - cli().arg("schema").arg("show").arg("--config").arg(&config), - )); - assert!(schema.contains("age: I32?")); - assert!(!schema.contains("years: I32?")); -} - #[test] fn local_cli_branch_create_enforces_engine_layer_policy() { - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); - - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("branch") - .arg("create") - .arg("--config") - .arg(&config) - .arg("--from") - .arg("main") - .arg("bruno-feature"), + // RFC-011 served re-point: bruno has no branch-ops rule → denied; + // ragnor has admins-branch-ops → allowed. + if skip_system_e2e("local_cli_branch_create_enforces_engine_layer_policy") { + return; + } + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), + &[("OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", POLICY_TOKENS_JSON)], ); + + let denied = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") + .arg("branch") + .arg("create") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("--from") + .arg("main") + .arg("bruno-feature") + .output() + .unwrap(); + assert!(!denied.status.success(), "bruno branch create must be denied"); let stderr = String::from_utf8_lossy(&denied.stderr); assert!( stderr.contains("denied"), @@ -1553,12 +1438,13 @@ fn local_cli_branch_create_enforces_engine_layer_policy() { output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("--from") .arg("main") .arg("ragnor-feature"), @@ -1567,34 +1453,47 @@ fn local_cli_branch_create_enforces_engine_layer_policy() { #[test] fn local_cli_branch_delete_enforces_engine_layer_policy() { - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); + // RFC-011 served re-point: bruno has no branch-ops rule → denied; + // ragnor has admins-branch-ops → allowed. + if skip_system_e2e("local_cli_branch_delete_enforces_engine_layer_policy") { + return; + } + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), + &[("OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", POLICY_TOKENS_JSON)], + ); // Pre-create the branch as ragnor so there's something to delete. output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("--from") .arg("main") .arg("doomed"), ); - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("branch") - .arg("delete") - .arg("--config") - .arg(&config) - .arg("doomed"), - ); + // `--yes` clears the RFC-011 Decision 9 destructive-write confirmation so + // the policy check (not the confirmation refusal) is what denies. + let denied = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") + .arg("branch") + .arg("delete") + .arg("--yes") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("doomed") + .output() + .unwrap(); + assert!(!denied.status.success(), "bruno branch delete must be denied"); let stderr = String::from_utf8_lossy(&denied.stderr); assert!( stderr.contains("denied"), @@ -1603,48 +1502,61 @@ fn local_cli_branch_delete_enforces_engine_layer_policy() { output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("branch") .arg("delete") - .arg("--config") - .arg(&config) + .arg("--yes") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("doomed"), ); } #[test] fn local_cli_branch_merge_enforces_engine_layer_policy() { - let graph = SystemGraph::loaded(); - let config = graph.write_config("omnigraph-policy.yaml", &local_policy_config(&graph)); - graph.write_config("policy.yaml", POLICY_E2E_YAML); + // RFC-011 served re-point: merging into protected main needs + // branch_merge with target_branch_scope protected. bruno has no such + // rule → denied; ragnor has admins-promote → allowed. + if skip_system_e2e("local_cli_branch_merge_enforces_engine_layer_policy") { + return; + } + let cluster = converged_loaded_cluster("knowledge", Some(POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), + &[("OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", POLICY_TOKENS_JSON)], + ); // Pre-create a feature branch as ragnor (admins-branch-ops covers it). output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("--from") .arg("main") .arg("merge-feature"), ); - let denied = output_failure( - cli() - .arg("--as") - .arg("act-bruno") - .arg("branch") - .arg("merge") - .arg("--config") - .arg(&config) - .arg("merge-feature") - .arg("--into") - .arg("main"), - ); + let denied = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "bruno-tok") + .arg("branch") + .arg("merge") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") + .arg("merge-feature") + .arg("--into") + .arg("main") + .output() + .unwrap(); + assert!(!denied.status.success(), "bruno branch merge must be denied"); let stderr = String::from_utf8_lossy(&denied.stderr); assert!( stderr.contains("denied"), @@ -1653,68 +1565,56 @@ fn local_cli_branch_merge_enforces_engine_layer_policy() { output_success( cli() - .arg("--as") - .arg("act-ragnor") + .env("OMNIGRAPH_BEARER_TOKEN", "ragnor-tok") .arg("branch") .arg("merge") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg("knowledge") .arg("merge-feature") .arg("--into") .arg("main"), ); } -// ─── MR-722 PR A: cli.actor config-only precedence ──────────────────────── +// ─── RFC-011: operator.actor cascade ────────────────────────────────────── // -// The change-writer test above uses `--as` directly. These two tests -// pin the precedence rule that `main.rs::resolve_cli_actor` implements: -// `--as` flag > `cli.actor` from `omnigraph.yaml` > None. +// The CLI actor chain is `--as` > `operator.actor` (in the operator config +// at $OMNIGRAPH_HOME/config.yaml) > none. These two tests pin that order on +// a direct (`--store`) write. RFC-011 makes direct-store writes unpoliced, +// so the assertion is on which `actor_id` the write records, not on a Cedar +// allow/deny — the actor still has to be resolved correctly and stamped onto +// the commit. -fn local_policy_config_with_actor(graph: &SystemGraph, actor: &str) -> String { - // Mirrors `local_policy_config` but adds `cli.actor` so the - // config-only precedence path is exercised. The `cli:` block - // already has `graph` and `branch`; appending `actor` here. - format!( - "\ -project: - name: policy-e2e-local -graphs: - local: - uri: {} - policy: - file: ./policy.yaml -cli: - graph: local - branch: main - actor: {} -query: - roots: - - . -", - yaml_string(&graph.path().to_string_lossy()), - actor, +/// An operator config (`$OMNIGRAPH_HOME/config.yaml`) carrying just +/// `operator.actor`. Pointing OMNIGRAPH_HOME at the holding dir makes the +/// CLI read it as the operator layer. +fn operator_home_with_actor(actor: &str) -> tempfile::TempDir { + let home = tempfile::tempdir().unwrap(); + fs::write( + home.path().join("config.yaml"), + format!("operator:\n actor: {actor}\n"), ) + .unwrap(); + home } #[test] fn local_cli_actor_from_config_used_when_no_flag() { - // cli.actor: act-ragnor in omnigraph.yaml, no --as flag → change - // permitted via admins-write rule. Proves the config-only path - // works; previously the only proof was structural. + // operator.actor: act-ragnor in the operator config, no --as flag → + // the write records act-ragnor. Proves the operator-layer actor source + // is consulted when `--as` is absent. let graph = SystemGraph::loaded(); - let config = graph.write_config( - "omnigraph-policy.yaml", - &local_policy_config_with_actor(&graph, "act-ragnor"), - ); - graph.write_config("policy.yaml", POLICY_E2E_YAML); + let home = operator_home_with_actor("act-ragnor"); let mutation_file = insert_person_query(&graph, "system-local-cli-actor.gq"); let allowed = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_HOME", home.path()) .arg("change") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(graph.path()) .arg("--query") .arg(&mutation_file) .arg("--params") @@ -1727,35 +1627,30 @@ fn local_cli_actor_from_config_used_when_no_flag() { #[test] fn local_cli_actor_flag_overrides_config_actor() { - // cli.actor: act-ragnor in config + --as act-bruno on CLI → change - // denied. Flag wins per the precedence rule. Without this test, a - // future change that reverses precedence would ride through silently. + // operator.actor: act-ragnor in the config + --as act-bruno on the CLI → + // the write records act-bruno. The flag wins per the precedence rule. + // Without this test, a future change that reverses precedence would ride + // through silently. let graph = SystemGraph::loaded(); - let config = graph.write_config( - "omnigraph-policy.yaml", - &local_policy_config_with_actor(&graph, "act-ragnor"), - ); - graph.write_config("policy.yaml", POLICY_E2E_YAML); + let home = operator_home_with_actor("act-ragnor"); let mutation_file = insert_person_query(&graph, "system-local-cli-actor-override.gq"); - let denied = output_failure( + let overridden = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_HOME", home.path()) .arg("--as") .arg("act-bruno") .arg("change") - .arg("--config") - .arg(&config) + .arg("--store") + .arg(graph.path()) .arg("--query") .arg(&mutation_file) .arg("--params") .arg(r#"{"name":"OverrideEve","age":19}"#) .arg("--json"), - ); - let stderr = String::from_utf8_lossy(&denied.stderr); - assert!( - stderr.contains("denied"), - "expected 'denied' when --as overrides config to bruno, got: {stderr}" - ); + )); + assert_eq!(overridden["affected_nodes"], 1); + assert_eq!(overridden["actor_id"], "act-bruno"); } /// Phase 5 (RFC-005): "applied means serving" — converge a cluster with the @@ -2053,22 +1948,16 @@ fn local_cluster_full_lifecycle_declare_serve_evolve_delete() { } // Out-of-band drift: the live graph evolves behind the cluster's back; - // refresh observes it, apply converges it back to the declared schema. - std::fs::write( - dir.join("rogue.pg"), - "\nnode Person {\n name: String @key\n bio: String?\n rogue: String?\n}\n", - ) - .unwrap(); - let output = cli() - .arg("schema") - .arg("apply") - .arg(dir.join("graphs/knowledge.omni")) - .arg("--schema") - .arg(dir.join("rogue.pg")) - .arg("--json") - .output() - .unwrap(); - assert!(output.status.success(), "out-of-band schema apply failed"); + // refresh observes it, apply converges it back to the declared schema. RFC-011 + // D10 makes the CLI `schema apply` refuse a cluster-managed graph, so a true + // bypass is a direct engine apply against the storage root. + let rogue_pg = "\nnode Person {\n name: String @key\n bio: String?\n rogue: String?\n}\n"; + tokio::runtime::Runtime::new().unwrap().block_on(async { + let db = Omnigraph::open(dir.join("graphs/knowledge.omni").to_string_lossy().as_ref()) + .await + .unwrap(); + db.apply_schema(rogue_pg).await.unwrap(); + }); let refresh = cluster_cli(dir, &["refresh"]); assert_eq!( refresh["resource_statuses"]["schema.knowledge"]["status"], @@ -2321,9 +2210,12 @@ fn cluster_server_boot_ignores_local_config_in_cwd() { /// 3), and `logout` revokes. #[test] fn local_cli_keyed_credentials_authenticate_url_matched_server() { - let graph = SystemGraph::loaded(); - let server = spawn_server_with_env( - graph.path(), + // RFC-011 cluster-only: the server boots from a converged cluster + // serving the fixture graph under id `local`; tokens-only boot is + // default-deny, which still permits `read`. + let cluster = converged_loaded_cluster("local", None); + let server = spawn_server_with_cluster_env( + cluster.path(), &[("OMNIGRAPH_SERVER_BEARER_TOKEN", "secret-tok")], ); let operator_home = tempfile::tempdir().unwrap(); @@ -2346,9 +2238,10 @@ fn local_cli_keyed_credentials_authenticate_url_matched_server() { .arg("read") .arg("--server") .arg(&server.base_url) + .arg("--graph") + .arg("local") .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -2435,26 +2328,45 @@ fn local_cli_keyed_credentials_authenticate_url_matched_server() { /// stored queries) end to end, with the keyed credential from PR 2. #[test] fn local_cli_operator_alias_and_server_flag_invoke_stored_query() { - let graph = SystemGraph::loaded(); - graph.write_query( - "stored-find-person.gq", + // RFC-011 cluster-only: build a converged cluster serving graph `local` + // with a stored query `find_person` and a per-graph policy granting the + // operator invoke_query + read (invoke_query is policy-gated — anti-probing + // 404 without the grant). + let cluster = tempfile::tempdir().unwrap(); + fs::copy(fixture("test.pg"), cluster.path().join("local.pg")).unwrap(); + fs::write( + cluster.path().join("find-person.gq"), "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.name } }", + ) + .unwrap(); + fs::write( + cluster.path().join("insert-person.gq"), + "query insert_person($name: String) { insert Person { name: $name, age: 41 } }", + ) + .unwrap(); + fs::write( + cluster.path().join("graph.policy.yaml"), + "version: 1\ngroups:\n ops: [\"act-op\"]\nprotected_branches: [main]\nrules:\n - id: allow-invoke\n allow:\n actors: { group: ops }\n actions: [invoke_query]\n - id: allow-read\n allow:\n actors: { group: ops }\n actions: [read]\n branch_scope: any\n - id: allow-change\n allow:\n actors: { group: ops }\n actions: [change]\n branch_scope: any\n", + ) + .unwrap(); + fs::write( + cluster.path().join("cluster.yaml"), + "version: 1\nmetadata:\n name: alias-sys\nstate:\n backend: cluster\n lock: true\ngraphs:\n local:\n schema: ./local.pg\n queries:\n find_person:\n file: ./find-person.gq\n insert_person:\n file: ./insert-person.gq\npolicies:\n graph:\n file: ./graph.policy.yaml\n applies_to: [local]\n", + ) + .unwrap(); + output_success(cli().arg("cluster").arg("import").arg("--config").arg(cluster.path())); + output_success(cli().arg("cluster").arg("apply").arg("--config").arg(cluster.path())); + output_success( + cli() + .arg("load") + .arg("--data") + .arg(fixture("test.jsonl")) + .arg("--mode") + .arg("overwrite") + .arg(cluster.path().join("graphs").join("local.omni")), ); - // invoke_query is policy-gated (anti-probing 404 without the grant), - // so the server gets a per-graph bundle granting it to the operator. - graph.write_file( - "graph.policy.yaml", - "version: 1\ngroups:\n ops: [\"act-op\"]\nprotected_branches: [main]\nrules:\n - id: allow-invoke\n allow:\n actors: { group: ops }\n actions: [invoke_query]\n - id: allow-read\n allow:\n actors: { group: ops }\n actions: [read]\n branch_scope: any\n", - ); - let config = graph.write_config( - "omnigraph-server.yaml", - &format!( - "graphs:\n local:\n uri: {}\n policy:\n file: ./graph.policy.yaml\n queries:\n find_person:\n file: ./stored-find-person.gq\n", - yaml_string(&graph.path().to_string_lossy()) - ), - ); - let server = spawn_server_with_config_env( - &config, + let server = spawn_server_with_cluster_env( + cluster.path(), &[( "OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", r#"{"act-op":"srv-tok"}"#, @@ -2465,7 +2377,7 @@ fn local_cli_operator_alias_and_server_flag_invoke_stored_query() { fs::write( operator_home.path().join("config.yaml"), format!( - "servers:\n dev:\n url: {}\naliases:\n who:\n server: dev\n graph: local\n query: find_person\n args: [name]\n", + "servers:\n dev:\n url: {}\naliases:\n who:\n server: dev\n graph: local\n query: find_person\n args: [name]\n create_person:\n server: dev\n graph: local\n query: insert_person\n args: [name]\n", server.base_url ), ) @@ -2485,12 +2397,11 @@ fn local_cli_operator_alias_and_server_flag_invoke_stored_query() { .unwrap(); } - // The operator alias: name + positional arg, nothing else — server, + // The operator alias (RFC-011 D4): `alias [args]` — server, // graph, stored query, and token all resolve from the operator layer. let output = cli() .env("OMNIGRAPH_HOME", operator_home.path()) - .arg("query") - .arg("--alias") + .arg("alias") .arg("who") .arg("Alice") .arg("--json") @@ -2503,6 +2414,46 @@ fn local_cli_operator_alias_and_server_flag_invoke_stored_query() { let payload: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap(); assert_eq!(payload["rows"][0]["p.name"], "Alice", "{payload}"); + // Operator aliases are read-only conveniences: a binding to a stored + // mutation must be rejected before the server executes it. + let output = cli() + .env("OMNIGRAPH_HOME", operator_home.path()) + .arg("alias") + .arg("create_person") + .arg("AliasGuardPerson") + .output() + .unwrap(); + assert!(!output.status.success(), "mutation alias must fail"); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("'insert_person' is a mutation") + && stderr.contains("omnigraph mutate insert_person"), + "expected mutation-kind mismatch; got: {stderr}" + ); + let output = cli() + .env("OMNIGRAPH_HOME", operator_home.path()) + .arg("query") + .arg("find_person") + .arg("--server") + .arg("dev") + .arg("--graph") + .arg("local") + .arg("--params") + .arg(r#"{"name":"AliasGuardPerson"}"#) + .arg("--json") + .output() + .unwrap(); + assert!( + output.status.success(), + "post-alias read should succeed: {output:?}" + ); + let payload: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap(); + assert_eq!( + payload["rows"].as_array().unwrap().len(), + 0, + "mutation alias must not insert AliasGuardPerson: {payload}" + ); + // --server/--graph: the same stored query via explicit targeting. let output = cli() .env("OMNIGRAPH_HOME", operator_home.path()) @@ -2520,6 +2471,45 @@ fn local_cli_operator_alias_and_server_flag_invoke_stored_query() { .unwrap(); assert!(output.status.success(), "{output:?}"); + // RFC-011 D3: invoke the STORED query by name (catalog lane, served-only). + // No `-e`/`--query` — the positional `find_person` is the catalog name. + let output = cli() + .env("OMNIGRAPH_HOME", operator_home.path()) + .arg("query") + .arg("find_person") + .arg("--server") + .arg("dev") + .arg("--graph") + .arg("local") + .arg("--params") + .arg(r#"{"name":"Alice"}"#) + .arg("--json") + .output() + .unwrap(); + assert!(output.status.success(), "by-name catalog invocation: {output:?}"); + let payload: serde_json::Value = serde_json::from_slice(&output.stdout).unwrap(); + assert_eq!(payload["rows"][0]["p.name"], "Alice", "{payload}"); + + // The verb asserts kind: `mutate ` is rejected by the server. + let output = cli() + .env("OMNIGRAPH_HOME", operator_home.path()) + .arg("mutate") + .arg("find_person") + .arg("--server") + .arg("dev") + .arg("--graph") + .arg("local") + .arg("--params") + .arg(r#"{"name":"Alice"}"#) + .output() + .unwrap(); + assert!(!output.status.success(), "mutate on a read query must fail"); + let stderr = String::from_utf8_lossy(&output.stderr); + assert!( + stderr.contains("'find_person' is a read — use omnigraph query find_person"), + "expected a kind-mismatch error; got: {stderr}" + ); + // Unknown --server errors listing what IS defined. let output = cli() .env("OMNIGRAPH_HOME", operator_home.path()) @@ -2534,10 +2524,14 @@ fn local_cli_operator_alias_and_server_flag_invoke_stored_query() { let stderr = String::from_utf8_lossy(&output.stderr); assert!(stderr.contains("unknown server 'nope'") && stderr.contains("dev"), "{stderr}"); - // --server is exclusive with a positional URI. + // --server is exclusive with --store (two ways to address the graph). + // (RFC-011 D3: there is no positional URI anymore — the positional is a + // query name — so the double-addressing contradiction now surfaces between + // the two scope primitives.) let output = cli() .env("OMNIGRAPH_HOME", operator_home.path()) .arg("query") + .arg("--store") .arg(&server.base_url) .arg("--server") .arg("dev") diff --git a/crates/omnigraph-cli/tests/system_remote.rs b/crates/omnigraph-cli/tests/system_remote.rs index 95a53e7..19f460e 100644 --- a/crates/omnigraph-cli/tests/system_remote.rs +++ b/crates/omnigraph-cli/tests/system_remote.rs @@ -8,6 +8,14 @@ use serde_json::json; use support::*; +/// Graph id every served test addresses (`--server --graph GRAPH_ID`). +/// RFC-011: the server is cluster-only, so a graph selector is always required +/// — even for a single-graph cluster. +const GRAPH_ID: &str = "knowledge"; + +/// Graph-bound Cedar bundle for the policy-flavored remote tests. `act-bruno` +/// (team) reads + writes unprotected branches; `act-ragnor` (admins) merges +/// into protected `main`. const REMOTE_POLICY_E2E_YAML: &str = r#" version: 1 groups: @@ -37,6 +45,8 @@ rules: target_branch_scope: protected "#; +/// Server-scoped bundle granting `act-admin` the `graph_list` action so +/// `GET /graphs` succeeds. const GRAPH_LIST_SERVER_POLICY_YAML: &str = r#" version: 1 groups: @@ -48,61 +58,24 @@ rules: actions: [graph_list] "#; -fn yaml_string(value: &str) -> String { - format!("'{}'", value.replace('\'', "''")) -} - -fn remote_policy_server_config(graph: &SystemGraph) -> String { - format!( - "\ -project: - name: remote-policy-e2e -graphs: - local: - uri: {} - policy: - file: ./policy.yaml -server: - graph: local -", - yaml_string(&graph.path().to_string_lossy()) - ) -} - -fn remote_policy_client_config(url: &str) -> String { - format!( - "\ -graphs: - dev: - uri: {} - bearer_token_env: POLICY_TEST_TOKEN -cli: - graph: dev - branch: main -query: - roots: - - . -auth: - env_file: ./.env.omni -", - yaml_string(url) - ) -} - #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_server_and_cli_end_to_end_flow() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let mutation_file = graph.write_query( - "system-remote-change.gq", + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + // The served graph's storage root — used for embedded-side cross checks. + let served_root = cluster.path().join("graphs").join(format!("{GRAPH_ID}.omni")); + let temp = tempfile::tempdir().unwrap(); + let mutation_file = temp.path().join("system-remote-change.gq"); + fs::write( + &mutation_file, r#" query insert_person($name: String, $age: I32) { insert Person { name: $name, age: $age } } "#, - ); + ) + .unwrap(); let client = Client::new(); let health = client @@ -116,13 +89,15 @@ query insert_person($name: String, $age: I32) { assert_eq!(health["status"], "ok"); let local_snapshot = parse_stdout_json(&output_success( - cli().arg("snapshot").arg(graph.path()).arg("--json"), + cli().arg("snapshot").arg(&served_root).arg("--json"), )); let snapshot = parse_stdout_json(&output_success( cli() .arg("snapshot") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--json"), )); assert_eq!(snapshot["branch"], "main"); @@ -131,10 +106,10 @@ query insert_person($name: String, $age: I32) { let local_read = parse_stdout_json(&output_success( cli() .arg("read") - .arg(graph.path()) + .arg("--store") + .arg(&served_root) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -143,11 +118,12 @@ query insert_person($name: String, $age: I32) { let read_payload = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -157,11 +133,15 @@ query insert_person($name: String, $age: I32) { assert_eq!(read_payload["row_count"], 1); assert_eq!(read_payload["rows"][0]["p.name"], "Alice"); + // Served write: no `--as` (the server resolves the actor; here the server + // is `--unauthenticated`, so the actor is the server default). let change_payload = parse_stdout_json(&output_success( cli() .arg("change") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&mutation_file) .arg("--params") @@ -172,7 +152,7 @@ query insert_person($name: String, $age: I32) { let query_source = fs::read_to_string(fixture("test.gq")).unwrap(); let http_read = client - .post(format!("{}/read", server.base_url)) + .post(format!("{}/graphs/{GRAPH_ID}/read", server.base_url)) .json(&json!({ "branch": "main", "query_source": query_source, @@ -191,10 +171,10 @@ query insert_person($name: String, $age: I32) { let local_verify = parse_stdout_json(&output_success( cli() .arg("read") - .arg(graph.path()) + .arg("--store") + .arg(&served_root) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Mina"}"#) @@ -203,15 +183,16 @@ query insert_person($name: String, $age: I32) { assert_eq!(local_verify["row_count"], 1); assert_eq!(local_verify["rows"][0]["p.name"], "Mina"); - // CLI `-e` over the HTTP transport (--config points at remote server). - // Confirms inline source survives the remote-execution path identically - // to file-based queries, and exercises `POST /query` end-to-end via the - // change-then-read round trip we just established. + // CLI inline source over the HTTP transport (--server). Confirms inline + // source survives the remote-execution path identically to file-based + // queries. let inline_remote_read = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("-e") .arg("query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }") .arg("--params") @@ -224,8 +205,10 @@ query insert_person($name: String, $age: I32) { let inline_remote_change = parse_stdout_json(&output_success( cli() .arg("change") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query-string") .arg("query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }") .arg("--params") @@ -234,10 +217,9 @@ query insert_person($name: String, $age: I32) { )); assert_eq!(inline_remote_change["affected_nodes"], 1); - // `POST /query` happy path directly: a hand-rolled HTTP body using the - // new clean field names. + // `POST /graphs/{id}/query` happy path directly. let http_query = client - .post(format!("{}/query", server.base_url)) + .post(format!("{}/graphs/{GRAPH_ID}/query", server.base_url)) .json(&json!({ "branch": "main", "query": "query find($name: String) { match { $p: Person { name: $name } } return { $p.name } }", @@ -252,9 +234,9 @@ query insert_person($name: String, $age: I32) { assert_eq!(http_query["row_count"], 1); assert_eq!(http_query["rows"][0]["p.name"], "Inline"); - // `POST /query` rejects mutations with 400. + // `POST /graphs/{id}/query` rejects mutations with 400. let http_query_mutation = client - .post(format!("{}/query", server.base_url)) + .post(format!("{}/graphs/{GRAPH_ID}/query", server.base_url)) .json(&json!({ "branch": "main", "query": "query bad($name: String, $age: I32) { insert Person { name: $name, age: $age } }", @@ -263,32 +245,33 @@ query insert_person($name: String, $age: I32) { .send() .unwrap(); assert_eq!(http_query_mutation.status(), reqwest::StatusCode::BAD_REQUEST); - - // `run publish` / `run list` removed. Direct-to-target writes - // already landed via the change call above; the commit graph is now - // the audit surface (verified separately by `commit list`). } #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_schema_apply_via_cli_updates_graph() { - let graph = SystemGraph::initialized(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let next_schema = graph.write_file( - "next.pg", - &fs::read_to_string(fixture("test.pg")).unwrap().replace( + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let served_root = cluster.path().join("graphs").join(format!("{GRAPH_ID}.omni")); + let temp = tempfile::tempdir().unwrap(); + let next_schema = temp.path().join("next.pg"); + fs::write( + &next_schema, + fs::read_to_string(fixture("test.pg")).unwrap().replace( " age: I32?\n}", " age: I32?\n nickname: String?\n}", ), - ); + ) + .unwrap(); let payload = parse_stdout_json(&output_success( cli() .arg("schema") .arg("apply") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--schema") .arg(&next_schema) .arg("--json"), @@ -297,7 +280,7 @@ fn remote_schema_apply_via_cli_updates_graph() { let db = tokio::runtime::Runtime::new() .unwrap() - .block_on(Omnigraph::open(graph.path().to_string_lossy().as_ref())) + .block_on(Omnigraph::open(served_root.to_string_lossy().as_ref())) .unwrap(); assert!( db.catalog().node_types["Person"] @@ -309,74 +292,95 @@ fn remote_schema_apply_via_cli_updates_graph() { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_schema_apply_rejects_unsupported_plan() { - let graph = SystemGraph::initialized(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let breaking_schema = graph.write_file( - "breaking.pg", - &fs::read_to_string(fixture("test.pg")) + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let temp = tempfile::tempdir().unwrap(); + let breaking_schema = temp.path().join("breaking.pg"); + fs::write( + &breaking_schema, + fs::read_to_string(fixture("test.pg")) .unwrap() .replace("age: I32?", "age: I64?"), - ); + ) + .unwrap(); let output = output_failure( cli() .arg("schema") .arg("apply") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--schema") .arg(&breaking_schema), ); let stderr = String::from_utf8_lossy(&output.stderr); - assert!(stderr.contains("changing property type")); + assert!( + stderr.contains("changing property type"), + "expected unsupported-plan error, got: {stderr}" + ); } #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_schema_apply_rejects_when_non_main_branch_exists() { - let graph = SystemGraph::initialized(); + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + + // Create a non-main branch over the served path so the schema-apply + // single-branch precondition fails. output_success( cli() .arg("branch") .arg("create") + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--from") .arg("main") - .arg("--uri") - .arg(graph.path()) .arg("feature"), ); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let next_schema = graph.write_file( - "next.pg", - &fs::read_to_string(fixture("test.pg")).unwrap().replace( + + let temp = tempfile::tempdir().unwrap(); + let next_schema = temp.path().join("next.pg"); + fs::write( + &next_schema, + fs::read_to_string(fixture("test.pg")).unwrap().replace( " age: I32?\n}", " age: I32?\n nickname: String?\n}", ), - ); + ) + .unwrap(); let output = output_failure( cli() .arg("schema") .arg("apply") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--schema") .arg(&next_schema), ); let stderr = String::from_utf8_lossy(&output.stderr); - assert!(stderr.contains("schema apply requires a graph with only main")); + assert!( + stderr.contains("schema apply requires a graph with only main"), + "expected single-branch precondition error, got: {stderr}" + ); } #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_read_preserves_projection_order_in_json_and_csv() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let ordered_query = graph.write_query( - "ordered-remote.gq", + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let temp = tempfile::tempdir().unwrap(); + let ordered_query = temp.path().join("ordered-remote.gq"); + fs::write( + &ordered_query, r#" query ordered_person($name: String) { match { @@ -385,16 +389,18 @@ query ordered_person($name: String) { return { $p.age, $p.name } } "#, - ); + ) + .unwrap(); let json_payload = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&ordered_query) - .arg("--name") .arg("ordered_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -411,11 +417,12 @@ query ordered_person($name: String) { let csv = stdout_string(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&ordered_query) - .arg("--name") .arg("ordered_person") .arg("--params") .arg(r#"{"name":"Alice"}"#) @@ -430,24 +437,28 @@ query ordered_person($name: String) { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_branch_create_list_merge_flow() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let mutation_file = graph.write_query( - "system-remote-branch-change.gq", + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let temp = tempfile::tempdir().unwrap(); + let mutation_file = temp.path().join("system-remote-branch-change.gq"); + fs::write( + &mutation_file, r#" query insert_person($name: String, $age: I32) { insert Person { name: $name, age: $age } } "#, - ); + ) + .unwrap(); let initial = parse_stdout_json(&output_success( cli() .arg("branch") .arg("list") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--json"), )); assert_eq!(initial["branches"], json!(["main"])); @@ -456,8 +467,10 @@ query insert_person($name: String, $age: I32) { cli() .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--from") .arg("main") .arg("feature") @@ -470,8 +483,10 @@ query insert_person($name: String, $age: I32) { cli() .arg("branch") .arg("list") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--json"), )); assert_eq!(listed["branches"], json!(["feature", "main"])); @@ -479,8 +494,10 @@ query insert_person($name: String, $age: I32) { let changed = parse_stdout_json(&output_success( cli() .arg("change") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&mutation_file) .arg("--branch") @@ -496,8 +513,10 @@ query insert_person($name: String, $age: I32) { cli() .arg("branch") .arg("merge") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("feature") .arg("--into") .arg("main") @@ -510,11 +529,12 @@ query insert_person($name: String, $age: I32) { let verify = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Zoe"}"#) @@ -527,16 +547,17 @@ query insert_person($name: String, $age: I32) { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_branch_delete_removes_branch() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); parse_stdout_json(&output_success( cli() .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--from") .arg("main") .arg("feature") @@ -547,9 +568,13 @@ fn remote_branch_delete_removes_branch() { cli() .arg("branch") .arg("delete") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("feature") + // Served target is non-local → destructive-confirm gate (RFC-011 D9). + .arg("--yes") .arg("--json"), )); assert_eq!(deleted["name"], "feature"); @@ -558,8 +583,10 @@ fn remote_branch_delete_removes_branch() { cli() .arg("branch") .arg("list") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--json"), )); assert_eq!(listed["branches"], json!(["main"])); @@ -568,11 +595,12 @@ fn remote_branch_delete_removes_branch() { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_export_round_trips_full_branch_graph() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let mutation_file = graph.write_query( - "system-remote-export-change.gq", + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let temp = tempfile::tempdir().unwrap(); + let mutation_file = temp.path().join("system-remote-export-change.gq"); + fs::write( + &mutation_file, r#" query insert_person($name: String, $age: I32) { insert Person { name: $name, age: $age } @@ -582,14 +610,17 @@ query add_friend($from: String, $to: String) { insert Knows { from: $from, to: $to } } "#, - ); + ) + .unwrap(); output_success( cli() .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--from") .arg("main") .arg("feature"), @@ -598,11 +629,12 @@ query add_friend($from: String, $to: String) { output_success( cli() .arg("change") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&mutation_file) - .arg("--name") .arg("insert_person") .arg("--branch") .arg("feature") @@ -613,11 +645,12 @@ query add_friend($from: String, $to: String) { output_success( cli() .arg("change") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&mutation_file) - .arg("--name") .arg("add_friend") .arg("--branch") .arg("feature") @@ -629,18 +662,17 @@ query add_friend($from: String, $to: String) { let exported = stdout_string(&output_success( cli() .arg("export") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--branch") .arg("feature") .arg("--jsonl"), )); - let export_path = graph.write_jsonl("system-remote-exported.jsonl", &exported); - let imported_graph = graph - .path() - .parent() - .unwrap() - .join("imported-remote-export.omni"); + let export_path = temp.path().join("system-remote-exported.jsonl"); + fs::write(&export_path, &exported).unwrap(); + let imported_graph = temp.path().join("imported-remote-export.omni"); output_success( cli() @@ -684,10 +716,10 @@ query add_friend($from: String, $to: String) { let eve = parse_stdout_json(&output_success( cli() .arg("read") + .arg("--store") .arg(&imported_graph) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"Eve"}"#) @@ -700,20 +732,24 @@ query add_friend($from: String, $to: String) { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_ingest_creates_review_branch_and_keeps_it_readable() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let ingest_data = graph.write_jsonl( - "system-remote-ingest.jsonl", + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let temp = tempfile::tempdir().unwrap(); + let ingest_data = temp.path().join("system-remote-ingest.jsonl"); + fs::write( + &ingest_data, r#"{"type":"Person","data":{"name":"Zoe","age":33}} {"type":"Person","data":{"name":"Bob","age":26}}"#, - ); + ) + .unwrap(); let ingest_payload = parse_stdout_json(&output_success( cli() .arg("ingest") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--data") .arg(&ingest_data) .arg("--branch") @@ -730,8 +766,10 @@ fn remote_ingest_creates_review_branch_and_keeps_it_readable() { let feature_snapshot = parse_stdout_json(&output_success( cli() .arg("snapshot") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--branch") .arg("feature-ingest") .arg("--json"), @@ -741,11 +779,12 @@ fn remote_ingest_creates_review_branch_and_keeps_it_readable() { let zoe = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--branch") .arg("feature-ingest") @@ -763,20 +802,24 @@ fn remote_ingest_creates_review_branch_and_keeps_it_readable() { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_load_round_trips_and_requires_from_for_new_branches() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); - let extra = graph.write_jsonl( - "system-remote-load.jsonl", + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); + let temp = tempfile::tempdir().unwrap(); + let extra = temp.path().join("system-remote-load.jsonl"); + fs::write( + &extra, r#"{"type":"Person","data":{"name":"Zoe","age":33}}"#, - ); + ) + .unwrap(); // Missing branch without --from: refused remotely, nothing created. let failure = output_failure( cli() .arg("load") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--mode") .arg("merge") .arg("--data") @@ -793,8 +836,10 @@ fn remote_load_round_trips_and_requires_from_for_new_branches() { let payload = parse_stdout_json(&output_success( cli() .arg("load") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--mode") .arg("merge") .arg("--data") @@ -813,8 +858,10 @@ fn remote_load_round_trips_and_requires_from_for_new_branches() { let snapshot = parse_stdout_json(&output_success( cli() .arg("snapshot") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--branch") .arg("feature-load") .arg("--json"), @@ -825,32 +872,38 @@ fn remote_load_round_trips_and_requires_from_for_new_branches() { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_ingest_reuses_existing_branch_and_merges_updates() { - let graph = SystemGraph::loaded(); - let server = graph.spawn_server(); - let config = graph.write_config("omnigraph.yaml", &remote_yaml_config(&server.base_url)); + let cluster = converged_loaded_cluster(GRAPH_ID, None); + let server = spawn_server_with_cluster(cluster.path()); output_success( cli() .arg("branch") .arg("create") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--from") .arg("main") .arg("feature-ingest"), ); - let ingest_data = graph.write_jsonl( - "system-remote-ingest-merge.jsonl", + let temp = tempfile::tempdir().unwrap(); + let ingest_data = temp.path().join("system-remote-ingest-merge.jsonl"); + fs::write( + &ingest_data, r#"{"type":"Person","data":{"name":"Bob","age":26}} {"type":"Person","data":{"name":"Zoe","age":33}}"#, - ); + ) + .unwrap(); let ingest_payload = parse_stdout_json(&output_success( cli() .arg("ingest") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--data") .arg(&ingest_data) .arg("--branch") @@ -869,11 +922,12 @@ fn remote_ingest_reuses_existing_branch_and_merges_updates() { let bob = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--branch") .arg("feature-ingest") @@ -887,11 +941,12 @@ fn remote_ingest_reuses_existing_branch_and_merges_updates() { let zoe = parse_stdout_json(&output_success( cli() .arg("read") - .arg("--config") - .arg(&config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--branch") .arg("feature-ingest") @@ -906,45 +961,51 @@ fn remote_ingest_reuses_existing_branch_and_merges_updates() { #[test] #[ignore = "requires loopback socket permissions in sandboxed runners"] fn remote_policy_enforces_branch_first_cli_workflow() { - let graph = SystemGraph::loaded(); - let server_config = - graph.write_config("server-policy.yaml", &remote_policy_server_config(&graph)); - graph.write_config("policy.yaml", REMOTE_POLICY_E2E_YAML); - let server = graph.spawn_server_with_config_env( - &server_config, + // Served policy enforcement: the cluster binds REMOTE_POLICY_E2E_YAML to the + // graph, and the server maps bearer tokens to actors. The actor is resolved + // from the token (no `--as` on served writes). + let cluster = converged_loaded_cluster(GRAPH_ID, Some(REMOTE_POLICY_E2E_YAML)); + let server = spawn_server_with_cluster_env( + cluster.path(), &[( "OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", r#"{"act-bruno":"team-token","act-ragnor":"admin-token"}"#, )], ); - let client_config = graph.write_config( - "omnigraph-policy.yaml", - &remote_policy_client_config(&server.base_url), - ); - graph.write_config(".env.omni", "POLICY_TEST_TOKEN=team-token\n"); - let mutation_file = graph.write_query( - "system-remote-policy-change.gq", + let temp = tempfile::tempdir().unwrap(); + let mutation_file = temp.path().join("system-remote-policy-change.gq"); + fs::write( + &mutation_file, r#" query insert_person($name: String, $age: I32) { insert Person { name: $name, age: $age } } "#, - ); + ) + .unwrap(); + // Reads are granted to the team group (bruno). let snapshot = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "team-token") .arg("snapshot") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--json"), )); assert_eq!(snapshot["branch"], "main"); + // bruno cannot change protected main (team-write-unprotected only). let denied_main_change = output_failure( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "team-token") .arg("change") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&mutation_file) .arg("--params") @@ -952,14 +1013,23 @@ query insert_person($name: String, $age: I32) { .arg("--json"), ); let denied_main_stderr = String::from_utf8(denied_main_change.stderr).unwrap(); - assert!(denied_main_stderr.contains("policy denied action 'change' on branch 'main'")); + assert!( + denied_main_stderr.contains("denied") + && denied_main_stderr.contains("change") + && denied_main_stderr.contains("main"), + "expected change-on-main denial, got: {denied_main_stderr}" + ); + // bruno can create an unprotected branch. let created = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "team-token") .arg("branch") .arg("create") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--from") .arg("main") .arg("feature") @@ -967,11 +1037,15 @@ query insert_person($name: String, $age: I32) { )); assert_eq!(created["name"], "feature"); + // bruno can change the unprotected branch; actor resolves from the token. let changed = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "team-token") .arg("change") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(&mutation_file) .arg("--branch") @@ -982,28 +1056,39 @@ query insert_person($name: String, $age: I32) { )); assert_eq!(changed["branch"], "feature"); assert_eq!(changed["affected_nodes"], 1); + assert_eq!(changed["actor_id"], "act-bruno"); + // bruno cannot merge into protected main (admins-promote only). let denied_merge = output_failure( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "team-token") .arg("branch") .arg("merge") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("feature") .arg("--into") .arg("main") .arg("--json"), ); let denied_merge_stderr = String::from_utf8(denied_merge.stderr).unwrap(); - assert!(denied_merge_stderr.contains("policy denied action 'branch_merge'")); + assert!( + denied_merge_stderr.contains("denied") && denied_merge_stderr.contains("branch_merge"), + "expected branch_merge denial, got: {denied_merge_stderr}" + ); + // ragnor (admins) can promote into protected main. let merged = parse_stdout_json(&output_success( cli() - .env("POLICY_TEST_TOKEN", "admin-token") + .env("OMNIGRAPH_BEARER_TOKEN", "admin-token") .arg("branch") .arg("merge") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("feature") .arg("--into") .arg("main") @@ -1013,12 +1098,14 @@ query insert_person($name: String, $age: I32) { let verify = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "team-token") .arg("read") - .arg("--config") - .arg(&client_config) + .arg("--server") + .arg(&server.base_url) + .arg("--graph") + .arg(GRAPH_ID) .arg("--query") .arg(fixture("test.gq")) - .arg("--name") .arg("get_person") .arg("--params") .arg(r#"{"name":"PolicyRemote"}"#) @@ -1030,13 +1117,16 @@ query insert_person($name: String, $age: I32) { // ─── MR-668 PR 8 — omnigraph graphs list end-to-end ──────────────────────── -/// Multi-graph server + CLI `omnigraph graphs list` end-to-end. +/// Multi-graph server + CLI `omnigraph graphs list` end-to-end (RFC-011 +/// cluster-only serving). /// /// Steps: -/// 1. Init a graph `alpha` on disk and write an `omnigraph.yaml` -/// whose `graphs:` map references it. -/// 2. Spawn the server with `--config `. -/// 3. `omnigraph graphs list` — expect to see `alpha`. +/// 1. Build a converged cluster serving one graph `alpha` with a +/// server-scoped policy granting `act-admin` the `graph_list` action. +/// 2. Spawn the server with `--cluster` + a bearer-token map. +/// 3. `omnigraph graphs list --server ` (admin token) — expect `alpha`. +/// 4. Addressing the server via `--server ` with NO `--graph` errors and +/// lists the candidate graphs (RFC-011 D7). /// /// Ignored by default — spawning servers needs loopback socket /// permissions some sandboxes lack. @@ -1044,86 +1134,33 @@ query insert_person($name: String, $age: I32) { #[ignore = "requires loopback socket permissions in sandboxed runners"] fn graphs_list_against_multi_graph_server() { let cfg_dir = tempfile::tempdir().unwrap(); - let schema_path = fixture("test.pg"); - - // Init `alpha` on disk. - let alpha_uri = cfg_dir.path().join("alpha.omni"); - tokio::runtime::Runtime::new().unwrap().block_on(async { - Omnigraph::init( - alpha_uri.to_str().unwrap(), - &fs::read_to_string(&schema_path).unwrap(), - ) - .await - .unwrap(); - }); - + let dir = cfg_dir.path(); + fs::copy(fixture("test.pg"), dir.join("alpha.pg")).unwrap(); + fs::write(dir.join("server.policy.yaml"), GRAPH_LIST_SERVER_POLICY_YAML).unwrap(); fs::write( - cfg_dir.path().join("server-policy.yaml"), - GRAPH_LIST_SERVER_POLICY_YAML, + dir.join("cluster.yaml"), + "version: 1\nmetadata:\n name: sys\nstate:\n backend: cluster\n lock: true\ngraphs:\n alpha:\n schema: ./alpha.pg\npolicies:\n server:\n file: ./server.policy.yaml\n applies_to: [cluster]\n", ) .unwrap(); + output_success(cli().arg("cluster").arg("import").arg("--config").arg(dir)); + output_success(cli().arg("cluster").arg("apply").arg("--config").arg(dir)); - // Server config with `graphs:` map and no `server.graph` selector - // — multi mode (rule 4 of the inference matrix). `GET /graphs` is a - // server-scoped action, so the success path needs an explicit server - // policy and bearer token. - let server_config_path = cfg_dir.path().join("omnigraph.yaml"); - fs::write( - &server_config_path, - format!( - "\ -server: - policy: - file: ./server-policy.yaml -graphs: - alpha: - uri: {} -", - yaml_string(&alpha_uri.to_string_lossy()) - ), - ) - .unwrap(); - - let server = spawn_server_with_config_env( - &server_config_path, + let server = spawn_server_with_cluster_env( + dir, &[( "OMNIGRAPH_SERVER_BEARER_TOKENS_JSON", r#"{"act-admin":"admin-token"}"#, )], ); - // Client config — the CLI's `--target dev` resolves to `server.base_url`. - let client_config_path = cfg_dir.path().join("client.yaml"); - fs::write( - &client_config_path, - format!( - "\ -graphs: - dev: - uri: {} - bearer_token_env: GRAPH_LIST_TOKEN -cli: - graph: dev -auth: - env_file: ./.env.omni -", - yaml_string(&server.base_url) - ), - ) - .unwrap(); - fs::write( - cfg_dir.path().join(".env.omni"), - "GRAPH_LIST_TOKEN=admin-token\n", - ) - .unwrap(); - // `graphs list` lists `alpha`. let payload = parse_stdout_json(&output_success( cli() + .env("OMNIGRAPH_BEARER_TOKEN", "admin-token") .arg("graphs") .arg("list") - .arg("--config") - .arg(&client_config_path) + .arg("--server") + .arg(&server.base_url) .arg("--json"), )); let ids: Vec<&str> = payload["graphs"] @@ -1134,5 +1171,27 @@ auth: .collect(); assert_eq!(ids, vec!["alpha"]); + // RFC-011 D7: addressing the multi-graph server via `--server ` with no + // `--graph` errors and lists the candidate graphs (the resolver probes + // GET /graphs; the default-env token authorizes it). + let no_graph = cli() + .env("OMNIGRAPH_BEARER_TOKEN", "admin-token") + .arg("query") + .arg("--server") + .arg(&server.base_url) + .arg("-e") + .arg("query q { match { $p: Person { name: \"x\" } } return { $p.name } }") + .output() + .unwrap(); + assert!( + !no_graph.status.success(), + "multi-graph server with no --graph must error" + ); + let stderr = String::from_utf8_lossy(&no_graph.stderr); + assert!( + stderr.contains("alpha") && stderr.contains("--graph "), + "expected a candidate-listing error naming alpha; got: {stderr}" + ); + drop(server); } diff --git a/crates/omnigraph-cluster/src/lib.rs b/crates/omnigraph-cluster/src/lib.rs index 6251913..1c4e4fc 100644 --- a/crates/omnigraph-cluster/src/lib.rs +++ b/crates/omnigraph-cluster/src/lib.rs @@ -38,8 +38,9 @@ use diff::{ diff_resources, resource_kind, }; pub use serve::{ - ServingGraph, ServingPolicy, ServingQuery, ServingSnapshot, cluster_root_for_graph_uri, - read_serving_snapshot, read_serving_snapshot_from_storage, resolve_graph_storage_uri, + ServingGraph, ServingPolicy, ServingQuery, ServingSnapshot, cluster_graph_ids, + cluster_root_for_graph_uri, read_serving_snapshot, read_serving_snapshot_from_storage, + resolve_graph_storage_uri, }; use store::{ClusterStore, StateLockGuard, StateSnapshot}; use sweep::{ diff --git a/crates/omnigraph-cluster/src/serve.rs b/crates/omnigraph-cluster/src/serve.rs index a0357b4..6f89e2d 100644 --- a/crates/omnigraph-cluster/src/serve.rs +++ b/crates/omnigraph-cluster/src/serve.rs @@ -112,32 +112,14 @@ pub async fn cluster_root_for_graph_uri(graph_uri: &str) -> Option { /// /// `cluster` is a config directory or a storage-root URI (`s3://…`, config-free), /// mirroring the server's `--cluster` dispatch. -pub async fn resolve_graph_storage_uri( - cluster: &str, - graph_id: &str, -) -> Result { - let backend = if cluster.contains("://") { - ClusterStore::for_storage_root(cluster)? - } else { - ClusterStore::for_config_dir(Path::new(cluster)) - }; +pub async fn resolve_graph_storage_uri(cluster: &str, graph_id: &str) -> Result { + let backend = open_cluster_backend(cluster)?; let mut observations = backend.observations(); let snapshot = backend.read_state(&mut observations).await?; - let state = snapshot.state.ok_or_else(|| { - Diagnostic::error( - "cluster_state_missing", - CLUSTER_STATE_FILE, - format!("cluster `{cluster}` has no applied state; run `cluster apply` first"), - ) - })?; + let state = snapshot.state.ok_or_else(|| missing_state_diagnostic(cluster))?; let address = format!("graph.{graph_id}"); if !state.applied_revision.resources.contains_key(&address) { - let applied: Vec<&str> = state - .applied_revision - .resources - .keys() - .filter_map(|a| a.strip_prefix("graph.")) - .collect(); + let applied = applied_graph_ids(&state); return Err(Diagnostic::error( "graph_not_applied", address, @@ -151,6 +133,46 @@ pub async fn resolve_graph_storage_uri( Ok(backend.graph_root(graph_id)) } +/// List the graph ids applied in a cluster's served state (sorted). Reads the +/// ledger only — no catalog validation — like `resolve_graph_storage_uri`, so +/// it works on a degraded cluster. Used to enumerate candidates when no +/// `--graph` is selected (RFC-011 Decision 7). +pub async fn cluster_graph_ids(cluster: &str) -> Result, Diagnostic> { + let backend = open_cluster_backend(cluster)?; + let mut observations = backend.observations(); + let snapshot = backend.read_state(&mut observations).await?; + let state = snapshot.state.ok_or_else(|| missing_state_diagnostic(cluster))?; + Ok(applied_graph_ids(&state)) +} + +fn open_cluster_backend(cluster: &str) -> Result { + if cluster.contains("://") { + ClusterStore::for_storage_root(cluster) + } else { + Ok(ClusterStore::for_config_dir(Path::new(cluster))) + } +} + +fn missing_state_diagnostic(cluster: &str) -> Diagnostic { + Diagnostic::error( + "cluster_state_missing", + CLUSTER_STATE_FILE, + format!("cluster `{cluster}` has no applied state; run `cluster apply` first"), + ) +} + +fn applied_graph_ids(state: &crate::types::ClusterState) -> Vec { + let mut ids: Vec = state + .applied_revision + .resources + .keys() + .filter_map(|a| a.strip_prefix("graph.")) + .map(str::to_string) + .collect(); + ids.sort(); + ids +} + /// Split `/graphs/.omni` → ``, gating on the exact cluster /// graph-layout shape (a single `` segment, no nested path). `None` for /// anything else — no I/O is done for non-cluster-shaped URIs. diff --git a/crates/omnigraph-server/examples/bench_concurrent_http.rs b/crates/omnigraph-server/examples/bench_concurrent_http.rs index 6a8411a..044b2ce 100644 --- a/crates/omnigraph-server/examples/bench_concurrent_http.rs +++ b/crates/omnigraph-server/examples/bench_concurrent_http.rs @@ -1,14 +1,15 @@ //! Server-level concurrent HTTP benchmark for MR-686 (PR 0 baseline). //! //! Drives concurrent `/change` requests against an in-process Omnigraph HTTP -//! server. Measures the global `Arc>` lock penalty on -//! current `main` so PR 1 + PR 2 can be evaluated against a real baseline. +//! server. Originally written to measure the global `Arc>` +//! lock penalty as an MR-686 baseline; that lock has since been removed +//! (engine write APIs are `&self`, the server holds a lockless +//! `Arc`), so this now measures the concurrent write path itself +//! (per-`(table, branch)` queue contention + Lance I/O). //! -//! Per the MR-686 plan: this is the load-bearing bench. `Omnigraph::mutate_as` -//! is `&mut self`, so an engine-level concurrent bench either serializes on the -//! borrow checker (measures nothing) or drives multiple handles (measures Lance -//! contention, not the server bottleneck). Driving the HTTP server is the only -//! way to measure the actual `RwLock` contention this work removes. +//! Driving the HTTP server is still the right level: an engine-level bench on +//! a single handle measures Lance contention, not the server's request-path +//! concurrency. //! //! Usage: //! ```sh diff --git a/crates/omnigraph-server/src/config.rs b/crates/omnigraph-server/src/config.rs deleted file mode 100644 index 15b957d..0000000 --- a/crates/omnigraph-server/src/config.rs +++ /dev/null @@ -1,1103 +0,0 @@ -use std::collections::BTreeMap; -use std::env; -use std::fs; -use std::path::{Path, PathBuf}; - -use clap::ValueEnum; -use color_eyre::eyre::{Result, bail}; -use serde::{Deserialize, Serialize}; - -pub const DEFAULT_CONFIG_FILE: &str = "omnigraph.yaml"; - -pub fn graph_resource_id_for_selection( - selected_graph: Option<&str>, - normalized_uri: &str, -) -> String { - selected_graph.unwrap_or(normalized_uri).to_string() -} - -#[derive(Debug, Clone, Default, Serialize, Deserialize)] -pub struct ProjectConfig { - pub name: Option, -} - -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct TargetConfig { - pub uri: String, - pub bearer_token_env: Option, - /// Per-graph Cedar policy file (MR-668). In single-graph mode this - /// field is unused — the top-level `policy.file` applies. In - /// multi-graph mode, each `graphs..policy.file` governs that - /// graph's HTTP-layer Cedar enforcement. - #[serde(default)] - pub policy: PolicySettings, - /// Per-graph stored-query registry: an inline `name -> entry` - /// map. Mirrors the per-graph `policy` shape — each - /// `graphs..queries` declares that graph's stored queries. Absent - /// (or empty) = no stored queries for the graph. v1 is inline-only; - /// an external `queries.yaml` manifest indirection is a deferred - /// convenience. - #[serde(default)] - pub queries: BTreeMap, -} - -#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)] -#[serde(rename_all = "snake_case")] -pub enum ReadOutputFormat { - #[default] - Table, - Kv, - Csv, - Jsonl, - Json, -} - -#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)] -#[serde(rename_all = "snake_case")] -pub enum TableCellLayout { - #[default] - Truncate, - Wrap, -} - -#[derive(Debug, Clone, Default, Serialize, Deserialize)] -pub struct CliDefaults { - #[serde(rename = "graph")] - pub graph: Option, - pub branch: Option, - pub output_format: Option, - pub table_max_column_width: Option, - pub table_cell_layout: Option, - /// Default actor identity for CLI direct-engine writes (MR-722). - /// Used when `policy.file` is configured and the operator hasn't - /// passed `--as ` on the command line. With policy configured - /// and neither this nor `--as` set, the engine-layer footgun guard - /// fires (no silent bypass). - pub actor: Option, -} - -#[derive(Debug, Clone, Default, Serialize, Deserialize)] -pub struct ServerDefaults { - #[serde(rename = "graph")] - pub graph: Option, - pub bind: Option, - /// Server-level Cedar policy (MR-668). Governs management endpoints - /// — currently `GET /graphs`; future runtime add/remove endpoints - /// will plug in here too. In single-graph mode this is unused — the - /// top-level `policy.file` covers the single graph. - #[serde(default)] - pub policy: PolicySettings, -} - -#[derive(Debug, Clone, Default, Serialize, Deserialize)] -pub struct AuthDefaults { - pub env_file: Option, -} - -#[derive(Debug, Clone, Default, Serialize, Deserialize)] -pub struct QueryDefaults { - #[serde(default)] - pub roots: Vec, -} - -#[derive(Debug, Clone, Default, Serialize, Deserialize)] -pub struct PolicySettings { - pub file: Option, -} - -/// One stored-query registry entry. The map **key** is the query's -/// identity — it must equal the `query ` symbol declared inside -/// the referenced `.gq` file (asserted when the registry loads). -/// Renaming the key (or the symbol) is a breaking change to callers, by -/// design. -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct QueryEntry { - /// Path to the `.gq` file (relative to the config's `base_dir`). The - /// file may declare several queries; the registry selects the one - /// whose symbol matches the map key. - pub file: String, - #[serde(default)] - pub mcp: McpSettings, -} - -/// MCP exposure for a stored query. A *deployment* concern (the same -/// `.gq` may be exposed in one graph and hidden in another), so it lives -/// in YAML rather than in the `.gq` source. **Default `expose: true`** — -/// declaring a query in the manifest *is* the opt-in, so it appears in the -/// MCP tool catalog (`GET /queries`) by default; set `expose: false` to -/// keep a query HTTP/service-callable but hidden from the agent tool list. -/// `expose` governs catalog membership only — it is **not** an -/// authorization gate (invocation is gated by `invoke_query`), so a hidden -/// query is still invocable by name with the right permission. -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct McpSettings { - #[serde(default = "mcp_expose_default")] - pub expose: bool, - pub tool_name: Option, -} - -fn mcp_expose_default() -> bool { - true -} - -impl Default for McpSettings { - fn default() -> Self { - Self { - expose: mcp_expose_default(), - tool_name: None, - } - } -} - -#[derive(Debug, Clone, Copy, Eq, PartialEq, Serialize, Deserialize)] -#[serde(rename_all = "snake_case")] -pub enum AliasCommand { - /// Read alias (canonical: `query`). The legacy spelling `read` is - /// kept as the variant name for back-compat with serialized configs - /// and external SDK callers; `query` is accepted on the wire via the - /// serde alias. - #[serde(alias = "query")] - Read, - /// Mutation alias (canonical: `mutate`). The legacy spelling `change` - /// is kept as the variant name for back-compat; `mutate` is accepted - /// on the wire via the serde alias. - #[serde(alias = "mutate")] - Change, -} - -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct AliasConfig { - pub command: AliasCommand, - pub query: String, - pub name: Option, - #[serde(default)] - pub args: Vec, - #[serde(rename = "graph")] - pub graph: Option, - pub branch: Option, - pub format: Option, -} - -#[derive(Debug, Clone, Serialize, Deserialize)] -pub struct OmnigraphConfig { - #[serde(default)] - pub project: ProjectConfig, - #[serde(default, rename = "graphs")] - pub graphs: BTreeMap, - #[serde(default)] - pub server: ServerDefaults, - #[serde(default)] - pub auth: AuthDefaults, - #[serde(default)] - pub cli: CliDefaults, - #[serde(default)] - pub query: QueryDefaults, - #[serde(default)] - pub aliases: BTreeMap, - #[serde(default)] - pub policy: PolicySettings, - /// Top-level stored-query registry, used in single-graph - /// mode — mirrors how the top-level `policy` applies to the single - /// graph. In multi-graph mode this is unused; each graph's - /// `graphs..queries` applies instead. - #[serde(default)] - pub queries: BTreeMap, - #[serde(skip)] - base_dir: PathBuf, -} - -impl Default for OmnigraphConfig { - fn default() -> Self { - Self { - project: ProjectConfig::default(), - graphs: BTreeMap::new(), - server: ServerDefaults::default(), - auth: AuthDefaults::default(), - cli: CliDefaults::default(), - query: QueryDefaults::default(), - aliases: BTreeMap::new(), - policy: PolicySettings::default(), - queries: BTreeMap::new(), - base_dir: PathBuf::new(), - } - } -} - -impl OmnigraphConfig { - pub fn base_dir(&self) -> &Path { - &self.base_dir - } - - pub fn cli_branch(&self) -> &str { - self.cli.branch.as_deref().unwrap_or("main") - } - - pub fn cli_output_format(&self) -> ReadOutputFormat { - self.cli.output_format.unwrap_or_default() - } - - pub fn table_max_column_width(&self) -> usize { - self.cli.table_max_column_width.unwrap_or(80) - } - - pub fn table_cell_layout(&self) -> TableCellLayout { - self.cli.table_cell_layout.unwrap_or_default() - } - - pub fn cli_graph_name(&self) -> Option<&str> { - self.cli.graph.as_deref() - } - - pub fn server_graph_name(&self) -> Option<&str> { - self.server.graph.as_deref() - } - - pub fn server_bind(&self) -> &str { - self.server.bind.as_deref().unwrap_or("127.0.0.1:8080") - } - - pub fn resolve_target_name<'a>( - &self, - explicit_uri: Option<&str>, - explicit_target: Option<&'a str>, - default_target: Option<&'a str>, - ) -> Option<&'a str> { - explicit_target.or_else(|| { - if explicit_uri.is_some() { - None - } else { - default_target - } - }) - } - - pub fn graph_bearer_token_env( - &self, - explicit_uri: Option<&str>, - explicit_target: Option<&str>, - default_target: Option<&str>, - ) -> Option<&str> { - let target_name = - self.resolve_target_name(explicit_uri, explicit_target, default_target)?; - self.graphs - .get(target_name) - .and_then(|target| target.bearer_token_env.as_deref()) - } - - pub fn resolve_auth_env_file(&self) -> Option { - self.auth - .env_file - .as_deref() - .map(|path| self.resolve_config_path(path)) - } - - pub fn resolve_policy_file(&self) -> Option { - self.policy - .file - .as_deref() - .map(|path| self.resolve_config_path(path)) - } - - /// Resolve the per-graph policy file path for the named target, - /// relative to the config file's `base_dir`. Returns `None` if the - /// target is unknown or no per-graph `policy.file` is set. - pub fn resolve_target_policy_file(&self, target_name: &str) -> Option { - let target = self.graphs.get(target_name)?; - target - .policy - .file - .as_deref() - .map(|path| self.resolve_config_path(path)) - } - - /// The top-level stored-query registry entries (single-graph mode). - pub fn query_entries(&self) -> &BTreeMap { - &self.queries - } - - /// The per-graph stored-query registry entries for a named target - /// (multi-graph mode). Returns `None` if the target is unknown. - pub fn target_query_entries( - &self, - target_name: &str, - ) -> Option<&BTreeMap> { - self.graphs.get(target_name).map(|target| &target.queries) - } - - /// The stored-query registry entries that apply for a graph - /// selection — the single definition of "which `queries:` block - /// governs graph X", shared by server boot and the CLI so the two - /// can't drift. A named graph present in `graphs:` uses its - /// per-graph block; everything else (no selection, or a name that is - /// not a known graph, e.g. a bare URI) falls back to the top-level - /// block (single-graph mode). - pub fn query_entries_for(&self, graph: Option<&str>) -> &BTreeMap { - match graph { - Some(name) if self.graphs.contains_key(name) => &self.graphs[name].queries, - _ => &self.queries, - } - } - - /// The single CLI gate that turns a raw graph selection into a *validated* - /// one — the fallible counterpart to the infallible - /// [`OmnigraphConfig::query_entries_for`]. Both `queries` subcommands route - /// their selection through here so neither can skip a check the other (or - /// server boot) applies: - /// * a known name passes through, but only after the same coherence check - /// server boot enforces - /// ([`OmnigraphConfig::ensure_top_level_blocks_honored`]) — a named graph - /// with a populated top-level block is rejected; - /// * an unknown name errors with the **same** message - /// [`OmnigraphConfig::resolve_target_uri`] produces, so a command that - /// opens no URI rejects an unknown `--target` exactly like the - /// URI-resolving commands do; - /// * an anonymous selection (`None`, e.g. a bare URI) stays anonymous, - /// resolving to the top-level registry downstream (top-level honored). - pub fn resolve_graph_selection<'a>(&self, graph: Option<&'a str>) -> Result> { - match graph { - Some(name) if self.graphs.contains_key(name) => { - self.ensure_top_level_blocks_honored(Some(name))?; - Ok(Some(name)) - } - Some(name) => bail!("graph '{}' not found in {}", name, DEFAULT_CONFIG_FILE), - None => Ok(None), - } - } - - pub fn resolve_policy_tooling_graph_selection(&self) -> Result> { - self.resolve_graph_selection(self.cli_graph_name().or_else(|| self.server_graph_name())) - } - - /// The policy file that applies for a graph selection — the policy - /// sibling of [`OmnigraphConfig::query_entries_for`], so policy and - /// queries resolve by the same identity rule. A named graph in - /// `graphs:` uses its per-graph `policy.file` with **no** top-level - /// fallback (a named graph with no per-graph policy has no policy — - /// that keeps the boot-time coherence check meaningful); anything else - /// (no selection, or a bare URI) uses the top-level `policy.file`. - pub fn resolve_policy_file_for(&self, graph: Option<&str>) -> Option { - match graph { - Some(name) if self.graphs.contains_key(name) => self.resolve_target_policy_file(name), - _ => self.resolve_policy_file(), - } - } - - /// Names of any top-level config blocks (`policy.file`, `queries:`) - /// that are populated. Used by the boot-time coherence check: when a - /// **named** graph is served (single-mode by name, or multi-mode), - /// the top-level blocks are not honored, so a populated one is a - /// configuration error rather than a silent no-op. - pub fn populated_top_level_blocks(&self) -> Vec<&'static str> { - let mut blocks = Vec::new(); - if self.policy.file.is_some() { - blocks.push("policy.file"); - } - if !self.queries.is_empty() { - blocks.push("queries"); - } - blocks - } - - /// A named graph uses its own `graphs.` block, so a populated - /// top-level block would be silently ignored — a config error. The single - /// definition of that rule, shared by server boot and the CLI selection - /// gate ([`OmnigraphConfig::resolve_graph_selection`]) so the two can't - /// drift. An anonymous selection (`None`, e.g. a bare URI) legitimately - /// honors the top-level blocks, so it is never rejected here. - pub fn ensure_top_level_blocks_honored(&self, selected: Option<&str>) -> Result<()> { - if let Some(name) = selected { - let unhonored = self.populated_top_level_blocks(); - if !unhonored.is_empty() { - bail!( - "named graph '{name}' uses its own `graphs.{name}.…` block, but top-level {} \ - {} set and would be ignored. Move it to `graphs.{name}` (e.g. \ - `graphs.{name}.policy.file`, `graphs.{name}.queries`).", - unhonored.join(" and "), - if unhonored.len() == 1 { "is" } else { "are" }, - ); - } - } - Ok(()) - } - - /// Resolve a stored-query `.gq` file path (from a registry entry), - /// relative to the config's `base_dir`. Mirrors policy-file - /// resolution; the registry loader calls this to turn each entry's - /// `file:` value into an absolute path. - pub fn resolve_query_file(&self, value: &str) -> PathBuf { - self.resolve_config_path(value) - } - - /// Resolve the server-level policy file path (used by management - /// endpoints). Returns `None` if `server.policy.file` is not set. - pub fn resolve_server_policy_file(&self) -> Option { - self.server - .policy - .file - .as_deref() - .map(|path| self.resolve_config_path(path)) - } - - /// Resolve a raw config-supplied URI (which may be relative) to its - /// absolute form. URIs containing `://` are passed through as-is; - /// relative paths are joined with the config file's `base_dir`. - pub fn resolve_uri_value(&self, value: &str) -> String { - self.resolve_config_uri(value) - } - - pub fn resolve_policy_tests_file(&self) -> Option { - let policy_file = self.resolve_policy_file()?; - Some(policy_file.with_file_name("policy.tests.yaml")) - } - - pub fn alias(&self, name: &str) -> Result<&AliasConfig> { - self.aliases - .get(name) - .ok_or_else(|| color_eyre::eyre::eyre!("alias '{}' not found", name)) - } - - pub fn resolve_target_uri( - &self, - explicit_uri: Option, - explicit_target: Option<&str>, - default_target: Option<&str>, - ) -> Result { - if let Some(uri) = explicit_uri { - return Ok(uri); - } - - let target_name = explicit_target.or(default_target).ok_or_else(|| { - color_eyre::eyre::eyre!("URI must be provided via , --target, or config") - })?; - let target = self.graphs.get(target_name).ok_or_else(|| { - color_eyre::eyre::eyre!( - "graph '{}' not found in {}", - target_name, - DEFAULT_CONFIG_FILE - ) - })?; - Ok(self.resolve_config_uri(&target.uri)) - } - - pub fn resolve_query_path(&self, query: &Path) -> Result { - if query.is_absolute() { - return Ok(query.to_path_buf()); - } - - let direct = self.base_dir.join(query); - if direct.exists() { - return Ok(direct); - } - - for root in &self.query.roots { - let candidate = self.base_dir.join(root).join(query); - if candidate.exists() { - return Ok(candidate); - } - } - - bail!("query file '{}' not found", query.display()); - } - - fn resolve_config_uri(&self, value: &str) -> String { - if value.contains("://") { - return value.to_string(); - } - - let path = Path::new(value); - if path.is_absolute() { - value.to_string() - } else { - self.base_dir.join(path).to_string_lossy().to_string() - } - } - - fn resolve_config_path(&self, value: &str) -> PathBuf { - let path = Path::new(value); - if path.is_absolute() { - path.to_path_buf() - } else { - self.base_dir.join(path) - } - } -} - -pub fn default_config_path() -> PathBuf { - PathBuf::from(DEFAULT_CONFIG_FILE) -} - -/// `OMNIGRAPH_CONFIG` env var: a first-class stand-in for `--config`, one -/// name with one meaning in both binaries (the container entrypoint already -/// uses it for the server; RFC-007 §D1 extends it to the CLI). -pub const CONFIG_PATH_ENV: &str = "OMNIGRAPH_CONFIG"; - -/// RFC-008 stage 4 — opt-in strict mode: when set, loading a legacy -/// `omnigraph.yaml` is a hard error instead of a warning. For teams that -/// finished migrating and want regressions caught (a stray legacy file -/// would otherwise silently outrank operator config during the window). -/// The rehearsal for stage 5's removal. -pub const NO_LEGACY_CONFIG_ENV: &str = "OMNIGRAPH_NO_LEGACY_CONFIG"; - -pub fn load_config(config_path: Option<&PathBuf>) -> Result { - let env_path = env::var_os(CONFIG_PATH_ENV).map(PathBuf::from); - let strict = env::var_os(NO_LEGACY_CONFIG_ENV).is_some(); - load_config_in(&env::current_dir()?, config_path, env_path.as_ref(), strict) -} - -fn load_config_in( - cwd: &Path, - config_path: Option<&PathBuf>, - env_path: Option<&PathBuf>, - strict_no_legacy: bool, -) -> Result { - // Precedence: explicit --config flag > $OMNIGRAPH_CONFIG > ./omnigraph.yaml. - let explicit_path = config_path.or(env_path).cloned(); - let config_path = explicit_path.or_else(|| { - let default_path = cwd.join(DEFAULT_CONFIG_FILE); - default_path.exists().then_some(default_path) - }); - - let mut config = if let Some(path) = &config_path { - if strict_no_legacy { - // Strict refuses the FILE, not its absence — flag-less - // invocations on migrated setups keep working. - bail!( - "legacy config '{}' refused: {NO_LEGACY_CONFIG_ENV} is set (RFC-008 strict mode); run `omnigraph config migrate`, then remove the file — or unset the variable", - path.display() - ); - } - let text = fs::read_to_string(path)?; - warn_yaml_deprecation_once(path, &text); - serde_yaml::from_str::(&text)? - } else { - OmnigraphConfig::default() - }; - - config.base_dir = if let Some(path) = config_path { - absolute_base_dir(cwd, &path)? - } else { - cwd.to_path_buf() - }; - - Ok(config) -} - -/// RFC-008 stage 1: suppress the legacy-config deprecation warning -/// (one process), for CI logs during the deprecation window. -pub const SUPPRESS_YAML_DEPRECATION_ENV: &str = "OMNIGRAPH_SUPPRESS_YAML_DEPRECATION"; - -/// RFC-008's migration map (the "Where every key goes" table), applied to -/// the keys actually present in a loaded file — never a generic banner. -/// Keys are `(yaml pointer, destination)`; the pointer is matched against -/// the file's real top-level/nested keys. -const YAML_DEPRECATION_MAP: &[(&str, &str)] = &[ - ("graphs", "cluster.yaml `graphs:` (team surface) — or flags/env for the zero-config tier"), - ("queries", "the cluster catalog (`.gq` discovery in cluster.yaml)"), - ("policy", "cluster.yaml `policies:` + `applies_to` bindings"), - ("server", "flags/env (`--bind`); meaningless under cluster boot"), - ("auth", "the operator credentials chain (`omnigraph login `)"), - ("aliases", "operator `aliases:` (bindings) + catalog stored queries (content)"), - ("query", "obsolete — cluster query discovery replaced `query.roots`"), - ("project", "cluster.yaml `metadata.name`"), - ("cli.actor", "`operator.actor` in ~/.omnigraph/config.yaml"), - ("cli.output_format", "`defaults.output` in ~/.omnigraph/config.yaml"), - ("cli.table_max_column_width", "`defaults.table_max_column_width` in ~/.omnigraph/config.yaml"), - ("cli.table_cell_layout", "`defaults.table_cell_layout` in ~/.omnigraph/config.yaml"), - ("cli.graph", "explicit `--target`/`--server` (no operator default-target yet)"), - ("cli.branch", "explicit `--branch`"), -]; - -/// Emit the per-key deprecation block once per process when a legacy -/// `omnigraph.yaml` is actually loaded. `omnigraph config migrate` -/// produces the split these lines describe. -fn warn_yaml_deprecation_once(path: &Path, text: &str) { - static WARNED: std::sync::OnceLock<()> = std::sync::OnceLock::new(); - if env::var_os(SUPPRESS_YAML_DEPRECATION_ENV).is_some() { - return; - } - let lines = yaml_deprecation_lines(text); - if lines.is_empty() { - return; - } - WARNED.get_or_init(|| { - eprintln!( - "warning: '{}' is deprecated (RFC-008) — its keys have new homes; run `omnigraph config migrate` for the split, set {SUPPRESS_YAML_DEPRECATION_ENV}=1 to silence:", - path.display() - ); - for line in &lines { - eprintln!(" {line}"); - } - }); -} - -fn yaml_deprecation_lines(text: &str) -> Vec { - let Ok(mapping) = serde_yaml::from_str::(text) else { - return Vec::new(); - }; - let present = |pointer: &str| -> bool { - match pointer.split_once('.') { - None => mapping.contains_key(pointer), - Some((outer, inner)) => mapping - .get(outer) - .and_then(|value| value.as_mapping()) - .is_some_and(|nested| nested.contains_key(inner)), - } - }; - YAML_DEPRECATION_MAP - .iter() - .filter(|(pointer, _)| present(pointer)) - .map(|(pointer, destination)| format!("`{pointer}` -> {destination}")) - .collect() -} - -fn absolute_base_dir(cwd: &Path, path: &Path) -> Result { - let path = if path.is_absolute() { - path.to_path_buf() - } else { - cwd.join(path) - }; - Ok(path - .parent() - .map(Path::to_path_buf) - .unwrap_or_else(|| cwd.to_path_buf())) -} - -#[cfg(test)] -mod tests { - use std::fs; - use std::path::{Path, PathBuf}; - - use tempfile::tempdir; - - use super::{ - ReadOutputFormat, TableCellLayout, graph_resource_id_for_selection, load_config_in, - }; - - #[test] - fn env_config_path_stands_in_for_the_flag_but_loses_to_it() { - let temp = tempdir().unwrap(); - let flag_path = temp.path().join("flag.yaml"); - let env_path = temp.path().join("env.yaml"); - fs::write(&flag_path, "cli:\n actor: act-flag\n").unwrap(); - fs::write(&env_path, "cli:\n actor: act-env\n").unwrap(); - - // $OMNIGRAPH_CONFIG used when no flag… - let config = load_config_in(temp.path(), None, Some(&env_path), false).unwrap(); - assert_eq!(config.cli.actor.as_deref(), Some("act-env")); - - // …loses to an explicit --config… - let config = load_config_in(temp.path(), Some(&flag_path), Some(&env_path), false).unwrap(); - assert_eq!(config.cli.actor.as_deref(), Some("act-flag")); - - // …and beats the cwd default file. - fs::write(temp.path().join("omnigraph.yaml"), "cli:\n actor: act-cwd\n").unwrap(); - let config = load_config_in(temp.path(), None, Some(&env_path), false).unwrap(); - assert_eq!(config.cli.actor.as_deref(), Some("act-env")); - } - - #[test] - fn strict_mode_refuses_the_file_not_its_absence() { - let temp = tempdir().unwrap(); - // No file: strict mode changes nothing (defaults load). - let config = load_config_in(temp.path(), None, None, true).unwrap(); - assert!(config.cli.actor.is_none()); - - // File present: strict refuses with the migrate pointer. - fs::write(temp.path().join("omnigraph.yaml"), "cli:\n actor: a\n").unwrap(); - let err = load_config_in(temp.path(), None, None, true).unwrap_err(); - let message = err.to_string(); - assert!( - message.contains("OMNIGRAPH_NO_LEGACY_CONFIG") && message.contains("config migrate"), - "{message}" - ); - // Without strict, the same file loads. - assert!(load_config_in(temp.path(), None, None, false).is_ok()); - } - - #[test] - fn yaml_deprecation_lines_name_present_keys_only() { - let lines = super::yaml_deprecation_lines( - "graphs:\n g:\n uri: /tmp/x\ncli:\n actor: a\n branch: main\n", - ); - let joined = lines.join("\n"); - assert!(joined.contains("`graphs` ->"), "{joined}"); - assert!(joined.contains("`cli.actor` -> `operator.actor`"), "{joined}"); - assert!(joined.contains("`cli.branch` ->"), "{joined}"); - assert!(!joined.contains("`aliases`"), "{joined}"); - assert!(!joined.contains("`cli.output_format`"), "{joined}"); - - assert!(super::yaml_deprecation_lines("").is_empty()); - assert!(super::yaml_deprecation_lines("not: [valid").is_empty()); - } - - #[test] - fn load_config_reads_yaml_defaults_from_current_dir() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - r#" -graphs: - local: - uri: ./demo.omni - bearer_token_env: DEMO_TOKEN -auth: - env_file: .env.omni -cli: - graph: local - branch: main - output_format: kv - table_max_column_width: 40 - table_cell_layout: wrap -policy: {} -"#, - ) - .unwrap(); - - let config = load_config_in(temp.path(), None, None, false).unwrap(); - assert_eq!(config.cli_graph_name(), Some("local")); - assert_eq!(config.cli_branch(), "main"); - assert_eq!(config.cli_output_format(), ReadOutputFormat::Kv); - assert_eq!(config.table_max_column_width(), 40); - assert_eq!(config.table_cell_layout(), TableCellLayout::Wrap); - assert_eq!( - config.graph_bearer_token_env(None, None, config.cli_graph_name()), - Some("DEMO_TOKEN") - ); - assert_eq!( - config.resolve_auth_env_file().unwrap(), - temp.path().join(".env.omni") - ); - assert_eq!( - PathBuf::from( - config - .resolve_target_uri(None, None, config.cli_graph_name()) - .unwrap() - ), - temp.path().join("./demo.omni") - ); - } - - #[test] - fn load_config_does_not_walk_parent_directories() { - let temp = tempdir().unwrap(); - let child = temp.path().join("child"); - fs::create_dir_all(&child).unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./demo.omni\n", - ) - .unwrap(); - - let config = load_config_in(&child, None, None, false).unwrap(); - assert!(config.graphs.is_empty()); - } - - #[test] - fn graph_resource_id_for_selection_uses_name_or_anonymous_uri() { - assert_eq!( - graph_resource_id_for_selection(Some("local"), "/tmp/graph.omni"), - "local" - ); - assert_eq!( - graph_resource_id_for_selection(None, "/tmp/graph.omni"), - "/tmp/graph.omni" - ); - } - - #[test] - fn resolve_graph_selection_validates_membership_and_coherence() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./demo.omni\n", - ) - .unwrap(); - let config = load_config_in(temp.path(), None, None, false).unwrap(); - - // A known graph passes through unchanged. - assert_eq!(config.resolve_graph_selection(Some("local")).unwrap(), Some("local")); - // An anonymous selection stays anonymous (→ top-level registry downstream). - assert_eq!(config.resolve_graph_selection(None).unwrap(), None); - // An unknown name errors, naming the graph (matching resolve_target_uri). - let err = config.resolve_graph_selection(Some("ghost")).unwrap_err().to_string(); - assert!( - err.contains("ghost") && err.contains("not found"), - "unknown graph must error naming it: {err}" - ); - - // Coherence: a named graph plus a populated top-level block is the - // config server boot refuses, so the gate rejects it too (shared rule - // via ensure_top_level_blocks_honored). An anonymous selection still - // passes — top-level is honored when no graph is named. - let temp2 = tempdir().unwrap(); - fs::write( - temp2.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./demo.omni\npolicy:\n file: ./top.yaml\n", - ) - .unwrap(); - let incoherent = load_config_in(temp2.path(), None, None, false).unwrap(); - let err = incoherent - .resolve_graph_selection(Some("local")) - .unwrap_err() - .to_string(); - assert!( - err.contains("local") && err.contains("policy.file"), - "named graph + populated top-level block must be rejected, naming both: {err}" - ); - assert_eq!( - incoherent.resolve_graph_selection(None).unwrap(), - None, - "anonymous selection still honors top-level" - ); - } - - #[test] - fn policy_tooling_graph_selection_prefers_cli_then_server_and_validates() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./local.omni\n prod:\n uri: ./prod.omni\n\ - server:\n graph: local\ncli:\n graph: prod\n", - ) - .unwrap(); - let config = load_config_in(temp.path(), None, None, false).unwrap(); - assert_eq!( - config.resolve_policy_tooling_graph_selection().unwrap(), - Some("prod") - ); - - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./local.omni\nserver:\n graph: local\n", - ) - .unwrap(); - let config = load_config_in(temp.path(), None, None, false).unwrap(); - assert_eq!( - config.resolve_policy_tooling_graph_selection().unwrap(), - Some("local") - ); - - let temp = tempdir().unwrap(); - fs::write(temp.path().join("omnigraph.yaml"), "policy: {}\n").unwrap(); - let config = load_config_in(temp.path(), None, None, false).unwrap(); - assert_eq!(config.resolve_policy_tooling_graph_selection().unwrap(), None); - - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./local.omni\nserver:\n graph: ghost\n", - ) - .unwrap(); - let config = load_config_in(temp.path(), None, None, false).unwrap(); - let err = config - .resolve_policy_tooling_graph_selection() - .unwrap_err() - .to_string(); - assert!( - err.contains("ghost") && err.contains("not found"), - "unknown server.graph must use graph-selection validation: {err}" - ); - } - - #[test] - fn resolve_query_path_searches_config_roots() { - let temp = tempdir().unwrap(); - fs::create_dir_all(temp.path().join("queries")).unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "query:\n roots:\n - queries\npolicy: {}\n", - ) - .unwrap(); - fs::write( - temp.path().join("queries").join("test.gq"), - "query q { return {} }", - ) - .unwrap(); - - let config = load_config_in(temp.path(), None, None, false).unwrap(); - let resolved = config.resolve_query_path(Path::new("test.gq")).unwrap(); - assert_eq!(resolved, temp.path().join("queries").join("test.gq")); - } - - #[test] - fn resolve_query_path_prefers_config_base_dir_over_ambient_cwd() { - let workspace = tempdir().unwrap(); - let config_dir = workspace.path().join("config"); - let ambient_dir = workspace.path().join("ambient"); - fs::create_dir_all(&config_dir).unwrap(); - fs::create_dir_all(&ambient_dir).unwrap(); - fs::write(config_dir.join("omnigraph.yaml"), "policy: {}\n").unwrap(); - fs::write(config_dir.join("local.gq"), "query local { return {} }").unwrap(); - fs::write(ambient_dir.join("local.gq"), "query ambient { return {} }").unwrap(); - - let config = - load_config_in(&ambient_dir, Some(&config_dir.join("omnigraph.yaml")), None, false).unwrap(); - let resolved = config.resolve_query_path(Path::new("local.gq")).unwrap(); - - assert_eq!(resolved, config_dir.join("local.gq")); - } - - #[test] - fn queries_block_round_trips_inline_and_per_graph() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - r#" -graphs: - prod: - uri: s3://bucket/prod - queries: - find_user: - file: ./queries/find_user.gq - mcp: - expose: true - tool_name: lookup_user - internal_audit: - file: ./queries/audit.gq -queries: - single_mode_q: - file: ./q.gq -"#, - ) - .unwrap(); - - let config = load_config_in(temp.path(), None, None, false).unwrap(); - - // Per-graph registry (multi-graph mode). - let prod = config.target_query_entries("prod").unwrap(); - assert_eq!(prod.len(), 2); - let find_user = &prod["find_user"]; - assert_eq!(find_user.file, "./queries/find_user.gq"); - assert!(find_user.mcp.expose); - assert_eq!(find_user.mcp.tool_name.as_deref(), Some("lookup_user")); - // Default exposure is true (the manifest entry is the opt-in); tool_name absent. - let audit = &prod["internal_audit"]; - assert!(audit.mcp.expose); - assert!(audit.mcp.tool_name.is_none()); - - // Top-level registry (single-graph mode). - assert_eq!(config.query_entries().len(), 1); - - // The shared selector resolves the same blocks the server boot - // and the CLI use: a known graph → its per-graph block; no - // selection or an unknown name → the top-level block (the latter - // pins the behavior of the CLI's now-deleted fallback arm). - assert_eq!(config.query_entries_for(Some("prod")).len(), 2); - assert_eq!(config.query_entries_for(None).len(), 1); - assert_eq!(config.query_entries_for(Some("nonexistent")).len(), 1); - - // Path resolution joins against base_dir, like policy files. - assert_eq!( - config.resolve_query_file(&find_user.file), - temp.path().join("./queries/find_user.gq") - ); - } - - #[test] - fn resolve_policy_file_for_follows_identity() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "policy:\n file: ./top.yaml\ngraphs:\n prod:\n uri: s3://b/prod\n \ - policy:\n file: ./prod.yaml\n bare:\n uri: s3://b/bare\n", - ) - .unwrap(); - let config = load_config_in(temp.path(), None, None, false).unwrap(); - - // Named graph with its own policy → per-graph (not top-level). - assert!( - config - .resolve_policy_file_for(Some("prod")) - .unwrap() - .ends_with("prod.yaml") - ); - // Named graph with NO per-graph policy → None (no top-level fallback; - // load-bearing for the boot coherence check). - assert!(config.resolve_policy_file_for(Some("bare")).is_none()); - // Anonymous (bare URI) or an unknown name → top-level. - assert!( - config - .resolve_policy_file_for(None) - .unwrap() - .ends_with("top.yaml") - ); - assert!( - config - .resolve_policy_file_for(Some("nope")) - .unwrap() - .ends_with("top.yaml") - ); - } - - #[test] - fn queries_block_absent_yields_empty_registry() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "graphs:\n local:\n uri: ./demo.omni\n", - ) - .unwrap(); - - let config = load_config_in(temp.path(), None, None, false).unwrap(); - // Additive: no `queries:` anywhere → empty registries everywhere. - assert!(config.query_entries().is_empty()); - assert!( - config - .target_query_entries("local") - .unwrap() - .is_empty() - ); - } - - #[test] - fn policy_block_accepts_non_empty_mapping() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - "policy:\n file: ./policy.yaml\n", - ) - .unwrap(); - - let config = load_config_in(temp.path(), None, None, false).unwrap(); - assert_eq!( - config.resolve_policy_file().unwrap(), - temp.path().join("policy.yaml") - ); - } - - #[test] - fn scoped_auth_env_ignores_default_target_when_uri_is_explicit() { - let temp = tempdir().unwrap(); - fs::write( - temp.path().join("omnigraph.yaml"), - r#" -graphs: - demo: - uri: https://example.com - bearer_token_env: DEMO_TOKEN -cli: - graph: demo -"#, - ) - .unwrap(); - - let config = load_config_in(temp.path(), None, None, false).unwrap(); - assert_eq!( - config.graph_bearer_token_env( - Some("https://override.example.com"), - None, - config.cli_graph_name() - ), - None - ); - assert_eq!( - config.graph_bearer_token_env( - Some("https://override.example.com"), - Some("demo"), - config.cli_graph_name() - ), - Some("DEMO_TOKEN") - ); - } -} diff --git a/crates/omnigraph-server/src/handlers.rs b/crates/omnigraph-server/src/handlers.rs index 94f4743..7de38d2 100644 --- a/crates/omnigraph-server/src/handlers.rs +++ b/crates/omnigraph-server/src/handlers.rs @@ -51,25 +51,15 @@ pub(crate) async fn server_graphs_list( State(state): State, actor: Option>, ) -> std::result::Result, ApiError> { - // 405 in single mode — there's no registry to enumerate, and the - // legacy URL surface didn't expose this endpoint. - let registry = match state.routing() { - GraphRouting::Single { .. } => { - return Err(ApiError::method_not_allowed( - "GET /graphs is only available in multi-graph mode", - )); - } - GraphRouting::Multi { registry, .. } => registry, - }; + let registry = &state.routing().registry; - // Server-level Cedar gate. `state.server_policy` is loaded from - // `server.policy.file` in `omnigraph.yaml` at startup. When no - // server policy is configured, `authorize_request_server` falls - // through to the MR-723 default-deny semantics (every non-Read - // action denied for an authenticated actor). `GraphList` is not - // `Read`, so without a server policy the request gets 403 — which - // is the right default (don't leak the registry until the operator - // explicitly authorizes it). + // Server-level Cedar gate. `state.server_policy` is loaded from the + // cluster-scoped policy bundle at startup. When no server policy is + // configured, `authorize_request_server` falls through to the MR-723 + // default-deny semantics (every non-Read action denied for an + // authenticated actor). `GraphList` is not `Read`, so without a server + // policy the request gets 403 — which is the right default (don't leak + // the registry until the operator explicitly authorizes it). authorize_request( actor.as_ref().map(|Extension(actor)| actor), state.server_policy.as_deref(), @@ -93,17 +83,15 @@ pub(crate) async fn server_graphs_list( } pub(crate) async fn server_openapi(State(state): State) -> Json { - let mut doc = ApiDoc::openapi(); + // `served_openapi` is the single nesting source — the protected + // routes always live under `/graphs/{graph_id}/...` (public/management + // paths `/healthz`, `/graphs` stay flat). Building from it here means + // the runtime spec and the committed `openapi.json` share one nesting + // pass and can't drift. + let mut doc = crate::served_openapi(); if !state.requires_bearer_auth() { strip_security(&mut doc); } - // MR-668: in multi mode, the protected routes live under - // `/graphs/{graph_id}/...`. Rewrite the doc so the spec matches - // the routes the router actually serves. Public paths (`/healthz`) - // stay flat in both modes. - if matches!(state.routing(), GraphRouting::Multi { .. }) { - nest_paths_under_cluster_prefix(&mut doc); - } Json(doc) } @@ -248,16 +236,11 @@ pub(crate) async fn require_bearer_auth( Ok(next.run(request).await) } -/// Routing middleware (MR-668). Resolves the active graph for the -/// request and injects `Arc` as an extension so handlers can -/// extract it via `Extension>`. +/// Routing middleware (RFC-011 cluster-only). Resolves the active graph +/// for the request and injects `Arc` as an extension so +/// handlers can extract it via `Extension>`. /// -/// **Single mode**: the routing field holds the single handle directly. -/// Routes are flat; every request resolves to that handle, regardless -/// of the URI path. No registry walk, no sentinel key, no -/// programmer-error guard. -/// -/// **Multi mode**: routes are nested under `/graphs/{graph_id}/...`. The +/// Routes are always nested under `/graphs/{graph_id}/...`. The /// middleware extracts `{graph_id}` from the URI path and looks it up in /// the registry. Returns 404 if the graph is not registered. /// @@ -268,39 +251,33 @@ pub(crate) async fn resolve_graph_handle( mut request: Request, next: Next, ) -> std::result::Result { - let handle = match &state.routing { - GraphRouting::Single { handle } => Arc::clone(handle), - GraphRouting::Multi { registry, .. } => { - // `Router::nest("/graphs/{graph_id}", inner)` rewrites - // `request.uri().path()` to the inner suffix (e.g. `/snapshot`). - // The pre-rewrite URI is preserved in the `OriginalUri` - // request extension by axum's router; we read from there to - // extract `{graph_id}`. Fall back to the current URI only if - // the extension is missing, which shouldn't happen for - // nested routes but is safe defensive code. - let original_path: String = request - .extensions() - .get::() - .map(|OriginalUri(uri)| uri.path().to_string()) - .unwrap_or_else(|| request.uri().path().to_string()); - let graph_id_str = original_path - .strip_prefix("/graphs/") - .and_then(|rest| rest.split('/').next()) - .filter(|s| !s.is_empty()) - .ok_or_else(|| { - ApiError::bad_request( - "cluster route missing /graphs/{graph_id} prefix".to_string(), - ) - })?; - let graph_id = GraphId::try_from(graph_id_str.to_string()) - .map_err(|err| ApiError::bad_request(err.to_string()))?; - let key = GraphKey::cluster(graph_id.clone()); - match registry.get(&key) { - RegistryLookup::Ready(handle) => handle, - RegistryLookup::Gone => { - return Err(ApiError::not_found(format!("graph '{graph_id}' not found"))); - } - } + let registry = &state.routing.registry; + // `Router::nest("/graphs/{graph_id}", inner)` rewrites + // `request.uri().path()` to the inner suffix (e.g. `/snapshot`). + // The pre-rewrite URI is preserved in the `OriginalUri` + // request extension by axum's router; we read from there to + // extract `{graph_id}`. Fall back to the current URI only if + // the extension is missing, which shouldn't happen for + // nested routes but is safe defensive code. + let original_path: String = request + .extensions() + .get::() + .map(|OriginalUri(uri)| uri.path().to_string()) + .unwrap_or_else(|| request.uri().path().to_string()); + let graph_id_str = original_path + .strip_prefix("/graphs/") + .and_then(|rest| rest.split('/').next()) + .filter(|s| !s.is_empty()) + .ok_or_else(|| { + ApiError::bad_request("cluster route missing /graphs/{graph_id} prefix".to_string()) + })?; + let graph_id = GraphId::try_from(graph_id_str.to_string()) + .map_err(|err| ApiError::bad_request(err.to_string()))?; + let key = GraphKey::cluster(graph_id.clone()); + let handle = match registry.get(&key) { + RegistryLookup::Ready(handle) => handle, + RegistryLookup::Gone => { + return Err(ApiError::not_found(format!("graph '{graph_id}' not found"))); } }; @@ -382,22 +359,25 @@ pub(crate) fn authorize( // runtime state means the docstring contract on // `server_graphs_list` ("don't leak the registry until the // operator explicitly authorizes it") holds uniformly; the - // operator's only path to enabling it is configuring an - // explicit `server.policy.file` in omnigraph.yaml. + // operator's only path to enabling it is configuring a + // cluster-scoped policy bundle, applying the cluster, and + // restarting the server. if request.action.resource_kind() == PolicyResourceKind::Server { return Ok(Authz::Denied( - "server-scoped actions require an explicit `server.policy.file` \ - configured in omnigraph.yaml — the management surface is closed \ - by default in every runtime state, including --unauthenticated, \ - so that server topology is never exposed without operator opt-in." + "server-scoped actions require an explicit cluster policy bundle \ + applied with `omnigraph cluster apply` and served after restart — \ + the management surface is closed by default in every runtime state, \ + including --unauthenticated, so that server topology is never exposed \ + without operator opt-in." .to_string(), )); } if actor.is_some() && request.action != PolicyAction::Read { return Ok(Authz::Denied( "server runs in default-deny mode (bearer tokens configured but no \ - policy file). Only `read` actions are permitted; configure \ - `policy.file` in omnigraph.yaml to enable other actions." + applied policy bundle). Only `read` actions are permitted; configure \ + a graph or cluster policy bundle in the cluster config, run \ + `omnigraph cluster apply`, and restart the server to enable other actions." .to_string(), )); } @@ -510,7 +490,7 @@ pub(crate) fn deprecation_headers(successor_link: &'static str) -> [(HeaderName, operation_id = "read", request_body = ReadRequest, responses( - (status = 200, description = "Query results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", body = ReadOutput), + (status = 200, description = "Query results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", body = ReadOutput), (status = 400, description = "Bad request", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput), @@ -524,7 +504,7 @@ pub(crate) fn deprecation_headers(successor_link: &'static str) -> [(HeaderName, /// route is kept indefinitely for byte-stable back-compat. New integrations /// should target `POST /query`, which has clean field names (`query` / /// `name`) and a 400-on-mutation guard. Responses from this route include -/// `Deprecation: true` and `Link: ; rel="successor-version"` +/// `Deprecation: true` and `Link: ; rel="successor-version"` /// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the /// signal. pub(crate) async fn server_read( @@ -544,7 +524,7 @@ pub(crate) async fn server_read( ) .await?; Ok(( - deprecation_headers("; rel=\"successor-version\""), + deprecation_headers("; rel=\"successor-version\""), Json(api::read_output(selected_name, &target, result)), )) } @@ -793,7 +773,7 @@ pub(crate) async fn run_query( operation_id = "change", request_body = ChangeRequest, responses( - (status = 200, description = "Mutation results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", body = ChangeOutput), + (status = 200, description = "Mutation results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", body = ChangeOutput), (status = 400, description = "Bad request", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput), @@ -809,7 +789,7 @@ pub(crate) async fn run_query( /// kept indefinitely for back-compat. New integrations should target /// `POST /mutate`, which has identical semantics and a name that pairs /// cleanly with `POST /query`. Responses from this route include -/// `Deprecation: true` and `Link: ; rel="successor-version"` +/// `Deprecation: true` and `Link: ; rel="successor-version"` /// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the /// signal. pub(crate) async fn server_change( @@ -830,7 +810,7 @@ pub(crate) async fn server_change( ) .await?; Ok(( - deprecation_headers("; rel=\"successor-version\""), + deprecation_headers("; rel=\"successor-version\""), Json(output), )) } @@ -980,6 +960,22 @@ pub(crate) async fn server_invoke_query( let query_name = stored.name.clone(); let is_mutation = stored.is_mutation(); + // RFC-011 D3: the CLI verb asserts the stored query's kind. `query ` + // sends `expect_mutation: false`, `mutate ` sends `true`; a mismatch + // is rejected here so the wrong verb errors instead of silently running. + if let Some(expected) = req.expect_mutation { + if expected != is_mutation { + let (actual, verb) = if is_mutation { + ("mutation", "mutate") + } else { + ("read", "query") + }; + return Err(ApiError::bad_request(format!( + "'{query_name}' is a {actual} — use omnigraph {verb} {query_name}" + ))); + } + } + info!( graph = %handle.uri, actor = ?actor_ref.map(|a| a.actor_id.as_ref()), @@ -1117,12 +1113,16 @@ pub(crate) async fn server_schema_get( (status = 400, description = "Bad request", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput), + (status = 409, description = "Schema apply is disabled for cluster-backed serving; use `omnigraph cluster apply` and restart", body = ErrorOutput), (status = 429, description = "Per-actor admission cap exceeded; honor `Retry-After` header", body = ErrorOutput), ), security(("bearer_token" = [])), )] /// Apply a schema migration. /// +/// Cluster-backed servers reject this route with `409 Conflict`; operators +/// must apply schema changes through `omnigraph cluster apply` and restart. +/// /// Diffs `schema_source` against the current schema and applies the resulting /// migration steps (add/drop type, add/drop column, etc.). **Destructive**: /// some steps drop data. Returns the list of steps applied; if `applied` is @@ -1149,6 +1149,17 @@ pub(crate) async fn server_schema_apply( target_branch: Some("main".to_string()), }, )?; + // Disable HTTP schema apply on cluster-backed serving AFTER the Cedar gate, + // so an unauthorized actor gets a 403 (not a 409 that would disclose the + // server is cluster-backed): 401 → 403 → 409, never leak topology before + // authorization. An authorized actor gets the actionable 409 signpost. + if state.routing().config_path.is_some() { + return Err(ApiError::conflict( + "server-side schema apply is disabled for cluster-backed serving; \ + update the cluster config, run `omnigraph cluster apply`, and restart \ + the server.", + )); + } let est_bytes = request.schema_source.len() as u64; let _admission = state .workload @@ -1180,6 +1191,25 @@ pub(crate) async fn server_schema_apply( .await .map_err(ApiError::from_omni)? }; + // Prompt index convergence (iss-848): schema apply records `@index` intent + // but defers the physical build. On a long-lived server, materialize it + // promptly rather than waiting for the next `optimize` cron — spawned + // detached so it never blocks or fails the apply response. Best-effort: a + // failure is logged and the index still converges on the next optimize. + // The CLI is one-shot, so it has no equivalent; its convergence path is the + // operator's optimize cadence. + if result.applied { + let engine = Arc::clone(&handle.engine); + tokio::spawn(async move { + if let Err(err) = engine.ensure_indices().await { + tracing::warn!( + target: "omnigraph::server", + error = %err, + "post-apply ensure_indices failed; indexes will converge on the next optimize", + ); + } + }); + } Ok(Json(schema_apply_output(handle.uri.as_str(), result))) } @@ -1311,7 +1341,7 @@ pub(crate) async fn server_load( operation_id = "ingest", request_body = IngestRequest, responses( - (status = 200, description = "Load results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", body = IngestOutput), + (status = 200, description = "Load results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", body = IngestOutput), (status = 400, description = "Bad request", body = ErrorOutput), (status = 401, description = "Unauthorized", body = ErrorOutput), (status = 403, description = "Forbidden", body = ErrorOutput), @@ -1325,7 +1355,7 @@ pub(crate) async fn server_load( /// Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is /// kept indefinitely for back-compat. New integrations should target /// `POST /load`, which has identical semantics. Responses from this route -/// include `Deprecation: true` and `Link: ; rel="successor-version"` +/// include `Deprecation: true` and `Link: ; rel="successor-version"` /// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal. pub(crate) async fn server_ingest( State(state): State, @@ -1341,7 +1371,7 @@ pub(crate) async fn server_ingest( ) .await?; Ok(( - deprecation_headers("; rel=\"successor-version\""), + deprecation_headers("; rel=\"successor-version\""), Json(output), )) } @@ -1725,4 +1755,3 @@ pub(crate) fn query_params_from_json( json_params_to_param_map(params_json, query_params, JsonParamMode::Standard) .map_err(|err| color_eyre::eyre::eyre!(err.to_string())) } - diff --git a/crates/omnigraph-server/src/lib.rs b/crates/omnigraph-server/src/lib.rs index d45c74d..5451b05 100644 --- a/crates/omnigraph-server/src/lib.rs +++ b/crates/omnigraph-server/src/lib.rs @@ -1,11 +1,10 @@ pub mod api; mod handlers; mod settings; -pub use settings::{load_server_settings, classify_server_runtime_state, server_config_is_multi, ServerRuntimeState}; +pub use settings::{load_server_settings, classify_server_runtime_state, ServerRuntimeState}; use settings::*; use handlers::*; pub mod auth; -pub mod config; pub mod graph_id; pub mod identity; pub mod policy; @@ -46,11 +45,6 @@ use axum::response::{IntoResponse, Response}; use axum::routing::{delete, get, post}; use axum::{Json, Router}; use color_eyre::eyre::{Result, WrapErr, bail, eyre}; -pub use config::{ - AliasCommand, AliasConfig, CliDefaults, DEFAULT_CONFIG_FILE, OmnigraphConfig, PolicySettings, - ProjectConfig, QueryDefaults, ReadOutputFormat, ServerDefaults, TableCellLayout, TargetConfig, - graph_resource_id_for_selection, load_config, -}; use futures::stream; use omnigraph::db::{Omnigraph, ReadTarget}; use omnigraph::error::{ManifestConflictDetails, ManifestErrorKind, OmniError}; @@ -122,6 +116,20 @@ fn hash_bearer_token(token: &str) -> BearerTokenHash { )] pub struct ApiDoc; +/// The canonical served OpenAPI shape (RFC-011 cluster-only): the static +/// `ApiDoc` with every protected path nested under `/graphs/{graph_id}/…` +/// and `cluster_`-prefixed operation ids. `/healthz` and `/graphs` stay +/// flat. This is the single source of nesting — both the runtime +/// `server_openapi` handler and the committed `openapi.json` derive from +/// it, so the published spec can never describe routes the server does +/// not serve. The handler additionally strips security in open mode; the +/// committed spec retains it. +pub fn served_openapi() -> utoipa::openapi::OpenApi { + let mut doc = ApiDoc::openapi(); + handlers::nest_paths_under_cluster_prefix(&mut doc); + doc +} + struct SecurityAddon; impl utoipa::Modify for SecurityAddon { @@ -143,11 +151,10 @@ const SERVER_SOURCE_VERSION: Option<&str> = option_env!("OMNIGRAPH_SOURCE_VERSIO #[derive(Debug, Clone)] pub struct ServerConfig { - /// Server topology + the graphs to open at startup. Single-mode - /// invocations (`omnigraph-server ` or `--target `) - /// produce `ServerConfigMode::Single`; multi-mode invocations - /// (`--config omnigraph.yaml` with a non-empty `graphs:` map and - /// no single-mode selector) produce `ServerConfigMode::Multi`. + /// Server topology + the graphs to open at startup. RFC-011 + /// cluster-only: the server always boots from a cluster + /// (`--cluster `) and serves N graphs under cluster + /// routes. pub mode: ServerConfigMode, pub bind: String, /// Operator opt-in for fully-unauthenticated dev mode (MR-723). @@ -161,49 +168,33 @@ pub struct ServerConfig { pub allow_unauthenticated: bool, } -/// What `load_server_settings` produces after applying the four-rule -/// mode inference matrix (MR-668 decision 2). +/// What `load_server_settings` produces. RFC-011 cluster-only: the +/// server always boots from a cluster's applied revision into a +/// multi-graph deployment (N ≥ 1 graphs). #[derive(Debug, Clone)] pub enum ServerConfigMode { - /// Legacy invocation — one graph at the given URI. Either: - /// * `omnigraph-server ` (CLI positional), or - /// * `omnigraph-server --target --config omnigraph.yaml`, or - /// * `omnigraph-server --config omnigraph.yaml` with `server.graph` - /// set to a named target. - Single { - uri: String, - /// Cedar graph resource id for the single graph. A named selection - /// uses the graph name; an anonymous URI uses the normalized URI to - /// preserve legacy single-graph policy identity. - graph_id: String, - /// Top-level `policy.file` (single-graph Cedar policy). - policy_file: Option, - /// Top-level stored-query registry, loaded and identity-checked - /// at settings-build time; type-checked against the schema when - /// the engine opens. - queries: QueryRegistry, - }, - /// Multi-graph invocation — `--config omnigraph.yaml` with a - /// non-empty `graphs:` map and no single-mode selector. + /// Cluster boot — `--cluster ` resolves the applied + /// revision into per-graph startup configs plus an optional + /// server-level policy. Multi { /// Per-graph startup configs, sorted by graph id (BTreeMap /// iteration order). The parallel-open loop iterates this. graphs: Vec, - /// Path to the config file the server was started from. Kept on - /// the mode so future runtime mutation (deferred — see release - /// notes) can locate the source of truth without re-parsing CLI - /// args. + /// The cluster boot source (config directory or storage root). + /// Kept on the mode so future runtime mutation (deferred — see + /// release notes) can locate the source of truth without + /// re-parsing CLI args. config_path: PathBuf, - /// `server.policy.file` (server-level Cedar policy for the - /// management endpoints). Wired into `GET /graphs` authorization. + /// Server-level Cedar policy for the management endpoints + /// (`GET /graphs`). Wired into `GET /graphs` authorization. server_policy: Option, }, } -/// Where a Cedar policy bundle comes from at startup. File-based for -/// omnigraph.yaml deployments; inline (digest-verified catalog content) -/// for cluster-mode boots, where the catalog may live on object storage -/// and the server must not re-read mutable state after the snapshot. +/// Where a Cedar policy bundle comes from at startup. Cluster-local files are +/// used during config application; inline digest-verified catalog content is +/// used for serving, where the catalog may live on object storage and the +/// server must not re-read mutable state after the snapshot. #[derive(Debug, Clone)] pub enum PolicySource { File(PathBuf), @@ -227,36 +218,25 @@ pub struct GraphStartupConfig { pub queries: QueryRegistry, } -/// Runtime routing for the server. Single mode = legacy -/// `omnigraph-server ` invocation, one graph, flat HTTP routes. -/// Multi mode = `--config omnigraph.yaml` with a non-empty `graphs:` -/// map, N graphs, cluster routes (`/graphs/{graph_id}/...`). Mode is -/// determined at startup by `load_server_settings`. +/// Runtime routing for the server (RFC-011 cluster-only). Every +/// deployment serves cluster routes (`/graphs/{graph_id}/...`) backed by +/// a registry of N graphs (N ≥ 1). The single-graph convenience +/// constructors build a one-graph registry keyed by `default`; the +/// cluster boot path builds an N-graph registry. There is no longer a +/// flat-route mode. /// -/// In single mode the handle lives here directly — there is no -/// registry, no sentinel key, no walk-and-assert. In multi mode the -/// registry carries N handles and the middleware dispatches on the -/// URL's `{graph_id}` segment. +/// `config_path` is the boot source (the cluster directory or storage +/// root); preserved here so future runtime mutation (deferred) can find +/// the source of truth without re-parsing CLI args. The server treats +/// the source as operator-owned and never writes it. /// -/// Both modes share the same handler bodies — the routing middleware +/// All handler bodies are mode-agnostic — the routing middleware /// (`resolve_graph_handle`) injects `Arc` as a request -/// extension so handlers never see the routing discriminator. +/// extension by looking up the `{graph_id}` URL segment in the registry. #[derive(Clone)] -pub enum GraphRouting { - /// Single-graph deployment: one handle, flat routes (`/snapshot`, - /// `/read`, …). The `handle.uri` field carries the URI the engine - /// was opened from. Backward compatible with v0.6.0 deployments. - Single { handle: Arc }, - /// Multi-graph deployment: many handles, cluster routes - /// (`/graphs/{graph_id}/...`). `config_path` is the `omnigraph.yaml` - /// the server reads at startup; preserved here so future runtime - /// mutation (deferred) can find the source of truth without - /// re-parsing CLI args. The server treats the file as - /// operator-owned and never writes it. - Multi { - registry: Arc, - config_path: Option, - }, +pub struct GraphRouting { + pub registry: Arc, + pub config_path: Option, } #[derive(Clone)] @@ -272,12 +252,10 @@ pub struct AppState { /// see MR-668 decision Q6. workload: Arc, bearer_tokens: Arc<[(BearerTokenHash, Arc)]>, - /// Server-level Cedar policy. Used by management endpoints (`POST - /// /graphs`, `GET /graphs`) which act on the registry resource, - /// not on a per-graph resource. Loaded from `server.policy.file` - /// in `omnigraph.yaml`. `None` outside multi mode and when no - /// server policy is configured. Per-graph policies live on each - /// `GraphHandle.policy`. + /// Server-level Cedar policy. Used by management endpoints (`GET + /// /graphs`) which act on the registry resource, not on a per-graph + /// resource. Loaded from the cluster-scoped policy binding when + /// configured. Per-graph policies live on each `GraphHandle.policy`. server_policy: Option>, } @@ -502,11 +480,13 @@ impl AppState { )) } - /// Single-mode shared construction: wraps the bare engine + per-graph - /// policy in a `GraphHandle` carried directly by `GraphRouting::Single`. - /// Per-graph policy enforcement on the engine (MR-722) is re-applied - /// via `Omnigraph::with_policy` so HTTP and engine layers can never - /// diverge. + /// Single-graph convenience construction (RFC-011 cluster-only): + /// wraps the bare engine + per-graph policy in a `GraphHandle` keyed + /// by `default`, then builds a one-graph registry so the deployment + /// serves the same `/graphs/{graph_id}/...` cluster routes as any + /// other. Per-graph policy enforcement on the engine (MR-722) is + /// re-applied via `Omnigraph::with_policy` so HTTP and engine layers + /// can never diverge. fn build_single_mode( uri: String, db: Omnigraph, @@ -525,18 +505,13 @@ impl AppState { } else { db }; - // `GraphHandle.key` is required by the struct, but in single - // mode it is never a registry key (there's no registry) and - // never compared against user input (routes are flat, no - // `{graph_id}` parameter). The label appears only in tracing - // output from `resolve_graph_handle`. The literal below is a - // log label, not a routing key — when the future cluster - // catalog ships, single mode may carry the catalog-assigned - // id here instead. + // The convenience constructors address the single graph by the + // reserved id `default` — both the registry key and the URL + // segment (`/graphs/default/...`). let uri = normalize_root_uri(&uri).unwrap_or(uri); - let key = GraphKey::cluster( - GraphId::try_from("default").expect("'default' is a valid GraphId log label"), - ); + let graph_id = + GraphId::try_from("default").expect("'default' is a valid GraphId"); + let key = GraphKey::cluster(graph_id); let handle = Arc::new(GraphHandle { key, uri, @@ -544,8 +519,15 @@ impl AppState { policy: policy_engine, queries, }); + let registry = Arc::new( + GraphRegistry::from_handles(vec![handle]) + .expect("a single handle never collides on graph id"), + ); Self { - routing: GraphRouting::Single { handle }, + routing: GraphRouting { + registry, + config_path: None, + }, workload, bearer_tokens, server_policy: None, @@ -553,12 +535,11 @@ impl AppState { } /// Multi-mode constructor — used by the startup loop. Operators - /// reach this by invoking `omnigraph-server --config omnigraph.yaml` - /// with a non-empty `graphs:` map. + /// reach this by invoking `omnigraph-server --cluster `. /// /// Caller supplies the already-opened `GraphHandle`s and (optionally) - /// the path to the source config file. `server_policy` is loaded - /// from `server.policy.file` if configured. + /// the path to the source cluster. `server_policy` is loaded from the + /// cluster-scoped policy binding if configured. pub fn new_multi( handles: Vec>, bearer_tokens: Vec<(String, String)>, @@ -569,7 +550,7 @@ impl AppState { let bearer_tokens = hash_bearer_tokens(bearer_tokens); let registry = Arc::new(GraphRegistry::from_handles(handles)?); Ok(Self { - routing: GraphRouting::Multi { + routing: GraphRouting { registry, config_path, }, @@ -581,9 +562,7 @@ impl AppState { /// Runtime routing accessor. Handlers don't typically inspect this — /// they extract `Arc` via the routing middleware — but - /// `build_app` matches on it to decide flat vs nested route - /// mounting, and a handful of management endpoints (`GET /graphs`, - /// the OpenAPI cluster rewrite) match on the discriminant. + /// `server_graphs_list` reads the registry through it. pub fn routing(&self) -> &GraphRouting { &self.routing } @@ -597,13 +576,9 @@ impl AppState { } // Any per-graph policy also requires auth — otherwise the // policy gate would receive unauthenticated requests. Reading - // from `routing` is O(1) in both arms: single mode is a direct - // `handle.policy.is_some()` check, multi mode reads the - // cached `any_per_graph_policy` flag on the registry snapshot. - match &self.routing { - GraphRouting::Single { handle } => handle.policy.is_some(), - GraphRouting::Multi { registry, .. } => registry.snapshot_ref().any_per_graph_policy, - } + // the cached `any_per_graph_policy` flag off the registry + // snapshot is O(1). + self.routing.registry.snapshot_ref().any_per_graph_policy } fn authenticate_bearer_token(&self, provided_token: &str) -> Option { @@ -898,18 +873,6 @@ fn validate_and_attach( }) } -/// Format every load error (parse / identity failure) into a multi-line -/// boot-abort message. -fn format_registry_load_errors(label: &str, errors: &[queries::LoadError]) -> String { - let joined = errors - .iter() - .map(|e| e.to_string()) - .collect::>() - .join("\n "); - format!("graph '{label}': stored-query registry failed to load:\n {joined}") -} - - pub fn build_app(state: AppState) -> Router { // The per-graph protected routes, identical in single + multi mode. // Two middleware layers wrap them (outer first, inner last): @@ -975,13 +938,9 @@ pub fn build_app(state: AppState) -> Router { // Management endpoints (`GET /graphs`) live alongside the per-graph // router. They go through bearer auth but NOT through // `resolve_graph_handle` — they operate on the registry directly. - // The endpoint is mounted in both modes; in single mode the handler - // returns 405 so clients see "resource exists, wrong context" - // rather than 404 "no such resource." // // Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`) is not - // exposed in v0.6.0 — operators add graphs by editing - // `omnigraph.yaml` and restarting. + // exposed — operators run `cluster apply` and restart. let management = Router::new() .route("/graphs", get(server_graphs_list)) .route_layer(middleware::from_fn_with_state( @@ -989,15 +948,11 @@ pub fn build_app(state: AppState) -> Router { require_bearer_auth, )); - // Mount the protected routes differently per mode: - // * Single → flat routes (legacy: `/snapshot`, `/read`, etc.) - // * Multi → nested under `/graphs/{graph_id}/...` - let protected: Router = match state.routing() { - GraphRouting::Single { .. } => per_graph_protected.merge(management), - GraphRouting::Multi { .. } => Router::new() - .nest("/graphs/{graph_id}", per_graph_protected) - .merge(management), - }; + // RFC-011 cluster-only: per-graph routes always nest under + // `/graphs/{graph_id}/...`; there are no flat single-graph routes. + let protected: Router = Router::new() + .nest("/graphs/{graph_id}", per_graph_protected) + .merge(management); Router::new() .route("/healthz", get(server_health)) @@ -1018,7 +973,6 @@ pub async fn serve(config: ServerConfig) -> Result<()> { // policy OR any per-graph policy file. Mirrors the // `requires_bearer_auth` semantics on AppState. let has_policy_configured = match &config.mode { - ServerConfigMode::Single { policy_file, .. } => policy_file.is_some(), ServerConfigMode::Multi { graphs, server_policy, @@ -1039,36 +993,14 @@ pub async fn serve(config: ServerConfig) -> Result<()> { ServerRuntimeState::DefaultDeny => warn!( "bearer tokens are configured but no policy file is set — running in \ default-deny mode (only `read` actions are permitted for authenticated \ - actors). Configure `policy.file` in omnigraph.yaml to enable Cedar rules." + actors). Configure a graph or cluster policy bundle in the cluster config, \ + run `omnigraph cluster apply`, and restart to enable Cedar rules." ), ServerRuntimeState::PolicyEnabled => {} } let bind = config.bind.clone(); let state = match config.mode { - ServerConfigMode::Single { - uri, - graph_id, - policy_file, - queries, - } => { - let uri_for_log = uri.clone(); - info!( - uri = %uri_for_log, - graph_id = %graph_id, - bind = %bind, - mode = "single", - "serving omnigraph" - ); - AppState::open_single_with_queries_for_graph_id( - uri, - tokens, - policy_file.as_ref(), - queries, - Some(graph_id), - ) - .await? - } ServerConfigMode::Multi { graphs, config_path, @@ -1076,7 +1008,7 @@ pub async fn serve(config: ServerConfig) -> Result<()> { } => { info!( bind = %bind, - mode = "multi", + mode = "cluster", graph_count = graphs.len(), config = %config_path.display(), "serving omnigraph" @@ -1197,4 +1129,3 @@ async fn shutdown_signal() { } info!("shutdown signal received"); } - diff --git a/crates/omnigraph-server/src/main.rs b/crates/omnigraph-server/src/main.rs index a138d12..482c9af 100644 --- a/crates/omnigraph-server/src/main.rs +++ b/crates/omnigraph-server/src/main.rs @@ -8,16 +8,10 @@ use omnigraph_server::{ServerConfig, init_tracing, load_server_settings, serve}; #[command(name = "omnigraph-server")] #[command(about = "HTTP server for the Omnigraph graph database")] struct Cli { - /// Graph URI - uri: Option, - #[arg(long)] - target: Option, - #[arg(long)] - config: Option, /// Boot from a cluster: either a config directory (storage resolved /// through cluster.yaml) or a storage-root URI directly /// (s3://bucket/prefix — config-free serving from the bucket). - /// Exclusive: cannot combine with , --target, or --config. + /// The server's only boot source (RFC-011 cluster-only). #[arg(long)] cluster: Option, #[arg(long)] @@ -36,14 +30,7 @@ async fn main() -> Result<()> { init_tracing(); let cli = Cli::parse(); - let settings: ServerConfig = load_server_settings( - cli.config.as_ref(), - cli.cluster.as_ref(), - cli.uri, - cli.target, - cli.bind, - cli.unauthenticated, - ) - .await?; + let settings: ServerConfig = + load_server_settings(cli.cluster.as_ref(), cli.bind, cli.unauthenticated).await?; serve(settings).await } diff --git a/crates/omnigraph-server/src/queries.rs b/crates/omnigraph-server/src/queries.rs index bf131c8..09d2491 100644 --- a/crates/omnigraph-server/src/queries.rs +++ b/crates/omnigraph-server/src/queries.rs @@ -13,7 +13,6 @@ //! Renaming either is a breaking change to callers, by design. use std::collections::BTreeMap; -use std::fs; use std::sync::Arc; use omnigraph_compiler::catalog::Catalog; @@ -22,8 +21,6 @@ use omnigraph_compiler::query::parser::parse_query; use omnigraph_compiler::query::typecheck::typecheck_query_decl; use omnigraph_compiler::types::{PropType, ScalarType}; -use crate::config::{OmnigraphConfig, QueryEntry}; - /// One loaded stored query. `source` is the full `.gq` file text — the /// invocation handler hands it to `run_query` / `run_mutate` verbatim, /// which reuse the same parse/IR/exec path as the inline routes (no @@ -68,8 +65,9 @@ pub struct QueryRegistry { by_name: BTreeMap, } -/// In-memory registry entry before file I/O. Used by [`QueryRegistry::load`] -/// (after reading each `.gq` from disk) and directly by tests. +/// In-memory registry spec: a query's name + already-read `.gq` source. The +/// input to [`QueryRegistry::from_specs`] — built by the server's cluster boot +/// and by the CLI's `queries` tooling from a cluster serving snapshot. #[derive(Debug, Clone)] pub struct RegistrySpec { pub name: String, @@ -169,47 +167,6 @@ impl QueryRegistry { } } - /// Read each registry entry's `.gq` file from disk and build the - /// registry. `entries` is either the top-level `queries` map (single - /// mode) or a graph's `queries` map (multi mode); `config` resolves - /// each entry's relative `file:` path against `base_dir`. - pub fn load( - config: &OmnigraphConfig, - entries: &BTreeMap, - ) -> Result> { - let mut specs = Vec::with_capacity(entries.len()); - let mut errors = Vec::new(); - for (name, entry) in entries { - let path = config.resolve_query_file(&entry.file); - match fs::read_to_string(&path) { - Ok(source) => specs.push(RegistrySpec { - name: name.clone(), - source, - expose: entry.mcp.expose, - tool_name: entry.mcp.tool_name.clone(), - }), - Err(err) => errors.push(LoadError { - query: Some(name.clone()), - message: format!("cannot read '{}': {err}", path.display()), - }), - } - } - - // Parse/identity/uniqueness-check the readable specs even when some - // files failed to read, so every broken entry (I/O, parse, identity, - // tool-name collision) surfaces in one pass rather than one per - // restart. I/O errors come first (in `entries` key order), then the - // spec errors. A non-empty `errors` always fails the load. - match Self::from_specs(specs) { - Ok(registry) if errors.is_empty() => Ok(registry), - Ok(_) => Err(errors), - Err(spec_errors) => { - errors.extend(spec_errors); - Err(errors) - } - } - } - pub fn lookup(&self, name: &str) -> Option<&StoredQuery> { self.by_name.get(name) } @@ -653,36 +610,4 @@ embedding: Vector(4) assert!(entry2.params.is_empty(), "no declared params → empty list"); } - // --- load() error collection (file I/O + parse in one pass) --- - - #[test] - fn load_collects_io_and_parse_errors_in_one_pass() { - use crate::config::load_config; - let temp = tempfile::tempdir().unwrap(); - std::fs::write( - temp.path().join("good.gq"), - "query good() { match { $u: User } return { $u.name } }", - ) - .unwrap(); - std::fs::write(temp.path().join("broken.gq"), "query broken( {{ not valid").unwrap(); - // `missing.gq` is deliberately not written (an I/O failure). - std::fs::write( - temp.path().join("omnigraph.yaml"), - "queries:\n good:\n file: ./good.gq\n \ - missing:\n file: ./missing.gq\n broken:\n file: ./broken.gq\n", - ) - .unwrap(); - let config = load_config(Some(&temp.path().join("omnigraph.yaml"))).unwrap(); - - let errors = QueryRegistry::load(&config, config.query_entries()).unwrap_err(); - let joined = errors.iter().map(|e| e.to_string()).collect::>().join("\n"); - // Both the missing file AND the parse error surface in one pass — - // the I/O failure must not mask the parse failure. - assert!(joined.contains("missing"), "I/O error must surface: {joined}"); - assert!( - joined.contains("broken") && joined.contains("parse error"), - "the parse error in a readable file must surface in the same pass: {joined}" - ); - assert!(!joined.contains("'good'"), "the valid entry is not an error: {joined}"); - } } diff --git a/crates/omnigraph-server/src/settings.rs b/crates/omnigraph-server/src/settings.rs index 0054e03..338400a 100644 --- a/crates/omnigraph-server/src/settings.rs +++ b/crates/omnigraph-server/src/settings.rs @@ -1,14 +1,13 @@ -//! Server settings: omnigraph.yaml/CLI/env resolution, mode inference -//! (single vs multi vs cluster), bearer-token sources, and runtime-state -//! classification (moved verbatim from lib.rs in the modularization). +//! Server settings: cluster/CLI/env resolution, bearer-token sources, and +//! runtime-state classification (moved verbatim from lib.rs in the +//! modularization). use super::*; /// Build serving settings from a cluster directory's applied revision /// (RFC-005 §D2): graphs at derived roots, stored queries from verified /// catalog blob content, policy bundles from blob paths with their applied -/// bindings. Always multi-graph routing. The unauthenticated/env handling -/// matches the omnigraph.yaml path. +/// bindings. Always multi-graph routing. pub(crate) async fn load_cluster_settings( cluster_dir: &PathBuf, cli_bind: Option, @@ -131,163 +130,24 @@ pub(crate) async fn load_cluster_settings( }) } +/// RFC-011 cluster-only boot: the server serves exclusively from a +/// cluster's applied revision (`--cluster `). The legacy +/// omnigraph.yaml / `--target` / positional-URI single-graph boot paths +/// were removed — a deployment serves from exactly one source. pub async fn load_server_settings( - config_path: Option<&PathBuf>, cli_cluster: Option<&PathBuf>, - cli_uri: Option, - cli_target: Option, cli_bind: Option, cli_allow_unauthenticated: bool, ) -> Result { - // Rule 0 (RFC-005): --cluster is an exclusive boot source. It is checked - // before anything reads omnigraph.yaml — in cluster mode that file is - // never opened, not even the implicit current-directory search. - if let Some(cluster_dir) = cli_cluster { - if cli_uri.is_some() || cli_target.is_some() || config_path.is_some() { - bail!( - "--cluster is an exclusive boot source; it cannot combine with a graph URI, --target, or --config (axiom 15: a deployment serves from one source)" - ); - } - return load_cluster_settings(cluster_dir, cli_bind, cli_allow_unauthenticated).await; - } - let config = load_config(config_path)?; - let bind = cli_bind.unwrap_or_else(|| config.server_bind().to_string()); - // Either `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` flips - // this. Treat any non-empty, non-"0"/"false" string as truthy — - // standard 12-factor "any value is true" reading of the env var. - let env_unauth = std::env::var("OMNIGRAPH_UNAUTHENTICATED") - .ok() - .map(|v| { - let trimmed = v.trim(); - !trimmed.is_empty() && trimmed != "0" && !trimmed.eq_ignore_ascii_case("false") - }) - .unwrap_or(false); - let allow_unauthenticated = cli_allow_unauthenticated || env_unauth; - - // MR-668 decision 2 — four-rule mode inference matrix. - // - // 1. CLI `` positional → Single (URI = the value) - // 2. CLI `--target ` → Single (URI = graphs..uri) - // 3. `server.graph` in config → Single (URI = graphs..uri) - // 4. `--config` + non-empty `graphs:` + no single-mode selector - // → Multi (every entry in `graphs:`) - // 5. otherwise → error with migration hint - // - // Rules 1-3 are mutually compatible (CLI URI wins over `--target` - // wins over `server.graph`), reusing the existing - // `resolve_target_uri` precedence. - let has_cli_uri = cli_uri.is_some(); - let has_cli_target = cli_target.is_some(); - let has_server_graph = config.server_graph_name().is_some(); - let has_graphs_map = !config.graphs.is_empty(); - let has_explicit_config = config_path.is_some(); - - let mode = if has_cli_uri || has_cli_target || has_server_graph { - // Rules 1, 2, or 3 → Single mode. - let raw_uri = config.resolve_target_uri( - cli_uri, - cli_target.as_deref(), - config.server_graph_name(), - )?; - let uri = normalize_root_uri(&raw_uri).wrap_err_with(|| { - format!("normalize single-graph URI '{raw_uri}' from server settings") - })?; - // Config follows graph IDENTITY, not mode: a bare URI is anonymous - // (top-level config); a graph chosen by name uses its per-graph - // `graphs..{policy,queries}`. `resolve_target_uri` already - // errored on an unknown name, so a `Some(name)` here is a known graph. - let selected: Option<&str> = if has_cli_uri { - None - } else { - cli_target.as_deref().or_else(|| config.server_graph_name()) - }; - // A named selection must not leave a populated top-level block - // silently unused — refuse boot and point at the per-graph block. The - // same rule the CLI selection gate enforces, shared via one helper so - // the boot check and `omnigraph queries validate`/`list` can't drift. - config.ensure_top_level_blocks_honored(selected)?; - // Load + identity-check now (no engine needed); the schema - // type-check happens when the engine opens. - let policy_file = config.resolve_policy_file_for(selected); - let queries = QueryRegistry::load(&config, config.query_entries_for(selected)) - .map_err(|errs| color_eyre::eyre::eyre!(format_registry_load_errors(&uri, &errs)))?; - let graph_id = graph_resource_id_for_selection(selected, &uri); - ServerConfigMode::Single { - uri, - graph_id, - policy_file, - queries, - } - } else if has_explicit_config && has_graphs_map { - // Multi mode: every graph uses its per-graph block; top-level - // policy/queries are never honored, so a populated one is an error. - let unhonored = config.populated_top_level_blocks(); - if !unhonored.is_empty() { - bail!( - "multi-graph mode: top-level {} {} not honored — each graph uses its own \ - `graphs..…` block. Move per-graph rules there (and any \ - `graph_list` policy to `server.policy.file`).", - unhonored.join(" and "), - if unhonored.len() == 1 { "is" } else { "are" }, - ); - } - // Rule 4 → Multi mode. Build a startup config per graph. - let mut graphs = Vec::with_capacity(config.graphs.len()); - for (name, target) in &config.graphs { - // Validate the graph id can construct a `GraphId` newtype. - // Doing this here (not at registry insert) so a malformed - // omnigraph.yaml fails at startup with a clear error. - GraphId::try_from(name.clone()).map_err(|err| { - color_eyre::eyre::eyre!("invalid graph id '{name}' in omnigraph.yaml: {err}") - })?; - let raw_uri = config.resolve_uri_value(&target.uri); - let uri = normalize_root_uri(&raw_uri).wrap_err_with(|| { - format!("normalize URI '{raw_uri}' for graph '{name}' in omnigraph.yaml") - })?; - // Per-graph `queries:`, selected through the shared - // `query_entries_for` so server and CLI resolve identically. - // Load + identity-check now; the schema type-check happens - // when this graph's engine opens. - let queries = QueryRegistry::load(&config, config.query_entries_for(Some(name.as_str()))) - .map_err(|errs| color_eyre::eyre::eyre!(format_registry_load_errors(name, &errs)))?; - graphs.push(GraphStartupConfig { - graph_id: name.clone(), - uri, - policy: config.resolve_target_policy_file(name).map(PolicySource::File), - embedding: None, - queries, - }); - } - let config_path = config_path - .cloned() - .expect("has_explicit_config implies config_path is Some"); - let server_policy = config.resolve_server_policy_file().map(PolicySource::File); - ServerConfigMode::Multi { - graphs, - config_path, - server_policy, - } - } else { - // Rule 5 → error with migration hint. + let Some(cluster_dir) = cli_cluster else { bail!( - "no graph to serve: pass a URI (`omnigraph-server `), select a target \ - (`--target --config omnigraph.yaml`), set `server.graph: ` in \ - omnigraph.yaml, or for multi-graph mode add a `graphs:` map to the config \ - file referenced by `--config`." + "omnigraph-server boots from a cluster: pass --cluster \ + (the cluster's applied revision is the deployment artifact). The legacy \ + single-graph boot (positional , --target, --config omnigraph.yaml) \ + was removed in RFC-011." ); }; - - Ok(ServerConfig { - mode, - bind, - allow_unauthenticated, - }) -} - -/// Whether the loaded config will run the server in multi-graph mode. -/// Useful for the test that constructs `ServerConfig` directly. -pub fn server_config_is_multi(config: &ServerConfig) -> bool { - matches!(config.mode, ServerConfigMode::Multi { .. }) + load_cluster_settings(cluster_dir, cli_bind, cli_allow_unauthenticated).await } /// MR-723 server runtime state, classified from the three-state matrix @@ -337,7 +197,8 @@ pub fn classify_server_runtime_state( "server has no bearer tokens and no policy file configured. This is a fully \ open server — pass `--unauthenticated` (or set OMNIGRAPH_UNAUTHENTICATED=1) \ if you actually want that, otherwise configure bearer tokens (see \ - docs/user/operations/server.md) and/or `policy.file` in omnigraph.yaml." + docs/user/operations/server.md) and a graph or cluster policy bundle in \ + the cluster config, then run `omnigraph cluster apply` and restart." ), (false, false, true) => Ok(ServerRuntimeState::Open), (true, false, _) => Ok(ServerRuntimeState::DefaultDeny), @@ -427,8 +288,8 @@ pub(crate) fn server_bearer_tokens_from_env() -> Result> { mod tests { use super::{ GraphStartupConfig, ServerConfig, ServerConfigMode, ServerRuntimeState, - classify_server_runtime_state, hash_bearer_token, load_server_settings, - normalize_bearer_token, parse_bearer_tokens_json, serve, server_bearer_tokens_from_env, + classify_server_runtime_state, hash_bearer_token, normalize_bearer_token, + parse_bearer_tokens_json, serve, server_bearer_tokens_from_env, }; use serial_test::serial; use std::env; @@ -587,108 +448,15 @@ mod tests { } #[tokio::test] - async fn server_settings_load_from_yaml_config() { - let temp = tempdir().unwrap(); - let config = temp.path().join("omnigraph.yaml"); - fs::write( - &config, - r#" -graphs: - local: - uri: /tmp/demo.omni -server: - graph: local - bind: 0.0.0.0:9090 -"#, - ) - .unwrap(); - - let settings = load_server_settings(Some(&config), None, None, None, None, false).await.unwrap(); - match &settings.mode { - ServerConfigMode::Single { uri, graph_id, .. } => { - assert_eq!(uri, "/tmp/demo.omni"); - assert_eq!(graph_id, "local"); - } - ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"), - } - assert_eq!(settings.bind, "0.0.0.0:9090"); - } - - #[tokio::test] - async fn server_settings_cli_flags_override_yaml_config() { - let temp = tempdir().unwrap(); - let config = temp.path().join("omnigraph.yaml"); - fs::write( - &config, - r#" -graphs: - local: - uri: /tmp/demo.omni -server: - graph: local - bind: 127.0.0.1:8080 -"#, - ) - .unwrap(); - - let settings = load_server_settings( - Some(&config), - None, - Some("/tmp/override.omni".to_string()), - None, - Some("0.0.0.0:9999".to_string()), - false, - ) - .await - .unwrap(); - match &settings.mode { - ServerConfigMode::Single { uri, graph_id, .. } => { - assert_eq!(uri, "/tmp/override.omni"); - assert_eq!(graph_id, "/tmp/override.omni"); - } - ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"), - } - assert_eq!(settings.bind, "0.0.0.0:9999"); - } - - #[tokio::test] - async fn server_settings_can_resolve_named_target() { - let temp = tempdir().unwrap(); - let config = temp.path().join("omnigraph.yaml"); - fs::write( - &config, - r#" -graphs: - local: - uri: ./demo.omni - dev: - uri: http://127.0.0.1:8080 -server: - graph: local - bind: 127.0.0.1:8080 -"#, - ) - .unwrap(); - - let settings = - load_server_settings(Some(&config), None, None, Some("dev".to_string()), None, false) - .await - .unwrap(); - match &settings.mode { - ServerConfigMode::Single { uri, graph_id, .. } => { - assert_eq!(uri, "http://127.0.0.1:8080"); - assert_eq!(graph_id, "dev"); - } - ServerConfigMode::Multi { .. } => panic!("expected Single mode, got Multi"), - } - } - - #[tokio::test] - async fn server_settings_require_uri_from_cli_or_config() { - let error = load_server_settings(None, None, None, None, None, false).await.unwrap_err(); + async fn server_settings_require_cluster_boot_source() { + // RFC-011 cluster-only: with no --cluster the server refuses to + // start and names the cluster-required remedy. + let error = super::load_server_settings(None, None, false) + .await + .unwrap_err(); assert!( - error.to_string().contains("no graph to serve"), - "expected mode-inference error, got: {error}", + error.to_string().contains("boots from a cluster"), + "expected cluster-required error, got: {error}", ); } @@ -799,17 +567,21 @@ server: ]); let temp = tempdir().unwrap(); // Graph path doesn't need to exist — classifier fires before - // `AppState::open_with_bearer_tokens_and_policy`. + // any engine open. let config = ServerConfig { - mode: ServerConfigMode::Single { - uri: temp - .path() - .join("graph.omni") - .to_string_lossy() - .into_owned(), - graph_id: "default".to_string(), - policy_file: None, - queries: crate::queries::QueryRegistry::default(), + mode: ServerConfigMode::Multi { + graphs: vec![GraphStartupConfig { + graph_id: "default".to_string(), + uri: temp + .path() + .join("graph.omni") + .to_string_lossy() + .into_owned(), + policy: None, + queries: crate::queries::QueryRegistry::default(), + }], + config_path: temp.path().join("cluster"), + server_policy: None, }, bind: "127.0.0.1:0".to_string(), allow_unauthenticated: false, @@ -824,75 +596,6 @@ server: ); } - #[tokio::test] - #[serial] - async fn unauthenticated_env_var_classification() { - // MR-723 PR A: closes the gap where the env-var read path inside - // `load_server_settings` was structurally implemented but not - // exercised by any test. Three properties to pin, all in one - // sequential test because `cargo test` runs the mod test suite - // in parallel and `OMNIGRAPH_UNAUTHENTICATED` is process-global - // — interleaving with another test that sets the same env var - // (concurrent classifier tests, even the bearer-token suite - // sharing `EnvGuard`) corrupts the read. Sequential within one - // test fn is the simplest race-free shape. - let temp = tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - local: - uri: /tmp/demo-unauth.omni -server: - graph: local -"#, - ) - .unwrap(); - - // Truthy values flip Open mode on, even with CLI flag off. - for value in ["1", "true", "yes", "TRUE", "anything"] { - let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", Some(value))]); - let settings = load_server_settings(Some(&config_path), None, None, None, None, false).await - .expect("settings load should succeed"); - assert!( - settings.allow_unauthenticated, - "OMNIGRAPH_UNAUTHENTICATED={value:?} should enable Open mode", - ); - } - - // Falsy values keep refusal behavior, even with CLI flag off. - for value in ["0", "false", "FALSE", ""] { - let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", Some(value))]); - let settings = load_server_settings(Some(&config_path), None, None, None, None, false).await - .expect("settings load should succeed"); - assert!( - !settings.allow_unauthenticated, - "OMNIGRAPH_UNAUTHENTICATED={value:?} should NOT enable Open mode", - ); - } - - // Unset env var: also false. - let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", None)]); - let settings = load_server_settings(Some(&config_path), None, None, None, None, false).await - .expect("settings load should succeed"); - assert!( - !settings.allow_unauthenticated, - "OMNIGRAPH_UNAUTHENTICATED unset should NOT enable Open mode", - ); - drop(_guard); - - // CLI flag wins even when env is falsy — `serve()` honors the - // OR of both inputs. - let _guard = EnvGuard::set(&[("OMNIGRAPH_UNAUTHENTICATED", Some("0"))]); - let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await - .expect("settings load should succeed"); - assert!( - settings.allow_unauthenticated, - "--unauthenticated CLI flag should win even when env is falsy", - ); - } - #[test] fn classify_policy_enabled_requires_tokens() { // State 3: tokens + policy → PolicyEnabled, regardless of the diff --git a/crates/omnigraph-server/tests/auth_policy.rs b/crates/omnigraph-server/tests/auth_policy.rs index 05c0c56..5cbbb97 100644 --- a/crates/omnigraph-server/tests/auth_policy.rs +++ b/crates/omnigraph-server/tests/auth_policy.rs @@ -50,7 +50,7 @@ async fn protected_routes_require_bearer_token() { let (status, body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -85,7 +85,7 @@ async fn protected_routes_accept_valid_bearer_token_while_healthz_stays_open() { let (status, body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .header("authorization", "Bearer demo-token") .body(Body::empty()) @@ -108,7 +108,7 @@ async fn protected_routes_accept_any_configured_team_bearer_token() { let (status, body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .header("authorization", "Bearer token-two") .body(Body::empty()) @@ -158,7 +158,7 @@ rules: let (ok_status, _) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-a") .body(Body::empty()) @@ -172,7 +172,7 @@ rules: let (denied_status, denied_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-b") .body(Body::empty()) @@ -190,7 +190,7 @@ rules: let (bad_status, _) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer wrong-token") .body(Body::empty()) @@ -245,7 +245,7 @@ rules: let (spoof_up_status, spoof_up_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-b") .header("x-actor-id", "act-a") @@ -270,7 +270,7 @@ rules: let (spoof_down_status, _) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-a") .header("x-actor-id", "act-b") @@ -290,7 +290,7 @@ rules: let (empty_spoof_status, _) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-b") .header("x-actor-id", "") @@ -316,7 +316,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() { let (missing_status, missing_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -332,7 +332,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() { let (snapshot_status, snapshot_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .header("authorization", "Bearer team-token") .body(Body::empty()) @@ -350,7 +350,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() { let (forbidden_status, forbidden_body) = json_response( &app, Request::builder() - .uri("/export") + .uri(g("/export")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -369,7 +369,7 @@ async fn policy_allows_read_but_distinguishes_401_from_403() { .clone() .oneshot( Request::builder() - .uri("/export") + .uri(g("/export")) .method(Method::POST) .header("authorization", "Bearer admin-token") .header("content-type", "application/json") @@ -410,7 +410,7 @@ async fn policy_uses_resolved_branch_for_snapshot_reads() { let (status, body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -458,7 +458,7 @@ async fn policy_blocks_change_on_protected_main_but_allows_unprotected_branch() let (main_status, main_body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -482,7 +482,7 @@ async fn policy_blocks_change_on_protected_main_but_allows_unprotected_branch() let (feature_status, feature_body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -533,7 +533,7 @@ async fn policy_blocks_non_admin_merge_to_main_and_allows_admin() { let (deny_status, deny_body) = json_response( &app, Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -551,7 +551,7 @@ async fn policy_blocks_non_admin_merge_to_main_and_allows_admin() { let (allow_status, allow_body) = json_response( &app, Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header("authorization", "Bearer admin-token") .header("content-type", "application/json") @@ -578,7 +578,7 @@ async fn authenticated_change_stamps_actor_on_commits() { let (change_status, change_body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("authorization", "Bearer token-one") .header("content-type", "application/json") @@ -592,7 +592,7 @@ async fn authenticated_change_stamps_actor_on_commits() { let (commits_status, commits_body) = json_response( &app, Request::builder() - .uri("/commits?branch=main") + .uri(g("/commits?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-one") .body(Body::empty()) @@ -623,7 +623,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() { let (create_status, _) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::POST) .header("authorization", "Bearer token-one") .header("content-type", "application/json") @@ -642,7 +642,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() { let (change_status, _) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("authorization", "Bearer token-one") .header("content-type", "application/json") @@ -659,7 +659,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() { let (merge_status, merge_body) = json_response( &app, Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header("authorization", "Bearer token-two") .header("content-type", "application/json") @@ -673,7 +673,7 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() { let (commit_status, commit_body) = json_response( &app, Request::builder() - .uri("/commits?branch=main") + .uri(g("/commits?branch=main")) .method(Method::GET) .header("authorization", "Bearer token-two") .body(Body::empty()) @@ -691,7 +691,6 @@ async fn authenticated_branch_merge_stamps_merge_actor_on_head_commit() { #[tokio::test(flavor = "multi_thread")] async fn engine_layer_policy_fires_via_direct_arc_omnigraph_from_new_single() { - use omnigraph_server::GraphRouting; let temp = init_loaded_graph().await; let graph = graph_path(temp.path()); let db = Omnigraph::open(graph.to_str().unwrap()).await.unwrap(); @@ -717,9 +716,14 @@ async fn engine_layer_policy_fires_via_direct_arc_omnigraph_from_new_single() { // embedded consumer holding `Arc` would. If `new_single` // failed to apply `with_policy` to the engine, this `mutate_as` // would succeed — the HTTP-layer is bypassed entirely. - let handle = match state.routing() { - GraphRouting::Single { handle } => Arc::clone(handle), - GraphRouting::Multi { .. } => panic!("expected single-mode routing"), + // RFC-011 cluster-only: the single-graph convenience constructor + // registers the graph under the reserved id `default`. + let key = omnigraph_server::GraphKey::cluster( + omnigraph_server::GraphId::try_from("default").unwrap(), + ); + let handle = match state.routing().registry.get(&key) { + omnigraph_server::RegistryLookup::Ready(handle) => handle, + omnigraph_server::RegistryLookup::Gone => panic!("default graph must be registered"), }; let engine = Arc::clone(&handle.engine); @@ -758,7 +762,7 @@ async fn oversized_request_body_returns_payload_too_large() { .clone() .oneshot( Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(oversized)) @@ -781,7 +785,7 @@ async fn default_deny_mode_allows_read_for_authenticated_actor() { let (status, _body) = json_response( &app, Request::builder() - .uri("/snapshot") + .uri(g("/snapshot")) .method(Method::GET) .header(AUTHORIZATION, "Bearer demo-token") .body(Body::empty()) @@ -808,7 +812,7 @@ async fn default_deny_mode_rejects_change_with_forbidden() { let (status, body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header(AUTHORIZATION, "Bearer demo-token") .header("content-type", "application/json") @@ -840,7 +844,7 @@ async fn default_deny_mode_rejects_schema_apply_with_forbidden() { let (status, body) = json_response( &app, Request::builder() - .uri("/schema/apply") + .uri(g("/schema/apply")) .method(Method::POST) .header(AUTHORIZATION, "Bearer demo-token") .header("content-type", "application/json") diff --git a/crates/omnigraph-server/tests/boot_settings.rs b/crates/omnigraph-server/tests/boot_settings.rs index 3869d27..4ccc8da 100644 --- a/crates/omnigraph-server/tests/boot_settings.rs +++ b/crates/omnigraph-server/tests/boot_settings.rs @@ -18,10 +18,7 @@ use support::*; mod multi_graph_startup { use super::*; use omnigraph::storage::normalize_root_uri; - use omnigraph_server::{ - GraphHandle, GraphId, GraphKey, GraphRegistry, InsertError, ServerConfig, ServerConfigMode, - load_server_settings, - }; + use omnigraph_server::{GraphHandle, GraphId, GraphKey, GraphRegistry, InsertError}; use std::sync::Arc; async fn build_multi_mode_app(graph_ids: &[&str]) -> (Vec, Router) { @@ -280,10 +277,11 @@ mod multi_graph_startup { ); } - /// Flat routes 404 in multi mode — the router only mounts under - /// `/graphs/{graph_id}/...` so `/snapshot` doesn't resolve. + /// RFC-011 cluster-only: flat per-graph routes never resolve — the + /// router only mounts under `/graphs/{graph_id}/...` so a root + /// `/snapshot` returns 404. #[tokio::test(flavor = "multi_thread")] - async fn flat_routes_404_in_multi_mode() { + async fn flat_routes_404_at_root() { let (_dirs, app) = build_multi_mode_app(&["alpha"]).await; let resp = app .oneshot( @@ -298,28 +296,6 @@ mod multi_graph_startup { assert_eq!(resp.status(), StatusCode::NOT_FOUND); } - /// `GraphId` validation runs at startup — a reserved name in - /// `omnigraph.yaml` produces a clear error rather than getting - /// rejected per-request. - #[tokio::test] - async fn load_server_settings_rejects_reserved_graph_id() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - policies: - uri: /tmp/g1.omni -"#, - ) - .unwrap(); - let err = load_server_settings(Some(&config_path), None, None, None, None, false).await.unwrap_err(); - assert!( - err.to_string().contains("invalid graph id 'policies'"), - "expected reserved-name rejection, got: {err}" - ); - } #[tokio::test(flavor = "multi_thread")] async fn registry_rejects_duplicate_normalized_graph_uris() { @@ -375,372 +351,6 @@ graphs: assert_eq!(listed[0].uri, graph_uri); } - // ── Four-rule mode inference matrix ─────────────────────────────── - - /// Rule 1: CLI positional URI → Single. - #[tokio::test] - async fn mode_inference_cli_uri_is_single() { - let settings = load_server_settings( - None, - None, - Some("/tmp/cli.omni".to_string()), - None, - None, - true, // allow unauth so we get past the runtime-state check - ) - .await - .unwrap(); - match settings.mode { - ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/cli.omni"), - ServerConfigMode::Multi { .. } => panic!("expected Single (rule 1), got Multi"), - } - } - - /// Rule 2: --target picks one graph from `graphs:` map → Single. - #[tokio::test] - async fn mode_inference_cli_target_is_single() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - alpha: - uri: /tmp/alpha.omni - beta: - uri: /tmp/beta.omni -"#, - ) - .unwrap(); - let settings = - load_server_settings(Some(&config_path), None, None, Some("alpha".into()), None, true) - .await - .unwrap(); - match settings.mode { - ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/alpha.omni"), - ServerConfigMode::Multi { .. } => panic!("expected Single (rule 2), got Multi"), - } - } - - /// Rule 3: `server.graph` set → Single (target picked from config). - #[tokio::test] - async fn mode_inference_server_graph_is_single() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - alpha: - uri: /tmp/alpha.omni - beta: - uri: /tmp/beta.omni -server: - graph: beta -"#, - ) - .unwrap(); - let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap(); - match settings.mode { - ServerConfigMode::Single { uri, .. } => assert_eq!(uri, "/tmp/beta.omni"), - ServerConfigMode::Multi { .. } => panic!("expected Single (rule 3), got Multi"), - } - } - - /// Rule 4: `--config` + non-empty `graphs:` + no single-mode selector → Multi. - #[tokio::test] - async fn mode_inference_config_plus_graphs_is_multi() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - alpha: - uri: /tmp/alpha.omni - beta: - uri: /tmp/beta.omni -"#, - ) - .unwrap(); - let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap(); - match settings.mode { - ServerConfigMode::Multi { graphs, .. } => { - let ids: Vec<&str> = graphs.iter().map(|g| g.graph_id.as_str()).collect(); - // BTreeMap iteration order is alphabetical. - assert_eq!(ids, vec!["alpha", "beta"]); - } - ServerConfigMode::Single { .. } => panic!("expected Multi (rule 4), got Single"), - } - } - - #[tokio::test] - async fn mode_inference_multi_rejects_top_level_policy_file() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -policy: - file: ./policy.yaml -graphs: - alpha: - uri: /tmp/alpha.omni -"#, - ) - .unwrap(); - let err = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap_err(); - let msg = err.to_string(); - assert!( - msg.contains("top-level") && msg.contains("policy.file") && msg.contains("not honored"), - "expected top-level-not-honored guidance, got: {msg}" - ); - assert!( - msg.contains("graphs."), - "expected per-graph migration guidance, got: {msg}" - ); - assert!( - msg.contains("server.policy.file"), - "expected server policy migration guidance, got: {msg}" - ); - } - - #[tokio::test] - async fn mode_inference_multi_rejects_top_level_queries() { - // Symmetric to the policy guard: a top-level `queries:` block in - // multi-graph mode is not honored (each graph uses its own), so it - // is a loud error rather than a silent no-op. - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - "queries:\n q:\n file: ./q.gq\ngraphs:\n alpha:\n uri: /tmp/alpha.omni\n", - ) - .unwrap(); - let err = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap_err(); - let msg = err.to_string(); - assert!( - msg.contains("queries") && msg.contains("not honored"), - "top-level queries must be rejected in multi-graph mode: {msg}" - ); - } - - #[tokio::test] - async fn single_mode_named_graph_rejects_top_level_blocks() { - // Serving a graph by name (`--target`/`server.graph`) uses its - // per-graph block; a populated top-level block would be silently - // shadowed, so boot refuses and names the per-graph location. - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - "policy:\n file: ./top.yaml\ngraphs:\n prod:\n uri: /tmp/prod.omni\n", - ) - .unwrap(); - let err = - load_server_settings(Some(&config_path), None, None, Some("prod".to_string()), None, true) - .await - .unwrap_err(); - let msg = err.to_string(); - assert!( - msg.contains("prod") && msg.contains("policy.file") && msg.contains("graphs.prod"), - "named single-mode + top-level policy must refuse, naming the graph: {msg}" - ); - } - - #[tokio::test] - async fn single_mode_named_graph_uses_per_graph_policy_and_queries() { - // The identity rule: `--target prod` attaches `graphs.prod`'s own - // policy + queries, not the top-level ones (which are absent here). - let temp = tempfile::tempdir().unwrap(); - fs::write( - temp.path().join("prod.gq"), - "query pq() { match { $u: User } return { $u.name } }", - ) - .unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - "graphs:\n prod:\n uri: /tmp/prod.omni\n policy:\n file: ./prod-policy.yaml\n \ - queries:\n pq:\n file: ./prod.gq\n", - ) - .unwrap(); - let settings = - load_server_settings(Some(&config_path), None, None, Some("prod".to_string()), None, true) - .await - .unwrap(); - match settings.mode { - ServerConfigMode::Single { - graph_id, - policy_file, - queries, - .. - } => { - assert_eq!(graph_id, "prod", "named single-mode keeps graph identity"); - assert!( - policy_file - .as_ref() - .is_some_and(|p| p.ends_with("prod-policy.yaml")), - "per-graph policy attached: {policy_file:?}" - ); - assert!(queries.lookup("pq").is_some(), "per-graph query attached"); - } - other => panic!("expected Single mode, got {other:?}"), - } - } - - #[tokio::test] - async fn mode_inference_normalizes_multi_graph_uris() { - let temp = tempfile::tempdir().unwrap(); - let graph = temp.path().join("alpha.omni"); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - format!( - r#" -graphs: - alpha: - uri: file://{}/ -"#, - graph.display() - ), - ) - .unwrap(); - let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap(); - match settings.mode { - ServerConfigMode::Multi { graphs, .. } => { - assert_eq!(graphs[0].uri, graph.to_string_lossy()); - } - ServerConfigMode::Single { .. } => panic!("expected Multi"), - } - } - - /// Rule 5: nothing → error with migration hint. - #[tokio::test] - async fn mode_inference_no_inputs_errors_with_migration_hint() { - let err = load_server_settings(None, None, None, None, None, true).await.unwrap_err(); - let msg = err.to_string(); - assert!( - msg.contains("no graph to serve"), - "expected migration-hint error, got: {msg}" - ); - } - - /// Rule 4 sub-case: `--config` with empty `graphs:` map and no - /// single-mode selector → rule 5 fires (no graph to serve). - #[tokio::test] - async fn mode_inference_empty_graphs_map_errors() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write(&config_path, "server:\n bind: 127.0.0.1:8080\n").unwrap(); - let err = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap_err(); - assert!(err.to_string().contains("no graph to serve")); - } - - /// `--config` + `` together: URI wins → Single (the CLI URI - /// takes precedence over the config's graphs map). - #[tokio::test] - async fn mode_inference_cli_uri_overrides_graphs_map() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - alpha: - uri: /tmp/alpha.omni -"#, - ) - .unwrap(); - let settings = load_server_settings( - Some(&config_path), - None, - Some("/tmp/cli-override.omni".to_string()), - None, - None, - true, - ) - .await - .unwrap(); - match settings.mode { - ServerConfigMode::Single { uri, .. } => { - assert_eq!( - uri, "/tmp/cli-override.omni", - "CLI URI must win over graphs: map" - ); - } - ServerConfigMode::Multi { .. } => { - panic!("expected Single (CLI URI wins), got Multi") - } - } - } - - /// Per-graph `policy.file` is resolved relative to the config base_dir. - #[tokio::test] - async fn per_graph_policy_file_is_resolved_relative_to_base_dir() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -graphs: - alpha: - uri: /tmp/alpha.omni - policy: - file: ./policies/alpha.yaml - beta: - uri: /tmp/beta.omni -"#, - ) - .unwrap(); - let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap(); - let graphs = match settings.mode { - ServerConfigMode::Multi { graphs, .. } => graphs, - _ => panic!("expected Multi"), - }; - // graphs is BTreeMap-iter order (alphabetical). - let alpha = &graphs[0]; - let beta = &graphs[1]; - assert_eq!(alpha.graph_id, "alpha"); - let omnigraph_server::PolicySource::File(alpha_policy) = - alpha.policy.as_ref().unwrap() - else { - panic!("yaml-configured policy must stay file-based"); - }; - assert_eq!(alpha_policy, &temp.path().join("policies/alpha.yaml")); - assert_eq!(beta.graph_id, "beta"); - assert!(beta.policy.is_none()); - } - - /// `server.policy.file` resolves alongside the graphs map. - #[tokio::test] - async fn server_policy_file_is_resolved_relative_to_base_dir() { - let temp = tempfile::tempdir().unwrap(); - let config_path = temp.path().join("omnigraph.yaml"); - fs::write( - &config_path, - r#" -server: - policy: - file: ./server-policy.yaml -graphs: - alpha: - uri: /tmp/alpha.omni -"#, - ) - .unwrap(); - let settings = load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap(); - match settings.mode { - ServerConfigMode::Multi { server_policy, .. } => { - let omnigraph_server::PolicySource::File(path) = server_policy.unwrap() else { - panic!("yaml-configured server policy must stay file-based"); - }; - assert_eq!(path, temp.path().join("server-policy.yaml")); - } - _ => panic!("expected Multi"), - } - } - /// `GET /graphs` must NOT leak the registry in Open mode without /// an explicit server policy. Operators who pass `--unauthenticated` /// opted into trusting the network for graph DATA, not for leaking @@ -786,28 +396,6 @@ graphs: ); } - /// `GET /graphs` returns 405 in single mode (resource exists in the - /// API surface, just not operational without a `graphs:` map). - #[tokio::test(flavor = "multi_thread")] - async fn get_graphs_returns_405_in_single_mode() { - let temp = init_loaded_graph().await; - let graph = graph_path(temp.path()); - let state = AppState::open(graph.to_string_lossy().to_string()) - .await - .unwrap(); - let app = build_app(state); - let resp = app - .oneshot( - Request::builder() - .method(Method::GET) - .uri("/graphs") - .body(Body::empty()) - .unwrap(), - ) - .await - .unwrap(); - assert_eq!(resp.status(), StatusCode::METHOD_NOT_ALLOWED); - } /// `GET /graphs` requires bearer auth when tokens are configured. #[tokio::test(flavor = "multi_thread")] @@ -971,52 +559,4 @@ rules: ); } - /// Loads an `omnigraph.yaml` with two graphs and verifies multi-mode - /// inference plus graph entry resolution. Cluster-route dispatch is - /// covered by the route tests above. - #[tokio::test(flavor = "multi_thread")] - async fn server_settings_load_multi_graph_config_entries() { - let cfg_dir = tempfile::tempdir().unwrap(); - // Real graph storage dirs (the URIs in the config must point to - // a graph init-able location). - let alpha_dir = cfg_dir.path().join("alpha.omni"); - let beta_dir = cfg_dir.path().join("beta.omni"); - let schema = fs::read_to_string(fixture("test.pg")).unwrap(); - Omnigraph::init(alpha_dir.to_str().unwrap(), &schema) - .await - .unwrap(); - Omnigraph::init(beta_dir.to_str().unwrap(), &schema) - .await - .unwrap(); - - let config_path = cfg_dir.path().join("omnigraph.yaml"); - fs::write( - &config_path, - format!( - r#" -graphs: - alpha: - uri: {alpha} - beta: - uri: {beta} -"#, - alpha = alpha_dir.display(), - beta = beta_dir.display(), - ), - ) - .unwrap(); - - let settings: ServerConfig = - load_server_settings(Some(&config_path), None, None, None, None, true).await.unwrap(); - assert!(matches!(settings.mode, ServerConfigMode::Multi { .. })); - - match settings.mode { - ServerConfigMode::Multi { graphs, .. } => { - assert_eq!(graphs.len(), 2); - let ids: Vec<&str> = graphs.iter().map(|g| g.graph_id.as_str()).collect(); - assert_eq!(ids, vec!["alpha", "beta"]); - } - _ => unreachable!(), - } - } } diff --git a/crates/omnigraph-server/tests/data_routes.rs b/crates/omnigraph-server/tests/data_routes.rs index 5dc47c1..65af2c6 100644 --- a/crates/omnigraph-server/tests/data_routes.rs +++ b/crates/omnigraph-server/tests/data_routes.rs @@ -63,7 +63,7 @@ async fn export_route_returns_jsonl_for_branch_snapshot() { .clone() .oneshot( Request::builder() - .uri("/export") + .uri(g("/export")) .method(Method::POST) .header("content-type", "application/json") .header("authorization", format!("Bearer {}", token)) @@ -99,7 +99,7 @@ async fn snapshot_route_returns_manifest_dataset_version() { let (snapshot_status, snapshot_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -131,7 +131,7 @@ async fn ingest_creates_branch_returns_metadata_and_stamps_actor() { let (status, body) = json_response( &app, Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("authorization", "Bearer token-one") .header("content-type", "application/json") @@ -195,7 +195,7 @@ async fn ingest_existing_branch_skips_branch_create_policy_check() { let (status, body) = json_response( &app, Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -223,7 +223,7 @@ async fn ingest_without_from_returns_404_for_missing_branch_and_creates_nothing( let (status, body) = json_response( &app, Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&ingest).unwrap())) @@ -264,7 +264,7 @@ async fn ingest_without_from_loads_into_existing_branch() { let (status, body) = json_response( &app, Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&ingest).unwrap())) @@ -294,7 +294,7 @@ async fn ingest_denies_missing_branch_without_branch_create_permission() { let (status, body) = json_response( &app, Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -327,7 +327,7 @@ async fn ingest_denies_when_actor_lacks_change_permission() { let (status, body) = json_response( &app, Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("authorization", "Bearer team-token") .header("content-type", "application/json") @@ -357,7 +357,7 @@ async fn ingest_rejects_payloads_over_32_mib() { .clone() .oneshot( Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&oversize).unwrap())) @@ -419,7 +419,7 @@ async fn branch_merge_conflict_response_includes_structured_conflicts() { let (status, body) = json_response( &app, Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&merge).unwrap())) @@ -451,7 +451,7 @@ async fn repeated_read_after_change_sees_updated_state_from_same_app() { let (change_status, change_body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&change).unwrap())) @@ -471,7 +471,7 @@ async fn repeated_read_after_change_sees_updated_state_from_same_app() { let (read_status, read_body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&read).unwrap())) @@ -497,7 +497,7 @@ async fn query_endpoint_runs_inline_read() { let (status, body) = json_response( &app, Request::builder() - .uri("/query") + .uri(g("/query")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&query).unwrap())) @@ -524,7 +524,7 @@ async fn query_endpoint_rejects_mutation_with_400() { let (status, body) = json_response( &app, Request::builder() - .uri("/query") + .uri(g("/query")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&query).unwrap())) @@ -555,7 +555,7 @@ async fn mutate_endpoint_runs_inline_mutation() { .clone() .oneshot( Request::builder() - .uri("/mutate") + .uri(g("/mutate")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&request).unwrap())) @@ -580,7 +580,7 @@ async fn mutate_endpoint_runs_inline_mutation() { #[tokio::test(flavor = "multi_thread")] async fn change_endpoint_emits_deprecation_headers() { // `/change` is kept indefinitely for back-compat but flagged at runtime - // per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: ; + // per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: ; // rel="successor-version"`). The OpenAPI side is covered by // `openapi_change_is_deprecated` in tests/openapi.rs. let (_temp, app) = app_for_loaded_graph().await; @@ -595,7 +595,7 @@ async fn change_endpoint_emits_deprecation_headers() { .clone() .oneshot( Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&request).unwrap())) @@ -615,7 +615,7 @@ async fn change_endpoint_emits_deprecation_headers() { ); assert_eq!( response.headers().get("link").and_then(|v| v.to_str().ok()), - Some("; rel=\"successor-version\""), + Some("; rel=\"successor-version\""), "POST /change must point at /mutate via `Link` rel=successor-version (RFC 8288)" ); } @@ -635,7 +635,7 @@ async fn load_endpoint_loads_into_existing_branch() { .clone() .oneshot( Request::builder() - .uri("/load") + .uri(g("/load")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&request).unwrap())) @@ -658,7 +658,7 @@ async fn load_endpoint_loads_into_existing_branch() { #[tokio::test(flavor = "multi_thread")] async fn ingest_endpoint_emits_deprecation_headers() { // `/ingest` is the deprecated alias of `/load` (RFC-009 Phase 5): flagged - // at runtime per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: ; + // at runtime per RFC 9745 (`Deprecation: true`) + RFC 8288 (`Link: ; // rel="successor-version"`). The OpenAPI side is covered by // `openapi_ingest_is_deprecated` in tests/openapi.rs. let (_temp, app) = app_for_loaded_graph().await; @@ -672,7 +672,7 @@ async fn ingest_endpoint_emits_deprecation_headers() { .clone() .oneshot( Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&request).unwrap())) @@ -692,7 +692,7 @@ async fn ingest_endpoint_emits_deprecation_headers() { ); assert_eq!( response.headers().get("link").and_then(|v| v.to_str().ok()), - Some("; rel=\"successor-version\""), + Some("; rel=\"successor-version\""), "POST /ingest must point at /load via `Link` rel=successor-version (RFC 8288)" ); } @@ -714,7 +714,7 @@ async fn read_endpoint_emits_deprecation_headers() { .clone() .oneshot( Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&request).unwrap())) @@ -734,7 +734,7 @@ async fn read_endpoint_emits_deprecation_headers() { ); assert_eq!( response.headers().get("link").and_then(|v| v.to_str().ok()), - Some("; rel=\"successor-version\""), + Some("; rel=\"successor-version\""), "POST /read must point at /query via `Link` rel=successor-version (RFC 8288)" ); } @@ -757,7 +757,7 @@ async fn query_endpoint_does_not_emit_deprecation_headers() { .clone() .oneshot( Request::builder() - .uri("/query") + .uri(g("/query")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&request).unwrap())) @@ -789,7 +789,7 @@ async fn change_endpoint_accepts_legacy_field_names() { let (status, body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&legacy_body).unwrap())) @@ -808,7 +808,7 @@ async fn change_endpoint_accepts_legacy_field_names() { let (status, body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&canonical_body).unwrap())) @@ -826,7 +826,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (list_status, list_body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -842,7 +842,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (create_status, create_body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&create).unwrap())) @@ -856,7 +856,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (list_status, list_body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -874,7 +874,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (change_status, change_body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&change).unwrap())) @@ -895,7 +895,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (read_status, read_body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&read_main_before).unwrap())) @@ -912,7 +912,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (merge_status, merge_body) = json_response( &app, Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&merge).unwrap())) @@ -934,7 +934,7 @@ async fn remote_branch_list_create_merge_flow_works() { let (read_status, read_body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&read_main_after).unwrap())) @@ -957,7 +957,7 @@ async fn remote_branch_delete_flow_works() { let (create_status, _) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&create).unwrap())) @@ -969,7 +969,7 @@ async fn remote_branch_delete_flow_works() { let (delete_status, delete_body) = json_response( &app, Request::builder() - .uri("/branches/feature") + .uri(g("/branches/feature")) .method(Method::DELETE) .body(Body::empty()) .unwrap(), @@ -981,7 +981,7 @@ async fn remote_branch_delete_flow_works() { let (list_status, list_body) = json_response( &app, Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -1009,7 +1009,7 @@ async fn branch_delete_denies_without_policy_permission() { let (status, body) = json_response( &app, Request::builder() - .uri("/branches/feature") + .uri(g("/branches/feature")) .method(Method::DELETE) .header("authorization", "Bearer token-team") .body(Body::empty()) @@ -1081,7 +1081,7 @@ query vector_search_string($q: String) { let (status, body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&read).unwrap())) @@ -1134,7 +1134,7 @@ async fn change_conflict_returns_manifest_conflict_409() { let (status, body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from( @@ -1206,7 +1206,7 @@ async fn change_concurrent_inserts_same_key_serialize_without_409() { }) .unwrap(); let req = Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -1238,7 +1238,7 @@ async fn change_concurrent_inserts_same_key_serialize_without_409() { let (snapshot_status, snapshot_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -1319,7 +1319,7 @@ async fn change_concurrent_updates_same_key_serialize_via_publisher_cas() { }) .unwrap(); let req = Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -1428,7 +1428,7 @@ query insert_c($name: String) { }) .unwrap(); let req = Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -1445,7 +1445,7 @@ query insert_c($name: String) { }) .unwrap(); let req = Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -1474,7 +1474,7 @@ query insert_c($name: String) { let (status, body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -1582,7 +1582,7 @@ async fn ingest_per_actor_admission_cap_returns_429() { }) .unwrap(); let req = Request::builder() - .uri("/ingest") + .uri(g("/ingest")) .method(Method::POST) .header("authorization", "Bearer flooder-token") .header("content-type", "application/json") diff --git a/crates/omnigraph-server/tests/multi_graph.rs b/crates/omnigraph-server/tests/multi_graph.rs index 6410719..617cc66 100644 --- a/crates/omnigraph-server/tests/multi_graph.rs +++ b/crates/omnigraph-server/tests/multi_graph.rs @@ -248,7 +248,7 @@ async fn concurrent_branch_ops_morphological_matrix() { .clone() .oneshot( Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -369,7 +369,7 @@ async fn concurrent_branch_ops_morphological_matrix() { .clone() .oneshot( Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -717,31 +717,15 @@ graphs: #[tokio::test] async fn cluster_boot_refusals() { - // Mutual exclusion with --config / URI. + // RFC-011 cluster-only: with no --cluster, boot refuses with the + // cluster-required remedy. + let err = omnigraph_server::load_server_settings(None, None, true) + .await + .unwrap_err(); + assert!(err.to_string().contains("boots from a cluster"), "{err}"); + let temp = converged_cluster_dir("").await; let dir = temp.path().to_path_buf(); - let err = omnigraph_server::load_server_settings( - Some(&dir.join("omnigraph.yaml")), - Some(&dir), - None, - None, - None, - true, - ) - .await - .unwrap_err(); - assert!(err.to_string().contains("exclusive boot source"), "{err}"); - let err = omnigraph_server::load_server_settings( - None, - Some(&dir), - Some("file:///tmp/x.omni".to_string()), - None, - None, - true, - ) - .await - .unwrap_err(); - assert!(err.to_string().contains("exclusive boot source"), "{err}"); // Tampered catalog blob refuses boot with the remedy. let blob_dir = dir.join("__cluster/resources/query/knowledge/find_person"); diff --git a/crates/omnigraph-server/tests/openapi.rs b/crates/omnigraph-server/tests/openapi.rs index ac1fb59..9276482 100644 --- a/crates/omnigraph-server/tests/openapi.rs +++ b/crates/omnigraph-server/tests/openapi.rs @@ -8,10 +8,9 @@ use axum::body::{Body, to_bytes}; use axum::http::{Method, Request, StatusCode}; use omnigraph::db::Omnigraph; use omnigraph::loader::{LoadMode, load_jsonl}; -use omnigraph_server::{ApiDoc, AppState, build_app}; +use omnigraph_server::{AppState, build_app, served_openapi}; use serde_json::Value; use tower::ServiceExt; -use utoipa::OpenApi; fn fixture(name: &str) -> PathBuf { PathBuf::from(env!("CARGO_MANIFEST_DIR")) @@ -71,7 +70,10 @@ async fn json_response(app: &Router, request: Request) -> (StatusCode, Val } fn openapi_doc() -> utoipa::openapi::OpenApi { - ApiDoc::openapi() + // RFC-011 cluster-only: the canonical committed spec is the SERVED + // shape — protected routes nested under `/graphs/{graph_id}/…`, + // `/healthz` and `/graphs` flat. This matches what the server serves. + served_openapi() } fn openapi_json() -> Value { @@ -159,26 +161,28 @@ fn openapi_info_contains_version() { // Path coverage tests // --------------------------------------------------------------------------- +// The canonical served spec keeps `/healthz` and `/graphs` flat; every +// protected route nests under `/graphs/{graph_id}/…`. const EXPECTED_PATHS: &[&str] = &[ "/healthz", "/graphs", - "/snapshot", - "/read", - "/query", - "/export", - "/change", - "/mutate", - "/queries", - "/queries/{name}", - "/schema", - "/schema/apply", - "/load", - "/ingest", - "/branches", - "/branches/{branch}", - "/branches/merge", - "/commits", - "/commits/{commit_id}", + "/graphs/{graph_id}/snapshot", + "/graphs/{graph_id}/read", + "/graphs/{graph_id}/query", + "/graphs/{graph_id}/export", + "/graphs/{graph_id}/change", + "/graphs/{graph_id}/mutate", + "/graphs/{graph_id}/queries", + "/graphs/{graph_id}/queries/{name}", + "/graphs/{graph_id}/schema", + "/graphs/{graph_id}/schema/apply", + "/graphs/{graph_id}/load", + "/graphs/{graph_id}/ingest", + "/graphs/{graph_id}/branches", + "/graphs/{graph_id}/branches/{branch}", + "/graphs/{graph_id}/branches/merge", + "/graphs/{graph_id}/commits", + "/graphs/{graph_id}/commits/{commit_id}", ]; #[test] @@ -222,25 +226,25 @@ fn openapi_healthz_is_get() { #[test] fn openapi_read_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/read"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/read"]["post"].is_object()); } #[test] fn openapi_export_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/export"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/export"]["post"].is_object()); } #[test] fn openapi_change_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/change"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/change"]["post"].is_object()); } #[test] fn openapi_mutate_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/mutate"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/mutate"]["post"].is_object()); } // Deprecation flagging — `/read` and `/change` are kept indefinitely for @@ -253,7 +257,7 @@ fn openapi_mutate_is_post() { fn openapi_read_is_deprecated() { let doc = openapi_json(); assert_eq!( - doc["paths"]["/read"]["post"]["deprecated"], + doc["paths"]["/graphs/{graph_id}/read"]["post"]["deprecated"], serde_json::Value::Bool(true), "/read must be flagged deprecated in OpenAPI; use /query instead" ); @@ -263,7 +267,7 @@ fn openapi_read_is_deprecated() { fn openapi_change_is_deprecated() { let doc = openapi_json(); assert_eq!( - doc["paths"]["/change"]["post"]["deprecated"], + doc["paths"]["/graphs/{graph_id}/change"]["post"]["deprecated"], serde_json::Value::Bool(true), "/change must be flagged deprecated in OpenAPI; use /mutate instead" ); @@ -272,7 +276,7 @@ fn openapi_change_is_deprecated() { #[test] fn openapi_query_is_not_deprecated() { let doc = openapi_json(); - let deprecated = doc["paths"]["/query"]["post"] + let deprecated = doc["paths"]["/graphs/{graph_id}/query"]["post"] .get("deprecated") .and_then(serde_json::Value::as_bool) .unwrap_or(false); @@ -285,7 +289,7 @@ fn openapi_query_is_not_deprecated() { #[test] fn openapi_mutate_is_not_deprecated() { let doc = openapi_json(); - let deprecated = doc["paths"]["/mutate"]["post"] + let deprecated = doc["paths"]["/graphs/{graph_id}/mutate"]["post"] .get("deprecated") .and_then(serde_json::Value::as_bool) .unwrap_or(false); @@ -298,15 +302,15 @@ fn openapi_mutate_is_not_deprecated() { #[test] fn openapi_ingest_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/ingest"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/ingest"]["post"].is_object()); } #[test] fn openapi_load_is_not_deprecated() { // RFC-009 Phase 5: /load is the canonical bulk-load endpoint. let doc = openapi_json(); - assert!(doc["paths"]["/load"]["post"].is_object()); - let deprecated = doc["paths"]["/load"]["post"] + assert!(doc["paths"]["/graphs/{graph_id}/load"]["post"].is_object()); + let deprecated = doc["paths"]["/graphs/{graph_id}/load"]["post"] .get("deprecated") .and_then(serde_json::Value::as_bool) .unwrap_or(false); @@ -321,7 +325,7 @@ fn openapi_ingest_is_deprecated() { // RFC-009 Phase 5: /ingest is now the deprecated alias of /load. let doc = openapi_json(); assert_eq!( - doc["paths"]["/ingest"]["post"]["deprecated"], + doc["paths"]["/graphs/{graph_id}/ingest"]["post"]["deprecated"], serde_json::Value::Bool(true), "/ingest must be flagged deprecated now that /load is canonical" ); @@ -330,32 +334,32 @@ fn openapi_ingest_is_deprecated() { #[test] fn openapi_branches_supports_get_and_post() { let doc = openapi_json(); - assert!(doc["paths"]["/branches"]["get"].is_object()); - assert!(doc["paths"]["/branches"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/branches"]["get"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/branches"]["post"].is_object()); } #[test] fn openapi_branch_delete_is_delete() { let doc = openapi_json(); - assert!(doc["paths"]["/branches/{branch}"]["delete"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/branches/{branch}"]["delete"].is_object()); } #[test] fn openapi_branch_merge_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/branches/merge"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/branches/merge"]["post"].is_object()); } #[test] fn openapi_commits_is_get() { let doc = openapi_json(); - assert!(doc["paths"]["/commits"]["get"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/commits"]["get"].is_object()); } #[test] fn openapi_commit_show_is_get() { let doc = openapi_json(); - assert!(doc["paths"]["/commits/{commit_id}"]["get"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/commits/{commit_id}"]["get"].is_object()); } // --------------------------------------------------------------------------- @@ -510,13 +514,13 @@ fn query_request_query_is_required() { #[test] fn openapi_query_is_post() { let doc = openapi_json(); - assert!(doc["paths"]["/query"]["post"].is_object()); + assert!(doc["paths"]["/graphs/{graph_id}/query"]["post"].is_object()); } #[test] fn query_endpoint_documents_mutation_400() { let doc = openapi_json(); - let four_hundred = &doc["paths"]["/query"]["post"]["responses"]["400"]; + let four_hundred = &doc["paths"]["/graphs/{graph_id}/query"]["post"]["responses"]["400"]; let description = four_hundred["description"].as_str().unwrap_or_default(); assert!( description.contains("mutations") || description.contains("POST /mutate"), @@ -727,21 +731,21 @@ fn openapi_defines_bearer_token_security_scheme() { fn protected_endpoints_reference_bearer_token_security() { let doc = openapi_json(); let protected_paths = [ - ("/read", "post"), - ("/change", "post"), - ("/schema/apply", "post"), - ("/queries", "get"), - ("/queries/{name}", "post"), - ("/load", "post"), - ("/ingest", "post"), - ("/export", "post"), - ("/snapshot", "get"), - ("/branches", "get"), - ("/branches", "post"), - ("/branches/{branch}", "delete"), - ("/branches/merge", "post"), - ("/commits", "get"), - ("/commits/{commit_id}", "get"), + ("/graphs/{graph_id}/read", "post"), + ("/graphs/{graph_id}/change", "post"), + ("/graphs/{graph_id}/schema/apply", "post"), + ("/graphs/{graph_id}/queries", "get"), + ("/graphs/{graph_id}/queries/{name}", "post"), + ("/graphs/{graph_id}/load", "post"), + ("/graphs/{graph_id}/ingest", "post"), + ("/graphs/{graph_id}/export", "post"), + ("/graphs/{graph_id}/snapshot", "get"), + ("/graphs/{graph_id}/branches", "get"), + ("/graphs/{graph_id}/branches", "post"), + ("/graphs/{graph_id}/branches/{branch}", "delete"), + ("/graphs/{graph_id}/branches/merge", "post"), + ("/graphs/{graph_id}/commits", "get"), + ("/graphs/{graph_id}/commits/{commit_id}", "get"), ]; for (path, method) in protected_paths { @@ -773,7 +777,7 @@ fn healthz_does_not_require_security() { #[test] fn branch_delete_has_branch_path_parameter() { let doc = openapi_json(); - let params = doc["paths"]["/branches/{branch}"]["delete"]["parameters"] + let params = doc["paths"]["/graphs/{graph_id}/branches/{branch}"]["delete"]["parameters"] .as_array() .unwrap(); let has_branch = params @@ -788,7 +792,7 @@ fn branch_delete_has_branch_path_parameter() { #[test] fn commit_show_has_commit_id_path_parameter() { let doc = openapi_json(); - let params = doc["paths"]["/commits/{commit_id}"]["get"]["parameters"] + let params = doc["paths"]["/graphs/{graph_id}/commits/{commit_id}"]["get"]["parameters"] .as_array() .unwrap(); let has_commit_id = params @@ -803,7 +807,7 @@ fn commit_show_has_commit_id_path_parameter() { #[test] fn snapshot_has_branch_query_parameter() { let doc = openapi_json(); - let params = doc["paths"]["/snapshot"]["get"]["parameters"] + let params = doc["paths"]["/graphs/{graph_id}/snapshot"]["get"]["parameters"] .as_array() .unwrap(); let has_branch = params @@ -818,7 +822,7 @@ fn snapshot_has_branch_query_parameter() { #[test] fn commits_has_branch_query_parameter() { let doc = openapi_json(); - let params = doc["paths"]["/commits"]["get"]["parameters"] + let params = doc["paths"]["/graphs/{graph_id}/commits"]["get"]["parameters"] .as_array() .unwrap(); let has_branch = params @@ -858,7 +862,7 @@ fn openapi_operations_have_tags() { #[test] fn read_endpoint_200_references_read_output_schema() { let doc = openapi_json(); - let content = &doc["paths"]["/read"]["post"]["responses"]["200"]["content"]; + let content = &doc["paths"]["/graphs/{graph_id}/read"]["post"]["responses"]["200"]["content"]; let schema = &content["application/json"]["schema"]; let ref_path = schema["$ref"].as_str().unwrap(); assert!( @@ -870,7 +874,7 @@ fn read_endpoint_200_references_read_output_schema() { #[test] fn change_endpoint_200_references_change_output_schema() { let doc = openapi_json(); - let content = &doc["paths"]["/change"]["post"]["responses"]["200"]["content"]; + let content = &doc["paths"]["/graphs/{graph_id}/change"]["post"]["responses"]["200"]["content"]; let schema = &content["application/json"]["schema"]; let ref_path = schema["$ref"].as_str().unwrap(); assert!( @@ -895,11 +899,11 @@ fn healthz_200_references_health_output_schema() { fn error_responses_reference_error_output_schema() { let doc = openapi_json(); let paths_with_errors = [ - ("/read", "post", "400"), - ("/read", "post", "401"), - ("/change", "post", "400"), - ("/change", "post", "409"), - ("/branches", "post", "409"), + ("/graphs/{graph_id}/read", "post", "400"), + ("/graphs/{graph_id}/read", "post", "401"), + ("/graphs/{graph_id}/change", "post", "400"), + ("/graphs/{graph_id}/change", "post", "409"), + ("/graphs/{graph_id}/branches", "post", "409"), ]; for (path, method, status) in paths_with_errors { @@ -921,13 +925,13 @@ fn error_responses_reference_error_output_schema() { fn post_endpoints_have_request_body() { let doc = openapi_json(); let post_paths = [ - ("/read", "ReadRequest"), - ("/change", "ChangeRequest"), - ("/schema/apply", "SchemaApplyRequest"), - ("/ingest", "IngestRequest"), - ("/export", "ExportRequest"), - ("/branches", "BranchCreateRequest"), - ("/branches/merge", "BranchMergeRequest"), + ("/graphs/{graph_id}/read", "ReadRequest"), + ("/graphs/{graph_id}/change", "ChangeRequest"), + ("/graphs/{graph_id}/schema/apply", "SchemaApplyRequest"), + ("/graphs/{graph_id}/ingest", "IngestRequest"), + ("/graphs/{graph_id}/export", "ExportRequest"), + ("/graphs/{graph_id}/branches", "BranchCreateRequest"), + ("/graphs/{graph_id}/branches/merge", "BranchMergeRequest"), ]; for (path, expected_schema) in post_paths { @@ -948,7 +952,7 @@ fn post_endpoints_have_request_body() { #[test] fn invoke_stored_query_request_body_is_optional() { let doc = openapi_json(); - let request_body = &doc["paths"]["/queries/{name}"]["post"]["requestBody"]; + let request_body = &doc["paths"]["/graphs/{graph_id}/queries/{name}"]["post"]["requestBody"]; assert!( request_body.is_object(), "POST /queries/{{name}} should document its optional request body" @@ -1051,12 +1055,14 @@ async fn auth_mode_spec_has_security_on_protected_operations() { .body(Body::empty()) .unwrap(); let (_, json) = json_response(&app, request).await; + // RFC-011 cluster-only: the served spec always nests protected + // routes under `/graphs/{graph_id}/...`. let protected_paths = [ - ("/read", "post"), - ("/change", "post"), - ("/snapshot", "get"), - ("/branches", "get"), - ("/commits", "get"), + ("/graphs/{graph_id}/read", "post"), + ("/graphs/{graph_id}/change", "post"), + ("/graphs/{graph_id}/snapshot", "get"), + ("/graphs/{graph_id}/branches", "get"), + ("/graphs/{graph_id}/commits", "get"), ]; for (path, method) in protected_paths { let security = &json["paths"][path][method]["security"]; @@ -1073,22 +1079,6 @@ async fn auth_mode_spec_has_security_on_protected_operations() { } } -#[tokio::test] -async fn auth_mode_spec_matches_static_generation() { - let (_temp, app) = app_for_loaded_graph_with_auth("secret").await; - let request = Request::builder() - .method(Method::GET) - .uri("/openapi.json") - .body(Body::empty()) - .unwrap(); - let (_, served) = json_response(&app, request).await; - let static_doc = openapi_json(); - assert_eq!( - served, static_doc, - "auth-mode served spec must match static generation" - ); -} - #[tokio::test] async fn auth_mode_healthz_still_has_no_security() { let (_temp, app) = app_for_loaded_graph_with_auth("secret").await; @@ -1394,8 +1384,9 @@ async fn multi_mode_operation_ids_are_unique() { } #[tokio::test] -async fn single_mode_openapi_unchanged_by_cluster_filter() { - // Regression: single mode still emits the legacy flat surface. +async fn served_spec_always_nests_under_cluster_prefix() { + // RFC-011 cluster-only: even a one-graph convenience app serves the + // nested cluster surface and never the flat protected routes. let (_temp, app) = app_for_loaded_graph().await; let request = Request::builder() .method(Method::GET) @@ -1405,16 +1396,37 @@ async fn single_mode_openapi_unchanged_by_cluster_filter() { let (_, json) = json_response(&app, request).await; let paths = json["paths"].as_object().unwrap(); let path_keys: HashSet<&str> = paths.keys().map(|k| k.as_str()).collect(); - for expected in EXPECTED_PATHS { - assert!( - path_keys.contains(expected), - "single mode must still emit flat path: {expected}" - ); - } for cluster in EXPECTED_CLUSTER_PATHS { assert!( - !path_keys.contains(cluster), - "single mode must NOT emit cluster path: {cluster}" + path_keys.contains(cluster), + "served spec must emit cluster path: {cluster}. Found: {path_keys:?}" + ); + } + // The flat protected routes must NOT appear — only the nested + // cluster surface plus the always-flat `/healthz` and `/graphs`. + let flat_protected = [ + "/snapshot", + "/read", + "/query", + "/export", + "/change", + "/mutate", + "/queries", + "/queries/{name}", + "/schema", + "/schema/apply", + "/load", + "/ingest", + "/branches", + "/branches/{branch}", + "/branches/merge", + "/commits", + "/commits/{commit_id}", + ]; + for flat in flat_protected { + assert!( + !path_keys.contains(flat), + "served spec must NOT emit flat protected path: {flat}" ); } } diff --git a/crates/omnigraph-server/tests/s3.rs b/crates/omnigraph-server/tests/s3.rs index 2c61125..99bf98d 100644 --- a/crates/omnigraph-server/tests/s3.rs +++ b/crates/omnigraph-server/tests/s3.rs @@ -43,7 +43,7 @@ async fn server_opens_s3_graph_directly_and_serves_snapshot_and_read() { let (snapshot_status, snapshot_body) = json_response( &app, Request::builder() - .uri("/snapshot") + .uri(g("/snapshot")) .method(Method::GET) .header("authorization", "Bearer s3-token") .body(Body::empty()) @@ -63,7 +63,7 @@ async fn server_opens_s3_graph_directly_and_serves_snapshot_and_read() { let (read_status, read_body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("authorization", "Bearer s3-token") .header("content-type", "application/json") @@ -134,11 +134,8 @@ async fn server_boots_cluster_from_bare_storage_uri_and_serves_query() { } let settings = omnigraph_server::load_server_settings( - None, Some(&std::path::PathBuf::from(&root)), None, - None, - None, true, ) .await diff --git a/crates/omnigraph-server/tests/schema_routes.rs b/crates/omnigraph-server/tests/schema_routes.rs index d250d8a..c73591c 100644 --- a/crates/omnigraph-server/tests/schema_routes.rs +++ b/crates/omnigraph-server/tests/schema_routes.rs @@ -2,6 +2,7 @@ //! Moved verbatim from tests/server.rs in the modularization. use std::fs; +use std::sync::Arc; use axum::body::Body; use axum::http::{Method, Request, StatusCode}; @@ -11,7 +12,9 @@ use omnigraph::loader::LoadMode; use omnigraph_server::api::{ ChangeRequest, ErrorOutput, ReadRequest, SchemaApplyRequest, SchemaOutput, }; -use omnigraph_server::{AppState, build_app}; +use omnigraph_server::{ + AppState, GraphHandle, GraphId, GraphKey, PolicyEngine, build_app, workload, +}; use serde_json::json; @@ -30,7 +33,7 @@ async fn schema_apply_route_updates_graph_for_authorized_admin() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -54,6 +57,111 @@ async fn schema_apply_route_updates_graph_for_authorized_admin() { ); } +#[tokio::test] +async fn schema_apply_route_refuses_cluster_backed_server_mode() { + let temp = init_graph_with_schema(&fs::read_to_string(fixture("test.pg")).unwrap()).await; + let graph = graph_path(temp.path()); + let graph_uri = graph.to_string_lossy().to_string(); + let engine = Omnigraph::open(&graph_uri).await.unwrap(); + let handle = Arc::new(GraphHandle { + key: GraphKey::cluster(GraphId::try_from("default").unwrap()), + uri: graph_uri.clone(), + engine: Arc::new(engine), + policy: None, + queries: None, + }); + let state = AppState::new_multi( + vec![handle], + Vec::new(), + None, + workload::WorkloadController::from_env(), + Some(temp.path().join("cluster.yaml")), + ) + .unwrap(); + let app = build_app(state); + + let request = Request::builder() + .method(Method::POST) + .uri(g("/schema/apply")) + .header("content-type", "application/json") + .body(Body::from( + serde_json::to_vec(&SchemaApplyRequest { + schema_source: additive_schema_with_nickname(), + ..Default::default() + }) + .unwrap(), + )) + .unwrap(); + let (status, payload) = json_response(&app, request).await; + + assert_eq!(status, StatusCode::CONFLICT, "body: {payload}"); + assert!( + payload["error"] + .as_str() + .unwrap_or_default() + .contains("cluster apply"), + "body: {payload}" + ); + let reopened = Omnigraph::open(&graph_uri).await.unwrap(); + assert!( + !reopened.catalog().node_types["Person"] + .properties + .contains_key("nickname"), + "cluster-backed schema apply must not mutate the graph" + ); +} + +#[tokio::test] +async fn schema_apply_route_cluster_backed_denies_unauthorized_actor_before_409() { + // The cluster-backed 409 is reported AFTER the Cedar gate, so an actor + // without `schema_apply` permission gets a 403 — never a 409 that would + // disclose the server is cluster-backed (401 → 403 → 409, no topology leak + // before authorization). POLICY_YAML grants read/export but not schema_apply, + // so act-ragnor is denied. + let temp = init_graph_with_schema(&fs::read_to_string(fixture("test.pg")).unwrap()).await; + let graph = graph_path(temp.path()); + let graph_uri = graph.to_string_lossy().to_string(); + let engine = Omnigraph::open(&graph_uri).await.unwrap(); + let policy = PolicyEngine::load_graph_from_source(POLICY_YAML, "default").unwrap(); + let handle = Arc::new(GraphHandle { + key: GraphKey::cluster(GraphId::try_from("default").unwrap()), + uri: graph_uri, + engine: Arc::new(engine), + policy: Some(Arc::new(policy)), + queries: None, + }); + let state = AppState::new_multi( + vec![handle], + vec![("act-ragnor".to_string(), "admin-token".to_string())], + None, + workload::WorkloadController::from_env(), + Some(temp.path().join("cluster.yaml")), + ) + .unwrap(); + let app = build_app(state); + + let request = Request::builder() + .method(Method::POST) + .uri(g("/schema/apply")) + .header("content-type", "application/json") + .header("authorization", "Bearer admin-token") + .body(Body::from( + serde_json::to_vec(&SchemaApplyRequest { + schema_source: additive_schema_with_nickname(), + ..Default::default() + }) + .unwrap(), + )) + .unwrap(); + let (status, payload) = json_response(&app, request).await; + + assert_eq!( + status, + StatusCode::FORBIDDEN, + "an unauthorized actor must get 403 before the cluster-backed 409: {payload}" + ); +} + #[tokio::test(flavor = "multi_thread")] async fn schema_apply_route_rejects_stored_query_breakage_before_publish() { let (temp, app) = app_with_stored_queries( @@ -65,7 +173,7 @@ async fn schema_apply_route_rejects_stored_query_breakage_before_publish() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -115,7 +223,7 @@ async fn schema_apply_route_noop_keeps_valid_stored_query_registry() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -142,7 +250,7 @@ async fn schema_apply_route_requires_schema_apply_policy_permission() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -173,7 +281,7 @@ async fn schema_apply_route_requires_bearer_token_when_policy_enabled() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .body(Body::from( serde_json::to_vec(&SchemaApplyRequest { @@ -203,7 +311,7 @@ async fn schema_apply_route_can_rename_type() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -239,7 +347,7 @@ async fn schema_apply_route_can_rename_property() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -279,7 +387,7 @@ async fn schema_apply_route_can_add_index() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -294,6 +402,11 @@ async fn schema_apply_route_can_add_index() { assert_eq!(status, StatusCode::OK); assert_eq!(payload["applied"], true); + // iss-848: the /schema/apply route accepts the index-add and applies it as a + // metadata change — it records the `@index` intent in the catalog/IR but does + // NOT build the physical index inline (the build is deferred to + // ensure_indices/optimize; on this empty table nothing would build anyway). + // So the physical index count is unchanged by the apply. let reopened = Omnigraph::open(graph.to_str().unwrap()).await.unwrap(); let snapshot = reopened .snapshot_of(ReadTarget::branch("main")) @@ -301,7 +414,10 @@ async fn schema_apply_route_can_add_index() { .unwrap(); let dataset = snapshot.open("node:Person").await.unwrap(); let after_index_count = dataset.load_indices().await.unwrap().len(); - assert!(after_index_count > before_index_count); + assert_eq!( + after_index_count, before_index_count, + "schema apply records @index intent but defers the physical build (iss-848)" + ); } #[tokio::test] @@ -315,7 +431,7 @@ async fn schema_apply_route_rejects_unsupported_plan() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -356,7 +472,7 @@ async fn schema_apply_route_rejects_when_non_main_branch_exists() { let request = Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -385,7 +501,7 @@ async fn schema_drift_returns_conflict_for_snapshot_read_and_change() { let (snapshot_status, snapshot_body) = json_response( &app, Request::builder() - .uri("/snapshot?branch=main") + .uri(g("/snapshot?branch=main")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -413,7 +529,7 @@ async fn schema_drift_returns_conflict_for_snapshot_read_and_change() { let (read_status, read_body) = json_response( &app, Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&read).unwrap())) @@ -441,7 +557,7 @@ async fn schema_drift_returns_conflict_for_snapshot_read_and_change() { let (change_status, change_body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(serde_json::to_vec(&change).unwrap())) @@ -467,7 +583,7 @@ async fn schema_route_returns_current_source() { let (status, body) = json_response( &app, Request::builder() - .uri("/schema") + .uri(g("/schema")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -486,7 +602,7 @@ async fn schema_route_requires_bearer_token_when_auth_configured() { let (missing_status, missing_body) = json_response( &app, Request::builder() - .uri("/schema") + .uri(g("/schema")) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -502,7 +618,7 @@ async fn schema_route_requires_bearer_token_when_auth_configured() { let (ok_status, ok_body) = json_response( &app, Request::builder() - .uri("/schema") + .uri(g("/schema")) .method(Method::GET) .header("authorization", "Bearer demo-token") .body(Body::empty()) @@ -533,7 +649,7 @@ async fn schema_route_denied_when_actor_lacks_read_permission() { let (status, body) = json_response( &app, Request::builder() - .uri("/schema") + .uri(g("/schema")) .method(Method::GET) .header("authorization", "Bearer team-token") .body(Body::empty()) @@ -574,7 +690,7 @@ async fn schema_apply_route_soft_drops_property_via_http() { &app, Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -631,7 +747,7 @@ async fn schema_apply_route_soft_drops_node_type_via_http() { &app, Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -683,7 +799,7 @@ async fn schema_apply_route_hard_drops_property_with_allow_data_loss() { &app, Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -738,7 +854,7 @@ async fn schema_apply_route_keeps_drops_soft_without_flag() { &app, Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -770,29 +886,27 @@ async fn schema_apply_route_additive_property_preserves_existing_rows() { // AddProperty wasn't pinned with a row-count check anywhere. // Load N rows, apply schema adding nullable property, verify // every row is still readable and the new column is null. - let (temp, app) = app_for_graph_with_auth_tokens_and_policy( - &fs::read_to_string(fixture("test.pg")).unwrap(), + let (temp, app) = app_for_loaded_graph_with_auth_tokens_and_policy( &[("act-ragnor", "admin-token")], SCHEMA_APPLY_POLICY_YAML, ) .await; let graph = graph_path(temp.path()); - // Standard fixture data: 4 Persons + 1 Company. Load it. + // Standard fixture data is loaded before the app is built, so the server + // handle applies schema from the same manifest it is serving. let pre_count = { let db = Omnigraph::open(graph.to_str().unwrap()).await.unwrap(); - db.load( - "main", - &fs::read_to_string(fixture("test.jsonl")).unwrap(), - LoadMode::Append, - ) - .await - .unwrap(); let snap = db .snapshot_of(omnigraph::db::ReadTarget::branch("main")) .await .unwrap(); - snap.entry("node:Person").expect("Person").row_count + snap.open("node:Person") + .await + .expect("Person") + .count_rows(None) + .await + .unwrap() }; assert!(pre_count > 0, "fixture should have loaded Person rows"); @@ -800,7 +914,7 @@ async fn schema_apply_route_additive_property_preserves_existing_rows() { &app, Request::builder() .method(Method::POST) - .uri("/schema/apply") + .uri(g("/schema/apply")) .header("content-type", "application/json") .header("authorization", "Bearer admin-token") .body(Body::from( @@ -822,7 +936,13 @@ async fn schema_apply_route_additive_property_preserves_existing_rows() { .snapshot_of(omnigraph::db::ReadTarget::branch("main")) .await .unwrap(); - let post_count = snap.entry("node:Person").expect("Person").row_count; + let post_count = snap + .open("node:Person") + .await + .expect("Person") + .count_rows(None) + .await + .unwrap(); assert_eq!( post_count, pre_count, "AddProperty should preserve row count", diff --git a/crates/omnigraph-server/tests/stored_queries.rs b/crates/omnigraph-server/tests/stored_queries.rs index e4da1d3..02553a7 100644 --- a/crates/omnigraph-server/tests/stored_queries.rs +++ b/crates/omnigraph-server/tests/stored_queries.rs @@ -82,6 +82,58 @@ async fn invoke_stored_read_returns_rows() { assert!(body["rows"].is_array(), "read envelope shape; body: {body}"); } +#[tokio::test(flavor = "multi_thread")] +async fn invoke_with_mismatched_expected_kind_is_rejected() { + // RFC-011 D3: the CLI verb asserts the stored query's kind via + // `expect_mutation`. Invoking a read with `expect_mutation: true` + // (i.e. `omnigraph mutate `) is a 400 naming the right verb. + let (_temp, app) = app_with_stored_queries( + &[("find_person", FIND_PERSON_GQ, false)], + &[("act-invoke", "t-invoke")], + INVOKE_POLICY_YAML, + ) + .await; + let (status, body) = json_response( + &app, + invoke_request( + "find_person", + "t-invoke", + json!({ "expect_mutation": true, "params": { "name": "Alice" } }), + ), + ) + .await; + assert_eq!(status, StatusCode::BAD_REQUEST, "body: {body}"); + assert!( + body["error"] + .as_str() + .unwrap_or_default() + .contains("'find_person' is a read — use omnigraph query find_person"), + "expected a kind-mismatch error; body: {body}" + ); +} + +#[tokio::test(flavor = "multi_thread")] +async fn invoke_with_matching_expected_kind_runs() { + // The matching assertion (`omnigraph query `) passes through. + let (_temp, app) = app_with_stored_queries( + &[("find_person", FIND_PERSON_GQ, false)], + &[("act-invoke", "t-invoke")], + INVOKE_POLICY_YAML, + ) + .await; + let (status, body) = json_response( + &app, + invoke_request( + "find_person", + "t-invoke", + json!({ "expect_mutation": false, "params": { "name": "Alice" } }), + ), + ) + .await; + assert_eq!(status, StatusCode::OK, "matching kind should run; body: {body}"); + assert_eq!(body["query_name"], "find_person"); +} + #[tokio::test(flavor = "multi_thread")] async fn invoke_stored_read_accepts_absent_or_empty_body() { let no_param_query = "query list_people() { match { $p: Person } return { $p.name } }"; @@ -272,7 +324,7 @@ async fn list_queries_returns_only_exposed_with_typed_params() { INVOKE_POLICY_YAML, ) .await; - let (status, body) = json_response(&app, get_request("/queries", "t-invoke")).await; + let (status, body) = json_response(&app, get_request(&g("/queries"), "t-invoke")).await; assert_eq!(status, StatusCode::OK, "body: {body}"); let entries = body["queries"].as_array().unwrap(); @@ -303,7 +355,7 @@ async fn list_queries_is_read_gated_so_a_non_invoker_can_list() { INVOKE_POLICY_YAML, ) .await; - let (status, body) = json_response(&app, get_request("/queries", "t-noinvoke")).await; + let (status, body) = json_response(&app, get_request(&g("/queries"), "t-noinvoke")).await; assert_eq!(status, StatusCode::OK, "read-gated catalog; body: {body}"); let names: Vec<&str> = body["queries"] .as_array() @@ -320,7 +372,7 @@ async fn list_queries_is_read_gated_so_a_non_invoker_can_list() { #[tokio::test(flavor = "multi_thread")] async fn list_queries_is_empty_when_no_registry() { let (_temp, app) = app_for_loaded_graph_with_auth("demo-token").await; - let (status, body) = json_response(&app, get_request("/queries", "demo-token")).await; + let (status, body) = json_response(&app, get_request(&g("/queries"), "demo-token")).await; assert_eq!(status, StatusCode::OK, "body: {body}"); assert!( body["queries"].as_array().unwrap().is_empty(), diff --git a/crates/omnigraph-server/tests/support/mod.rs b/crates/omnigraph-server/tests/support/mod.rs index 0e32410..157c58e 100644 --- a/crates/omnigraph-server/tests/support/mod.rs +++ b/crates/omnigraph-server/tests/support/mod.rs @@ -248,9 +248,17 @@ rules: pub const FIND_PERSON_GQ: &str = "query find_person($name: String) { match { $p: Person { name: $name } } return { $p.age } }"; +/// RFC-011 cluster-only: the single-graph convenience apps built by the +/// `app_for_loaded_graph*` helpers serve the graph under the reserved id +/// `default`. This prefixes a flat per-graph path (e.g. `/snapshot`) with +/// the cluster route prefix so tests address `/graphs/default/snapshot`. +pub fn g(path: &str) -> String { + format!("/graphs/default{path}") +} + pub fn invoke_request(name: &str, token: &str, body: Value) -> Request { Request::builder() - .uri(format!("/queries/{name}")) + .uri(g(&format!("/queries/{name}"))) .method(Method::POST) .header("content-type", "application/json") .header("authorization", format!("Bearer {token}")) @@ -265,7 +273,7 @@ pub fn invoke_request_bytes( content_type: Option<&str>, ) -> Request { let mut builder = Request::builder() - .uri(format!("/queries/{name}")) + .uri(g(&format!("/queries/{name}"))) .method(Method::POST) .header("authorization", format!("Bearer {token}")); if let Some(content_type) = content_type { @@ -656,7 +664,7 @@ pub mod matrix { .clone() .oneshot( Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -686,7 +694,7 @@ pub mod matrix { .clone() .oneshot( Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -728,7 +736,7 @@ pub mod matrix { .clone() .oneshot( Request::builder() - .uri(format!("/snapshot?branch={}", branch)) + .uri(g(&format!("/snapshot?branch={}", branch))) .method(Method::GET) .body(Body::empty()) .unwrap(), @@ -766,7 +774,7 @@ pub mod matrix { .clone() .oneshot( Request::builder() - .uri("/read") + .uri(g("/read")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -833,7 +841,7 @@ pub mod matrix { .clone() .oneshot( Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -874,7 +882,7 @@ pub mod matrix { let response = app .oneshot( Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -910,7 +918,7 @@ pub mod matrix { let response = app .oneshot( Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -943,7 +951,7 @@ pub mod matrix { let response = app .oneshot( Request::builder() - .uri("/branches") + .uri(g("/branches")) .method(Method::POST) .header("content-type", "application/json") .body(Body::from(body)) @@ -970,7 +978,7 @@ pub mod matrix { let response = app .oneshot( Request::builder() - .uri(format!("/branches/{}", name)) + .uri(g(&format!("/branches/{}", name))) .method(Method::DELETE) .body(Body::empty()) .unwrap(), @@ -1091,7 +1099,7 @@ pub async fn http_change_decision( let (status, _body) = json_response( &app, Request::builder() - .uri("/change") + .uri(g("/change")) .method(Method::POST) .header(AUTHORIZATION, format!("Bearer {token}")) .header("content-type", "application/json") @@ -1141,7 +1149,7 @@ pub async fn http_merge_decision( let (status, _body) = json_response( &app, Request::builder() - .uri("/branches/merge") + .uri(g("/branches/merge")) .method(Method::POST) .header(AUTHORIZATION, format!("Bearer {token}")) .header("content-type", "application/json") @@ -1191,5 +1199,5 @@ graphs: } pub async fn cluster_settings(dir: &Path) -> color_eyre::eyre::Result { - omnigraph_server::load_server_settings(None, Some(&dir.to_path_buf()), None, None, None, true).await + omnigraph_server::load_server_settings(Some(&dir.to_path_buf()), None, true).await } diff --git a/crates/omnigraph/src/db/manifest.rs b/crates/omnigraph/src/db/manifest.rs index 6cd271a..f130523 100644 --- a/crates/omnigraph/src/db/manifest.rs +++ b/crates/omnigraph/src/db/manifest.rs @@ -34,10 +34,10 @@ pub(crate) use namespace::open_table_head_for_write; use namespace::{branch_manifest_namespace, staged_table_namespace}; use publisher::{GraphNamespacePublisher, ManifestBatchPublisher}; pub(crate) use recovery::{ - RecoveryMode, RecoverySidecar, RecoverySidecarHandle, SidecarKind, SidecarTablePin, - SidecarTableRegistration, SidecarTombstone, delete_sidecar, has_schema_apply_sidecar, - heal_pending_sidecars_roll_forward, list_sidecars, new_sidecar, recover_manifest_drift, - schema_apply_serial_queue_key, write_sidecar, + RecoveryMode, RecoverySidecarHandle, SidecarKind, SidecarTablePin, SidecarTableRegistration, + SidecarTombstone, delete_sidecar, has_schema_apply_sidecar, heal_pending_sidecars_roll_forward, + list_sidecars, new_sidecar, recover_manifest_drift, schema_apply_serial_queue_key, + write_sidecar, }; pub use state::SubTableEntry; #[cfg(test)] diff --git a/crates/omnigraph/src/db/manifest/recovery.rs b/crates/omnigraph/src/db/manifest/recovery.rs index d49e86a..968d3f4 100644 --- a/crates/omnigraph/src/db/manifest/recovery.rs +++ b/crates/omnigraph/src/db/manifest/recovery.rs @@ -793,10 +793,10 @@ pub(crate) fn schema_apply_serial_queue_key() -> crate::db::write_queue::TableQu /// same table append extra Lance restore commits which `omnigraph /// cleanup` reclaims. /// -/// Concurrency: today recovery runs synchronously in `Omnigraph::open` -/// *before* the engine is wrapped in the server's `Arc>`. -/// No request handlers can race, so this sweep does NOT acquire write -/// queues. In-process callers (refresh, write entry points) must use +/// Concurrency: the open-time sweep runs synchronously in `Omnigraph::open` +/// before the engine handle is published to any caller, so no request +/// handler can race it and it does NOT acquire write queues. In-process +/// callers (refresh, write entry points) must use /// [`heal_pending_sidecars_roll_forward`] instead, which serializes /// against live writers via per-(table_key, branch) queue acquisition. pub(crate) async fn recover_manifest_drift( diff --git a/crates/omnigraph/src/db/mod.rs b/crates/omnigraph/src/db/mod.rs index 000602a..f382908 100644 --- a/crates/omnigraph/src/db/mod.rs +++ b/crates/omnigraph/src/db/mod.rs @@ -11,9 +11,9 @@ pub use graph_coordinator::{GraphCoordinator, ReadTarget, ResolvedTarget, Snapsh pub use manifest::{Snapshot, SubTableEntry, SubTableUpdate}; pub(crate) use omnigraph::ensure_public_branch_ref; pub use omnigraph::{ - CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, RepairAction, - RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions, SchemaApplyResult, - SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats, + CleanupPolicyOptions, InitOptions, MergeOutcome, Omnigraph, OpenMode, PendingIndex, + RepairAction, RepairClassification, RepairOptions, RepairStats, SchemaApplyOptions, + SchemaApplyResult, SkipReason, TableCleanupStats, TableOptimizeStats, TableRepairStats, }; pub(crate) const SCHEMA_APPLY_LOCK_BRANCH: &str = "__schema_apply_lock__"; diff --git a/crates/omnigraph/src/db/omnigraph.rs b/crates/omnigraph/src/db/omnigraph.rs index a1779a9..48be274 100644 --- a/crates/omnigraph/src/db/omnigraph.rs +++ b/crates/omnigraph/src/db/omnigraph.rs @@ -40,6 +40,7 @@ pub use repair::{ RepairAction, RepairClassification, RepairOptions, RepairStats, TableRepairStats, }; pub use schema_apply::SchemaApplyOptions; +pub use table_ops::PendingIndex; use super::commit_graph::GraphCommit; use super::manifest::{ @@ -113,10 +114,11 @@ pub struct Omnigraph { /// Read-heavy on schema introspection paths, written only by /// `apply_schema`. Same ArcSwap rationale as `catalog`. schema_source: Arc>, - /// Per-`(table_key, branch)` writer queues. Reachable from engine - /// internals (mutation finalize, schema_apply, branch_merge, - /// ensure_indices, delete_where) and from future MR-870 recovery - /// reconciler. PR 1b adds the field; callers acquire in commits 4+. + /// Per-`(table_key, branch)` writer queues — the engine's + /// write-serialization mechanism (the server holds the engine as a + /// lockless `Arc`). Reachable from engine internals + /// (mutation finalize, schema_apply, branch_merge, ensure_indices, + /// delete_where, the fork path, recovery reconciler). write_queue: Arc, /// Process-wide mutex held across the swap → operate → restore window /// in `branch_merge_impl`. Two concurrent merges with distinct targets @@ -1107,11 +1109,15 @@ impl Omnigraph { /// unbranched subtables keep inheriting `main`, while subtables inherited /// from an ancestor branch are first forked into the active branch before /// their index metadata is updated. - pub async fn ensure_indices(&self) -> Result<()> { + /// Returns the declared indexes that could not be materialized on this + /// pass (today: vector columns with no trainable vectors yet). They are + /// deferred, not errors; a later `ensure_indices`/`optimize` builds them + /// once the column is trainable. Reads stay correct (brute-force) meanwhile. + pub async fn ensure_indices(&self) -> Result> { table_ops::ensure_indices(self).await } - pub async fn ensure_indices_on(&self, branch: &str) -> Result<()> { + pub async fn ensure_indices_on(&self, branch: &str) -> Result> { table_ops::ensure_indices_on(self, branch).await } @@ -1517,6 +1523,13 @@ impl Omnigraph { table_ops::open_for_mutation_on_branch(self, branch, table_key, op_kind).await } + /// Fork `table_key` onto `active_branch` from the given source state, + /// self-healing a manifest-unreferenced leftover fork if one is in the + /// way. Callers that reach this MUST already hold the per-`(table_key, + /// active_branch)` write queue (so the reclaim cannot race an in-process + /// fork) and must have confirmed via the live manifest that the table is + /// not yet on `active_branch`. Both the first-write fork path + /// (`open_owned_dataset_for_branch_write`) and `branch_merge` satisfy this. pub(crate) async fn fork_dataset_from_entry_state( &self, table_key: &str, @@ -1525,7 +1538,7 @@ impl Omnigraph { source_version: u64, active_branch: &str, ) -> Result { - table_ops::fork_dataset_from_entry_state( + match table_ops::fork_dataset_from_entry_state( self, table_key, full_path, @@ -1533,7 +1546,21 @@ impl Omnigraph { source_version, active_branch, ) - .await + .await? + { + crate::storage_layer::ForkOutcome::Created(ds) => Ok(ds), + crate::storage_layer::ForkOutcome::RefAlreadyExists => { + table_ops::reclaim_orphaned_fork_and_refork( + self, + table_key, + full_path, + source_branch, + source_version, + active_branch, + ) + .await + } + } } pub(crate) async fn reopen_for_mutation( @@ -1568,19 +1595,10 @@ impl Omnigraph { &self, table_key: &str, ds: &mut SnapshotHandle, - ) -> Result<()> { + ) -> Result> { table_ops::build_indices_on_dataset(self, table_key, ds).await } - pub(crate) async fn build_indices_on_dataset_for_catalog( - &self, - catalog: &Catalog, - table_key: &str, - ds: &mut SnapshotHandle, - ) -> Result<()> { - table_ops::build_indices_on_dataset_for_catalog(self, catalog, table_key, ds).await - } - // Used only by in-tree tests (`#[cfg(test)]`); the runtime path now // uses `commit_updates_on_branch_with_expected` exclusively. #[cfg(test)] @@ -2536,25 +2554,49 @@ edge WorksAt: Person -> Company } #[tokio::test] - async fn test_apply_schema_adds_index_for_existing_property() { + async fn test_apply_schema_defers_index_then_reconciler_builds_it() { + // iss-848: schema apply records the @index intent but builds nothing + // inline; a later ensure_indices materializes it once the table has + // rows. (Use `age`, which is unindexed in TEST_SCHEMA — `name @key` is + // already FTS-indexed at seed, so it can't show the deferral.) let dir = tempfile::tempdir().unwrap(); let uri = dir.path().to_str().unwrap(); let mut db = Omnigraph::init(uri, TEST_SCHEMA).await.unwrap(); + seed_person_row(&mut db, "Alice", Some(30)).await; - let desired = TEST_SCHEMA.replace("name: String @key", "name: String @key @index"); + let desired = TEST_SCHEMA.replace("age: I32?", "age: I32? @index"); db.apply_schema(&desired).await.unwrap(); + // Apply built nothing — the BTREE on `age` is deferred. let snapshot = db.snapshot().await; let ds = db .storage() .open_snapshot_at_table(&snapshot, "node:Person") .await .unwrap(); - assert!(db.storage().has_fts_index(&ds, "name").await.unwrap()); + assert!( + !db.storage().has_btree_index(&ds, "age").await.unwrap(), + "apply must not build the index inline (deferred to the reconciler)" + ); + + // The reconciler materializes it (Person has a row). + db.ensure_indices().await.unwrap(); + let snapshot = db.snapshot().await; + let ds = db + .storage() + .open_snapshot_at_table(&snapshot, "node:Person") + .await + .unwrap(); + assert!( + db.storage().has_btree_index(&ds, "age").await.unwrap(), + "ensure_indices must build the deferred index" + ); } #[tokio::test] - async fn test_apply_schema_rewrite_preserves_existing_indices() { + async fn test_apply_schema_rewrite_defers_index_then_reconciler_restores() { + // iss-848: an AddProperty rewrite writes a new dataset version without + // rebuilding indexes inline (deferred); ensure_indices restores them. let dir = tempfile::tempdir().unwrap(); let uri = dir.path().to_str().unwrap(); let initial_schema = TEST_SCHEMA.replace("name: String @key", "name: String @key @index"); @@ -2567,6 +2609,8 @@ edge WorksAt: Person -> Company ); db.apply_schema(&desired).await.unwrap(); + // After the rewrite the reconciler restores index coverage. + db.ensure_indices().await.unwrap(); let snapshot = db.snapshot().await; let ds = db .storage() diff --git a/crates/omnigraph/src/db/omnigraph/optimize.rs b/crates/omnigraph/src/db/omnigraph/optimize.rs index 9195256..9181822 100644 --- a/crates/omnigraph/src/db/omnigraph/optimize.rs +++ b/crates/omnigraph/src/db/omnigraph/optimize.rs @@ -140,6 +140,12 @@ pub struct TableOptimizeStats { /// Lance HEAD version observed by optimize for drift skips. `None` for /// normal compaction/no-op/blob skips. pub lance_head_version: Option, + /// Declared `@index` columns on this table the reconciler could not build + /// this run, each with the `reason` (today: a vector column with no + /// trainable vectors yet). Empty on the common path. Reported, not fatal — a + /// later `optimize` retries; the `list_indices`/`indisvalid` analog so + /// operators can see which index is pending and why. + pub pending_indexes: Vec, } impl TableOptimizeStats { @@ -153,6 +159,7 @@ impl TableOptimizeStats { skipped: None, manifest_version: None, lance_head_version: None, + pending_indexes: Vec::new(), } } @@ -166,6 +173,7 @@ impl TableOptimizeStats { skipped: Some(reason), manifest_version: None, lance_head_version: None, + pending_indexes: Vec::new(), } } @@ -183,6 +191,7 @@ impl TableOptimizeStats { skipped: Some(SkipReason::DriftNeedsRepair), manifest_version: Some(manifest_version), lance_head_version: Some(lance_head_version), + pending_indexes: Vec::new(), } } } @@ -259,9 +268,7 @@ pub async fn optimize_all_tables(db: &Omnigraph) -> Result 0; // Even when there is nothing to compact, the table may still have index // work: rows appended since the index was built (e.g. via `ingest --mode - // merge`) are scanned unindexed until folded in. Either compaction or stale - // index coverage is enough to enter the publish path. If NEITHER, this - // table is a no-op and must NOT be pinned in a sidecar — a zero-commit pin - // classifies NoMovement on recovery and forces an all-or-nothing rollback - // of sibling tables' legitimate work. Uncovered pre-existing manifest/HEAD - // drift is skipped above and must go through explicit repair. + // merge`) are scanned unindexed until folded in (needs_reindex), OR a + // declared `@index` was never built — schema apply records the intent but + // defers the physical build (iss-848), so optimize is the operator-facing + // reconciler that materializes it (needs_index_create). Any of the three is + // enough to enter the publish path. If NONE, this table is a no-op and must + // NOT be pinned in a sidecar — a zero-commit pin classifies NoMovement on + // recovery and forces an all-or-nothing rollback of sibling tables' + // legitimate work. Uncovered pre-existing manifest/HEAD drift is skipped + // above and goes through explicit repair, so this only runs on a healthy + // table under the per-table queue + sidecar. let needs_reindex = TableStore::has_unindexed_fragments(&ds).await?; - if !will_compact && !needs_reindex { + // needs_index_work_* checks "a declared index is missing AND row_count > 0", + // so empty tables stay no-ops (never pinned). It re-reads the head under the + // queue we already hold, so it is consistent with `ds`. + let needs_index_create = if let Some(type_name) = table_key.strip_prefix("node:") { + super::table_ops::needs_index_work_node(db, type_name, &table_key, &full_path, None).await? + } else { + super::table_ops::needs_index_work_edge(db, &table_key, &full_path, None).await? + }; + if !will_compact && !needs_reindex && !needs_index_create { return Ok(TableOptimizeStats::compacted( table_key, &CompactionMetrics::default(), @@ -427,7 +446,30 @@ async fn optimize_one_table( ds.optimize_indices(&OptimizeOptions::default()) .await .map_err(|e| OmniError::Lance(format!("optimize_indices on {}: {}", table_key, e)))?; - let version_after = ds.version().version; + + // Materialize any declared-but-missing index over the just-compacted layout, + // reusing the build chokepoint (idempotent: skips existing indexes; fault- + // isolates an untrainable vector column into `pending` rather than failing). + // Run it UNCONDITIONALLY now that we are past the no-op gate — not only when + // `needs_index_create`. A table can enter this path for compaction or + // reindex while its sole missing index is an untrainable Vector column + // (which `needs_index_work_*` does not count as buildable work); calling the + // build here is what surfaces that column in `pending_indexes`, so optimize + // can't compact a table yet silently drop the deferred-index signal. + // Idempotent + cheap when there is nothing to build. Vector index creation + // is an inline-commit residual; the Optimize sidecar's loose post_commit_pin + // covers the extra commits. + let catalog = db.catalog(); + let mut snapshot = crate::storage_layer::SnapshotHandle::new(ds); + let pending_indexes: Vec = + super::table_ops::build_indices_on_dataset_for_catalog( + db, + &catalog, + &table_key, + &mut snapshot, + ) + .await?; + let version_after = snapshot.dataset().version().version; let committed = version_after != version_before; // Pin the per-writer Phase B → Phase C residual for optimize: Lance HEAD has @@ -438,9 +480,6 @@ async fn optimize_one_table( // expected = the version observed under the queue). On failure the sidecar // is intentionally left for the open-time recovery sweep to roll forward. if committed { - // Re-wrap the post-compaction dataset to read its state through the - // trait surface (`table_state` is a read; no HEAD advance). - let snapshot = crate::storage_layer::SnapshotHandle::new(ds); let state = db.storage().table_state(&full_path, &snapshot).await?; let update = crate::db::SubTableUpdate { table_key: table_key.clone(), @@ -467,7 +506,9 @@ async fn optimize_one_table( ); } - Ok(TableOptimizeStats::compacted(table_key, &metrics, committed)) + let mut stat = TableOptimizeStats::compacted(table_key, &metrics, committed); + stat.pending_indexes = pending_indexes; + Ok(stat) } /// Run Lance `cleanup_old_versions` on every node + edge table on `main`, @@ -599,27 +640,37 @@ pub struct BranchReconcileStats { pub failures: Vec<(String, String)>, } -/// Drop every per-table and commit-graph Lance branch that the manifest no -/// longer references. +/// Drop every per-table and commit-graph Lance branch fork the manifest does +/// not reference. /// -/// Orphaned forks arise when a `branch_delete` flips the manifest authority -/// (atomic) but a downstream best-effort reclaim does not complete. They are -/// unreachable through any snapshot — no manifest entry can name them — yet -/// they pin their `tree/{branch}/` storage and can block reusing the branch -/// name. This is the guaranteed convergence backstop: it is idempotent and -/// derived purely from the manifest authority, so it no-ops once everything is -/// reconciled, and it would harmlessly find nothing if a future Lance atomic -/// multi-dataset branch op prevented orphans from forming. +/// Two origins produce a manifest-unreferenced fork: +/// 1. A `branch_delete` flips the manifest authority (atomic) but a +/// downstream best-effort reclaim does not complete — the whole branch is +/// gone from the manifest, but a `tree/{branch}/` ref lingers. +/// 2. A first-write fork (or a merge fork) creates the branch ref before the +/// manifest publish, then the writer dies / is cancelled — the branch is +/// still a live manifest branch, but the manifest's snapshot of it does +/// not place *this table* on the branch. /// -/// The keep-set is the full (unfiltered) manifest branch list, so system -/// branches' forks are never reclaimed; `main`/default is not a named Lance -/// branch and so is never a candidate. Referencing children are dropped before -/// parents (Lance refuses to delete a referenced parent) by ordering longest -/// branch names first. +/// The write path self-heals (2) on the next write to the table +/// (`reclaim_orphaned_fork_and_refork`); this is the guaranteed-convergence +/// backstop that also covers (1) and any table the write path never revisits. +/// +/// The orphan test is therefore **per-table**, not per-branch-name: a Lance +/// branch `B` on table `T` is an orphan iff `B` is not a live manifest branch +/// at all (origin 1) OR the manifest's branch-`B` snapshot does not place `T` +/// on `B` (origin 2). A legitimately-forked table (`table_branch == Some(B)`) +/// is kept. `main` and internal/system branches are never candidates. Lance +/// refuses to force-delete a branch with referencing descendants, so children +/// are dropped before parents (longest name first). Idempotent and authority- +/// derived: no-ops once reconciled, and degrades to finding nothing if a future +/// Lance atomic multi-dataset branch op prevents orphans from forming. pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result { - use std::collections::HashSet; + use std::collections::{HashMap, HashSet}; - let keep: HashSet = db + // Live manifest branches: the set whose per-table placements are + // authoritative. A branch absent here is a whole-branch (origin-1) orphan. + let live_branches: HashSet = db .coordinator .read() .await @@ -640,6 +691,12 @@ pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result = HashMap::new(); + let mut failed_branch_snapshots: HashSet = HashSet::new(); // Per-table fault isolation: one table's transient failure is recorded and // logged, never aborting the rest of the sweep. @@ -658,7 +715,104 @@ pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result = Vec::new(); + for branch in listed { + // `main` is not a named Lance branch; system/internal branches + // (e.g. the schema-apply lock) own legitimate forks — never touch. + if branch == "main" || crate::db::is_internal_system_branch(&branch) { + continue; + } + let is_orphan = if !live_branches.contains(&branch) { + true // origin 1: whole branch gone from the manifest + } else { + // origin 2: live branch, but does the manifest place THIS + // table on it? Resolve (and cache) the branch's snapshot. + if failed_branch_snapshots.contains(&branch) { + continue; + } + if !branch_snapshots.contains_key(&branch) { + let branch_snapshot = + match crate::failpoints::maybe_fail("cleanup.resolve_branch_snapshot") { + Ok(()) => db.snapshot_for_branch(Some(&branch)).await, + Err(injected) => Err(injected), + }; + match branch_snapshot { + Ok(snap) => { + branch_snapshots.insert(branch.clone(), snap); + } + Err(err) => { + tracing::warn!( + target: "omnigraph::cleanup", + table = %table_key, + branch = %branch, + error = %err, + "resolving branch snapshot failed during reconcile; skipping", + ); + stats.failures.push((table_key.clone(), err.to_string())); + failed_branch_snapshots.insert(branch.clone()); + continue; + } + } + } + branch_snapshots[&branch] + .entry(&table_key) + .map(|e| e.table_branch.as_deref() != Some(branch.as_str())) + .unwrap_or(true) + }; + if is_orphan { + orphans.push(branch); + } + } + // Children before parents (longest name first) so Lance's referenced- + // parent RefConflict cannot block reclamation. + orphans.sort_by(|a, b| b.len().cmp(&a.len()).then_with(|| a.cmp(b))); + + for branch in orphans { + // Serialize against in-process live writers before destroying a ref. + // A first-write fork holds the per-(table, branch) write queue from + // before the fork through the manifest publish; on a LIVE branch its + // in-flight fork looks exactly like an origin-2 orphan (manifest not + // yet advanced). Acquire the same queue so cleanup waits for any such + // writer, then RE-VALIDATE under the queue with a fresh read: if the + // writer published in the meantime (table now placed on the branch), + // it is no longer an orphan — skip it. (Cross-process writers remain + // the documented one-winner-CAS gap.) One key held at a time → no + // lock-order inversion against multi-table `acquire_many` writers. + let _guard = db + .write_queue() + .acquire(&(table_key.clone(), Some(branch.clone()))) + .await; + // Decide under the queue from FRESH authority via the shared + // classifier (same decision the write-path reclaim uses) — never + // from the sweep-start `live_branches` capture. A branch created + // AFTER that capture is absent from the stale set yet may already + // carry a legitimately-published fork (an in-process writer held + // this queue through its fork+publish; we just waited on it), so a + // stale "origin-1 ⇒ delete" shortcut would destroy a live fork. + // Only `Orphan` is reclaimed; `Indeterminate` (transient read) is + // skipped and recorded. (Cross-process writers remain the documented + // one-winner-CAS gap.) One key held at a time → no lock-order + // inversion vs multi-table `acquire_many` writers. + match super::table_ops::classify_fork_ref(db, &table_key, &branch).await { + super::table_ops::ForkRefStatus::Orphan => {} + super::table_ops::ForkRefStatus::Legitimate => continue, + super::table_ops::ForkRefStatus::Indeterminate => { + tracing::warn!( + target: "omnigraph::cleanup", + table = %table_key, + branch = %branch, + "fresh re-check inconclusive during reconcile; skipping to avoid \ + destroying a possibly-live fork (will retry next cleanup)", + ); + stats.failures.push(( + table_key.clone(), + format!("indeterminate fork status for {branch}"), + )); + continue; + } + } let outcome = match crate::failpoints::maybe_fail("cleanup.reconcile_fork") { Ok(()) => storage.force_delete_branch(&full_path, &branch).await, Err(injected) => Err(injected), @@ -679,15 +833,17 @@ pub async fn reconcile_orphaned_branches(db: &Omnigraph) -> Result keys.sort(); keys } + +#[cfg(all(test, feature = "failpoints"))] +mod tests { + use super::*; + use crate::failpoints::ScopedFailPoint; + use crate::loader::{LoadMode, load_jsonl}; + + fn node_table_uri(root: &str, type_name: &str) -> String { + let mut hash: u64 = 0xcbf2_9ce4_8422_2325; + for &b in type_name.as_bytes() { + hash ^= b as u64; + hash = hash.wrapping_mul(0x100_0000_01b3); + } + format!("{}/nodes/{hash:016x}", root.trim_end_matches('/')) + } + + #[tokio::test] + async fn reconcile_caches_live_branch_snapshot_resolution_failure() { + let _scenario = fail::FailScenario::setup(); + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap(); + let schema = "node Person { name: String @key }\nnode Company { name: String @key }\n"; + let mut db = Omnigraph::init(uri, schema).await.unwrap(); + load_jsonl( + &mut db, + "{\"type\":\"Person\",\"data\":{\"name\":\"Alice\"}}\n\ + {\"type\":\"Company\",\"data\":{\"name\":\"Acme\"}}", + LoadMode::Merge, + ) + .await + .unwrap(); + db.branch_create("feature").await.unwrap(); + + for type_name in ["Person", "Company"] { + let table_uri = node_table_uri(uri, type_name); + let mut ds = lance::Dataset::open(&table_uri).await.unwrap(); + let base = ds.version().version; + ds.create_branch("feature", base, None).await.unwrap(); + } + + let _fp = ScopedFailPoint::new("cleanup.resolve_branch_snapshot", "return"); + let stats = reconcile_orphaned_branches(&db).await.unwrap(); + + assert_eq!( + stats.failures.len(), + 1, + "one live-branch snapshot resolution failure should be reported once, \ + not once per table: {:?}", + stats.failures + ); + assert!( + stats.failures[0] + .1 + .contains("cleanup.resolve_branch_snapshot"), + "the recorded failure should be the branch-snapshot resolution failure: {:?}", + stats.failures + ); + assert!( + stats.reclaimed.is_empty(), + "unreadable live-branch refs must be left for the next cleanup run" + ); + } +} diff --git a/crates/omnigraph/src/db/omnigraph/schema_apply.rs b/crates/omnigraph/src/db/omnigraph/schema_apply.rs index f965ad4..48f8099 100644 --- a/crates/omnigraph/src/db/omnigraph/schema_apply.rs +++ b/crates/omnigraph/src/db/omnigraph/schema_apply.rs @@ -193,7 +193,6 @@ where let mut added_tables = BTreeSet::new(); let mut renamed_tables = HashMap::new(); let mut rewritten_tables = BTreeSet::new(); - let mut indexed_tables = BTreeSet::new(); let mut dropped_tables = BTreeSet::new(); // Hard-drop cleanup targets: (table_key, full_dataset_uri). // Populated for DropProperty { Hard } and DropType { Hard }; the @@ -252,14 +251,14 @@ where .or_default() .insert(to.clone(), from.clone()); } - SchemaMigrationStep::AddConstraint { - type_kind, - type_name, - .. - } => { - indexed_tables.insert(schema_table_key(*type_kind, type_name)); - } - SchemaMigrationStep::UpdateTypeMetadata { .. } + // AddConstraint is only ever an `@index` addition (every other + // added constraint plans as UnsupportedChange). It records intent + // in the desired catalog/IR; the physical index is built off the + // critical path by ensure_indices/optimize (iss-848), so the apply + // does no table work for it — a pure metadata change like the two + // metadata steps below. + SchemaMigrationStep::AddConstraint { .. } + | SchemaMigrationStep::UpdateTypeMetadata { .. } | SchemaMigrationStep::UpdatePropertyMetadata { .. } => {} SchemaMigrationStep::DropProperty { type_kind, @@ -347,18 +346,15 @@ where let mut table_updates = HashMap::::new(); let mut table_tombstones = HashMap::::new(); - // Recovery sidecar: protect the per-table commit_staged loop in - // rewritten_tables + indexed_tables. The post_commit_pin we record - // here is a lower bound (expected + 1); the classifier loose-matches - // for SidecarKind::SchemaApply because the actual N depends on how - // many indices need building. See classify_table's loose-match arm. + // Recovery sidecar: protect the per-table `stage_overwrite` + + // `commit_staged` in rewritten_tables — the only tables that advance Lance + // HEAD inline now that index building is deferred to the reconciler + // (iss-848). Each rewritten table is exactly one commit, so + // `post_commit_pin = expected + 1` is now exact (it was a loose lower bound + // when index builds added extra commits); the classifier's loose-match for + // SidecarKind::SchemaApply still accepts it. let recovery_pins: Vec = rewritten_tables .iter() - .chain(indexed_tables.iter().filter(|t| { - !rewritten_tables.contains(*t) - && !added_tables.contains(*t) - && !renamed_tables.contains_key(*t) - })) .filter_map(|table_key| { let entry = snapshot.entry(table_key)?; Some(crate::db::manifest::SidecarTablePin { @@ -432,10 +428,10 @@ where // manifest publish via `commit_changes_with_actor` below. // // Schema-apply already holds the graph-wide `__schema_apply_lock__` - // sentinel branch, so under PR 1b's intermediate state these - // per-table acquisitions are uncontended. They exist for symmetry - // with future MR-870 recovery, which will need queue acquisition - // before any `Dataset::restore` it issues for SchemaApply sidecars. + // sentinel branch, so these per-table acquisitions are uncontended in + // practice. They exist for symmetry with the recovery reconciler, which + // acquires the same queues before any `Dataset::restore` it issues for + // SchemaApply sidecars. let mut schema_apply_queue_keys: Vec<(String, Option)> = recovery_pins .iter() .map(|pin| (pin.table_key.clone(), pin.table_branch.clone())) @@ -490,10 +486,11 @@ where let table_path = table_path_for_table_key(table_key)?; let dataset_uri = db.storage().dataset_uri(&table_path); let schema = schema_for_table_key(&desired_catalog, table_key)?; - let mut ds = + let ds = SnapshotHandle::new(TableStore::create_empty_dataset(&dataset_uri, &schema).await?); - db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds) - .await?; + // Indexes for the new table are materialized off the critical path by + // ensure_indices/optimize (iss-848); a 0-row table is never trainable + // anyway. The @index intent is recorded in the persisted catalog/IR. let state = db.storage().table_state(&dataset_uri, &ds).await?; table_registrations.insert(table_key.clone(), table_path); table_updates.insert( @@ -533,10 +530,9 @@ where .await?; let table_path = table_path_for_table_key(target_table_key)?; let dataset_uri = db.storage().dataset_uri(&table_path); - let mut target_ds = + let target_ds = SnapshotHandle::new(TableStore::write_dataset(&dataset_uri, batch).await?); - db.build_indices_on_dataset_for_catalog(&desired_catalog, target_table_key, &mut target_ds) - .await?; + // Indexes on the renamed table are reconciled later (iss-848). let state = db.storage().table_state(&dataset_uri, &target_ds).await?; table_registrations.insert(target_table_key.clone(), table_path); table_updates.insert( @@ -593,9 +589,10 @@ where .open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref()) .await?; let staged = db.storage().stage_overwrite(&existing, batch).await?; - let mut target_ds = db.storage().commit_staged(existing, staged).await?; - db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut target_ds) - .await?; + let target_ds = db.storage().commit_staged(existing, staged).await?; + // The rewrite drops the table's existing index coverage; it is + // restored off the critical path by optimize's optimize_indices / + // ensure_indices (iss-848). Reads scan uncovered fragments meanwhile. let state = db.storage().table_state(&dataset_uri, &target_ds).await?; table_updates.insert( table_key.clone(), @@ -609,41 +606,12 @@ where ); } - for table_key in &indexed_tables { - if added_tables.contains(table_key) - || renamed_tables.contains_key(table_key) - || rewritten_tables.contains(table_key) - { - continue; - } - let entry = snapshot.entry(table_key).ok_or_else(|| { - OmniError::manifest(format!( - "missing table '{}' for schema index apply", - table_key - )) - })?; - ensure_snapshot_entry_head_matches(db, entry).await?; - let dataset_uri = db.storage().dataset_uri(&entry.table_path); - let mut ds = db - .storage() - .open_dataset_head_for_write(table_key, &dataset_uri, entry.table_branch.as_deref()) - .await?; - db.storage() - .ensure_expected_version(&ds, table_key, entry.table_version)?; - db.build_indices_on_dataset_for_catalog(&desired_catalog, table_key, &mut ds) - .await?; - let state = db.storage().table_state(&dataset_uri, &ds).await?; - table_updates.insert( - table_key.clone(), - crate::db::SubTableUpdate { - table_key: table_key.clone(), - table_version: state.version, - table_branch: None, - row_count: state.row_count, - version_metadata: state.version_metadata, - }, - ); - } + // Index-only changes (AddConstraint, i.e. adding an `@index`) are pure + // metadata: the new `@index` intent is recorded in the desired catalog/IR + // persisted below, and the physical index is materialized off the critical + // path by `ensure_indices`/`optimize` (iss-848). Schema apply touches no + // table data for them, so there is no per-table loop here and no recovery + // pin (no Lance HEAD advances). Reads stay correct meanwhile via a scan. let mut manifest_changes = Vec::new(); for (table_key, table_path) in table_registrations { diff --git a/crates/omnigraph/src/db/omnigraph/table_ops.rs b/crates/omnigraph/src/db/omnigraph/table_ops.rs index 3f40c1d..d30acff 100644 --- a/crates/omnigraph/src/db/omnigraph/table_ops.rs +++ b/crates/omnigraph/src/db/omnigraph/table_ops.rs @@ -21,7 +21,7 @@ pub(super) async fn graph_index_for_resolved( db.runtime_cache.graph_index(resolved, &catalog).await } -pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> { +pub(super) async fn ensure_indices(db: &Omnigraph) -> Result> { let current_branch = db .coordinator .read() @@ -31,7 +31,7 @@ pub(super) async fn ensure_indices(db: &Omnigraph) -> Result<()> { ensure_indices_for_branch(db, current_branch.as_deref()).await } -pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result<()> { +pub(super) async fn ensure_indices_on(db: &Omnigraph, branch: &str) -> Result> { let branch = normalize_branch_name(branch)?; ensure_indices_for_branch(db, branch.as_deref()).await } @@ -73,12 +73,16 @@ pub(super) async fn failpoint_publish_table_head_without_index_rebuild_for_test( .await } -pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&str>) -> Result<()> { +pub(super) async fn ensure_indices_for_branch( + db: &Omnigraph, + branch: Option<&str>, +) -> Result> { db.ensure_schema_state_valid().await?; db.ensure_schema_apply_idle("ensure_indices").await?; let resolved = db.resolved_branch_target(branch).await?; let snapshot = resolved.snapshot; let mut updates = Vec::new(); + let mut pending = Vec::new(); let active_branch = resolved.branch; let catalog = db.catalog(); @@ -160,9 +164,8 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st // that needs index work. Held across the per-table commit loop and // the manifest publish at the end of this function. Sorted-order // acquisition prevents lock-order inversion against concurrent - // multi-table writers (mutation finalize, branch_merge, future - // MR-870 recovery). Under PR 1b's intermediate state (global server - // RwLock still in place), this acquisition is uncontended. + // multi-table writers (mutation finalize, branch_merge, the fork + // path, recovery). let queue_keys: Vec<(String, Option)> = recovery_pins .iter() .map(|pin| (pin.table_key.clone(), pin.table_branch.clone())) @@ -217,7 +220,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st }; let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0); if row_count > 0 { - build_indices_on_dataset(db, &table_key, &mut ds).await?; + pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?); } let state = db.storage().table_state(&full_path, &ds).await?; @@ -265,7 +268,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st }; let row_count = db.storage().count_rows(&ds, None).await.unwrap_or(0); if row_count > 0 { - build_indices_on_dataset(db, &table_key, &mut ds).await?; + pending.extend(build_indices_on_dataset(db, &table_key, &mut ds).await?); } let state = db.storage().table_state(&full_path, &ds).await?; @@ -307,7 +310,7 @@ pub(super) async fn ensure_indices_for_branch(db: &Omnigraph, branch: Option<&st } } - Ok(()) + Ok(pending) } /// The single scalar/vector index a node property receives from a one-column @@ -352,6 +355,26 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option { } } +/// Whether a vector column currently has at least one non-null vector — the +/// minimum for Lance IVF k-means to train (the `ivf_flat(1)` index we build +/// needs >=1 vector). Used identically by `needs_index_work_node` (so an +/// untrainable column is not pinned for recovery — avoiding a zero-commit pin +/// that would roll back a sibling's index work) and by the vector build arm (so +/// `create_vector_index` is only attempted when it can succeed, keeping its +/// genuine errors fatal instead of swallowed as pending). If index params +/// become size-aware (dev-graph iss-687), this threshold moves with them. +async fn vector_column_trainable( + db: &Omnigraph, + ds: &SnapshotHandle, + column: &str, +) -> Result { + Ok(db + .storage() + .count_rows(ds, Some(format!("{column} IS NOT NULL"))) + .await? + > 0) +} + /// Returns true if the node table is missing at least one declared /// scalar/vector index that `build_indices_on_dataset_for_catalog` would /// build AND has at least one row (the ensure_indices loop has @@ -366,7 +389,7 @@ fn node_prop_index_kind(prop_type: &PropType) -> Option { /// (DateTime/Date/numeric/Bool), FTS for free-text Strings, or a Vector index. /// Edges get BTree only (id, src, dst). This helper and the builder share /// `node_prop_index_kind` so they cannot drift — see its doc comment. -async fn needs_index_work_node( +pub(super) async fn needs_index_work_node( db: &Omnigraph, type_name: &str, table_key: &str, @@ -409,7 +432,14 @@ async fn needs_index_work_node( } } Some(NodePropIndexKind::Vector) => { - if !db.storage().has_vector_index(&ds, prop_name).await? { + // Only count a missing vector index as buildable *work* when the + // column is trainable (>=1 non-null vector). An untrainable + // column would defer in the build and commit nothing; pinning it + // for recovery would be a zero-commit pin that classifies + // NoMovement and rolls back a sibling table's index work. + if !db.storage().has_vector_index(&ds, prop_name).await? + && vector_column_trainable(db, &ds, prop_name).await? + { return Ok(true); } } @@ -434,7 +464,7 @@ async fn needs_index_work_node( /// /// Empty edge tables are skipped by the ensure_indices loop the same /// way node tables are; see `needs_index_work_node`. -async fn needs_index_work_edge( +pub(super) async fn needs_index_work_edge( db: &Omnigraph, table_key: &str, full_path: &str, @@ -551,8 +581,14 @@ pub(super) async fn open_owned_dataset_for_branch_write( )); } } - fork_dataset_from_entry_state( - db, + // The fork advances Lance state before the manifest publish. The + // caller holds the per-(table, active_branch) write queue from + // before this fork through the publish, so a leftover ref is a + // manifest-unreferenced fork (interrupted prior fork, or + // delete+recreate), not a live in-process fork. The wrapper + // self-heals it (reclaim + re-fork); see + // `Omnigraph::fork_dataset_from_entry_state`. + db.fork_dataset_from_entry_state( table_key, full_path, source_branch, @@ -580,7 +616,7 @@ pub(super) async fn fork_dataset_from_entry_state( source_branch: Option<&str>, source_version: u64, active_branch: &str, -) -> Result { +) -> Result> { db.storage() .fork_branch_from_state( full_path, @@ -592,6 +628,172 @@ pub(super) async fn fork_dataset_from_entry_state( .await } +/// Classification of a Lance branch ref `B` on table `T` against FRESH manifest +/// authority — the single decision both fork-ref reclaim sites share: the +/// write-path reclaim ([`reclaim_orphaned_fork_and_refork`]) and the cleanup +/// reconciler (`optimize::reconcile_orphaned_branches`). Having one classifier +/// keeps the two destructive sites from drifting (the bug history: each was +/// hardened separately and the other lagged). +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +pub(crate) enum ForkRefStatus { + /// The manifest places `T` on `B` — a legitimate fork. Never destroy. + Legitimate, + /// The manifest does not reference this fork (`T` not on `B`, or `B` absent + /// from the manifest entirely). Reclaimable. + Orphan, + /// Fresh authority could not be established (a transient read failure on a + /// live branch). Ambiguous — do not destroy; the caller retries / converges. + Indeterminate, +} + +/// Classify a fork ref from FRESH manifest authority (bypasses the coordinator +/// cache). MUST be called with the per-`(table, branch)` write queue held, so +/// the classification is stable against in-process writers for the caller's +/// critical section. Both reclaim sites map the result to their own action +/// (write path: reclaim vs retryable; cleanup: delete vs skip), but the +/// destroy-only-on-`Orphan` rule is enforced here, once. +pub(crate) async fn classify_fork_ref( + db: &Omnigraph, + table_key: &str, + branch: &str, +) -> ForkRefStatus { + // `classify.fresh_read` failpoint: simulate a transient failure of the + // fresh-authority read (no-op without the `failpoints` feature). Lets a + // test exercise the Indeterminate path — a read failure on a live branch + // must classify as Indeterminate (skip), never Orphan (destroy). + let fresh = match crate::failpoints::maybe_fail("classify.fresh_read") { + Ok(()) => db.fresh_snapshot_for_branch(Some(branch)).await, + Err(injected) => Err(injected), + }; + match fresh { + Ok(snap) => { + let placed = snap + .entry(table_key) + .map(|e| e.table_branch.as_deref() == Some(branch)) + .unwrap_or(false); + if placed { + ForkRefStatus::Legitimate + } else { + // Branch resolves but the manifest does not place this table on + // it — a manifest-unreferenced fork. + ForkRefStatus::Orphan + } + } + // Branch did not resolve. `all_branches` lists `_refs/branches/` live, so + // absent there = genuinely no such manifest branch (origin-1 orphan); + // present (or a list error) = transient read — never destroy on that. + Err(_) => match db.coordinator.read().await.all_branches().await { + Ok(fresh) if !fresh.iter().any(|b| b == branch) => ForkRefStatus::Orphan, + _ => ForkRefStatus::Indeterminate, + }, + } +} + +/// Reclaim a manifest-unreferenced fork and re-fork in its place. +/// +/// Reached when `fork_branch_from_state` reports `RefAlreadyExists`. This is a +/// destructive op (it force-deletes a Lance branch ref), so it owns its own +/// safety precondition rather than trusting the caller's: it re-derives, via +/// [`classify_fork_ref`], that the manifest does not place this table on +/// `active_branch`. The caller's earlier proof may have come from the +/// coordinator's *cached* branch snapshot (`resolved_branch_target` returns +/// the cache when the handle is bound to `active_branch` — an embedded handle +/// on the branch, or `branch_merge`'s target swap); trusting it could +/// force-delete a fork a concurrent writer just legitimately published. Only +/// once fresh authority confirms the ref is unreferenced does it drop the ref +/// (idempotent `force_delete_branch`) and re-fork, exactly once. +/// +/// If fresh authority shows the table IS on `active_branch` (a legitimate +/// concurrent fork), or a second collision occurs after reclaim (a foreign- +/// process writer recreated the ref — the documented one-winner-CAS gap), it +/// surfaces a retryable conflict; on retry the winner's fork is visible and +/// the no-fork path runs. +pub(super) async fn reclaim_orphaned_fork_and_refork( + db: &Omnigraph, + table_key: &str, + full_path: &str, + source_branch: Option<&str>, + source_version: u64, + active_branch: &str, +) -> Result { + // Self-validate against FRESH authority before destroying anything. Only an + // Orphan is reclaimable; a Legitimate status (a concurrent writer published + // a real fork despite the caller's possibly-cached proof) or an + // Indeterminate one (transient read) surfaces a retryable conflict rather + // than stranding the manifest at a version the recreated ref won't have. + match classify_fork_ref(db, table_key, active_branch).await { + ForkRefStatus::Orphan => {} + ForkRefStatus::Legitimate => { + let actual = db + .fresh_snapshot_for_branch(Some(active_branch)) + .await + .ok() + .and_then(|s| s.entry(table_key).map(|e| e.table_version)) + .unwrap_or(source_version); + return Err(OmniError::manifest_expected_version_mismatch( + table_key, + source_version, + actual, + )); + } + ForkRefStatus::Indeterminate => { + return Err(OmniError::manifest_conflict(format!( + "could not verify whether branch '{active_branch}' still owns an orphaned \ + fork for table '{table_key}' because fresh manifest authority was \ + unavailable; refresh and retry" + ))); + } + } + + crate::failpoints::maybe_fail("fork.before_reclaim")?; + db.storage() + .force_delete_branch(full_path, active_branch) + .await + .map_err(|e| { + // Lance refuses to delete a branch with dependent child branches + // even under force (RefConflict). Unreachable for a leaf first-write + // fork (the cleanup reconciler also drops children before parents), + // but surface it actionably if it ever happens. We match loosely on + // "referenc" rather than the exact prose, which is not a Lance API + // contract; a typed RefConflict variant through `force_delete_branch` + // is the durable follow-up. + if e.to_string().contains("referenc") { + OmniError::manifest_conflict(format!( + "branch '{active_branch}' cannot reclaim the leftover fork for \ + table '{table_key}' because it has dependent child branches; \ + delete the child branches (or run `omnigraph cleanup`) first" + )) + } else { + e + } + })?; + + match fork_dataset_from_entry_state( + db, + table_key, + full_path, + source_branch, + source_version, + active_branch, + ) + .await? + { + crate::storage_layer::ForkOutcome::Created(ds) => Ok(ds), + crate::storage_layer::ForkOutcome::RefAlreadyExists => { + let live = db.fresh_snapshot_for_branch(Some(active_branch)).await?; + let actual = live + .entry(table_key) + .map(|e| e.table_version) + .unwrap_or(source_version); + Err(OmniError::manifest_expected_version_mismatch( + table_key, + source_version, + actual, + )) + } + } +} + pub(super) async fn reopen_for_mutation( db: &Omnigraph, table_key: &str, @@ -632,11 +834,25 @@ pub(super) async fn open_dataset_at_state( .await } +/// A declared index the builder could not materialize on this pass. Today the +/// only such case is a vector (IVF) column with no trainable vectors yet +/// (KMeans needs >=1 vector), e.g. the load-before-embed window. Reported, not +/// fatal: a later `ensure_indices`/`optimize` retries once the column is +/// buildable, and reads stay correct via brute-force meanwhile. Surfacing +/// pending index *status* rather than failing the operation is the database +/// norm (Postgres `indisvalid`, LanceDB `list_indices`). +#[derive(Debug, Clone)] +pub struct PendingIndex { + pub table_key: String, + pub column: String, + pub reason: String, +} + pub(super) async fn build_indices_on_dataset( db: &Omnigraph, table_key: &str, ds: &mut SnapshotHandle, -) -> Result<()> { +) -> Result> { let catalog = db.catalog(); build_indices_on_dataset_for_catalog(db, &catalog, table_key, ds).await } @@ -646,8 +862,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog( catalog: &Catalog, table_key: &str, ds: &mut SnapshotHandle, -) -> Result<()> { +) -> Result> { if let Some(type_name) = table_key.strip_prefix("node:") { + let mut pending = Vec::new(); if !db.storage().has_btree_index(ds, "id").await? { stage_and_commit_btree(db, table_key, ds, &["id"]).await?; } @@ -676,22 +893,52 @@ pub(super) async fn build_indices_on_dataset_for_catalog( } Some(NodePropIndexKind::Vector) => { if !db.storage().has_vector_index(ds, prop_name).await? { - // Inline-commit residual: lance-6.0.1 does not - // expose `build_index_metadata_from_segments` as - // `pub`, so vector indices cannot be staged from - // outside the lance crate. Document at the call - // site; companion ticket to lance-format/lance#6658. - let new_snap = db - .storage_inline_residual() - .create_vector_index(ds.clone(), prop_name.as_str()) - .await - .map_err(|e| { - OmniError::Lance(format!( - "create Vector index on {}({}): {}", - table_key, prop_name, e - )) - })?; - *ds = new_snap; + // A vector (IVF) index trains k-means over the column, + // so it needs >=1 non-null vector (KMeans errors + // "cannot train N centroids with 0 vectors"). Precheck + // trainability: a column with no vectors yet (e.g. rows + // loaded before `embed`) is recorded as a *pending* + // index and skipped — deferred, not failed. The SAME + // predicate gates `needs_index_work_node`, so an + // untrainable column is never pinned for recovery (no + // zero-commit pin that would roll back a sibling + // table's index work). This function is the chokepoint + // every write path funnels through (load/mutate, schema + // apply, ensure_indices, optimize, merge), realizing + // the governing principle — physical index state never + // fails a logical operation. Only when trainable do we + // attempt the build, and then we PROPAGATE any error: a + // genuine I/O/manifest/Lance failure must stay fatal, + // not be hidden as pending. (Vector creation is an + // inline-commit residual until lance#6666; iss-951.) + if vector_column_trainable(db, ds, prop_name).await? { + let new_snap = db + .storage_inline_residual() + .create_vector_index(ds.clone(), prop_name.as_str()) + .await + .map_err(|e| { + OmniError::Lance(format!( + "create Vector index on {}({}): {}", + table_key, prop_name, e + )) + })?; + *ds = new_snap; + } else { + tracing::info!( + target: "omnigraph::index", + table = %table_key, + column = %prop_name, + "deferring Vector index: column has no \ + trainable vectors yet", + ); + pending.push(PendingIndex { + table_key: table_key.to_string(), + column: prop_name.clone(), + reason: "column has no non-null vectors to \ + train on yet" + .to_string(), + }); + } } } // Enum + orderable scalars (DateTime/Date/numeric/Bool) @@ -709,7 +956,7 @@ pub(super) async fn build_indices_on_dataset_for_catalog( } } } - return Ok(()); + return Ok(pending); } if table_key.starts_with("edge:") { @@ -722,7 +969,9 @@ pub(super) async fn build_indices_on_dataset_for_catalog( if !db.storage().has_btree_index(ds, "dst").await? { stage_and_commit_btree(db, table_key, ds, &["dst"]).await?; } - return Ok(()); + // Edge tables only get BTree (id/src/dst), which build at any + // cardinality; no pending state is possible here. + return Ok(Vec::new()); } Err(OmniError::manifest(format!( @@ -844,7 +1093,11 @@ async fn prepare_updates_for_commit( crate::db::MutationOpKind::SchemaRewrite, ) .await?; - build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?; + // Any column not yet buildable (e.g. a vector column whose rows + // have null embeddings) is deferred and logged inside + // build_indices; a later ensure_indices/optimize materializes it. + // The load/mutate/merge commit must not fail on it. + let _pending = build_indices_on_dataset(db, &prepared_update.table_key, &mut ds).await?; let state = db.storage().table_state(&full_path, &ds).await?; prepared_update.table_version = state.version; prepared_update.row_count = state.row_count; @@ -1045,3 +1298,78 @@ pub(super) async fn ensure_commit_graph_initialized(db: &Omnigraph) -> Result<() pub(super) async fn invalidate_graph_index(db: &Omnigraph) { db.runtime_cache.invalidate_all().await; } + +#[cfg(test)] +mod classify_fork_ref_tests { + //! Direct coverage of [`classify_fork_ref`] — the single fresh-authority + //! decision both fork-ref reclaim sites (write-path reclaim + cleanup + //! reconciler) route through. Pins each deterministic status so reverting + //! the fresh-authority logic at either site fails here. (The `Indeterminate` + //! arm needs an injected transient read and is covered under the + //! `failpoints` suite.) + use super::*; + use crate::db::Omnigraph; + use crate::loader::LoadMode; + + const SCHEMA: &str = "node Person { name: String @key }\nnode Company { name: String @key }\n"; + + /// On-disk dataset path for a node table, taken from the manifest entry + /// (the same path the engine uses) so the test forges against the real ref. + async fn node_path(db: &Omnigraph, branch: &str, table_key: &str) -> String { + let snap = db.snapshot_for_branch(Some(branch)).await.unwrap(); + let entry = snap.entry(table_key).unwrap(); + format!("{}/{}", db.root_uri, entry.table_path) + } + + #[tokio::test] + async fn classify_distinguishes_legitimate_unreferenced_and_ghost() { + let dir = tempfile::tempdir().unwrap(); + let db = Omnigraph::init(dir.path().to_str().unwrap(), SCHEMA) + .await + .unwrap(); + db.branch_create("feature").await.unwrap(); + + // Legitimate: a real write forks Company onto `feature`, and the + // manifest places Company on `feature`. + db.load_as( + "feature", + None, + r#"{"type":"Company","data":{"name":"Acme"}}"#, + LoadMode::Merge, + None, + ) + .await + .unwrap(); + assert_eq!( + classify_fork_ref(&db, "node:Company", "feature").await, + ForkRefStatus::Legitimate, + "a manifest-placed fork must classify as Legitimate (never destroyed)" + ); + + // Orphan (manifest-unreferenced): forge a `feature` ref on Person, which + // the manifest's `feature` snapshot still places on main. + let person = node_path(&db, "feature", "node:Person").await; + { + let mut ds = lance::Dataset::open(&person).await.unwrap(); + let v = ds.version().version; + ds.create_branch("feature", v, None).await.unwrap(); + } + assert_eq!( + classify_fork_ref(&db, "node:Person", "feature").await, + ForkRefStatus::Orphan, + "a ref the manifest does not place on the branch must classify as Orphan" + ); + + // Orphan (ghost): a ref for a branch the manifest does not have at all. + { + let mut ds = lance::Dataset::open(&person).await.unwrap(); + let v = ds.version().version; + ds.create_branch("ghost", v, None).await.unwrap(); + } + assert_eq!( + classify_fork_ref(&db, "node:Person", "ghost").await, + ForkRefStatus::Orphan, + "a ref for a branch absent from the manifest must classify as Orphan" + ); + } +} diff --git a/crates/omnigraph/src/db/write_queue.rs b/crates/omnigraph/src/db/write_queue.rs index 1f0c53a..18a14d1 100644 --- a/crates/omnigraph/src/db/write_queue.rs +++ b/crates/omnigraph/src/db/write_queue.rs @@ -1,12 +1,15 @@ -//! Per-`(table_key, branch)` writer queues — MR-686 scaffolding. +//! Per-`(table_key, branch)` writer queues. //! -//! Today every server-layer write serializes on the global -//! `Arc>` in `AppState`. MR-686 replaces that with -//! per-`(table_key, branch_ref)` queues so disjoint-key writes proceed -//! concurrently. This module owns the queue data structure; callers in -//! `MutationStaging::commit_all`, `branch_merge`, `schema_apply`, -//! `ensure_indices`, `delete_where`, and the future MR-870 recovery -//! reconciler acquire guards before any per-table Lance commit. +//! These queues are the engine's write-serialization mechanism: the server +//! holds the engine as a lockless `Arc` (writes are `&self`), so +//! disjoint-key writes proceed concurrently and only writes to the same +//! `(table_key, branch_ref)` serialize here. This module owns the queue +//! data structure; callers in `MutationStaging::commit_all`, `branch_merge`, +//! `schema_apply`, `ensure_indices`, `delete_where`, the fork path (first +//! write to a table on a branch — acquired before the fork, held through the +//! manifest publish), and the recovery reconciler acquire guards before any +//! per-table Lance commit. Serialization is in-process only; cross-process +//! writers on one graph remain one-winner-CAS at the manifest publish. //! //! ## Why exclusive `tokio::sync::Mutex<()>` per key //! diff --git a/crates/omnigraph/src/exec/merge.rs b/crates/omnigraph/src/exec/merge.rs index ea16b15..5d0be74 100644 --- a/crates/omnigraph/src/exec/merge.rs +++ b/crates/omnigraph/src/exec/merge.rs @@ -1323,9 +1323,9 @@ impl Omnigraph { // branch_merge writes only to the target branch. // // Held across the per-table publish loop and the manifest - // commit + record_merge_commit calls below. Under PR 1b's - // intermediate state (global server RwLock still in place), - // this acquisition is uncontended. + // commit + record_merge_commit calls below, so no concurrent + // writer to a touched (table, target_branch) can interleave + // between our commit_staged and our publish. let active_branch_for_keys = self.active_branch().await; let merge_queue_keys: Vec<(String, Option)> = ordered_table_keys .iter() diff --git a/crates/omnigraph/src/exec/mutation.rs b/crates/omnigraph/src/exec/mutation.rs index e9051c4..9fcff45 100644 --- a/crates/omnigraph/src/exec/mutation.rs +++ b/crates/omnigraph/src/exec/mutation.rs @@ -741,14 +741,45 @@ impl Omnigraph { // tables. Branch is threaded explicitly — no coordinator swap. let mut staging = MutationStaging::default(); + // Lower + validate up front so the touched-table set is known before + // execution. A lowering/validation error returns exactly as it did + // when this happened inside execute_named_mutation. + let ir = self.lower_named_mutation(query_source, query_name)?; + + // Up-front fork-queue acquisition (see the loader for the full + // rationale): if this mutation will fork any touched table onto a + // non-main branch, acquire the per-(table, branch) write queues for + // every touched table before the first fork and hold them through the + // publish, so the orphan-fork reclaim can't race a concurrent + // in-process fork. The touched set is derived from the lowered IR. + let fork_queue_guards: Option<( + Vec<(String, Option)>, + Vec>, + )> = if let Some(active) = requested.as_deref() { + let snapshot = self.snapshot_for_branch(Some(active)).await?; + let touched: Vec<(String, Option)> = self + .touched_table_keys(&ir) + .into_iter() + .map(|k| (k, Some(active.to_string()))) + .collect(); + let needs_fork = touched.iter().any(|(table_key, _)| { + snapshot + .entry(table_key) + .map(|e| e.table_branch.as_deref() != Some(active)) + .unwrap_or(false) + }); + if needs_fork { + let guards = self.write_queue().acquire_many(&touched).await; + Some((touched, guards)) + } else { + None + } + } else { + None + }; + let exec_result = self - .execute_named_mutation( - query_source, - query_name, - &resolved_params, - requested.as_deref(), - &mut staging, - ) + .execute_named_mutation(&ir, &resolved_params, requested.as_deref(), &mut staging) .await; match exec_result { @@ -768,6 +799,7 @@ impl Omnigraph { requested.as_deref(), crate::db::manifest::SidecarKind::Mutation, actor_id, + fork_queue_guards, ) .await?; // Failpoint that wedges the documented finalize→publisher @@ -817,14 +849,19 @@ impl Omnigraph { } } - async fn execute_named_mutation( + /// Lower + validate a named mutation query into its IR. + /// + /// Hoisted out of [`Self::execute_named_mutation`] so the caller can + /// inspect the IR before execution — specifically to compute the + /// touched-table set (see [`Self::touched_table_keys`]) for up-front + /// write-queue acquisition. Performs the same find → typecheck → lower + /// → D₂ checks that execution previously did inline, so error behavior + /// is unchanged. + fn lower_named_mutation( &self, query_source: &str, query_name: &str, - params: &ParamMap, - branch: Option<&str>, - staging: &mut MutationStaging, - ) -> Result { + ) -> Result { let query_decl = omnigraph_compiler::find_named_query(query_source, query_name) .map_err(|e| OmniError::manifest(e.to_string()))?; @@ -841,7 +878,61 @@ impl Omnigraph { let ir = lower_mutation_query(&query_decl)?; // D₂: reject mixed insert/update + delete before any I/O. enforce_no_mixed_destructive_constructive(&ir)?; + Ok(ir) + } + /// The COMPLETE set of `(node|edge):{type}` table keys a mutation IR can + /// touch at execution time, keyed as `MutationStaging`/`commit_all` key + /// them. Must be a superset of everything execution forks/commits, since + /// it drives the up-front fork-queue acquisition and `commit_all`'s + /// held-guard coverage check — a miss means an unserialized fork/commit. + /// + /// The set is a pure function of (IR ops + catalog). For each op it mirrors + /// the execute path's node-vs-edge dispatch (`node_types` first, then + /// `edge_types`). A `delete ` additionally **cascades** to every edge + /// type whose endpoint is that node (see `execute_delete_node`), forking + /// those edge tables during execution — so they are included here, derived + /// the same way the executor derives them (`from_type`/`to_type` match). + /// Unknown types are skipped (the execute path surfaces the error). + /// Sorted + deduped for one-shot `acquire_many`. + fn touched_table_keys(&self, ir: &omnigraph_compiler::ir::MutationIR) -> Vec { + use omnigraph_compiler::ir::MutationOpIR; + let catalog = self.catalog(); + let mut keys: Vec = Vec::new(); + for op in &ir.ops { + let type_name = match op { + MutationOpIR::Insert { type_name, .. } + | MutationOpIR::Update { type_name, .. } + | MutationOpIR::Delete { type_name, .. } => type_name, + }; + if catalog.node_types.contains_key(type_name) { + keys.push(format!("node:{type_name}")); + // A node delete cascades to every edge touching this node type, + // forking those edge tables. Include them so the up-front + // acquisition covers the cascade (mirrors execute_delete_node). + if matches!(op, MutationOpIR::Delete { .. }) { + for (edge_name, edge_type) in &catalog.edge_types { + if edge_type.from_type == *type_name || edge_type.to_type == *type_name { + keys.push(format!("edge:{edge_name}")); + } + } + } + } else if catalog.edge_types.contains_key(type_name) { + keys.push(format!("edge:{type_name}")); + } + } + keys.sort(); + keys.dedup(); + keys + } + + async fn execute_named_mutation( + &self, + ir: &omnigraph_compiler::ir::MutationIR, + params: &ParamMap, + branch: Option<&str>, + staging: &mut MutationStaging, + ) -> Result { let mut total = MutationResult::default(); for op in &ir.ops { let result = match op { diff --git a/crates/omnigraph/src/exec/staging.rs b/crates/omnigraph/src/exec/staging.rs index cbfd52d..464ec34 100644 --- a/crates/omnigraph/src/exec/staging.rs +++ b/crates/omnigraph/src/exec/staging.rs @@ -463,12 +463,28 @@ impl StagedMutation { /// unreferenced (cleaned by `cleanup_old_versions`'s age sweep) /// rather than being committed and creating a Lance-HEAD-ahead /// residual. + /// `held_guards`: when the caller already holds the per-`(table_key, + /// branch)` write queues for every touched table (the fork path acquires + /// them up front, before the fork, and holds them through the manifest + /// publish), it passes `(acquired_keys, guards)` here so `commit_all` + /// reuses them instead of re-acquiring — the queue is a non-re-entrant + /// `tokio::Mutex`, so re-acquiring a held key would self-deadlock. + /// `None` (the steady-state path) means `commit_all` acquires them + /// itself. `acquired_keys` must cover every key `commit_all` would + /// acquire (debug-asserted below) — the guards from `acquire_many` don't + /// carry their keys, so the caller hands the key set alongside them. The + /// fork path guarantees coverage by keying every touched table uniformly + /// by the resolved target branch. pub(crate) async fn commit_all( self, db: &crate::db::Omnigraph, branch: Option<&str>, sidecar_kind: SidecarKind, actor_id: Option<&str>, + held_guards: Option<( + Vec<(String, Option)>, + Vec>, + )>, ) -> Result<( Vec, HashMap, @@ -483,21 +499,18 @@ impl StagedMutation { op_kinds, } = self; - // Acquire per-(table_key, branch) queues for every touched - // table — both staged and inline-committed. Sorted by - // `acquire_many` internally so all multi-table writers - // (mutation, branch_merge, schema_apply, future MR-870 - // recovery) agree on acquisition order — prevents lock-order - // inversion deadlock. + // Per-(table_key, branch) queues for every touched table — both + // staged and inline-committed. Sorted by `acquire_many` internally + // so all multi-table writers (mutation, branch_merge, schema_apply, + // the fork path, recovery) agree on acquisition order — prevents + // lock-order inversion deadlock. // - // For inline-committed tables (delete-only mutations), Lance - // HEAD has already advanced inside `delete_where` before - // `commit_all` runs. Holding the queue here doesn't prevent - // that interleaving (commit 6 will move queue acquisition into - // `delete_where`'s call site); it does prevent another writer - // from interleaving between our delete and our publish, which - // would otherwise leave a Lance-HEAD-ahead residual the - // delete-only sidecar (added below) would have to recover. + // For inline-committed tables (delete-only mutations), Lance HEAD + // has already advanced inside `delete_where` before `commit_all` + // runs. Holding the queue here prevents another writer from + // interleaving between our delete and our publish, which would + // otherwise leave a Lance-HEAD-ahead residual the delete-only + // sidecar (added below) would have to recover. let mut queue_keys: Vec<(String, Option)> = Vec::with_capacity(staged.len() + inline_committed.len()); for entry in &staged { @@ -512,7 +525,30 @@ impl StagedMutation { })?; queue_keys.push((table_key.clone(), path.table_branch.clone())); } - let guards = db.write_queue().acquire_many(&queue_keys).await; + // Reuse the caller's guards (fork path) when handed in, else acquire + // our own. When reusing, every key we would acquire MUST already be + // covered — re-acquiring a held non-re-entrant key would deadlock, and + // a key we'd need but DON'T hold would commit unserialized. This is a + // load-bearing safety invariant, so it is checked in ALL builds (not a + // debug_assert) and fails the write loudly+safely rather than silently + // proceeding unguarded if a future execution path ever touches a table + // outside the caller's pre-computed set. + let guards = match held_guards { + Some((acquired_keys, guards)) => { + let held: std::collections::HashSet<&(String, Option)> = + acquired_keys.iter().collect(); + if let Some(missing) = queue_keys.iter().find(|k| !held.contains(k)) { + return Err(OmniError::manifest_internal(format!( + "commit_all: pre-held write-queue guards do not cover touched table \ + '{}' on branch {:?} — the caller's up-front acquisition set diverged \ + from the staged/inline set (a touched-table-set bug)", + missing.0, missing.1 + ))); + } + guards + } + None => db.write_queue().acquire_many(&queue_keys).await, + }; // Re-capture manifest pins under the queue (PR 2 / MR-686). // diff --git a/crates/omnigraph/src/loader/mod.rs b/crates/omnigraph/src/loader/mod.rs index 69ada79..2365243 100644 --- a/crates/omnigraph/src/loader/mod.rs +++ b/crates/omnigraph/src/loader/mod.rs @@ -418,6 +418,45 @@ async fn load_jsonl_reader( LoadMode::Overwrite => crate::db::MutationOpKind::SchemaRewrite, }; + // Up-front fork-queue acquisition. The first write to a table on a + // non-main branch forks it (create_branch), which advances Lance state + // before the manifest publish; the reclaim of any manifest-unreferenced + // leftover (`reclaim_orphaned_fork_and_refork`) must not race a concurrent + // in-process fork. So when this load will fork at least one touched table, + // acquire the per-(table, branch) write queues for ALL touched tables up + // front (one sorted `acquire_many`, keyed uniformly by the target branch + // so it covers what `commit_all` recomputes) and hold them through the + // publish. Main-branch loads never fork; branch loads where every touched + // table is already forked skip this and let `commit_all` acquire at commit. + let fork_queue_guards: Option<( + Vec<(String, Option)>, + Vec>, + )> = if let Some(active) = branch { + let touched: Vec<(String, Option)> = node_rows + .keys() + .map(|t| (format!("node:{t}"), Some(active.to_string()))) + .chain( + edge_rows + .keys() + .map(|e| (format!("edge:{e}"), Some(active.to_string()))), + ) + .collect(); + let needs_fork = touched.iter().any(|(table_key, _)| { + snapshot + .entry(table_key) + .map(|e| e.table_branch.as_deref() != Some(active)) + .unwrap_or(false) + }); + if needs_fork { + let guards = db.write_queue().acquire_many(&touched).await; + Some((touched, guards)) + } else { + None + } + } else { + None + }; + // Phase 2a: build and validate every node batch up front. Cheap and // synchronous — surfaces validation errors before any S3 traffic. let mut prepared_nodes: Vec<(String, String, RecordBatch, usize)> = @@ -551,7 +590,13 @@ async fn load_jsonl_reader( // across the manifest publish below — see exec/mutation.rs for // the rationale (interleaving prevention). let (updates, expected_versions, sidecar_handle, _queue_guards) = staged - .commit_all(db, branch, crate::db::manifest::SidecarKind::Load, actor_id) + .commit_all( + db, + branch, + crate::db::manifest::SidecarKind::Load, + actor_id, + fork_queue_guards, + ) .await?; // Same finalize → publisher residual as mutations: per-table // staged commits have advanced Lance HEAD, but the manifest diff --git a/crates/omnigraph/src/storage_layer.rs b/crates/omnigraph/src/storage_layer.rs index d2f6b01..7c7685d 100644 --- a/crates/omnigraph/src/storage_layer.rs +++ b/crates/omnigraph/src/storage_layer.rs @@ -184,6 +184,26 @@ pub(crate) fn staged_handles_as_writes(handles: &[StagedHandle]) -> Vec { + Created(D), + RefAlreadyExists, +} + // ─── TableStorage trait ──────────────────────────────────────────────────── /// Engine-internal trait covering every Lance dataset operation an @@ -231,7 +251,7 @@ pub trait TableStorage: sealed::Sealed + Send + Sync + Debug { table_key: &str, source_version: u64, target_branch: &str, - ) -> Result; + ) -> Result>; async fn delete_branch(&self, dataset_uri: &str, branch: &str) -> Result<()>; @@ -497,17 +517,22 @@ impl TableStorage for TableStore { table_key: &str, source_version: u64, target_branch: &str, - ) -> Result { - TableStore::fork_branch_from_state( - self, - dataset_uri, - source_branch, - table_key, - source_version, - target_branch, + ) -> Result> { + Ok( + match TableStore::fork_branch_from_state( + self, + dataset_uri, + source_branch, + table_key, + source_version, + target_branch, + ) + .await? + { + ForkOutcome::Created(ds) => ForkOutcome::Created(SnapshotHandle::new(ds)), + ForkOutcome::RefAlreadyExists => ForkOutcome::RefAlreadyExists, + }, ) - .await - .map(SnapshotHandle::new) } async fn delete_branch(&self, dataset_uri: &str, branch: &str) -> Result<()> { diff --git a/crates/omnigraph/src/table_store.rs b/crates/omnigraph/src/table_store.rs index b458aec..5c99b01 100644 --- a/crates/omnigraph/src/table_store.rs +++ b/crates/omnigraph/src/table_store.rs @@ -26,6 +26,7 @@ use std::sync::Arc; use crate::db::manifest::{TableVersionMetadata, open_table_head_for_write}; use crate::db::{Snapshot, SubTableEntry}; use crate::error::{OmniError, Result}; +use crate::storage_layer::ForkOutcome; #[derive(Debug, Clone, PartialEq, Eq)] pub struct TableState { @@ -285,7 +286,7 @@ impl TableStore { table_key: &str, source_version: u64, target_branch: &str, - ) -> Result { + ) -> Result> { let mut source_ds = self .open_dataset_head(dataset_uri, source_branch) .await? @@ -294,31 +295,49 @@ impl TableStore { .map_err(|e| OmniError::Lance(e.to_string()))?; self.ensure_expected_version(&source_ds, table_key, source_version)?; - if source_ds + if let Err(create_err) = source_ds .create_branch(target_branch, source_version, None) .await - .is_err() { - // The target branch ref already exists. The caller - // (`open_owned_dataset_for_branch_write`) re-reads the live manifest - // before forking and returns a retryable error when a concurrent - // writer legitimately holds the fork, so reaching here means the - // manifest does NOT reference this fork: it is an orphan from an - // incomplete prior `branch_delete`. Surface the actionable cleanup - // error rather than guessing from Lance branch versions. - return Err(OmniError::manifest_conflict(format!( - "branch '{}' has orphaned table state for '{}' from an incomplete \ - prior delete; run `omnigraph cleanup` to reclaim it before reusing \ - this branch name", - target_branch, table_key - ))); + // Disambiguate the failure: only a genuinely pre-existing ref is a + // reclaim candidate. Mapping EVERY create_branch failure to + // `RefAlreadyExists` would route a transient I/O / version / Lance + // internal error into the destructive reclaim path. So check whether + // the ref actually exists; if not, the failure is real — propagate + // it (preserving error fidelity) rather than force-deleting. + // + // `list_branches` reads `_refs/branches/` from the store, so it sees + // a fully-formed manifest-unreferenced fork (our common case — a + // create_branch that completed but whose manifest publish did not). + // It does NOT see a phase-1-only Lance "zombie" (tree dir written, + // no BranchContents) — but neither does `cleanup`'s reconciler, also + // list_branches-based. A zombie only forms if create_branch is + // interrupted *between its two internal phases* (a far narrower + // window than the manifest-publish gap), and it surfaces here as the + // propagated create error requiring manual reclaim. We deliberately + // do NOT force-delete on a not-found-ref failure: it is + // indistinguishable from a transient error on a fresh create, and + // force-deleting there is the destructive overreach this guard + // removes. The caller holds the per-(table, branch) write queue, so + // no in-process writer races this fork; a cross-process create + // between our check and now is the documented one-winner-CAS gap and + // propagates as a retryable error. + let ref_exists = source_ds + .list_branches() + .await + .map(|b| b.contains_key(target_branch)) + .unwrap_or(false); + if ref_exists { + return Ok(ForkOutcome::RefAlreadyExists); + } + return Err(OmniError::Lance(create_err.to_string())); } let ds = self .open_dataset_head(dataset_uri, Some(target_branch)) .await?; self.ensure_expected_version(&ds, table_key, source_version)?; - Ok(ds) + Ok(ForkOutcome::Created(ds)) } pub async fn scan_batches(&self, ds: &Dataset) -> Result> { diff --git a/crates/omnigraph/tests/failpoints.rs b/crates/omnigraph/tests/failpoints.rs index b45cfa0..38a60ae 100644 --- a/crates/omnigraph/tests/failpoints.rs +++ b/crates/omnigraph/tests/failpoints.rs @@ -5,7 +5,9 @@ mod helpers; use fail::FailScenario; use futures::FutureExt; use omnigraph::db::Omnigraph; +use omnigraph::error::{ManifestErrorKind, OmniError}; use omnigraph::failpoints::ScopedFailPoint; +use omnigraph::loader::LoadMode; use helpers::recovery::{ FollowUpMutation, RecoveryExpectation, TableExpectation, assert_post_recovery_invariants, @@ -127,12 +129,12 @@ async fn branch_delete_partial_failure_converges_via_cleanup() { } // Reusing a branch name whose delete left an orphaned fork (before `cleanup` -// reconciles it) must fail with a clear, actionable error pointing at -// `cleanup`, not the opaque `ExpectedVersionMismatch` that leaks from the fork -// path. The recreate itself succeeds; the first write to the previously-forked -// table is where the stale orphan collides. +// reconciles it) must SELF-HEAL on the next write — the write reclaims the +// manifest-unreferenced fork and re-forks, rather than wedging with "incomplete +// prior delete; run cleanup". (This test was the inverse before the fork-as- +// idempotent-reconcile fix; its flip is the signal the bug class is closed.) #[tokio::test] -async fn recreate_over_orphaned_fork_before_cleanup_is_actionable() { +async fn recreate_over_orphaned_fork_self_heals_without_cleanup() { let _scenario = FailScenario::setup(); let dir = tempfile::tempdir().unwrap(); let uri = dir.path().to_str().unwrap().to_string(); @@ -158,10 +160,10 @@ async fn recreate_over_orphaned_fork_before_cleanup_is_actionable() { } // Recreate the name and write to the previously-forked table WITHOUT a - // cleanup in between. + // cleanup in between. The write must self-heal the stale orphan fork. main.branch_create("feature").await.unwrap(); let mut feature2 = Omnigraph::open(&uri).await.unwrap(); - let err = helpers::mutate_branch( + helpers::mutate_branch( &mut feature2, "feature", MUTATION_QUERIES, @@ -169,20 +171,83 @@ async fn recreate_over_orphaned_fork_before_cleanup_is_actionable() { &mixed_params(&[("$name", "Frank")], &[("$age", 41)]), ) .await - .expect_err("write should collide with the stale orphaned fork"); + .expect("recreate-over-orphan write must self-heal, not require cleanup"); - let msg = err.to_string(); - assert!( - msg.contains("cleanup") - && (msg.contains("orphan") || msg.contains("incomplete prior delete")), - "expected an actionable orphaned-fork error pointing at cleanup, got: {msg}" - ); - assert!( - !msg.contains("expected manifest table version"), - "should not surface the opaque ExpectedVersionMismatch, got: {msg}" + // The recreated branch forks FRESH from main: the deleted branch's Eve is + // gone and only the new Frank is added on top of main's seed. A count of + // main + 2 would mean Eve resurrected from the stale fork (the bug). + let main_people = helpers::count_rows(&main, "node:Person").await; + let feature_people = helpers::count_rows_branch(&feature2, "feature", "node:Person").await; + assert_eq!( + feature_people, + main_people + 1, + "self-healed feature must fork fresh from main (+Frank only); \ + main={main_people}, feature={feature_people} (main+2 ⇒ Eve resurrected)" ); } +// The write-path orphan reclaim shares the same fresh-authority classifier as +// cleanup. If that classifier is Indeterminate (transient read on a live +// branch), the write must return a clear retryable authority-read conflict and +// leave the ref in place. It must not squeeze the ambiguity through +// ExpectedVersionMismatch with expected == actual, which lies about the cause. +#[tokio::test] +async fn recreate_over_orphaned_fork_reports_indeterminate_authority_read() { + let _scenario = FailScenario::setup(); + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap().to_string(); + let db = helpers::init_and_load(&dir).await; + db.branch_create("feature").await.unwrap(); + + let person_uri = node_table_uri(&uri, "Person"); + { + let mut ds = lance::Dataset::open(&person_uri).await.unwrap(); + let base = ds.version().version; + ds.create_branch("feature", base, None).await.unwrap(); + } + + let row = r#"{"type":"Person","data":{"name":"Grace","age":37}}"#; + { + let _fp = ScopedFailPoint::new("classify.fresh_read", "return"); + let err = db + .load_as("feature", None, row, LoadMode::Merge, None) + .await + .expect_err("indeterminate authority read must fail retryably"); + + match &err { + OmniError::Manifest(manifest) => { + assert_eq!(manifest.kind, ManifestErrorKind::Conflict); + assert!( + manifest.details.is_none(), + "indeterminate authority read is not an expected-version mismatch: {manifest:?}" + ); + } + other => panic!("expected manifest conflict, got {other:?}"), + } + let message = err.to_string(); + assert!( + message.contains("could not verify") + && message.contains("fresh manifest authority was unavailable") + && message.contains("refresh and retry"), + "error should name the unavailable authority read, got: {message}" + ); + assert!( + !message.contains("expected manifest table version"), + "indeterminate authority must not be reported as a version mismatch: {message}" + ); + + let ds = lance::Dataset::open(&person_uri).await.unwrap(); + assert!( + ds.list_branches().await.unwrap().contains_key("feature"), + "ambiguous orphan status must leave the fork for a later retry" + ); + } + + db.load_as("feature", None, row, LoadMode::Merge, None) + .await + .expect("when fresh authority is available, the orphan is reclaimed and write converges"); +} + // cleanup is the guaranteed convergence backstop, so one table's transient // failure must not abort the whole sweep. Inject a one-shot version-GC failure // for a single table and assert: cleanup still succeeds, the failure is @@ -330,6 +395,68 @@ async fn cleanup_reclaims_orphaned_commit_graph_branch() { } } +// `classify_fork_ref` returns `Indeterminate` when the fresh-authority read +// fails on a LIVE branch — and a destructive caller must SKIP, never delete, on +// that ambiguity. Here the reconciler has a genuine origin-2 orphan candidate +// (a manifest-unreferenced Person fork on the live `feature` branch), but the +// `classify.fresh_read` failpoint makes the fresh re-check fail: cleanup must +// leave the ref in place (cannot confirm it is unreferenced), then reclaim it on +// the next run once the read succeeds. This pins the Indeterminate arm and the +// don't-destroy-on-ambiguity rule end-to-end through cleanup. +#[tokio::test] +async fn reconcile_skips_fork_when_fresh_recheck_is_unavailable_then_converges() { + let _scenario = FailScenario::setup(); + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap().to_string(); + let mut db = helpers::init_and_load(&dir).await; + db.branch_create("feature").await.unwrap(); + + // Forge a manifest-unreferenced Person fork on the live `feature` branch — + // a genuine orphan the reconciler would normally reclaim. + let person_uri = node_table_uri(&uri, "Person"); + { + let mut ds = lance::Dataset::open(&person_uri).await.unwrap(); + let base = ds.version().version; + ds.create_branch("feature", base, None).await.unwrap(); + assert!( + ds.list_branches().await.unwrap().contains_key("feature"), + "precondition: forged orphan fork present" + ); + } + + // With the fresh re-check failing, the fork's status is Indeterminate (the + // branch is live but unreadable) → cleanup must SKIP it, not delete. + { + let _fp = ScopedFailPoint::new("classify.fresh_read", "return"); + db.cleanup(omnigraph::db::CleanupPolicyOptions { + keep_versions: Some(1), + older_than: None, + }) + .await + .unwrap(); + let ds = lance::Dataset::open(&person_uri).await.unwrap(); + assert!( + ds.list_branches().await.unwrap().contains_key("feature"), + "reconcile must NOT delete a fork whose fresh re-check is inconclusive" + ); + } + + // Read succeeds now → cleanup confirms the orphan and reclaims it (converges). + db.cleanup(omnigraph::db::CleanupPolicyOptions { + keep_versions: Some(1), + older_than: None, + }) + .await + .unwrap(); + { + let ds = lance::Dataset::open(&person_uri).await.unwrap(); + assert!( + !ds.list_branches().await.unwrap().contains_key("feature"), + "next cleanup (fresh read available) must reclaim the confirmed orphan" + ); + } +} + // A branch_delete whose best-effort commit-graph reclaim fails leaves a // commit-graph "zombie" branch. Recreating that name must heal the zombie and // succeed (branch_create force-deletes a stale commit-graph ref since the @@ -2619,69 +2746,66 @@ async fn finalize_publisher_residual_does_not_drift_untouched_tables() { } /// Acceptance test: a stage-step failure in the staged-index path -/// (`stage_create_btree_index` succeeded; `commit_staged` not yet -/// called) leaves NO Lance-HEAD drift on the existing tables. -/// Subsequent operations against those tables succeed without -/// `ExpectedVersionMismatch`. +/// (`stage_create_btree_index` succeeded; `commit_staged` not yet called) +/// leaves NO Lance-HEAD drift, so other tables stay writable. /// -/// Path: `apply_schema(v1 → v2)` adds a new node type. The -/// `added_tables` loop in `schema_apply` creates the empty dataset and -/// then calls `build_indices_on_dataset_for_catalog` → -/// `stage_and_commit_btree(..., &["id"])`. The failpoint fires -/// between `stage_create_btree_index` and `commit_staged`, so the -/// staged segments are written under `_indices//` but Lance HEAD -/// on the new dataset is unchanged at v=1. The schema-apply lock -/// branch is released by `apply_schema`'s outer match. Existing -/// tables (e.g. `node:Person`) are completely untouched by the new -/// node's added_tables iteration — they're outside the failed apply -/// path entirely — and we assert that mutations against them continue -/// to work. -/// -/// The orphan empty dataset from the failed apply is acceptable -/// residual: it's unreferenced by `__manifest` and will be reclaimed -/// by `cleanup_old_versions` (or removed when a future apply at the -/// same target path resolves the rename). +/// Under iss-848 schema apply no longer builds indexes inline — the build +/// happens in the reconciler (`ensure_indices`/`optimize`) and at load. So this +/// fires the failpoint where it lives now: an `ensure_indices` build of a BTREE +/// that a prior apply declared (`@index`) but deferred. The failpoint fires +/// between `stage_create_btree_index` and `commit_staged`, so the staged +/// segment is written under `_indices//` but `node:Person`'s Lance HEAD is +/// unchanged. `ensure_indices` fails and its EnsureIndices sidecar pins only +/// Person at NoMovement (a clean no-op on the next open). A write to a +/// different, unpinned table (`node:Company`) is unaffected: mutations/loads run +/// a roll-forward-only heal and proceed — they do not refuse on a pending +/// sidecar the way `optimize`/`repair` do — so the write succeeds with no drift. #[tokio::test] async fn ensure_indices_stage_btree_failure_leaves_existing_tables_writable() { let _scenario = FailScenario::setup(); let dir = tempfile::tempdir().unwrap(); let uri = dir.path().to_str().unwrap().to_string(); - - // Init with TEST_SCHEMA which declares Person + Knows. Indices on - // those tables get built during init. let mut db = Omnigraph::init(&uri, helpers::TEST_SCHEMA).await.unwrap(); - // Apply a schema that adds a new node type. The added_tables loop - // will hit the failpoint between stage and commit on the new - // node:Project table's btree-on-id build. (TEST_SCHEMA already - // has Person + Company + Knows + WorksAt — pick a name that isn't - // already declared.) - let extended_schema = format!( - "{}\nnode Project {{ name: String @key }}\n", - helpers::TEST_SCHEMA - ); - - { - let _failpoint = - ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return"); - let err = db.apply_schema(&extended_schema).await.unwrap_err(); - assert!( - err.to_string() - .contains("ensure_indices.post_stage_pre_commit_btree"), - "schema apply should fail with the synthetic failpoint error, got: {err}" - ); - } - - // Existing tables stayed at their pre-apply versions; subsequent - // mutations against them succeed (no Lance-HEAD drift). + // Seed a Person row — the load builds Person's id BTREE + name FTS. mutate_main( &mut db, helpers::MUTATION_QUERIES, "insert_person", - &mixed_params(&[("$name", "Eve")], &[("$age", 22)]), + &mixed_params(&[("$name", "Alice")], &[("$age", 30)]), ) .await - .expect("Person mutation must succeed after the failed schema apply — existing tables are not drifted"); + .expect("seed Person"); + + // Add `@index` on `age`: schema apply records the intent but defers the + // physical build (iss-848), so the BTREE on `age` is unbuilt. + let indexed_schema = helpers::TEST_SCHEMA.replace("age: I32?", "age: I32? @index"); + db.apply_schema(&indexed_schema) + .await + .expect("adding an @index is metadata-only and succeeds"); + + { + // ensure_indices builds the deferred `age` BTREE on Person; the failpoint + // fires between stage and commit, so Person's Lance HEAD does not move. + let _failpoint = + ScopedFailPoint::new("ensure_indices.post_stage_pre_commit_btree", "return"); + let err = db.ensure_indices().await.unwrap_err(); + assert!( + err.to_string() + .contains("ensure_indices.post_stage_pre_commit_btree"), + "ensure_indices should fail with the synthetic failpoint error, got: {err}" + ); + } + + // A different, unpinned table is untouched by the failed index build. + use omnigraph::loader::{LoadMode, load_jsonl}; + load_jsonl( + &mut db, + r#"{"type": "Company", "data": {"name": "Acme"}}"#, + LoadMode::Append, + ) + .await + .expect("Company write on a table untouched by the failed ensure_indices should succeed"); } fn assert_no_staging_files(graph: &std::path::Path) { diff --git a/crates/omnigraph/tests/helpers/mod.rs b/crates/omnigraph/tests/helpers/mod.rs index 295cab7..6476e1a 100644 --- a/crates/omnigraph/tests/helpers/mod.rs +++ b/crates/omnigraph/tests/helpers/mod.rs @@ -54,6 +54,19 @@ pub async fn init_and_load(dir: &tempfile::TempDir) -> Omnigraph { db } +/// On-disk Lance dataset URI for a node type, mirroring the engine's +/// `nodes/{fnv1a(type)}` layout. Used by tests that reach the raw Lance +/// dataset to forge or inspect branch state. (Local copies exist in +/// `failpoints.rs` / `maintenance.rs`; this is the shared one for new tests.) +pub fn node_table_uri(root: &str, type_name: &str) -> String { + let mut hash: u64 = 0xcbf2_9ce4_8422_2325; + for &b in type_name.as_bytes() { + hash ^= b as u64; + hash = hash.wrapping_mul(0x100_0000_01b3); + } + format!("{}/nodes/{hash:016x}", root.trim_end_matches('/')) +} + /// Read all rows from a sub-table by table_key. pub async fn read_table(db: &Omnigraph, table_key: &str) -> Vec { let snap = snapshot_main(db).await.unwrap(); diff --git a/crates/omnigraph/tests/maintenance.rs b/crates/omnigraph/tests/maintenance.rs index deb4d2d..78e31fa 100644 --- a/crates/omnigraph/tests/maintenance.rs +++ b/crates/omnigraph/tests/maintenance.rs @@ -843,3 +843,222 @@ async fn cleanup_reconciles_orphaned_branch_forks() { .await .unwrap(); } + +// cleanup must reclaim a manifest-unreferenced fork even when the BRANCH is +// still live (origin 2: an interrupted first-write fork), while KEEPING a table +// that is legitimately forked on that same live branch. Before the per-table +// authority broadening, the reconciler keyed only on the branch name and so +// never reclaimed a fork on a live branch — the wedge the handoff hit. +#[tokio::test] +async fn cleanup_reconciles_live_branch_orphan_fork_but_keeps_legitimate_fork() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap().to_string(); + let mut db = init_and_load(&dir).await; + + db.branch_create("feature").await.unwrap(); + + // Legitimately fork Company onto the live `feature` branch (a real write). + db.load_as( + "feature", + None, + r#"{"type":"Company","data":{"name":"Acme"}}"#, + LoadMode::Merge, + None, + ) + .await + .unwrap(); + + // Forge a manifest-unreferenced Person fork on the SAME live branch: the + // manifest's `feature` snapshot still places Person on main (Person was + // never written on feature), so this ref is an origin-2 orphan. + let person_uri = node_table_uri(&uri, "Person"); + { + let mut ds = Dataset::open(&person_uri).await.unwrap(); + let base = ds.version().version; + ds.create_branch("feature", base, None).await.unwrap(); + assert!( + ds.list_branches().await.unwrap().contains_key("feature"), + "precondition: forged orphan Person fork present on the live branch" + ); + } + + let company_uri = node_table_uri(&uri, "Company"); + let main_people = count_rows(&db, "node:Person").await; + let main_companies = count_rows(&db, "node:Company").await; + + db.cleanup(CleanupPolicyOptions { + keep_versions: Some(1), + older_than: None, + }) + .await + .unwrap(); + + // Origin-2 orphan reclaimed... + { + let ds = Dataset::open(&person_uri).await.unwrap(); + assert!( + !ds.list_branches().await.unwrap().contains_key("feature"), + "cleanup must reclaim the manifest-unreferenced Person fork on the live branch" + ); + } + // ...but the legitimate Company fork on the same live branch is kept. + { + let ds = Dataset::open(&company_uri).await.unwrap(); + assert!( + ds.list_branches().await.unwrap().contains_key("feature"), + "cleanup must NOT reclaim a legitimately-forked table on a live branch" + ); + } + // main is untouched. + assert_eq!(count_rows(&db, "node:Person").await, main_people); + assert_eq!(count_rows(&db, "node:Company").await, main_companies); +} + +// Regression (iss-848): a table with rows but NULL vectors (the load-before- +// embed window) must not abort index building. The vector (IVF) index cannot +// train on 0 vectors, so `create_vector_index` errors with "KMeans cannot +// train 1 centroids with 0 vectors". `build_indices_on_dataset_for_catalog` +// is the chokepoint every caller funnels through (load/mutate via +// prepare_updates_for_commit, ensure_indices, optimize, schema apply, merge), +// so per-index fault isolation there must defer that one column (pending) and +// still build the sibling scalar indexes, instead of propagating the error. +// This exercises both the load path (which builds indices inline) and the +// ensure_indices reconciler. Pre-fix this fails at the load step. +#[tokio::test] +async fn index_build_tolerates_null_vector_rows() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap(); + let schema = "node Doc {\n \ + slug: String @key\n \ + n: I64 @index\n \ + embedding: Vector(8)? @index\n\ + }\n"; + let mut db = Omnigraph::init(uri, schema).await.unwrap(); + // Rows present, embeddings null (loaded but not yet embedded). + load_jsonl( + &mut db, + "{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"n\":1}}\n\ + {\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"n\":2}}", + LoadMode::Merge, + ) + .await + .expect("load rows with null embeddings"); + + // Must not abort: the untrainable vector column is deferred, the sibling + // BTREE on `n` still builds. + db.ensure_indices() + .await + .expect("ensure_indices must not abort when a vector column has no trainable vectors yet"); +} + +// iss-848: `optimize` converges declared-but-unbuilt indexes. After an @index is +// added post-data (a metadata-only apply that defers the physical build), the +// column is unindexed and reads scan. `optimize` — the operator's reconciler, +// run on a cron — must materialize it, by composing the ensure_indices +// reconciler after the compaction sweep. Pre-iss-848 optimize only maintained +// coverage of EXISTING indexes (optimize_indices) and never created missing ones. +#[tokio::test] +async fn optimize_materializes_index_declared_but_unbuilt() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap(); + let v1 = "node Doc {\n slug: String @key\n rank: I32\n}\n"; + let mut db = Omnigraph::init(uri, v1).await.unwrap(); + load_jsonl( + &mut db, + "{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\ + {\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}", + LoadMode::Merge, + ) + .await + .unwrap(); + + // Add @index on `rank` after data exists: a metadata-only apply that defers + // the physical build (iss-848), so the column is declared-indexed but unbuilt. + let v2 = "node Doc {\n slug: String @key\n rank: I32 @index\n}\n"; + db.apply_schema(v2).await.expect("index-only apply"); + + // Precondition: `rank` is declared @index but unbuilt -> reads degrade. + { + let snap = snapshot_main(&db).await.unwrap(); + let ds = snap.open("node:Doc").await.unwrap(); + assert!( + matches!( + TableStore::key_column_index_coverage(&ds, "rank") + .await + .unwrap(), + IndexCoverage::Degraded { .. } + ), + "rank must be unindexed after the deferred apply" + ); + } + + db.optimize().await.unwrap(); + + // Postcondition: optimize's reconciler materialized the declared index. + let snap = snapshot_main(&db).await.unwrap(); + let ds = snap.open("node:Doc").await.unwrap(); + assert_eq!( + TableStore::key_column_index_coverage(&ds, "rank") + .await + .unwrap(), + IndexCoverage::Indexed, + "optimize must build the declared-but-unbuilt rank index" + ); +} + +// iss-848 (PR review): the rename path also defers index building. A RenameType +// migration writes the renamed table as a new dataset with the existing rows +// but no indexes (its inline build was removed). optimize must then materialize +// the declared index on the renamed table. +#[tokio::test] +async fn optimize_materializes_index_after_type_rename() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap(); + let v1 = "node Doc {\n slug: String @key\n rank: I32 @index\n}\n"; + let mut db = Omnigraph::init(uri, v1).await.unwrap(); + load_jsonl( + &mut db, + "{\"type\":\"Doc\",\"data\":{\"slug\":\"d1\",\"rank\":1}}\n\ + {\"type\":\"Doc\",\"data\":{\"slug\":\"d2\",\"rank\":2}}", + LoadMode::Merge, + ) + .await + .unwrap(); + + // Rename Doc -> Item; rows are preserved on the new table key. + let v2 = "node Item @rename_from(\"Doc\") {\n slug: String @key\n rank: I32 @index\n}\n"; + let result = db.apply_schema(v2).await.expect("rename apply"); + assert!(result.applied); + assert_eq!( + count_rows(&db, "node:Item").await, + 2, + "rename must preserve rows" + ); + + // Post-rename the renamed table's declared rank index is unbuilt (deferred). + { + let snap = snapshot_main(&db).await.unwrap(); + let ds = snap.open("node:Item").await.unwrap(); + assert!( + matches!( + TableStore::key_column_index_coverage(&ds, "rank") + .await + .unwrap(), + IndexCoverage::Degraded { .. } + ), + "rank must be unindexed immediately after the rename" + ); + } + + db.optimize().await.unwrap(); + + let snap = snapshot_main(&db).await.unwrap(); + let ds = snap.open("node:Item").await.unwrap(); + assert_eq!( + TableStore::key_column_index_coverage(&ds, "rank") + .await + .unwrap(), + IndexCoverage::Indexed, + "optimize must build the renamed table's deferred rank index" + ); +} diff --git a/crates/omnigraph/tests/schema_apply.rs b/crates/omnigraph/tests/schema_apply.rs index cc0cae2..508451a 100644 --- a/crates/omnigraph/tests/schema_apply.rs +++ b/crates/omnigraph/tests/schema_apply.rs @@ -736,3 +736,108 @@ edge Knows: Person -> Person { // current contract, the data is *unreachable* via omnigraph // (no manifest entry), which is the user-facing guarantee. } + +// Regression (bug 3 / dev-graph iss-848): a `Vector @index` on a 0-row table +// must not abort an otherwise-valid schema apply. A vector (IVF) index trains +// k-means centroids over the column's vectors, so Lance cannot build it on 0 +// vectors — it errors with "Creating empty vector indices with train=False is +// not yet implemented". When a *later* migration touches that table (here, an +// unrelated scalar `@index` on `body`), schema apply reconciles the table's +// whole index set, which previously tried to materialize the dormant vector +// index and aborted the entire migration (all-or-nothing). The build is now +// deferred (pending) when the column is untrainable, instead of failing the +// migration. The dormant index is materialized by a later `ensure_indices` / +// `optimize` once the table has rows. Full decoupling — intent recorded at +// apply, an async reconciler converges physical coverage — is iss-848. +#[tokio::test] +async fn apply_schema_defers_vector_index_on_empty_table() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap(); + + // init does not build indices, so the declared-but-unbuilt vector index + // sits harmless on the empty table (this is how it survived earlier + // applies that never touched the table). + // `slug` is the user @key; omnigraph injects its own internal `id` column, + // so the key field must not be named `id`. + let v1 = "node Doc {\n \ + slug: String @key\n \ + body: String?\n \ + embedding: Vector(8) @index\n\ + }\n"; + let mut db = Omnigraph::init(uri, v1).await.unwrap(); + + // Add an *unrelated* scalar @index on `body`. This routes Doc through + // schema apply's index reconcile, which must NOT abort on the untrainable + // empty vector index. + let v2 = "node Doc {\n \ + slug: String @key\n \ + body: String? @index\n \ + embedding: Vector(8) @index\n\ + }\n"; + let result = db.apply_schema(v2).await.expect( + "schema apply must succeed: an empty-table vector @index is deferred, not fatal", + ); + assert!(result.applied, "the scalar @index change must apply"); + + // The deferred vector index is not dropped — once the table has a + // trainable vector, `ensure_indices` materializes it without error. (If + // the guard wrongly skipped a non-empty column, this would still be + // unindexed; if it wrongly tried to build on empty, the apply above would + // have failed.) + load_jsonl( + &mut db, + r#"{"type":"Doc","data":{"slug":"d1","body":"hello","embedding":[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8]}}"#, + LoadMode::Merge, + ) + .await + .expect("loading a Doc with an embedding must succeed"); + db.ensure_indices() + .await + .expect("the deferred vector index must build once the table has a trainable vector"); +} + +// iss-848: adding an `@index` to an existing column is a pure metadata change. +// Schema apply records the intent (the catalog/IR now declares the index) but +// must NOT build the index inline, so the table's data and manifest version are +// untouched. The physical index is materialized later by ensure_indices / +// optimize. Pre-iss-848 the indexed_tables block built the index inline and +// bumped the table version. +#[tokio::test] +async fn index_only_constraint_apply_touches_no_table_data() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap(); + let v1 = "node Doc {\n slug: String @key\n n: I64\n}\n"; + let mut db = Omnigraph::init(uri, v1).await.unwrap(); + load_jsonl( + &mut db, + r#"{"type":"Doc","data":{"slug":"d1","n":1}}"#, + LoadMode::Merge, + ) + .await + .expect("load a Doc"); + + let before = db + .snapshot_of(ReadTarget::branch("main")) + .await + .unwrap() + .entry("node:Doc") + .unwrap() + .table_version; + + // Add an @index on the existing `n` column. + let v2 = "node Doc {\n slug: String @key\n n: I64 @index\n}\n"; + let result = db.apply_schema(v2).await.expect("index-only apply must succeed"); + assert!(result.applied, "the @index addition must apply"); + + let after = db + .snapshot_of(ReadTarget::branch("main")) + .await + .unwrap() + .entry("node:Doc") + .unwrap() + .table_version; + assert_eq!( + before, after, + "adding an @index must not bump the table version (no inline index build)" + ); +} diff --git a/crates/omnigraph/tests/writes.rs b/crates/omnigraph/tests/writes.rs index b006f4c..8120940 100644 --- a/crates/omnigraph/tests/writes.rs +++ b/crates/omnigraph/tests/writes.rs @@ -1540,3 +1540,109 @@ async fn second_sequential_update_on_same_row_succeeds() { "Alice's age must reflect the second update" ); } + +// An interrupted first-write fork (create_branch succeeded, the manifest +// publish did not) leaves a fully-formed Lance branch ref on the table that +// the manifest never references — a "manifest-unreferenced fork". The branch +// itself stays a valid manifest branch, so `cleanup`'s reconciler (keyed on +// the manifest branch list) never reclaims it. Today the next write to that +// table on that branch re-enters the fork path, `create_branch` collides, and +// the engine wedges with "incomplete prior delete; run `omnigraph cleanup`". +// +// We forge that exact residue (a live `feature` branch + a directly-created +// `feature` ref on the Person table the manifest doesn't reference) and assert +// the next write — via both `load` and `mutate` — self-heals by reclaiming the +// orphan fork and re-forking, rather than wedging. No process death / timing +// needed: the forge is the post-crash state. +#[tokio::test] +async fn first_write_self_heals_manifest_unreferenced_fork_on_live_branch() { + let dir = tempfile::tempdir().unwrap(); + let uri = dir.path().to_str().unwrap().to_string(); + let mut db = init_and_load(&dir).await; + db.branch_create("feature").await.unwrap(); + + // Forge the manifest-unreferenced fork directly at the Lance layer. + let person_uri = node_table_uri(&uri, "Person"); + { + let mut ds = lance::Dataset::open(&person_uri).await.unwrap(); + let base = ds.version().version; + ds.create_branch("feature", base, None).await.unwrap(); + assert!( + ds.list_branches().await.unwrap().contains_key("feature"), + "precondition: forged orphan fork present on Person" + ); + } + + // load → must self-heal, not wedge with "incomplete prior delete". + let row = r#"{"type":"Person","data":{"name":"Zoe","age":30}}"#; + db.load_as("feature", None, row, LoadMode::Merge, None) + .await + .expect("load onto a manifest-unreferenced fork must self-heal, not wedge"); + + // mutate → same path, must also self-heal. + mutate_branch( + &mut db, + "feature", + MUTATION_QUERIES, + "insert_person", + &mixed_params(&[("$name", "Yan")], &[("$age", 41)]), + ) + .await + .expect("mutate onto a manifest-unreferenced fork must self-heal"); + + // The healed branch holds the new rows; main is untouched (still no Zoe/Yan). + let feature_people = count_rows_branch(&db, "feature", "node:Person").await; + let main_people = count_rows(&db, "node:Person").await; + assert!( + feature_people >= main_people + 2, + "feature must contain the two new rows on top of the inherited set \ + (feature={feature_people}, main={main_people})" + ); +} + +// A node delete cascades to every edge table touching that node, forking those +// edge tables during execution. The up-front fork-queue acquisition must cover +// those cascade-forked edges, not just the node table named in the IR — else +// commit_all's held-guard coverage check fails the write (and, before the +// coverage check was promoted out of debug-only, edge commits would slip +// through unserialized). This drives the new code via a DELETE (the only +// cascading op), on a branch, as the FIRST write (so it actually forks). +#[tokio::test] +async fn branch_cascade_delete_forks_node_and_edges_under_held_queues() { + let dir = tempfile::tempdir().unwrap(); + let mut db = init_and_load(&dir).await; + db.branch_create("feature").await.unwrap(); + + // Baseline inherited from main (Alice has 2 Knows + 1 WorksAt edge). + let main_people = count_rows(&db, "node:Person").await; + let main_knows = count_rows(&db, "edge:Knows").await; + + // First write to `feature` is `delete Person Alice`, whose cascade forks + // node:Person AND edge:Knows + edge:WorksAt. Pre-fix the up-front set held + // only node:Person, so commit_all's coverage check rejected the write. + mutate_branch( + &mut db, + "feature", + MUTATION_QUERIES, + "remove_person", + &mixed_params(&[("$name", "Alice")], &[]), + ) + .await + .expect("branch cascade-delete must hold queues for cascade-forked edge tables"); + + // Alice and her edges are gone on feature; main is untouched. + assert_eq!( + count_rows_branch(&db, "feature", "node:Person").await, + main_people - 1, + "feature should have Alice removed from the inherited set" + ); + assert!( + count_rows_branch(&db, "feature", "edge:Knows").await < main_knows, + "feature should have Alice's cascade-deleted Knows edges removed" + ); + assert_eq!( + count_rows(&db, "node:Person").await, + main_people, + "main must be untouched by the branch delete" + ); +} diff --git a/docs/dev/architecture.md b/docs/dev/architecture.md index 4e8d3c6..004a98a 100644 --- a/docs/dev/architecture.md +++ b/docs/dev/architecture.md @@ -1,6 +1,6 @@ # Architecture -OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets, with Git-style branches and commits across the whole graph, multi-modal querying (vector + FTS + BM25 + RRF + graph traversal) in one runtime, an HTTP server with Cedar policy, and a CLI driven by a single `omnigraph.yaml`. +OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets, with Git-style branches and commits across the whole graph, multi-modal querying (vector + FTS + BM25 + RRF + graph traversal) in one runtime, an HTTP server with Cedar policy, and a CLI driven by a per-operator `~/.omnigraph/config.yaml` plus team-owned cluster directories. ## Reading guide diff --git a/docs/dev/invariants.md b/docs/dev/invariants.md index 878adfe..2fa87d1 100644 --- a/docs/dev/invariants.md +++ b/docs/dev/invariants.md @@ -15,6 +15,38 @@ Use it this way: - Keep implementation ledgers, roadmap detail, and historical MR notes in the per-area docs. This file is the filter, not the encyclopedia. +## Governing principle: logical contract over physical state + +The hard invariants below are instances of one rule. Keep it in view whenever +a change touches the boundary between what the graph *means* and how it is +physically stored. + +> **Logical state is the contract. Physical state — index coverage, fragment +> layout, compaction versions, staged writes — is derived, rebuildable, and may +> be produced asynchronously. A physical operation must never fail a logical +> one. Preconditions are checked against logical state; physical reconciliation +> is idempotent and may lag or retry. Genuine logical conflicts still fail +> loudly: the licence to lag covers physical convergence, not correctness.** + +Invariants that instantiate it: **2** (manifest-atomic visibility) and **5** +(recovery is part of the commit protocol) — a partially-written physical layer +never changes what a graph commit means; **7** (indexes are derived state) — a +query is correct under partial index coverage, and expensive index work +converges from manifest state instead of gating the write path; **13** (failures +bounded and observable) — the licence to lag is not a licence to drop, so a +physical step that cannot make progress is surfaced, not swallowed. Deny-list +items that enforce it: synchronous inline vector/FTS index rebuilds on the +commit path; state that drifts from Lance or the manifest when it can be +derived; job queues for manifest-derivable state where a reconciler fits. + +The failure shape it rules out: a legitimate background operation on the +physical layer (compaction, an index build, an interrupted staged write) is +allowed to break a logical operation (a query's correctness, a migration's +success, a branch's writability). The smell to watch for is a logical operation +whose precondition is a *physical* fact — a cached file version, an index's +existence, a fragment count. Make the precondition logical and let a reconciler +converge the physical state. + ## Hard Invariants 1. **Respect the substrate.** Lance owns columnar storage, per-dataset @@ -105,7 +137,7 @@ Use it this way: | Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) | | Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) | | Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) | -| Index lifecycle | Index *creation* per `@index`/`@key` property is dispatched by type (enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) via `node_prop_index_kind`; index *coverage maintenance* exists — `optimize` runs Lance `optimize_indices` after compaction to fold appended/rewritten fragments into existing indexes (still an explicit maintenance call, not yet a background reconciler) | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) | +| Index lifecycle | `@index`/`@key` declares *intent*; the physical index is derived state and never fails a logical op. `schema apply` builds no indexes (records intent only; index-only changes touch no table data). `load`/`mutate` build inline through one chokepoint (`build_indices_on_dataset_for_catalog`, type-dispatched by `node_prop_index_kind`: enum + orderable scalar → BTREE, free-text String → FTS, Vector → vector) that fault-isolates an untrainable Vector column into a *pending* index instead of aborting. `optimize`/`ensure_indices` is the reconciler: it creates declared-but-missing indexes and folds appended/rewritten fragments into existing ones (`optimize_indices`), reporting still-pending columns. Explicit maintenance call, not yet a background loop | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) | | Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) | | Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) | | Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) | @@ -165,6 +197,22 @@ them explicit. one-winner-CAS territory; closing this fully needs a cross-process serialization primitive (e.g. lease-based use of the schema-apply lock branch) — design it before promoting multi-process write topologies. +- **Fork reclaim is in-process-safe only:** the first write to a table on a + branch forks it (a Lance `create_branch` that advances state before the + manifest publish). An interrupted fork (crash, or a cancelled request + future) leaves a manifest-unreferenced branch ref. The next write self-heals + it — `reclaim_orphaned_fork_and_refork` (`force_delete_branch` + re-fork) + — but reclaim is only safe because the writer holds the per-`(table, + branch)` write queue from before the fork through the publish AND re-checks + the live manifest under it, so no *in-process* writer can be mid-fork. A + reclaim cannot serialize against a foreign-*process* in-flight fork: it may + force-delete a peer's just-created ref, which makes that peer's commit fail + and retry — the same one-winner-CAS exposure as above, not corruption. The + reclaim never fires unless in-process-queue + manifest authority both prove + the ref is manifest-unreferenced. `cleanup`'s per-table reconciler + (`reconcile_orphaned_branches`) is the guaranteed backstop for any fork the + write path never revisits. Both degrade to a no-op if Lance ships an atomic + multi-dataset branch op. - **Local `write_text_if_match` is not a cross-process CAS:** object-store backends use a true conditional put (ETag If-Match; the in-memory test backend too), but upstream `object_store` leaves `PutMode::Update` diff --git a/docs/dev/rfc-011-cli-refactoring.md b/docs/dev/rfc-011-cli-refactoring.md index 768509b..d26dd84 100644 --- a/docs/dev/rfc-011-cli-refactoring.md +++ b/docs/dev/rfc-011-cli-refactoring.md @@ -1,6 +1,9 @@ # RFC-011: CLI refactoring — one addressing & config model -**Status:** Proposed +**Status:** Accepted — implemented (the `omnigraph.yaml` excision landed as +#250/#251/#252; D1–D4, D6, D7, D9, D10 shipped). Two items remain: **D11** +(server-side maintenance jobs) is gated on the bulk-data-plane RFC #219; **D5** +(combined admin scope) stays deferred by design. **Date:** 2026-06-14 **Audience:** CLI/server maintainers **Builds on:** [rfc-007-operator-config.md](rfc-007-operator-config.md) @@ -526,10 +529,9 @@ Non-blocking; settle when convenient. server scope and maintain via `--cluster`. A `deployments: { … }` object (server + cluster validated coherent, referenced by a profile) is revisited only if admin ergonomics demand it — and Decision 11 largely removes the need. -- **D8 — the `profile` command surface.** `profile list` / `profile show` - (read-only inspection) are additive diagnostics, shippable anytime; they don't - touch the grammar or resolution. The *no sticky `profile use`* constraint holds - regardless — it is a design principle, not a command. +- **D8 — the `profile` command surface.** *Shipped:* `profile list` / `profile + show []` (read-only inspection). The *no sticky `profile use`* constraint + holds — it is a design principle, not a command. ## Safety diff --git a/docs/dev/testing.md b/docs/dev/testing.md index 7e181e1..8d6a305 100644 --- a/docs/dev/testing.md +++ b/docs/dev/testing.md @@ -7,7 +7,7 @@ This file is the always-on map of the test surface. **Consult it before every ta | Crate | Path | Style | |---|---|---| | `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (28 files), fixture-driven, share `tests/helpers/mod.rs` | -| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | Per-area suites (post-modularization): `cli_cluster.rs` (cluster command surface + operator-actor cascade), `cli_cluster_e2e.rs` (spawned-binary lifecycle compositions — lost-state re-import recovery, out-of-band drift, graph-root destruction, multi-graph mixed-disposition convergence), `cli_data.rs` (load/read/change/branch/commit/export/snapshot/policy/embed/maintenance + operator format cascade), `cli_schema_config.rs` (init/config, schema plan/apply, RFC-008 deprecation warnings + `config migrate` + strict mode), `cli_queries.rs`, `parity_matrix.rs` (RFC-009 Phase 1: the embedded-vs-remote referee — every forked verb run against both arms with matched Cedar policy and the same actor, scrubbed-JSON + exit-code equality; divergences are pinned in its `KNOWN_DIVERGENCES` ledger, never silently repaired), `system_local.rs` (full-cycle cluster lifecycle with a spawned `--cluster` server, applied-policy enforcement over HTTP, keyed-credential auth, operator aliases), `system_remote.rs`; share `tests/support/mod.rs` (hermetic `OMNIGRAPH_HOME` by default) | +| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | Per-area suites (post-modularization): `cli_cluster.rs` (cluster command surface + operator-actor cascade), `cli_cluster_e2e.rs` (spawned-binary lifecycle compositions — lost-state re-import recovery, out-of-band drift, graph-root destruction, multi-graph mixed-disposition convergence), `cli_data.rs` (load/read/change/branch/commit/export/snapshot/policy/embed/maintenance + operator format cascade), `cli_schema_config.rs` (init/config, schema plan/apply), `cli_queries.rs`, `parity_matrix.rs` (RFC-009 Phase 1: the embedded-vs-remote referee — every forked verb run against both arms with matched Cedar policy and the same actor, scrubbed-JSON + exit-code equality; divergences are pinned in its `KNOWN_DIVERGENCES` ledger, never silently repaired), `system_local.rs` (full-cycle cluster lifecycle with a spawned `--cluster` server, applied-policy enforcement over HTTP, keyed-credential auth, operator aliases), `system_remote.rs`; share `tests/support/mod.rs` (hermetic `OMNIGRAPH_HOME` by default) | | `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests`; `tests/failpoints.rs` (feature-gated); `tests/s3_cluster.rs` (bucket-gated full lifecycle on object storage) | Cluster config parser, local JSON state diff, state CAS/lock handling/recovery, read-only validate/plan/status plus explicit refresh/import graph observations, config-only apply (content-addressed payload publish, disposition gating, composite-digest convergence, idempotent re-apply), catalog payload verification (status read-only, refresh drift + self-heal), failpoint crash-mid-apply / CAS-race coverage, Stage 4A graph creation (create executor, recovery sidecars + sweep rows, create crash windows), Stage 4B schema apply (migration previews in plan, schema executor, schema-apply sweep classification, schema crash windows), Stage 4C gated deletes (digest-bound approvals, delete executor + tombstones, delete sweep rows, delete crash windows), and 5A policy binding metadata (applies_to in the applied revision, binding-change diffing + convergence, pre-5A backfill), and the 5B serving-snapshot read API (converged read, refusal rows) | | `omnigraph-server` | `crates/omnigraph-server/tests/` | Per-area suites (post-modularization): `auth_policy.rs`, `data_routes.rs`, `schema_routes.rs`, `stored_queries.rs`, `multi_graph.rs` (cluster-mode boot — converged serving, policy binding wiring, boot refusals — + the concurrent branch-ops matrix), `boot_settings.rs` (mode inference, PolicySource), `s3.rs` (bucket-gated: single-graph serving + config-free `--cluster s3://` boot), `openapi.rs` (OpenAPI drift / regeneration); share `tests/support/mod.rs` | | `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint | @@ -29,7 +29,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav | `point_in_time.rs` | Snapshots, time travel (`snapshot_at_version`, `entity_at`) | | `changes.rs` | `diff_between` / `diff_commits` | | `consistency.rs` | Cross-table snapshot isolation, atomic publish | -| `schema_apply.rs` | Migration plan + apply, schema-apply lock | +| `schema_apply.rs` | Migration plan + apply, schema-apply lock; index materialization deferred to the reconciler (iss-848): `apply_schema_defers_vector_index_on_empty_table` (an empty-table Vector `@index` never aborts the apply) and `index_only_constraint_apply_touches_no_table_data` (adding an `@index` is metadata-only — no table-version bump) | | `search.rs` | FTS / vector / hybrid (`bm25`, `nearest`, `rrf`) | | `traversal.rs` | `Expand`, variable-length hops, anti-join (CSR path — `OMNIGRAPH_TRAVERSAL_MODE` unset) | | `traversal_indexed.rs` | BTREE-indexed Expand (`execute_expand_indexed`) forced via `OMNIGRAPH_TRAVERSAL_MODE`, asserted semantically equal to the CSR path; own binary, all `#[serial]` so env writes never race | @@ -42,7 +42,7 @@ The engine's `tests/` is the principal coverage surface; most graph-shaped behav | `lance_version_columns.rs` | Per-row `_row_last_updated_at_version` behavior | | `validators.rs` | Schema constraint enforcement (enum, range, unique, cardinality) across JSONL, insert, update paths | | `policy_engine_chassis.rs` | Engine-layer Cedar enforcement (MR-722): allow + deny through every `_as` writer via the SDK directly — no HTTP — proving embedded and CLI callers hit the same gate as the server, with action × scope shapes matching `authorize_request` | -| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice | +| `maintenance.rs` | `optimize` (compaction), `repair` (explicit uncovered-drift publish), and `cleanup` (version GC): empty/idempotent/no-op edges, policy validation, head preservation; `optimize` publishes its own compaction (`optimize_publishes_compaction_to_manifest_so_schema_apply_succeeds`), skips pre-existing uncovered drift (`optimize_skips_preexisting_manifest_head_drift`), and refuses to run while a `__recovery` sidecar is pending (`optimize_defers_when_recovery_sidecar_is_pending`); `repair` previews/heals verified maintenance drift, refuses raw semantic drift without `--force`, and forced repair publishes only by explicit operator choice; the index reconciler (iss-848): `index_build_tolerates_null_vector_rows` (an untrainable Vector column defers instead of aborting the build, sibling indexes still build) and `optimize_materializes_index_declared_but_unbuilt` (optimize creates a declared-but-deferred index) | | `failpoints.rs` | Failure-injection coverage (gated on `failpoints` feature). Includes the five per-writer Phase B → recovery integration tests (`recovery_rolls_forward_after_finalize_publisher_failure`, `schema_apply_phase_b_failure_recovered_on_next_open`, `branch_merge_phase_b_failure_recovered_on_next_open`, `ensure_indices_phase_b_failure_recovered_on_next_open`, `optimize_phase_b_failure_recovered_on_next_open`) and the write-entry in-process heal contract (the four `*_after_finalize_publisher_failure_heals_without_reopen` tests — load, mutation, schema apply, branch merge: a follow-up write on the same handle rolls a sidecar-covered residual forward without reopen/refresh) and the storage-fault matrix for the sidecar lifecycle (`recovery.sidecar_{write,delete,list}` / `recovery.record_audit` failpoints: Phase A put failure aborts with zero drift, Phase D delete failure is swallowed and healed by the next write, list failures are loud at heal and open, audit-append failures are retried to exactly one audit row; plus the bucket-gated `s3_load_recovers_after_publisher_failure_without_reopen`). | | `recovery.rs` | Open-time recovery sweep — sidecar I/O, classifier dispatch (NoMovement / RolledPastExpected / UnexpectedAtP1 / UnexpectedMultistep / InvariantViolation), all-or-nothing decision, roll-forward via `ManifestBatchPublisher::publish`, roll-back via `Dataset::restore`, audit row in `_graph_commit_recoveries.lance`, `OpenMode::ReadOnly` skip path | | `composite_flow.rs` | Compositional/narrative end-to-end stories — multi-step flows that compose mechanics covered by other test files. Catches integration regressions where individual operations all pass their unit tests but their composition breaks (sequential merges, post-merge main writes, time-travel through merge DAG, reopen consistency over multi-merge histories, post-optimize and post-cleanup strict writes). | diff --git a/docs/dev/writes.md b/docs/dev/writes.md index ccfd5bc..01c166e 100644 --- a/docs/dev/writes.md +++ b/docs/dev/writes.md @@ -19,8 +19,14 @@ publisher's row-level CAS on `__manifest` is the single fence. `__run__*` branch on an upgraded graph is swept off `__manifest` by the v2→v3 internal-schema migration on first read-write open. (The inert `_graph_runs.lance` bytes remain until a `delete_prefix` primitive lands.) -- Cancelled mutation futures leave **no graph-level state** — only orphaned - Lance fragments, which the existing `omnigraph cleanup` pipe reclaims. +- Cancelled mutation futures leave **no graph-visible state** — the manifest + is never advanced. They can leave two kinds of unreferenced residue, both + self-healing: orphaned Lance fragments (reclaimed by `omnigraph cleanup`), + and — on the *first* write to a table on a branch, which forks it before the + publish — a manifest-unreferenced branch ref. The next write to that table + reclaims the stale fork and re-forks (`reclaim_orphaned_fork_and_refork`), + and `cleanup`'s per-table reconciler is the guaranteed backstop; see the + fork-reclaim note in [invariants.md](invariants.md). ## Read-your-writes within a multi-statement mutation diff --git a/docs/user/cli/index.md b/docs/user/cli/index.md index 6df606c..6f49c42 100644 --- a/docs/user/cli/index.md +++ b/docs/user/cli/index.md @@ -6,34 +6,43 @@ omnigraph init --schema schema.pg graph.omni omnigraph load --data data.jsonl --mode overwrite graph.omni omnigraph snapshot graph.omni --branch main --json -omnigraph query --uri graph.omni --query queries.gq --name get_person --params '{"name":"Alice"}' -omnigraph mutate --uri graph.omni --query queries.gq --name insert_person --params '{"name":"Mina","age":28}' +# Invoke a stored query BY NAME from the catalog (served — addressed by scope): +omnigraph query get_person --params '{"name":"Alice"}' +omnigraph mutate insert_person --params '{"name":"Mina","age":28}' ``` `omnigraph query` is the canonical read command (pairs with `POST /query`); `omnigraph mutate` is the canonical write command (pairs with `POST /mutate`). -The previous names `omnigraph read` and `omnigraph change` keep working as -visible aliases — invocations emit a one-line deprecation warning to stderr -and otherwise behave identically. See [Deprecated names](#deprecated-names) -for the migration table. +The positional argument is the **stored-query name**, invoked from the served +catalog (RFC-011 D3) — the graph is addressed by scope (`--server` / `--profile` +/ defaults), and the verb asserts the query's kind (`query` rejects a stored +mutation, and vice-versa). The previous names `omnigraph read` and +`omnigraph change` keep working as visible aliases — invocations emit a one-line +deprecation warning to stderr. See [Deprecated names](#deprecated-names). -For ad-hoc reads and mutations (REPLs, AI agents, one-off scripts), pass the -GQ source inline with `-e` / `--query-string` instead of a file path: +For **ad-hoc** reads and mutations (REPLs, AI agents, one-off scripts, local dev), +pass the GQ source with `-e` / `--query-string` (inline) or `--query ` (a +file), and address a graph's storage directly with `--store`. By-name catalog +invocation is served-only — a bare `--store` has no catalog, so it's the ad-hoc +lane: ```bash -omnigraph query --uri graph.omni \ +omnigraph query --store graph.omni \ -e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \ --params '{"name":"Alice"}' -omnigraph mutate --uri graph.omni \ +omnigraph mutate --store graph.omni \ -e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \ --params '{"name":"Inline","age":42}' + +# A multi-query file: the positional selects which query to run. +omnigraph query --store graph.omni --query queries.gq get_person --params '{"name":"Alice"}' ``` -`-e` is mutually exclusive with `--query ` and `--alias `; exactly -one of the three must be provided. The inline source travels through the same -parser, lint, params binding, and commit machinery as a file-based query — -only the source loader changes. +`-e` is mutually exclusive with `--query `. With either, the positional +name (optional) selects which query in the source to run. The inline source +travels through the same parser, lint, params binding, and commit machinery as a +file-based query — only the source loader changes. ## Branching And Reviewable Data Flows @@ -50,19 +59,18 @@ omnigraph commit show --uri graph.omni --json ## Remote Server Mode -Serve a graph: +Serve a cluster-applied graph: ```bash -omnigraph-server graph.omni --bind 127.0.0.1:8080 +omnigraph cluster apply --config ./company-brain +omnigraph-server --cluster ./company-brain --bind 127.0.0.1:8080 ``` -Read through the HTTP API: +Read through the HTTP API — invoke a stored query by name from the catalog: ```bash -omnigraph query \ +omnigraph query get_person \ --server http://127.0.0.1:8080 \ - --query queries.gq \ - --name get_person \ --params '{"name":"Alice"}' ``` @@ -71,25 +79,31 @@ literal URL); a positional `http(s)://` URI is rejected. If the server requires auth, set its bearer token and `omnigraph login ` (or `OMNIGRAPH_BEARER_TOKEN`). -## Multi-graph servers (v0.6.0+) +## Multi-graph servers -Against a multi-graph server (started with `--config omnigraph.yaml` referencing a non-empty `graphs:` map), use `omnigraph graphs list` to enumerate the registered graphs. The server must configure bearer tokens and `server.policy.file` with a rule that allows `graph_list`; `/graphs` is closed by default even when the server runs with `--unauthenticated`. +A server boots from a cluster directory (`omnigraph-server --cluster `) and +serves every graph the cluster declares. Use `omnigraph graphs list` to enumerate +them. The cluster's server-level policy must allow `graph_list`; `/graphs` is +closed by default even when the server runs with `--unauthenticated`. ```bash OMNIGRAPH_BEARER_TOKEN=admin-token \ - omnigraph graphs list --uri http://server.example.com --json + omnigraph graphs list --server http://server.example.com --json ``` -For config-driven clients, set the remote graph's `bearer_token_env` to an environment variable containing a token whose actor is authorized by `server.policy.file`. +For an operator-defined server, store its token with `omnigraph login ` (or +`OMNIGRAPH_TOKEN_`); the actor must be authorized by the cluster's +server-level policy. -`list` rejects local URI targets — it's for remote multi-graph servers only. +`list` rejects local (`--store`) targets — it's for remote multi-graph servers only. -Runtime add/remove is **not** in v0.6.0. To add a graph, stop the server, add a `graphs.` entry to `omnigraph.yaml`, then restart. To remove, stop the server, delete the entry, restart. +Runtime add/remove via API is not exposed. To add or remove a graph, edit the +cluster's `cluster.yaml`, run `omnigraph cluster apply`, then restart the server. -Per-graph URLs: hit a graph's cluster route from any subcommand by pointing `--uri` at it: +Per-graph addressing: select a graph on a multi-graph server with `--graph`: ```bash -omnigraph read --uri http://server.example.com/graphs/beta --query q.gq ... +omnigraph query get_person --server http://server.example.com --graph beta --params '{"name":"Ada"}' ``` ## Runs, Policy, And Diagnostics @@ -100,9 +114,9 @@ omnigraph check --query queries.gq graph.omni --json omnigraph schema plan --schema next.pg graph.omni --json omnigraph schema apply --schema next.pg graph.omni --json -omnigraph policy validate --config omnigraph.yaml -omnigraph policy test --config omnigraph.yaml -omnigraph policy explain --config omnigraph.yaml --actor act-alice --action read --branch main +omnigraph policy validate --cluster ./company-brain --graph knowledge +omnigraph policy test --cluster ./company-brain --graph knowledge --tests policy.tests.yaml +omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act-alice --action read --branch main omnigraph commit list graph.omni --json omnigraph commit show --uri graph.omni --json @@ -116,34 +130,29 @@ also pass `--schema`. ## Config -`omnigraph.yaml` lets the CLI and server share named graphs, defaults, and -query roots: +Configuration has two surfaces with single owners (see the +[CLI reference](reference.md#config-surfaces) for the full schema): + +- **`~/.omnigraph/config.yaml`** — your personal operator config: default actor + (`--as`), named servers + credentials, clusters, profiles, aliases, and + default scope (`defaults.server` / `defaults.store` / `default_graph`). It + decides *who you are* and *what you address by default*. +- **`cluster.yaml`** (a team-owned cluster directory) — declares *what the system + is*: graphs, schemas, stored queries, policies, and storage. A server boots + from it (`--cluster `); see the [cluster guide](../clusters/index.md). ```yaml -graphs: - local: - uri: demo.omni +# ~/.omnigraph/config.yaml +operator: + actor: act-andrew +servers: dev: - uri: http://127.0.0.1:8080 - bearer_token_env: OMNIGRAPH_BEARER_TOKEN - -cli: - graph: local - branch: main - -query: - roots: - - queries - - . + url: http://127.0.0.1:8080 +defaults: + server: dev + default_graph: knowledge ``` -The config file can also define: - -- server bind defaults -- auth env files -- query aliases for common read and change commands -- `policy.file` for Cedar authorization rules - When policy is enabled, `schema apply` is authorized through the `schema_apply` action and is typically limited to admins on protected `main`. @@ -161,6 +170,6 @@ one-line warning to stderr and otherwise behave identically. | `omnigraph query lint` | `omnigraph lint` | Same flags. The argv-level shim rewrites `query lint` to `lint`. | | `omnigraph query check` | `omnigraph check` | `check` is a visible alias of `omnigraph lint`. | -The `command:` field in `aliases.` in `omnigraph.yaml` accepts both -`read` / `change` (legacy) and `query` / `mutate` (canonical); the two +The `command:` field in `aliases.` in `~/.omnigraph/config.yaml` accepts +both `read` / `change` (legacy) and `query` / `mutate` (canonical); the two spellings are interchangeable on the wire via serde aliases. diff --git a/docs/user/cli/reference.md b/docs/user/cli/reference.md index 3da502a..1d52e45 100644 --- a/docs/user/cli/reference.md +++ b/docs/user/cli/reference.md @@ -1,31 +1,32 @@ # CLI Reference (`omnigraph`) -A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](index.md). +A reference for the `omnigraph` binary's command surface and the per-operator `~/.omnigraph/config.yaml` schema. For a quick-start guide, see [cli.md](index.md). -Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server ` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph ` for multi-graph servers; exclusive with a positional URI), `--store ` (a single graph's storage directly), or `--profile ` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config `. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. +Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server ` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph ` for multi-graph servers; exclusive with a positional URI), `--store ` (a single graph's storage directly), or `--profile ` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config `, while `policy` and `queries` read a cluster's applied state via `--cluster `. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. **`query`/`mutate` are the exception**: their positional is a stored-query *name* (RFC-011 D3), not a graph URI, so they address the graph only via `--store`/`--server`/`--profile`/defaults. ## Top-level commands | Command | Purpose | |---|---| -| `init` | `--schema ` → initialize a graph (no longer scaffolds `omnigraph.yaml`; start cluster configs from the [cluster.md](../clusters/index.md) quick-start or `config migrate`) | +| `init` | `--schema ` → initialize a graph (start cluster configs from the [cluster.md](../clusters/index.md) quick-start) | | `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from ` forks a missing `--branch` from `` first | | `ingest` | deprecated alias of `load --from ` (defaults: `--from main --mode merge`); prints a one-line warning to stderr | -| `query` (alias: `read`) | run named read query; source via `--query `, `-e`/`--query-string `, or `--alias ` (exactly one). `read` is the deprecated previous name and prints a one-line warning to stderr | -| `mutate` (alias: `change`) | run mutation query; same `--query` / `-e` / `--alias` mutual-exclusion as `query`. `change` is the deprecated previous name and prints a one-line warning to stderr | +| `query ` (alias: `read`) | run a read query. **Catalog lane** (default): `` is a stored query invoked **by name** from the served catalog (served-only — address with `--server`/`--profile`; the verb asserts the query is a read). **Ad-hoc lane**: with `--query ` or `-e`/`--query-string `, runs that source (the positional `` then selects which query in it). No positional graph URI — address via `--store`/`--server`/`--profile`. `read` is the deprecated previous name (one-line stderr warning) | +| `mutate ` (alias: `change`) | run a mutation query; same catalog (by-name, served-only, verb asserts mutation) / ad-hoc (`--query`/`-e`) lanes as `query`. `change` is the deprecated previous name (one-line stderr warning) | +| `alias [args]` | invoke an operator alias — a read-only personal binding (under `aliases:` in `~/.omnigraph/config.yaml`) to a stored query on a named server (RFC-011 D4; replaces the removed `--alias` flag; stored mutations are rejected before execution) | | `snapshot` | print current snapshot (per-table version + row count) | | `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) | | `branch create \| list \| delete \| merge` | branching ops | | `commit list \| show` | inspect commit graph | -| `schema plan \| apply \| show (alias: get)` | migrations | +| `schema plan \| apply \| show (alias: get)` | migrations. `apply` refuses a cluster-managed graph (one whose storage is inside a cluster) and points at `cluster apply` — those graphs evolve through the cluster ledger, not a direct apply | | `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` | -| `config migrate` | propose (or `--write`: apply) the split of a legacy `omnigraph.yaml` — team half → ready-to-review `cluster.yaml`, personal half → `~/.omnigraph/config.yaml` (key-level merge, existing entries win), plus dropped-key reasons and manual steps | -| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve --as ` (`apply`/`approve` default the actor from the per-operator `omnigraph.yaml`'s `cli.actor` when `--as` is omitted; nothing else in that file affects cluster commands); what apply converges is what an `omnigraph-server --cluster ` deployment serves on its next restart (omnigraph.yaml deployments are unaffected); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock ` manually removes a held local state lock by exact id | +| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve --as ` (`apply`/`approve` default the actor from `~/.omnigraph/config.yaml`'s `operator.actor` when `--as` is omitted); what apply converges is what an `omnigraph-server --cluster ` deployment serves on its next restart (`--cluster` is the server's only boot source — RFC-011 cluster-only); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock ` manually removes a held local state lock by exact id | | `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) | | `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review | -| `cleanup --keep N --older-than 7d --confirm` | destructive version GC | +| `cleanup --keep N --older-than 7d --confirm` | destructive version GC (`--confirm` to execute; also needs `--yes` against a non-local `s3://` target — see *Write diagnostics & destructive confirmation*) | | `embed` | offline JSONL embedding pipeline | -| `policy validate \| test \| explain` | Cedar tooling. Selects `cli.graph`, else `server.graph`, else top-level `policy.file` | +| `policy validate \| test \| explain` | Cedar tooling against a cluster's applied policies (`--cluster `; `--graph ` picks a graph's bundle when several apply). `test` takes `--tests `; `explain` takes `--actor`/`--action`/`--branch`/`--target-branch` | +| `profile list \| show []` | read-only inspection of `~/.omnigraph/config.yaml` profiles. `list` shows each profile's binding (server/cluster/store) + default graph and marks the `$OMNIGRAPH_PROFILE`-active one; JSON keeps `binding` and adds `scope_kind`, `target`, `valid`, and `error`; `show` resolves one profile's scope (endpoint + default graph), defaulting to the active profile, else the flat operator defaults | | `version` / `-v` | print `omnigraph 0.3.x` | ## Command capabilities @@ -34,21 +35,30 @@ Every command declares the **capability** it needs — what it requires to reach - **`any`** — `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show`, `schema apply`. Run against a graph **served (via a server) or embedded (direct against a store)**: accept a positional `file://`/`s3://` URI, `--server ` (+ `--graph ` for multi-graph servers), `--store `, or `--profile `. A remote server is addressed with `--server` — a positional `http(s)://` URI does **not** dispatch to one. - **`served`** — `graphs list`. Requires a server (accepts `--server` / `--profile`). -- **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server` / `--graph`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` also accept **`--cluster --cluster-graph `**, which resolves the graph's storage URI from the served cluster state (so you needn't know the `/graphs/.omni` layout). -- **`control`** — `cluster *`. Operates on a cluster directory via `--config `. -- **`local`** — `policy *`, `embed`, `login`, `logout`, `config`, `version`, `queries list`. Address no graph. +- **`direct`** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `lint`. Need **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI`, but **not** `--server`, and a remote (`http(s)://`) URI is rejected. `optimize` / `repair` / `cleanup` additionally accept **`--cluster --graph `** (`--cluster` is a cluster directory or storage-root URI, named via `clusters:` in `~/.omnigraph/config.yaml` or a literal root), which resolves the graph's storage URI from the served cluster state (so you needn't know the `/graphs/.omni` layout). `--graph` is the one graph selector across all scopes — on these three verbs it picks the cluster graph; on the other `direct` verbs it does not apply. +- **`control`** — `cluster *` via `--config `; `policy *` and `queries *` via `--cluster ` or a cluster profile. +- **`local`** — `alias`, `embed`, `login`, `logout`, `profile`, `version`. Address no explicit graph scope. These restrictions are enforced and reported, not silent: -- A served-graph flag (`--server` / `--graph`) on a verb that doesn't reach a graph through a server fails loudly, e.g.: ``optimize is a direct (storage-native) command; --server/--graph address a served graph and do not apply. Pass a storage URI, or --cluster --cluster-graph .`` +- A scope flag on a verb that can't consume it fails loudly rather than being silently dropped — `--server` outside a served scope, `--cluster` outside cluster-scoped verbs, or `--graph` where no multi-graph scope applies, e.g.: ``optimize is a direct (storage-native) command; --server addresses a served graph and does not apply. Pass a storage URI, or --cluster --graph .`` - A `direct` verb pointed at a remote URI fails loudly, e.g.: ``optimize is a direct (storage-native) command and needs direct storage access; the resolved target is a remote server (https://…). Pass the graph's file:// or s3:// URI.`` - A data verb pointed at a positional `http(s)://` URI fails loudly: ``a remote graph must be addressed with --server — a positional (or --uri) http(s):// URL no longer dispatches to a server.`` - `init` into an **established cluster's** storage layout (`/graphs/.omni` where `` holds `__cluster/state.json`) is refused — graphs in a cluster are created by `cluster apply` (which records ledger / recovery / approvals), not `init`. -To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --cluster-graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design. +To maintain a server-backed graph, run the `direct` verbs from a host with storage access against the graph's storage URI (a positional URI, or `--cluster … --graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design. `omnigraph --help` lists commands with a **capability legend** at the bottom (any / served / direct / control / local). +## Write diagnostics & destructive confirmation + +Two global flags make writes self-documenting and guard the dangerous ones (RFC-011 Decision 9): + +- **Every write echoes its resolved target to stderr** — `omnigraph load → s3://acme/brain/graphs/knowledge.omni (direct, remote)` — so you catch a scope that resolved somewhere unexpected (e.g. *prod*) before it lands. Applies to `load`, `ingest`, `mutate`, `branch create|delete|merge`, `schema apply`, `optimize`, `repair`, `cleanup`. The line is stderr, so `--json` consumers reading stdout are unaffected; suppress it with **`--quiet`**. +- **Destructive writes against a non-local scope require confirmation.** `cleanup`, overwrite `load` (`--mode overwrite`), and `branch delete` proceed freely against a local (`file://`) graph, but when the resolved target is **not local** (a served `http(s)://` graph or an `s3://` store/cluster) they require explicit consent: pass **`--yes`** to confirm, an interactive terminal is prompted, and a non-interactive run (no TTY, or `--json`) **refuses with an error** rather than silently destroying. `cleanup` still also requires its existing `--confirm` (preview→execute); `--yes` is the additional non-local consent. + +A "local" target is a bare path or a `file://` URI; `http(s)://`, `s3://`, and other object-store schemes are non-local. + ## Config surfaces Two config surfaces with single owners, plus a zero-config tier: @@ -59,22 +69,20 @@ Two config surfaces with single owners, plus a zero-config tier: | Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics | | Flags / env | per invocation | — | everything, explicitly | -`omnigraph.yaml` (below) is the legacy combined file — fully supported -today, slated for staged deprecation; its keys' future homes are -listed there. - ### `~/.omnigraph/config.yaml` (operator) ```yaml operator: - actor: act-andrew # default identity for every --as cascade: - # --as > legacy cli.actor > operator.actor > none + actor: act-andrew # default identity for the --as cascade: --as > operator.actor > none servers: # operator-owned endpoints; names key the credentials prod: url: https://graph.example.com # no tokens in this file, ever defaults: - output: table # read format default, below --json/--format/alias/legacy - server: prod # the everyday scope when no address is given (RFC-011) + output: table # read format default, below --json/--format/alias + server: prod # the everyday SERVED scope when no address is given (RFC-011) + # store: file:///data/dev.omni # OR a zero-flag LOCAL default (mutually + # # exclusive with `server`); the local-dev + # # counterpart of `server` default_graph: knowledge # graph selected in a server/cluster scope clusters: # admin-only: managed-cluster storage roots (RFC-011). brain: # the ONLY place a storage root lives in this file. @@ -85,8 +93,8 @@ profiles: # named scope bundles (RFC-011); pick with --profile ``` Absent file = empty layer. Unknown keys warn and load (a file written for a -newer CLI works on an older one). `$OMNIGRAPH_CONFIG=` stands in for -`--config` (the flag wins) in both the CLI and the server. +newer CLI works on an older one). Override the config directory with +`$OMNIGRAPH_HOME`. #### Scopes & profiles (RFC-011) @@ -95,20 +103,32 @@ graph in it; the served-vs-direct access path is derived from the scope, not toggled. The scope comes from one of (highest precedence first): an explicit address (a positional URI, `--server`, or `--store `); a named `--profile ` (or `$OMNIGRAPH_PROFILE`); or the flat `defaults.server` + -`defaults.default_graph`. A **profile** binds exactly one of `server` / `cluster` -/ `store` plus an optional default graph — config data, not state: every command -resolves its scope fresh, there is no sticky "current" mode. +`defaults.default_graph` (a served default) **or** `defaults.store` (a zero-flag +*local* default — mutually exclusive with `defaults.server`). A **profile** binds +exactly one of `server` / `cluster` / `store` plus an optional default graph — +config data, not state: every command resolves its scope fresh, there is no +sticky "current" mode. Inspect what is defined with `omnigraph profile list` and +`omnigraph profile show []` (read-only). - `--store ` addresses a single graph's storage directly (ad-hoc / break-glass). - A `cluster`-bound profile reaches `optimize` / `repair` / `cleanup` for a managed graph (resolving its storage root from `clusters:`), the same as - `--cluster --cluster-graph `. + `--cluster --graph `. A `--graph` flag overrides the profile's default. - A `server`-bound scope on a maintenance verb, or a `cluster`-bound scope on a data verb, is rejected with a message pointing at the right addressing. +- **No graph selected (RFC-011 D7).** When a scope has no `--graph` and no + `default_graph`, the CLI never silently picks: + - **Cluster scope** — exactly **one** applied graph is used automatically; + **several** errors and lists the candidates (from the served catalog). + - **Server scope** — a multi-graph server (any non-empty `GET /graphs`, even a + single entry) errors and lists the candidates: you must pass `--graph `. + A single-graph / flat server (405 on `/graphs`), or one whose `/graphs` is + policy-gated or unreachable, uses its bare URL as before. -`--target` and the positional-`http(s)://`→remote dispatch have been **removed**; -the remaining legacy surfaces (`--cluster-graph`, `omnigraph.yaml`'s `cli.graph` -default) still work and an explicit address always wins. +`--target`, `--cluster-graph`, and the positional-`http(s)://`→remote dispatch +have been **removed** (`--graph` is now the one graph selector across server and +cluster scopes); operator `defaults`/`--profile` supply the no-flag scope and an +explicit address always wins. #### Credentials keyed by server name @@ -136,10 +156,11 @@ aliases: format: table ``` -`omnigraph query --alias triage 2026-06-01` invokes +`omnigraph alias triage 2026-06-01` invokes `POST /graphs/spike/queries/weekly_triage` with the keyed -credential. A legacy `omnigraph.yaml` alias with the same name wins during -the deprecation window (with a warning). +credential. Aliases live in their own `alias` namespace (RFC-011 Decision 4), +so an alias can never shadow — or be shadowed by — a built-in verb. (The old +`--alias ` flag on `query`/`mutate` was removed.) A remote command whose URL prefix-matches an operator server's `url` (the `gh` host model — no flags needed) resolves its token through: @@ -148,64 +169,10 @@ A remote command whose URL prefix-matches an operator server's `url` (the |---|---| | 1 | `OMNIGRAPH_TOKEN_` env (`prod` → `OMNIGRAPH_TOKEN_PROD`) | | 2 | `[]` section in `~/.omnigraph/credentials` | -| 3 | the legacy chain unchanged (`bearer_token_env` → `OMNIGRAPH_BEARER_TOKEN` → `auth.env_file`) | +| 3 | the default `OMNIGRAPH_BEARER_TOKEN` env | -A token is only ever sent to the server it is keyed to: URLs matching no -operator server use the legacy chain alone. - -## `omnigraph.yaml` schema (legacy combined file) - -> **Deprecated.** Loading this file prints a per-key notice -> naming each present key's new home (suppress in CI with -> `OMNIGRAPH_SUPPRESS_YAML_DEPRECATION=1`); `omnigraph config migrate` -> produces the split. The file keeps working through the deprecation -> window. Migrated teams can set `OMNIGRAPH_NO_LEGACY_CONFIG=1` to turn -> any legacy-file load into a hard error (regression guard; the file's -> absence is always fine). - -```yaml -project: { name } -graphs: - : - uri: - bearer_token_env: - queries: # per-graph stored-query registry (server-role; multi-graph mode) - : # key MUST equal the `query ` symbol inside the .gq - file: # relative to this config's directory - mcp: - expose: true # default true: listed in the MCP catalog (GET /queries); set false to hide (still HTTP-callable) - tool_name: # optional MCP tool-name override (defaults to ; - # must be unique across exposed queries) -server: - graph: - bind: -cli: - graph: - branch: - output_format: json|jsonl|csv|kv|table - table_max_column_width: 80 - table_cell_layout: truncate|wrap -query: - roots: [, …] # search path for .gq files -auth: - env_file: .env.omni -aliases: - : - # accepted values: `read` / `query` (read alias), `change` / `mutate` - # (write alias). `query` and `mutate` are recommended; `read` and - # `change` remain accepted forever for back-compat. - command: read|change|query|mutate - query: - name: - args: [, …] - graph: - branch: - format: -queries: # top-level registry — applies only to a bare-URI (anonymous) graph; a graph served by name uses its `graphs..queries`. Mirrors top-level `policy`. - : { file: } # mcp.expose defaults to true -policy: - file: policy.yaml -``` +A keyed token is only ever sent to the server it is keyed to: a URL matching no +operator server falls back to `OMNIGRAPH_BEARER_TOKEN` alone. ## Cluster config preview @@ -228,8 +195,8 @@ apply, refresh, and import acquire `__cluster/lock.json` by default and release it before returning. `cluster apply` executes only stored-query/policy catalog writes (content-addressed under `__cluster/resources/`) and requires an existing `state.json`; graph/schema changes are deferred with warnings, and -applied resources do not serve traffic — the server still boots from -`omnigraph.yaml`. `cluster status` reads state only and reports any existing +applied resources do not serve traffic until an `omnigraph-server --cluster +` restart picks them up. `cluster status` reads state only and reports any existing lock metadata. `force-unlock` removes a lock only when the supplied id exactly matches the lock file. `refresh` requires an existing `state.json`; `import` creates one only when it is missing. Both observe declared graphs read-only at @@ -248,7 +215,7 @@ embeddings, aliases, and bindings are reserved for later stages. See ## Param resolution -Precedence (high to low): explicit `--params` / `--params-file`, alias positional args, `omnigraph.yaml` defaults. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients. +Precedence (high to low): explicit `--params` / `--params-file`, alias positional args. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients. ## Bearer token resolution (CLI) diff --git a/docs/user/clusters/config.md b/docs/user/clusters/config.md index 348d3a4..8f8caf4 100644 --- a/docs/user/clusters/config.md +++ b/docs/user/clusters/config.md @@ -32,26 +32,24 @@ omnigraph cluster force-unlock --config company-brain --json `--config` points at a directory, not a file. The directory must contain `cluster.yaml`. When omitted, it defaults to the current directory. -## Relationship to `omnigraph.yaml` +## Relationship to `~/.omnigraph/config.yaml` -`cluster.yaml` does not replace `omnigraph.yaml`, and the two never describe -the same fact. `omnigraph.yaml` is the permanent **per-operator** layer (CLI -defaults, the operator's identity and credential references, graph targets -for data-plane commands); `cluster.yaml` is the shared desired state of a +`cluster.yaml` and the per-operator `~/.omnigraph/config.yaml` never describe +the same fact. The operator config is the permanent **per-operator** layer +(the operator's identity and credential references, named servers/clusters, +profiles, and CLI defaults); `cluster.yaml` is the shared desired state of a whole deployment, read only by the `cluster` commands via `--config`. The exact contract: -- **Cluster commands read `omnigraph.yaml` for exactly one thing**: the - `cli.actor` default used by `apply`/`approve` when `--as` is omitted — - operator identity is a per-operator fact. With `--as` present, no config - is read at all. Nothing else (its graph set, targets, bind, queries, - policies) ever influences a cluster command; a malformed `omnigraph.yaml` - breaks only the no-flag actor lookup, loudly. -- **A `--cluster` server reads `omnigraph.yaml` for nothing** — not even the - implicit current-directory search runs (mode-inference rule 0). Boot from - cluster state XOR `omnigraph.yaml`, never a merge. -- **The other direction is ergonomics, not coupling**: a per-operator +- **Cluster commands read the operator config for exactly one thing**: the + `operator.actor` default used by `apply`/`approve` when `--as` is omitted — + operator identity is a per-operator fact. With `--as` present, the operator + config is not needed. Nothing else in it influences a cluster command. +- **No legacy `omnigraph.yaml`**: the CLI does not read `omnigraph.yaml` at + all, and a `--cluster` server reads only the cluster catalog — boot is + cluster-only. +- **The other direction is ergonomics, not coupling**: per-operator data-plane commands address a cluster graph by its derived storage root (`company-brain/graphs/knowledge.omni`) with `--store ` — an ordinary local path, no special handling. @@ -269,12 +267,11 @@ Deletes remove the resource from state; their old payload blobs stay on disk (garbage collection is a later stage). Re-running a converged apply is a no-op: no state write, no revision change (`state_written: false`). -**Applied means serving — for deployments that opt in.** A server started -with `--cluster ` boots from the applied revision (see +**Applied means serving.** A server started with `--cluster ` boots from +the applied revision (see [Serving from the cluster](#serving-from-the-cluster-the-mode-switch)); it -picks up newly applied state on its next restart. Deployments still booting -from `omnigraph.yaml` are untouched: for them, applied means recorded in the -catalog, nothing more. +picks up newly applied state on its next restart. Until that restart, applied +means recorded in the catalog, nothing more. ### Graph creation diff --git a/docs/user/clusters/index.md b/docs/user/clusters/index.md index 053d5a1..c59ff9d 100644 --- a/docs/user/clusters/index.md +++ b/docs/user/clusters/index.md @@ -117,7 +117,7 @@ omnigraph cluster apply --config company-brain --as andrew `--as ` attributes the run: it is recorded in recovery sidecars and audit entries and threaded into the engine's commit history. Set -`cli: { actor: }` in your per-operator `omnigraph.yaml` to make it the +`operator: { actor: }` in your `~/.omnigraph/config.yaml` to make it the default when `--as` is omitted (the flag always wins; `approve` requires one of the two). @@ -244,12 +244,12 @@ with an in-flight apply. - **CI-driven convergence**: `validate` and `plan --json` are read-only and safe in pipelines; gate `apply --as ci` on plan review. Approvals are the human step by design — keep `cluster approve` out of automation. -- **`omnigraph.yaml` still has a job**: per-operator settings — your - `cli.actor` default for `--as`, CLI defaults, credentials, and data-plane - ergonomics (address a cluster graph by its derived root like - `company-brain/graphs/knowledge.omni` with `--store` for loads). It just no - longer describes the deployment — a server boots from one source or the - other, never a merge of both. +- **`~/.omnigraph/config.yaml` is the per-operator config**: your + `operator.actor` default for `--as`, named servers/clusters, credentials, + profiles, and data-plane ergonomics (address a cluster graph by its derived + root like `company-brain/graphs/knowledge.omni` with `--store` for loads). The + cluster directory's `cluster.yaml` is the **sole deployment declaration** — the + server boots from the cluster only. ## 7. Maintaining a cluster graph @@ -258,10 +258,11 @@ operation — it runs out-of-band, with direct storage access, against the graph roots. Address a cluster graph by name instead of hand-typing its storage path: ```bash -omnigraph optimize --cluster ./company-brain --cluster-graph knowledge -omnigraph cleanup --cluster ./company-brain --cluster-graph knowledge --keep 10 --confirm -# --cluster also takes the storage-root URI directly (config-free): -omnigraph optimize --cluster s3://bucket/clusters/company-brain --cluster-graph knowledge +omnigraph optimize --cluster ./company-brain --graph knowledge +omnigraph cleanup --cluster ./company-brain --graph knowledge --keep 10 --confirm +# --cluster also takes the storage-root URI directly (config-free), and a +# `clusters:` name from ~/.omnigraph/config.yaml: +omnigraph optimize --cluster s3://bucket/clusters/company-brain --graph knowledge ``` The graph's storage URI is resolved from the **served cluster state** (the same @@ -270,6 +271,16 @@ not resolvable. Run these from a host with storage access — there are no serve routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in a cluster are created by `cluster apply`, not by hand. +If the cluster has exactly **one** applied graph you can omit `--graph` — it is +used automatically. With **several**, omitting `--graph` errors and lists the +candidates (RFC-011 D7); it never picks one for you. + +Against an **`s3://`-backed cluster** the resolved graph storage is non-local, so a +destructive `cleanup` additionally requires **`--yes`** (an interactive prompt +otherwise, refusal without a TTY) on top of `--confirm` — see [cli-reference.md](../cli/reference.md)'s +*Write diagnostics & destructive confirmation*. Every maintenance run also echoes +its resolved target to stderr (suppress with `--quiet`). + ## What the control plane does not do (yet) - **No hot reload** — applied changes serve on the next restart. diff --git a/docs/user/deployment.md b/docs/user/deployment.md index 71cd5c8..21b8087 100644 --- a/docs/user/deployment.md +++ b/docs/user/deployment.md @@ -13,13 +13,10 @@ Omnigraph supports two broad deployment shapes: The server binary and container image expose the same HTTP surface. -The server also has two **boot sources**: `omnigraph.yaml` (graph targets -declared in the per-operator config) or a **cluster directory** -(`omnigraph-server --cluster `), which serves the cluster control +The server has a single **boot source**: a **cluster directory** +(`omnigraph-server --cluster `), which serves the cluster control plane's applied revision — see [cluster-config.md](clusters/config.md#serving-from-the-cluster-the-mode-switch). -The two are exclusive per deployment; switching is a restart with a different -flag. ## Binary Deployment @@ -30,21 +27,26 @@ Build or install: On Windows, the binaries are `omnigraph.exe` and `omnigraph-server.exe`. -Run against a local graph: +The server boots from a cluster only (RFC-011) — there is no positional +`` / single-graph boot. Point it at a local cluster directory: ```bash -omnigraph-server graph.omni --bind 0.0.0.0:8080 +omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080 ``` -Run against an object-store-backed graph: +Or boot config-free from an object-storage-rooted cluster: ```bash OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \ AWS_REGION="us-east-1" \ -omnigraph-server s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \ +omnigraph-server --cluster s3://my-bucket/clusters/company-brain \ --bind 0.0.0.0:8080 ``` +The server serves every graph in the cluster's applied revision under +`/graphs/{id}/...`. See [clusters](clusters/index.md) for authoring and +applying a cluster. + ## Cluster Mode in Containers (AWS, Railway) A cluster-booted deployment has **two shapes** since the `storage:` root: @@ -80,10 +82,8 @@ docker run -d \ -p 8080:8080 ``` -`OMNIGRAPH_CLUSTER` is exclusive: combining it with `OMNIGRAPH_TARGET_URI`, -`OMNIGRAPH_CONFIG`, or `OMNIGRAPH_TARGET` fails fast (exit 64), the same -rule the server itself enforces. The image also ships the `omnigraph` CLI, -so the day-2 loop runs in-container with no `omnigraph.yaml`: +`OMNIGRAPH_CLUSTER` is the server's only boot source. The image also +ships the `omnigraph` CLI, so the day-2 loop runs in-container: ```bash docker exec -it sh -c \ @@ -104,10 +104,10 @@ docker exec -it sh -c \ `omnigraph cluster apply --as --config /var/lib/omnigraph/cluster` → force a new deployment (restart). -For a deployment that doesn't need the cluster control plane, the classic -stateless shape — `OMNIGRAPH_TARGET_URI=s3://bucket/graph.omni`, no volume — -remains the simplest AWS architecture (see Binary/Container Deployment -above). +For a stateless, volume-free deployment, root the cluster on object +storage and boot config-free with +`OMNIGRAPH_CLUSTER=s3://bucket/clusters/` (the bucket-no-volume +shape above) — the simplest AWS architecture. ### Railway @@ -181,23 +181,24 @@ Build the image: docker build -t omnigraph-server:local . ``` -Run against a local graph: +The server boots from a cluster only (RFC-011). Run against a cluster +directory on a mounted volume: ```bash docker run --rm -p 8080:8080 \ - -v "$PWD/graph.omni:/data/graph.omni" \ + -v "$PWD/company-brain:/var/lib/omnigraph/cluster" \ omnigraph-server:local \ - /data/graph.omni --bind 0.0.0.0:8080 + --cluster /var/lib/omnigraph/cluster --bind 0.0.0.0:8080 ``` -Run against an S3-backed graph: +Run config-free against an object-storage-rooted cluster: ```bash docker run --rm -p 8080:8080 \ -e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \ -e AWS_REGION="us-east-1" \ omnigraph-server:local \ - s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \ + --cluster s3://my-bucket/clusters/company-brain \ --bind 0.0.0.0:8080 ``` @@ -208,27 +209,14 @@ When no positional args are given, the image entrypoint | Var | Effect | |---|---| -| `OMNIGRAPH_TARGET_URI` | Graph URI, passed as the positional argument. | -| `OMNIGRAPH_CONFIG` | Path to an `omnigraph.yaml`, passed as `--config`. Used to supply a `policy.file` (Cedar authorization). The config file and any relative `policy.file` must be mounted into the container. | -| `OMNIGRAPH_TARGET` | Graph name to select from the config's `graphs:` block (with `OMNIGRAPH_CONFIG`, when no `OMNIGRAPH_TARGET_URI`). | +| `OMNIGRAPH_CLUSTER` | Cluster boot source — a config directory or a storage-root URI, forwarded as `--cluster`. The only boot source. | | `OMNIGRAPH_BIND` | Listen address (default `0.0.0.0:8080`). | -`OMNIGRAPH_TARGET_URI` and `OMNIGRAPH_CONFIG` **compose**: set both to keep the -graph URI in the env var while loading policy from the config file (the -positional URI wins over any `graphs:` entry). To enable Cedar policy on a -container otherwise driven by `OMNIGRAPH_TARGET_URI`, mount the config dir and -add `OMNIGRAPH_CONFIG`: - -```bash -docker run --rm -p 8080:8080 \ - -e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \ - -e OMNIGRAPH_TARGET_URI="s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0" \ - -e OMNIGRAPH_CONFIG="/etc/omnigraph/omnigraph.yaml" \ - -v "$PWD/config:/etc/omnigraph:ro" \ - omnigraph-server:local -# /etc/omnigraph/omnigraph.yaml contains `policy: { file: policy.yaml }`; -# policy.yaml (+ optional policy.tests.yaml) sit beside it in the mount. -``` +Per-graph and server-level Cedar policy come from the cluster's applied +revision (authored in `cluster.yaml` and published with `cluster apply`), +not from a separate config file. The cluster docker shapes — volume vs. +config-free object-storage root — are detailed under +[Cluster Mode in Containers](#cluster-mode-in-containers-aws-railway) above. ## Auth diff --git a/docs/user/operations/maintenance.md b/docs/user/operations/maintenance.md index bf7a81c..161e5d6 100644 --- a/docs/user/operations/maintenance.md +++ b/docs/user/operations/maintenance.md @@ -1,17 +1,18 @@ # Maintenance: Optimize, Repair & Cleanup -**Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster --cluster-graph `** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `/graphs/.omni` layout). They never run through a server, and reject `--server` / `--graph` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md). +**Addressing.** `optimize`, `repair`, and `cleanup` are **direct** (storage-native) CLI commands: they run with direct storage access against a positional `file://`/`s3://` URI or **`--cluster --graph `** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `/graphs/.omni` layout). They never run through a server, and reject `--server` or a remote (`http(s)://`) URI with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command capabilities* section of [cli-reference.md](../cli/reference.md). ## `optimize` — non-destructive - Compacts every node + edge table on `main`, then reindexes them, then **publishes the resulting version to the `__manifest`** so the manifest's recorded version tracks the compacted-and-reindexed state. Reads pin the manifest version, so without this publish the work would be invisible to readers *and* would break the version precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually changed. - Rewrites small fragments into fewer large ones; old fragments remain reachable via older versions until `cleanup` runs. - **Reindex (index coverage maintenance).** A scalar/FTS/vector index only covers the fragments it was built over. Rows appended after the index was built (e.g. by `load --mode merge`, whose commit does not rebuild an already-existing index) are scanned unindexed, and compaction itself rewrites fragments out of an index's coverage. `optimize` runs Lance's incremental `optimize_indices` after compaction to fold those fragments back in (a delta merge, not a full retrain), restoring full coverage so equality/range/traversal predicates stay index-accelerated. This is why a table with **no compaction work but stale index coverage still commits** a new version under `optimize`. Run `optimize` on a cadence at least as frequent as your freshness window so recently-loaded rows do not linger in the unindexed flat-scan tail. +- **Create declared-but-missing indexes (the index reconciler).** `@index`/`@key` declares intent; `schema apply` records it but builds nothing, and `load`/`mutate` defer a column that cannot be built yet (a `Vector` column with no trainable vectors). `optimize` materializes any such declared-but-unbuilt index over the compacted layout — so it is the convergence path for an `@index` added after data exists, or a vector index whose embeddings arrived via a later `embed`. A column still not buildable (no vectors yet) is reported on the table's stat as `pending_indexes` (visible in `--json`), not treated as a failure; the next `optimize` retries. So `optimize` is the single operator-facing index reconciler: it compacts, restores coverage, **and** builds declared-but-missing indexes. - Each table's compact→reindex→publish serializes with concurrent mutations on the same table. A crash mid-operation is recovered automatically on the next open (both compaction and reindex are content-preserving, so roll-forward is always safe). - **Requires a recovered graph.** `optimize` refuses (errors) when a pending crash-recovery operation is present — operating on an unrecovered graph could publish a partial write that recovery would roll back. Reopen the graph to run recovery, then re-run `optimize`. - **Uncovered drift is skipped, not interpreted.** If a table's underlying version is ahead of the version recorded in `__manifest` and no crash-recovery record covers that movement, `optimize` reports `skipped: DriftNeedsRepair` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift. - Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8). -- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version`. +- Returns per-table stats: `table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version, pending_indexes` (the last lists any declared `@index` column the reconciler could not build this run, with the reason — e.g. a vector column with no trainable vectors yet). - **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: BlobColumnsUnsupportedByLance` (and logged) instead of compacted, and the rest of the sweep proceeds normally. **Reads and writes are unaffected** — only compaction is. Consequence: fragment count and deleted-row space on blob tables are not reclaimed; query results are never affected. A skipped blob table is also **not reindexed** in the same sweep (the skip happens before the reindex step), so its index coverage on appended rows is not refreshed by `optimize` today. ## `repair` — explicit @@ -34,6 +35,7 @@ backstop, so it does as much as it can and converges on re-run. The CLI reports any failed tables; rerun `cleanup` to retry them. - CLI guards with `--confirm`; without it, prints a preview line. +- **Non-local consent (RFC-011 D9).** Against a non-local target (an `s3://` store/cluster), `cleanup` additionally requires `--yes` on top of `--confirm`: a TTY is prompted, and a non-interactive run (no TTY, or `--json`) refuses rather than destroying. A local (`file://`) target needs only `--confirm`. The same `--yes` gate applies to overwrite `load` and `branch delete`; every maintenance run echoes its resolved target to stderr (suppress with `--quiet`). - **Recovery floor:** `--keep < 3` may garbage-collect versions that crash recovery needs as a rollback target. Default `--keep 10` is safe. - **Orphaned-branch reconciliation:** before the version GC, cleanup reclaims any per-table or commit-graph branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged. diff --git a/docs/user/operations/policy.md b/docs/user/operations/policy.md index ced1c60..c6096d0 100644 --- a/docs/user/operations/policy.md +++ b/docs/user/operations/policy.md @@ -20,7 +20,7 @@ Server-scoped action (v0.6.0+; binds to `Omnigraph::Server::"root"`): 10. `graph_list` — `GET /graphs` registry enumeration (multi-graph mode) -Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` are reserved but not shipped in v0.6.0; operators add/remove graphs by editing `omnigraph.yaml` and restarting.) +Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — they operate on the registry, not on a graph's branches. A rule cannot mix server-scoped and per-graph actions; split into separate rules. (Runtime `graph_create` / `graph_delete` over HTTP are reserved but not shipped; operators add/remove graphs by editing the cluster's `cluster.yaml`, running `omnigraph cluster apply`, and restarting the server.) ## Scope kinds @@ -28,38 +28,34 @@ Server-scoped actions cannot use `branch_scope` or `target_branch_scope` — the - `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops) - `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected` -## Per-graph vs. server-level policy (multi-graph mode) +## Per-graph vs. server-level policy -In multi mode (`omnigraph.yaml` with a non-empty `graphs:` map), policy files attach at two levels: +A server boots from a cluster (`--cluster `), and the cluster's +`cluster.yaml` declares its policy bundles in a `policies:` section. Each bundle +names the scopes it `applies_to`: a graph id (per-graph rules — `read`, `change`, +`branch_*`, `schema_apply`) or the literal `cluster` (server-level rules — +`graph_list`). ```yaml -server: - policy: - file: server-policy.yaml # server-level: graph_list - -graphs: +# cluster.yaml +policies: + base: + file: base.policy.yaml + applies_to: [cluster, knowledge] # cluster-level + the `knowledge` graph alpha: - uri: s3://tenant-bucket/alpha - policy: - file: policies/alpha.yaml # per-graph: read, change, branch_*, schema_apply - beta: - uri: s3://tenant-bucket/beta - # no per-graph policy → no engine-layer Cedar enforcement on beta + file: policies/alpha.yaml + applies_to: [alpha] # per-graph: alpha only ``` -**Config follows graph identity, not server mode.** A graph served by **name** -(`--target ` or `server.graph`) uses its own `graphs..policy.file`, -exactly as in multi-graph mode. Top-level `policy.file` applies only to an -**anonymous** graph — one served by a bare `` with no `graphs:` entry. -Serving a **named** graph (single- or multi-graph mode) while top-level -`policy.file` (or `queries:`) is populated **refuses boot**, naming the block, -since the top-level value would otherwise be silently shadowed by the per-graph -block. Move per-graph rules to `graphs..policy.file` and `graph_list` -rules to `server.policy.file`. +A graph with no bundle bound to it has no engine-layer Cedar enforcement. Each +graph's HTTP request flows through its bound bundle; the management endpoint +(`GET /graphs`) flows through the `cluster`-scoped bundle. When no bundle binds +`cluster`, `GET /graphs` is denied in every runtime state, including +`--unauthenticated`; with bearer tokens configured it returns 403 after admission +control because `graph_list` is not a `read`-equivalent action. The operator must +bind a `cluster`-scoped bundle granting `graph_list` to expose `/graphs`. -Each graph's HTTP request flows through its own per-graph policy. The management endpoint (`GET /graphs`) flows through the server-level policy. When `server.policy.file` is unset, `GET /graphs` is denied in every runtime state, including `--unauthenticated`; with bearer tokens configured, it returns 403 after admission control because `graph_list` is not a `read`-equivalent action. The operator must explicitly authorize via `server-policy.yaml` to expose `/graphs`. - -Example server-level policy: +Example `cluster`-scoped bundle: ```yaml version: 1 @@ -72,38 +68,26 @@ rules: actions: [graph_list] ``` -## Configuration +Each per-graph rule may use at most one of `branch_scope` or +`target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they +have no branch context. -`omnigraph.yaml`: +## Actor for direct-engine writes -```yaml -policy: - file: policy.yaml # Cedar rules + groups - tests: policy.tests.yaml # declarative test cases - -cli: - actor: act-andrew # default actor for CLI direct-engine writes -``` - -Each per-graph rule may use at most one of `branch_scope` or `target_branch_scope`. Server-scoped rules (`graph_list`) take neither — they have no branch context. - -`cli.actor` is the default actor identity for CLI direct-engine writes -when `policy.file` is configured. Override per-invocation with `--as -` (top-level flag) — `--as` wins, otherwise `cli.actor` is used, -otherwise no actor. With policy configured and neither set, the -engine-layer footgun guard intentionally denies the write (silent bypass -via "I forgot the actor" is exactly what the guard prevents). Remote -HTTP writes ignore both — they resolve their actor server-side from the -bearer token. +The default actor identity for CLI direct-engine (`--store`) writes is +`operator.actor` in `~/.omnigraph/config.yaml`. Override per-invocation with +`--as ` — `--as` wins, otherwise `operator.actor`, otherwise no actor. +Remote HTTP writes ignore both — they resolve their actor server-side from the +bearer token. (Direct-store access carries no Cedar policy under RFC-011; policy +lives in the cluster/server.) ## CLI -Policy tooling resolves its graph like server single-mode policy: `cli.graph` -wins, otherwise `server.graph` is used, otherwise the top-level `policy.file` -is validated/tested/explained as the anonymous policy. +Policy tooling reads a cluster's applied policy bundles: pass `--cluster `, +and `--graph ` to pick a graph's bundle when several apply. - `omnigraph policy validate` — parse + count actors, exit 1 on parse error. -- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch. +- `omnigraph policy test --tests ` — run the declarative cases in `` against the selected bundle, exit 1 on any expectation mismatch. - `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule. - `omnigraph --as ` — set the actor for the duration of one invocation. Effective for `change`, `load` (and its deprecated `ingest` alias), `branch create|delete|merge`, and `schema apply` against a direct (`--store`) graph. **Rejected** on a served write (`--server`): the actor is bearer-token-resolved server-side, so `--as` can't set it there. @@ -132,7 +116,7 @@ reaches the authorization gate without a matching policy permit. |---|---|---|---| | **Open** | no | no | Every request is permitted. Refuses to start unless `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1` is set — the operator must explicitly opt in. | | **DefaultDeny** | yes | no | Every authenticated request for an action other than `read` is rejected with HTTP 403. Closes the "tokens but forgot the policy file" trap — an operator who sets up auth and forgot to point at a policy file used to ship the illusion of protection. | -| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require `server.policy.file`. | +| **PolicyEnabled** | yes | yes | Authenticated requests that reach a configured policy engine are evaluated by Cedar. Server-scoped actions still require a `cluster`-scoped policy bundle. | The server refuses to start for the "no tokens, no policy, no flag" cell and for "policy file, no tokens" — instead of silently shipping an open diff --git a/docs/user/operations/server.md b/docs/user/operations/server.md index 0eb2ae8..bd14e1e 100644 --- a/docs/user/operations/server.md +++ b/docs/user/operations/server.md @@ -1,38 +1,29 @@ # HTTP Server (`omnigraph-server`) -Axum 0.8 + tokio + utoipa-generated OpenAPI. **Two modes** (v0.6.0+): single-graph and multi-graph, with **two boot sources** for multi mode: `omnigraph.yaml` or — exclusively — a cluster directory (`--cluster`). Mode is inferred from CLI args + config shape. +Axum 0.8 + tokio + utoipa-generated OpenAPI. **Cluster-only boot** (RFC-011): the server always boots from a cluster (`--cluster `) and serves N graphs (N ≥ 1) under cluster routes. There is no longer a single-graph flat-route mode, no positional `` boot, no `--target`, and no `omnigraph.yaml`-`graphs:`-map boot. All HTTP is nested under `/graphs/{graph_id}/...`; `/healthz` and the management `/graphs` enumeration stay flat. -## Modes +## Boot -### Single-graph mode +### Cluster boot (the only boot) -`omnigraph-server ` or `omnigraph-server --target --config omnigraph.yaml`. Routes are flat — `/snapshot`, `/read`, `/branches`, etc. +```bash +omnigraph-server --cluster --bind 0.0.0.0:8080 +``` -**Config follows graph identity.** A bare `` is an *anonymous* graph and uses the **top-level** `policy.file` / `queries:`. A graph chosen by **name** (`--target` / `server.graph`) uses its own `graphs..{policy.file, queries}` — the same block multi-graph mode uses. ⚠️ *Changed from v0.6.0, which always used top-level config in single mode: a named-graph config that puts `policy`/`queries` at top-level now **refuses boot** and points you at `graphs..…` (move the block there). Bare-`` single mode is unchanged.* - -### Multi-graph mode (v0.6.0+) - -`omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and **no** single-mode selector (no `server.graph`, no ``, no `--target`). The server opens every configured graph in parallel at startup (bounded concurrency = 4, fail-fast on the first open error). Routes are nested under `/graphs/{graph_id}/...`. Bare flat paths return 404 in multi mode. - -### Cluster-booted multi mode - -`omnigraph-server --cluster ` boots from the cluster catalog's **applied -revision** instead of -`omnigraph.yaml` — an exclusive boot source: combining it with ``, -`--target`, or `--config` is a startup error, and `omnigraph.yaml` is never -read in this mode. Always multi-graph routing. See +`omnigraph-server --cluster ` boots from the cluster catalog's +**applied revision**. The server resolves that revision into per-graph +startup configs (id, URI, optional per-graph policy, stored-query +registry) plus an optional server-level policy, then opens every +configured graph in parallel at startup (bounded concurrency = 4, +fail-fast on the first open error). Routing is always multi-graph — +requests to bare flat protected paths (`/read`, `/snapshot`, …) return +404; the served surface is `/graphs/{graph_id}/...`. See [cluster-config.md](../clusters/config.md#serving-from-the-cluster-the-mode-switch) -for what is read and the fail-fast readiness rules. `--bind`, -`--unauthenticated`, and the bearer-token env vars work identically. +for what is read and the fail-fast readiness rules. -Mode inference: - -0. CLI `--cluster ` → **multi, cluster-booted** (exclusive; a scheme-qualified argument reads the ledger straight from the storage root, no local config) -1. CLI positional `` → single -2. CLI `--target ` → single -3. `server.graph` in config → single -4. `--config` + non-empty `graphs:` + no single-mode selector → **multi** -5. otherwise → error with migration hint +A scheme-qualified argument (`s3://…`) reads the ledger straight from the +storage root, with no local config directory. `--bind`, +`--unauthenticated`, and the bearer-token env vars all apply. ### Stored-query validation at startup @@ -40,36 +31,37 @@ If a graph declares a `queries:` registry (see [cli-reference](../cli/reference. ## Endpoint inventory -Per-graph endpoints — same body shape across modes; URLs differ: - -| Method | Single-mode path | Multi-mode path | Auth | Action | -|---|---|---|---|---| -| GET | `/healthz` | `/healthz` | none | — | -| GET | `/openapi.json` | `/openapi.json` | none | — (strips security if auth disabled; in multi mode emits cluster paths with `cluster_` operation-id prefix) | -| GET | `/snapshot?branch=` | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch | -| POST | `/query` | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) | -| POST | `/read` | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: ; rel="successor-version"`) | -| POST | `/export` | `/graphs/{id}/export` | bearer + `export` | NDJSON stream | -| POST | `/mutate` | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) | -| POST | `/change` | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: ; rel="successor-version"`) | -| GET | `/queries` | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog | -| POST | `/queries/{name}` | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 | -| GET | `/schema` | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source | -| POST | `/schema/apply` | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate | -| POST | `/load` | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) | -| POST | `/ingest` | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: ; rel="successor-version"`) (32 MB body limit) | -| GET | `/branches` | `/graphs/{id}/branches` | bearer + `read` | list branches | -| POST | `/branches` | `/graphs/{id}/branches` | bearer + `branch_create` | create | -| DELETE | `/branches/{branch}` | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete | -| POST | `/branches/merge` | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` | -| GET | `/commits?branch=` | `/graphs/{id}/commits?branch=` | bearer + `read` | list | -| GET | `/commits/{commit_id}` | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show | - -Server-level management endpoints (v0.6.0+): +Per-graph endpoints — all nested under `/graphs/{id}/...`. `{id}` is the +graph id from the cluster's applied revision: | Method | Path | Auth | Action | |---|---|---|---| -| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs (405 in single mode) | +| GET | `/healthz` | none | — | +| GET | `/openapi.json` | none | — (strips security if auth disabled; emits the nested cluster paths with `cluster_` operation-id prefix) | +| GET | `/graphs/{id}/snapshot?branch=` | bearer + `read` | snapshot of branch | +| POST | `/graphs/{id}/query` | bearer + `read` | inline read query (canonical; clean field names `query`/`name`; mutations → 400) | +| POST | `/graphs/{id}/read` | bearer + `read` | **deprecated** alias of `/query` (legacy field names `query_source`/`query_name`, byte-stable response; carries `Deprecation: true` + `Link: ; rel="successor-version"`) | +| POST | `/graphs/{id}/export` | bearer + `export` | NDJSON stream | +| POST | `/graphs/{id}/mutate` | bearer + `change` | mutation (canonical; `query`/`name`; accepts legacy `query_source`/`query_name` as serde aliases) | +| POST | `/graphs/{id}/change` | bearer + `change` | **deprecated** alias of `/mutate` (carries `Deprecation: true` + `Link: ; rel="successor-version"`) | +| GET | `/graphs/{id}/queries` | bearer + `read` | list the `mcp.expose` stored queries as a typed tool catalog | +| POST | `/graphs/{id}/queries/{name}` | bearer + `invoke_query` (+ `change` for a stored mutation) | invoke a named query from the `queries:` registry; deny == 404 | +| GET | `/graphs/{id}/schema` | bearer + `read` | get current `.pg` source | +| POST | `/graphs/{id}/schema/apply` | bearer + `schema_apply` (target=`main`) | disabled for cluster-backed serving; returns 409 and points operators at `omnigraph cluster apply` + restart | +| POST | `/graphs/{id}/load` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | bulk load (canonical); branch creation is opt-in via `from` — without it a missing `branch` is a 404, never an implicit fork (32 MB body limit) | +| POST | `/graphs/{id}/ingest` | bearer + `branch_create` (only when `from` is set and the branch is created) + `change` | **deprecated** alias of `/load` (carries `Deprecation: true` + `Link: ; rel="successor-version"`) (32 MB body limit) | +| GET | `/graphs/{id}/branches` | bearer + `read` | list branches | +| POST | `/graphs/{id}/branches` | bearer + `branch_create` | create | +| DELETE | `/graphs/{id}/branches/{branch}` | bearer + `branch_delete` | delete | +| POST | `/graphs/{id}/branches/merge` | bearer + `branch_merge` | merge `source → target` | +| GET | `/graphs/{id}/commits?branch=` | bearer + `read` | list | +| GET | `/graphs/{id}/commits/{commit_id}` | bearer + `read` | show | + +Server-level management endpoints: + +| Method | Path | Auth | Action | +|---|---|---|---| +| GET | `/graphs` | bearer + `graph_list` on `Server::"root"` | list registered graphs | ### Stored-query catalog (`GET /queries`) @@ -88,13 +80,14 @@ Invoke a curated, server-side stored query by **name** — the source comes from - **Requires an explicit policy grant when auth is on.** In default-deny mode (bearer tokens but no `policy.file`), only `read` is permitted, so *every* `/queries/{name}` call returns `404` until an `invoke_query` rule is configured. - A stored mutation cannot target a `snapshot` (`400`); a parameter type error is a structured `400` naming the parameter. -## Adding and removing graphs (multi mode) +## Adding and removing graphs -Runtime add/remove via API is **not** exposed in v0.6.0 — neither -`POST /graphs` nor `DELETE /graphs/{id}` is implemented. Operators add -or remove graphs by stopping the server, editing the `graphs:` map in -`omnigraph.yaml`, then restarting. The server treats `omnigraph.yaml` -as operator-owned configuration and never writes it. +Runtime add/remove via API is **not** exposed — neither `POST /graphs` +nor `DELETE /graphs/{id}` is implemented. Operators add or remove graphs +by running `cluster apply` against the cluster (which publishes a new +applied revision) and restarting the server so it boots from the new +revision. The server treats the cluster source as operator-owned and +never writes it. A future release may introduce a managed registry and re-expose runtime mutation on top of it. @@ -138,8 +131,8 @@ channels: - **Response headers (RFC 9745)**: every response carries `Deprecation: true`. - **Response headers (RFC 8288)**: every response carries a `Link` header pointing at the canonical successor: - `Link: ; rel="successor-version"` for `/read`, and - `Link: ; rel="successor-version"` for `/change`. SDKs and HTTP + `Link: ; rel="successor-version"` for `/read`, and + `Link: ; rel="successor-version"` for `/change`. SDKs and HTTP proxies can pick the successor up automatically. Migration is purely cosmetic on the client side — swap the URL path, leave @@ -226,4 +219,4 @@ See [deployment.md](../deployment.md) for token-source operational details. admission control" above). No global rate limiter is configured; add `tower_http::limit` if a graph-wide cap is needed. - Pagination — none (commits/branches return everything; export streams). -- Runtime graph add/remove — edit `omnigraph.yaml` and restart. +- Runtime graph add/remove — run `cluster apply` and restart. diff --git a/docs/user/search/indexes.md b/docs/user/search/indexes.md index ea65a6f..57935cd 100644 --- a/docs/user/search/indexes.md +++ b/docs/user/search/indexes.md @@ -27,7 +27,8 @@ list/`Blob` columns → none. ## L2 — OmniGraph orchestration -- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run. +- **`@index`/`@key` declares intent; the physical index is derived state.** A migration records the declaration in the catalog/IR and never fails on it — `schema apply` builds **no** indexes (adding an `@index` to an existing column is a pure metadata change that touches no table data). `load`/`mutate` build declared indexes inline as part of the write, but a column that can't be built yet (a `Vector` column with no trainable vectors — IVF k-means needs ≥1 vector, e.g. rows loaded before `embed` runs) is left **pending**, not fatal. Reads stay correct meanwhile: a missing/partial index degrades to a scan (vector search to brute-force). A later `ensure_indices`/`optimize` materializes the pending index once it is buildable. This mirrors how LanceDB builds indexes asynchronously and serves unindexed rows by brute-force. +- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted + vector indexes for the current head; safe to re-run; returns the columns it had to defer as pending. `optimize` runs it after compaction, so the maintenance cron is the convergence path for deferred indexes. - Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state. - **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork. - Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column. diff --git a/openapi.json b/openapi.json index 4f0309f..fb76fae 100644 --- a/openapi.json +++ b/openapi.json @@ -10,14 +10,82 @@ "version": "0.7.0" }, "paths": { - "/branches": { + "/graphs": { + "get": { + "tags": [ + "management" + ], + "summary": "List every graph currently registered with this server (MR-668).", + "description": "Multi-graph mode only. In single mode, the route returns 405 — there's\nno registry to enumerate. Cedar-gated by the server-level policy via\nthe `graph_list` action against `Omnigraph::Server::\"root\"`.\n\nOrder: alphabetical by `graph_id` (server-sorted so clients see\ndeterministic output across requests).", + "operationId": "listGraphs", + "responses": { + "200": { + "description": "List of registered graphs", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/GraphListResponse" + } + } + } + }, + "401": { + "description": "Unauthorized", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ErrorOutput" + } + } + } + }, + "403": { + "description": "Forbidden", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ErrorOutput" + } + } + } + }, + "405": { + "description": "Method not allowed (single-graph mode)", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ErrorOutput" + } + } + } + } + }, + "security": [ + { + "bearer_token": [] + } + ] + } + }, + "/graphs/{graph_id}/branches": { "get": { "tags": [ "branches" ], "summary": "List all branches.", "description": "Returns branch names sorted alphabetically. Read-only.", - "operationId": "listBranches", + "operationId": "cluster_listBranches", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "responses": { "200": { "description": "List of branches", @@ -62,7 +130,18 @@ ], "summary": "Create a new branch.", "description": "Forks `name` off of `from` (defaults to `main`). The new branch shares\ntable data with its parent until it is mutated. Returns 409 if `name`\nalready exists.", - "operationId": "createBranch", + "operationId": "cluster_createBranch", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -142,14 +221,25 @@ ] } }, - "/branches/merge": { + "/graphs/{graph_id}/branches/merge": { "post": { "tags": [ "branches" ], "summary": "Merge one branch into another.", "description": "Merges `source` into `target` (defaults to `main`). Outcome is one of\n`already_up_to_date`, `fast_forward`, or `merged`. Returns 409 with the\nlist of conflicts if the merge cannot be completed; the target is left\nunchanged in that case. **Destructive** to `target` on success.", - "operationId": "mergeBranches", + "operationId": "cluster_mergeBranches", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -229,15 +319,24 @@ ] } }, - "/branches/{branch}": { + "/graphs/{graph_id}/branches/{branch}": { "delete": { "tags": [ "branches" ], "summary": "Delete a branch.", "description": "**Irreversible.** Removes the branch pointer; commits remain reachable\nonly if referenced by another branch. Returns 404 if the branch does not\nexist.", - "operationId": "deleteBranch", + "operationId": "cluster_deleteBranch", "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + }, { "name": "branch", "in": "path", @@ -307,14 +406,25 @@ ] } }, - "/change": { + "/graphs/{graph_id}/change": { "post": { "tags": [ "mutations" ], "summary": "**Deprecated** — use [`POST /mutate`](#tag/mutations/operation/mutate) instead.", - "description": "Apply a GQ mutation to a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /mutate`, which has identical semantics and a name that pairs\ncleanly with `POST /query`. Responses from this route include\n`Deprecation: true` and `Link: ; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.", - "operationId": "change", + "description": "Apply a GQ mutation to a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /mutate`, which has identical semantics and a name that pairs\ncleanly with `POST /query`. Responses from this route include\n`Deprecation: true` and `Link: ; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.", + "operationId": "cluster_change", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -327,7 +437,7 @@ }, "responses": { "200": { - "description": "Mutation results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", + "description": "Mutation results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", "content": { "application/json": { "schema": { @@ -395,15 +505,24 @@ ] } }, - "/commits": { + "/graphs/{graph_id}/commits": { "get": { "tags": [ "commits" ], "summary": "List commits.", "description": "Filter by `branch` to get the commits on a single branch (most recent\nfirst); omit to list across all branches. Read-only.", - "operationId": "listCommits", + "operationId": "cluster_listCommits", "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + }, { "name": "branch", "in": "query", @@ -455,15 +574,24 @@ ] } }, - "/commits/{commit_id}": { + "/graphs/{graph_id}/commits/{commit_id}": { "get": { "tags": [ "commits" ], "summary": "Get a single commit.", "description": "Returns the commit's manifest version, parent commit(s), and creation\nmetadata. Read-only.", - "operationId": "getCommit", + "operationId": "cluster_getCommit", "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + }, { "name": "commit_id", "in": "path", @@ -523,14 +651,25 @@ ] } }, - "/export": { + "/graphs/{graph_id}/export": { "post": { "tags": [ "queries" ], "summary": "Stream the contents of a branch as NDJSON.", "description": "Emits one JSON object per line (`application/x-ndjson`). Filter with\n`type_names` (node/edge type names) and/or `table_keys`; both empty\nstreams the entire branch. Suitable for large exports — the response is\nstreamed, not buffered. Read-only.", - "operationId": "export", + "operationId": "cluster_export", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -586,93 +725,25 @@ ] } }, - "/graphs": { - "get": { - "tags": [ - "management" - ], - "summary": "List every graph currently registered with this server (MR-668).", - "description": "Multi-graph mode only. In single mode, the route returns 405 — there's\nno registry to enumerate. Cedar-gated by the server-level policy via\nthe `graph_list` action against `Omnigraph::Server::\"root\"`.\n\nOrder: alphabetical by `graph_id` (server-sorted so clients see\ndeterministic output across requests).", - "operationId": "listGraphs", - "responses": { - "200": { - "description": "List of registered graphs", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/GraphListResponse" - } - } - } - }, - "401": { - "description": "Unauthorized", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ErrorOutput" - } - } - } - }, - "403": { - "description": "Forbidden", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ErrorOutput" - } - } - } - }, - "405": { - "description": "Method not allowed (single-graph mode)", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/ErrorOutput" - } - } - } - } - }, - "security": [ - { - "bearer_token": [] - } - ] - } - }, - "/healthz": { - "get": { - "tags": [ - "health" - ], - "summary": "Liveness probe.", - "description": "Returns server status and version. Unauthenticated; safe to call from any\ncaller. Use this to confirm the server is reachable before invoking other\nendpoints.", - "operationId": "health", - "responses": { - "200": { - "description": "Server is healthy", - "content": { - "application/json": { - "schema": { - "$ref": "#/components/schemas/HealthOutput" - } - } - } - } - } - } - }, - "/ingest": { + "/graphs/{graph_id}/ingest": { "post": { "tags": [ "mutations" ], "summary": "**Deprecated** — use [`POST /load`](#tag/mutations/operation/load) instead.", - "description": "Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /load`, which has identical semantics. Responses from this route\ninclude `Deprecation: true` and `Link: ; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal.", - "operationId": "ingest", + "description": "Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is\nkept indefinitely for back-compat. New integrations should target\n`POST /load`, which has identical semantics. Responses from this route\ninclude `Deprecation: true` and `Link: ; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal.", + "operationId": "cluster_ingest", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -685,7 +756,7 @@ }, "responses": { "200": { - "description": "Load results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", + "description": "Load results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", "content": { "application/json": { "schema": { @@ -743,14 +814,25 @@ ] } }, - "/load": { + "/graphs/{graph_id}/load": { "post": { "tags": [ "mutations" ], "summary": "Bulk-load NDJSON data into a branch (canonical load endpoint).", "description": "`data` is NDJSON with one record per line. `mode` controls behavior on\nexisting rows: `merge` upserts by id (default), `append` blindly inserts,\n`overwrite` replaces table contents. Branch creation is opt-in by\npresence of `from`: with `from` set, a missing `branch` is created from\nit; without `from`, `branch` must already exist — a missing branch is a\n404, never an implicit fork. **Destructive** when `mode` is `overwrite`\nor when the load produces conflicting writes.\n\nThe legacy `POST /ingest` route has identical semantics and is kept as a\ndeprecated alias.", - "operationId": "load", + "operationId": "cluster_load", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -820,14 +902,25 @@ ] } }, - "/mutate": { + "/graphs/{graph_id}/mutate": { "post": { "tags": [ "mutations" ], "summary": "Apply a GQ mutation to a branch (canonical mutation endpoint).", "description": "Writes to the named `branch` (defaults to `main`). Mutations are atomic\nper call and produce a new commit. Returns counts of nodes and edges\naffected. **Destructive**: on success the branch is updated; rejected\nmutations may still acquire locks briefly. Returns 409 on merge conflict.\n\nPairs with `POST /query` (read-only). The legacy `POST /change` route\nhas identical semantics and is kept as a deprecated alias.", - "operationId": "mutate", + "operationId": "cluster_mutate", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -907,14 +1000,25 @@ ] } }, - "/queries": { + "/graphs/{graph_id}/queries": { "get": { "tags": [ "queries" ], "summary": "List the graph's exposed stored queries as a typed tool catalog.", "description": "Returns the `mcp.expose == true` subset of the `queries:` registry, each\nwith its MCP tool name, read/mutate flag, description/instruction, and\ntyped parameters — enough for a client to register them as tools without\nfetching `.gq` source. Read-gated; the catalog is graph-wide (branch\nindependent — `read` is authorized against `main`). **Not** Cedar-filtered\nper query yet, so it can list a query whose `invoke_query` the caller\nlacks (a known gap until per-query authorization lands).", - "operationId": "list_queries", + "operationId": "cluster_list_queries", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "responses": { "200": { "description": "Stored-query catalog (the mcp.expose subset, with typed params)", @@ -954,15 +1058,24 @@ ] } }, - "/queries/{name}": { + "/graphs/{graph_id}/queries/{name}": { "post": { "tags": [ "queries" ], "summary": "Invoke a curated, server-side stored query by name.", "description": "The query source comes from the graph's `queries:` registry, not the\nrequest body — callers send only runtime inputs (`params`, `branch`,\n`snapshot`). Gated by the `invoke_query` Cedar action at the boundary;\na stored *mutation* additionally passes the engine's `change` gate\n(double-gated). An actor **without** `invoke_query` cannot tell a denied\nquery from a missing one — both return the same 404, so the catalog\ncan't be probed without the grant. Once `invoke_query` is held, the\ninner `read`/`change` gate may surface a 403 for an existing query the\nactor can't run (the intended double-gate signal).", - "operationId": "invoke_query", + "operationId": "cluster_invoke_query", "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + }, { "name": "name", "in": "path", @@ -1078,14 +1191,25 @@ ] } }, - "/query": { + "/graphs/{graph_id}/query": { "post": { "tags": [ "queries" ], "summary": "Execute an inline read query (friendlier-named alternative to `POST /read`).", "description": "Designed for ad-hoc exploration and AI-agent tool-use: short field\nnames (`query`, `name`) match the CLI `-e` flag and the GQ `query`\nkeyword. Mutations (`insert`/`update`/`delete`) are rejected with 400\n-- use `POST /mutate` (or its deprecated alias `POST /change`) for\nwrite queries. Otherwise behaves identically to `POST /read`: same\ntarget semantics (branch xor snapshot), same Cedar action (Read),\nsame response shape.", - "operationId": "query", + "operationId": "cluster_query", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -1145,14 +1269,25 @@ ] } }, - "/read": { + "/graphs/{graph_id}/read": { "post": { "tags": [ "queries" ], "summary": "**Deprecated** — use [`POST /query`](#tag/queries/operation/query) instead.", - "description": "Execute a GQ read query. Behavior is unchanged from prior releases; the\nroute is kept indefinitely for byte-stable back-compat. New integrations\nshould target `POST /query`, which has clean field names (`query` /\n`name`) and a 400-on-mutation guard. Responses from this route include\n`Deprecation: true` and `Link: ; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.", - "operationId": "read", + "description": "Execute a GQ read query. Behavior is unchanged from prior releases; the\nroute is kept indefinitely for byte-stable back-compat. New integrations\nshould target `POST /query`, which has clean field names (`query` /\n`name`) and a 400-on-mutation guard. Responses from this route include\n`Deprecation: true` and `Link: ; rel=\"successor-version\"`\nheaders per RFC 9745 / RFC 8288 so SDKs and proxies can surface the\nsignal.", + "operationId": "cluster_read", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -1165,7 +1300,7 @@ }, "responses": { "200": { - "description": "Query results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", + "description": "Query results (response includes `Deprecation: true` + `Link: ; rel=\"successor-version\"`)", "content": { "application/json": { "schema": { @@ -1213,14 +1348,25 @@ ] } }, - "/schema": { + "/graphs/{graph_id}/schema": { "get": { "tags": [ "schema" ], "summary": "Read the current schema source.", "description": "Returns the project's schema as a single string in `.pg` source form.\nUseful for clients that want to introspect available types and tables\nbefore constructing GQ queries. Read-only.", - "operationId": "getSchema", + "operationId": "cluster_getSchema", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "responses": { "200": { "description": "Current schema source", @@ -1260,14 +1406,25 @@ ] } }, - "/schema/apply": { + "/graphs/{graph_id}/schema/apply": { "post": { "tags": [ "mutations" ], "summary": "Apply a schema migration.", - "description": "Diffs `schema_source` against the current schema and applies the resulting\nmigration steps (add/drop type, add/drop column, etc.). **Destructive**:\nsome steps drop data. Returns the list of steps applied; if `applied` is\nfalse the diff was unsupported and no changes were made.", - "operationId": "applySchema", + "description": "Cluster-backed servers reject this route with `409 Conflict`; operators\nmust apply schema changes through `omnigraph cluster apply` and restart.\n\nDiffs `schema_source` against the current schema and applies the resulting\nmigration steps (add/drop type, add/drop column, etc.). **Destructive**:\nsome steps drop data. Returns the list of steps applied; if `applied` is\nfalse the diff was unsupported and no changes were made.", + "operationId": "cluster_applySchema", + "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + } + ], "requestBody": { "content": { "application/json": { @@ -1319,6 +1476,16 @@ } } }, + "409": { + "description": "Schema apply is disabled for cluster-backed serving; use `omnigraph cluster apply` and restart", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/ErrorOutput" + } + } + } + }, "429": { "description": "Per-actor admission cap exceeded; honor `Retry-After` header", "content": { @@ -1337,15 +1504,24 @@ ] } }, - "/snapshot": { + "/graphs/{graph_id}/snapshot": { "get": { "tags": [ "snapshots" ], "summary": "Read the current snapshot of a branch.", "description": "Returns the manifest version plus per-table metadata (path, version, row\ncount) for every table on the branch. Defaults to `main` when `branch` is\nomitted. Read-only.", - "operationId": "getSnapshot", + "operationId": "cluster_getSnapshot", "parameters": [ + { + "name": "graph_id", + "in": "path", + "description": "Graph id to route the request to.", + "required": true, + "schema": { + "type": "string" + } + }, { "name": "branch", "in": "query", @@ -1396,6 +1572,28 @@ } ] } + }, + "/healthz": { + "get": { + "tags": [ + "health" + ], + "summary": "Liveness probe.", + "description": "Returns server status and version. Unauthenticated; safe to call from any\ncaller. Use this to confirm the server is reachable before invoking other\nendpoints.", + "operationId": "health", + "responses": { + "200": { + "description": "Server is healthy", + "content": { + "application/json": { + "schema": { + "$ref": "#/components/schemas/HealthOutput" + } + } + } + } + } + } } }, "components": { @@ -1891,6 +2089,13 @@ ], "description": "Branch to run against. Defaults to `main`; for a stored mutation the\nwrite targets this branch." }, + "expect_mutation": { + "type": [ + "boolean", + "null" + ], + "description": "The kind the caller expects (RFC-011 Decision 3): `Some(false)` for\n`omnigraph query `, `Some(true)` for `omnigraph mutate `.\nWhen set and it disagrees with the stored query's actual kind, the\nserver rejects the call (400) so the verb asserts the kind. `None`\n(the default) skips the check — preserving older clients and aliases." + }, "params": { "description": "JSON object whose keys match the stored query's declared parameters." },