diff --git a/docs/dev/testing.md b/docs/dev/testing.md index 3c5ee32..1f818e9 100644 --- a/docs/dev/testing.md +++ b/docs/dev/testing.md @@ -8,7 +8,7 @@ This file is the always-on map of the test surface. **Consult it before every ta |---|---|---| | `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (21 files), fixture-driven, share `tests/helpers/mod.rs` | | `omnigraph-cli` | `crates/omnigraph-cli/tests/` | `cli.rs` (unit-ish), `system_local.rs`, `system_remote.rs`, share `tests/support/mod.rs` | -| `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests` | Cluster config parser, local JSON state diff, state CAS/lock handling/recovery, read-only validate/plan/status plus explicit refresh/import graph observations | +| `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests` | Cluster config parser, local JSON state diff, state CAS/lock handling/recovery, read-only validate/plan/status plus explicit refresh/import graph observations, and config-only apply (content-addressed payload publish, disposition gating, composite-digest convergence, idempotent re-apply) | | `omnigraph-server` | `crates/omnigraph-server/tests/` | `server.rs` (HTTP-level), `openapi.rs` (OpenAPI drift / regeneration) | | `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint | diff --git a/docs/user/cli-reference.md b/docs/user/cli-reference.md index 70ac6f4..774ea6b 100644 --- a/docs/user/cli-reference.md +++ b/docs/user/cli-reference.md @@ -19,7 +19,7 @@ Top-level command families and subcommands. Graph-targeting commands accept eith | `commit list \| show` | inspect commit graph | | `schema plan \| apply \| show (alias: get)` | migrations | | `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` | -| `cluster validate \| plan \| status \| refresh \| import \| force-unlock` | cluster-control preview. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`; `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock ` manually removes a held local state lock by exact id. No apply, graph-resource mutation, server change, automatic stale-lock breaking, or `plan --refresh` occurs in Stage 2C | +| `cluster validate \| plan \| apply \| status \| refresh \| import \| force-unlock` | cluster-control preview. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json` and annotates each change with its apply disposition; `apply` executes the config-only (stored-query/policy) subset into the content-addressed local catalog under `__cluster/resources/` — graph/schema changes are deferred loudly, and nothing applied serves traffic (the server still boots from `omnigraph.yaml`); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock ` manually removes a held local state lock by exact id. No graph-manifest movement, server change, automatic stale-lock breaking, or `plan --refresh` occurs in Stage 3A | | `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) | | `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review | | `cleanup --keep N --older-than 7d --confirm` | destructive version GC | @@ -78,6 +78,7 @@ policy: ```bash omnigraph cluster validate --config ./company-brain omnigraph cluster plan --config ./company-brain --json +omnigraph cluster apply --config ./company-brain --json omnigraph cluster status --config ./company-brain --json omnigraph cluster refresh --config ./company-brain --json omnigraph cluster import --config ./company-brain --json @@ -85,16 +86,20 @@ omnigraph cluster force-unlock --config ./company-brain --json ``` `--config` is a directory containing `cluster.yaml`; it defaults to `.`. -Stage 2C accepts graphs, schemas, stored queries, and policy bundle file +Stage 3A accepts graphs, schemas, stored queries, and policy bundle file references. `cluster plan` reads local JSON state from `/__cluster/state.json`; a missing file means empty state. Plan, -refresh, and import acquire `__cluster/lock.json` by default and release it -before returning. `cluster status` reads state only and reports any existing +apply, refresh, and import acquire `__cluster/lock.json` by default and release +it before returning. `cluster apply` executes only stored-query/policy catalog +writes (content-addressed under `__cluster/resources/`) and requires an +existing `state.json`; graph/schema changes are deferred with warnings, and +applied resources do not serve traffic — the server still boots from +`omnigraph.yaml`. `cluster status` reads state only and reports any existing lock metadata. `force-unlock` removes a lock only when the supplied id exactly matches the lock file. `refresh` requires an existing `state.json`; `import` creates one only when it is missing. Both observe declared graphs read-only at -`/graphs/.omni`. External state backends, apply, -automatic stale-lock breaking, `plan --refresh`, pipelines, UI specs, +`/graphs/.omni`. External state backends, graph/schema +apply, automatic stale-lock breaking, `plan --refresh`, pipelines, UI specs, embeddings, aliases, and bindings are reserved for later stages. See [cluster-config.md](cluster-config.md). diff --git a/docs/user/cluster-config.md b/docs/user/cluster-config.md index 24718b1..b285cf3 100644 --- a/docs/user/cluster-config.md +++ b/docs/user/cluster-config.md @@ -1,19 +1,23 @@ # Cluster Config -**Status:** Stage 2C state-lock recovery preview. +**Status:** Stage 3A config-only apply preview. Cluster config is the future control-plane configuration surface for a whole OmniGraph deployment. In this stage, OmniGraph can validate a local `cluster.yaml` folder, produce a deterministic read-only plan, inspect the -local JSON state ledger, and explicitly refresh/import graph observations into -that ledger. It can also manually remove a held local state lock by exact lock -id. It does not apply desired changes, start servers, or write graph resources. +local JSON state ledger, explicitly refresh/import graph observations into +that ledger, manually remove a held local state lock by exact lock id, and +**apply the config-only subset of the plan** — stored-query and policy-bundle +catalog writes. It does not move graph manifests, change schemas, start +servers, or serve anything it applies: the server still boots from +`omnigraph.yaml`. ## Commands ```bash omnigraph cluster validate --config ./company-brain omnigraph cluster plan --config ./company-brain --json +omnigraph cluster apply --config ./company-brain --json omnigraph cluster status --config ./company-brain --json omnigraph cluster refresh --config ./company-brain --json omnigraph cluster import --config ./company-brain --json @@ -51,9 +55,9 @@ policies: `metadata.name` is a display label. `state.backend` may be omitted or set to `cluster`; external state backends are reserved for a later stage. `state.lock` -defaults to `true`. When enabled, `cluster plan`, `cluster refresh`, and -`cluster import` briefly acquire `/__cluster/lock.json`, then remove -it before returning. `cluster status` never acquires the lock; it only reports +defaults to `true`. When enabled, `cluster plan`, `cluster apply`, +`cluster refresh`, and `cluster import` briefly acquire +`/__cluster/lock.json`, then remove it before returning. `cluster status` never acquires the lock; it only reports whether one is present. `cluster force-unlock` is the only lock-removal command; it requires the exact lock id and should be run only after confirming no cluster operation is active. @@ -125,8 +129,53 @@ successful `plan` instead reports `lock_acquired: true` and an `acquired_lock_id`, then releases the lock before returning. The command never writes `state.json` and does not scan live graphs. Use explicit `cluster refresh` / `cluster import` when the state ledger should be updated -from live observations. Apply and live drift scans during plan are later-stage -work. +from live observations. Live drift scans during plan are later-stage work. + +Each plan change carries a `disposition` field — an honest preview of what +`cluster apply` will do with it in this stage: `applied` (executes), `derived` +(a `graph.` composite-digest update that converges automatically once its +query digests land), `deferred` (graph/schema change, later phase), or +`blocked` (query/policy gated by an unapplied or missing dependency, with the +condition in `reason`). + +## Apply + +`cluster apply` executes the config-only subset of the plan — stored-query and +policy-bundle changes. There is no confirm flag: `cluster plan` is the preview, +and apply recomputes the same diff under the state lock before executing, so a +stale preview can never be applied. Apply requires an existing `state.json` +(`state_missing` directs you to `cluster import` first). + +For each applied create/update, the resource payload is written +content-addressed into the local catalog: + +```text +/__cluster/resources/query///.gq +/__cluster/resources/policy//.yaml +``` + +Extensions are fixed per kind regardless of the source file's name. Payloads +are written before the state update because `state.json` is the publish point: +if the final CAS-checked state write fails, no success is reported and the +digest-named blobs already written are inert — re-running apply is the repair. +Deletes remove the resource from state; their old payload blobs stay on disk +(garbage collection is a later stage). Re-running a converged apply is a no-op: +no state write, no revision change (`state_written: false`). + +**Applied means recorded in the cluster catalog — nothing more.** The server +still boots from `omnigraph.yaml`; no query or policy applied here serves +traffic until the server-boot stage ships, as an explicit per-deployment mode +switch. + +Graph and schema changes are never executed by this stage. They are reported +as `deferred` (warning `apply_unsupported_change`), and query/policy changes +that depend on them are `blocked` (warning `apply_dependency_blocked`, status +`blocked` in state). A partially-applicable plan still exits 0 with warnings; +the JSON `converged` field is the automation signal for "state now matches the +desired revision". The applied `config_digest` is only recorded when apply +fully converges. The `graph.` composite digest is recomputed from state's +own schema/query digests after each apply, so applied query changes converge +without graph movement. ## Status