Implement cluster refresh and import

This commit is contained in:
aaltshuler 2026-06-08 23:18:44 +03:00
parent a7956ea5a9
commit cb1e7bb5ea
9 changed files with 1208 additions and 29 deletions

View file

@ -5,6 +5,13 @@
**Date:** 2026-06-07
**Relationship:** generalizes today's `omnigraph.yaml` graph/query/policy configuration surface ([CLI reference](../user/cli-reference.md), [server docs](../user/server.md)) into a future cluster control plane. The distilled rules are in [cluster-axioms.md](cluster-axioms.md); detailed downstream implementation spec and blast-radius assessment in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). This is a proposed architecture, not an implemented RFC.
> **Implementation status.** The examples below describe the full target schema.
> Stage 2B only accepts the read-only subset documented in
> [cluster-config.md](../user/cluster-config.md). Future-phase fields such as
> `env_file`, `apply`, `providers`, `pipelines`, `embeddings`, `ui`, `aliases`,
> and `bindings` are intentionally rejected with typed diagnostics until their
> reconciler semantics are implemented.
> **Revision 2026-06-07 — full commitment to the Terraform paradigm.** Three changes from the earlier draft: (1) **state is an authoritative, locked ledger in a backend** (server-hosted *or* a separate cloud store), not "a mostly-rebuildable projection"; (2) `plan` is framed as the **CLI diff between local config and state**; (3) **ETL pipelines** (external data sources) are a first-class config asset — a second seam, alongside schema, where a definition triggers a data-plane effect. The full set of config assets (incl. **aliases**, **embeddings**) is enumerated below.
---

View file

@ -8,7 +8,7 @@ This file is the always-on map of the test surface. **Consult it before every ta
|---|---|---|
| `omnigraph` (engine) | `crates/omnigraph/tests/` | Integration tests (21 files), fixture-driven, share `tests/helpers/mod.rs` |
| `omnigraph-cli` | `crates/omnigraph-cli/tests/` | `cli.rs` (unit-ish), `system_local.rs`, `system_remote.rs`, share `tests/support/mod.rs` |
| `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests` | Cluster config parser, local JSON state diff, state CAS/lock handling, read-only validate/plan/status |
| `omnigraph-cluster` | mostly in-source `#[cfg(test)] mod tests` | Cluster config parser, local JSON state diff, state CAS/lock handling, read-only validate/plan/status plus explicit refresh/import graph observations |
| `omnigraph-server` | `crates/omnigraph-server/tests/` | `server.rs` (HTTP-level), `openapi.rs` (OpenAPI drift / regeneration) |
| `omnigraph-compiler` | mostly in-source `#[cfg(test)] mod tests` | Parser, type-checker, IR lowering, lint |

View file

@ -21,7 +21,7 @@ A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` sc
| `schema plan \| apply \| show (alias: get)` | migrations |
| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
| `queries validate \| list` | operate on the server-side stored-query registry (the `queries:` block). `validate` type-checks every stored query against the live schema offline (opens the selected graph; exits non-zero on any breakage), catching schema drift without restarting the server; `list` prints the selected registry's query names, MCP exposure, and typed params. For per-graph registries, pass `--target <graph>` or set `cli.graph`; with no graph selection, `list` shows only top-level `queries:`. Distinct from `lint`, which validates a single `.gq` file |
| `cluster validate \| plan \| status` | read-only cluster-control preview. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json` while briefly holding `__cluster/lock.json`; `status` reads the state ledger. No apply, graph open, live drift scan, server change, or `state.json` mutation occurs in Stage 2A |
| `cluster validate \| plan \| status \| refresh \| import` | cluster-control preview. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`; `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations. No apply, graph-resource mutation, server change, or `plan --refresh` occurs in Stage 2B |
| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns; `--json` reports a `skipped` field) |
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
| `embed` | offline JSONL embedding pipeline |
@ -80,16 +80,21 @@ policy:
omnigraph cluster validate --config ./company-brain
omnigraph cluster plan --config ./company-brain --json
omnigraph cluster status --config ./company-brain --json
omnigraph cluster refresh --config ./company-brain --json
omnigraph cluster import --config ./company-brain --json
```
`--config` is a directory containing `cluster.yaml`; it defaults to `.`.
Stage 2A accepts graphs, schemas, stored queries, and policy bundle file
Stage 2B accepts graphs, schemas, stored queries, and policy bundle file
references. `cluster plan` reads local JSON state from
`<config-dir>/__cluster/state.json`; a missing file means empty state. Plan
acquires `__cluster/lock.json` by default and releases it before returning.
`cluster status` reads state only and reports any existing lock. External state
backends, apply, refresh/import, pipelines, UI specs, embeddings, aliases, and
bindings are reserved for later stages. See [cluster-config.md](cluster-config.md).
`<config-dir>/__cluster/state.json`; a missing file means empty state. Plan,
refresh, and import acquire `__cluster/lock.json` by default and release it
before returning. `cluster status` reads state only and reports any existing
lock. `refresh` requires an existing `state.json`; `import` creates one only
when it is missing. Both observe declared graphs read-only at
`<config-dir>/graphs/<graph-id>.omni`. External state backends, apply,
`plan --refresh`, pipelines, UI specs, embeddings, aliases, and bindings are
reserved for later stages. See [cluster-config.md](cluster-config.md).
## Output formats (`query` command, alias: `read`)

View file

@ -1,12 +1,13 @@
# Cluster Config
**Status:** Stage 2A read-only preview.
**Status:** Stage 2B state-observation preview.
Cluster config is the future control-plane configuration surface for a whole
OmniGraph deployment. In this stage, OmniGraph can validate a local
`cluster.yaml` folder, produce a deterministic read-only plan, and inspect the
local JSON state ledger. It does not apply changes, open graph roots, scan live
cluster state, start servers, or write graph resources.
`cluster.yaml` folder, produce a deterministic read-only plan, inspect the
local JSON state ledger, and explicitly refresh/import graph observations into
that ledger. It does not apply desired changes, start servers, or write graph
resources.
## Commands
@ -14,6 +15,8 @@ cluster state, start servers, or write graph resources.
omnigraph cluster validate --config ./company-brain
omnigraph cluster plan --config ./company-brain --json
omnigraph cluster status --config ./company-brain --json
omnigraph cluster refresh --config ./company-brain --json
omnigraph cluster import --config ./company-brain --json
```
`--config` points at a directory, not a file. The directory must contain
@ -21,7 +24,7 @@ omnigraph cluster status --config ./company-brain --json
## Supported `cluster.yaml`
Stage 2A accepts only the read-only resource subset:
Stage 2B accepts only the read-only resource subset:
```yaml
version: 1
@ -47,10 +50,10 @@ policies:
`metadata.name` is a display label. `state.backend` may be omitted or set to
`cluster`; external state backends are reserved for a later stage. `state.lock`
defaults to `true`. When enabled, `cluster plan` briefly acquires
`<config-dir>/__cluster/lock.json` while it reads state, then removes it before
returning. `cluster status` never acquires the lock; it only reports whether one
is present.
defaults to `true`. When enabled, `cluster plan`, `cluster refresh`, and
`cluster import` briefly acquire `<config-dir>/__cluster/lock.json`, then remove
it before returning. `cluster status` never acquires the lock; it only reports
whether one is present.
## Validation
@ -113,8 +116,10 @@ Missing `state_revision` is treated as `0`. Resource status values are
Plan output compares desired resource digests against state resource digests
and reports `create`, `update`, and `delete` changes. It also reports the state
CAS (`sha256:<digest>`), state revision, and lock id used for the read. The
command never writes `state.json`; apply, refresh, import, and live drift scans
are later-stage work.
command never writes `state.json` and does not scan live graphs. Use explicit
`cluster refresh` / `cluster import` when the state ledger should be updated
from live observations. Apply and live drift scans during plan are later-stage
work.
## Status
@ -122,3 +127,24 @@ are later-stage work.
ledger says is deployed. It does not validate referenced schema/query/policy
files and does not inspect live graphs. Missing `state.json` succeeds with a
warning; invalid state JSON or an unsupported state version fails.
## Refresh And Import
`cluster refresh` updates an existing `state.json` from actual observations.
`cluster import` creates the first `state.json` when the ledger is missing.
Both commands open declared graphs read-only at:
```text
<config-dir>/graphs/<graph-id>.omni
```
They observe only branch `main`, recording graph existence, manifest version,
live schema digest, desired schema digest, and schema-match status under
`observations["graph.<id>"]`. Missing graph roots are recorded as drift and
remove the graph/schema digests from state so a later `plan` proposes creates.
Invalid graph roots are recorded as errors; `refresh` persists the error
observation and exits non-zero, while `import` exits non-zero without creating
initial state.
Refresh/import do not observe query or policy resources yet. Existing query and
policy state digests are preserved on refresh and are not invented on import.