docs: state cluster apply is storage-direct, not server-routed (#306)

* docs: state cluster apply is storage-direct, not server-routed

`cluster apply` reaches the object store directly — the `__cluster/` ledger
and each graph's Lance datasets — never through a running omnigraph-server,
so the host that runs it needs storage credentials. The rationale (declarative
control plane, not a runtime mutation API) was documented in cluster-axioms.md
§3/§4, and the out-of-band/direct-storage fact was stated for the maintenance
verbs and init/load, but never spelled out for apply itself.

- docs/user/clusters/index.md: add a day-2 note making apply's storage-direct
  execution and credential requirement explicit, linking the why to axioms 3/4.
- skills/omnigraph/SKILL.md: extend the "init/load write storage directly
  (bypassing the server)" line to include cluster apply, with the same reasoning.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: disambiguate the §5 cross-reference in cluster apply note

The trailing (§5) sat right after the cluster-axioms.md §3/§4 citation, so a
reader could read §5 as referring to cluster-axioms.md (whose §5 covers locked
state) rather than this guide's §5. Make it an explicit same-page forward
reference. Addresses Greptile P2 on #306.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs: don't claim the server is read-only against storage

The "server only reads from it" wording was wrong: the data plane serves HTTP
writes (mutate/load/branch) that go through the server to the graph datasets,
so omnigraph-server is not read-only against object storage. The hazard is an
operator granting the server read-only S3 creds and breaking runtime writes.
Scope the read-only claim to cluster (control-plane) state at boot, and state
that data-plane writes still need read-write storage access. Addresses Greptile
P-level finding on #306.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com>
This commit is contained in:
Andrew Altshuler 2026-06-28 18:14:58 +03:00 committed by GitHub
parent 7779b72446
commit 20e5fada8a
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 15 additions and 1 deletions

View file

@ -121,6 +121,20 @@ audit entries and threaded into the engine's commit history. Set
default when `--as` is omitted (the flag always wins; `approve` requires one
of the two).
**`apply` runs out-of-band, with direct storage access — there are no server
routes for it.** Like `init`/`load` and the maintenance verbs (§7),
`cluster apply` reaches the object store directly: it reads and writes the
cluster ledger under `__cluster/` *and* opens each graph's Lance datasets to
create, migrate, or delete them. It never goes through a running
`omnigraph-server`, so the host that runs it (an operator or CI) needs storage
access — the `AWS_*` credential contract for an `s3://` cluster. This is by
design, not a missing feature: the control plane is **declarative** (config →
cluster), not a runtime mutation API on the serving process — intent lives in
the config files, outside the running system (the reasoning is
[cluster-axioms.md](../../dev/cluster-axioms.md) §3 and §4). The server only ever
*reads* the converged ledger, which is why a held apply lock never blocks
serving (see §5 below, in this guide).
What each change kind does:
| You edit | Plan shows | Apply does |

View file

@ -295,7 +295,7 @@ A graph's bytes live in one of two backends:
set -a && source .env.omni && set +a
```
`init` and `load` write storage directly (bypassing the server); the server reads from it. Validate with `curl http://127.0.0.1:8080/healthz`, then `omnigraph snapshot <graph-uri> --json`.
`init`, `load`, and **`cluster apply`** write storage directly (bypassing the server). `cluster apply` is a storage-direct control-plane command — it reaches the object store directly (the `__cluster/` ledger *and* each graph's Lance datasets, to create/migrate/delete them), never through a running server, so the host that runs it needs storage access (the `AWS_*` contract for an `s3://` cluster). That is by design: the control plane is declarative (config → cluster), not a runtime mutation API on the serving process. The server reads **cluster** state read-only at boot, but it is not read-only against storage overall — data-plane HTTP writes (`mutate`/`load`/`branch`) still go through the server to the graph datasets, so it needs read-write storage access. Validate with `curl http://127.0.0.1:8080/healthz`, then `omnigraph snapshot <graph-uri> --json`.
## Project Layout