mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
docs(cluster): axiom 15 — single ownership, mode-switch migration, per-operator layer (#164)
Encode the omnigraph.yaml ↔ cluster.yaml coexistence rules that were implicit across the specs: - cluster-axioms.md: new axiom 15 — every fact has exactly one owner at a time; coexistence is a mode switch, never a merge; omnigraph.yaml's job description shrinks to the permanent per-operator layer. Added review-tension bullet. - cluster-config-specs.md: "Migration model" subsection (three coexistence windows: no-conflict, Phase-5 mode switch, bridges-with-sunsets) and a "per-operator layer" completeness table (connection, credential reference, active context, ergonomics, personal aliases) with its global-config-dir destination per the RFC-002 direction. - cluster-config-implementation-spec.md: Compatibility Stance #7–#9 (single ownership, shrinking role, bridges carry sunsets); Phase 5 boot is an exclusive XOR mode switch; fixed the duplicated recoveries/recovery dirs in the Phase-1 storage layout. - docs/user/cluster-config.md: "Relationship to omnigraph.yaml" section in current-reality terms (cluster catalog is inspectable, not live). Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
parent
2c578a60b2
commit
cec65b8ef8
4 changed files with 100 additions and 4 deletions
|
|
@ -24,6 +24,12 @@ consequences that follow from them.
|
|||
> Terraform-style JSON documents plus backend lock/CAS, not Lance control-plane
|
||||
> datasets. Lance remains a possible later backend only if row-level history or
|
||||
> queryability justifies the extra machinery.
|
||||
>
|
||||
> **Revision 2026-06-09 — single ownership during migration.** Axiom **15**
|
||||
> added: while `omnigraph.yaml` and the cluster catalog coexist, every fact has
|
||||
> exactly one owner at a time — coexistence is a **mode switch, never a merge**.
|
||||
> `omnigraph.yaml` does not get replaced; its job description shrinks to the
|
||||
> permanent per-operator layer.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -72,6 +78,8 @@ invoke_query. This axiom is the target control-plane rule, not a statement
|
|||
about today's server catalog. -->
|
||||
**14. Exposure is a policy decision, not a config flag.** Target design: which stored queries (and the tools/dashboards built on them) an actor may **list or invoke** is decided by the policy layer (Cedar: `invoke_query` + catalog visibility), not by a per-query `expose:` boolean. The registry only says a query *exists* (name → file); **policy says who may see and run it**, so the MCP catalog (`GET /queries`) becomes each actor's policy-permitted set. This supersedes the engine's current `mcp.expose` flag only after per-query `invoke_query` scope and Cedar-filtered catalog listing land; until then, proposals must state the compatibility bridge to today's `mcp.expose` + coarse invocation gate.
|
||||
|
||||
**15. Every fact has exactly one owner at a time; coexistence is a mode switch, never a merge.** `cluster.yaml` is not `omnigraph.yaml` v2 — the two documents end with disjoint jobs, and only the *shared-truth* parts of today's `omnigraph.yaml` (the set of graphs, stored-query registry, policy wiring, server boot source) migrate to the cluster catalog. The per-operator parts — connection/cluster selection, the operator's own credential reference, active graph/branch context, CLI ergonomics — are per-operator *by nature* (Sarah's and Bob's differ) and stay in the per-operator layer permanently; plan a **shrinking job description** for `omnigraph.yaml`, not an exit. During the migration window each fact is read from exactly one source at a time: a deployment serves from `omnigraph.yaml` **or** boots from cluster state (an exclusive mode switch), never from a precedence-merge of both. Two readers for one fact is the brittle-backcompat failure mode — it is the deny-list's "state that drifts from what it can be derived from" wearing a compatibility costume. Any compatibility bridge must name its replacement and its removal phase (the `mcp.expose` → policy-owned exposure bridge of axiom 14 is the template); bridges that accumulate without an exit are rejected at review.
|
||||
|
||||
---
|
||||
|
||||
## The one-line compression
|
||||
|
|
@ -82,7 +90,7 @@ about today's server catalog. -->
|
|||
|
||||
## How to use this file
|
||||
|
||||
- **Reviewing a proposal:** walk axioms 0–14; any conflict is the burden of the proposer to justify. The most common tensions:
|
||||
- **Reviewing a proposal:** walk axioms 0–15; any conflict is the burden of the proposer to justify. The most common tensions:
|
||||
- Treating the *running system* as the source of truth for **intent** → axioms 2, 4 (intent lives in config).
|
||||
- Treating state as a throwaway derivation rather than an authoritative, locked, backend-held ledger → axiom 5, 12.
|
||||
- A runtime config-mutation API instead of declarative apply → axiom 3.
|
||||
|
|
@ -94,4 +102,5 @@ about today's server catalog. -->
|
|||
- A secret value (token, embedding key, pipeline source credential) inline in config instead of in the gitignored `.env` file → axiom 10.
|
||||
- A per-query `expose:`/visibility flag in target-state cluster config instead of governing list/invoke in policy; or failing to account for today's `mcp.expose` compatibility bridge → axiom 14.
|
||||
- Shipping `apply` before hermetic `validate` + read-only `plan` tests, or shipping graph/schema-moving apply before recovery tests for the graph/resource-moved-before-cluster-publish gap → axiom 5 and axiom 12.
|
||||
- Reading one fact from both `omnigraph.yaml` and the cluster catalog with precedence rules (a merge instead of a mode switch), migrating per-operator concerns into shared cluster config, or adding a compatibility bridge with no named replacement and removal phase → axiom 15.
|
||||
- **Citing:** reference axioms by number in PRs and review comments so the rationale is stable across renames and refactors.
|
||||
|
|
|
|||
|
|
@ -64,6 +64,21 @@ is trying to create. -->
|
|||
identity. It is not committed into `cluster.yaml`.
|
||||
6. `mcp.expose` remains supported in current `omnigraph.yaml` until the
|
||||
per-query policy replacement ships.
|
||||
7. **Single ownership (axiom 15).** While `omnigraph.yaml` and the cluster
|
||||
catalog coexist, each fact is read from exactly one source at a time.
|
||||
Phase 5 server boot is an exclusive mode switch — boot from cluster state
|
||||
XOR from `omnigraph.yaml` — never a precedence-merge of both. No phase may
|
||||
introduce a surface that reads the same fact (graph set, query registry,
|
||||
policy wiring, bind address) from both sources with tie-break rules.
|
||||
8. **`omnigraph.yaml` shrinks; it does not get deprecated.** Its terminal role
|
||||
is the per-operator layer: connection/cluster selection, the operator's
|
||||
credential reference, active graph/branch context, CLI ergonomics, and
|
||||
purely personal aliases (target home: the operator's global config dir per
|
||||
RFC-002). Shared-truth keys migrate to `cluster.yaml`; per-operator keys
|
||||
never do.
|
||||
9. **Bridges carry sunsets.** Every compatibility bridge names its replacement
|
||||
and the phase that removes it (`mcp.expose` → Phase 6 policy-owned exposure
|
||||
is the template). A bridge without an exit is a review-blocking finding.
|
||||
|
||||
## Terraform-Aligned Schema Validation
|
||||
|
||||
|
|
@ -335,8 +350,6 @@ Target Phase-1 cluster-root layout:
|
|||
<ulid>.json
|
||||
recoveries/
|
||||
<ulid>.json
|
||||
recovery/
|
||||
<ulid>.json
|
||||
resources/
|
||||
query/<graph>/<name>/<digest>.gq
|
||||
policy/<name>/<digest>.yaml
|
||||
|
|
@ -586,7 +599,9 @@ replacement would make every invariant harder to audit. -->
|
|||
|
||||
- Allow server startup from cluster state.
|
||||
- Add status and catalog endpoints as needed.
|
||||
- Keep the current `omnigraph.yaml` startup path as compatibility mode.
|
||||
- Keep the current `omnigraph.yaml` startup path as compatibility mode — an
|
||||
**exclusive mode switch** per deployment (cluster state XOR `omnigraph.yaml`),
|
||||
never a merged read of both (Compatibility Stance #7, axiom 15).
|
||||
- Regenerate OpenAPI for any HTTP surface.
|
||||
|
||||
### Phase 6: Policy-Owned Query Exposure
|
||||
|
|
|
|||
|
|
@ -387,6 +387,65 @@ This proposal:
|
|||
|
||||
The connection/credential/preference layer remains per operator: it points at a cluster, resolves that operator's identity, and holds personal ergonomics. The cluster config stays shared, secret-free, and reviewable; the state ledger stays authoritative and locked.
|
||||
|
||||
### Migration model: single ownership, mode switch, shrinking job description (axiom 15)
|
||||
|
||||
`omnigraph.yaml` is not being replaced; its **job description shrinks**. Only the
|
||||
shared-truth parts of its current role migrate to the cluster catalog (the set of
|
||||
graphs, the stored-query registry, policy wiring, the server boot source). The
|
||||
per-operator parts are per-operator *by nature* — Sarah's and Bob's differ — and
|
||||
keep `omnigraph.yaml`/the per-operator layer as a permanent, well-defined home.
|
||||
|
||||
While both exist, **each fact has exactly one owner at any moment, and
|
||||
coexistence is a mode switch, never a merge**. The brittle version of backward
|
||||
compatibility — the server reading graphs from `omnigraph.yaml` *and* from
|
||||
cluster state with precedence rules gluing them together — is rejected outright:
|
||||
two readers for one truth means every bug becomes "which file won?" and every
|
||||
feature pays the tax twice. The realistic timeline has three windows:
|
||||
|
||||
1. **Now → Phase 4 (no conflict).** Cluster apply writes only to its own catalog
|
||||
(`__cluster/`); `omnigraph.yaml` serves traffic. `Applied` status must
|
||||
visibly mean "recorded in the cluster catalog, not yet serving" so the
|
||||
overlap is loud, not hidden.
|
||||
2. **Phase 5 (the mode switch).** A deployment opts into booting from cluster
|
||||
state; `omnigraph.yaml`'s server-role keys become inert *for that
|
||||
deployment*. Exclusive — boot from cluster state XOR `omnigraph.yaml` — with
|
||||
no key-level aliasing and no merged precedence.
|
||||
3. **Phase 6+ (bridges with sunsets).** Targeted compatibility bridges are
|
||||
allowed only with a named replacement and a removal phase; `mcp.expose` →
|
||||
policy-owned exposure is the template. Bridges that accumulate without an
|
||||
exit are review-rejected.
|
||||
|
||||
Key-by-key compatibility inside one evolving file is the expensive kind of
|
||||
backcompat (the v1 `omnigraph.yaml` reshape's `--target`/legacy-key regressions
|
||||
are the in-repo cautionary tale); resource-ownership seams between two files
|
||||
with a mode switch is the cheap kind. Police the single-owner rule in every
|
||||
Phase 3–6 PR: a proposal that merges the two sources for one fact is the
|
||||
deny-list's "state that drifts from what it can be derived from" wearing a
|
||||
compatibility costume.
|
||||
|
||||
### The per-operator layer: contents and destination
|
||||
|
||||
The per-operator layer must be **complete** — everything an operator needs to
|
||||
work against any cluster from any directory, and nothing that two operators must
|
||||
agree on:
|
||||
|
||||
| Per-operator concern | Today | Target |
|
||||
|---|---|---|
|
||||
| Connection (which cluster/server, named endpoints) | `omnigraph.yaml` `graphs.<name>` URIs / `servers:` refs | global config, per-operator |
|
||||
| Operator credential **reference** (`bearer_token_env`, env-file lookup) | `omnigraph.yaml` + `.env` | global config references; secret values stay in env/`.env`, never in any config |
|
||||
| Active context (current graph/branch selection) | ad-hoc per-command flags / `defaults` | global state layer (e.g. `omnigraph use`), explicitly **not** the cluster state ledger (axiom 5's "state" is the applied-cluster ledger, not a personal selection) |
|
||||
| CLI ergonomics (output format, table layout) | `omnigraph.yaml` `cli:`/`defaults:` | global config, per-operator |
|
||||
| Personal command shortcuts (purely personal aliases) | `omnigraph.yaml` `aliases:` | global config; *shared* aliases (team vocabulary) are cluster config — see the aliases split note above |
|
||||
|
||||
Destination: this layer belongs in the operator's **global config dir**
|
||||
(`~/.omnigraph`, per the RFC-002 global-first layered-config direction —
|
||||
global config + active-context state file), not in a repo-committed file, so it
|
||||
survives `git clone`, works from any directory, and never collides with the
|
||||
shared cluster folder. The RFC-002 layering implementation is currently parked
|
||||
(PRs #139/#162 closed over review findings), but the *boundary* it draws is the
|
||||
one this spec depends on: per-operator → global dir; shared deployment intent →
|
||||
the cluster config folder; deployed reality → the state ledger.
|
||||
|
||||
Implementation gate: the Terraform-style workflow must be testable in order.
|
||||
`cluster validate` must catch bad config before any apply path exists;
|
||||
read-only `cluster plan` must have deterministic structured-plan tests before
|
||||
|
|
|
|||
|
|
@ -23,6 +23,19 @@ omnigraph cluster force-unlock <LOCK_ID> --config ./company-brain --json
|
|||
`--config` points at a directory, not a file. The directory must contain
|
||||
`cluster.yaml`. When omitted, it defaults to the current directory.
|
||||
|
||||
## Relationship to `omnigraph.yaml`
|
||||
|
||||
`cluster.yaml` does not replace `omnigraph.yaml`, and the two never describe
|
||||
the same fact. `omnigraph.yaml` remains how the CLI and server are configured
|
||||
today (graph targets, server bind, CLI defaults, credential env references) and
|
||||
is its long-term home for per-operator settings. `cluster.yaml` is the shared
|
||||
desired state of a whole deployment, read only by the `cluster` commands via
|
||||
`--config`. In the current stage, nothing recorded in the cluster state ledger
|
||||
affects what a server serves or what other CLI commands target — the cluster
|
||||
catalog is inspectable, not live. When server boot from cluster state ships in
|
||||
a later stage, it will be an explicit per-deployment mode switch, not a merge
|
||||
of the two files.
|
||||
|
||||
## Supported `cluster.yaml`
|
||||
|
||||
Stage 2C accepts only the read-only resource subset:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue