From d0e06a6ff6cc549e55c1bf1f19e61138b7ecfeb7 Mon Sep 17 00:00:00 2001 From: Andrew Altshuler Date: Wed, 17 Jun 2026 02:58:47 +0300 Subject: [PATCH] =?UTF-8?q?docs:=20audit=20pass=20=E2=80=94=20drop=20pre-0?= =?UTF-8?q?.7.0=20release=20notes;=20scrub=20RFC=20refs=20from=20user=20do?= =?UTF-8?q?cs=20(#272)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs: audit pass — drop pre-0.7.0 release notes; scrub RFC refs from user docs - Delete the pre-0.7.0 release-notes archive (v0.2.0 … v0.6.2); keep v0.7.0. - Rewrite every inline "RFC-0NN" citation in docs/user/** into durable plain language (the behavior is the contract, not the planning doc): cli/index.md, cli/reference.md, clusters/index.md, operations/{maintenance, policy,server}.md. Updated the in-page "Scopes & profiles" anchor to match the de-RFC'd heading. No sub-0.7.0 version caveats or stale Lance-version refs were present in docs/user/**. Dev docs, AGENTS.md, and instruction files are out of scope for this pass. Co-Authored-By: Claude Opus 4.8 * docs: second alignment pass — drop residual pre-cluster-only framing - cli/reference.md: rewrite the server-scope graph-resolution rule — an omnigraph-server is always cluster-backed, so GET /graphs always answers and --graph is required; the bare-URL path is only the fallback for an unavailable/non-omnigraph endpoint (was "a single-graph / flat server … uses its bare URL as before"). - embeddings.md: "Direct single-graph serving" → "Direct (--store) access" (there is no single-graph serving mode under cluster-only). - clusters/{config,index}.md: drop the removed --target flag from the "--cluster cannot combine with …" clauses. Verified: no Linear tickets, no RFC refs, no single-graph-as-current, no --target-as-combinable in docs/user/**. Co-Authored-By: Claude Opus 4.8 --------- Co-authored-by: Claude Opus 4.8 --- docs/releases/v0.2.0.md | 86 -------------- docs/releases/v0.2.1.md | 59 ---------- docs/releases/v0.2.2.md | 29 ----- docs/releases/v0.3.0.md | 49 -------- docs/releases/v0.3.1.md | 19 ---- docs/releases/v0.4.0.md | 88 -------------- docs/releases/v0.4.1.md | 142 ----------------------- docs/releases/v0.4.2.md | 115 ------------------- docs/releases/v0.5.0.md | 171 ---------------------------- docs/releases/v0.6.0.md | 141 ----------------------- docs/releases/v0.6.1.md | 28 ----- docs/releases/v0.6.2.md | 69 ----------- docs/user/cli/index.md | 2 +- docs/user/cli/reference.md | 29 ++--- docs/user/clusters/config.md | 2 +- docs/user/clusters/index.md | 4 +- docs/user/operations/maintenance.md | 2 +- docs/user/operations/policy.md | 2 +- docs/user/operations/server.md | 2 +- docs/user/search/embeddings.md | 2 +- 20 files changed, 23 insertions(+), 1018 deletions(-) delete mode 100644 docs/releases/v0.2.0.md delete mode 100644 docs/releases/v0.2.1.md delete mode 100644 docs/releases/v0.2.2.md delete mode 100644 docs/releases/v0.3.0.md delete mode 100644 docs/releases/v0.3.1.md delete mode 100644 docs/releases/v0.4.0.md delete mode 100644 docs/releases/v0.4.1.md delete mode 100644 docs/releases/v0.4.2.md delete mode 100644 docs/releases/v0.5.0.md delete mode 100644 docs/releases/v0.6.0.md delete mode 100644 docs/releases/v0.6.1.md delete mode 100644 docs/releases/v0.6.2.md diff --git a/docs/releases/v0.2.0.md b/docs/releases/v0.2.0.md deleted file mode 100644 index 7872ecf..0000000 --- a/docs/releases/v0.2.0.md +++ /dev/null @@ -1,86 +0,0 @@ -# Omnigraph v0.2.0 - -Omnigraph v0.2.0 focuses on day-to-day operability: safer schema evolution, more capable mutation queries, better local and remote ergonomics, and a documented HTTP surface for clients and tooling. - -This release is especially relevant if you are running Omnigraph locally on RustFS or using the CLI and server together as a graph application backend. - -## Highlights - -### Schema planning and apply - -Schema changes can now move from planning to execution with first-class CLI and server support. - -- Added `omnigraph schema apply --schema ...` alongside `schema plan` -- Added `POST /schema/apply` on the server -- Added policy support for schema application through the `schema_apply` action -- Persisted accepted schema updates as part of a supported apply flow - -This makes schema evolution an actual product capability instead of a plan-only diagnostic. - -### Safer schema apply on live repos - -After the initial schema-apply rollout, the apply path was hardened to avoid clobbering concurrent writes and to preserve indexes during table rewrites. - -- Blocks writes while schema apply is in progress -- Verifies source heads before publishing rewritten tables -- Rebuilds the full expected index set after rewrite operations -- Keeps schema apply constrained to repos whose only branch is `main` - -The result is a much more defensible v1 schema migration path. - -### Multi-statement mutations - -Mutation queries can now contain multiple sequential statements that execute atomically within one run. - -Example: - -```gq -query add_and_link($name: String, $age: I32, $friend: String) { - insert Person { name: $name, age: $age } - insert Knows { from: $name, to: $friend } -} -``` - -This is a meaningful step toward richer write-side workflows without forcing multiple client round trips. - -### OpenAPI support - -The server now publishes an OpenAPI document at `/openapi.json`. - -- Added schema-backed endpoint documentation for the Omnigraph HTTP API -- Documented request and response types for the current server surface -- Made the published spec reflect runtime auth mode, so open local deployments are documented correctly - -This makes Omnigraph easier to integrate with generated clients, inspection tools, and API consumers that want a machine-readable contract. - -### CLI and export ergonomics - -Several rough edges in the CLI were fixed. - -- Export now streams instead of buffering the full snapshot in memory first -- Load summaries now report actual loaded row counts -- Alias handling no longer steals legitimate first arguments -- `commit show` matches the documented `--uri` usage -- Remote and local usage are more consistent for common admin flows - -## Additional Improvements - -- RustFS CI is now scoped to relevant changes instead of burning time on unrelated pull requests -- README and install docs were tightened around public binary install behavior -- The local RustFS bootstrap remains aligned with the rolling `edge` binary channel - -## Upgrade Notes - -- If you use local or remote schema administration, prefer `schema plan` before `schema apply` -- `schema apply` is intentionally conservative in v1 and rejects repos with non-`main` branches -- If policy is enabled, make sure admin actors are allowed to perform `schema_apply` -- If you rely on published binaries, this release is the point where stable installers can pick up schema apply and the newer CLI/runtime behavior without using `edge` - -## Included Changes - -- PR #2: CLI ergonomics and streamed export output -- PR #5: schema apply command and policy support -- PR #7: schema apply concurrency and index-preservation hardening -- PR #4: multi-statement mutations -- PR #1: OpenAPI generation and auth-aware `/openapi.json` -- PR #8: RustFS CI scoping improvements diff --git a/docs/releases/v0.2.1.md b/docs/releases/v0.2.1.md deleted file mode 100644 index b840885..0000000 --- a/docs/releases/v0.2.1.md +++ /dev/null @@ -1,59 +0,0 @@ -# Omnigraph v0.2.1 - -Omnigraph v0.2.1 is a focused follow-up release on top of v0.2.0. It adds query linting, improves query execution correctness, hardens the local RustFS bootstrap flow, and cleans up project config naming. - -## Highlights - -### Query lint and check - -The CLI now ships a first-class query validation surface: - -- `omnigraph query lint` -- `omnigraph query check` - -These commands validate `.gq` files against either an explicit schema file or a local/S3-backed repo schema, emit structured results, and support both human-readable and JSON output. - -### Query execution fixes and aggregate support - -This release includes several improvements in the query engine: - -- aggregate execution support for read queries -- nullable query parameters now accept omission and explicit null for nullable params -- traversal planning and join alignment are more robust for traversal-introduced bindings - -Together, these changes make complex read queries more dependable and easier to author. - -### Better local RustFS startup - -The local RustFS bootstrap is more resilient: - -- detects dirty/stale repo prefixes before blindly reinitializing -- makes bootstrap recovery clearer for persisted local RustFS state -- ships a more generic demo fixture instead of user-specific seed content - -This reduces the most common failure mode in local-first setup. - -### Config terminology cleanup - -`omnigraph.yaml` now uses graph-oriented naming: - -- `graphs:` instead of `targets:` -- `cli.graph` / `server.graph` instead of `target` - -This removes one of the more confusing overloaded terms in the CLI/server config model. - -## Included Changes - -- PR #15: query lint and query check commands -- PR #6: aggregate execution support -- PR #3: nullable query parameter fixes -- PR #16: traversal planning and join-alignment fixes -- PR #13: local RustFS bootstrap recovery hardening -- PR #14: generic bootstrap fixture -- PR #17: config rename from targets to graphs - -## Upgrade Notes - -- If you maintain `.gq` files in-repo, add `omnigraph query lint` to your local validation workflow -- Existing configs must use `graphs:` / `graph:` after this release -- Local RustFS users should prefer the current bootstrap script from `main` or this release rather than older cached copies diff --git a/docs/releases/v0.2.2.md b/docs/releases/v0.2.2.md deleted file mode 100644 index 88d086e..0000000 --- a/docs/releases/v0.2.2.md +++ /dev/null @@ -1,29 +0,0 @@ -# Omnigraph v0.2.2 - -Omnigraph v0.2.2 is a packaging follow-up to v0.2.1. It keeps the CLI and server surface the same, but renames the published runtime crate from `omnigraph` to `omnigraph-engine` so the full crate set can be published cleanly to crates.io. - -## Highlights - -### Published runtime crate rename - -The runtime package is now published as: - -- `omnigraph-engine` - -The in-code Rust library name remains `omnigraph`, so internal imports and code paths stay stable. CLI users are unaffected. - -### Crates.io metadata cleanup - -All published crates now ship repository, homepage, and documentation metadata so the crates.io pages are complete and the release pipeline no longer emits missing-package-metadata warnings. - -## Included Changes - -- rename runtime package from `omnigraph` to `omnigraph-engine` -- bump `omnigraph-engine`, `omnigraph-compiler`, `omnigraph-server`, and `omnigraph-cli` to `0.2.2` -- update dependent manifests and CI package references to the new runtime package name - -## Upgrade Notes - -- Rust consumers should depend on `omnigraph-engine` on crates.io -- Code that imports the library can continue using `omnigraph` as the crate name -- The `omnigraph` CLI binary name is unchanged diff --git a/docs/releases/v0.3.0.md b/docs/releases/v0.3.0.md deleted file mode 100644 index 4c900a7..0000000 --- a/docs/releases/v0.3.0.md +++ /dev/null @@ -1,49 +0,0 @@ -# Omnigraph v0.3.0 - -Omnigraph v0.3.0 is a feature and security release. It adds an AWS deployment path for the server, hardens bearer-token authentication, introduces a schema inspection endpoint, and ships the CodeBuild-driven image packaging pipeline. - -## Highlights - -### AWS deployment path - -A new `aws` Cargo feature enables an AWS-native bearer-token backend. When compiled with `--features aws` and pointed at an AWS Secrets Manager secret ARN via `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET`, the server fetches and parses bearer tokens directly from Secrets Manager at startup. The token loading path is abstracted behind a `TokenSource` trait so additional backends are easy to add. - -A manually-dispatched Package workflow builds two variants of the server image (default and `--features aws`) via AWS CodeBuild, tags them by source SHA in ECR, and records the digests for downstream deploy automation. - -### Bearer auth hardening - -Bearer tokens are now hashed (SHA-256) at rest inside the server and compared using constant-time equality (`subtle::ConstantTimeEq`). The authenticated actor id is resolved server-side from the hash match — requests can no longer assert their own actor id by setting a header. - -### Schema inspection API - -A new `GET /schema` endpoint and matching CLI `schema get` command return the active graph schema as JSON. A static OpenAPI spec is published at `openapi.json` and kept in sync with the server via a CI job. - -### Stricter run-branch hygiene - -Internal `__run__…` branches, used for short-lived write staging, are now filtered out of user-visible branch listings and are deleted on every terminal state transition instead of accumulating over time. - -## Breaking changes - -### Schema state is now required - -The server refuses to open a repo that lacks persisted schema state (`_schema.pg`, `_schema.ir.json`, `__schema_state.json`) or that has non-main public branches left over from earlier versions. Existing repos created with 0.2.x need to be reinitialized (or have their schema state written explicitly) before they can be opened with 0.3.0. - -## Included Changes - -- Add `aws` feature + `SecretsManagerTokenSource` backend -- Extract `TokenSource` trait for bearer token loading -- Harden bearer auth: constant-time compare, SHA-256 hashed at rest, server-authoritative actor id -- Add manually-dispatched Package workflow for CodeBuild image builds (default + aws variants) -- Add `GET /schema` endpoint and `schema get` CLI command -- Ship static `openapi.json` spec with CI auto-sync -- Filter and delete ephemeral `__run__` branches -- Switch Dockerfile base to ECR Public (avoid Docker Hub rate limits) -- Raise `LANCE_MEM_POOL_SIZE` default to 1 GB for stable parallel tests -- Automate Homebrew tap updates on release tags -- Documentation for the AWS build variant and bearer-token sources - -## Upgrade Notes - -- Repos created with 0.2.x must be reinitialized (or have their schema state generated) before they can be opened with 0.3.0 -- Deployments using AWS Secrets Manager for bearer tokens must build the server with `--features aws` and set `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` to the secret ARN -- The default token source (env var or JSON file) continues to work unchanged diff --git a/docs/releases/v0.3.1.md b/docs/releases/v0.3.1.md deleted file mode 100644 index 1f5d7dc..0000000 --- a/docs/releases/v0.3.1.md +++ /dev/null @@ -1,19 +0,0 @@ -# Omnigraph v0.3.1 - -Omnigraph v0.3.1 is a performance and operability point release. - -## Highlights - -- **Parallel per-type load writes**: the bulk loader writes to each node/edge table concurrently rather than serially, materially reducing wall-clock time on multi-table loads. -- **`omnigraph optimize` and `omnigraph cleanup` CLI commands**: previously only available via the engine API. `optimize` runs Lance `compact_files()` across every node/edge table; `cleanup` runs Lance `cleanup_old_versions()` with a `--keep`/`--older-than` policy and requires `--confirm` for the destructive form. -- **Dst-id deduplication during edge expand hydration**: avoids redundant lookups when the same destination id appears multiple times in an `Expand` step (#45). - -## Included Changes - -- Parallel per-type load writes (#46) -- `omnigraph optimize` / `cleanup` CLI commands and runtime APIs (#46) -- Dedupe dst ids before hydrating nodes in `execute_expand` (#45) - -## Upgrade Notes - -No breaking changes. Existing v0.3.0 repos can be opened directly with v0.3.1. diff --git a/docs/releases/v0.4.0.md b/docs/releases/v0.4.0.md deleted file mode 100644 index d3a8244..0000000 --- a/docs/releases/v0.4.0.md +++ /dev/null @@ -1,88 +0,0 @@ -# Omnigraph v0.4.0 - -Omnigraph v0.4.0 demotes the Run state machine to commit metadata via the -publisher's CAS, fixing a write-cancellation hole and reducing the engine's -surface area. - -## Highlights - -- **Direct-to-target writes**: `mutate_as` and `load` write - directly to the target tables and call - `ManifestBatchPublisher::publish` once at the end with - `expected_table_versions`. No more `__run__` staging branches, no - more `RunRecord` state machine. Cross-table OCC is enforced inside the - publisher's row-level CAS on `__manifest`. -- **Cancellation safety by construction**: a dropped mutation future - leaves no graph-level state — only orphaned Lance fragments, reclaimed - by `omnigraph cleanup`. The "zombie run" cascade documented in - `.context/zombie-run-investigation.md` is gone. -- **Read-your-writes inside multi-statement mutations**: a `.gq` query - that inserts and then references a row in the same statement now sees - its own writes via an in-process `MutationStaging` cache, even though - no manifest commit happens between ops. -- **Structured conflict surface**: concurrent writers race through the - publisher's CAS; the loser surfaces as - `ManifestConflictDetails::ExpectedVersionMismatch { table_key, - expected, actual }`. The HTTP server maps this to **409 Conflict** with - a structured `manifest_conflict` body so clients can detect-and-retry - without parsing the message. - -## Removed - -This is a breaking release. Pre-0.4.0 / no SLA. - -- `omnigraph::db::{RunRecord, RunStatus, RunId}` types and the - `_graph_runs.lance` / `_graph_run_actors.lance` Lance datasets. -- Engine APIs `begin_run`, `begin_run_as`, `publish_run`, - `publish_run_as`, `abort_run`, `fail_run`, `terminate_run`, - `list_runs`, `get_run`. -- HTTP endpoints: `GET /runs`, `GET /runs/{run_id}`, `POST - /runs/{run_id}/publish`, `POST /runs/{run_id}/abort`. The - `RunListOutput` and `RunOutput` schemas are removed from the OpenAPI - document. -- CLI subcommands: `omnigraph run list`, `omnigraph run show`, `omnigraph - run publish`, `omnigraph run abort`. Use `omnigraph commit list` - reading the commit graph for audit history. -- Cedar policy actions `run_publish` and `run_abort`. Existing - `policy.yaml` files referencing these actions will fail validation — - remove the rules; the `change` action covers the equivalent gating. - -## Behavior changes - -- `mutate_as` / `load` are now **atomic per query, single publish at the - end**. A failed mutation leaves the target unchanged with no - intermediate manifest commits. -- The `OmniError::manifest_conflict` shape produced by concurrent - writers is now `ExpectedVersionMismatch` (was `MergeConflict::DivergentUpdate` - via the run merge path). Clients that match on the conflict body must - switch to inspecting `manifest_conflict.table_key/expected/actual`. - -## Known limitation - -A multi-statement mutation that writes a Lance fragment in op-N and then -fails in op-N+1 leaves the touched table with Lance HEAD ahead of the -manifest. The next mutation against that table fails with -`ExpectedVersionMismatch`. Most validation runs before any Lance write, -so single-statement mutations are unaffected; the narrow path is -multi-statement queries with late-op failures. Tracked as a follow-up; -see [docs/dev/writes.md](../dev/writes.md#mid-query-partial-failure-closed-by-mr-794) -for the workaround. - -## Upgrade notes - -- **Stale `__run__*` branches and `_graph_runs.lance`** in legacy v0.3.x - repos are *inert* — the engine no longer reads them — but they remain - on disk until production cleanup. This release deliberately does not touch - legacy bytes. -- The `is_internal_run_branch` predicate is kept as a defense-in-depth - guard against users naming a branch `__run__*`. It will be removed in - a follow-up cleanup. -- External scripts hitting `/runs/*` will now receive 404. Migrate them - to `/commits` for audit history; mutation status is implied by the - HTTP response on `/change` itself. - -## Included Changes - -- Demote Run: write directly to target via publisher -- `ManifestBatchPublisher::publish` accepts per-table - `expected_table_versions` diff --git a/docs/releases/v0.4.1.md b/docs/releases/v0.4.1.md deleted file mode 100644 index 4983015..0000000 --- a/docs/releases/v0.4.1.md +++ /dev/null @@ -1,142 +0,0 @@ -# Omnigraph v0.4.1 - -Omnigraph v0.4.1 closes the multi-statement-mutation atomicity gap that -v0.4.0 documented as a known limitation. Inserts and updates now route -through an in-memory `MutationStaging` accumulator and commit via Lance's -two-phase distributed-write API at end-of-query. A failed mid-query op -no longer leaves Lance HEAD drifted on the touched table — the next -mutation proceeds normally. - -## Highlights - -- **Staged-write rewire**: `mutate_as` and `load` (Append / - Merge modes) accumulate insert/update batches into - `MutationStaging.pending` per touched table. No Lance HEAD advance - happens during op execution; one `stage_*` + `commit_staged` per - table runs at end-of-query, then `ManifestBatchPublisher::publish` - commits the manifest atomically. **For op-execution failures** - (validation errors, missing endpoints, parse-time D₂ rejection), Lance - HEAD on every staged table is untouched and the next mutation - proceeds normally. A narrowed residual remains at the - finalize→publisher boundary (multi-table `commit_staged` is not - atomic with the manifest commit) — see [docs/dev/writes.md](../dev/writes.md) - "Finalize → publisher residual" for details. -- **D₂ parse-time rule**: a single mutation query is either - insert/update-only or delete-only. Mixed → rejected with a clear - error directing the caller to split into two queries. Lance 4.0.0 - has no public two-phase delete; deletes still inline-commit, and D₂ - keeps that path safe. -- **Read-your-writes via DataFusion `MemTable`**: read sites in - multi-statement mutations consume `TableStore::scan_with_pending`, - which Lance-scans the committed snapshot at the captured - `expected_version` and unions with a DataFusion `MemTable` over the - pending batches. Replaces the previous "reopen at staged Lance - version" pattern. -- **Coordinator swap-restore eliminated** from `mutate_with_current_actor`. - Branch is threaded explicitly through the per-op execution path - (`execute_named_mutation`, `execute_insert`, `execute_update`, - `execute_delete*`, `validate_edge_insert_endpoints`, - `ensure_node_id_exists`). The `swap_coordinator_for_branch` / - `restore_coordinator` API and `CoordinatorRestoreGuard` are removed - from `mutation.rs`. (`merge.rs` keeps its own swap pattern; that's - a separate workflow.) -- **`docs/dev/invariants.md` mutation atomicity / read-your-writes status** - flips from `aspirational/open` to `upheld for inserts/updates`. The within-query read-your-writes - guarantee is now load-bearing for the publisher CAS contract. - -## Behavior changes - -- A failed multi-statement mutation no longer surfaces - `ExpectedVersionMismatch` on the *next* mutation against the same - table. The next call proceeds normally — Lance HEAD on staged - tables is unchanged. -- Mixed insert/update + delete in one query is rejected at parse - time. Existing test queries that mixed both must be split. -- `MutationStaging`'s shape changed: `pending: HashMap` - + `inline_committed: HashMap` replaces the - previous `latest: HashMap`. This is an internal - type; no public API impact. - -## Residual / out of scope - -- **`LoadMode::Overwrite`** keeps the legacy inline-commit path - (truncate-then-append doesn't fit the staged shape). A mid-overwrite - failure can still drift Lance HEAD on a partially-truncated table; - the next overwrite replaces it. Operator-driven, rare. -- **Delete-only multi-statement mutations** still inline-commit per op. - D₂ keeps inserts/updates from coexisting with deletes, so the - inline path remains atomic per op but not per query for delete-only - cascades. Closing this requires Lance to expose - `DeleteJob::execute_uncommitted`; tracked upstream with Lance. -- **`schema_apply`, `branch_merge_internal`, `ensure_indices`** still - use Lance's inline-commit APIs. The two-phase pattern is in - `mutate_as` and `load` only; hoisting it to a storage-trait invariant - covering all writers remains future work. - -## Tests added - -- `tests/writes.rs::partial_failure_leaves_target_queryable_and_unblocks_next_mutation` - (replaces the old `partial_failure_observably_rolls_back_but_blocks_next_mutation_on_same_table`) -- `tests/writes.rs::mutation_rejects_mixed_insert_and_delete_at_parse_time` -- `tests/writes.rs::mixed_insert_and_update_on_same_person_coalesces_to_one_merge` -- `tests/writes.rs::multiple_appends_to_same_edge_coalesce_to_one_append` -- `tests/writes.rs::multi_statement_inserts_publish_exactly_once` -- `tests/writes.rs::load_with_bad_edge_reference_unblocks_next_load` -- `tests/writes.rs::load_with_cardinality_violation_unblocks_next_load` - -## Files changed - -- `crates/omnigraph/src/exec/staging.rs` (NEW) — `MutationStaging`, - `PendingTable`, `PendingMode`, `StagedTablePath`, - `dedupe_merge_batches_by_id`. -- `crates/omnigraph/src/exec/mutation.rs` — D₂ check; per-op - rewires (`execute_insert`, `execute_update`, `execute_delete*`); - branch threading; coordinator-swap removal; helper - `validate_edge_cardinality_with_pending`; helper - `concat_match_batches_to_schema`; `apply_assignments` updated to - copy unassigned blob columns from full-schema scans. -- `crates/omnigraph/src/loader/mod.rs` — `load_jsonl_reader` split: - staged path for Append/Merge, legacy inline-commit path for - Overwrite. Helpers `collect_node_ids_with_pending` and - `validate_edge_cardinality_with_pending_loader`. -- `crates/omnigraph/src/table_store.rs` — `scan_with_pending`, - `count_rows_with_pending` (DataFusion `MemTable`-backed union with - Lance scan). -- `Cargo.toml` (workspace) + `crates/omnigraph/Cargo.toml` — added - `datafusion = "52"` direct dep (transitively pulled by Lance - already; required for `MemTable`). -- `docs/dev/writes.md` — removed "Known limitation" section; documented - the new accumulator + D₂ + LoadMode::Overwrite residual. -- `docs/dev/invariants.md` — mutation atomicity / read-your-writes status - flipped to `upheld for inserts/updates`. -- `docs/dev/architecture.md` — added "Mutation atomicity — in-memory - accumulator" subsection; refreshed the engine + state - diagrams to drop `RunRegistry` and add `MutationStaging`. -- `docs/dev/execution.md` — rewrote the mutation flow sequence diagram - for the staged-write path; updated the `LoadMode` table to call - out per-mode commit semantics; rewrote `load` vs `ingest`. -- `docs/user/query-language.md` — documented the D₂ parse-time rule. -- `docs/user/errors.md` — added the D₂ `BadRequest` rejection path. -- `docs/user/storage.md` — dropped the live `_graph_runs.lance` reference - from the layout diagram and prose. -- `docs/user/branches-commits.md` — moved `__run__` to a legacy note; - removed `publish_run` from the publish-trigger list. -- `docs/user/audit.md` — current `_as` API list refreshed; legacy - `RunRecord.actor_id` moved to a historical note. -- `docs/user/constants.md` — marked the run registry / branch-prefix rows - as legacy. -- `docs/user/cli.md` — replaced the legacy `omnigraph run *` quickstart - block with `omnigraph commit list/show`. -- `docs/dev/testing.md` — extended the `writes.rs` row to cover the new - staged-write contract tests; added the `staged_writes.rs` row. -- `AGENTS.md` (CLAUDE.md symlink) — updated the atomic-per-query - description and the L2 capability matrix row. - -## Included Changes - -- Rewire `mutate_as` and `load` via in-memory `MutationStaging` + - `stage_*` / `commit_staged` per touched table at end-of-query. -- (The storage substrate shipped in v0.4.0's PR #67 — `StagedWrite`, - `stage_append`, `stage_merge_insert`, `commit_staged`, - `scan_with_staged`, `count_rows_with_staged` — and is the substrate - this release builds on.) diff --git a/docs/releases/v0.4.2.md b/docs/releases/v0.4.2.md deleted file mode 100644 index bc45716..0000000 --- a/docs/releases/v0.4.2.md +++ /dev/null @@ -1,115 +0,0 @@ -# Omnigraph v0.4.2 - -Omnigraph v0.4.2 is a concurrency, admission-control, and release-hygiene -release. It removes the server-global write lock, lets disjoint writers make -progress concurrently, adds per-actor admission limits, hardens branch and -mutation races with snapshot-isolation fences, and documents the release in -public open-source terms. - -## Highlights - -- **Unlocked server engine handle**: the HTTP server now holds the engine behind - a shared handle instead of a server-global write lock. Concurrent handlers can - call engine APIs directly while the engine serializes only the resources that - actually conflict. -- **Engine-owned writer queues**: same `(table, branch)` writers are serialized - by per-table writer queues inside the engine, while disjoint table/branch - writes can run concurrently. This narrows contention without relying on route - handlers to know storage-level ordering rules. -- **Per-actor admission control**: mutating HTTP handlers are gated by a - `WorkloadController` with per-actor in-flight request and estimated-byte - budgets. Rejections use HTTP 429 with `code: too_many_requests` and a - `Retry-After` header, so noisy actors back off without blocking unrelated - actors. -- **Admission coverage for all mutating handlers**: `/change`, `/ingest`, - `/schema/apply`, branch create/delete, and branch merge now flow through the - admission controller. Read-only endpoints are not admission-gated. -- **Op-kind-aware version checks**: mutation commit-time drift checks distinguish - append-like inserts from strict update/delete work. Inserts remain permissive - enough for safe concurrent append patterns; updates and deletes get stricter - stale-view rejection. -- **Read-time drift checks for strict mutations**: staged mutations compare the - manifest pin captured when the query opened against the manifest snapshot - captured under table-queue ownership. If a concurrent writer moved the table - after the query read, the stale writer returns a structured - `manifest_conflict` 409 instead of staging work computed against an old - snapshot. -- **Inline-delete recovery coverage**: delete-only mutations still use Lance's - inline delete path, but their recovery sidecar is now written before the - manifest-version rejection path can return. If a delete moves Lance HEAD and a - concurrent manifest update makes the query stale, the next read-write open can - roll the residual back rather than leaving a head-ahead-of-manifest table. -- **Branch-operation race hardening**: branch creation and branch merge avoid - coordinator swap-restore races that could expose the wrong active branch to - concurrent work. Concurrent branch merges are serialized by a merge mutex. -- **Branch-merge target revalidation**: merges re-check target table versions - after acquiring target write queues. A stale merge plan returns a structured - conflict instead of overwriting concurrent target-branch changes or adopting a - source table over newly appended target rows. -- **Schema refresh deadlock fix**: recovery refresh releases the write guard - before schema reload, preventing a refresh/schema-apply deadlock. -- **Lean admission API**: removed the unused global rewrite admission pool, - `service_unavailable` error variant, related 503 documentation, and benchmark - flag. The public server surface now reflects only admission behavior that is - wired to handlers. -- **Open-source release hygiene**: this release adds guidance for public-facing - documentation, release notes, and version bumps. Release docs now avoid - private issue tracker references and use stable public descriptions instead. - -## Behavior changes - -- Disjoint mutating HTTP requests can now make progress concurrently instead of - queueing behind one process-wide engine write lock. -- Mutating handlers may return HTTP 429 when an actor exceeds per-actor in-flight - or estimated-byte budgets. Clients should respect `Retry-After` and retry - later. -- Concurrent update/delete and merge races now return structured - `manifest_conflict` 409 responses in more stale-view cases instead of relying - on later publisher-CAS detection or allowing a stale plan to proceed. -- Concurrent branch merge × change on the same target branch may return either - success or a clean 409 conflict, depending on which operation wins the queue. -- `OMNIGRAPH_GLOBAL_REWRITE_MAX` is no longer recognized. Remove it from - deployment manifests; use the per-actor in-flight and byte-budget admission - settings for the currently wired server controls. - -## Upgrade Notes - -- No repository migration is required. Existing v0.4.1 repos can be opened - directly with v0.4.2. -- Clients should treat `manifest_conflict` 409 responses as retryable stale-view - conflicts. This was already the documented contract, but this release uses it - in more concurrent-write paths. -- Clients should handle HTTP 429 from every mutating endpoint, not only - `/change`. Honor the `Retry-After` header. -- Operators should remove stale references to global rewrite admission and 503 - rewrite-pool exhaustion from local runbooks. -- If you maintain public docs or release notes, use public identifiers and - user-facing descriptions rather than private tracker IDs. - -## Tests added or strengthened - -- Regression tests for update read-your-writes under in-process concurrency. -- HTTP tests for same-key insert snapshots, disjoint `/change` concurrency, and - `/ingest` admission 429 + `Retry-After`. -- Branch-operation regression tests for branch-create swap-restore races, - concurrent `/change` + branch-merge interleavings, branch-merge swap-restore - races, branch-op matrix coverage, and post-reopen consistency. -- Failpoint-backed regression coverage for inline-delete recovery sidecar - creation before version-mismatch rejection. -- Admission tests use injectable `WorkloadController` state instead of mutating - process environment. - -## Included Changes - -- Shared server engine state and per-actor admission on mutating endpoints. -- Per-(table, branch) writer queues and op-kind-aware manifest drift checks. -- Strict read-time version checks for updates/deletes. -- Branch create/merge race hardening and branch-merge target snapshot - revalidation under queue ownership. -- Retry-after support for admission rejections and OpenAPI updates for reachable - 429 responses. -- Actor-isolation benchmark harness updates for the current admission controller. -- Removal of the unwired global rewrite admission / 503 server surface. -- Version bump to `0.4.2` across workspace crates, `Cargo.lock`, and - `openapi.json`. -- Public release-note cleanup and new OSS best-practice guidance in `AGENTS.md`. diff --git a/docs/releases/v0.5.0.md b/docs/releases/v0.5.0.md deleted file mode 100644 index 16e284e..0000000 --- a/docs/releases/v0.5.0.md +++ /dev/null @@ -1,171 +0,0 @@ -# Omnigraph v0.5.0 - -Omnigraph v0.5.0 is a substrate, security, and migration-safety release. It -jumps the storage substrate from Lance 4 to Lance 6.0.1 (DataFusion 52 → 53, -Arrow 57 → 58), introduces engine-wide Cedar policy enforcement on every -authoring path, and ships a structured schema-lint v1 chassis with -code-tagged diagnostics, soft drops, and an explicit `--allow-data-loss` -flag for destructive migrations. - -## Highlights - -- **Lance 6.0.1 substrate**: bump from Lance 4.0.0 → 6.0.1, DataFusion 52 → - 53, Arrow 57 → 58. New optimizer rules (vectorized `IN`-list eq kernel, - `PhysicalExprSimplifier`, push-limit-into-hash-join, CASE-NULL shortcut) - reach predicates that flow through the engine. `lance-tokenizer` replaces - tantivy internally; FTS behavior preserved. -- **Cedar policy engine**: a new `omnigraph-policy` crate wires - `Omnigraph::enforce(action, scope, actor)` into every `_as` writer - (`mutate_as`, `load_as`, `apply_schema_as`, `branch_create_as`, - `branch_merge_as`, `branch_delete_as`, plus the load and change - variants). The HTTP server defaults to deny-all when no Cedar policy is - configured; a YAML policy file is required to enable writes. Actor - identity comes only from signed token claims — clients cannot set actor - identity directly. -- **Schema lint v1 chassis**: diagnostics now carry stable codes of the form - `OG-XXX-NNN` instead of free-form messages. `omnigraph schema plan` and - `apply` understand soft drops on properties and types — destructive drops - require the new `--allow-data-loss` flag (Hard mode) at the CLI and an - equivalent JSON flag over HTTP. -- **Structured filter pushdown**: query-language predicates lower to - DataFusion `Expr` and push down through Lance's `Scanner::filter_expr` - instead of being flattened to SQL strings. This unlocks `CompOp::Contains` - pushdown (via `array_has`), which previously fell through to in-memory - post-scan filtering, and lets the DataFusion 53 optimizer rules above act - on our predicates. -- **HTTP `allow_data_loss` parity**: the destructive-drop guard now exists - on both the CLI (`--allow-data-loss`) and HTTP (`allow_data_loss: true` in - the schema-apply request body). -- **Inline query strings on CLI and HTTP**: `omnigraph read` / - `omnigraph mutate` and the corresponding HTTP endpoints accept inline - `.gq` source, not just a file path. Easier ad-hoc queries, clearer - request logs. -- **Browser CORS layer**: optional CORS layer on `omnigraph-server` for - browser-based UIs, gated by `OMNIGRAPH_CORS_ORIGINS`. -- **Merge-insert dup-rowid fix**: Lance's `MergeInsertBuilder` could surface - spurious `"Ambiguous merge inserts"` errors on sequential merges against - rows previously rewritten by `merge_insert`. The engine now opts into - `SourceDedupeBehavior::FirstSeen` with a `check_batch_unique_by_keys` - fail-fast precondition that guarantees source-side dedup happens before - Lance sees the batch. -- **Branch-merge error-path recovery**: a branch merge that failed - mid-flight could leave the in-process coordinator pointing at a stale - active branch. The error path now restores the prior coordinator, - matching the success path's invariant. -- **Branch merge with blob columns**: external blob URIs are now - materialized correctly during branch merge instead of being dropped or - pointing at the source branch. -- **Lance API surface guards**: a new test file - (`crates/omnigraph/tests/lance_surface_guards.rs`) pins eight specific - Lance API surfaces (`LanceError::TooMuchWriteContention`, - `ManifestLocation` fields, `MergeInsertBuilder` return shape, - `WriteParams::default`, `compact_files` signature, etc.) so the next - Lance bump fails compile or runtime on any silent drift rather than - producing wrong-state recovery in production. - -## Behavior changes - -- **On-disk format unchanged**: existing v0.4.2 datasets open unchanged. - The Lance file format pin stays at V2_2 (required by Lance's blob v2 - feature). -- **`omnigraph-server` defaults to deny-all under `--policy`**: starting a - server with the policy feature enabled but no Cedar YAML policy - configured rejects every write. Operators must supply a policy file to - authorize anything. -- **Schema-lint diagnostics carry stable codes**: messages now lead with - `OG-XXX-NNN`. CI parsers or tooling that keyed off the v0.4.2 free-form - text need to switch to code-based matching. -- **Destructive schema drops require `--allow-data-loss`**: dropping a - property or type returns a structured diagnostic by default. - `omnigraph schema apply --allow-data-loss` (CLI) or - `{"allow_data_loss": true}` (HTTP) opts into Hard mode. -- **`HashJoinExec` null-aware semantics on anti-join**: a side effect of - the DataFusion 53 bump — `NOT IN` semantics under null-valued anti-join - columns are now correct per SQL standard. Queries that depended on the - prior behavior would have been incorrect. - -## Upgrade Notes - -### Migration - -- No data migration. v0.4.2 repos open directly on v0.5.0. - -### Clients - -- HTTP and SDK clients should switch any string-matching schema-lint - parsing to code-based matching against the `OG-XXX-NNN` prefix. -- Clients exercising destructive schema drops (`DropProperty`, `DropType`) - must add the `allow_data_loss` request field (HTTP) or - `--allow-data-loss` flag (CLI). Default is soft-drop-or-reject. -- Clients consuming `mutate_as` / `load_as` / `apply_schema_as` / branch - authoring APIs now flow through the policy enforcer. Anything bypassing - authorization on v0.4.2 will be rejected on v0.5.0 once a policy is - configured. - -### Operators - -- Configure a Cedar policy YAML for production servers before enabling - writes; deny-all is the new default. The `omnigraph policy validate` / - `test` / `explain` CLI commands are unchanged. -- Bearer tokens continue to be the actor-identity source; review the - signed-token-claim-only invariant in `docs/dev/invariants.md` if you've - built custom authentication. -- If your local CI uses RustFS for S3-compatible storage testing, our CI - pins `rustfs/rustfs:1.0.0-beta.3` (the last known-good tag before the - upstream credentials-policy change). Mirror the pin or set - `RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true` for the new image - versions. - -## Tests added or strengthened - -- `crates/omnigraph/tests/lance_surface_guards.rs` — 8 named guards pinning - Lance API surfaces against silent drift on future bumps. -- `crates/omnigraph/tests/policy_engine_chassis.rs` — engine-level policy - enforcement coverage; complements the existing HTTP policy tests. -- Policy chassis e2e gap-fills — branch-merge, branch-create, branch-delete - policy paths now have explicit end-to-end tests over HTTP and CLI. -- Merge-pair truth table — exhaustive op-variant matrix for three-way - merge across `noop`, `addNode`, `removeNode`, `addEdge`, `removeEdge`, - `setProperty`, `dropProperty`, `addLabel`, `removeLabel`; the build - fails to compile when a new op variant is added without dispositioning - every pairing. -- Merge-insert: regression for the dup-rowid bug class on the load surface - (`load_merge_repeated_against_overlapping_keys_succeeds`), the update - surface (`second_sequential_update_on_same_row_succeeds`), and the - upstream-Lance-gap canary - (`load_merge_window_2_documents_upstream_lance_gap`). -- Maintenance + destructive-migration coverage — `omnigraph optimize` / - `cleanup` boundary cases, plus schema-apply soft-drop and Hard-mode - paths. -- Stable-row-id preservation across `stage_overwrite` — pins the invariant - that staged overwrites carry stable row IDs through to the committed - fragment set. -- `CompOp::Contains` pushdown regression - (`ir_filter_with_list_contains_pushes_down`) — pins the new structured - Expr pushdown path that retired the in-memory fallback. - -## Included Changes - -- Lance 4 → 6.0.1, DataFusion 52 → 53, Arrow 57 → 58 substrate upgrade. -- `omnigraph-policy` crate with engine-wide Cedar enforcement and - signed-token-claim-only actor identity. -- Schema-lint v1 chassis with `OG-XXX-NNN` codes, soft `DropProperty` / - `DropType` semantics, and `--allow-data-loss` for Hard mode. -- HTTP `allow_data_loss` request field parity with the CLI flag. -- Structured DataFusion `Expr` filter pushdown via - `Scanner::filter_expr`, with `CompOp::Contains` lowered through - `array_has`. -- Inline `.gq` source acceptance on CLI and HTTP read/mutate endpoints. -- Optional CORS layer on `omnigraph-server` for browser UIs. -- Bug fixes: merge-insert dup-rowid (FirstSeen + uniqueness precondition), - branch-merge coordinator restore on error, blob-column materialization - during branch merge. -- New Lance API surface-guard test file as the canary for future Lance - bumps. -- Recovery-sidecar coverage extended across the four write paths - (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, - `ensure_indices`) with failpoint regression tests. -- CI: pinned `rustfs/rustfs:1.0.0-beta.3` after the upstream `:latest` - introduced a credentials-policy change. -- Version bump to `0.5.0` across workspace crates, `Cargo.lock`, - `openapi.json`, and the `AGENTS.md` surveyed version. diff --git a/docs/releases/v0.6.0.md b/docs/releases/v0.6.0.md deleted file mode 100644 index 7984056..0000000 --- a/docs/releases/v0.6.0.md +++ /dev/null @@ -1,141 +0,0 @@ -# Omnigraph v0.6.0 - -Three pieces of work land in this release: - -1. The **graph terminology rename** (renamed `Repo` → `Graph` across the Cedar resource model, policy API, and query-lint schema source). -2. **Multi-graph server mode** — one `omnigraph-server` process can now serve 1–10 graphs concurrently behind cluster routes (`/graphs/{graph_id}/...`), with per-graph and server-level Cedar policy, read-only `GET /graphs` enumeration, and CLI parity (`omnigraph graphs list`). -3. **Inline + canonical-named queries and mutations.** New `POST /query` and `POST /mutate` endpoints pair with the CLI's new `-e/--query-string` flag for ad-hoc execution without a temp file. `POST /read` and `POST /change` continue serving indefinitely as deprecated aliases that carry RFC 9745 `Deprecation: true` and RFC 8288 `Link: ; rel="successor-version"` response headers, plus `deprecated: true` in `openapi.json`. Same canonicalization on the CLI: `omnigraph query`, `omnigraph mutate`, and top-level `omnigraph lint` / `omnigraph check` replace `omnigraph read`, `omnigraph change`, and the nested `omnigraph query lint` / `omnigraph query check`. Every deprecated spelling remains a `visible_alias` that warns to stderr once per invocation. - -Runtime add/remove (`POST /graphs`, `DELETE /graphs/{id}`, `omnigraph graphs create`) is **not** in v0.6.0. Operators add or remove graphs by editing `omnigraph.yaml` and restarting. The first cut of `POST /graphs` shipped behind an atomic-YAML-rewrite design that we pulled before release once its concurrency guarantees were challenged (flock-on-renamed-inode race, duplicate-check outside the critical section, and an init-cleanup path that could destroy an existing graph's schema on re-init). The correct fix is a Lance-style cluster catalog (reserve → init → publish with recovery sidecars); that work is deferred. - -## Breaking Changes - -### Graph terminology rename - -- Renamed the Cedar resource entity from `Omnigraph::Repo` to `Omnigraph::Graph`. -- Renamed policy API terminology from `repo_id` to `graph_id` on `PolicyCompiler::compile` (and on the new `PolicyEngine::load_graph` / `PolicyEngine::load_server` loaders described below). -- Renamed query-lint schema source JSON from `"repo"` to `"graph"` for `schema_source.kind`. - -### Multi-graph server mode - -- **Multi-graph deployments lose flat routes.** Single-graph invocation (`omnigraph-server `) is unchanged — same flat `/snapshot`, `/read`, `/branches`, etc. Multi-graph deployments serve those routes under `/graphs/{graph_id}/...`; bare flat paths return 404 in multi mode. -- **`ServerConfig` shape change** (programmatic embedders only): `ServerConfig { uri, policy_file }` is replaced by `ServerConfig { mode: ServerConfigMode }`, where `ServerConfigMode = Single { uri, policy_file } | Multi { graphs, config_path, server_policy_file }`. Callers that use `load_server_settings` are unaffected; callers that construct `ServerConfig` directly need to wrap their fields in `ServerConfigMode::Single`. -- **`AppState`'s routing surface** is `AppState::routing() -> &GraphRouting`, where `GraphRouting = Single { handle } | Multi { registry, config_path }`. The previous `AppState::uri()`, `AppState::mode()`, `AppState::registry()` accessors and the `ServerMode` enum are gone — embedders read `state.routing()` and match on the arm they need. Per-graph URIs live on `handle.uri`. -- **`AppState::new_multi`** is the new multi-graph constructor. Single-mode `new_*` / `open_*` constructors are unchanged. -- **`AuthenticatedActor(Arc)` → `ResolvedActor { actor_id, tenant_id, scopes, source }`** (programmatic embedders only). The struct shape changes, but the HTTP contract — bearer auth and the bearer-derived-actor-identity guarantee — is unchanged. Cluster-mode call sites construct with `tenant_id: None`, `scopes: vec![Scope::Full]`, `source: AuthSource::Static`. The new fields are forward-compat seams for future multi-tenant and OAuth deployments; they're inert in this release. -- **`PolicyEngine::load(path, graph_id)` removed** in favor of two kind-typed loaders: `PolicyEngine::load_graph(path, graph_id)` for per-graph policies and `PolicyEngine::load_server(path)` for server-level policies. Each loader rejects rules whose action `resource_kind()` doesn't match the engine kind — operators who put a `graph_list` rule in a per-graph file (or a `read` rule in a server file) now get a load-time error instead of a silently-never-matching rule. -- **`PolicyRequest::actor_id` field removed.** Actor identity is now a separate parameter on `PolicyEngine::authorize(actor_id, &request)`. The type system enforces the server-authoritative-actor invariant: actor identity is always sourced from the bearer-token match resolved at the auth boundary; handlers cannot smuggle identity through the request body. -- **`Omnigraph::init` is strict by default.** Initialization at a URI that already holds schema files now errors with `OmniError::AlreadyInitialized` instead of silently overwriting. Operators who actually want to overwrite use `InitOptions { force: true }` (CLI: `omnigraph init --force`). Closes the destructive-cleanup footgun where a failed re-init would delete an existing graph's schema files. -- **Top-level `policy.file` is rejected in multi-graph server mode.** It remains valid for single-graph / CLI-local policy. Multi-graph deployments must move graph rules to `graphs..policy.file` and server-scoped `graph_list` rules to `server.policy.file`. -- **Open server startup requires explicit opt-in.** A server with no bearer tokens and no policy now refuses to start unless passed `--unauthenticated` or `OMNIGRAPH_UNAUTHENTICATED=1`. -- **Policy requires bearer tokens.** Configuring any policy file without bearer tokens now refuses startup; otherwise every protected request would 401 before Cedar could evaluate it. -- **Tokens without policy default-deny non-read actions.** Existing authenticated deployments that relied on writes or admin routes without Cedar policy must add policy rules for those actions. -- **`GET /graphs` requires `server.policy.file` in every runtime state.** Even `--unauthenticated` mode keeps server topology closed until the operator explicitly authorizes `graph_list`. - -### Query / mutation rename - -- **`ChangeRequest` field rename**: `query_source` → `query`, `query_name` → `name`. Both legacy names continue to deserialize via `#[serde(alias = "...")]`, so existing clients sending the old JSON keys keep working. CLI remote calls against `/change` still emit the legacy keys verbatim through the `legacy_change_request_body` helper so a newer CLI talking to an older server keeps working byte-for-byte. -- **CLI `omnigraph query lint` / `omnigraph query check`** are now top-level — canonical name is **`omnigraph lint`**. The three deprecated invocations (`omnigraph query lint`, `omnigraph query check`, and bare `omnigraph check`) remain as argv-level shims that rewrite to `omnigraph lint` and print a one-line stderr deprecation warning. `check` is deliberately **not** a clap `visible_alias` on `lint` — two equivalent canonical names would split agent emissions between them depending on training-data drift, so the deprecation pattern (rewrite + warn) gives one unambiguous canonical name in `omnigraph --help`. - -## New - -- **Multi-graph mode**. Invoke with `omnigraph-server --config omnigraph.yaml` where the YAML has a non-empty `graphs:` map and no single-mode selector (no `server.graph`, no CLI `` or `--target`). At startup the server opens every configured graph in parallel (bounded concurrency, fail-fast). -- **`GET /graphs`**. Lists every registered graph, sorted alphabetically by `graph_id`. Auth-required when bearer tokens are configured; Cedar-gated by `PolicyAction::GraphList` against `Omnigraph::Server::"root"`. Returns 405 in single mode. Server-scoped actions require an explicit `server.policy.file` in every runtime state — the management surface is closed by default even in `--unauthenticated` mode so that server topology is never exposed without operator opt-in. -- **CLI `omnigraph graphs list`**. Mirrors the HTTP surface. Rejects local URI targets with a clear message — for remote multi-graph servers only. -- **CLI `omnigraph init --force`**. Bypasses the strict-init preflight when an operator deliberately wants to recover from orphan schema files. Does NOT purge existing Lance datasets; recursive deletion needs `StorageAdapter::delete_prefix` (deferred — see below). -- **Per-graph Cedar policy**. Each entry in the `graphs:` map can carry a `policy.file` path, loaded at startup via `PolicyEngine::load_graph`. Cedar's `Omnigraph::Graph::""` resource is per-graph; the new `Omnigraph::Server::"root"` resource governs server-level actions. -- **Server-level Cedar policy**. `server.policy.file` in the config governs the `graph_list` action on `Omnigraph::Server::"root"`. Required to expose `GET /graphs` in every runtime state — without a server policy the default-deny posture rejects `graph_list`, including in `--unauthenticated` mode. -- **Cedar action vocabulary**: `graph_list` (server-scoped). Runtime `graph_create` / `graph_delete` are reserved but not shipped — see "Deferred." -- **Canonical graph URI identity.** Server startup normalizes graph root URIs before registry insertion and response output, so aliases such as `/tmp/g`, `/tmp/g/`, and `file:///tmp/g` cannot register as distinct graphs that actually share one Lance root. -- **`POST /query`** and **`POST /mutate`**. Canonical inline endpoints. `/query` rejects mutations with a typed 400 (the D2 rule lives at the URL — read-only contract enforced before execution); body uses the clean `{ query, name, params, branch, snapshot }` shape. `/mutate` accepts the same shape for mutations. Both available in single mode and per-graph multi mode (`/graphs/{id}/query`, `/graphs/{id}/mutate`). Internal call sites share two helpers (`run_query`, `run_mutate`) that take decoupled args, not request bodies — the seam MR-969's future stored-query handler plugs into. -- **CLI `omnigraph query` / `omnigraph mutate`** as top-level canonical subcommands. Pairs with new top-level **`omnigraph lint` (alias `check`)** so query validation no longer sits under `omnigraph query`. -- **CLI `-e, --query-string `** on both `omnigraph query` and `omnigraph mutate`. 3-way mutex with `--query ` and `--alias ` — exactly one is required. Empty string rejected. Suits ad-hoc exploration, REPL workflows, and agent tool-use without temp files. -- **Three-channel deprecation signal on `/read` and `/change`**: OpenAPI `deprecated: true` on the operation (every codegen flags the generated SDK method), RFC 9745 `Deprecation: true` response header, and RFC 8288 `Link: ; rel="successor-version"` (or ``) response header. Auto-discoverable; no SDK breakage. -- **`omnigraph.yaml` `aliases..command`** now accepts `query` and `mutate` as canonical values alongside the legacy `read` and `change`. The internal `AliasCommand` enum retains the legacy variant names so serialized configs stay byte-stable. - -## Configuration - -`omnigraph.yaml` schema additions (all optional, single-mode unaffected): - -```yaml -server: - bind: 0.0.0.0:8080 - policy: - file: ./server-policy.yaml # server-level Cedar (graph_list) - -graphs: - alpha: - uri: s3://tenant-bucket/alpha - policy: - file: ./policies/alpha.yaml # per-graph Cedar - beta: - uri: s3://tenant-bucket/beta - # no per-graph policy → engine-layer enforcement is a no-op -``` - -## Deferred - -- **`POST /graphs` runtime graph creation** and **CLI `omnigraph graphs create`**. Pulled before release after the YAML-rewrite design's correctness story didn't survive review. A future release will add a managed cluster catalog (Lance-backed reserve → init → publish with recovery sidecars) and re-expose runtime creation on top of it. Until then, operators add graphs by editing `omnigraph.yaml` and restarting. -- **`DELETE /graphs/{id}`**. Never shipped in v0.6.0; deferred with the same cluster-catalog work. -- **`StorageAdapter::delete_prefix`**. The substrate primitive a managed catalog would need. Will land alongside runtime mutation. -- **`omnigraph init --force` purging Lance state.** Today `--force` only bypasses the schema-file preflight; recursive deletion of existing Lance datasets needs `delete_prefix`. -- **`X-Actor-Id` service delegation forwarding**. Needs durable both-actor audit on `_graph_commits.lance` — out of scope. -- **Hot policy reload**. Restart is cheap at N≤10 graphs. - -## User Impact - -- **No on-disk migration is required.** Existing `.omni` graphs from v0.5.0 (and earlier) open cleanly under v0.6.0 — Lance datasets, `__manifest`, `_schema.pg`, `_schema.ir.json`, `__schema_state.json`, `_graph_commits.lance`, `_graph_commit_recoveries.lance` all use unchanged formats. No conversion step. -- **Existing single-graph storage upgrades without migration.** Server deployments may need auth/policy config changes: explicitly pass `--unauthenticated` for local open mode, configure tokens when using policy, and add Cedar policy for non-read authenticated actions. -- **Multi-graph adoption is opt-in.** Add a `graphs:` map to `omnigraph.yaml` (and remove `server.graph`) to switch a deployment to multi mode. -- **Cluster routes are breaking for client SDKs targeting multi mode.** Generated clients from previous v0.5.0 OpenAPI specs will hit 404 on flat paths against a multi-mode server. Regenerate against the v0.6.0 `openapi.json`. -- **Supported YAML policy authoring is unchanged.** The Cedar `Omnigraph::Graph` and `Omnigraph::Server` entities are internally generated by `compile_policy_source` — operator YAML only references actions and groups. -- **Operators with unsupported raw Cedar policy files** should update `Omnigraph::Repo` resource references to `Omnigraph::Graph`. -- **Endpoint and CLI rename is cosmetic on the client side.** Existing callers on `/read`, `/change`, `omnigraph read`, `omnigraph change`, and `omnigraph query lint` keep working — they pick up the `Deprecation` + `Link` headers (or stderr deprecation warning on the CLI) so SDKs and proxies can surface the successor name automatically. New integrations should target the canonical names. ChangeRequest field names migrate at the caller's pace — both `query_source`/`query_name` and `query`/`name` accepted indefinitely. - -## Migration: single → multi - -```yaml -# Before (v0.5.0 single-mode invocation) -server: - graph: my-graph -graphs: - my-graph: - uri: /var/lib/omnigraph/my-graph -policy: - file: ./policy.yaml -``` - -```yaml -# After (v0.6.0 multi-mode — drop `server.graph` and the top-level `policy`) -server: - policy: - file: ./server-policy.yaml # NEW: governs GET /graphs -graphs: - my-graph: - uri: /var/lib/omnigraph/my-graph - policy: - file: ./policy.yaml # MOVED: was top-level -``` - -Same `omnigraph.yaml` file; restart the server. Clients targeting the old flat routes (`/snapshot`, `/read`, …) must update to `/graphs/my-graph/snapshot`, etc. - -To add a new graph after rollout: stop the server, append a new `graphs.` entry, restart. - -## Documentation - -- Public docs, CLI help, examples, server docs, and test helpers now consistently use "graph" for the OmniGraph data artifact. -- GitHub/source repository terminology remains spelled out as "repository" where needed. -- New: `docs/user/cli.md` documents `omnigraph graphs list`; `docs/user/server.md` documents the multi-graph mode and the cluster route convention; `docs/user/policy.md` documents the per-graph vs server-scoped action distinction. -- New: `docs/user/server.md` documents `POST /query` / `POST /mutate` and the three-channel deprecation signal on `/read` / `/change`. `docs/user/cli.md` documents the `-e/--query-string` flag with examples. `docs/user/cli-reference.md` shows the canonical CLI verbs (`query`, `mutate`, `lint`, `check`) with legacy spellings as visible aliases. -- New: `docs/dev/rfc-001-queries-envelope-mcp.md` is the cross-cutting design doc for the inline / stored query work that started landing in this release. It sequences the v0.6.x patch series (request/response envelope hardening) and the v0.7.0 stored-query + MCP work. - -## Test coverage - -- `GraphId` newtype validation, registry race tests, init failpoints (still reachable from `omnigraph init` CLI). -- Mode-inference four-rule matrix, parallel multi-graph startup, cluster routing. -- Cedar `Server` resource refactor, backwards-compat for graph-only policies, kind-alignment rejection (server actions in graph files / vice versa). -- `GET /graphs` enumeration, 405-in-single-mode, 403-in-Open-mode-without-server-policy, Cedar admin/viewer authorization. -- Cluster routes with inner path params (`/branches/{branch}`, `/commits/{commit_id}`) deserialize correctly under axum 0.8 nested routing. -- Policy-requires-tokens startup invariant enforced uniformly across single and multi mode. -- The bearer-auth-derived-actor-identity regression test (client-supplied identity headers are ignored; the server-resolved actor is the only identity Cedar sees) stays green across the entire refactor. - diff --git a/docs/releases/v0.6.1.md b/docs/releases/v0.6.1.md deleted file mode 100644 index eb76e1f..0000000 --- a/docs/releases/v0.6.1.md +++ /dev/null @@ -1,28 +0,0 @@ -# Omnigraph v0.6.1 - -v0.6.1 focuses on operational polish after v0.6.0: stored-query registries, safer branch cleanup, more complete release artifacts, and a Lance blob-compaction workaround. - -## Highlights - -- **Stored-query registries.** `omnigraph.yaml` can declare curated `queries:` blocks per graph. Servers load and type-check them at startup, `omnigraph queries validate` checks them offline, `omnigraph queries list` shows exposed queries and typed params, `GET /queries` exposes a typed catalog, and `POST /queries/{name}` invokes a stored query without accepting ad hoc `.gq` source from the client. -- **Stored-query policy gate.** New Cedar action `invoke_query` gates the stored-query invocation surface. Stored mutations are double-gated: `invoke_query` to reach the stored query and `change` for the actual write. -- **Safer branch deletion.** `branch_delete` now treats the manifest as the authority, flips branch visibility atomically, and reclaims per-table/commit-graph forks as derived state. If best-effort reclaim is interrupted, `cleanup` reconciles orphaned forks; reusing a branch name before cleanup reports an actionable error. -- **Legacy `__run__` cleanup (MR-770).** *(Correction: this item shipped in [v0.6.2](v0.6.2.md), not v0.6.1 — the v0.6.1 notes over-claimed it. At the v0.6.1 tag the `__run__` branch-name guard and `run_registry.rs` were still present and no v2→v3 sweep migration existed.)* The guard removal and the one-time v2→v3 `__manifest` migration that sweeps stale `__run__*` staging branches on first read-write open are described in the v0.6.2 release notes. -- **Blob-safe optimize.** `omnigraph optimize` skips tables with `Blob` properties instead of failing the whole sweep on Lance's blob-v2 compaction decode bug. Skips are visible in human output, `--json` as `skipped`, `TableOptimizeStats.skipped`, and logs; non-blob tables still compact normally. -- **Deployment improvements.** The container entrypoint now composes `OMNIGRAPH_TARGET_URI` with `OMNIGRAPH_CONFIG`, so operators can keep the graph URI in env while loading policy/query config from a mounted file. The local RustFS bootstrap pins RustFS beta.3 and allows the current insecure local-dev default credentials. -- **Windows release support.** Tagged and edge releases now publish Windows x86_64 archives containing `omnigraph.exe` and `omnigraph-server.exe`, with a PowerShell installer and Windows install docs. -- **Release tooling.** Homebrew formula generation was tightened to produce audit-clean formulas. - -## Compatibility Notes - -- A graph selected by name (`--target` or `server.graph`) now uses `graphs..policy` and `graphs..queries`. Top-level `policy` / `queries` blocks are only for anonymous bare-URI single-graph mode; using them with a named graph now fails loudly with migration guidance. -- `mcp.expose` defaults to `true` for stored-query registry entries. Set `mcp: { expose: false }` for service-only queries that should not appear in the catalog. -- `invoke_query` is graph-scoped, not branch-scoped. Branch/snapshot access remains enforced by the inner `read` / `change` gate. -- **Legacy `__run__` migration.** *(Correction: deferred to [v0.6.2](v0.6.2.md).)* The automatic v2→v3 `__manifest` stamp migration that sweeps stale `__run__*` branches on first read-write open ships in v0.6.2, not v0.6.1; a v0.6.1 binary does not perform it. See the v0.6.2 notes for the migration behavior and the read-only caveat. -- Blob tables are not compacted until the upstream Lance fix lands, so fragment count and deleted-row space on blob tables are not reclaimed by `optimize`. Reads, writes, and query results are unaffected; no on-disk migration is required. -- `TableOptimizeStats` is now `#[non_exhaustive]` and gains a `skipped: Option` field (so does the new `SkipReason` enum). This is a source-level change only for downstream code that built this returned result struct by literal — rare, since it is produced by `optimize` and consumed by reading its fields; field access is unaffected, and `#[non_exhaustive]` keeps future additions non-breaking. - -## Docs And Cleanup - -- Public docs were updated for stored queries, policy, server routes, deployment, Windows installation, branch deletion, maintenance, and the `runs` docs rename to `writes`. -- README copy and release documentation were refreshed; older release notes had small typo/wording fixes. diff --git a/docs/releases/v0.6.2.md b/docs/releases/v0.6.2.md deleted file mode 100644 index f97f67b..0000000 --- a/docs/releases/v0.6.2.md +++ /dev/null @@ -1,69 +0,0 @@ -# Omnigraph v0.6.2 - -v0.6.2 is a maintenance-safety release on top of v0.6.1. It tightens the -`optimize` / recovery boundary, adds an explicit repair path for uncovered -manifest/head drift, completes the legacy `__run__` branch cleanup (MR-770), -accepts pretty-printed JSON load input, and updates the project governance and -release automation around those fixes. - -## Highlights - -- **Explicit `omnigraph repair`.** New `repair` CLI support previews uncovered - manifest/head drift by default and reports each table's classification, - action, manifest version, Lance HEAD version, Lance operations, and any - classification error. `--confirm` publishes verified maintenance-only drift; - `--force --confirm` can publish suspicious or unverifiable drift after - operator review. -- **Optimize skips uncovered drift.** `omnigraph optimize` now refuses to - interpret Lance HEAD movement that is ahead of `__manifest` without a recovery - sidecar. Those tables are reported as `skipped: DriftNeedsRepair` and left - untouched until `omnigraph repair` classifies them. -- **Optimize publishes compaction.** Successful compaction now publishes the - compacted Lance version back through the graph manifest and is covered by an - `Optimize` recovery sidecar. A crash after Lance compaction but before - manifest publish converges through the normal recovery sweep instead of - leaving hidden drift. -- **Recovery roll-back convergence.** Recovery roll-back now aligns the - manifest-visible version after restoring a table, closing the residual where - Lance HEAD and `__manifest` could stay out of sync after recovery. -- **Legacy `__run__` branch cleanup (MR-770).** Completes the retirement of the - Run state machine (removed in v0.4.0). A one-time v2→v3 `__manifest` - internal-schema migration runs on the first read-write open and deletes any - stale `__run__*` staging branches left by pre-v0.4.0 graphs — they previously - leaked into `branch list` and counted as blocking branches at `schema apply` - time. The migration is idempotent, and the `is_internal_run_branch` guard - (and `run_registry.rs`) is retired now that `__run__*` is an ordinary branch - name. (The earlier v0.6.1 notes described this as shipped in v0.6.1; it - actually landed here in v0.6.2.) -- **Pretty-printed JSON load input.** `load` accepts multi-line JSON objects in - addition to one-object-per-line JSONL, so formatted fixture or export files no - longer need to be minified before import. - -## Operational Notes - -- `repair` requires a clean recovery state. Pending `__recovery` sidecars still - belong to automatic open-time recovery; reopen the graph first, then run - repair if drift remains. -- `repair --confirm` only auto-publishes drift made of Lance maintenance - operations (`Rewrite` and `ReserveFragments`). Semantic operations such as - append, delete, update, and merge are refused unless the operator uses - `--force --confirm`. -- `optimize` remains non-destructive. It still skips blob-bearing tables while - OmniGraph is pinned to the Lance version with the blob-v2 compaction issue. -- No manual on-disk migration is required. Existing graphs open under v0.6.2. - Graphs already at internal manifest schema stamp v3 are unchanged; graphs - created before v0.4.0 that still carry the v2 stamp auto-migrate v2→v3 on the - first **read-write** open (the `__run__*` sweep above). The migration is - write-path-only, so a long-lived **read-only** deployment still lists any - stale `__run__*` branch until it is next opened read-write. - -## Docs, Governance, And CI - -- Added issue, discussion, RFC, and pull-request templates plus governance docs - for the external contribution path. -- Regenerated CODEOWNERS tables and adjusted branch-protection docs so code - owners can bypass required PR review where repository rules allow it. -- Trimmed Windows release builds out of per-PR CI and kept Windows packaging on - tag releases. -- Made Homebrew audit diagnostic-only in the release workflow so a flaky audit - cannot block publishing an otherwise valid formula update. diff --git a/docs/user/cli/index.md b/docs/user/cli/index.md index 6f49c42..b00d42b 100644 --- a/docs/user/cli/index.md +++ b/docs/user/cli/index.md @@ -14,7 +14,7 @@ omnigraph mutate insert_person --params '{"name":"Mina","age":28}' `omnigraph query` is the canonical read command (pairs with `POST /query`); `omnigraph mutate` is the canonical write command (pairs with `POST /mutate`). The positional argument is the **stored-query name**, invoked from the served -catalog (RFC-011 D3) — the graph is addressed by scope (`--server` / `--profile` +catalog — the graph is addressed by scope (`--server` / `--profile` / defaults), and the verb asserts the query's kind (`query` rejects a stored mutation, and vice-versa). The previous names `omnigraph read` and `omnigraph change` keep working as visible aliases — invocations emit a one-line diff --git a/docs/user/cli/reference.md b/docs/user/cli/reference.md index 1d52e45..9d83ead 100644 --- a/docs/user/cli/reference.md +++ b/docs/user/cli/reference.md @@ -2,7 +2,7 @@ A reference for the `omnigraph` binary's command surface and the per-operator `~/.omnigraph/config.yaml` schema. For a quick-start guide, see [cli.md](index.md). -Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server ` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph ` for multi-graph servers; exclusive with a positional URI), `--store ` (a single graph's storage directly), or `--profile ` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles-rfc-011)); `cluster` commands use `--config `, while `policy` and `queries` read a cluster's applied state via `--cluster `. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. **`query`/`mutate` are the exception**: their positional is a stored-query *name* (RFC-011 D3), not a graph URI, so they address the graph only via `--store`/`--server`/`--profile`/defaults. +Top-level command families and subcommands. Graph-targeting commands accept a positional `file://`/`s3://` URI, `--server ` (an operator-defined server from `~/.omnigraph/config.yaml` by name, or a literal `http(s)://` URL, optionally with `--graph ` for multi-graph servers; exclusive with a positional URI), `--store ` (a single graph's storage directly), or `--profile ` / `$OMNIGRAPH_PROFILE` (a named scope bundle; see [Scopes & profiles](#scopes--profiles)); `cluster` commands use `--config `, while `policy` and `queries` read a cluster's applied state via `--cluster `. A remote server is addressed only with `--server` — a positional `http(s)://` URI is rejected. **`query`/`mutate` are the exception**: their positional is a stored-query *name*, not a graph URI, so they address the graph only via `--store`/`--server`/`--profile`/defaults. ## Top-level commands @@ -13,14 +13,14 @@ Top-level command families and subcommands. Graph-targeting commands accept a po | `ingest` | deprecated alias of `load --from ` (defaults: `--from main --mode merge`); prints a one-line warning to stderr | | `query ` (alias: `read`) | run a read query. **Catalog lane** (default): `` is a stored query invoked **by name** from the served catalog (served-only — address with `--server`/`--profile`; the verb asserts the query is a read). **Ad-hoc lane**: with `--query ` or `-e`/`--query-string `, runs that source (the positional `` then selects which query in it). No positional graph URI — address via `--store`/`--server`/`--profile`. `read` is the deprecated previous name (one-line stderr warning) | | `mutate ` (alias: `change`) | run a mutation query; same catalog (by-name, served-only, verb asserts mutation) / ad-hoc (`--query`/`-e`) lanes as `query`. `change` is the deprecated previous name (one-line stderr warning) | -| `alias [args]` | invoke an operator alias — a read-only personal binding (under `aliases:` in `~/.omnigraph/config.yaml`) to a stored query on a named server (RFC-011 D4; replaces the removed `--alias` flag; stored mutations are rejected before execution) | +| `alias [args]` | invoke an operator alias — a read-only personal binding (under `aliases:` in `~/.omnigraph/config.yaml`) to a stored query on a named server (replaces the removed `--alias` flag; stored mutations are rejected before execution) | | `snapshot` | print current snapshot (per-table version + row count) | | `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) | | `branch create \| list \| delete \| merge` | branching ops | | `commit list \| show` | inspect commit graph | | `schema plan \| apply \| show (alias: get)` | migrations. `apply` refuses a cluster-managed graph (one whose storage is inside a cluster) and points at `cluster apply` — those graphs evolve through the cluster ledger, not a direct apply | | `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` | -| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve --as ` (`apply`/`approve` default the actor from `~/.omnigraph/config.yaml`'s `operator.actor` when `--as` is omitted); what apply converges is what an `omnigraph-server --cluster ` deployment serves on its next restart (`--cluster` is the server's only boot source — RFC-011 cluster-only); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock ` manually removes a held local state lock by exact id | +| `cluster validate \| plan \| apply \| approve \| status \| refresh \| import \| force-unlock` | declarative cluster control plane. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`, annotates dispositions, and embeds real schema-migration previews; `apply` converges the cluster — stored-query/policy catalog writes (content-addressed under `__cluster/resources/`), graph creates, schema updates (soft drops only; `--as` records the actor), and graph deletes behind a digest-bound approval from `cluster approve --as ` (`apply`/`approve` default the actor from `~/.omnigraph/config.yaml`'s `operator.actor` when `--as` is omitted); what apply converges is what an `omnigraph-server --cluster ` deployment serves on its next restart (`--cluster` is the server's only boot source — cluster-only); `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations; `force-unlock ` manually removes a held local state lock by exact id | | `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) | | `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review | | `cleanup --keep N --older-than 7d --confirm` | destructive version GC (`--confirm` to execute; also needs `--yes` against a non-local `s3://` target — see *Write diagnostics & destructive confirmation*) | @@ -52,7 +52,7 @@ To maintain a server-backed graph, run the `direct` verbs from a host with stora ## Write diagnostics & destructive confirmation -Two global flags make writes self-documenting and guard the dangerous ones (RFC-011 Decision 9): +Two global flags make writes self-documenting and guard the dangerous ones: - **Every write echoes its resolved target to stderr** — `omnigraph load → s3://acme/brain/graphs/knowledge.omni (direct, remote)` — so you catch a scope that resolved somewhere unexpected (e.g. *prod*) before it lands. Applies to `load`, `ingest`, `mutate`, `branch create|delete|merge`, `schema apply`, `optimize`, `repair`, `cleanup`. The line is stderr, so `--json` consumers reading stdout are unaffected; suppress it with **`--quiet`**. - **Destructive writes against a non-local scope require confirmation.** `cleanup`, overwrite `load` (`--mode overwrite`), and `branch delete` proceed freely against a local (`file://`) graph, but when the resolved target is **not local** (a served `http(s)://` graph or an `s3://` store/cluster) they require explicit consent: pass **`--yes`** to confirm, an interactive terminal is prompted, and a non-interactive run (no TTY, or `--json`) **refuses with an error** rather than silently destroying. `cleanup` still also requires its existing `--confirm` (preview→execute); `--yes` is the additional non-local consent. @@ -79,15 +79,15 @@ servers: # operator-owned endpoints; names key the credentials url: https://graph.example.com # no tokens in this file, ever defaults: output: table # read format default, below --json/--format/alias - server: prod # the everyday SERVED scope when no address is given (RFC-011) + server: prod # the everyday SERVED scope when no address is given # store: file:///data/dev.omni # OR a zero-flag LOCAL default (mutually # # exclusive with `server`); the local-dev # # counterpart of `server` default_graph: knowledge # graph selected in a server/cluster scope -clusters: # admin-only: managed-cluster storage roots (RFC-011). +clusters: # admin-only: managed-cluster storage roots. brain: # the ONLY place a storage root lives in this file. root: s3://acme/clusters/brain -profiles: # named scope bundles (RFC-011); pick with --profile +profiles: # named scope bundles; pick with --profile staging: { server: staging, default_graph: knowledge } # a served scope brain-admin: { cluster: brain, default_graph: knowledge } # a direct cluster scope ``` @@ -96,7 +96,7 @@ Absent file = empty layer. Unknown keys warn and load (a file written for a newer CLI works on an older one). Override the config directory with `$OMNIGRAPH_HOME`. -#### Scopes & profiles (RFC-011) +#### Scopes & profiles A command resolves a **scope** — a server, a cluster, or a store — then selects a graph in it; the served-vs-direct access path is derived from the scope, not @@ -116,14 +116,15 @@ sticky "current" mode. Inspect what is defined with `omnigraph profile list` and `--cluster --graph `. A `--graph` flag overrides the profile's default. - A `server`-bound scope on a maintenance verb, or a `cluster`-bound scope on a data verb, is rejected with a message pointing at the right addressing. -- **No graph selected (RFC-011 D7).** When a scope has no `--graph` and no +- **No graph selected.** When a scope has no `--graph` and no `default_graph`, the CLI never silently picks: - **Cluster scope** — exactly **one** applied graph is used automatically; **several** errors and lists the candidates (from the served catalog). - - **Server scope** — a multi-graph server (any non-empty `GET /graphs`, even a - single entry) errors and lists the candidates: you must pass `--graph `. - A single-graph / flat server (405 on `/graphs`), or one whose `/graphs` is - policy-gated or unreachable, uses its bare URL as before. + - **Server scope** — an `omnigraph-server` is always cluster-backed, so its + `GET /graphs` lists the graphs and you must pass `--graph ` (the CLI + lists the candidates if you omit it). It falls back to the bare URL only + when `/graphs` is unavailable: policy-gated, unreachable, or a + non-`omnigraph` endpoint. `--target`, `--cluster-graph`, and the positional-`http(s)://`→remote dispatch have been **removed** (`--graph` is now the one graph selector across server and @@ -158,7 +159,7 @@ aliases: `omnigraph alias triage 2026-06-01` invokes `POST /graphs/spike/queries/weekly_triage` with the keyed -credential. Aliases live in their own `alias` namespace (RFC-011 Decision 4), +credential. Aliases live in their own `alias` namespace, so an alias can never shadow — or be shadowed by — a built-in verb. (The old `--alias ` flag on `query`/`mutate` was removed.) diff --git a/docs/user/clusters/config.md b/docs/user/clusters/config.md index 8f8caf4..cd4d772 100644 --- a/docs/user/clusters/config.md +++ b/docs/user/clusters/config.md @@ -390,7 +390,7 @@ omnigraph-server --cluster company-brain --bind 0.0.0.0:8080 ``` `--cluster ` is an **exclusive boot source** (axiom 15): it cannot -combine with a graph URI, `--target`, or `--config`, and in this mode +combine with a graph URI or `--config`, and in this mode `omnigraph.yaml` is never read — not for graphs, not for queries, not for policies. The server serves the **applied revision**: graph roots recorded in `state.json`, stored-query and policy content from the content-addressed diff --git a/docs/user/clusters/index.md b/docs/user/clusters/index.md index c59ff9d..0c2e7d7 100644 --- a/docs/user/clusters/index.md +++ b/docs/user/clusters/index.md @@ -91,7 +91,7 @@ only the URI and credentials, no checkout of the config repo. The ledger and catalog on the bucket are the deployment artifact. `--cluster` is an **exclusive boot source**: it cannot be combined with a -graph URI, `--target`, or `--config`, and `omnigraph.yaml` is never read in +graph URI or `--config`, and `omnigraph.yaml` is never read in this mode. Routing is always multi-graph: ```bash @@ -273,7 +273,7 @@ a cluster are created by `cluster apply`, not by hand. If the cluster has exactly **one** applied graph you can omit `--graph` — it is used automatically. With **several**, omitting `--graph` errors and lists the -candidates (RFC-011 D7); it never picks one for you. +candidates; it never picks one for you. Against an **`s3://`-backed cluster** the resolved graph storage is non-local, so a destructive `cleanup` additionally requires **`--yes`** (an interactive prompt diff --git a/docs/user/operations/maintenance.md b/docs/user/operations/maintenance.md index 161e5d6..e2a88eb 100644 --- a/docs/user/operations/maintenance.md +++ b/docs/user/operations/maintenance.md @@ -35,7 +35,7 @@ backstop, so it does as much as it can and converges on re-run. The CLI reports any failed tables; rerun `cleanup` to retry them. - CLI guards with `--confirm`; without it, prints a preview line. -- **Non-local consent (RFC-011 D9).** Against a non-local target (an `s3://` store/cluster), `cleanup` additionally requires `--yes` on top of `--confirm`: a TTY is prompted, and a non-interactive run (no TTY, or `--json`) refuses rather than destroying. A local (`file://`) target needs only `--confirm`. The same `--yes` gate applies to overwrite `load` and `branch delete`; every maintenance run echoes its resolved target to stderr (suppress with `--quiet`). +- **Non-local consent.** Against a non-local target (an `s3://` store/cluster), `cleanup` additionally requires `--yes` on top of `--confirm`: a TTY is prompted, and a non-interactive run (no TTY, or `--json`) refuses rather than destroying. A local (`file://`) target needs only `--confirm`. The same `--yes` gate applies to overwrite `load` and `branch delete`; every maintenance run echoes its resolved target to stderr (suppress with `--quiet`). - **Recovery floor:** `--keep < 3` may garbage-collect versions that crash recovery needs as a rollback target. Default `--keep 10` is safe. - **Orphaned-branch reconciliation:** before the version GC, cleanup reclaims any per-table or commit-graph branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged. diff --git a/docs/user/operations/policy.md b/docs/user/operations/policy.md index c6096d0..54fbea5 100644 --- a/docs/user/operations/policy.md +++ b/docs/user/operations/policy.md @@ -78,7 +78,7 @@ The default actor identity for CLI direct-engine (`--store`) writes is `operator.actor` in `~/.omnigraph/config.yaml`. Override per-invocation with `--as ` — `--as` wins, otherwise `operator.actor`, otherwise no actor. Remote HTTP writes ignore both — they resolve their actor server-side from the -bearer token. (Direct-store access carries no Cedar policy under RFC-011; policy +bearer token. (Direct-store access carries no Cedar policy; policy lives in the cluster/server.) ## CLI diff --git a/docs/user/operations/server.md b/docs/user/operations/server.md index bd14e1e..ced9d0d 100644 --- a/docs/user/operations/server.md +++ b/docs/user/operations/server.md @@ -1,6 +1,6 @@ # HTTP Server (`omnigraph-server`) -Axum 0.8 + tokio + utoipa-generated OpenAPI. **Cluster-only boot** (RFC-011): the server always boots from a cluster (`--cluster `) and serves N graphs (N ≥ 1) under cluster routes. There is no longer a single-graph flat-route mode, no positional `` boot, no `--target`, and no `omnigraph.yaml`-`graphs:`-map boot. All HTTP is nested under `/graphs/{graph_id}/...`; `/healthz` and the management `/graphs` enumeration stay flat. +Axum 0.8 + tokio + utoipa-generated OpenAPI. **Cluster-only boot**: the server always boots from a cluster (`--cluster `) and serves N graphs (N ≥ 1) under cluster routes. There is no longer a single-graph flat-route mode, no positional `` boot, no `--target`, and no `omnigraph.yaml`-`graphs:`-map boot. All HTTP is nested under `/graphs/{graph_id}/...`; `/healthz` and the management `/graphs` enumeration stay flat. ## Boot diff --git a/docs/user/search/embeddings.md b/docs/user/search/embeddings.md index e69d928..11f3540 100644 --- a/docs/user/search/embeddings.md +++ b/docs/user/search/embeddings.md @@ -42,7 +42,7 @@ boots from the applied cluster ledger, so `cluster validate`, `plan`, and needs no key. Vector dimensions stay schema-driven by the target `Vector(N)` column. -Direct single-graph serving, embedded callers, and the offline +Direct (`--store`) access, embedded callers, and the offline `omnigraph embed` pipeline use environment configuration unless they inject an `EmbeddingConfig` directly.