mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
docs(user): restructure user docs into topic sections (Phase 1) (#223)
Move the 23 flat docs/user/*.md files into topic subdirectories so the user guide is organized by area (schema, queries, search, branching, cli, operations, clusters, concepts, reference) instead of a flat list. This is a pure structural move — whole files relocated, every cross-doc link recomputed, no prose rewrites or content splits (those follow in Phase 2). - 19 `git mv`s (install.md, deployment.md stay top-level); history preserved (renames detected at 92–100% similarity). - All intra-doc links, AGENTS.md's topic table (52 pointers), and the docs/dev + docs/releases back-links recomputed via relpath from each file's new location. - docs/user/index.md rewritten as a sectioned nav hub. - Fixed 5 doc-path references in Rust (comments + two user-facing server settings error strings) to point at the new locations. Verified: zero broken .md links across tracked docs; check-agents-md.sh green (with the untracked scratch docs set aside); touched crates build. Note: the public site (omnigraph-web) imports docs/ via a flat-only script; its import-docs.mjs needs a subdir-aware update before the next re-sync. Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
8726ca92ec
commit
d46e50dd6d
33 changed files with 126 additions and 109 deletions
40
AGENTS.md
40
AGENTS.md
|
|
@ -73,32 +73,32 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec
|
|||
| **Lance docs index — fetch upstream Lance docs by problem domain** | **[docs/dev/lance.md](docs/dev/lance.md)** |
|
||||
| **Test coverage map — what's covered, what helpers to reuse, before-every-task checklist** | **[docs/dev/testing.md](docs/dev/testing.md)** |
|
||||
| Architecture, L1/L2 framing, concurrency model | [docs/dev/architecture.md](docs/dev/architecture.md) |
|
||||
| Storage layout, `__manifest` schema, URI schemes, S3 env vars | [docs/user/storage.md](docs/user/storage.md) |
|
||||
| `.pg` schema language, types, constraints, annotations, migration planning | [docs/user/schema-language.md](docs/user/schema-language.md) |
|
||||
| Schema-lint codes (`OG-XXX-NNN`), families, severity, suppression | [docs/user/schema-lint.md](docs/user/schema-lint.md) |
|
||||
| `.gq` query language, MATCH/RETURN/ORDER, search funcs, mutations, IR ops, lint codes | [docs/user/query-language.md](docs/user/query-language.md) |
|
||||
| Indexes (BTREE / inverted / vector / graph topology) | [docs/user/indexes.md](docs/user/indexes.md) |
|
||||
| Embeddings (compiler + engine clients, env vars, `@embed`) | [docs/user/embeddings.md](docs/user/embeddings.md) |
|
||||
| Branches, commit graph, snapshots, system branches | [docs/user/branches-commits.md](docs/user/branches-commits.md) |
|
||||
| Transactions and atomicity (per-query atomic; branches as multi-query transactions) | [docs/user/transactions.md](docs/user/transactions.md) |
|
||||
| Storage layout, `__manifest` schema, URI schemes, S3 env vars | [docs/user/storage.md](docs/user/concepts/storage.md) |
|
||||
| `.pg` schema language, types, constraints, annotations, migration planning | [docs/user/schema-language.md](docs/user/schema/index.md) |
|
||||
| Schema-lint codes (`OG-XXX-NNN`), families, severity, suppression | [docs/user/schema-lint.md](docs/user/schema/lint.md) |
|
||||
| `.gq` query language, MATCH/RETURN/ORDER, search funcs, mutations, IR ops, lint codes | [docs/user/query-language.md](docs/user/queries/index.md) |
|
||||
| Indexes (BTREE / inverted / vector / graph topology) | [docs/user/indexes.md](docs/user/search/indexes.md) |
|
||||
| Embeddings (compiler + engine clients, env vars, `@embed`) | [docs/user/embeddings.md](docs/user/search/embeddings.md) |
|
||||
| Branches, commit graph, snapshots, system branches | [docs/user/branches-commits.md](docs/user/branching/index.md) |
|
||||
| Transactions and atomicity (per-query atomic; branches as multi-query transactions) | [docs/user/transactions.md](docs/user/branching/transactions.md) |
|
||||
| Direct-publish write path (staging, D2, recovery sidecars; the former Run state machine) | [docs/dev/writes.md](docs/dev/writes.md) |
|
||||
| Three-way merge and conflict kinds | [docs/dev/merge.md](docs/dev/merge.md) |
|
||||
| Diff / change feed (`diff_between`, `diff_commits`) | [docs/user/changes.md](docs/user/changes.md) |
|
||||
| Diff / change feed (`diff_between`, `diff_commits`) | [docs/user/changes.md](docs/user/branching/changes.md) |
|
||||
| Query execution, mutation execution, bulk loader, `load` vs `ingest` | [docs/dev/execution.md](docs/dev/execution.md) |
|
||||
| `optimize` (compaction) and `cleanup` (version GC) | [docs/user/maintenance.md](docs/user/maintenance.md) |
|
||||
| Cluster operator guide (deploy/manage clusters, approvals, recovery, serving) | [docs/user/cluster.md](docs/user/cluster.md) |
|
||||
| Cedar policy actions, scopes, CLI | [docs/user/policy.md](docs/user/policy.md) |
|
||||
| HTTP server endpoints, auth, error model, body limits | [docs/user/server.md](docs/user/server.md) |
|
||||
| CLI quick-start | [docs/user/cli.md](docs/user/cli.md) |
|
||||
| CLI command surface and config schemas (`~/.omnigraph/config.yaml`, legacy `omnigraph.yaml`) | [docs/user/cli-reference.md](docs/user/cli-reference.md) |
|
||||
| Audit / actor tracking | [docs/user/audit.md](docs/user/audit.md) |
|
||||
| Error taxonomy and result serialization | [docs/user/errors.md](docs/user/errors.md) |
|
||||
| `optimize` (compaction) and `cleanup` (version GC) | [docs/user/maintenance.md](docs/user/operations/maintenance.md) |
|
||||
| Cluster operator guide (deploy/manage clusters, approvals, recovery, serving) | [docs/user/cluster.md](docs/user/clusters/index.md) |
|
||||
| Cedar policy actions, scopes, CLI | [docs/user/policy.md](docs/user/operations/policy.md) |
|
||||
| HTTP server endpoints, auth, error model, body limits | [docs/user/server.md](docs/user/operations/server.md) |
|
||||
| CLI quick-start | [docs/user/cli.md](docs/user/cli/index.md) |
|
||||
| CLI command surface and config schemas (`~/.omnigraph/config.yaml`, legacy `omnigraph.yaml`) | [docs/user/cli-reference.md](docs/user/cli/reference.md) |
|
||||
| Audit / actor tracking | [docs/user/audit.md](docs/user/operations/audit.md) |
|
||||
| Error taxonomy and result serialization | [docs/user/errors.md](docs/user/operations/errors.md) |
|
||||
| Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) |
|
||||
| Deployment (binary / container / RustFS bootstrap / auth / build variants) | [docs/user/deployment.md](docs/user/deployment.md) |
|
||||
| CI / release workflows | [docs/dev/ci.md](docs/dev/ci.md) |
|
||||
| Code ownership (CODEOWNERS source of truth, roles, regeneration) | [docs/dev/codeowners.md](docs/dev/codeowners.md) |
|
||||
| Branch protection policy (declarative, applied via `scripts/apply-branch-protection.sh`) | [docs/dev/branch-protection.md](docs/dev/branch-protection.md) |
|
||||
| Constants & tunables cheat sheet | [docs/user/constants.md](docs/user/constants.md) |
|
||||
| Constants & tunables cheat sheet | [docs/user/constants.md](docs/user/reference/constants.md) |
|
||||
| Per-version release notes | [docs/releases/](docs/releases/) |
|
||||
|
||||
---
|
||||
|
|
@ -257,7 +257,7 @@ omnigraph policy explain --actor act-alice --action change --branch main
|
|||
| Per-query atomic writes | — | In-memory `MutationStaging.pending` accumulator + `stage_*` / `commit_staged` per touched table at end-of-query + publisher CAS via `commit_with_expected` (single manifest commit per `mutate_as` / `load`); D₂ parse-time rule keeps inserts/updates and deletes from mixing |
|
||||
| Three-way row-level merge | — | `OrderedTableCursor` + `StagedTableWriter`, structured `MergeConflictKind` |
|
||||
| Change feeds | — | `diff_between` / `diff_commits` with manifest fast path + ID streaming |
|
||||
| Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/policy.md](docs/user/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. |
|
||||
| Cedar policy | — | Per-graph actions plus server-scoped actions (see [docs/user/policy.md](docs/user/operations/policy.md) for the current list), branch / target_branch / protected scopes, validate/test/explain CLI. **Engine-wide enforcement** (MR-722): every `_as` writer (`apply_schema_as`, `mutate_as`, `load_as` — the deprecated `ingest_as` shims route through it — `branch_create_as` / `branch_create_from_as`, `branch_delete_as`, `branch_merge_as`) calls `Omnigraph::enforce(action, scope, actor)` — HTTP, CLI, embedded SDK all hit the same gate. |
|
||||
| HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **multi-graph mode (v0.6.0+) with cluster routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Multi-graph boots from a cluster directory (`--cluster`) or the legacy `omnigraph.yaml`; add/remove graphs via `cluster apply` (or by editing the legacy file) and restarting.** |
|
||||
| CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`; legacy `omnigraph.yaml` deprecated per RFC-008), aliases, multi-format output (json/jsonl/csv/kv/table) |
|
||||
| Audit / actor tracking | — | `_as` write APIs + actor map in commit graph |
|
||||
|
|
@ -282,7 +282,7 @@ Rules:
|
|||
7. **Re-verify before recommending.** If you cite a flag, env var, endpoint, or constant to the user or in code, grep for it in source first. Memory and docs go stale; the code is authoritative.
|
||||
8. **Keep AGENTS.md short.** This file is always loaded into agent context, so every added line has a recurring context-window cost. Prefer pointers and terse invariants here; put detail in `docs/`.
|
||||
9. **Keep AGENTS.md a map, not an encyclopedia.** New deep content goes into `docs/`. Add an entry to "Where to find each topic" instead of pasting prose into this file. The "Always-on rules" section is the exception — it's for invariants that should always be in scope.
|
||||
10. **Re-read on schema/query/IR changes.** Edits to `schema.pest`, `query.pest`, `ir/lower.rs`, `query/typecheck.rs`, or `query/lint.rs` should trigger a re-read of [docs/user/schema-language.md](docs/user/schema-language.md), [docs/user/query-language.md](docs/user/query-language.md), and [docs/dev/execution.md](docs/dev/execution.md) to confirm they still describe reality.
|
||||
10. **Re-read on schema/query/IR changes.** Edits to `schema.pest`, `query.pest`, `ir/lower.rs`, `query/typecheck.rs`, or `query/lint.rs` should trigger a re-read of [docs/user/schema-language.md](docs/user/schema/index.md), [docs/user/query-language.md](docs/user/queries/index.md), and [docs/dev/execution.md](docs/dev/execution.md) to confirm they still describe reality.
|
||||
11. **Always make smaller commits.** Each commit does one thing, compiles, and passes tests; mechanical refactors land separately from the behavior changes they enable.
|
||||
12. **Test-first for bug fixes.** When fixing an identified bug, write a regression test that reproduces the failure first. Confirm it fails against the current code with the predicted symptom (not an unrelated error). Then land the fix in a separate commit and confirm the test turns green. The test commit lands just before the fix commit so the red → green pair is visible in `git log` and a reviewer can check out the test commit alone and reproduce the failure.
|
||||
13. **Correct by design over symptomatic patches.** When a bug surfaces, identify the root cause and make the fix correct by construction. Don't patch the symptom. If the design admits the bug class, the fix is to close the class, not to add a guard around the latest instance. A symptomatic patch is acceptable only as a stop-gap, with an explicit note in the commit message and a follow-up issue tracking the design fix.
|
||||
|
|
|
|||
|
|
@ -26,7 +26,7 @@ pub(crate) struct Cli {
|
|||
/// Actor id for direct-engine writes; overrides `cli.actor`. No effect on
|
||||
/// remote writes (the server resolves the actor from the bearer token).
|
||||
/// With a policy configured but no actor set, the write is denied — see
|
||||
/// docs/user/policy.md.
|
||||
/// docs/user/operations/policy.md.
|
||||
#[arg(long = "as", global = true, value_name = "ACTOR")]
|
||||
pub(crate) as_actor: Option<String>,
|
||||
|
||||
|
|
|
|||
|
|
@ -327,14 +327,14 @@ pub fn classify_server_runtime_state(
|
|||
"server has no bearer tokens and no policy file configured. This is a fully \
|
||||
open server — pass `--unauthenticated` (or set OMNIGRAPH_UNAUTHENTICATED=1) \
|
||||
if you actually want that, otherwise configure bearer tokens (see \
|
||||
docs/user/server.md) and/or `policy.file` in omnigraph.yaml."
|
||||
docs/user/operations/server.md) and/or `policy.file` in omnigraph.yaml."
|
||||
),
|
||||
(false, false, true) => Ok(ServerRuntimeState::Open),
|
||||
(true, false, _) => Ok(ServerRuntimeState::DefaultDeny),
|
||||
(false, true, _) => bail!(
|
||||
"policy file is configured but no bearer tokens — every request would 401 \
|
||||
because no token can ever match. Configure at least one bearer token (see \
|
||||
docs/user/server.md), or remove the policy file. To deny all unauthenticated \
|
||||
docs/user/operations/server.md), or remove the policy file. To deny all unauthenticated \
|
||||
traffic deliberately, configure tokens plus a deny-all Cedar rule — that \
|
||||
produces meaningful 403s with policy-decision logging instead of silent 401s."
|
||||
),
|
||||
|
|
|
|||
|
|
@ -763,7 +763,7 @@ fn traversal_indexed_override() -> Option<bool> {
|
|||
|
||||
/// Max source-row frontier for which Expand uses the BTREE-indexed path.
|
||||
/// Larger frontiers fall back to the in-memory CSR (dense / whole-graph). See
|
||||
/// `docs/user/constants.md`.
|
||||
/// `docs/user/reference/constants.md`.
|
||||
const DEFAULT_EXPAND_INDEXED_MAX_FRONTIER: usize = 1024;
|
||||
/// Max hop count for the indexed path (each hop is one indexed scan; very deep
|
||||
/// traversals fan out toward whole-graph and are better served by CSR).
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@
|
|||
//! keys yield a TOTAL, deterministic order (and `ORDER … LIMIT` is
|
||||
//! deterministic). NULL placement is `nulls_first = !descending` (NULLs first
|
||||
//! under ASC, last under DESC). Both are documented in
|
||||
//! `docs/user/query-language.md`.
|
||||
//! `docs/user/queries/index.md`.
|
||||
|
||||
mod helpers;
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ Three views, increasing zoom:
|
|||
2. **Layer view** — the eight-layer stack inside one OmniGraph process.
|
||||
3. **Component zoom-ins** — what's inside each layer.
|
||||
|
||||
For runtime flows (read query, mutation), see [`docs/dev/execution.md`](execution.md). For the on-disk layout of a graph, see [`docs/user/storage.md`](../user/storage.md).
|
||||
For runtime flows (read query, mutation), see [`docs/dev/execution.md`](execution.md). For the on-disk layout of a graph, see [`docs/user/storage.md`](../user/concepts/storage.md).
|
||||
|
||||
L1 (orange in the diagrams) is what we inherit from Lance; L2 (blue) is what OmniGraph adds. The L1/L2 framing is also called out in prose at the bottom of this doc.
|
||||
|
||||
|
|
@ -280,7 +280,7 @@ flowchart LR
|
|||
eng --> wq
|
||||
```
|
||||
|
||||
The server applies Cedar policy at the HTTP boundary today. The roadmap, called out in [docs/dev/invariants.md](invariants.md) as a known gap, is to push policy into the planner as predicates. After Cedar, mutating handlers go through `WorkloadController` (per-actor admission cap + byte budget; PR 2 / MR-686) before reaching the engine. The engine itself holds an `Arc<WriteQueueManager>` so concurrent mutations on the same `(table, branch)` serialize at the queue, while disjoint keys run in parallel — see [docs/user/server.md](../user/server.md) "Per-actor admission control" and [docs/dev/writes.md](writes.md). The CLI bypasses the HTTP layer (and admission) and calls the engine API directly.
|
||||
The server applies Cedar policy at the HTTP boundary today. The roadmap, called out in [docs/dev/invariants.md](invariants.md) as a known gap, is to push policy into the planner as predicates. After Cedar, mutating handlers go through `WorkloadController` (per-actor admission cap + byte budget; PR 2 / MR-686) before reaching the engine. The engine itself holds an `Arc<WriteQueueManager>` so concurrent mutations on the same `(table, branch)` serialize at the queue, while disjoint keys run in parallel — see [docs/user/server.md](../user/operations/server.md) "Per-actor admission control" and [docs/dev/writes.md](writes.md). The CLI bypasses the HTTP layer (and admission) and calls the engine API directly.
|
||||
|
||||
Code paths:
|
||||
|
||||
|
|
|
|||
|
|
@ -3,11 +3,11 @@
|
|||
**Status:** Draft / thinking-in-progress
|
||||
**Type:** Architecture direction
|
||||
**Date:** 2026-06-07
|
||||
**Relationship:** generalizes today's `omnigraph.yaml` graph/query/policy configuration surface ([CLI reference](../user/cli-reference.md), [server docs](../user/server.md)) into a future cluster control plane. The distilled rules are in [cluster-axioms.md](cluster-axioms.md); detailed downstream implementation spec and blast-radius assessment in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). This is a proposed architecture, not an implemented RFC.
|
||||
**Relationship:** generalizes today's `omnigraph.yaml` graph/query/policy configuration surface ([CLI reference](../user/cli/reference.md), [server docs](../user/operations/server.md)) into a future cluster control plane. The distilled rules are in [cluster-axioms.md](cluster-axioms.md); detailed downstream implementation spec and blast-radius assessment in [cluster-config-implementation-spec.md](cluster-config-implementation-spec.md). This is a proposed architecture, not an implemented RFC.
|
||||
|
||||
> **Implementation status.** The examples below describe the full target schema.
|
||||
> Stage 2B only accepts the read-only subset documented in
|
||||
> [cluster-config.md](../user/cluster-config.md). Future-phase fields such as
|
||||
> [cluster-config.md](../user/clusters/config.md). Future-phase fields such as
|
||||
> `env_file`, `apply`, `providers`, `pipelines`, `embeddings`, `ui`, `aliases`,
|
||||
> and `bindings` are intentionally rejected with typed diagnostics until their
|
||||
> reconciler semantics are implemented.
|
||||
|
|
|
|||
|
|
@ -177,4 +177,4 @@ For all three modes, a mid-load failure (RI / cardinality violation, validation
|
|||
|
||||
## Embeddings during load
|
||||
|
||||
If a node type has `@embed` properties, the loader calls the engine embedding client (Gemini, RETRIEVAL_DOCUMENT) per row to populate the vector column. See [embeddings.md](../user/embeddings.md).
|
||||
If a node type has `@embed` properties, the loader calls the engine embedding client (Gemini, RETRIEVAL_DOCUMENT) per row to populate the vector column. See [embeddings.md](../user/search/embeddings.md).
|
||||
|
|
|
|||
|
|
@ -20,13 +20,13 @@ constraints. User-facing behavior should still be documented through
|
|||
| Area | Read |
|
||||
|---|---|
|
||||
| System structure, L1/L2 framing, component diagrams | [architecture.md](architecture.md) |
|
||||
| On-disk layout, manifest schema, URI behavior | [storage.md](../user/storage.md) |
|
||||
| On-disk layout, manifest schema, URI behavior | [storage.md](../user/concepts/storage.md) |
|
||||
| Direct-publish writes, D2, staged writes, recovery sidecars | [writes.md](writes.md) |
|
||||
| Query execution, mutation execution, loader flow | [execution.md](execution.md) |
|
||||
| Index lifecycle and graph topology indexes | [indexes.md](../user/indexes.md) |
|
||||
| Branch and commit internals | [branches-commits.md](../user/branches-commits.md) |
|
||||
| Index lifecycle and graph topology indexes | [indexes.md](../user/search/indexes.md) |
|
||||
| Branch and commit internals | [branches-commits.md](../user/branching/index.md) |
|
||||
| Three-way merge implementation and conflicts | [merge.md](merge.md) |
|
||||
| Diff/change-feed implementation | [changes.md](../user/changes.md) |
|
||||
| Diff/change-feed implementation | [changes.md](../user/branching/changes.md) |
|
||||
| Branch protection policy | [branch-protection.md](branch-protection.md) |
|
||||
| CODEOWNERS source of truth | [codeowners.md](codeowners.md) |
|
||||
|
||||
|
|
@ -34,14 +34,14 @@ constraints. User-facing behavior should still be documented through
|
|||
|
||||
| Area | Read |
|
||||
|---|---|
|
||||
| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema-language.md) |
|
||||
| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/query-language.md) |
|
||||
| Embedding client and `@embed` integration | [embeddings.md](../user/embeddings.md) |
|
||||
| Cedar policy surface and server gating | [policy.md](../user/policy.md) |
|
||||
| Server auth, OpenAPI, endpoint handlers | [server.md](../user/server.md) |
|
||||
| Error taxonomy and serialization | [errors.md](../user/errors.md) |
|
||||
| Constants and tunables | [constants.md](../user/constants.md) |
|
||||
| Transaction model public contract | [transactions.md](../user/transactions.md) |
|
||||
| Schema grammar, catalog, migration planner | [schema-language.md](../user/schema/index.md) |
|
||||
| Query grammar, IR, lints, mutation restrictions | [query-language.md](../user/queries/index.md) |
|
||||
| Embedding client and `@embed` integration | [embeddings.md](../user/search/embeddings.md) |
|
||||
| Cedar policy surface and server gating | [policy.md](../user/operations/policy.md) |
|
||||
| Server auth, OpenAPI, endpoint handlers | [server.md](../user/operations/server.md) |
|
||||
| Error taxonomy and serialization | [errors.md](../user/operations/errors.md) |
|
||||
| Constants and tunables | [constants.md](../user/reference/constants.md) |
|
||||
| Transaction model public contract | [transactions.md](../user/branching/transactions.md) |
|
||||
|
||||
## Project Operations
|
||||
|
||||
|
|
|
|||
|
|
@ -58,7 +58,7 @@ Use it this way:
|
|||
branch they read even when index coverage is partial. Expensive index work
|
||||
should converge from manifest state instead of extending the critical write
|
||||
path. Scalar staged index builds and vector inline residuals are documented
|
||||
in [writes.md](writes.md) and [indexes.md](../user/indexes.md).
|
||||
in [writes.md](writes.md) and [indexes.md](../user/search/indexes.md).
|
||||
|
||||
8. **Schema identity survives renames.** Accepted schema identity must remain
|
||||
stable across type and property renames. Rename support belongs in migration
|
||||
|
|
@ -100,14 +100,14 @@ Use it this way:
|
|||
|---|---|---|
|
||||
| Multi-table commit | Manifest CAS plus recovery sidecars; not a single Lance primitive | [writes.md](writes.md), [architecture.md](architecture.md) |
|
||||
| Constructive mutations | In-memory `MutationStaging`, one end-of-query table commit per touched table, then one manifest publish | [writes.md](writes.md), [execution.md](execution.md) |
|
||||
| Deletes | Inline-commit residual; delete-only queries allowed, mixed insert/update/delete rejected by D2 | [query-language.md](../user/query-language.md), [writes.md](writes.md) |
|
||||
| Branch delete | Manifest is the single authority, flipped atomically first; per-table forks + commit-graph branch are derived state, reclaimed best-effort (`force_delete_branch`) with the `cleanup` reconciler as the guaranteed backstop. Reusing a name whose reclaim failed before `cleanup` surfaces an actionable error | [branches-commits.md](../user/branches-commits.md), [maintenance.md](../user/maintenance.md) |
|
||||
| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema-language.md), [execution.md](execution.md) |
|
||||
| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema-language.md) |
|
||||
| Deletes | Inline-commit residual; delete-only queries allowed, mixed insert/update/delete rejected by D2 | [query-language.md](../user/queries/index.md), [writes.md](writes.md) |
|
||||
| Branch delete | Manifest is the single authority, flipped atomically first; per-table forks + commit-graph branch are derived state, reclaimed best-effort (`force_delete_branch`) with the `cleanup` reconciler as the guaranteed backstop. Reusing a name whose reclaim failed before `cleanup` surfaces an actionable error | [branches-commits.md](../user/branching/index.md), [maintenance.md](../user/operations/maintenance.md) |
|
||||
| Schema validation | Type checks, required fields, defaults, edge endpoint checks, and edge cardinality are enforced on write paths | [schema-language.md](../user/schema/index.md), [execution.md](execution.md) |
|
||||
| Unique constraints | Intra-batch and write-path checks exist; intake and branch-merge derive the composite key through one shared function (`loader::composite_unique_key`, a separator-free `Vec<String>` tuple) and fail loudly on an un-keyable column type rather than silently exempting it; full cross-version uniqueness against already-committed rows is still a gap | [schema-language.md](../user/schema/index.md) |
|
||||
| Storage trait | `TableStorage` (via `db.storage()`) is staged-only; the inline-commit residuals (`delete_where`, `create_vector_index`) are split onto a separate sealed `InlineCommitResidual` trait reached via `db.storage_inline_residual()` (MR-854), so §1 holds by construction; capability/stat surfaces are roadmap | [writes.md](writes.md), [architecture.md](architecture.md) |
|
||||
| Index lifecycle | `ensure_indices` is explicit today; reconciler-based convergence is roadmap | [indexes.md](../user/indexes.md), [maintenance.md](../user/maintenance.md) |
|
||||
| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/query-language.md) |
|
||||
| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/server.md), [policy.md](../user/policy.md) |
|
||||
| Index lifecycle | `ensure_indices` is explicit today; reconciler-based convergence is roadmap | [indexes.md](../user/search/indexes.md), [maintenance.md](../user/operations/maintenance.md) |
|
||||
| Traversal IDs | Runtime still builds `TypeIndex`; Lance stable row-id based graph IDs are roadmap | [architecture.md](architecture.md), [query-language.md](../user/queries/index.md) |
|
||||
| Auth | Bearer token hashing and server-side actor resolution are implemented at the HTTP boundary | [server.md](../user/operations/server.md), [policy.md](../user/operations/policy.md) |
|
||||
| Tests | Tempdir-backed Lance tests are the current substrate; the storage adapter has an in-memory backend for adapter-level contract tests, but Lance datasets bypass it | [testing.md](testing.md) |
|
||||
|
||||
The branch-delete reconciler is authority-derived: it reclaims orphaned forks
|
||||
|
|
|
|||
|
|
@ -348,4 +348,4 @@ Callers move at their own pace. The envelope upgrades + URL rename ship in v0.6.
|
|||
- RFC 8288 (`Link` relations, `successor-version`)
|
||||
- MCP spec: [modelcontextprotocol.io](https://modelcontextprotocol.io)
|
||||
- [invariants.md](./invariants.md) — substrate boundaries this work respects
|
||||
- [../user/server.md](../user/server.md) — current HTTP surface (post-MR-656 picks up the `/query`+`/mutate` rename and deprecation)
|
||||
- [../user/server.md](../user/operations/server.md) — current HTTP surface (post-MR-656 picks up the `/query`+`/mutate` rename and deprecation)
|
||||
|
|
|
|||
|
|
@ -305,7 +305,7 @@ success and one failure. The losing writer's error is
|
|||
`ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected,
|
||||
actual }`. The HTTP server maps this to **409 Conflict** with body
|
||||
`{"error": "...", "code": "conflict", "manifest_conflict": { "table_key":
|
||||
"...", "expected": N, "actual": M }}` — see [docs/user/server.md](../user/server.md).
|
||||
"...", "expected": N, "actual": M }}` — see [docs/user/server.md](../user/operations/server.md).
|
||||
|
||||
## Audit
|
||||
|
||||
|
|
|
|||
|
|
@ -10,7 +10,7 @@ OmniGraph builds *graph branches* on top by branching every sub-table coherently
|
|||
|
||||
- `branch_create(name)` / `branch_create_from(target, name)` — disallowed name `main`; fails if branch exists; ensures the schema-apply lock is idle. Atomic and authority-first like `branch_delete`: it flips the `__manifest` branch (authority), then creates the derived commit-graph branch, force-dropping any orphaned commit-graph ref left by an incomplete prior delete (the manifest branch is fresh, so a same-named commit-graph branch is provably a zombie). If commit-graph creation fails, the manifest branch is rolled back so the name never half-exists.
|
||||
- `branch_list()` — returns public branches, **filters the internal** `__schema_apply_lock__` branch.
|
||||
- `branch_delete(name)` — refuses if there are descendants on the branch, or if it is the current branch. The manifest is the single authority for branch existence: deletion flips the `__manifest` branch ref first (one atomic op), after which the branch is gone from every snapshot. The owned per-table forks and the commit-graph branch are derived state, reclaimed best-effort with `force_delete_branch` after the flip. A failure during that reclaim (transient object-store error) does not fail the call or block the authority flip; the leftover forks are unreachable orphans that the [`cleanup`](maintenance.md) reconciler converges. One consequence: if a delete's best-effort reclaim fails, reusing that branch name before the next `cleanup` surfaces a clear error pointing at `cleanup` (the stale fork would otherwise collide on first write).
|
||||
- `branch_delete(name)` — refuses if there are descendants on the branch, or if it is the current branch. The manifest is the single authority for branch existence: deletion flips the `__manifest` branch ref first (one atomic op), after which the branch is gone from every snapshot. The owned per-table forks and the commit-graph branch are derived state, reclaimed best-effort with `force_delete_branch` after the flip. A failure during that reclaim (transient object-store error) does not fail the call or block the authority flip; the leftover forks are unreachable orphans that the [`cleanup`](../operations/maintenance.md) reconciler converges. One consequence: if a delete's best-effort reclaim fails, reusing that branch name before the next `cleanup` surfaces a clear error pointing at `cleanup` (the stale fork would otherwise collide on first write).
|
||||
- **Lazy forking**: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source. A fork collision is classified by the manifest authority, not by Lance branch versions: if the live manifest already records the fork on the active branch, a concurrent first-write won and the caller gets a retryable "refresh and retry"; if the manifest does not, a physical branch there is an orphan and the caller is pointed at `cleanup`.
|
||||
- `sync_branch(branch)` — re-binds the in-memory handle to the latest head of the branch.
|
||||
|
||||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
OmniGraph does not have `BEGIN` / `COMMIT` / `ROLLBACK`. Branches do that job. This page explains the model, when to use which primitive, and shows worked examples for the patterns that come up most.
|
||||
|
||||
The architectural rule lives in [`docs/dev/invariants.md`](../dev/invariants.md):
|
||||
The architectural rule lives in [`docs/dev/invariants.md`](../../dev/invariants.md):
|
||||
|
||||
> **Mutations publish at one boundary.** A `mutate_as` or `load` operation
|
||||
> accumulates constructive writes, commits each touched table at the end, then
|
||||
|
|
@ -161,8 +161,8 @@ This is the workflow MR-797 / agentic loops are designed around: **branches are
|
|||
|
||||
## See also
|
||||
|
||||
- [`docs/user/branches-commits.md`](branches-commits.md) — branch and commit-graph mechanics.
|
||||
- [`docs/dev/merge.md`](../dev/merge.md) — three-way merge details and conflict kinds.
|
||||
- [`docs/user/query-language.md`](query-language.md) — `.gq` syntax for the multi-statement queries used above.
|
||||
- [`docs/dev/writes.md`](../dev/writes.md) — the per-query commit pipeline that gives single-query atomicity.
|
||||
- [`docs/dev/invariants.md`](../dev/invariants.md) — the architectural rule.
|
||||
- [`docs/user/branches-commits.md`](index.md) — branch and commit-graph mechanics.
|
||||
- [`docs/dev/merge.md`](../../dev/merge.md) — three-way merge details and conflict kinds.
|
||||
- [`docs/user/query-language.md`](../queries/index.md) — `.gq` syntax for the multi-statement queries used above.
|
||||
- [`docs/dev/writes.md`](../../dev/writes.md) — the per-query commit pipeline that gives single-query atomicity.
|
||||
- [`docs/dev/invariants.md`](../../dev/invariants.md) — the architectural rule.
|
||||
|
|
@ -1,6 +1,6 @@
|
|||
# CLI Reference (`omnigraph`)
|
||||
|
||||
A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](cli.md).
|
||||
A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](index.md).
|
||||
|
||||
Top-level command families and subcommands. Graph-targeting commands accept a positional `URI`, `--uri`, a `--target <name>` resolved against `omnigraph.yaml`, or `--server <name>` (an operator-defined server from `~/.omnigraph/config.yaml`, optionally with `--graph <id>` for multi-graph servers; exclusive with the other forms); `cluster` commands use `--config <dir>`.
|
||||
|
||||
|
|
@ -8,7 +8,7 @@ Top-level command families and subcommands. Graph-targeting commands accept a po
|
|||
|
||||
| Command | Purpose |
|
||||
|---|---|
|
||||
| `init` | `--schema <pg>` → initialize a graph (no longer scaffolds `omnigraph.yaml` — RFC-008; start cluster configs from the [cluster.md](cluster.md) quick-start or `config migrate`) |
|
||||
| `init` | `--schema <pg>` → initialize a graph (no longer scaffolds `omnigraph.yaml` — RFC-008; start cluster configs from the [cluster.md](../clusters/index.md) quick-start or `config migrate`) |
|
||||
| `load` | bulk load a branch, local or remote (`--mode overwrite\|append\|merge` is **required** — overwrite is destructive, so there is no default). Without `--from` the target branch must exist; `--from <base>` forks a missing `--branch` from `<base>` first |
|
||||
| `ingest` | deprecated alias of `load --from <base>` (defaults: `--from main --mode merge`); prints a one-line warning to stderr |
|
||||
| `query` (alias: `read`) | run named read query; source via `--query <path>`, `-e`/`--query-string <GQ>`, or `--alias <name>` (exactly one). `read` is the deprecated previous name and prints a one-line warning to stderr |
|
||||
|
|
@ -53,7 +53,7 @@ tier:
|
|||
|
||||
| Surface | Owner | Location | Declares |
|
||||
|---|---|---|---|
|
||||
| Cluster config | the team, in a repo | `cluster.yaml` + checkout ([cluster-config.md](cluster-config.md)) | what the system **is**: graphs, schemas, queries, policies, storage |
|
||||
| Cluster config | the team, in a repo | `cluster.yaml` + checkout ([cluster-config.md](../clusters/config.md)) | what the system **is**: graphs, schemas, queries, policies, storage |
|
||||
| Operator config | one person | `~/.omnigraph/config.yaml` (override dir with `$OMNIGRAPH_HOME`) | who **I** am: identity, ergonomics |
|
||||
| Flags / env | per invocation | — | everything, explicitly |
|
||||
|
||||
|
|
@ -204,7 +204,7 @@ creates one only when it is missing. Both observe declared graphs read-only at
|
|||
`<config-dir>/graphs/<graph-id>.omni`. External state backends, graph/schema
|
||||
apply, automatic stale-lock breaking, `plan --refresh`, pipelines, UI specs,
|
||||
embeddings, aliases, and bindings are reserved for later stages. See
|
||||
[cluster-config.md](cluster-config.md).
|
||||
[cluster-config.md](../clusters/config.md).
|
||||
|
||||
## Output formats (`query` command, alias: `read`)
|
||||
|
||||
|
|
@ -3,7 +3,7 @@
|
|||
**Status:** Phase 5 — cluster-booted serving (`omnigraph-server --cluster`).
|
||||
|
||||
> New to the cluster tooling? Start with the operator how-to guide,
|
||||
> [cluster.md](cluster.md) — this document is the reference.
|
||||
> [cluster.md](index.md) — this document is the reference.
|
||||
|
||||
Cluster config is the future control-plane configuration surface for a whole
|
||||
OmniGraph deployment. In this stage, OmniGraph can validate a local
|
||||
|
|
@ -7,8 +7,8 @@ destructive changes, and recovering from crashes.
|
|||
|
||||
It is a **how-to**. The reference for every `cluster.yaml` key, command flag,
|
||||
state-file field, and diagnostic code is
|
||||
[cluster-config.md](cluster-config.md); the HTTP surface is
|
||||
[server.md](server.md).
|
||||
[cluster-config.md](config.md); the HTTP surface is
|
||||
[server.md](../operations/server.md).
|
||||
|
||||
## The model in one paragraph
|
||||
|
||||
|
|
@ -102,7 +102,7 @@ curl -H 'authorization: Bearer s3cret' \
|
|||
|
||||
Bearer tokens and the bind address are deliberately *not* cluster facts —
|
||||
they are per-replica, set by flag or environment
|
||||
([server.md](server.md#modes) for the token sources).
|
||||
([server.md](../operations/server.md#modes) for the token sources).
|
||||
|
||||
## 2. The day-2 loop: edit → plan → apply → restart
|
||||
|
||||
|
|
@ -237,7 +237,7 @@ with an in-flight apply.
|
|||
directory; boot is read-only. Roll out a change by `apply` once, then
|
||||
restarting replicas (serving is static per process — there is no hot
|
||||
reload yet). Container/cloud recipes (AWS ECS+EFS, Railway volumes):
|
||||
[deployment.md](deployment.md#cluster-mode-in-containers-aws-railway).
|
||||
[deployment.md](../deployment.md#cluster-mode-in-containers-aws-railway).
|
||||
- **The directory is the deployable unit**: config, catalog, ledger,
|
||||
approvals, and graph data all live under it. Back it up as a whole;
|
||||
version the *config files* (not `__cluster/` or `graphs/`) in git.
|
||||
|
|
@ -282,4 +282,4 @@ a cluster are created by `cluster apply`, not by hand.
|
|||
reserved and rejected loudly.
|
||||
|
||||
For the full reference — every key, flag, status, disposition, and
|
||||
diagnostic — see [cluster-config.md](cluster-config.md).
|
||||
diagnostic — see [cluster-config.md](config.md).
|
||||
|
|
@ -17,7 +17,7 @@ The server also has two **boot sources**: `omnigraph.yaml` (graph targets
|
|||
declared in the per-operator config) or a **cluster directory**
|
||||
(`omnigraph-server --cluster <dir>`), which serves the cluster control
|
||||
plane's applied revision — see
|
||||
[cluster-config.md](cluster-config.md#serving-from-the-cluster-the-mode-switch).
|
||||
[cluster-config.md](clusters/config.md#serving-from-the-cluster-the-mode-switch).
|
||||
The two are exclusive per deployment; switching is a restart with a different
|
||||
flag.
|
||||
|
||||
|
|
|
|||
|
|
@ -2,44 +2,62 @@
|
|||
|
||||
**Audience:** users, CLI users, HTTP clients, and self-hosting operators
|
||||
|
||||
This is the public-facing entry point. These docs should describe behavior,
|
||||
commands, configuration, and operational contracts without requiring knowledge
|
||||
of MRs, internal recovery mechanics, or contributor-only invariants.
|
||||
This is the public-facing entry point. These docs describe behavior, commands,
|
||||
configuration, and operational contracts without requiring knowledge of internal
|
||||
recovery mechanics or contributor-only invariants. They are organized by topic —
|
||||
start with install, then follow the section that matches your task.
|
||||
|
||||
## Start Here
|
||||
## Start here
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Install OmniGraph | [install.md](install.md) |
|
||||
| Run the CLI locally | [cli.md](cli.md) |
|
||||
| Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
|
||||
| Deploy and operate a cluster (how-to guide) | [cluster.md](cluster.md) |
|
||||
| Validate and plan cluster config | [cluster-config.md](cluster-config.md) |
|
||||
| Write schemas | [schema-language.md](schema-language.md) |
|
||||
| Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |
|
||||
| Write queries and mutations | [query-language.md](query-language.md) |
|
||||
| Use embeddings | [embeddings.md](embeddings.md) |
|
||||
| Run the CLI | [cli/index.md](cli/index.md) |
|
||||
| Look up every CLI flag and config field | [cli/reference.md](cli/reference.md) |
|
||||
|
||||
## Operate A Graph
|
||||
## Schema & queries
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Understand graph layout and URI support | [storage.md](storage.md) |
|
||||
| Work with branches, commits, and snapshots | [branches-commits.md](branches-commits.md) |
|
||||
| Coordinate multi-query workflows | [transactions.md](transactions.md) |
|
||||
| Read diffs and change feeds | [changes.md](changes.md) |
|
||||
| Build and use indexes | [indexes.md](indexes.md) |
|
||||
| Compact and clean old versions | [maintenance.md](maintenance.md) |
|
||||
| Interpret errors and output formats | [errors.md](errors.md) |
|
||||
| Write schemas (the `.pg` language) | [schema/index.md](schema/index.md) |
|
||||
| Read schema-lint diagnostic codes | [schema/lint.md](schema/lint.md) |
|
||||
| Write queries and mutations (the `.gq` language) | [queries/index.md](queries/index.md) |
|
||||
| Use vector / full-text / hybrid search | [search/indexes.md](search/indexes.md) |
|
||||
| Generate embeddings | [search/embeddings.md](search/embeddings.md) |
|
||||
| Build and use indexes | [search/indexes.md](search/indexes.md) |
|
||||
|
||||
## Run The Server
|
||||
## Branching & version control
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Work with branches, commits, and snapshots | [branching/index.md](branching/index.md) |
|
||||
| Coordinate multi-query workflows | [branching/transactions.md](branching/transactions.md) |
|
||||
| Read diffs and change feeds | [branching/changes.md](branching/changes.md) |
|
||||
|
||||
## Operations
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Deploy the binary or container | [deployment.md](deployment.md) |
|
||||
| Use HTTP endpoints | [server.md](server.md) |
|
||||
| Configure Cedar authorization | [policy.md](policy.md) |
|
||||
| Track actors and audit behavior | [audit.md](audit.md) |
|
||||
| Use HTTP endpoints | [operations/server.md](operations/server.md) |
|
||||
| Compact, repair, and clean old versions | [operations/maintenance.md](operations/maintenance.md) |
|
||||
| Configure Cedar authorization | [operations/policy.md](operations/policy.md) |
|
||||
| Track actors and audit behavior | [operations/audit.md](operations/audit.md) |
|
||||
| Interpret errors and output formats | [operations/errors.md](operations/errors.md) |
|
||||
|
||||
## Clusters
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Deploy and operate a cluster (how-to) | [clusters/index.md](clusters/index.md) |
|
||||
| Reference every `cluster.yaml` key and command | [clusters/config.md](clusters/config.md) |
|
||||
|
||||
## Concepts & reference
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Understand graph layout and URI support | [concepts/storage.md](concepts/storage.md) |
|
||||
| Look up constants and tunables | [reference/constants.md](reference/constants.md) |
|
||||
|
||||
## Releases
|
||||
|
||||
|
|
@ -48,7 +66,6 @@ changes between versions, not for contributor design history.
|
|||
|
||||
## Boundary
|
||||
|
||||
User docs should focus on stable behavior. If a paragraph needs to explain
|
||||
internal sidecars, Lance API blockers, MR numbers, test strategy, or review
|
||||
rules, it probably belongs in [docs/dev/index.md](../dev/index.md) or a developer-area document
|
||||
instead.
|
||||
User docs focus on stable behavior. If a paragraph needs to explain internal
|
||||
sidecars, Lance API blockers, or test strategy, it probably belongs in
|
||||
[docs/dev/index.md](../dev/index.md) or a developer-area document instead.
|
||||
|
|
|
|||
|
|
@ -9,7 +9,7 @@
|
|||
- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option<ManifestConflictDetails>, … })`
|
||||
- `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`).
|
||||
- `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts.
|
||||
- **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/user/query-language.md](query-language.md) for the rule and [docs/dev/writes.md](../dev/writes.md) for the underlying staged-write rationale.
|
||||
- **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/user/query-language.md](../queries/index.md) for the rule and [docs/dev/writes.md](../../dev/writes.md) for the underlying staged-write rationale.
|
||||
- `MergeConflicts(Vec<MergeConflict>)`
|
||||
|
||||
Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics.
|
||||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
`db/omnigraph/optimize.rs` and `db/omnigraph/repair.rs`.
|
||||
|
||||
**Addressing (RFC-010).** `optimize`, `repair`, and `cleanup` are **storage-plane** CLI commands: they run with direct storage access against a positional `URI`, `--target`, or **`--cluster <dir|s3://…> --cluster-graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` / `--graph` or a `--target` that resolves to a remote (`http(s)://`) URL with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command planes* section of [cli-reference.md](cli-reference.md).
|
||||
**Addressing (RFC-010).** `optimize`, `repair`, and `cleanup` are **storage-plane** CLI commands: they run with direct storage access against a positional `URI`, `--target`, or **`--cluster <dir|s3://…> --cluster-graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` / `--graph` or a `--target` that resolves to a remote (`http(s)://`) URL with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command planes* section of [cli-reference.md](../cli/reference.md).
|
||||
|
||||
## `optimize_all_tables(db)` — non-destructive
|
||||
|
||||
|
|
@ -13,7 +13,7 @@
|
|||
- **Uncovered drift is skipped, not interpreted.** If a table's Lance HEAD is ahead of the version recorded in `__manifest` and no recovery sidecar covers that movement, `optimize` reports `skipped: Some(DriftNeedsRepair)` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
|
||||
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
|
||||
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version }]`.
|
||||
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: Some(BlobColumnsUnsupportedByLance)` (and logged via `tracing::warn`) instead of compacted, and the rest of the sweep proceeds normally. The current Lance `compact_files` mis-decodes blob-v2 columns under its forced `BlobHandling::AllBinary` read; **reads and writes are unaffected** — only compaction is. This is gated by `LANCE_SUPPORTS_BLOB_COMPACTION` (`db/omnigraph/optimize.rs`) and removed when the upstream Lance fix lands (see [docs/dev/lance.md](../dev/lance.md)). Consequence: fragment count and deleted-row space on blob tables are not reclaimed until then; query results are never affected.
|
||||
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: Some(BlobColumnsUnsupportedByLance)` (and logged via `tracing::warn`) instead of compacted, and the rest of the sweep proceeds normally. The current Lance `compact_files` mis-decodes blob-v2 columns under its forced `BlobHandling::AllBinary` read; **reads and writes are unaffected** — only compaction is. This is gated by `LANCE_SUPPORTS_BLOB_COMPACTION` (`db/omnigraph/optimize.rs`) and removed when the upstream Lance fix lands (see [docs/dev/lance.md](../../dev/lance.md)). Consequence: fragment count and deleted-row space on blob tables are not reclaimed until then; query results are never affected.
|
||||
|
||||
## `repair_all_tables(db, options)` — explicit
|
||||
|
||||
|
|
@ -36,7 +36,7 @@
|
|||
any failed tables; rerun `cleanup` to retry them.
|
||||
- CLI guards with `--confirm`; without it, prints a preview line.
|
||||
- **Recovery floor:** `--keep < 3` may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the branch's manifest-pinned table version, which is HEAD-1 in the typical Phase B → Phase C drift case). Default `--keep 10` is safe.
|
||||
- **Orphaned-branch reconciliation:** before the version GC, cleanup runs `reconcile_orphaned_branches`, which `force_delete_branch`es any per-table or commit-graph Lance branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](branches-commits.md)). The reconciler is authority-derived and idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged via `tracing::info`.
|
||||
- **Orphaned-branch reconciliation:** before the version GC, cleanup runs `reconcile_orphaned_branches`, which `force_delete_branch`es any per-table or commit-graph Lance branch absent from the manifest branch list. These orphans arise when a `branch_delete` flips the manifest authority but a downstream best-effort reclaim does not complete (see [branches-commits.md](../branching/index.md)). The reconciler is authority-derived and idempotent (it no-ops once nothing is orphaned), runs regardless of the `keep_versions` / `older_than` values (those gate version GC only), and never reclaims `main` or system-branch forks. Reclaimed forks are logged via `tracing::info`.
|
||||
|
||||
## Tombstones
|
||||
|
||||
|
|
@ -44,6 +44,6 @@ Logical sub-table delete markers in `__manifest`; `tombstone_object_id(table_key
|
|||
|
||||
## Internal schema migrations (`db/manifest/migrations.rs`)
|
||||
|
||||
Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape the binary expects; the on-disk stamp `omnigraph:internal_schema_version` (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls `migrate_internal_schema` before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](storage.md) for the full mechanism.
|
||||
Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape the binary expects; the on-disk stamp `omnigraph:internal_schema_version` (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls `migrate_internal_schema` before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](../concepts/storage.md) for the full mechanism.
|
||||
|
||||
A binary opening a manifest stamped at a version *higher* than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.
|
||||
|
|
@ -21,7 +21,7 @@ revision** (`state.json` + content-addressed blobs) instead of
|
|||
`omnigraph.yaml` — an exclusive boot source: combining it with `<URI>`,
|
||||
`--target`, or `--config` is a startup error, and `omnigraph.yaml` is never
|
||||
read in this mode. Always multi-graph routing. See
|
||||
[cluster-config.md](cluster-config.md#serving-from-the-cluster-the-mode-switch)
|
||||
[cluster-config.md](../clusters/config.md#serving-from-the-cluster-the-mode-switch)
|
||||
for what is read and the fail-fast readiness rules. `--bind`,
|
||||
`--unauthenticated`, and the bearer-token env vars work identically.
|
||||
|
||||
|
|
@ -36,7 +36,7 @@ Mode inference:
|
|||
|
||||
### Stored-query validation at startup
|
||||
|
||||
If a graph declares a `queries:` registry (see [cli-reference](cli-reference.md)), the server **loads and type-checks every stored query against that graph's live schema at startup** and **refuses to boot** if any query references a type or property the schema lacks — the same fail-loud posture as a malformed policy file, so schema drift surfaces at the deploy boundary rather than at invocation. Two MCP-exposed queries claiming the same tool name is likewise a boot error. Non-blocking advisories (e.g. an MCP-exposed query with a vector parameter an agent cannot supply) are logged. Validate offline before deploying with `omnigraph queries validate`. Discover the exposed queries as a typed tool catalog with `GET /queries`, and invoke one over HTTP with `POST /queries/{name}` (both below).
|
||||
If a graph declares a `queries:` registry (see [cli-reference](../cli/reference.md)), the server **loads and type-checks every stored query against that graph's live schema at startup** and **refuses to boot** if any query references a type or property the schema lacks — the same fail-loud posture as a malformed policy file, so schema drift surfaces at the deploy boundary rather than at invocation. Two MCP-exposed queries claiming the same tool name is likewise a boot error. Non-blocking advisories (e.g. an MCP-exposed query with a vector parameter an agent cannot supply) are logged. Validate offline before deploying with `omnigraph queries validate`. Discover the exposed queries as a typed tool catalog with `GET /queries`, and invoke one over HTTP with `POST /queries/{name}` (both below).
|
||||
|
||||
## Endpoint inventory
|
||||
|
||||
|
|
@ -209,7 +209,7 @@ admission-gated.
|
|||
policy file without tokens is also rejected at startup. In open mode
|
||||
`/openapi.json` strips the security scheme.
|
||||
|
||||
See [deployment.md](deployment.md) for token-source operational details.
|
||||
See [deployment.md](../deployment.md) for token-source operational details.
|
||||
|
||||
## Tracing & observability
|
||||
|
||||
|
|
@ -72,7 +72,7 @@ A single mutation query must be **either insert/update-only or delete-only**. Mi
|
|||
|
||||
> `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes. This restriction lifts when Lance exposes a two-phase delete API (tracked: MR-793 / Lance-upstream).`
|
||||
|
||||
Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance v6.0.1 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until the MR-A Lance v7 bump migrates `delete_where` to staged (`DeleteBuilder::execute_uncommitted` first ships in `v7.0.0-beta.10`), the parse-time rejection keeps both paths atomic and correct. See [docs/dev/writes.md](../dev/writes.md), [docs/dev/lance.md](../dev/lance.md), and [docs/dev/invariants.md](../dev/invariants.md).
|
||||
Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance v6.0.1 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until the MR-A Lance v7 bump migrates `delete_where` to staged (`DeleteBuilder::execute_uncommitted` first ships in `v7.0.0-beta.10`), the parse-time rejection keeps both paths atomic and correct. See [docs/dev/writes.md](../../dev/writes.md), [docs/dev/lance.md](../../dev/lance.md), and [docs/dev/invariants.md](../../dev/invariants.md).
|
||||
|
||||
## IR (Intermediate Representation)
|
||||
|
||||
|
|
@ -81,7 +81,7 @@ Reason: under the staged-write rewire (MR-794), inserts and updates accumulate i
|
|||
Pipeline operations:
|
||||
|
||||
- `NodeScan { variable, type_name, filters }`
|
||||
- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune. Executed one of two ways, chosen per-expand by a cost model over cheap manifest counts (frontier size, |E|, source-vertex count, hops) plus index coverage: selective traversals (small frontier relative to the source set) resolve neighbors from the persisted `src`/`dst` BTREE (one indexed scan per hop); dense / deep / large-frontier traversals — or those whose BTREE coverage is degraded so a full scan would be paid per hop — use the in-memory CSR adjacency index. Both produce identical results. The `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER` / `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS` ceilings bound the *initial dispatch* frontier/hops (beyond them CSR is always used); the cost model estimates total indexed work as ~`hops × frontier × fanout` and prices dense fan-out toward CSR — they are not a hard per-hop bound. `OMNIGRAPH_TRAVERSAL_MODE=indexed|csr` forces a mode (see [constants](constants.md)).
|
||||
- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune. Executed one of two ways, chosen per-expand by a cost model over cheap manifest counts (frontier size, |E|, source-vertex count, hops) plus index coverage: selective traversals (small frontier relative to the source set) resolve neighbors from the persisted `src`/`dst` BTREE (one indexed scan per hop); dense / deep / large-frontier traversals — or those whose BTREE coverage is degraded so a full scan would be paid per hop — use the in-memory CSR adjacency index. Both produce identical results. The `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER` / `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS` ceilings bound the *initial dispatch* frontier/hops (beyond them CSR is always used); the cost model estimates total indexed work as ~`hops × frontier × fanout` and prices dense fan-out toward CSR — they are not a hard per-hop bound. `OMNIGRAPH_TRAVERSAL_MODE=indexed|csr` forces a mode (see [constants](../reference/constants.md)).
|
||||
- `Filter { left, op, right }`
|
||||
- `AntiJoin { outer_var, inner: Vec<IROp> }` — for `not { … }`
|
||||
|
||||
|
|
@ -23,4 +23,4 @@ This is OmniGraph-specific (not Lance):
|
|||
- `CsrIndex`: Compressed Sparse Row representation of edges per edge type — `offsets[i]..offsets[i+1]` slices into `targets`.
|
||||
- `GraphIndex { type_indices, csr (out), csc (in) }` — built on demand from a snapshot's edge tables, **lazily**: only when an `Expand` the planner routes to the CSR path (dense / large frontier) or an `AntiJoin` actually needs it.
|
||||
- Cached in `RuntimeCache::graph_indices` (LRU, max 8 entries, keyed by snapshot id + edge table versions).
|
||||
- Selective `Expand`s resolve neighbors from the persisted `src`/`dst` BTREE instead (one indexed scan per hop) and never trigger the CSR build; see [query-language](query-language.md) → Expand. Pure scans, and queries served entirely by the indexed traversal path, skip it.
|
||||
- Selective `Expand`s resolve neighbors from the persisted `src`/`dst` BTREE instead (one indexed scan per hop) and never trigger the CSR build; see [query-language](../queries/index.md) → Expand. Pure scans, and queries served entirely by the indexed traversal path, skip it.
|
||||
Loading…
Add table
Add a link
Reference in a new issue