mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-27 02:39:38 +02:00
Merge origin/main (v0.6.2, cluster Stage 2B) into ragnorc/scrutinize-rfc-002
Conflict resolutions: - cli/main.rs: keep the omnigraph_api_types import (branch extraction) and add main's omnigraph_cluster import; test-import list takes the branch's ResolvedCliGraph/is_remote_uri additions - Cargo manifests: union of deps — branch's config/queries/api-types crates plus main's cluster crate; all versions unified at 0.6.2 (new crates bumped from 0.6.1) - cli/Cargo.toml: omnigraph-server dep stays dropped (branch decision; CLI references it only in comments and the test harness binary spawn) - AGENTS.md: 0.6.2 + union crate list - cli-reference.md: branch's --graph/layered-config sections + main's cluster/repair rows - Cargo.lock regenerated cargo test --workspace --locked passes; scripts/check-agents-md.sh passes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
commit
48912167d0
91 changed files with 10889 additions and 736 deletions
|
|
@ -4,4 +4,4 @@
|
|||
- `_as` variants of every write API let callers override the actor: `mutate_as`, `ingest_as`, `branch_merge_as`, `apply_schema_as`, etc.
|
||||
- Actor IDs are persisted on `GraphCommit.actor_id` with split storage in `_graph_commit_actors.lance` (the commit graph is split into `_graph_commits.lance` for the linkage and `_graph_commit_actors.lance` for the actor map).
|
||||
- HTTP server uses the bearer-token actor automatically; CLI uses the local user / explicit env (no implicit actor).
|
||||
- Pre-v0.4.0 graphs also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0 and reclaimed by MR-770's production sweep.
|
||||
- Pre-v0.4.0 graphs also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0. The v2→v3 manifest migration sweeps any stale `__run__*` branches on first write-open (MR-770); the inert dataset bytes remain until a `delete_prefix` primitive lands.
|
||||
|
|
|
|||
|
|
@ -9,8 +9,8 @@ Lance supports branching at the dataset level: a branch is a named lineage of ve
|
|||
OmniGraph builds *graph branches* on top by branching every sub-table coherently:
|
||||
|
||||
- `branch_create(name)` / `branch_create_from(target, name)` — disallowed name `main`; fails if branch exists; ensures the schema-apply lock is idle. Atomic and authority-first like `branch_delete`: it flips the `__manifest` branch (authority), then creates the derived commit-graph branch, force-dropping any orphaned commit-graph ref left by an incomplete prior delete (the manifest branch is fresh, so a same-named commit-graph branch is provably a zombie). If commit-graph creation fails, the manifest branch is rolled back so the name never half-exists.
|
||||
- `branch_list()` — returns public branches, **filters internal** `__run__…` and `__schema_apply_lock__` prefixes.
|
||||
- `branch_delete(name)` — refuses if there are descendants or active runs on the branch. The manifest is the single authority for branch existence: deletion flips the `__manifest` branch ref first (one atomic op), after which the branch is gone from every snapshot. The owned per-table forks and the commit-graph branch are derived state, reclaimed best-effort with `force_delete_branch` after the flip. A failure during that reclaim (transient object-store error) does not fail the call or block the authority flip; the leftover forks are unreachable orphans that the [`cleanup`](maintenance.md) reconciler converges. One consequence: if a delete's best-effort reclaim fails, reusing that branch name before the next `cleanup` surfaces a clear error pointing at `cleanup` (the stale fork would otherwise collide on first write).
|
||||
- `branch_list()` — returns public branches, **filters the internal** `__schema_apply_lock__` branch.
|
||||
- `branch_delete(name)` — refuses if there are descendants on the branch, or if it is the current branch. The manifest is the single authority for branch existence: deletion flips the `__manifest` branch ref first (one atomic op), after which the branch is gone from every snapshot. The owned per-table forks and the commit-graph branch are derived state, reclaimed best-effort with `force_delete_branch` after the flip. A failure during that reclaim (transient object-store error) does not fail the call or block the authority flip; the leftover forks are unreachable orphans that the [`cleanup`](maintenance.md) reconciler converges. One consequence: if a delete's best-effort reclaim fails, reusing that branch name before the next `cleanup` surfaces a clear error pointing at `cleanup` (the stale fork would otherwise collide on first write).
|
||||
- **Lazy forking**: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source. A fork collision is classified by the manifest authority, not by Lance branch versions: if the live manifest already records the fork on the active branch, a concurrent first-write won and the caller gets a retryable "refresh and retry"; if the manifest does not, a physical branch there is an orphan and the caller is pointed at `cleanup`.
|
||||
- `sync_branch(branch)` — re-binds the in-memory handle to the latest head of the branch.
|
||||
|
||||
|
|
@ -51,13 +51,13 @@ Notes:
|
|||
|
||||
## L2 — Internal system branches
|
||||
|
||||
Filtered from `branch_list()` but visible to internals:
|
||||
Internal or legacy branch refs:
|
||||
|
||||
- `__schema_apply_lock__` — serializes schema migrations.
|
||||
- `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). The branch-name guard predicate `is_internal_run_branch` is kept as defense-in-depth so users cannot create a branch matching the legacy prefix; the filter will be removed once production legacy branches are swept (MR-770).
|
||||
- `__schema_apply_lock__` — serializes schema migrations; filtered from `branch_list()` but visible to internals.
|
||||
- `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). These are swept off `__manifest` on the first read-write open by the v2→v3 internal-schema migration (MR-770), and `__run__*` is no longer a reserved name. Known limitation: a pre-v0.4.0 graph opened **read-only** still surfaces any stale `__run__*` branch in `branch_list()` until its first read-write open (the migration is write-path-only, like all manifest migrations).
|
||||
|
||||
## L2 — Recovery audit trail
|
||||
|
||||
The four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) protect their multi-table commits with a sidecar at `__recovery/{ulid}.json` written before Phase B and deleted after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.
|
||||
The five migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`, `optimize_all_tables`) protect their multi-table commits with a sidecar at `__recovery/{ulid}.json` written before Phase B and deleted after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.
|
||||
|
||||
Audit rows live in `_graph_commit_recoveries.lance` (sibling to `_graph_commits.lance`) and reference the commit graph by `graph_commit_id`. The linked recovery commit is identified by that same `graph_commit_id`, and `actor_id="omnigraph:recovery"` is stored in `_graph_commit_actors.lance` (joined by `graph_commit_id`) — `_graph_commits.lance` itself does not carry the `actor_id` column. To find recoveries for a specific original actor: `omnigraph commit list --filter actor=omnigraph:recovery`, then join to `_graph_commit_recoveries.lance` by `graph_commit_id` to read `recovery_for_actor`. Schema: see `crates/omnigraph/src/db/recovery_audit.rs`.
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](cli.md).
|
||||
|
||||
17 top-level command families, 40+ subcommands. All commands accept either a positional `URI`, `--uri`, or a `--graph <name>` resolved against `omnigraph.yaml`.
|
||||
Top-level command families and subcommands. Graph-targeting commands accept either a positional `URI`, `--uri`, or a `--graph <name>` resolved against `omnigraph.yaml`; `cluster` commands use `--config <dir>`.
|
||||
|
||||
## Top-level commands
|
||||
|
||||
|
|
@ -17,11 +17,12 @@ A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` sc
|
|||
| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
|
||||
| `branch create \| list \| delete \| merge` | branching ops |
|
||||
| `commit list \| show` | inspect commit graph |
|
||||
| `run list \| show \| publish \| abort` | transactional run ops |
|
||||
| `schema plan \| apply \| show (alias: get)` | migrations |
|
||||
| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
|
||||
| `queries validate \| list` | operate on the server-side stored-query registry (the `queries:` block). `validate` type-checks every stored query against the live schema offline (opens the selected graph; exits non-zero on any breakage), catching schema drift without restarting the server; `list` prints the selected registry's query names, MCP exposure, and typed params. For per-graph registries, pass `--graph <graph>` or set `defaults.graph`; with no graph selection, `list` shows only top-level `queries:`. Distinct from `lint`, which validates a single `.gq` file |
|
||||
| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns; `--json` reports a `skipped` field) |
|
||||
| `cluster validate \| plan \| status \| refresh \| import` | cluster-control preview. `validate` checks a local `cluster.yaml` folder and referenced schema/query/policy files; `plan` diffs it against local JSON state at `__cluster/state.json`; `status` reads the state ledger; `refresh`/`import` explicitly update local JSON state from read-only graph observations. No apply, graph-resource mutation, server change, or `plan --refresh` occurs in Stage 2B |
|
||||
| `optimize` | non-destructive Lance compaction (skips tables with `Blob` columns or uncovered drift; `--json` reports `skipped`) |
|
||||
| `repair [--confirm] [--force]` | preview or explicitly publish uncovered manifest/head drift. `--confirm` heals verified maintenance drift and exits non-zero if suspicious/unverifiable drift is refused; `--force --confirm` publishes suspicious/unverifiable drift after operator review |
|
||||
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
|
||||
| `embed` | offline JSONL embedding pipeline |
|
||||
| `policy validate \| test \| explain` | Cedar tooling. Selects `defaults.graph`, else `serve.graphs`, else top-level `policy.file` |
|
||||
|
|
@ -123,6 +124,28 @@ directory of the layer that defined them.
|
|||
`config view --resolved <graph>` prints the final embedded/remote locator. The server
|
||||
does **not** layer the global config — it reads only the project/`--config` manifest.
|
||||
|
||||
## Cluster config preview
|
||||
|
||||
```bash
|
||||
omnigraph cluster validate --config ./company-brain
|
||||
omnigraph cluster plan --config ./company-brain --json
|
||||
omnigraph cluster status --config ./company-brain --json
|
||||
omnigraph cluster refresh --config ./company-brain --json
|
||||
omnigraph cluster import --config ./company-brain --json
|
||||
```
|
||||
|
||||
`--config` is a directory containing `cluster.yaml`; it defaults to `.`.
|
||||
Stage 2B accepts graphs, schemas, stored queries, and policy bundle file
|
||||
references. `cluster plan` reads local JSON state from
|
||||
`<config-dir>/__cluster/state.json`; a missing file means empty state. Plan,
|
||||
refresh, and import acquire `__cluster/lock.json` by default and release it
|
||||
before returning. `cluster status` reads state only and reports any existing
|
||||
lock. `refresh` requires an existing `state.json`; `import` creates one only
|
||||
when it is missing. Both observe declared graphs read-only at
|
||||
`<config-dir>/graphs/<graph-id>.omni`. External state backends, apply,
|
||||
`plan --refresh`, pipelines, UI specs, embeddings, aliases, and bindings are
|
||||
reserved for later stages. See [cluster-config.md](cluster-config.md).
|
||||
|
||||
## Output formats (`query` command, alias: `read`)
|
||||
|
||||
- `json` — pretty-printed object with metadata + rows
|
||||
|
|
|
|||
152
docs/user/cluster-config.md
Normal file
152
docs/user/cluster-config.md
Normal file
|
|
@ -0,0 +1,152 @@
|
|||
# Cluster Config
|
||||
|
||||
**Status:** Stage 2B state-observation preview.
|
||||
|
||||
Cluster config is the future control-plane configuration surface for a whole
|
||||
OmniGraph deployment. In this stage, OmniGraph can validate a local
|
||||
`cluster.yaml` folder, produce a deterministic read-only plan, inspect the
|
||||
local JSON state ledger, and explicitly refresh/import graph observations into
|
||||
that ledger. It does not apply desired changes, start servers, or write graph
|
||||
resources.
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
omnigraph cluster validate --config ./company-brain
|
||||
omnigraph cluster plan --config ./company-brain --json
|
||||
omnigraph cluster status --config ./company-brain --json
|
||||
omnigraph cluster refresh --config ./company-brain --json
|
||||
omnigraph cluster import --config ./company-brain --json
|
||||
```
|
||||
|
||||
`--config` points at a directory, not a file. The directory must contain
|
||||
`cluster.yaml`. When omitted, it defaults to the current directory.
|
||||
|
||||
## Supported `cluster.yaml`
|
||||
|
||||
Stage 2B accepts only the read-only resource subset:
|
||||
|
||||
```yaml
|
||||
version: 1
|
||||
metadata:
|
||||
name: company-brain
|
||||
|
||||
state:
|
||||
backend: cluster
|
||||
lock: true
|
||||
|
||||
graphs:
|
||||
knowledge:
|
||||
schema: ./knowledge.pg
|
||||
queries:
|
||||
find_experts:
|
||||
file: ./knowledge.gq
|
||||
|
||||
policies:
|
||||
base:
|
||||
file: ./base.policy.yaml
|
||||
applies_to: [knowledge]
|
||||
```
|
||||
|
||||
`metadata.name` is a display label. `state.backend` may be omitted or set to
|
||||
`cluster`; external state backends are reserved for a later stage. `state.lock`
|
||||
defaults to `true`. When enabled, `cluster plan`, `cluster refresh`, and
|
||||
`cluster import` briefly acquire `<config-dir>/__cluster/lock.json`, then remove
|
||||
it before returning. `cluster status` never acquires the lock; it only reports
|
||||
whether one is present.
|
||||
|
||||
## Validation
|
||||
|
||||
`cluster validate` checks:
|
||||
|
||||
- `cluster.yaml` syntax and supported fields
|
||||
- duplicate YAML keys
|
||||
- schema, query, and policy file existence
|
||||
- schema parsing and catalog construction
|
||||
- stored-query parsing and query-name matching
|
||||
- stored-query type-checking against the desired schema
|
||||
- policy `applies_to` graph references
|
||||
|
||||
Fields reserved for later phases, such as `pipelines`, `embeddings`, `ui`,
|
||||
`aliases`, and `bindings`, fail with a typed diagnostic instead of being
|
||||
silently ignored.
|
||||
|
||||
## Planning
|
||||
|
||||
`cluster plan` first performs validation, then reads local JSON state from:
|
||||
|
||||
```text
|
||||
<config-dir>/__cluster/state.json
|
||||
```
|
||||
|
||||
If the file is missing, the state is treated as empty and every desired
|
||||
resource is planned as a create. If present, the file must use this shape:
|
||||
|
||||
```json
|
||||
{
|
||||
"version": 1,
|
||||
"state_revision": 0,
|
||||
"applied_revision": {
|
||||
"config_digest": "...",
|
||||
"resources": {
|
||||
"graph.knowledge": { "digest": "..." },
|
||||
"schema.knowledge": { "digest": "..." },
|
||||
"query.knowledge.find_experts": { "digest": "..." },
|
||||
"policy.base": { "digest": "..." }
|
||||
}
|
||||
},
|
||||
"resource_statuses": {
|
||||
"graph.knowledge": {
|
||||
"status": "applied",
|
||||
"conditions": [],
|
||||
"message": "optional status detail"
|
||||
}
|
||||
},
|
||||
"approval_records": {},
|
||||
"recovery_records": {},
|
||||
"observations": {}
|
||||
}
|
||||
```
|
||||
|
||||
`state_revision`, `resource_statuses`, `approval_records`, `recovery_records`,
|
||||
and `observations` are optional so older Stage 1 state fixtures keep working.
|
||||
Missing `state_revision` is treated as `0`. Resource status values are
|
||||
`pending`, `planned`, `applying`, `applied`, `drifted`, `blocked`, or `error`.
|
||||
|
||||
Plan output compares desired resource digests against state resource digests
|
||||
and reports `create`, `update`, and `delete` changes. It also reports the state
|
||||
CAS (`sha256:<digest>`) and state revision. `state_observations.locked` means an
|
||||
existing lock file was observed; a successful `plan` instead reports
|
||||
`lock_acquired: true` and an `acquired_lock_id`, then releases the lock before
|
||||
returning. The command never writes `state.json` and does not scan live graphs.
|
||||
Use explicit `cluster refresh` / `cluster import` when the state ledger should
|
||||
be updated from live observations. Apply and live drift scans during plan are
|
||||
later-stage work.
|
||||
|
||||
## Status
|
||||
|
||||
`cluster status` reads the same local JSON state ledger and prints what the
|
||||
ledger says is deployed. It does not validate referenced schema/query/policy
|
||||
files and does not inspect live graphs. Missing `state.json` succeeds with a
|
||||
warning; invalid state JSON or an unsupported state version fails.
|
||||
|
||||
## Refresh And Import
|
||||
|
||||
`cluster refresh` updates an existing `state.json` from actual observations.
|
||||
`cluster import` creates the first `state.json` when the ledger is missing.
|
||||
Both commands open declared graphs read-only at:
|
||||
|
||||
```text
|
||||
<config-dir>/graphs/<graph-id>.omni
|
||||
```
|
||||
|
||||
They observe only branch `main`, recording graph existence, manifest version,
|
||||
live schema digest, desired schema digest, and schema-match status under
|
||||
`observations["graph.<id>"]`. Missing graph roots are recorded as drift and
|
||||
remove the graph/schema digests from state so a later `plan` proposes creates.
|
||||
Invalid graph roots are recorded as errors; `refresh` persists the error
|
||||
observation and exits non-zero, while `import` exits non-zero without creating
|
||||
initial state.
|
||||
|
||||
Refresh/import do not observe query or policy resources yet. Existing query and
|
||||
policy state digests are preserved on refresh and are not invented on import.
|
||||
|
|
@ -4,15 +4,19 @@
|
|||
|---|---|---|
|
||||
| `MANIFEST_DIR` | `__manifest` | `db/manifest/layout.rs` |
|
||||
| Commit graph dir | `_graph_commits.lance` | `db/commit_graph.rs` |
|
||||
| Run registry dir (legacy, removed MR-771) | `_graph_runs.lance` | inert post-v0.4.0; reclaimed by MR-770 |
|
||||
| Run branch prefix (legacy, removed MR-771) | `__run__` | filtered by `is_internal_run_branch` defense-in-depth |
|
||||
| Run registry dir (legacy, removed MR-771) | `_graph_runs.lance` | inert post-v0.4.0; bytes remain until a `delete_prefix` primitive lands |
|
||||
| Run branch prefix (legacy, removed MR-771/MR-770) | `__run__` | swept off `__manifest` by the v2→v3 migration; no longer a reserved name |
|
||||
| Schema apply lock | `__schema_apply_lock__` | `db/mod.rs` |
|
||||
| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | `db/manifest/publisher.rs` |
|
||||
| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 2` | `db/manifest/migrations.rs` |
|
||||
| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 3` | `db/manifest/migrations.rs` |
|
||||
| Merge stage batch | `MERGE_STAGE_BATCH_ROWS = 8192` | `exec/merge.rs` |
|
||||
| Maintenance concurrency | `OMNIGRAPH_MAINTENANCE_CONCURRENCY=8` | `db/omnigraph/optimize.rs` |
|
||||
| Lance blob compaction support | `LANCE_SUPPORTS_BLOB_COMPACTION = false` | `db/omnigraph/optimize.rs` |
|
||||
| Graph index cache size | `8` (LRU) | `runtime_cache.rs` |
|
||||
| Expand indexed-path frontier ceiling | `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024` | `exec/query.rs` |
|
||||
| Expand indexed-path hop ceiling | `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS=6` | `exec/query.rs` |
|
||||
| Expand CSR-build cost factor | `CSR_BUILD_FACTOR = 1.5` | `exec/query.rs` |
|
||||
| Expand mode override | `OMNIGRAPH_TRAVERSAL_MODE` (`indexed`\|`csr`; unset = cost-based auto) | `exec/query.rs` |
|
||||
| Default body limit | `1 MB` | `omnigraph-server/lib.rs` |
|
||||
| Ingest body limit | `32 MB` | `omnigraph-server/lib.rs` |
|
||||
| Engine embed model | `gemini-embedding-2-preview` | `omnigraph/embedding.rs` |
|
||||
|
|
@ -21,3 +25,16 @@
|
|||
| Embed retries | `4` | both clients |
|
||||
| Embed retry backoff | `200 ms` | both clients |
|
||||
| LANCE memory pool default | `1 GB` (raised in v0.3.0) | runtime |
|
||||
|
||||
**Expand traversal dispatch.** With `OMNIGRAPH_TRAVERSAL_MODE` unset, the engine
|
||||
chooses the indexed (per-hop BTREE) vs CSR (whole-graph in-memory) path with a
|
||||
cost model over cheap manifest counts (frontier size, |E|, source-vertex count,
|
||||
hops) plus the index-coverage signal: the indexed path is preferred when its
|
||||
frontier-relative work beats building the CSR (≈ when `hops × frontier` is a
|
||||
small fraction of the source-vertex set), and CSR is preferred for dense/deep
|
||||
traversals or when the BTREE coverage is degraded and a full scan would be paid
|
||||
per hop. The two ceilings bound the **initial dispatch** frontier/hops (beyond
|
||||
them CSR is always used); they are not a hard per-hop bound — the cost model
|
||||
*estimates* total indexed work as ~`hops × frontier × fanout`, so dense fan-out is
|
||||
priced toward CSR rather than capped mid-traversal. The override flag forces a path (the `auto` result is identical either way;
|
||||
only the path differs).
|
||||
|
|
|
|||
|
|
@ -13,6 +13,7 @@ of MRs, internal recovery mechanics, or contributor-only invariants.
|
|||
| Install OmniGraph | [install.md](install.md) |
|
||||
| Run the CLI locally | [cli.md](cli.md) |
|
||||
| Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
|
||||
| Validate and plan cluster config | [cluster-config.md](cluster-config.md) |
|
||||
| Write schemas | [schema-language.md](schema-language.md) |
|
||||
| Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |
|
||||
| Write queries and mutations | [query-language.md](query-language.md) |
|
||||
|
|
|
|||
|
|
@ -21,6 +21,6 @@ This is OmniGraph-specific (not Lance):
|
|||
|
||||
- `TypeIndex`: dense `u32 ↔ String id` mapping per node type.
|
||||
- `CsrIndex`: Compressed Sparse Row representation of edges per edge type — `offsets[i]..offsets[i+1]` slices into `targets`.
|
||||
- `GraphIndex { type_indices, csr (out), csc (in) }` — built on demand from a snapshot's edge tables.
|
||||
- `GraphIndex { type_indices, csr (out), csc (in) }` — built on demand from a snapshot's edge tables, **lazily**: only when an `Expand` the planner routes to the CSR path (dense / large frontier) or an `AntiJoin` actually needs it.
|
||||
- Cached in `RuntimeCache::graph_indices` (LRU, max 8 entries, keyed by snapshot id + edge table versions).
|
||||
- Built only when an `Expand` or `AntiJoin` IR op is present in the lowered query, so pure scans skip it.
|
||||
- Selective `Expand`s resolve neighbors from the persisted `src`/`dst` BTREE instead (one indexed scan per hop) and never trigger the CSR build; see [query-language](query-language.md) → Expand. Pure scans, and queries served entirely by the indexed traversal path, skip it.
|
||||
|
|
|
|||
|
|
@ -1,15 +1,26 @@
|
|||
# Maintenance: Optimize & Cleanup
|
||||
# Maintenance: Optimize, Repair & Cleanup
|
||||
|
||||
`db/omnigraph/optimize.rs`.
|
||||
`db/omnigraph/optimize.rs` and `db/omnigraph/repair.rs`.
|
||||
|
||||
## `optimize_all_tables(db)` — non-destructive
|
||||
|
||||
- Lance `compact_files()` on every node + edge table on `main`.
|
||||
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests.
|
||||
- Lance `compact_files()` on every node + edge table on `main`, then **publishes the compacted version to the `__manifest`** so the manifest's `table_version` tracks the compacted Lance HEAD. Reads pin the manifest version, so without this publish compaction would be invisible to readers *and* would break the HEAD-vs-manifest precondition of the next schema apply / strict update/delete ("stale view … refresh and retry"). The publish advances the graph version (a system-attributed commit) only for tables that actually compacted.
|
||||
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests until `cleanup` runs.
|
||||
- Each table's compact→publish runs under its per-`(table, main)` write queue (serializing with concurrent mutations — compaction is a Lance `Rewrite` op that retryable-conflicts with a concurrent merge/update/delete on overlapping fragments). The Lance-HEAD-before-manifest-publish gap is covered by a `SidecarKind::Optimize` recovery sidecar (loose-match): a crash in that window rolls the compacted version forward on the next `Omnigraph::open` (compaction is content-preserving, so roll-forward is always safe).
|
||||
- **Requires a recovered graph.** `optimize` refuses (errors) when an unresolved recovery sidecar is present under `__recovery` — operating on an unrecovered graph could publish a partial write the open-time recovery sweep would roll back. Reopen the graph to run the recovery sweep, then re-run `optimize`.
|
||||
- **Uncovered drift is skipped, not interpreted.** If a table's Lance HEAD is ahead of the version recorded in `__manifest` and no recovery sidecar covers that movement, `optimize` reports `skipped: Some(DriftNeedsRepair)` with the manifest/head versions and leaves the table untouched. Run `omnigraph repair` to classify and explicitly publish that drift.
|
||||
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
|
||||
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed, skipped }]`.
|
||||
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed, skipped, manifest_version, lance_head_version }]`.
|
||||
- **Blob tables are skipped.** A table that declares any `Blob` property is not compacted: it is reported with `skipped: Some(BlobColumnsUnsupportedByLance)` (and logged via `tracing::warn`) instead of compacted, and the rest of the sweep proceeds normally. The current Lance `compact_files` mis-decodes blob-v2 columns under its forced `BlobHandling::AllBinary` read; **reads and writes are unaffected** — only compaction is. This is gated by `LANCE_SUPPORTS_BLOB_COMPACTION` (`db/omnigraph/optimize.rs`) and removed when the upstream Lance fix lands (see [docs/dev/lance.md](../dev/lance.md)). Consequence: fragment count and deleted-row space on blob tables are not reclaimed until then; query results are never affected.
|
||||
|
||||
## `repair_all_tables(db, options)` — explicit
|
||||
|
||||
- Handles **uncovered manifest/head drift**: a table's Lance HEAD is ahead of the manifest pin and no recovery sidecar records the writer intent.
|
||||
- Preview by default. `omnigraph repair --json <uri>` reports each table's `classification`, `action`, manifest/head versions, Lance operation names, and any classification error. `--confirm` publishes only verified maintenance drift; if any suspicious or unverifiable table is refused, the CLI prints the per-table output and exits non-zero. `--force --confirm` also publishes suspicious or unverifiable drift after operator review.
|
||||
- Classifies drift by reading Lance transactions from `manifest_version + 1` through `lance_head_version`. Only `ReserveFragments` and `Rewrite` are verified maintenance. Semantic operations such as `Append`, `Delete`, `Update`, `Merge`, or missing transaction history are not auto-healed.
|
||||
- Publishes repair by advancing `__manifest` to the existing Lance HEAD; it does **not** rewrite Lance data. If the publish succeeds, normal reads and strict writes use the repaired version. If it fails, no new data-side partial state was created.
|
||||
- Requires a clean recovery state. Pending `__recovery` sidecars still belong to automatic sidecar recovery, not manual repair.
|
||||
|
||||
## `cleanup_all_tables(db, options)` — destructive
|
||||
|
||||
- Lance `cleanup_old_versions()` per table.
|
||||
|
|
|
|||
|
|
@ -55,6 +55,8 @@ Used inside MATCH or as expressions inside RETURN/ORDER:
|
|||
|
||||
- `order { <expr> [asc|desc], … }` — supports plain expressions and `nearest(...)`.
|
||||
- `limit <integer>` — required when there is a `nearest(...)` ordering.
|
||||
- **Total, deterministic order.** Rows with equal user-sort keys are broken by the bound entities' key columns (`<var>.id`, ascending) appended as a final tie-break, so the result is a *total* order — reproducible across runs, and `order … limit N` returns a deterministic top-N even when ties straddle the cutoff. (Aggregate results have no entity-key columns; their group rows are already distinct on the projected group keys.)
|
||||
- **NULL placement** is *nulls-first ascending, nulls-last descending* (i.e. `nulls_first = !descending`): a NULL sorts as if smaller than any value.
|
||||
|
||||
## Mutation statements
|
||||
|
||||
|
|
@ -79,7 +81,7 @@ Reason: under the staged-write rewire (MR-794), inserts and updates accumulate i
|
|||
Pipeline operations:
|
||||
|
||||
- `NodeScan { variable, type_name, filters }`
|
||||
- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune.
|
||||
- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune. Executed one of two ways, chosen per-expand by a cost model over cheap manifest counts (frontier size, |E|, source-vertex count, hops) plus index coverage: selective traversals (small frontier relative to the source set) resolve neighbors from the persisted `src`/`dst` BTREE (one indexed scan per hop); dense / deep / large-frontier traversals — or those whose BTREE coverage is degraded so a full scan would be paid per hop — use the in-memory CSR adjacency index. Both produce identical results. The `OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER` / `OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS` ceilings bound the *initial dispatch* frontier/hops (beyond them CSR is always used); the cost model estimates total indexed work as ~`hops × frontier × fanout` and prices dense fan-out toward CSR — they are not a hard per-hop bound. `OMNIGRAPH_TRAVERSAL_MODE=indexed|csr` forces a mode (see [constants](constants.md)).
|
||||
- `Filter { left, op, right }`
|
||||
- `AntiJoin { outer_var, inner: Vec<IROp> }` — for `not { … }`
|
||||
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordin
|
|||
- `edges/{fnv1a64-hex(edge_type_name)}` — one Lance dataset per edge type
|
||||
- `__manifest/` — the catalog of all sub-tables and their published versions
|
||||
- `_graph_commits.lance` / `_graph_commit_actors.lance` — the commit graph and its actor map
|
||||
- (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 graphs are inert; the run state machine was removed in MR-771 and these files are cleaned up via MR-770's production sweep)
|
||||
- (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 graphs are inert; the run state machine was removed in MR-771. The v2→v3 manifest migration sweeps stale `__run__*` branches on first write-open; the inert dataset bytes themselves remain until a `delete_prefix` storage primitive lands)
|
||||
- **Manifest row schema** (`object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count`):
|
||||
- `object_type` ∈ `table | table_version | table_tombstone`
|
||||
- `table_key` ∈ `node:<TypeName> | edge:<EdgeName>`
|
||||
|
|
@ -47,6 +47,7 @@ Adding a new on-disk shape change is one constant bump (`INTERNAL_MANIFEST_SCHEM
|
|||
|---|---|
|
||||
| v1 (implicit, pre-stamp) | `__manifest.object_id` had no PK annotation; publisher had no row-level CAS protection. |
|
||||
| v2 | `__manifest.object_id` carries `lance-schema:unenforced-primary-key=true`; row-level CAS engaged. Stamped as `omnigraph:internal_schema_version=2`. |
|
||||
| v3 | One-time sweep of legacy `__run__*` staging branches (pre-v0.4.0 Run state machine, removed MR-771) off `__manifest`. Runs at `Omnigraph::open(ReadWrite)` and on publish. Stamped as `omnigraph:internal_schema_version=3`. |
|
||||
|
||||
## On-disk layout
|
||||
|
||||
|
|
@ -91,9 +92,9 @@ flowchart TB
|
|||
- **Graph root** is one directory (or S3 prefix). Everything below is part of one OmniGraph graph.
|
||||
- **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
|
||||
- **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
|
||||
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 graphs also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
|
||||
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 graphs also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; the v2→v3 migration sweeps their stale `__run__*` branches, and the dataset bytes are reclaimed once `delete_prefix` lands.)
|
||||
- **`_graph_commit_recoveries.lance`** — one row per recovery sweep action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join. See `crates/omnigraph/src/db/recovery_audit.rs`.
|
||||
- **`__recovery/{ulid}.json`** — transient sidecar files written by the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
|
||||
- **`__recovery/{ulid}.json`** — transient sidecar files written by the five migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`, `optimize_all_tables`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
|
||||
- **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
|
||||
- **Inside each Lance dataset** (orange): the standard Lance directory layout. `_versions/{n}.manifest` records every commit; `data/` holds the actual Arrow fragments; `_indices/{uuid}/` holds index segments with their own `fragment_bitmap` for partial coverage; `_refs/` holds Lance-native per-dataset branches and tags.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue