mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
Merge branch 'main' into devin/1779464281-mr-656-inline-query-strings
Resolve conflicts: keep query/mutate canonical CLI subcommands and top-level lint command (this branch) alongside the repo→graph terminology rename from main. Update test helpers (repo_path → graph_path, init_repo → init_graph, app_for_loaded_repo → app_for_loaded_graph) and align tempdir variable names so the merged tests compile. Drop the now- unused QueryCommand enum (Lint was promoted to a top-level Command). Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com>
This commit is contained in:
commit
9ff4af47fb
79 changed files with 2780 additions and 1894 deletions
|
|
@ -4,4 +4,4 @@
|
|||
- `_as` variants of every write API let callers override the actor: `mutate_as`, `ingest_as`, `branch_merge_as`, `apply_schema_as`, etc.
|
||||
- Actor IDs are persisted on `GraphCommit.actor_id` with split storage in `_graph_commit_actors.lance` (the commit graph is split into `_graph_commits.lance` for the linkage and `_graph_commit_actors.lance` for the actor map).
|
||||
- HTTP server uses the bearer-token actor automatically; CLI uses the local user / explicit env (no implicit actor).
|
||||
- Pre-v0.4.0 repos also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0 and reclaimed by MR-770's production sweep.
|
||||
- Pre-v0.4.0 graphs also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0 and reclaimed by MR-770's production sweep.
|
||||
|
|
|
|||
|
|
@ -8,7 +8,7 @@ A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` sc
|
|||
|
||||
| Command | Purpose |
|
||||
|---|---|
|
||||
| `init` | `--schema <pg>` → initialize a repo (also scaffolds `omnigraph.yaml` if missing) |
|
||||
| `init` | `--schema <pg>` → initialize a graph (also scaffolds `omnigraph.yaml` if missing) |
|
||||
| `load` | bulk load a branch (`--mode overwrite\|append\|merge`) |
|
||||
| `ingest` | branch-creating transactional load (`--from <base>`) |
|
||||
| `query` (alias: `read`) | run named read query; source via `--query <path>`, `-e`/`--query-string <GQ>`, or `--alias <name>` (exactly one). `read` is the deprecated previous name and prints a one-line warning to stderr |
|
||||
|
|
@ -19,7 +19,7 @@ A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` sc
|
|||
| `commit list \| show` | inspect commit graph |
|
||||
| `run list \| show \| publish \| abort` | transactional run ops |
|
||||
| `schema plan \| apply \| show (alias: get)` | migrations |
|
||||
| `lint` (alias: `check`) | offline / repo-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
|
||||
| `lint` (alias: `check`) | offline / graph-backed query validation. Replaces `query lint` / `query check`, which are kept as deprecated argv-level shims that print a one-line warning and rewrite to `omnigraph lint` |
|
||||
| `optimize` | non-destructive Lance compaction |
|
||||
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
|
||||
| `embed` | offline JSONL embedding pipeline |
|
||||
|
|
|
|||
|
|
@ -1,13 +1,13 @@
|
|||
# CLI Guide
|
||||
|
||||
## Core Repo Flow
|
||||
## Core Graph Flow
|
||||
|
||||
```bash
|
||||
omnigraph init --schema ./schema.pg ./repo.omni
|
||||
omnigraph load --data ./data.jsonl --mode overwrite ./repo.omni
|
||||
omnigraph snapshot ./repo.omni --branch main --json
|
||||
omnigraph query --uri ./repo.omni --query ./queries.gq --name get_person --params '{"name":"Alice"}'
|
||||
omnigraph mutate --uri ./repo.omni --query ./queries.gq --name insert_person --params '{"name":"Mina","age":28}'
|
||||
omnigraph init --schema ./schema.pg ./graph.omni
|
||||
omnigraph load --data ./data.jsonl --mode overwrite ./graph.omni
|
||||
omnigraph snapshot ./graph.omni --branch main --json
|
||||
omnigraph query --uri ./graph.omni --query ./queries.gq --name get_person --params '{"name":"Alice"}'
|
||||
omnigraph mutate --uri ./graph.omni --query ./queries.gq --name insert_person --params '{"name":"Mina","age":28}'
|
||||
```
|
||||
|
||||
`omnigraph query` is the canonical read command (pairs with `POST /query`);
|
||||
|
|
@ -21,11 +21,11 @@ For ad-hoc reads and mutations (REPLs, AI agents, one-off scripts), pass the
|
|||
GQ source inline with `-e` / `--query-string` instead of a file path:
|
||||
|
||||
```bash
|
||||
omnigraph query --uri ./repo.omni \
|
||||
omnigraph query --uri ./graph.omni \
|
||||
-e 'query find($name: String) { match { $p: Person { name: $name } } return { $p.name, $p.age } }' \
|
||||
--params '{"name":"Alice"}'
|
||||
|
||||
omnigraph mutate --uri ./repo.omni \
|
||||
omnigraph mutate --uri ./graph.omni \
|
||||
-e 'query add($name: String, $age: I32) { insert Person { name: $name, age: $age } }' \
|
||||
--params '{"name":"Inline","age":42}'
|
||||
```
|
||||
|
|
@ -38,22 +38,22 @@ only the source loader changes.
|
|||
## Branching And Reviewable Data Flows
|
||||
|
||||
```bash
|
||||
omnigraph branch create --uri ./repo.omni --from main feature-x
|
||||
omnigraph branch list --uri ./repo.omni
|
||||
omnigraph branch merge --uri ./repo.omni feature-x --into main
|
||||
omnigraph branch create --uri ./graph.omni --from main feature-x
|
||||
omnigraph branch list --uri ./graph.omni
|
||||
omnigraph branch merge --uri ./graph.omni feature-x --into main
|
||||
|
||||
omnigraph ingest --data ./batch.jsonl --branch review/import-2026-04-09 ./repo.omni
|
||||
omnigraph export ./repo.omni --branch main --type Person > people.jsonl
|
||||
omnigraph commit list ./repo.omni --branch main --json
|
||||
omnigraph commit show --uri ./repo.omni <commit-id> --json
|
||||
omnigraph ingest --data ./batch.jsonl --branch review/import-2026-04-09 ./graph.omni
|
||||
omnigraph export ./graph.omni --branch main --type Person > people.jsonl
|
||||
omnigraph commit list ./graph.omni --branch main --json
|
||||
omnigraph commit show --uri ./graph.omni <commit-id> --json
|
||||
```
|
||||
|
||||
## Remote Server Mode
|
||||
|
||||
Serve a repo:
|
||||
Serve a graph:
|
||||
|
||||
```bash
|
||||
omnigraph-server ./repo.omni --bind 127.0.0.1:8080
|
||||
omnigraph-server ./graph.omni --bind 127.0.0.1:8080
|
||||
```
|
||||
|
||||
Read through the HTTP API:
|
||||
|
|
@ -73,22 +73,22 @@ and configure the matching `bearer_token_env` in `omnigraph.yaml`.
|
|||
|
||||
```bash
|
||||
omnigraph lint --query ./queries.gq --schema ./schema.pg --json
|
||||
omnigraph check --query ./queries.gq ./repo.omni --json
|
||||
omnigraph check --query ./queries.gq ./graph.omni --json
|
||||
|
||||
omnigraph schema plan --schema ./next.pg ./repo.omni --json
|
||||
omnigraph schema apply --schema ./next.pg ./repo.omni --json
|
||||
omnigraph schema plan --schema ./next.pg ./graph.omni --json
|
||||
omnigraph schema apply --schema ./next.pg ./graph.omni --json
|
||||
omnigraph policy validate --config ./omnigraph.yaml
|
||||
omnigraph policy test --config ./omnigraph.yaml
|
||||
omnigraph policy explain --config ./omnigraph.yaml --actor act-alice --action read --branch main
|
||||
|
||||
omnigraph commit list ./repo.omni --json
|
||||
omnigraph commit show --uri ./repo.omni <commit-id> --json
|
||||
omnigraph commit list ./graph.omni --json
|
||||
omnigraph commit show --uri ./graph.omni <commit-id> --json
|
||||
```
|
||||
|
||||
(The legacy `omnigraph run list/show/publish/abort` subcommands were removed in MR-771; mutations and loads publish atomically and the commit graph (`omnigraph commit list`) is the audit surface.)
|
||||
|
||||
`query lint` and `query check` are the same command surface. In v1, repo-backed
|
||||
lint uses local or `s3://` repo URIs; HTTP targets are only supported when you
|
||||
`query lint` and `query check` are the same command surface. In v1, graph-backed
|
||||
lint uses local or `s3://` graph URIs; HTTP targets are only supported when you
|
||||
also pass `--schema`.
|
||||
|
||||
## Config
|
||||
|
|
|
|||
|
|
@ -8,8 +8,8 @@ internal deploy automation.
|
|||
|
||||
Omnigraph supports two broad deployment shapes:
|
||||
|
||||
- local directory repos
|
||||
- `s3://` repos on AWS S3 or S3-compatible object stores
|
||||
- local directory graphs
|
||||
- `s3://` graphs on AWS S3 or S3-compatible object stores
|
||||
|
||||
The server binary and container image expose the same HTTP surface.
|
||||
|
||||
|
|
@ -20,18 +20,18 @@ Build or install:
|
|||
- `omnigraph`
|
||||
- `omnigraph-server`
|
||||
|
||||
Run against a local repo:
|
||||
Run against a local graph:
|
||||
|
||||
```bash
|
||||
omnigraph-server ./repo.omni --bind 0.0.0.0:8080
|
||||
omnigraph-server ./graph.omni --bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
Run against an object-store-backed repo:
|
||||
Run against an object-store-backed graph:
|
||||
|
||||
```bash
|
||||
OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
|
||||
AWS_REGION="us-east-1" \
|
||||
omnigraph-server s3://my-bucket/repos/example/releases/2026-04-10-v0.1.0 \
|
||||
omnigraph-server s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
|
||||
--bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
|
|
@ -46,7 +46,7 @@ curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/
|
|||
The bootstrap:
|
||||
|
||||
- starts a local RustFS-backed object store
|
||||
- creates a bucket and S3-backed Omnigraph repo
|
||||
- creates a bucket and S3-backed Omnigraph graph
|
||||
- loads the checked-in context fixture
|
||||
- starts `omnigraph-server` on `127.0.0.1:8080`
|
||||
|
||||
|
|
@ -60,8 +60,8 @@ Useful overrides:
|
|||
|
||||
- `WORKDIR=/path/to/state`
|
||||
- `BUCKET=omnigraph-local`
|
||||
- `PREFIX=repos/context`
|
||||
- `RESET_REPO=1` to delete an existing partially initialized repo prefix before recreating it
|
||||
- `PREFIX=graphs/context`
|
||||
- `RESET_REPO=1` to delete an existing partially initialized graph prefix before recreating it
|
||||
- `BIND=127.0.0.1:8080`
|
||||
- `RUSTFS_CONTAINER_NAME=omnigraph-rustfs-demo`
|
||||
|
||||
|
|
@ -76,7 +76,7 @@ If `aws` is not installed, the script attempts a user-local AWS CLI install via
|
|||
running.
|
||||
|
||||
If a previous bootstrap left objects behind under the selected `PREFIX` but did
|
||||
not finish initializing the repo, rerun with `RESET_REPO=1` or choose a new
|
||||
not finish initializing the graph, rerun with `RESET_REPO=1` or choose a new
|
||||
`PREFIX`.
|
||||
|
||||
## Container Deployment
|
||||
|
|
@ -87,23 +87,23 @@ Build the image:
|
|||
docker build -t omnigraph-server:local .
|
||||
```
|
||||
|
||||
Run against a local repo:
|
||||
Run against a local graph:
|
||||
|
||||
```bash
|
||||
docker run --rm -p 8080:8080 \
|
||||
-v "$PWD/repo.omni:/data/repo.omni" \
|
||||
-v "$PWD/graph.omni:/data/graph.omni" \
|
||||
omnigraph-server:local \
|
||||
/data/repo.omni --bind 0.0.0.0:8080
|
||||
/data/graph.omni --bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
Run against an S3-backed repo:
|
||||
Run against an S3-backed graph:
|
||||
|
||||
```bash
|
||||
docker run --rm -p 8080:8080 \
|
||||
-e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
|
||||
-e AWS_REGION="us-east-1" \
|
||||
omnigraph-server:local \
|
||||
s3://my-bucket/repos/example/releases/2026-04-10-v0.1.0 \
|
||||
s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
|
||||
--bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
|
|
@ -154,7 +154,7 @@ Manager secret whose `SecretString` is a JSON object of
|
|||
`{"actor_id": "token", ...}`:
|
||||
|
||||
```bash
|
||||
omnigraph-server-aws s3://my-bucket/repos/example ...
|
||||
omnigraph-server-aws s3://my-bucket/graphs/example ...
|
||||
# Environment:
|
||||
# OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET=arn:aws:secretsmanager:us-east-1:123456789012:secret:omnigraph-tokens-AbCdEf
|
||||
```
|
||||
|
|
|
|||
|
|
@ -22,7 +22,7 @@ Mark a Vector property with `@embed("source_text_property")`. At ingest, the eng
|
|||
|
||||
## CLI `omnigraph embed` (offline file pipeline)
|
||||
|
||||
Operates on **JSONL files** (not on a repo). Three modes (mutually exclusive):
|
||||
Operates on **JSONL files** (not on a graph). Three modes (mutually exclusive):
|
||||
|
||||
- (default) `fill_missing` — only embed rows whose target field is empty
|
||||
- `--reembed-all` — overwrite all
|
||||
|
|
|
|||
|
|
@ -18,11 +18,11 @@ of MRs, internal recovery mechanics, or contributor-only invariants.
|
|||
| Write queries and mutations | [query-language.md](query-language.md) |
|
||||
| Use embeddings | [embeddings.md](embeddings.md) |
|
||||
|
||||
## Operate A Repo
|
||||
## Operate A Graph
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Understand repo layout and URI support | [storage.md](storage.md) |
|
||||
| Understand graph layout and URI support | [storage.md](storage.md) |
|
||||
| Work with branches, commits, and snapshots | [branches-commits.md](branches-commits.md) |
|
||||
| Coordinate multi-query workflows | [transactions.md](transactions.md) |
|
||||
| Read diffs and change feeds | [changes.md](changes.md) |
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
# HTTP Server (`omnigraph-server`)
|
||||
|
||||
Axum 0.8 + tokio + utoipa-generated OpenAPI. Single repo per process; deploy multiple processes for multi-tenant.
|
||||
Axum 0.8 + tokio + utoipa-generated OpenAPI. Single graph per process; deploy multiple processes for multi-tenant.
|
||||
|
||||
## Endpoint inventory
|
||||
|
||||
|
|
@ -136,7 +136,7 @@ See [deployment.md](deployment.md) for token-source operational details.
|
|||
|
||||
- `tower_http::TraceLayer::new_for_http()`
|
||||
- Policy decisions logged at INFO level with actor, action, branch, decision, matched rule
|
||||
- Startup logs: token source name, repo URI, bind address
|
||||
- Startup logs: token source name, graph URI, bind address
|
||||
- Graceful SIGINT shutdown
|
||||
|
||||
## Not implemented (by design or "TBD")
|
||||
|
|
@ -148,4 +148,4 @@ See [deployment.md](deployment.md) for token-source operational details.
|
|||
admission control" above). No global rate limiter is configured;
|
||||
add `tower_http::limit` if a graph-wide cap is needed.
|
||||
- Pagination — none (commits/branches return everything; export streams).
|
||||
- Multi-tenant routing — one repo per process.
|
||||
- Multi-tenant routing — one graph per process.
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ Every node type and every edge type is its own Lance dataset:
|
|||
- **Columnar Arrow storage**: each property is a column; nullable per Arrow schema.
|
||||
- **Fragments**: data is partitioned into fragments; new writes create new fragments.
|
||||
- **Manifest versioning**: every commit produces a new dataset version; old versions remain readable.
|
||||
- **Stable row IDs**: `enable_stable_row_ids: true` is set on every Lance dataset OmniGraph creates — node and edge data tables, `__manifest`, `_graph_commits.lance`, `_graph_commit_recoveries.lance`, and any future system tables. This is an architectural invariant: the flag is one-way at dataset create per Lance's row-id-lineage spec, so a future change that introduces a Lance dataset must preserve it. Consequences: `_row_created_at_version` and `_row_last_updated_at_version` are available on every dataset (load-bearing for change-feed validators); `CreateIndex × Rewrite` is not a retryable conflict, so indices survive `omnigraph optimize` without needing the Fragment Reuse Index; readers must use a Lance build that recognises the flag (our pinned 4.0.0 is fine). Pre-0.4.x repos created before this code path settled may have datasets without the flag and cannot be retrofitted in place — the supported path is dump-and-reload. The `stage_overwrite` rewrite path (used by `schema_apply`) preserves the flag through `Operation::Overwrite`; pinned by `stage_overwrite_preserves_stable_row_ids` in `crates/omnigraph/tests/staged_writes.rs`.
|
||||
- **Stable row IDs**: `enable_stable_row_ids: true` is set on every Lance dataset OmniGraph creates — node and edge data tables, `__manifest`, `_graph_commits.lance`, `_graph_commit_recoveries.lance`, and any future system tables. This is an architectural invariant: the flag is one-way at dataset create per Lance's row-id-lineage spec, so a future change that introduces a Lance dataset must preserve it. Consequences: `_row_created_at_version` and `_row_last_updated_at_version` are available on every dataset (load-bearing for change-feed validators); `CreateIndex × Rewrite` is not a retryable conflict, so indices survive `omnigraph optimize` without needing the Fragment Reuse Index; readers must use a Lance build that recognises the flag (our pinned 4.0.0 is fine). Pre-0.4.x graphs created before this code path settled may have datasets without the flag and cannot be retrofitted in place — the supported path is dump-and-reload. The `stage_overwrite` rewrite path (used by `schema_apply`) preserves the flag through `Operation::Overwrite`; pinned by `stage_overwrite_preserves_stable_row_ids` in `crates/omnigraph/tests/staged_writes.rs`.
|
||||
- **Append / delete / `merge_insert`**: native Lance write modes.
|
||||
- **Per-dataset branches** (Lance native): copy-on-write at the dataset level.
|
||||
- **Object-store agnostic**: file://, s3://, gs://, az://, http (read-only via Lance) — OmniGraph wires file:// and s3:// (`storage.rs`).
|
||||
|
|
@ -22,7 +22,7 @@ OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordin
|
|||
- `edges/{fnv1a64-hex(edge_type_name)}` — one Lance dataset per edge type
|
||||
- `__manifest/` — the catalog of all sub-tables and their published versions
|
||||
- `_graph_commits.lance` / `_graph_commit_actors.lance` — the commit graph and its actor map
|
||||
- (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 repos are inert; the run state machine was removed in MR-771 and these files are cleaned up via MR-770's production sweep)
|
||||
- (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 graphs are inert; the run state machine was removed in MR-771 and these files are cleaned up via MR-770's production sweep)
|
||||
- **Manifest row schema** (`object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count`):
|
||||
- `object_type` ∈ `table | table_version | table_tombstone`
|
||||
- `table_key` ∈ `node:<TypeName> | edge:<EdgeName>`
|
||||
|
|
@ -36,7 +36,7 @@ OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordin
|
|||
|
||||
The on-disk shape of `__manifest` is reconciled with the binary via a single stamp + dispatcher. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape this binary writes; the on-disk stamp `omnigraph:internal_schema_version` lives in the manifest dataset's schema-level metadata (Lance `update_schema_metadata`).
|
||||
|
||||
- **`init_manifest_repo`** stamps the current version at creation, so newly initialized repos never need migration.
|
||||
- **`init_manifest_graph`** stamps the current version at creation, so newly initialized graphs never need migration.
|
||||
- **Publisher open-for-write path** (`load_publish_state`) calls `migrate_internal_schema(&mut dataset)` before reading state. When the on-disk stamp matches the binary, this is a single metadata read with no writes; otherwise the dispatcher walks `match`-arm steps forward (1→2, 2→3, …) until the stamp matches, then proceeds with the publish. Reads stay side-effect-free.
|
||||
- **Forward-version protection**: a stamp *higher* than the binary's known version triggers a clear "upgrade omnigraph first" error. An old binary cannot clobber a newer schema by silently treating "unknown stamp" as "missing stamp".
|
||||
- **Idempotency**: each migration step is safe to re-run. A crash between two metadata updates inside a single step leaves the partial state; the next open re-runs the step and the second update lands. The dispatcher itself is a cheap stamp-read on the steady-state path.
|
||||
|
|
@ -50,14 +50,14 @@ Adding a new on-disk shape change is one constant bump (`INTERNAL_MANIFEST_SCHEM
|
|||
|
||||
## On-disk layout
|
||||
|
||||
A repo on disk is a directory tree of Lance datasets. Each dataset follows the standard Lance layout (`_versions/`, `data/`, `_indices/`, `_refs/`); OmniGraph adds the multi-dataset coordination by keeping `__manifest/` alongside the per-type datasets.
|
||||
A graph on disk is a directory tree of Lance datasets. Each dataset follows the standard Lance layout (`_versions/`, `data/`, `_indices/`, `_refs/`); OmniGraph adds the multi-dataset coordination by keeping `__manifest/` alongside the per-type datasets.
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
classDef l1 fill:#fef3e8,stroke:#c46900,color:#000
|
||||
classDef l2 fill:#e8f4fd,stroke:#1e6aa8,color:#000
|
||||
|
||||
repo["repo URI<br/>file:// or s3://bucket/prefix"]:::l2
|
||||
graph["graph URI<br/>file:// or s3://bucket/prefix"]:::l2
|
||||
|
||||
manifest["__manifest/<br/>L2 catalog of sub-tables"]:::l2
|
||||
nodes["nodes/{fnv1a64-hex}/<br/>one dataset per node type"]:::l2
|
||||
|
|
@ -66,12 +66,12 @@ flowchart TB
|
|||
recovery["__recovery/{ulid}.json<br/>recovery sidecars (transient)"]:::l2
|
||||
refs["_refs/branches/{name}.json<br/>graph-level branches"]:::l2
|
||||
|
||||
repo --> manifest
|
||||
repo --> nodes
|
||||
repo --> edges
|
||||
repo --> cgraph
|
||||
repo --> recovery
|
||||
repo --> refs
|
||||
graph --> manifest
|
||||
graph --> nodes
|
||||
graph --> edges
|
||||
graph --> cgraph
|
||||
graph --> recovery
|
||||
graph --> refs
|
||||
|
||||
subgraph dataset[Inside each Lance dataset — L1]
|
||||
ds_v["_versions/{n}.manifest<br/>per-dataset versions"]:::l1
|
||||
|
|
@ -88,10 +88,10 @@ flowchart TB
|
|||
|
||||
**What's where:**
|
||||
|
||||
- **Repo root** is one directory (or S3 prefix). Everything below is part of one OmniGraph repo.
|
||||
- **Graph root** is one directory (or S3 prefix). Everything below is part of one OmniGraph graph.
|
||||
- **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
|
||||
- **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
|
||||
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 repos also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
|
||||
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 graphs also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
|
||||
- **`_graph_commit_recoveries.lance`** — one row per recovery sweep action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join. See `crates/omnigraph/src/db/recovery_audit.rs`.
|
||||
- **`__recovery/{ulid}.json`** — transient sidecar files written by the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
|
||||
- **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
|
||||
|
|
|
|||
|
|
@ -48,7 +48,7 @@ query register_employee_with_team($name: String, $age: I32, $team: String) {
|
|||
|
||||
```bash
|
||||
omnigraph change --query ./mutations.gq --name register_employee_with_team \
|
||||
--params '{"name":"Alice","age":30,"team":"Acme"}' ./repo.omni
|
||||
--params '{"name":"Alice","age":30,"team":"Acme"}' ./graph.omni
|
||||
```
|
||||
|
||||
If the second statement fails (e.g. `Acme` doesn't exist), the publisher never publishes; `Alice` is not in the database. Atomic.
|
||||
|
|
@ -57,10 +57,10 @@ If the second statement fails (e.g. `Acme` doesn't exist), the publisher never p
|
|||
|
||||
```bash
|
||||
# Query 1
|
||||
omnigraph change --query ./mutations.gq --name register_employee --params '{"name":"Alice","age":30}' ./repo.omni
|
||||
omnigraph change --query ./mutations.gq --name register_employee --params '{"name":"Alice","age":30}' ./graph.omni
|
||||
|
||||
# Query 2 — runs after Query 1 has already published
|
||||
omnigraph change --query ./mutations.gq --name link_to_team --params '{"name":"Alice","team":"Acme"}' ./repo.omni
|
||||
omnigraph change --query ./mutations.gq --name link_to_team --params '{"name":"Alice","team":"Acme"}' ./graph.omni
|
||||
```
|
||||
|
||||
These are **two publishes** on `main`. If Query 2 fails, Query 1's effects are already visible. There is no `ROLLBACK` for Query 1.
|
||||
|
|
@ -75,32 +75,32 @@ The pattern when you need to run multiple queries — possibly across multiple c
|
|||
|
||||
```bash
|
||||
# Fork a working branch from main.
|
||||
omnigraph branch create --from main onboarding/2026-04-25 ./repo.omni
|
||||
omnigraph branch create --from main onboarding/2026-04-25 ./graph.omni
|
||||
|
||||
# Run any number of mutations on the branch — each one is its own publish on the branch.
|
||||
# Concurrent reads of `main` are unaffected.
|
||||
omnigraph change --branch onboarding/2026-04-25 \
|
||||
--query ./mutations.gq --name register_employee \
|
||||
--params '{"name":"Alice","age":30}' ./repo.omni
|
||||
--params '{"name":"Alice","age":30}' ./graph.omni
|
||||
|
||||
omnigraph change --branch onboarding/2026-04-25 \
|
||||
--query ./mutations.gq --name register_employee \
|
||||
--params '{"name":"Bob","age":25}' ./repo.omni
|
||||
--params '{"name":"Bob","age":25}' ./graph.omni
|
||||
|
||||
omnigraph change --branch onboarding/2026-04-25 \
|
||||
--query ./mutations.gq --name link_to_team \
|
||||
--params '{"name":"Alice","team":"Acme"}' ./repo.omni
|
||||
--params '{"name":"Alice","team":"Acme"}' ./graph.omni
|
||||
|
||||
# Inspect the branch — read queries work just like on main.
|
||||
omnigraph read --branch onboarding/2026-04-25 \
|
||||
--query ./queries.gq --name list_employees ./repo.omni
|
||||
--query ./queries.gq --name list_employees ./graph.omni
|
||||
|
||||
# Happy with what's on the branch? Merge it. This is one atomic publish:
|
||||
# `main` flips to include every commit on the branch.
|
||||
omnigraph branch merge onboarding/2026-04-25 --into main ./repo.omni
|
||||
omnigraph branch merge onboarding/2026-04-25 --into main ./graph.omni
|
||||
|
||||
# OR: not happy? Throw it away. `main` is untouched.
|
||||
# omnigraph branch delete onboarding/2026-04-25 ./repo.omni
|
||||
# omnigraph branch delete onboarding/2026-04-25 ./graph.omni
|
||||
```
|
||||
|
||||
Properties:
|
||||
|
|
@ -115,16 +115,16 @@ Two agents writing to the same graph independently:
|
|||
|
||||
```bash
|
||||
# Agent A
|
||||
omnigraph branch create --from main agent-a/work ./repo.omni
|
||||
omnigraph change --branch agent-a/work … ./repo.omni
|
||||
omnigraph branch create --from main agent-a/work ./graph.omni
|
||||
omnigraph change --branch agent-a/work … ./graph.omni
|
||||
# … many mutations …
|
||||
omnigraph branch merge agent-a/work --into main ./repo.omni
|
||||
omnigraph branch merge agent-a/work --into main ./graph.omni
|
||||
|
||||
# Agent B (running concurrently)
|
||||
omnigraph branch create --from main agent-b/work ./repo.omni
|
||||
omnigraph change --branch agent-b/work … ./repo.omni
|
||||
omnigraph branch create --from main agent-b/work ./graph.omni
|
||||
omnigraph change --branch agent-b/work … ./graph.omni
|
||||
# … many mutations …
|
||||
omnigraph branch merge agent-b/work --into main ./repo.omni
|
||||
omnigraph branch merge agent-b/work --into main ./graph.omni
|
||||
```
|
||||
|
||||
Each agent sees a consistent snapshot of `main` at the time it forked. The first merge to `main` lands as a fast-forward (or a no-op if no concurrent change). The second merge runs three-way: rows touched by both branches surface as `MergeConflict`s for the caller to resolve.
|
||||
|
|
@ -138,7 +138,7 @@ This is the workflow MR-797 / agentic loops are designed around: **branches are
|
|||
| Single query fails mid-flight | Publisher never publishes; target unchanged | Read the error, decide whether to retry |
|
||||
| Concurrent writers race the same `(table, branch)` | Publisher CAS rejects the loser with `ManifestConflictDetails::ExpectedVersionMismatch` | Refresh handle, retry the query |
|
||||
| Branch with N successful mutations, then merge fails (three-way conflict) | Each individual mutation already committed on the branch; merge surfaces `MergeConflicts` | Inspect, decide whether to keep working on the branch, abandon it (`branch_delete`), or resolve and re-merge |
|
||||
| Process crashes mid-branch-workflow | Each completed mutation on the branch is durable | Re-open the repo, continue where you left off |
|
||||
| Process crashes mid-branch-workflow | Each completed mutation on the branch is durable | Re-open the graph, continue where you left off |
|
||||
|
||||
## When to use what
|
||||
|
||||
|
|
@ -156,7 +156,7 @@ This is the workflow MR-797 / agentic loops are designed around: **branches are
|
|||
|
||||
- **Cross-query atomicity on `main` without a branch.** If you don't want to fork a branch, multiple queries on `main` publish independently. There is no implicit transaction.
|
||||
- **Long-running interactive transactions.** No `BEGIN` over a connection. Branches are the durable equivalent.
|
||||
- **Cross-graph (cross-repo) transactions.** Each repo is its own atomicity domain.
|
||||
- **Cross-graph transactions.** Each graph is its own atomicity domain.
|
||||
- **"Pessimistic" locks** that serialize writers before they reach the storage layer. Snapshot-MVCC + publisher CAS handles concurrency optimistically; the loser retries.
|
||||
|
||||
## See also
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue