mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-09 01:35:18 +02:00
docs: split user and developer docs (#93)
This commit is contained in:
parent
e8d49559c4
commit
60eee78465
39 changed files with 499 additions and 445 deletions
7
docs/user/audit.md
Normal file
7
docs/user/audit.md
Normal file
|
|
@ -0,0 +1,7 @@
|
|||
# Audit / Actor tracking
|
||||
|
||||
- `Omnigraph::audit_actor_id: Option<String>` is the actor in effect.
|
||||
- `_as` variants of every write API let callers override the actor: `mutate_as`, `ingest_as`, `branch_merge_as`, `apply_schema_as`, etc.
|
||||
- Actor IDs are persisted on `GraphCommit.actor_id` with split storage in `_graph_commit_actors.lance` (the commit graph is split into `_graph_commits.lance` for the linkage and `_graph_commit_actors.lance` for the actor map).
|
||||
- HTTP server uses the bearer-token actor automatically; CLI uses the local user / explicit env (no implicit actor).
|
||||
- Pre-v0.4.0 repos also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0 and reclaimed by MR-770's production sweep.
|
||||
63
docs/user/branches-commits.md
Normal file
63
docs/user/branches-commits.md
Normal file
|
|
@ -0,0 +1,63 @@
|
|||
# Branches, Commits, Snapshots
|
||||
|
||||
## L1 — Lance per-dataset branches
|
||||
|
||||
Lance supports branching at the dataset level: a branch is a named lineage of versions, and `fork_branch_from_state(source_branch, target_branch, source_version)` creates a copy-on-write fork.
|
||||
|
||||
## L2 — Graph-level branches
|
||||
|
||||
OmniGraph builds *graph branches* on top by branching every sub-table coherently:
|
||||
|
||||
- `branch_create(name)` / `branch_create_from(target, name)` — disallowed name `main`; fails if branch exists; ensures the schema-apply lock is idle.
|
||||
- `branch_list()` — returns public branches, **filters internal** `__run__…` and `__schema_apply_lock__` prefixes.
|
||||
- `branch_delete(name)` — refuses if there are descendants or active runs on the branch; cleans up owned per-branch fragments.
|
||||
- **Lazy forking**: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source.
|
||||
- `sync_branch(branch)` — re-binds the in-memory handle to the latest head of the branch.
|
||||
|
||||
## L2 — Commit graph (`db/commit_graph.rs`)
|
||||
|
||||
In-memory shape of a graph commit:
|
||||
|
||||
```
|
||||
GraphCommit {
|
||||
graph_commit_id: ULID,
|
||||
manifest_branch: Option<String>,
|
||||
manifest_version: u64,
|
||||
parent_commit_id: Option<String>,
|
||||
merged_parent_commit_id: Option<String>, // populated for merge commits
|
||||
actor_id: Option<String>, // joined in-memory from _graph_commit_actors.lance, NOT a column on _graph_commits.lance
|
||||
created_at: i64 (microseconds since epoch),
|
||||
}
|
||||
```
|
||||
|
||||
Storage is split across two Lance datasets (both with stable row IDs):
|
||||
|
||||
- `_graph_commits.lance` — every column above *except* `actor_id`.
|
||||
- `_graph_commit_actors.lance` — optional separate `(graph_commit_id, actor_id)` map, created on demand. The `actor_id` field above is populated by joining this dataset in-memory at load time.
|
||||
|
||||
Notes:
|
||||
|
||||
- Every successful publish (load / change / merge / schema_apply) appends one commit.
|
||||
- Merge commits have two parents; linear commits have one.
|
||||
- API: `list_commits(branch)`, `get_commit(id)`, `head_commit_id_for_branch(branch)`.
|
||||
|
||||
## L2 — Snapshots & time travel
|
||||
|
||||
- `snapshot()` — current snapshot for the bound branch; cached.
|
||||
- `snapshot_of(target)` — snapshot at a `ReadTarget` (branch | snapshot id).
|
||||
- `snapshot_at_version(v: u64)` — historical snapshot from any manifest version.
|
||||
- `entity_at(table_key, id, version)` — single-entity time travel without building a full snapshot.
|
||||
- A `Snapshot` is a `(version, HashMap<table_key, SubTableEntry>)` — cheap to build, snapshot-isolated cross-table reads.
|
||||
|
||||
## L2 — Internal system branches
|
||||
|
||||
Filtered from `branch_list()` but visible to internals:
|
||||
|
||||
- `__schema_apply_lock__` — serializes schema migrations.
|
||||
- `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). The branch-name guard predicate `is_internal_run_branch` is kept as defense-in-depth so users cannot create a branch matching the legacy prefix; the filter will be removed once production legacy branches are swept (MR-770).
|
||||
|
||||
## L2 — Recovery audit trail
|
||||
|
||||
The four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) protect their multi-table commits with a sidecar at `__recovery/{ulid}.json` written before Phase B and deleted after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.
|
||||
|
||||
Audit rows live in `_graph_commit_recoveries.lance` (sibling to `_graph_commits.lance`) and reference the commit graph by `graph_commit_id`. The linked recovery commit is identified by that same `graph_commit_id`, and `actor_id="omnigraph:recovery"` is stored in `_graph_commit_actors.lance` (joined by `graph_commit_id`) — `_graph_commits.lance` itself does not carry the `actor_id` column. To find recoveries for a specific original actor: `omnigraph commit list --filter actor=omnigraph:recovery`, then join to `_graph_commit_recoveries.lance` by `graph_commit_id` to read `recovery_for_actor`. Schema: see `crates/omnigraph/src/db/recovery_audit.rs`.
|
||||
24
docs/user/changes.md
Normal file
24
docs/user/changes.md
Normal file
|
|
@ -0,0 +1,24 @@
|
|||
# Change Detection / Diff
|
||||
|
||||
`changes/mod.rs`. Three-level algorithm:
|
||||
|
||||
1. **Manifest diff**: skip sub-tables whose `(table_version, table_branch)` is unchanged.
|
||||
2. **Lineage check**:
|
||||
- Same branch lineage → fast path: use the per-row `_row_last_updated_at_version` column to classify Insert/Update/Delete.
|
||||
- Different lineages → ID-based streaming comparison.
|
||||
3. **Row-level diff**: streaming, no full materialization.
|
||||
|
||||
## Public API
|
||||
|
||||
- `diff_between(from: ReadTarget, to: ReadTarget, filter: Option<ChangeFilter>) -> ChangeSet`
|
||||
- `diff_commits(from_commit_id, to_commit_id, filter)` — cross-branch safe.
|
||||
|
||||
## Types
|
||||
|
||||
```
|
||||
ChangeOp: Insert | Update | Delete
|
||||
EntityKind: Node | Edge
|
||||
EntityChange { table_key, kind, type_name, id, op, manifest_version, endpoints?: {src, dst} }
|
||||
ChangeFilter { kinds?, type_names?, ops? }
|
||||
ChangeSet { from_version, to_version, branch?, changes[], stats }
|
||||
```
|
||||
83
docs/user/cli-reference.md
Normal file
83
docs/user/cli-reference.md
Normal file
|
|
@ -0,0 +1,83 @@
|
|||
# CLI Reference (`omnigraph`)
|
||||
|
||||
A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](cli.md).
|
||||
|
||||
17 top-level command families, 40+ subcommands. All commands accept either a positional `URI`, `--uri`, or a `--target <name>` resolved against `omnigraph.yaml`.
|
||||
|
||||
## Top-level commands
|
||||
|
||||
| Command | Purpose |
|
||||
|---|---|
|
||||
| `init` | `--schema <pg>` → initialize a repo (also scaffolds `omnigraph.yaml` if missing) |
|
||||
| `load` | bulk load a branch (`--mode overwrite\|append\|merge`) |
|
||||
| `ingest` | branch-creating transactional load (`--from <base>`) |
|
||||
| `read` | run named query (params via `--params`, `--params-file`, or alias args) |
|
||||
| `change` | run mutation query |
|
||||
| `snapshot` | print current snapshot (per-table version + row count) |
|
||||
| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
|
||||
| `branch create \| list \| delete \| merge` | branching ops |
|
||||
| `commit list \| show` | inspect commit graph |
|
||||
| `run list \| show \| publish \| abort` | transactional run ops |
|
||||
| `schema plan \| apply \| show (alias: get)` | migrations |
|
||||
| `query lint \| check` | offline / repo-backed validation |
|
||||
| `optimize` | non-destructive Lance compaction |
|
||||
| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
|
||||
| `embed` | offline JSONL embedding pipeline |
|
||||
| `policy validate \| test \| explain` | Cedar tooling |
|
||||
| `version` / `-v` | print `omnigraph 0.3.x` |
|
||||
|
||||
## `omnigraph.yaml` schema
|
||||
|
||||
```yaml
|
||||
project: { name }
|
||||
graphs:
|
||||
<name>:
|
||||
uri: <local|s3://|http(s)://>
|
||||
bearer_token_env: <ENV_NAME>
|
||||
server:
|
||||
graph: <name>
|
||||
bind: <ip:port>
|
||||
cli:
|
||||
graph: <name>
|
||||
branch: <name>
|
||||
output_format: json|jsonl|csv|kv|table
|
||||
table_max_column_width: 80
|
||||
table_cell_layout: truncate|wrap
|
||||
query:
|
||||
roots: [<dir>, …] # search path for .gq files
|
||||
auth:
|
||||
env_file: ./.env.omni
|
||||
aliases:
|
||||
<alias>:
|
||||
command: read|change
|
||||
query: <path-to-.gq>
|
||||
name: <query-name>
|
||||
args: [<positional-name>, …]
|
||||
graph: <name>
|
||||
branch: <name>
|
||||
format: <output-format>
|
||||
policy:
|
||||
file: ./policy.yaml
|
||||
```
|
||||
|
||||
## Output formats (read command)
|
||||
|
||||
- `json` — pretty-printed object with metadata + rows
|
||||
- `jsonl` — one metadata line then one JSON object per row
|
||||
- `csv` — RFC 4180-ish quoting
|
||||
- `table` — fitted text table, honors `table_max_column_width` + `table_cell_layout`
|
||||
- `kv` — grouped per-row key/value blocks
|
||||
|
||||
## Param resolution
|
||||
|
||||
Precedence (high to low): explicit `--params` / `--params-file`, alias positional args, `omnigraph.yaml` defaults. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
|
||||
|
||||
## Bearer token resolution (CLI)
|
||||
|
||||
1. `graphs.<name>.bearer_token_env`
|
||||
2. `OMNIGRAPH_BEARER_TOKEN` global env
|
||||
3. `auth.env_file` referenced `.env`
|
||||
|
||||
## Duration parsing (cleanup)
|
||||
|
||||
`s | m | h | d | w` units, e.g. `--older-than 7d`.
|
||||
100
docs/user/cli.md
Normal file
100
docs/user/cli.md
Normal file
|
|
@ -0,0 +1,100 @@
|
|||
# CLI Guide
|
||||
|
||||
## Core Repo Flow
|
||||
|
||||
```bash
|
||||
omnigraph init --schema ./schema.pg ./repo.omni
|
||||
omnigraph load --data ./data.jsonl --mode overwrite ./repo.omni
|
||||
omnigraph snapshot ./repo.omni --branch main --json
|
||||
omnigraph read --uri ./repo.omni --query ./queries.gq --name get_person --params '{"name":"Alice"}'
|
||||
omnigraph change --uri ./repo.omni --query ./queries.gq --name insert_person --params '{"name":"Mina","age":28}'
|
||||
```
|
||||
|
||||
## Branching And Reviewable Data Flows
|
||||
|
||||
```bash
|
||||
omnigraph branch create --uri ./repo.omni --from main feature-x
|
||||
omnigraph branch list --uri ./repo.omni
|
||||
omnigraph branch merge --uri ./repo.omni feature-x --into main
|
||||
|
||||
omnigraph ingest --data ./batch.jsonl --branch review/import-2026-04-09 ./repo.omni
|
||||
omnigraph export ./repo.omni --branch main --type Person > people.jsonl
|
||||
omnigraph commit list ./repo.omni --branch main --json
|
||||
omnigraph commit show --uri ./repo.omni <commit-id> --json
|
||||
```
|
||||
|
||||
## Remote Server Mode
|
||||
|
||||
Serve a repo:
|
||||
|
||||
```bash
|
||||
omnigraph-server ./repo.omni --bind 127.0.0.1:8080
|
||||
```
|
||||
|
||||
Read through the HTTP API:
|
||||
|
||||
```bash
|
||||
omnigraph read \
|
||||
--target http://127.0.0.1:8080 \
|
||||
--query ./queries.gq \
|
||||
--name get_person \
|
||||
--params '{"name":"Alice"}'
|
||||
```
|
||||
|
||||
If the server requires auth, set `OMNIGRAPH_SERVER_BEARER_TOKEN` on the server
|
||||
and configure the matching `bearer_token_env` in `omnigraph.yaml`.
|
||||
|
||||
## Runs, Policy, And Diagnostics
|
||||
|
||||
```bash
|
||||
omnigraph query lint --query ./queries.gq --schema ./schema.pg --json
|
||||
omnigraph query check --query ./queries.gq ./repo.omni --json
|
||||
|
||||
omnigraph schema plan --schema ./next.pg ./repo.omni --json
|
||||
omnigraph schema apply --schema ./next.pg ./repo.omni --json
|
||||
omnigraph policy validate --config ./omnigraph.yaml
|
||||
omnigraph policy test --config ./omnigraph.yaml
|
||||
omnigraph policy explain --config ./omnigraph.yaml --actor act-alice --action read --branch main
|
||||
|
||||
omnigraph commit list ./repo.omni --json
|
||||
omnigraph commit show --uri ./repo.omni <commit-id> --json
|
||||
```
|
||||
|
||||
(The legacy `omnigraph run list/show/publish/abort` subcommands were removed in MR-771; mutations and loads publish atomically and the commit graph (`omnigraph commit list`) is the audit surface.)
|
||||
|
||||
`query lint` and `query check` are the same command surface. In v1, repo-backed
|
||||
lint uses local or `s3://` repo URIs; HTTP targets are only supported when you
|
||||
also pass `--schema`.
|
||||
|
||||
## Config
|
||||
|
||||
`omnigraph.yaml` lets the CLI and server share named graphs, defaults, and
|
||||
query roots:
|
||||
|
||||
```yaml
|
||||
graphs:
|
||||
local:
|
||||
uri: ./demo.omni
|
||||
dev:
|
||||
uri: http://127.0.0.1:8080
|
||||
bearer_token_env: OMNIGRAPH_BEARER_TOKEN
|
||||
|
||||
cli:
|
||||
graph: local
|
||||
branch: main
|
||||
|
||||
query:
|
||||
roots:
|
||||
- queries
|
||||
- .
|
||||
```
|
||||
|
||||
The config file can also define:
|
||||
|
||||
- server bind defaults
|
||||
- auth env files
|
||||
- query aliases for common read and change commands
|
||||
- `policy.file` for Cedar authorization rules
|
||||
|
||||
When policy is enabled, `schema apply` is authorized through the
|
||||
`schema_apply` action and is typically limited to admins on protected `main`.
|
||||
22
docs/user/constants.md
Normal file
22
docs/user/constants.md
Normal file
|
|
@ -0,0 +1,22 @@
|
|||
# Constants & Tunables (cheat sheet)
|
||||
|
||||
| Name | Value | Where |
|
||||
|---|---|---|
|
||||
| `MANIFEST_DIR` | `__manifest` | `db/manifest/layout.rs` |
|
||||
| Commit graph dir | `_graph_commits.lance` | `db/commit_graph.rs` |
|
||||
| Run registry dir (legacy, removed MR-771) | `_graph_runs.lance` | inert post-v0.4.0; reclaimed by MR-770 |
|
||||
| Run branch prefix (legacy, removed MR-771) | `__run__` | filtered by `is_internal_run_branch` defense-in-depth |
|
||||
| Schema apply lock | `__schema_apply_lock__` | `db/mod.rs` |
|
||||
| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | `db/manifest/publisher.rs` |
|
||||
| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 2` | `db/manifest/migrations.rs` |
|
||||
| Merge stage batch | `MERGE_STAGE_BATCH_ROWS = 8192` | `exec/merge.rs` |
|
||||
| Maintenance concurrency | `OMNIGRAPH_MAINTENANCE_CONCURRENCY=8` | `db/omnigraph/optimize.rs` |
|
||||
| Graph index cache size | `8` (LRU) | `runtime_cache.rs` |
|
||||
| Default body limit | `1 MB` | `omnigraph-server/lib.rs` |
|
||||
| Ingest body limit | `32 MB` | `omnigraph-server/lib.rs` |
|
||||
| Engine embed model | `gemini-embedding-2-preview` | `omnigraph/embedding.rs` |
|
||||
| Compiler embed model | `text-embedding-3-small` | `omnigraph-compiler/embedding.rs` |
|
||||
| Embed timeout | `30 000 ms` | both clients |
|
||||
| Embed retries | `4` | both clients |
|
||||
| Embed retry backoff | `200 ms` | both clients |
|
||||
| LANCE memory pool default | `1 GB` (raised in v0.3.0) | runtime |
|
||||
184
docs/user/deployment.md
Normal file
184
docs/user/deployment.md
Normal file
|
|
@ -0,0 +1,184 @@
|
|||
# Deployment
|
||||
|
||||
This doc describes the public runtime contract for self-hosting Omnigraph. It
|
||||
does not include environment-specific secrets, private infrastructure, or
|
||||
internal deploy automation.
|
||||
|
||||
## Runtime Modes
|
||||
|
||||
Omnigraph supports two broad deployment shapes:
|
||||
|
||||
- local directory repos
|
||||
- `s3://` repos on AWS S3 or S3-compatible object stores
|
||||
|
||||
The server binary and container image expose the same HTTP surface.
|
||||
|
||||
## Binary Deployment
|
||||
|
||||
Build or install:
|
||||
|
||||
- `omnigraph`
|
||||
- `omnigraph-server`
|
||||
|
||||
Run against a local repo:
|
||||
|
||||
```bash
|
||||
omnigraph-server ./repo.omni --bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
Run against an object-store-backed repo:
|
||||
|
||||
```bash
|
||||
OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
|
||||
AWS_REGION="us-east-1" \
|
||||
omnigraph-server s3://my-bucket/repos/example/releases/2026-04-10-v0.1.0 \
|
||||
--bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
## One-Command Local RustFS Bootstrap
|
||||
|
||||
The easiest local S3-backed deployment path is:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash
|
||||
```
|
||||
|
||||
The bootstrap:
|
||||
|
||||
- starts a local RustFS-backed object store
|
||||
- creates a bucket and S3-backed Omnigraph repo
|
||||
- loads the checked-in context fixture
|
||||
- starts `omnigraph-server` on `127.0.0.1:8080`
|
||||
|
||||
Supported behavior:
|
||||
|
||||
- downloads the rolling `edge` binary when one exists for the current platform
|
||||
- otherwise clones `ModernRelay/omnigraph` and builds from source
|
||||
- reuses an existing RustFS container if it is already running
|
||||
|
||||
Useful overrides:
|
||||
|
||||
- `WORKDIR=/path/to/state`
|
||||
- `BUCKET=omnigraph-local`
|
||||
- `PREFIX=repos/context`
|
||||
- `RESET_REPO=1` to delete an existing partially initialized repo prefix before recreating it
|
||||
- `BIND=127.0.0.1:8080`
|
||||
- `RUSTFS_CONTAINER_NAME=omnigraph-rustfs-demo`
|
||||
|
||||
The bootstrap expects:
|
||||
|
||||
- Docker
|
||||
- `curl`
|
||||
- either a matching release asset or a local Rust toolchain plus `git`
|
||||
|
||||
If `aws` is not installed, the script attempts a user-local AWS CLI install via
|
||||
`python3 -m pip`. Docker Desktop or another Docker daemon must already be
|
||||
running.
|
||||
|
||||
If a previous bootstrap left objects behind under the selected `PREFIX` but did
|
||||
not finish initializing the repo, rerun with `RESET_REPO=1` or choose a new
|
||||
`PREFIX`.
|
||||
|
||||
## Container Deployment
|
||||
|
||||
Build the image:
|
||||
|
||||
```bash
|
||||
docker build -t omnigraph-server:local .
|
||||
```
|
||||
|
||||
Run against a local repo:
|
||||
|
||||
```bash
|
||||
docker run --rm -p 8080:8080 \
|
||||
-v "$PWD/repo.omni:/data/repo.omni" \
|
||||
omnigraph-server:local \
|
||||
/data/repo.omni --bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
Run against an S3-backed repo:
|
||||
|
||||
```bash
|
||||
docker run --rm -p 8080:8080 \
|
||||
-e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
|
||||
-e AWS_REGION="us-east-1" \
|
||||
omnigraph-server:local \
|
||||
s3://my-bucket/repos/example/releases/2026-04-10-v0.1.0 \
|
||||
--bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
## Auth
|
||||
|
||||
The server can run unauthenticated for local development, but any shared or
|
||||
internet-facing deployment should set a bearer token source.
|
||||
|
||||
### Token sources
|
||||
|
||||
The server reads bearer tokens from one of three places, in precedence order:
|
||||
|
||||
1. **AWS Secrets Manager** (build with `--features aws`, see below) — set
|
||||
`OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` to the secret ID or ARN.
|
||||
2. **JSON file or env** — set one of:
|
||||
- `OMNIGRAPH_SERVER_BEARER_TOKENS_FILE` — path to a JSON `{"actor": "token", ...}` file.
|
||||
- `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON` — the JSON literal inline.
|
||||
3. **Single-token env** — `OMNIGRAPH_SERVER_BEARER_TOKEN` (assigns the
|
||||
implicit actor `default`).
|
||||
|
||||
Tokens are hashed with SHA-256 immediately on ingest; plaintext does not
|
||||
persist in process memory after startup.
|
||||
|
||||
The health endpoint `/healthz` remains suitable for load balancer health checks
|
||||
and is never gated.
|
||||
|
||||
## Build Variants
|
||||
|
||||
The server binary ships in two flavors:
|
||||
|
||||
| Variant | Command | Contents |
|
||||
|---------|---------|----------|
|
||||
| **Default** (on-prem / local dev) | `cargo build --release` | Core server, no AWS SDK |
|
||||
| **AWS** | `cargo build --release --features aws` | Adds AWS Secrets Manager backend for bearer tokens |
|
||||
|
||||
Release artifacts are published with matching suffixes —
|
||||
`omnigraph-server-<version>-<platform>.tar.gz` for the default build and
|
||||
`omnigraph-server-<version>-<platform>-aws.tar.gz` for the AWS-enabled build.
|
||||
|
||||
The AWS build adds ~150 transitive deps and ~30-60s of first-build compile
|
||||
time. Default builds don't pay that cost.
|
||||
|
||||
## AWS Secrets Manager
|
||||
|
||||
When the binary is built with `--features aws`, set
|
||||
`OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` to the ARN or name of a Secrets
|
||||
Manager secret whose `SecretString` is a JSON object of
|
||||
`{"actor_id": "token", ...}`:
|
||||
|
||||
```bash
|
||||
omnigraph-server-aws s3://my-bucket/repos/example ...
|
||||
# Environment:
|
||||
# OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET=arn:aws:secretsmanager:us-east-1:123456789012:secret:omnigraph-tokens-AbCdEf
|
||||
```
|
||||
|
||||
Credentials are resolved via the AWS default chain (env vars, shared config,
|
||||
IMDSv2 instance role, ECS task role) — no explicit credential plumbing is
|
||||
needed when running under an IAM instance role on EC2/ECS/EKS.
|
||||
|
||||
The IAM role must permit `secretsmanager:GetSecretValue` on the referenced
|
||||
secret.
|
||||
|
||||
Setting the env var without building with `--features aws` is a hard error
|
||||
with a rebuild instruction — it does not silently fall back to the env/file
|
||||
source.
|
||||
|
||||
## S3-Compatible Storage
|
||||
|
||||
For S3-compatible backends such as RustFS or MinIO, set the usual AWS SDK
|
||||
environment variables:
|
||||
|
||||
- `AWS_ACCESS_KEY_ID`
|
||||
- `AWS_SECRET_ACCESS_KEY`
|
||||
- `AWS_REGION`
|
||||
- optional `AWS_ENDPOINT_URL`
|
||||
- optional `AWS_ENDPOINT_URL_S3`
|
||||
- optional `AWS_ALLOW_HTTP=true`
|
||||
- optional `AWS_S3_FORCE_PATH_STYLE=true`
|
||||
31
docs/user/embeddings.md
Normal file
31
docs/user/embeddings.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# Embeddings
|
||||
|
||||
OmniGraph has **two** embedding clients with different defaults and purposes.
|
||||
|
||||
## Compiler-side client (`omnigraph-compiler/src/embedding.rs`) — query-time normalization
|
||||
|
||||
- Default model: `text-embedding-3-small` (OpenAI-style schema)
|
||||
- Env: `NANOGRAPH_EMBED_MODEL`, `OPENAI_API_KEY`, `OPENAI_BASE_URL` (default `https://api.openai.com/v1`), `NANOGRAPH_EMBEDDINGS_MOCK`, `NANOGRAPH_EMBED_TIMEOUT_MS=30000`, `NANOGRAPH_EMBED_RETRY_ATTEMPTS=4`, `NANOGRAPH_EMBED_RETRY_BACKOFF_MS=200`
|
||||
- Methods: `embed_text(input, expected_dim)`, `embed_texts(inputs, expected_dim)`
|
||||
- Mock mode: deterministic FNV-1a + xorshift64 → L2-normalized vectors
|
||||
|
||||
## Engine-side client (`omnigraph/src/embedding.rs`) — runtime ingest
|
||||
|
||||
- Model: `gemini-embedding-2-preview`
|
||||
- Env: `GEMINI_API_KEY`, `OMNIGRAPH_GEMINI_BASE_URL` (default Google generativelanguage v1beta), `OMNIGRAPH_EMBED_TIMEOUT_MS=30000`, `OMNIGRAPH_EMBED_RETRY_ATTEMPTS=4`, `OMNIGRAPH_EMBED_RETRY_BACKOFF_MS=200`, `OMNIGRAPH_EMBEDDINGS_MOCK`
|
||||
- Two task types: `embed_query_text` (RETRIEVAL_QUERY) and `embed_document_text` (RETRIEVAL_DOCUMENT)
|
||||
- Exponential backoff with retryable detection (timeouts, 429, 5xx)
|
||||
|
||||
## Schema integration
|
||||
|
||||
Mark a Vector property with `@embed("source_text_property")`. At ingest, the engine pulls the source text and writes the embedding into the vector column. Stored as L2-normalized FixedSizeList(Float32, dim).
|
||||
|
||||
## CLI `omnigraph embed` (offline file pipeline)
|
||||
|
||||
Operates on **JSONL files** (not on a repo). Three modes (mutually exclusive):
|
||||
|
||||
- (default) `fill_missing` — only embed rows whose target field is empty
|
||||
- `--reembed-all` — overwrite all
|
||||
- `--clean` — strip embeddings
|
||||
|
||||
Inputs are either a single seed manifest YAML or `--input/--output/--spec`. Selectors `--type T`, `--select T:field=value` filter rows. Streams JSONL → JSONL.
|
||||
24
docs/user/errors.md
Normal file
24
docs/user/errors.md
Normal file
|
|
@ -0,0 +1,24 @@
|
|||
# Errors and Result Serialization
|
||||
|
||||
## Error taxonomy (`omnigraph::error::OmniError`)
|
||||
|
||||
- `Compiler(...)` — schema/query parse/typecheck errors
|
||||
- `Lance(String)` — storage layer
|
||||
- `DataFusion(String)` — execution layer
|
||||
- `Io(io::Error)`
|
||||
- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option<ManifestConflictDetails>, … })`
|
||||
- `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`).
|
||||
- `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts.
|
||||
- **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/user/query-language.md](query-language.md) for the rule and [docs/dev/runs.md](../dev/runs.md) for the underlying staged-write rationale.
|
||||
- `MergeConflicts(Vec<MergeConflict>)`
|
||||
|
||||
Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics.
|
||||
|
||||
## Result serialization (`omnigraph_compiler::result::QueryResult`)
|
||||
|
||||
- `to_arrow_ipc()` — efficient binary
|
||||
- `to_sdk_json()` — JS-safe JSON (large i64 wrapped in metadata)
|
||||
- `to_rust_json()` — Rust-friendly JSON
|
||||
- `batches()` — direct Arrow `RecordBatch` access
|
||||
|
||||
Mutation results: `{ affectedNodes: usize, affectedEdges: usize }` (also exposed as a tiny Arrow batch).
|
||||
52
docs/user/index.md
Normal file
52
docs/user/index.md
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
# User Docs
|
||||
|
||||
**Audience:** users, CLI users, HTTP clients, and self-hosting operators
|
||||
|
||||
This is the public-facing entry point. These docs should describe behavior,
|
||||
commands, configuration, and operational contracts without requiring knowledge
|
||||
of MRs, internal recovery mechanics, or contributor-only invariants.
|
||||
|
||||
## Start Here
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Install OmniGraph | [install.md](install.md) |
|
||||
| Run the CLI locally | [cli.md](cli.md) |
|
||||
| Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
|
||||
| Write schemas | [schema-language.md](schema-language.md) |
|
||||
| Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |
|
||||
| Write queries and mutations | [query-language.md](query-language.md) |
|
||||
| Use embeddings | [embeddings.md](embeddings.md) |
|
||||
|
||||
## Operate A Repo
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Understand repo layout and URI support | [storage.md](storage.md) |
|
||||
| Work with branches, commits, and snapshots | [branches-commits.md](branches-commits.md) |
|
||||
| Coordinate multi-query workflows | [transactions.md](transactions.md) |
|
||||
| Read diffs and change feeds | [changes.md](changes.md) |
|
||||
| Build and use indexes | [indexes.md](indexes.md) |
|
||||
| Compact and clean old versions | [maintenance.md](maintenance.md) |
|
||||
| Interpret errors and output formats | [errors.md](errors.md) |
|
||||
|
||||
## Run The Server
|
||||
|
||||
| Goal | Read |
|
||||
|---|---|
|
||||
| Deploy the binary or container | [deployment.md](deployment.md) |
|
||||
| Use HTTP endpoints | [server.md](server.md) |
|
||||
| Configure Cedar authorization | [policy.md](policy.md) |
|
||||
| Track actors and audit behavior | [audit.md](audit.md) |
|
||||
|
||||
## Releases
|
||||
|
||||
Release notes live in [releases/](../releases/). Use them for user-visible
|
||||
changes between versions, not for contributor design history.
|
||||
|
||||
## Boundary
|
||||
|
||||
User docs should focus on stable behavior. If a paragraph needs to explain
|
||||
internal sidecars, Lance API blockers, MR numbers, test strategy, or review
|
||||
rules, it probably belongs in [docs/dev/index.md](../dev/index.md) or a developer-area document
|
||||
instead.
|
||||
26
docs/user/indexes.md
Normal file
26
docs/user/indexes.md
Normal file
|
|
@ -0,0 +1,26 @@
|
|||
# Indexes
|
||||
|
||||
## L1 — Lance index types OmniGraph exposes
|
||||
|
||||
| Index | Use | Notes |
|
||||
|---|---|---|
|
||||
| **BTREE scalar** | range / equality on any scalar | created on `@key`, `@index(...)`, and on key columns by `ensure_indices()` |
|
||||
| **Inverted (FTS)** | `search`, `fuzzy`, `match_text`, `bm25` | created on text columns referenced by FTS queries |
|
||||
| **Vector** | `nearest()` k-NN | Lance picks IVF_PQ vs HNSW family by configuration; OmniGraph stores as FixedSizeList(Float32, dim) |
|
||||
|
||||
## L2 — OmniGraph orchestration
|
||||
|
||||
- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run.
|
||||
- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
|
||||
- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
|
||||
- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.
|
||||
|
||||
## L2 — Graph topology index (`graph_index/mod.rs`)
|
||||
|
||||
This is OmniGraph-specific (not Lance):
|
||||
|
||||
- `TypeIndex`: dense `u32 ↔ String id` mapping per node type.
|
||||
- `CsrIndex`: Compressed Sparse Row representation of edges per edge type — `offsets[i]..offsets[i+1]` slices into `targets`.
|
||||
- `GraphIndex { type_indices, csr (out), csc (in) }` — built on demand from a snapshot's edge tables.
|
||||
- Cached in `RuntimeCache::graph_indices` (LRU, max 8 entries, keyed by snapshot id + edge table versions).
|
||||
- Built only when an `Expand` or `AntiJoin` IR op is present in the lowered query, so pure scans skip it.
|
||||
94
docs/user/install.md
Normal file
94
docs/user/install.md
Normal file
|
|
@ -0,0 +1,94 @@
|
|||
# Install
|
||||
|
||||
## Quick Install
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
By default the installer places:
|
||||
|
||||
- `omnigraph`
|
||||
- `omnigraph-server`
|
||||
|
||||
in `~/.local/bin`.
|
||||
|
||||
The default installer is binary-only. It downloads a published release asset,
|
||||
verifies the SHA256 checksum, and unpacks it. It does not build from source.
|
||||
If no stable tag is published yet, the installer automatically falls back to
|
||||
the rolling `edge` release.
|
||||
|
||||
## Homebrew
|
||||
|
||||
```bash
|
||||
brew tap ModernRelay/tap
|
||||
brew install ModernRelay/tap/omnigraph
|
||||
```
|
||||
|
||||
## Channels
|
||||
|
||||
Stable binaries:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
|
||||
```
|
||||
|
||||
Rolling edge binaries from `main`:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | RELEASE_CHANNEL=edge bash
|
||||
```
|
||||
|
||||
Install from source:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install-source.sh | bash
|
||||
```
|
||||
|
||||
## Useful Overrides
|
||||
|
||||
Install to a different directory:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | INSTALL_DIR="$HOME/bin" bash
|
||||
```
|
||||
|
||||
Install a specific tag:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | VERSION=v0.1.0 bash
|
||||
```
|
||||
|
||||
Build from a specific git ref:
|
||||
|
||||
```bash
|
||||
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install-source.sh | SOURCE_REF=main bash
|
||||
```
|
||||
|
||||
## Manual Source Build
|
||||
|
||||
```bash
|
||||
cargo build --release --locked -p omnigraph-cli -p omnigraph-server
|
||||
install -m 0755 target/release/omnigraph ~/.local/bin/omnigraph
|
||||
install -m 0755 target/release/omnigraph-server ~/.local/bin/omnigraph-server
|
||||
```
|
||||
|
||||
## Release Assets
|
||||
|
||||
Tagged releases are expected to publish:
|
||||
|
||||
- `omnigraph-linux-x86_64.tar.gz`
|
||||
- `omnigraph-macos-x86_64.tar.gz`
|
||||
- `omnigraph-macos-arm64.tar.gz`
|
||||
|
||||
Each archive contains both binaries:
|
||||
|
||||
- `omnigraph`
|
||||
- `omnigraph-server`
|
||||
|
||||
## Verify The Install
|
||||
|
||||
```bash
|
||||
omnigraph version
|
||||
omnigraph-server --help
|
||||
```
|
||||
29
docs/user/maintenance.md
Normal file
29
docs/user/maintenance.md
Normal file
|
|
@ -0,0 +1,29 @@
|
|||
# Maintenance: Optimize & Cleanup
|
||||
|
||||
`db/omnigraph/optimize.rs`.
|
||||
|
||||
## `optimize_all_tables(db)` — non-destructive
|
||||
|
||||
- Lance `compact_files()` on every node + edge table on `main`.
|
||||
- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests.
|
||||
- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
|
||||
- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed }]`.
|
||||
|
||||
## `cleanup_all_tables(db, options)` — destructive
|
||||
|
||||
- Lance `cleanup_old_versions()` per table.
|
||||
- Removes manifests (and their unique fragments) older than the retention policy.
|
||||
- `CleanupPolicyOptions { keep_versions: Option<u32>, older_than: Option<Duration> }` — at least one is required.
|
||||
- Returns `[TableCleanupStats { table_key, bytes_removed, old_versions_removed }]`.
|
||||
- CLI guards with `--confirm`; without it, prints a preview line.
|
||||
- **Recovery floor:** `--keep < 3` may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the branch's manifest-pinned table version, which is HEAD-1 in the typical Phase B → Phase C drift case). Default `--keep 10` is safe.
|
||||
|
||||
## Tombstones
|
||||
|
||||
Logical sub-table delete markers in `__manifest`; `tombstone_object_id(table_key, version)` excludes a sub-table version from snapshot reconstruction.
|
||||
|
||||
## Internal schema migrations (`db/manifest/migrations.rs`)
|
||||
|
||||
Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape the binary expects; the on-disk stamp `omnigraph:internal_schema_version` (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls `migrate_internal_schema` before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](storage.md) for the full mechanism.
|
||||
|
||||
A binary opening a manifest stamped at a version *higher* than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.
|
||||
44
docs/user/policy.md
Normal file
44
docs/user/policy.md
Normal file
|
|
@ -0,0 +1,44 @@
|
|||
# Authorization (Cedar policy)
|
||||
|
||||
OmniGraph integrates AWS Cedar (`cedar-policy = 4.9`) for ABAC.
|
||||
|
||||
## Policy actions
|
||||
|
||||
1. `read` — query / snapshot / list branches & commits
|
||||
2. `export` — NDJSON export
|
||||
3. `change` — mutations
|
||||
4. `schema_apply` — apply schema migrations
|
||||
5. `branch_create`
|
||||
6. `branch_delete`
|
||||
7. `branch_merge`
|
||||
8. `run_publish`
|
||||
9. `run_abort`
|
||||
10. `admin` — reserved
|
||||
|
||||
## Scope kinds
|
||||
|
||||
- `branch_scope` — applied to source branch (`read`, `export`, `change`)
|
||||
- `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops)
|
||||
- `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected`
|
||||
|
||||
## Configuration
|
||||
|
||||
`omnigraph.yaml`:
|
||||
|
||||
```yaml
|
||||
policy:
|
||||
file: ./policy.yaml # Cedar rules + groups
|
||||
tests: ./policy.tests.yaml # declarative test cases
|
||||
```
|
||||
|
||||
Each rule must use exactly one of `branch_scope` or `target_branch_scope`.
|
||||
|
||||
## CLI
|
||||
|
||||
- `omnigraph policy validate` — parse + count actors, exit 1 on parse error.
|
||||
- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch.
|
||||
- `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule.
|
||||
|
||||
## Server enforcement
|
||||
|
||||
Every mutating endpoint calls `authorize_request()` *before* the handler runs; decisions are logged with actor / action / branch / outcome / matched rule.
|
||||
111
docs/user/query-language.md
Normal file
111
docs/user/query-language.md
Normal file
|
|
@ -0,0 +1,111 @@
|
|||
# Query Language (`.gq`)
|
||||
|
||||
Pest grammar at `crates/omnigraph-compiler/src/query/query.pest`. AST in `query/ast.rs`. Type checker in `query/typecheck.rs`. Lowering in `ir/lower.rs`.
|
||||
|
||||
## Query declarations
|
||||
|
||||
```
|
||||
query <name>($p1: T1, $p2: T2?, …)
|
||||
@description("…") @instruction("…") {
|
||||
…
|
||||
}
|
||||
```
|
||||
|
||||
Two body shapes:
|
||||
|
||||
- **Read**: `match { … } return { … } [order { … }] [limit N]`
|
||||
- **Mutation**: one or more of `insert | update | delete` statements
|
||||
|
||||
Param types reuse all schema scalars; trailing `?` makes a param optional. The compiler reserves `$__nanograph_now` for `now()`.
|
||||
|
||||
## MATCH clauses
|
||||
|
||||
- **Binding**: `$x: NodeType { prop: <literal | $param | now()>, … }`
|
||||
- **Traversal**: `$src EDGE_NAME { min, max? } $dst` — variable-length paths via hop bounds; default 1..1 if bounds omitted.
|
||||
- **Filter**: `<expr> <op> <expr>` with operators `>=`, `<=`, `!=`, `>`, `<`, `=`, and string `contains`.
|
||||
- **Negation**: `not { clause+ }` — desugars to anti-join over the inner pipeline.
|
||||
|
||||
## Search clauses (multi-modal)
|
||||
|
||||
Used inside MATCH or as expressions inside RETURN/ORDER:
|
||||
|
||||
| Function | Purpose | Underlying Lance facility |
|
||||
|---|---|---|
|
||||
| `nearest($x.vec, $q)` | k-NN vector search (cosine) | Lance vector index (IVF / HNSW) |
|
||||
| `search(field, q)` | Generic FTS | Inverted index |
|
||||
| `fuzzy(field, q [, max_edits])` | Levenshtein-tolerant text search | Inverted index |
|
||||
| `match_text(field, q)` | Pattern match | Inverted index |
|
||||
| `bm25(field, q)` | BM25 scoring | Inverted index |
|
||||
| `rrf(rank_a, rank_b [, k])` | Reciprocal Rank Fusion of two rankings (default k=60) | OmniGraph fuses scored rankings |
|
||||
|
||||
`nearest()` requires a `LIMIT`; the compiler resolves the query vector via the param map (or via the runtime embedding client when bound to a text input).
|
||||
|
||||
## RETURN clause
|
||||
|
||||
`return { <expr> [as <alias>], … }` with expressions:
|
||||
|
||||
- Variable / property access: `$x`, `$x.prop`
|
||||
- Literals: string, int, float, bool, list
|
||||
- `now()`
|
||||
- Aggregates: `count`, `sum`, `avg`, `min`, `max`
|
||||
- All search functions above (so you can return a score column)
|
||||
- `AliasRef` — re-use a previous projection alias
|
||||
|
||||
## ORDER & LIMIT
|
||||
|
||||
- `order { <expr> [asc|desc], … }` — supports plain expressions and `nearest(...)`.
|
||||
- `limit <integer>` — required when there is a `nearest(...)` ordering.
|
||||
|
||||
## Mutation statements
|
||||
|
||||
- `insert <Type> { prop: <value>, … }`
|
||||
- `update <Type> set { prop: <value>, … } where <prop> <op> <value>`
|
||||
- `delete <Type> where <prop> <op> <value>`
|
||||
|
||||
`<value>` is a literal, `$param`, or `now()`. Multi-statement mutations execute atomically (added in v0.2.0).
|
||||
|
||||
### D₂ — mixed insert/update + delete is rejected at parse time
|
||||
|
||||
A single mutation query must be **either insert/update-only or delete-only**. Mixed → rejected before any I/O with the message:
|
||||
|
||||
> `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes. This restriction lifts when Lance exposes a two-phase delete API (tracked: MR-793 / Lance-upstream).`
|
||||
|
||||
Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance 4.0.0 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until Lance exposes `DeleteJob::execute_uncommitted`, the parse-time rejection keeps both paths atomic and correct. See [docs/dev/runs.md](../dev/runs.md) and [docs/dev/invariants.md](../dev/invariants.md).
|
||||
|
||||
## IR (Intermediate Representation)
|
||||
|
||||
`QueryIR { name, params, pipeline: Vec<IROp>, return_exprs, order_by, limit }`
|
||||
|
||||
Pipeline operations:
|
||||
|
||||
- `NodeScan { variable, type_name, filters }`
|
||||
- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune.
|
||||
- `Filter { left, op, right }`
|
||||
- `AntiJoin { outer_var, inner: Vec<IROp> }` — for `not { … }`
|
||||
|
||||
Lowering:
|
||||
|
||||
1. Partition MATCH clauses (bindings, traversals, filters, negations).
|
||||
2. Identify "deferred" bindings (a destination of a traversal that has filters) so the Expand can carry the filter as a pushdown.
|
||||
3. Emit NodeScan for the first binding, then Expand operations, then remaining Filter operations, then AntiJoins for negations.
|
||||
4. Translate RETURN / ORDER expressions; preserve LIMIT.
|
||||
|
||||
## Linting & validation (`query/lint.rs`)
|
||||
|
||||
Codes seen so far:
|
||||
|
||||
- **Q000** (Error): parse error
|
||||
- **L201** (Warning): nullable property never set by any UPDATE — "{type}.{prop} exists in schema but no update query sets it"
|
||||
- (Warning): mutation declares no params — hardcoded mutations are easy to miss
|
||||
- Plus all type errors from `typecheck_query_decl()` (undefined types, mismatched operators, undefined edges, etc.)
|
||||
|
||||
Output:
|
||||
|
||||
```
|
||||
QueryLintOutput { status, schema_source, query_path,
|
||||
queries_processed, errors, warnings, infos,
|
||||
results: [{ name, kind, status, error?, warnings[] }],
|
||||
findings: [{ severity, code, message, type_name?, property?, query_names[] }] }
|
||||
```
|
||||
|
||||
CLI exits non-zero only on `status = Error`.
|
||||
80
docs/user/schema-language.md
Normal file
80
docs/user/schema-language.md
Normal file
|
|
@ -0,0 +1,80 @@
|
|||
# Schema Language (`.pg`)
|
||||
|
||||
Pest grammar at `crates/omnigraph-compiler/src/schema/schema.pest`. AST at `schema/ast.rs`. Catalog at `catalog/mod.rs`.
|
||||
|
||||
## Top-level declarations
|
||||
|
||||
- `interface <Name> { property* }` — reusable property contracts.
|
||||
- `node <Name> [implements <Iface>, ...] { property* | constraint* }`
|
||||
- `edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }`
|
||||
- Comments: line `//` and block `/* … */`.
|
||||
|
||||
## Property declarations
|
||||
|
||||
`<ident>: <TypeRef> [annotation*]`
|
||||
|
||||
## Built-in scalar types
|
||||
|
||||
| Scalar | Arrow type |
|
||||
|---|---|
|
||||
| `String` | Utf8 |
|
||||
| `Blob` | LargeBinary |
|
||||
| `Bool` | Boolean |
|
||||
| `I32` / `I64` | Int32 / Int64 |
|
||||
| `U32` / `U64` | UInt32 / UInt64 |
|
||||
| `F32` / `F64` | Float32 / Float64 |
|
||||
| `Date` | Date32 |
|
||||
| `DateTime` | Date64 |
|
||||
| `Vector(<dim>)` | FixedSizeList(Float32, dim), `1 ≤ dim ≤ i32::MAX` |
|
||||
| `[<scalar>]` | List(scalar) |
|
||||
| `enum(v1, v2, …)` | Utf8 with sorted/dedup'd set of allowed string values |
|
||||
| `<scalar>?` | Same as scalar but `nullable: true` |
|
||||
|
||||
## Constraints (body level)
|
||||
|
||||
| Constraint | On | Effect |
|
||||
|---|---|---|
|
||||
| `@key(p, …)` | node | Primary key; implies index on key columns; `key_property()` returns the first key |
|
||||
| `@unique(p, …)` | node, edge | Uniqueness across listed columns |
|
||||
| `@index(p, …)` | node, edge | Build a scalar (BTREE) index on the columns |
|
||||
| `@range(p, min..max)` | node | Numeric range validation (open ranges allowed) |
|
||||
| `@check(p, "regex")` | node | Regex pattern validation |
|
||||
| `@card(min..max?)` | edge | Edge multiplicity — default `0..*`; `0..1`, `1..1`, `1..*`, etc. |
|
||||
|
||||
Edge bodies only allow `@unique` and `@index`.
|
||||
|
||||
## Annotations
|
||||
|
||||
- `@<ident>` or `@<ident>(<literal>)` on any declaration or property.
|
||||
- Known annotations:
|
||||
- `@embed` on a Vector property — names the *source* property whose text gets embedded into this vector at ingest (`embed_sources` map in NodeType).
|
||||
- `@description("…")`, `@instruction("…")` on query declarations (carried through to clients).
|
||||
- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.
|
||||
|
||||
## Catalog construction
|
||||
|
||||
- Pass 0: collect interfaces.
|
||||
- Pass 1: collect nodes, expand `implements`, build constraint and `@embed` mappings, build the Arrow schema for each node table (`id: Utf8` plus all properties; blob columns get `LargeBinary`).
|
||||
- Pass 2: collect edges, validate that `from_type` / `to_type` exist, normalize edge names case-insensitively for lookup, validate constraints for edges. Edge Arrow schema: `id: Utf8, src: Utf8, dst: Utf8` plus edge properties.
|
||||
|
||||
## Schema IR & stable type IDs
|
||||
|
||||
- `SCHEMA_IR_VERSION = 1` (`catalog/schema_ir.rs`).
|
||||
- Each interface/node/edge currently gets a `stable_type_id` from a kind+name hash.
|
||||
- Rename-preserving accepted IDs are an architectural invariant, but the current hash-on-name implementation is a known gap until migration carries IDs across `@rename_from`.
|
||||
- Serialized as JSON for diff/migration plans.
|
||||
|
||||
## Schema migration planning
|
||||
|
||||
`plan_schema_migration(accepted, desired) -> SchemaMigrationPlan { supported, steps[] }` with step types:
|
||||
|
||||
- `AddType { type_kind, name }`
|
||||
- `RenameType { type_kind, from, to }`
|
||||
- `AddProperty { type_kind, type_name, property_name, property_type }`
|
||||
- `RenameProperty { type_kind, type_name, from, to }`
|
||||
- `AddConstraint { type_kind, type_name, constraint }`
|
||||
- `UpdateTypeMetadata { … annotations }`
|
||||
- `UpdatePropertyMetadata { … annotations }`
|
||||
- `UnsupportedChange { entity, reason }` (forces `supported=false`)
|
||||
|
||||
`apply_schema()` returns `SchemaApplyResult { supported, applied, manifest_version, steps }` and is gated by an internal `__schema_apply_lock__` system branch so concurrent schema applies serialize.
|
||||
61
docs/user/schema-lint.md
Normal file
61
docs/user/schema-lint.md
Normal file
|
|
@ -0,0 +1,61 @@
|
|||
# Schema lint
|
||||
|
||||
The migration planner emits **code-tagged diagnostics** for every schema change it rejects. Codes have the form `OG-XXX-NNN` and identify the rule (not the message); operators reference them in suppression directives, severity overrides, and CI reports.
|
||||
|
||||
This page is the catalog of codes shipped today. The chassis behind it is tracked in [MR-694](https://linear.app/modernrelay/issue/MR-694).
|
||||
|
||||
## What's shipped in v0
|
||||
|
||||
- Stable code attached to every rejection the planner emits (today: 5 of 17 paths — the rest carry `code: None` and are tagged as future work).
|
||||
- Code appears in the user-visible error message: `[OG-DS-104] removing property 'Person.age' is not supported …`.
|
||||
- CLI `omnigraph schema plan` shows the code on `unsupported change …` lines.
|
||||
- Tests in `tests/schema_apply.rs` assert on codes, not on free-text prose.
|
||||
|
||||
## What's not shipped yet
|
||||
|
||||
- Severity configuration in `omnigraph.yaml` (planned: `lint: { OG-DS-103: error }`).
|
||||
- `@allow(OG-XXX-NNN, "rationale")` suppression directives.
|
||||
- Pre-migration checks (the `migration_check { … }` block — MR-941).
|
||||
- The CD / VE / LK / NM families (MR-942..945).
|
||||
- CI integration (MR-946).
|
||||
- Cost-class annotations (MR-944).
|
||||
|
||||
See the parent chassis issue (MR-694) for the design and the per-family sub-issues for what's planned.
|
||||
|
||||
## Code catalog (v0)
|
||||
|
||||
The chassis defines ten families. Today only DS and MF have emitted codes. The remaining families are reserved for future PRs.
|
||||
|
||||
| Code | Family | Tier | Default severity | Meaning |
|
||||
|---|---|---|---|---|
|
||||
| `OG-DS-101` | Destructive | destructive | error | drop graph type with rows (reserved; not yet emitted) |
|
||||
| `OG-DS-102` | Destructive | destructive | error | drop node type with rows |
|
||||
| `OG-DS-103` | Destructive | destructive | error | drop edge type with rows |
|
||||
| `OG-DS-104` | Destructive | destructive | error | drop property with rows |
|
||||
| `OG-DS-105` | Destructive | destructive | error | drop populated vector column (reserved) |
|
||||
| `OG-MF-103` | Maybe-fail | validated | error | add required property without `@default` to populated type |
|
||||
| `OG-MF-104` | Maybe-fail | validated | error | tighten nullable to non-nullable (reserved) |
|
||||
| `OG-MF-106` | Maybe-fail | destructive | error | narrowing scalar type |
|
||||
|
||||
The full code catalog source of truth lives in `crates/omnigraph-compiler/src/lint/codes.rs`. CI-level invariants (uniqueness, format, family coverage) are unit-tested in the same module.
|
||||
|
||||
## Families
|
||||
|
||||
The ten chassis families:
|
||||
|
||||
| Prefix | Family | Status |
|
||||
|---|---|---|
|
||||
| **DS** | Destructive (data-loss) | shipped, v0 |
|
||||
| **MF** | Maybe-fail / data-dependent | shipped, v0 |
|
||||
| **CD** | Constraint deletion (relaxation warning) | tracked in MR-942 |
|
||||
| **BC** | Backward-incompatible (rename) | implicit in `@rename_from`; codify later |
|
||||
| **NM** | Naming conventions | tracked in MR-945 |
|
||||
| **OW** | Ownership (per-resource Cedar) | tracked in MR-722 |
|
||||
| **NL** | Non-linear (branch-merge divergence) | stubbed in MR-947 |
|
||||
| **VE** | Vector / embedding | tracked in MR-943 |
|
||||
| **ED** | Edge / graph topology | tracked in MR-701, MR-943 |
|
||||
| **LK** | Lock duration / cost | tracked in MR-944 |
|
||||
|
||||
## Prior art
|
||||
|
||||
The chassis is modeled on [Atlas's `sqlcheck` analyzers](https://atlasgo.io/lint/analyzers) (DS / MF / CD / BC / NM families). Atlas was the direct inspiration for stable codes, per-rule severity, suppression directives with rationale, and pre-migration checks. omnigraph adapts the chassis to a typed-IR substrate (no SQL injection vector, no per-engine locking, native vector / edge / embedding types Atlas doesn't have).
|
||||
101
docs/user/server.md
Normal file
101
docs/user/server.md
Normal file
|
|
@ -0,0 +1,101 @@
|
|||
# HTTP Server (`omnigraph-server`)
|
||||
|
||||
Axum 0.8 + tokio + utoipa-generated OpenAPI. Single repo per process; deploy multiple processes for multi-tenant.
|
||||
|
||||
## Endpoint inventory
|
||||
|
||||
| Method | Path | Auth | Action | Handler |
|
||||
|---|---|---|---|---|
|
||||
| GET | `/healthz` | none | — | `server_health` |
|
||||
| GET | `/openapi.json` | none | — | `server_openapi` (strips security if auth disabled) |
|
||||
| GET | `/snapshot?branch=` | bearer + `read` | snapshot of branch | `server_snapshot` |
|
||||
| POST | `/read` | bearer + `read` | run named query | `server_read` |
|
||||
| POST | `/export` | bearer + `export` | NDJSON stream | `server_export` |
|
||||
| POST | `/change` | bearer + `change` | mutation | `server_change` |
|
||||
| GET | `/schema` | bearer + `read` | get current `.pg` source | `server_schema_get` |
|
||||
| POST | `/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate | `server_schema_apply` |
|
||||
| POST | `/ingest` | bearer + `branch_create` (if new) + `change` | bulk load | `server_ingest` (32 MB body limit) |
|
||||
| GET | `/branches` | bearer + `read` | list branches | `server_branch_list` |
|
||||
| POST | `/branches` | bearer + `branch_create` | create | `server_branch_create` |
|
||||
| DELETE | `/branches/{branch}` | bearer + `branch_delete` | delete | `server_branch_delete` |
|
||||
| POST | `/branches/merge` | bearer + `branch_merge` | merge `source → target` | `server_branch_merge` |
|
||||
| GET | `/commits?branch=` | bearer + `read` | list | `server_commit_list` |
|
||||
| GET | `/commits/{commit_id}` | bearer + `read` | show | `server_commit_show` |
|
||||
|
||||
## Streaming
|
||||
|
||||
Only `/export` streams (`application/x-ndjson`, MPSC channel + `Body::from_stream`). Everything else is buffered JSON.
|
||||
|
||||
## Error model
|
||||
|
||||
Uniform `ErrorOutput { error, code?, merge_conflicts[], manifest_conflict? }` with `code ∈ unauthorized | forbidden | bad_request | not_found | conflict | too_many_requests | internal`. Merge conflicts attach structured `MergeConflictOutput { table_key, row_id?, kind, message }`.
|
||||
|
||||
`manifest_conflict` is set on **publisher CAS rejections** (HTTP 409): the
|
||||
caller's pre-write view of one table's manifest version was stale.
|
||||
`ManifestConflictOutput { table_key, expected, actual }` tells the client
|
||||
which table to refresh and retry. This is the conflict shape produced by
|
||||
concurrent `/change` or `/ingest` calls landing the same `(table, branch)`
|
||||
race.
|
||||
|
||||
HTTP status codes used: 200, 400, 401, 403, 404, 409, 429, 500.
|
||||
|
||||
## Per-actor admission control
|
||||
|
||||
Disjoint
|
||||
`(table, branch)` writes from different actors now run concurrently,
|
||||
guarded only by the engine's per-(table, branch) write queue. To keep
|
||||
one heavy actor from exhausting shared capacity (Lance I/O, manifest
|
||||
churn, network), the server gates mutating handlers through a
|
||||
`WorkloadController` configured per-process from environment variables:
|
||||
|
||||
| Env var | Default | Purpose |
|
||||
|---|---|---|
|
||||
| `OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX` | 16 | Concurrent in-flight mutations per actor |
|
||||
| `OMNIGRAPH_PER_ACTOR_BYTES_MAX` | 4 GiB | In-flight estimated bytes per actor |
|
||||
|
||||
When an actor exceeds its in-flight count or byte budget, the server
|
||||
returns **HTTP 429 Too Many Requests** with `code: too_many_requests`
|
||||
and a `Retry-After` header (seconds). The actor should back off; other
|
||||
actors are unaffected.
|
||||
|
||||
Cedar policy authorization runs **before** admission accounting so
|
||||
denied requests don't consume admission slots.
|
||||
|
||||
Today admission gates every mutating handler: `/change`, `/ingest`,
|
||||
`/branches/{create,delete,merge}`, and `/schema/apply`. Read-only
|
||||
endpoints (`/snapshot`, `/read`, `/export`, `/branches` GET, `/commits`,
|
||||
`/schema` GET) are not admission-gated.
|
||||
|
||||
## Body limits
|
||||
|
||||
- Default: 1 MB
|
||||
- `/ingest`: 32 MB
|
||||
|
||||
## Auth model (`bearer + SHA-256`)
|
||||
|
||||
- Tokens are SHA-256 hashed on startup; plaintext is never persisted in memory.
|
||||
- Constant-time comparison via `subtle::ConstantTimeEq`.
|
||||
- Three sources, in precedence:
|
||||
1. `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` — AWS Secrets Manager (build with `--features aws`)
|
||||
2. `OMNIGRAPH_SERVER_BEARER_TOKENS_FILE` or `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON` — JSON `{actor_id: token, …}`
|
||||
3. `OMNIGRAPH_SERVER_BEARER_TOKEN` — single legacy token, actor `default`
|
||||
- If no tokens configured, server runs unauthenticated (local dev) and `/openapi.json` strips the security scheme.
|
||||
|
||||
See [deployment.md](deployment.md) for token-source operational details.
|
||||
|
||||
## Tracing & observability
|
||||
|
||||
- `tower_http::TraceLayer::new_for_http()`
|
||||
- Policy decisions logged at INFO level with actor, action, branch, decision, matched rule
|
||||
- Startup logs: token source name, repo URI, bind address
|
||||
- Graceful SIGINT shutdown
|
||||
|
||||
## Not implemented (by design or "TBD")
|
||||
|
||||
- CORS — not configured; add `tower_http::cors` if needed.
|
||||
- Rate limiting — per-actor admission control gates `/change`, `/ingest`,
|
||||
`/branches/{create,delete,merge}`, `/schema/apply` (see "Per-actor
|
||||
admission control" above). No global rate limiter is configured;
|
||||
add `tower_http::limit` if a graph-wide cap is needed.
|
||||
- Pagination — none (commits/branches return everything; export streams).
|
||||
- Multi-tenant routing — one repo per process.
|
||||
115
docs/user/storage.md
Normal file
115
docs/user/storage.md
Normal file
|
|
@ -0,0 +1,115 @@
|
|||
# Storage
|
||||
|
||||
## L1 — Lance dataset (per node/edge type)
|
||||
|
||||
Every node type and every edge type is its own Lance dataset:
|
||||
|
||||
- **Columnar Arrow storage**: each property is a column; nullable per Arrow schema.
|
||||
- **Fragments**: data is partitioned into fragments; new writes create new fragments.
|
||||
- **Manifest versioning**: every commit produces a new dataset version; old versions remain readable.
|
||||
- **Stable row IDs**: `enable_stable_row_ids: true` is set on every Lance dataset OmniGraph creates — node and edge data tables, `__manifest`, `_graph_commits.lance`, `_graph_commit_recoveries.lance`, and any future system tables. This is an architectural invariant: the flag is one-way at dataset create per Lance's row-id-lineage spec, so a future change that introduces a Lance dataset must preserve it. Consequences: `_row_created_at_version` and `_row_last_updated_at_version` are available on every dataset (load-bearing for change-feed validators); `CreateIndex × Rewrite` is not a retryable conflict, so indices survive `omnigraph optimize` without needing the Fragment Reuse Index; readers must use a Lance build that recognises the flag (our pinned 4.0.0 is fine). Pre-0.4.x repos created before this code path settled may have datasets without the flag and cannot be retrofitted in place — the supported path is dump-and-reload. The `stage_overwrite` rewrite path (used by `schema_apply`) preserves the flag through `Operation::Overwrite`; pinned by `stage_overwrite_preserves_stable_row_ids` in `crates/omnigraph/tests/staged_writes.rs`.
|
||||
- **Append / delete / `merge_insert`**: native Lance write modes.
|
||||
- **Per-dataset branches** (Lance native): copy-on-write at the dataset level.
|
||||
- **Object-store agnostic**: file://, s3://, gs://, az://, http (read-only via Lance) — OmniGraph wires file:// and s3:// (`storage.rs`).
|
||||
|
||||
## L2 — Multi-dataset coordination via `__manifest`
|
||||
|
||||
OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordinated through one append-only manifest table.
|
||||
|
||||
- **Manifest table**: `__manifest/` Lance dataset.
|
||||
- **Layout** (`db/manifest/layout.rs`, `db/manifest/state.rs`):
|
||||
- `nodes/{fnv1a64-hex(type_name)}` — one Lance dataset per node type
|
||||
- `edges/{fnv1a64-hex(edge_type_name)}` — one Lance dataset per edge type
|
||||
- `__manifest/` — the catalog of all sub-tables and their published versions
|
||||
- `_graph_commits.lance` / `_graph_commit_actors.lance` — the commit graph and its actor map
|
||||
- (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 repos are inert; the run state machine was removed in MR-771 and these files are cleaned up via MR-770's production sweep)
|
||||
- **Manifest row schema** (`object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count`):
|
||||
- `object_type` ∈ `table | table_version | table_tombstone`
|
||||
- `table_key` ∈ `node:<TypeName> | edge:<EdgeName>`
|
||||
- `table_branch` is `null` for the main lineage and the branch name otherwise
|
||||
- **Snapshot reconstruction**: latest visible `table_version` per `(table_key, table_branch)` minus tombstones — rows where `object_type = table_tombstone`, whose own `table_version` (acting as the tombstone version) is `>= the entry's table_version`.
|
||||
- **Atomic publish**: multi-dataset commits publish via a `ManifestBatchPublisher` so a single write to `__manifest` flips all the new sub-table versions visible at once.
|
||||
- **Row-level CAS on the merge-insert join key**: `object_id` carries `lance-schema:unenforced-primary-key=true` so Lance's bloom-filter conflict resolver rejects two concurrent commits that land the same `object_id` row. Without this annotation, Lance's transparent rebase would admit silent duplicates of `version:T@v=N` from racing publishers (see `.context/merge-insert-cas-granularity.md`).
|
||||
- **Optimistic concurrency control on publish**: `ManifestBatchPublisher::publish` accepts a `expected_table_versions: HashMap<table_key, u64>` map. Each entry asserts the manifest's current latest non-tombstoned version for that table is exactly what the caller observed; mismatches surface as `OmniError::Manifest` with `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }`. Empty map preserves the legacy "best-effort publish" semantics. The publisher uses `conflict_retries(0)` against Lance and owns retry itself (`PUBLISHER_RETRY_BUDGET = 5`), re-running the pre-check on each iteration so concurrent advances surface as `ExpectedVersionMismatch` rather than being silently rebased through.
|
||||
|
||||
### Internal schema versioning (`db/manifest/migrations.rs`)
|
||||
|
||||
The on-disk shape of `__manifest` is reconciled with the binary via a single stamp + dispatcher. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape this binary writes; the on-disk stamp `omnigraph:internal_schema_version` lives in the manifest dataset's schema-level metadata (Lance `update_schema_metadata`).
|
||||
|
||||
- **`init_manifest_repo`** stamps the current version at creation, so newly initialized repos never need migration.
|
||||
- **Publisher open-for-write path** (`load_publish_state`) calls `migrate_internal_schema(&mut dataset)` before reading state. When the on-disk stamp matches the binary, this is a single metadata read with no writes; otherwise the dispatcher walks `match`-arm steps forward (1→2, 2→3, …) until the stamp matches, then proceeds with the publish. Reads stay side-effect-free.
|
||||
- **Forward-version protection**: a stamp *higher* than the binary's known version triggers a clear "upgrade omnigraph first" error. An old binary cannot clobber a newer schema by silently treating "unknown stamp" as "missing stamp".
|
||||
- **Idempotency**: each migration step is safe to re-run. A crash between two metadata updates inside a single step leaves the partial state; the next open re-runs the step and the second update lands. The dispatcher itself is a cheap stamp-read on the steady-state path.
|
||||
|
||||
Adding a new on-disk shape change is one constant bump (`INTERNAL_MANIFEST_SCHEMA_VERSION`), one match arm in `migrate_internal_schema`, and one test. No code outside this module branches on the stamp.
|
||||
|
||||
| Stamp | Shape change |
|
||||
|---|---|
|
||||
| v1 (implicit, pre-stamp) | `__manifest.object_id` had no PK annotation; publisher had no row-level CAS protection. |
|
||||
| v2 | `__manifest.object_id` carries `lance-schema:unenforced-primary-key=true`; row-level CAS engaged. Stamped as `omnigraph:internal_schema_version=2`. |
|
||||
|
||||
## On-disk layout
|
||||
|
||||
A repo on disk is a directory tree of Lance datasets. Each dataset follows the standard Lance layout (`_versions/`, `data/`, `_indices/`, `_refs/`); OmniGraph adds the multi-dataset coordination by keeping `__manifest/` alongside the per-type datasets.
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
classDef l1 fill:#fef3e8,stroke:#c46900,color:#000
|
||||
classDef l2 fill:#e8f4fd,stroke:#1e6aa8,color:#000
|
||||
|
||||
repo["repo URI<br/>file:// or s3://bucket/prefix"]:::l2
|
||||
|
||||
manifest["__manifest/<br/>L2 catalog of sub-tables"]:::l2
|
||||
nodes["nodes/{fnv1a64-hex}/<br/>one dataset per node type"]:::l2
|
||||
edges["edges/{fnv1a64-hex}/<br/>one dataset per edge type"]:::l2
|
||||
cgraph["_graph_commits.lance/<br/>_graph_commit_actors.lance/<br/>_graph_commit_recoveries.lance/"]:::l2
|
||||
recovery["__recovery/{ulid}.json<br/>recovery sidecars (transient)"]:::l2
|
||||
refs["_refs/branches/{name}.json<br/>graph-level branches"]:::l2
|
||||
|
||||
repo --> manifest
|
||||
repo --> nodes
|
||||
repo --> edges
|
||||
repo --> cgraph
|
||||
repo --> recovery
|
||||
repo --> refs
|
||||
|
||||
subgraph dataset[Inside each Lance dataset — L1]
|
||||
ds_v["_versions/{n}.manifest<br/>per-dataset versions"]:::l1
|
||||
ds_data["data/<br/>fragment files (Arrow IPC)"]:::l1
|
||||
ds_idx["_indices/{uuid}/<br/>BTREE · Inverted FTS · IVF/HNSW"]:::l1
|
||||
ds_refs["_refs/<br/>per-dataset Lance branches/tags"]:::l1
|
||||
ds_tx["_transactions/<br/>commit transaction logs"]:::l1
|
||||
end
|
||||
|
||||
nodes -.-> dataset
|
||||
edges -.-> dataset
|
||||
manifest -.-> dataset
|
||||
```
|
||||
|
||||
**What's where:**
|
||||
|
||||
- **Repo root** is one directory (or S3 prefix). Everything below is part of one OmniGraph repo.
|
||||
- **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
|
||||
- **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
|
||||
- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 repos also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
|
||||
- **`_graph_commit_recoveries.lance`** — one row per recovery sweep action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join. See `crates/omnigraph/src/db/recovery_audit.rs`.
|
||||
- **`__recovery/{ulid}.json`** — transient sidecar files written by the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
|
||||
- **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
|
||||
- **Inside each Lance dataset** (orange): the standard Lance directory layout. `_versions/{n}.manifest` records every commit; `data/` holds the actual Arrow fragments; `_indices/{uuid}/` holds index segments with their own `fragment_bitmap` for partial coverage; `_refs/` holds Lance-native per-dataset branches and tags.
|
||||
|
||||
The split — L2 owns the cross-dataset catalog; L1 owns the per-dataset internals — means that schema work (which adds or removes datasets) updates `__manifest`, while data work (which adds fragments) updates `_versions/` inside the affected dataset and then bumps `__manifest`.
|
||||
|
||||
## URI scheme support (`storage.rs`)
|
||||
|
||||
| Scheme | Backend | Notes |
|
||||
|---|---|---|
|
||||
| local path / `file://` | `LocalStorageAdapter` (tokio) | Normalized to absolute paths |
|
||||
| `s3://bucket/prefix` | `S3StorageAdapter` (object_store) | Honors `AWS_ENDPOINT_URL_S3`, `AWS_ALLOW_HTTP`, `AWS_S3_FORCE_PATH_STYLE` |
|
||||
| `http(s)://host:port` | HTTP client to `omnigraph-server` | Used by CLI as a target, not a storage backend |
|
||||
|
||||
## Object-store env vars (S3-compatible)
|
||||
|
||||
- `AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`
|
||||
- `AWS_ENDPOINT_URL`, `AWS_ENDPOINT_URL_S3` — for MinIO / RustFS / GCS-via-XML
|
||||
- `AWS_S3_FORCE_PATH_STYLE=true` — path-style URLs
|
||||
- `AWS_ALLOW_HTTP=true` — allow plain HTTP (local dev)
|
||||
168
docs/user/transactions.md
Normal file
168
docs/user/transactions.md
Normal file
|
|
@ -0,0 +1,168 @@
|
|||
# Transactions in OmniGraph
|
||||
|
||||
OmniGraph does not have `BEGIN` / `COMMIT` / `ROLLBACK`. Branches do that job. This page explains the model, when to use which primitive, and shows worked examples for the patterns that come up most.
|
||||
|
||||
The architectural rule lives in [`docs/dev/invariants.md`](../dev/invariants.md):
|
||||
|
||||
> **Mutations publish at one boundary.** A `mutate_as` or `load` operation
|
||||
> accumulates constructive writes, commits each touched table at the end, then
|
||||
> publishes one manifest update.
|
||||
|
||||
If you need to coordinate multiple queries atomically, you fork a branch, run mutations on it, and merge when you're satisfied. If something goes wrong, you delete the branch.
|
||||
|
||||
## The atomicity model
|
||||
|
||||
Two primitives, two scopes:
|
||||
|
||||
| Scope | Primitive | Atomic? | Failure mode |
|
||||
|---|---|---|---|
|
||||
| **One `.gq` query** (any number of statements inside) | The query itself — handled by the publisher's atomic manifest commit | Yes — all statements land together or none of them do | The publisher never publishes; target unchanged |
|
||||
| **Many queries that must succeed together** | Branches: `branch_create` → run N queries on the branch → `branch_merge` | Yes — the merge is a single atomic publish | Drop the branch (`branch_delete`); main is unaffected |
|
||||
|
||||
Snapshot isolation is per-query — every read inside one query sees one consistent manifest version. Two concurrent queries on the same branch see independent snapshots; the publisher's CAS catches racing writes.
|
||||
|
||||
## Comparison with `BEGIN` / `COMMIT`
|
||||
|
||||
| Postgres / MySQL | OmniGraph |
|
||||
|---|---|
|
||||
| `BEGIN; … ; COMMIT` | `branch_create review/X` → mutations on `review/X` → `branch_merge review/X --into main` |
|
||||
| `ROLLBACK` | `branch_delete review/X` |
|
||||
| Connection-bound session state | Branch-scoped lineage on disk |
|
||||
| Locks (or MVCC + abort on conflict) | Snapshot isolation per query + three-way merge at branch-join |
|
||||
| Transaction is invisible to ops | Branch is a durable artifact (visible in `branch_list`, queryable, time-travelable) |
|
||||
|
||||
The trade-off: branches are heavier than a connection-scoped transaction (they exist on disk, have a name, show up in `branch_list`), but they fit the agent-as-user model — agents naturally fork branches to plan, batch, and review work. And they're durable: if your process crashes mid-workflow, the branch survives and you can pick up where you left off.
|
||||
|
||||
## Worked examples
|
||||
|
||||
### 1. Single query, multi-statement (atomic by default)
|
||||
|
||||
A `.gq` query with multiple `insert` / `update` statements is one transaction. Either all statements land together at publish time, or none do.
|
||||
|
||||
```gq
|
||||
query register_employee_with_team($name: String, $age: I32, $team: String) {
|
||||
insert Person { name: $name, age: $age }
|
||||
insert WorksAt { from: $name, to: $team }
|
||||
}
|
||||
```
|
||||
|
||||
```bash
|
||||
omnigraph change --query ./mutations.gq --name register_employee_with_team \
|
||||
--params '{"name":"Alice","age":30,"team":"Acme"}' ./repo.omni
|
||||
```
|
||||
|
||||
If the second statement fails (e.g. `Acme` doesn't exist), the publisher never publishes; `Alice` is not in the database. Atomic.
|
||||
|
||||
### 2. Two separate queries on `main` (NOT atomic)
|
||||
|
||||
```bash
|
||||
# Query 1
|
||||
omnigraph change --query ./mutations.gq --name register_employee --params '{"name":"Alice","age":30}' ./repo.omni
|
||||
|
||||
# Query 2 — runs after Query 1 has already published
|
||||
omnigraph change --query ./mutations.gq --name link_to_team --params '{"name":"Alice","team":"Acme"}' ./repo.omni
|
||||
```
|
||||
|
||||
These are **two publishes** on `main`. If Query 2 fails, Query 1's effects are already visible. There is no `ROLLBACK` for Query 1.
|
||||
|
||||
If you want both-or-neither, you have two options:
|
||||
- Combine them into a single `.gq` query (option 1 above), or
|
||||
- Use a branch (option 3 below).
|
||||
|
||||
### 3. Many queries, atomic via a branch
|
||||
|
||||
The pattern when you need to run multiple queries — possibly across multiple commands, agents, or sessions — and have them succeed or fail as a unit.
|
||||
|
||||
```bash
|
||||
# Fork a working branch from main.
|
||||
omnigraph branch create --from main onboarding/2026-04-25 ./repo.omni
|
||||
|
||||
# Run any number of mutations on the branch — each one is its own publish on the branch.
|
||||
# Concurrent reads of `main` are unaffected.
|
||||
omnigraph change --branch onboarding/2026-04-25 \
|
||||
--query ./mutations.gq --name register_employee \
|
||||
--params '{"name":"Alice","age":30}' ./repo.omni
|
||||
|
||||
omnigraph change --branch onboarding/2026-04-25 \
|
||||
--query ./mutations.gq --name register_employee \
|
||||
--params '{"name":"Bob","age":25}' ./repo.omni
|
||||
|
||||
omnigraph change --branch onboarding/2026-04-25 \
|
||||
--query ./mutations.gq --name link_to_team \
|
||||
--params '{"name":"Alice","team":"Acme"}' ./repo.omni
|
||||
|
||||
# Inspect the branch — read queries work just like on main.
|
||||
omnigraph read --branch onboarding/2026-04-25 \
|
||||
--query ./queries.gq --name list_employees ./repo.omni
|
||||
|
||||
# Happy with what's on the branch? Merge it. This is one atomic publish:
|
||||
# `main` flips to include every commit on the branch.
|
||||
omnigraph branch merge onboarding/2026-04-25 --into main ./repo.omni
|
||||
|
||||
# OR: not happy? Throw it away. `main` is untouched.
|
||||
# omnigraph branch delete onboarding/2026-04-25 ./repo.omni
|
||||
```
|
||||
|
||||
Properties:
|
||||
- Each query on the branch is its own publisher commit — so they're individually atomic. Per-query CAS works on branches just like on main.
|
||||
- The branch lives on disk. Process crash mid-workflow? Re-open and resume.
|
||||
- Multiple agents can work on different branches in parallel without blocking each other.
|
||||
- The merge is a three-way merge at the row level. Conflicts surface as `OmniError::MergeConflicts(Vec<MergeConflict>)`, with structured kinds (`DivergentInsert`, `DivergentUpdate`, `DeleteVsUpdate`, …) so callers can handle them programmatically.
|
||||
|
||||
### 4. Coordinating multiple agents
|
||||
|
||||
Two agents writing to the same graph independently:
|
||||
|
||||
```bash
|
||||
# Agent A
|
||||
omnigraph branch create --from main agent-a/work ./repo.omni
|
||||
omnigraph change --branch agent-a/work … ./repo.omni
|
||||
# … many mutations …
|
||||
omnigraph branch merge agent-a/work --into main ./repo.omni
|
||||
|
||||
# Agent B (running concurrently)
|
||||
omnigraph branch create --from main agent-b/work ./repo.omni
|
||||
omnigraph change --branch agent-b/work … ./repo.omni
|
||||
# … many mutations …
|
||||
omnigraph branch merge agent-b/work --into main ./repo.omni
|
||||
```
|
||||
|
||||
Each agent sees a consistent snapshot of `main` at the time it forked. The first merge to `main` lands as a fast-forward (or a no-op if no concurrent change). The second merge runs three-way: rows touched by both branches surface as `MergeConflict`s for the caller to resolve.
|
||||
|
||||
This is the workflow MR-797 / agentic loops are designed around: **branches are the unit of "an agent's working set."**
|
||||
|
||||
## Failure modes
|
||||
|
||||
| Scenario | What happens | Caller action |
|
||||
|---|---|---|
|
||||
| Single query fails mid-flight | Publisher never publishes; target unchanged | Read the error, decide whether to retry |
|
||||
| Concurrent writers race the same `(table, branch)` | Publisher CAS rejects the loser with `ManifestConflictDetails::ExpectedVersionMismatch` | Refresh handle, retry the query |
|
||||
| Branch with N successful mutations, then merge fails (three-way conflict) | Each individual mutation already committed on the branch; merge surfaces `MergeConflicts` | Inspect, decide whether to keep working on the branch, abandon it (`branch_delete`), or resolve and re-merge |
|
||||
| Process crashes mid-branch-workflow | Each completed mutation on the branch is durable | Re-open the repo, continue where you left off |
|
||||
|
||||
## When to use what
|
||||
|
||||
| Intent | Use |
|
||||
|---|---|
|
||||
| One conceptual change, multiple statements | One `.gq` query with multiple statements |
|
||||
| Bulk import of a related set of records | One `omnigraph load` (the loader is one atomic query under the hood) |
|
||||
| Many independent changes, no coordination needed | Many separate queries on `main`. Each is its own atomic unit. |
|
||||
| "Do these N things, all together or not at all" | Branch → run N queries → merge |
|
||||
| "Try things, evaluate, then commit" | Branch → mutate → read/inspect → merge or delete |
|
||||
| "Multiple agents writing concurrently" | One branch per agent, merge to `main` at end of agent task |
|
||||
| "Long-running workflow that may span sessions or process restarts" | Branch (durable on disk) |
|
||||
|
||||
## What this model can't do
|
||||
|
||||
- **Cross-query atomicity on `main` without a branch.** If you don't want to fork a branch, multiple queries on `main` publish independently. There is no implicit transaction.
|
||||
- **Long-running interactive transactions.** No `BEGIN` over a connection. Branches are the durable equivalent.
|
||||
- **Cross-graph (cross-repo) transactions.** Each repo is its own atomicity domain.
|
||||
- **"Pessimistic" locks** that serialize writers before they reach the storage layer. Snapshot-MVCC + publisher CAS handles concurrency optimistically; the loser retries.
|
||||
|
||||
## See also
|
||||
|
||||
- [`docs/user/branches-commits.md`](branches-commits.md) — branch and commit-graph mechanics.
|
||||
- [`docs/dev/merge.md`](../dev/merge.md) — three-way merge details and conflict kinds.
|
||||
- [`docs/user/query-language.md`](query-language.md) — `.gq` syntax for the multi-statement queries used above.
|
||||
- [`docs/dev/runs.md`](../dev/runs.md) — the per-query commit pipeline that gives single-query atomicity.
|
||||
- [`docs/dev/invariants.md`](../dev/invariants.md) — the architectural rule.
|
||||
Loading…
Add table
Add a link
Reference in a new issue