docs: split user and developer docs (#93)

2026-06-09 01:35:18 +02:00 · 2026-05-15 03:45:22 +03:00 · 2026-05-15 03:45:22 +03:00 · 60eee78465
commit 60eee78465
parent e8d49559c4
39 changed files with 499 additions and 445 deletions
--- a/docs/user/audit.md
+++ b/docs/user/audit.md
@ -0,0 +1,7 @@
+# Audit / Actor tracking
+
+- `Omnigraph::audit_actor_id: Option<String>` is the actor in effect.
+- `_as` variants of every write API let callers override the actor: `mutate_as`, `ingest_as`, `branch_merge_as`, `apply_schema_as`, etc.
+- Actor IDs are persisted on `GraphCommit.actor_id` with split storage in `_graph_commit_actors.lance` (the commit graph is split into `_graph_commits.lance` for the linkage and `_graph_commit_actors.lance` for the actor map).
+- HTTP server uses the bearer-token actor automatically; CLI uses the local user / explicit env (no implicit actor).
+- Pre-v0.4.0 repos also stored actor IDs on `RunRecord.actor_id` in `_graph_runs.lance` / `_graph_run_actors.lance`. The Run state machine was removed in MR-771; those files are inert post-v0.4.0 and reclaimed by MR-770's production sweep.
--- a/docs/user/branches-commits.md
+++ b/docs/user/branches-commits.md
@ -0,0 +1,63 @@
+# Branches, Commits, Snapshots
+
+## L1 — Lance per-dataset branches
+
+Lance supports branching at the dataset level: a branch is a named lineage of versions, and `fork_branch_from_state(source_branch, target_branch, source_version)` creates a copy-on-write fork.
+
+## L2 — Graph-level branches
+
+OmniGraph builds *graph branches* on top by branching every sub-table coherently:
+
+- `branch_create(name)` / `branch_create_from(target, name)` — disallowed name `main`; fails if branch exists; ensures the schema-apply lock is idle.
+- `branch_list()` — returns public branches, **filters internal** `__run__…` and `__schema_apply_lock__` prefixes.
+- `branch_delete(name)` — refuses if there are descendants or active runs on the branch; cleans up owned per-branch fragments.
+- **Lazy forking**: a branch only forks a sub-table when that sub-table is first mutated on it. Pure-read branches share fragments with their source.
+- `sync_branch(branch)` — re-binds the in-memory handle to the latest head of the branch.
+
+## L2 — Commit graph (`db/commit_graph.rs`)
+
+In-memory shape of a graph commit:
+
+```
+GraphCommit {
+  graph_commit_id: ULID,
+  manifest_branch: Option<String>,
+  manifest_version: u64,
+  parent_commit_id: Option<String>,
+  merged_parent_commit_id: Option<String>,   // populated for merge commits
+  actor_id: Option<String>,                  // joined in-memory from _graph_commit_actors.lance, NOT a column on _graph_commits.lance
+  created_at: i64 (microseconds since epoch),
+}
+```
+
+Storage is split across two Lance datasets (both with stable row IDs):
+
+- `_graph_commits.lance` — every column above *except* `actor_id`.
+- `_graph_commit_actors.lance` — optional separate `(graph_commit_id, actor_id)` map, created on demand. The `actor_id` field above is populated by joining this dataset in-memory at load time.
+
+Notes:
+
+- Every successful publish (load / change / merge / schema_apply) appends one commit.
+- Merge commits have two parents; linear commits have one.
+- API: `list_commits(branch)`, `get_commit(id)`, `head_commit_id_for_branch(branch)`.
+
+## L2 — Snapshots & time travel
+
+- `snapshot()` — current snapshot for the bound branch; cached.
+- `snapshot_of(target)` — snapshot at a `ReadTarget` (branch | snapshot id).
+- `snapshot_at_version(v: u64)` — historical snapshot from any manifest version.
+- `entity_at(table_key, id, version)` — single-entity time travel without building a full snapshot.
+- A `Snapshot` is a `(version, HashMap<table_key, SubTableEntry>)` — cheap to build, snapshot-isolated cross-table reads.
+
+## L2 — Internal system branches
+
+Filtered from `branch_list()` but visible to internals:
+
+- `__schema_apply_lock__` — serializes schema migrations.
+- `__run__<run-id>` — legacy from the pre-v0.4.0 Run state machine (removed in MR-771). The branch-name guard predicate `is_internal_run_branch` is kept as defense-in-depth so users cannot create a branch matching the legacy prefix; the filter will be removed once production legacy branches are swept (MR-770).
+
+## L2 — Recovery audit trail
+
+The four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) protect their multi-table commits with a sidecar at `__recovery/{ulid}.json` written before Phase B and deleted after Phase C. The next `Omnigraph::open` (gated on `OpenMode::ReadWrite`) runs the recovery sweep in `crates/omnigraph/src/db/manifest/recovery.rs`: classify per-table state, decide all-or-nothing per sidecar, roll forward / back, record an audit row.
+
+Audit rows live in `_graph_commit_recoveries.lance` (sibling to `_graph_commits.lance`) and reference the commit graph by `graph_commit_id`. The linked recovery commit is identified by that same `graph_commit_id`, and `actor_id="omnigraph:recovery"` is stored in `_graph_commit_actors.lance` (joined by `graph_commit_id`) — `_graph_commits.lance` itself does not carry the `actor_id` column. To find recoveries for a specific original actor: `omnigraph commit list --filter actor=omnigraph:recovery`, then join to `_graph_commit_recoveries.lance` by `graph_commit_id` to read `recovery_for_actor`. Schema: see `crates/omnigraph/src/db/recovery_audit.rs`.
--- a/docs/user/changes.md
+++ b/docs/user/changes.md
@ -0,0 +1,24 @@
+# Change Detection / Diff
+
+`changes/mod.rs`. Three-level algorithm:
+
+1. **Manifest diff**: skip sub-tables whose `(table_version, table_branch)` is unchanged.
+2. **Lineage check**:
+   - Same branch lineage → fast path: use the per-row `_row_last_updated_at_version` column to classify Insert/Update/Delete.
+   - Different lineages → ID-based streaming comparison.
+3. **Row-level diff**: streaming, no full materialization.
+
+## Public API
+
+- `diff_between(from: ReadTarget, to: ReadTarget, filter: Option<ChangeFilter>) -> ChangeSet`
+- `diff_commits(from_commit_id, to_commit_id, filter)` — cross-branch safe.
+
+## Types
+
+```
+ChangeOp: Insert | Update | Delete
+EntityKind: Node | Edge
+EntityChange { table_key, kind, type_name, id, op, manifest_version, endpoints?: {src, dst} }
+ChangeFilter { kinds?, type_names?, ops? }
+ChangeSet { from_version, to_version, branch?, changes[], stats }
+```
--- a/docs/user/cli-reference.md
+++ b/docs/user/cli-reference.md
@ -0,0 +1,83 @@
+# CLI Reference (`omnigraph`)
+
+A reference for the `omnigraph` binary's command surface and `omnigraph.yaml` schema. For a quick-start guide, see [cli.md](cli.md).
+
+17 top-level command families, 40+ subcommands. All commands accept either a positional `URI`, `--uri`, or a `--target <name>` resolved against `omnigraph.yaml`.
+
+## Top-level commands
+
+| Command | Purpose |
+|---|---|
+| `init` | `--schema <pg>` → initialize a repo (also scaffolds `omnigraph.yaml` if missing) |
+| `load` | bulk load a branch (`--mode overwrite\|append\|merge`) |
+| `ingest` | branch-creating transactional load (`--from <base>`) |
+| `read` | run named query (params via `--params`, `--params-file`, or alias args) |
+| `change` | run mutation query |
+| `snapshot` | print current snapshot (per-table version + row count) |
+| `export` | dump to JSONL on stdout (`--type T`, `--table K` filters) |
+| `branch create \| list \| delete \| merge` | branching ops |
+| `commit list \| show` | inspect commit graph |
+| `run list \| show \| publish \| abort` | transactional run ops |
+| `schema plan \| apply \| show (alias: get)` | migrations |
+| `query lint \| check` | offline / repo-backed validation |
+| `optimize` | non-destructive Lance compaction |
+| `cleanup --keep N --older-than 7d --confirm` | destructive version GC |
+| `embed` | offline JSONL embedding pipeline |
+| `policy validate \| test \| explain` | Cedar tooling |
+| `version` / `-v` | print `omnigraph 0.3.x` |
+
+## `omnigraph.yaml` schema
+
+```yaml
+project: { name }
+graphs:
+  <name>:
+    uri: <local|s3://|http(s)://>
+    bearer_token_env: <ENV_NAME>
+server:
+  graph: <name>
+  bind: <ip:port>
+cli:
+  graph: <name>
+  branch: <name>
+  output_format: json|jsonl|csv|kv|table
+  table_max_column_width: 80
+  table_cell_layout: truncate|wrap
+query:
+  roots: [<dir>, …]   # search path for .gq files
+auth:
+  env_file: ./.env.omni
+aliases:
+  <alias>:
+    command: read|change
+    query: <path-to-.gq>
+    name: <query-name>
+    args: [<positional-name>, …]
+    graph: <name>
+    branch: <name>
+    format: <output-format>
+policy:
+  file: ./policy.yaml
+```
+
+## Output formats (read command)
+
+- `json` — pretty-printed object with metadata + rows
+- `jsonl` — one metadata line then one JSON object per row
+- `csv` — RFC 4180-ish quoting
+- `table` — fitted text table, honors `table_max_column_width` + `table_cell_layout`
+- `kv` — grouped per-row key/value blocks
+
+## Param resolution
+
+Precedence (high to low): explicit `--params` / `--params-file`, alias positional args, `omnigraph.yaml` defaults. JS-safe-integer handling is built in (`is_js_safe_integer_i64`, `JS_MAX_SAFE_INTEGER_U64`) so 64-bit ids round-trip safely through JSON clients.
+
+## Bearer token resolution (CLI)
+
+1. `graphs.<name>.bearer_token_env`
+2. `OMNIGRAPH_BEARER_TOKEN` global env
+3. `auth.env_file` referenced `.env`
+
+## Duration parsing (cleanup)
+
+`s | m | h | d | w` units, e.g. `--older-than 7d`.
--- a/docs/user/cli.md
+++ b/docs/user/cli.md
@ -0,0 +1,100 @@
+# CLI Guide
+
+## Core Repo Flow
+
+```bash
+omnigraph init --schema ./schema.pg ./repo.omni
+omnigraph load --data ./data.jsonl --mode overwrite ./repo.omni
+omnigraph snapshot ./repo.omni --branch main --json
+omnigraph read --uri ./repo.omni --query ./queries.gq --name get_person --params '{"name":"Alice"}'
+omnigraph change --uri ./repo.omni --query ./queries.gq --name insert_person --params '{"name":"Mina","age":28}'
+```
+
+## Branching And Reviewable Data Flows
+
+```bash
+omnigraph branch create --uri ./repo.omni --from main feature-x
+omnigraph branch list --uri ./repo.omni
+omnigraph branch merge --uri ./repo.omni feature-x --into main
+
+omnigraph ingest --data ./batch.jsonl --branch review/import-2026-04-09 ./repo.omni
+omnigraph export ./repo.omni --branch main --type Person > people.jsonl
+omnigraph commit list ./repo.omni --branch main --json
+omnigraph commit show --uri ./repo.omni <commit-id> --json
+```
+
+## Remote Server Mode
+
+Serve a repo:
+
+```bash
+omnigraph-server ./repo.omni --bind 127.0.0.1:8080
+```
+
+Read through the HTTP API:
+
+```bash
+omnigraph read \
+  --target http://127.0.0.1:8080 \
+  --query ./queries.gq \
+  --name get_person \
+  --params '{"name":"Alice"}'
+```
+
+If the server requires auth, set `OMNIGRAPH_SERVER_BEARER_TOKEN` on the server
+and configure the matching `bearer_token_env` in `omnigraph.yaml`.
+
+## Runs, Policy, And Diagnostics
+
+```bash
+omnigraph query lint --query ./queries.gq --schema ./schema.pg --json
+omnigraph query check --query ./queries.gq ./repo.omni --json
+
+omnigraph schema plan --schema ./next.pg ./repo.omni --json
+omnigraph schema apply --schema ./next.pg ./repo.omni --json
+omnigraph policy validate --config ./omnigraph.yaml
+omnigraph policy test --config ./omnigraph.yaml
+omnigraph policy explain --config ./omnigraph.yaml --actor act-alice --action read --branch main
+
+omnigraph commit list ./repo.omni --json
+omnigraph commit show --uri ./repo.omni <commit-id> --json
+```
+
+(The legacy `omnigraph run list/show/publish/abort` subcommands were removed in MR-771; mutations and loads publish atomically and the commit graph (`omnigraph commit list`) is the audit surface.)
+
+`query lint` and `query check` are the same command surface. In v1, repo-backed
+lint uses local or `s3://` repo URIs; HTTP targets are only supported when you
+also pass `--schema`.
+
+## Config
+
+`omnigraph.yaml` lets the CLI and server share named graphs, defaults, and
+query roots:
+
+```yaml
+graphs:
+  local:
+    uri: ./demo.omni
+  dev:
+    uri: http://127.0.0.1:8080
+    bearer_token_env: OMNIGRAPH_BEARER_TOKEN
+
+cli:
+  graph: local
+  branch: main
+
+query:
+  roots:
+    - queries
+    - .
+```
+
+The config file can also define:
+
+- server bind defaults
+- auth env files
+- query aliases for common read and change commands
+- `policy.file` for Cedar authorization rules
+
+When policy is enabled, `schema apply` is authorized through the
+`schema_apply` action and is typically limited to admins on protected `main`.
--- a/docs/user/constants.md
+++ b/docs/user/constants.md
@ -0,0 +1,22 @@
+# Constants & Tunables (cheat sheet)
+
+| Name | Value | Where |
+|---|---|---|
+| `MANIFEST_DIR` | `__manifest` | `db/manifest/layout.rs` |
+| Commit graph dir | `_graph_commits.lance` | `db/commit_graph.rs` |
+| Run registry dir (legacy, removed MR-771) | `_graph_runs.lance` | inert post-v0.4.0; reclaimed by MR-770 |
+| Run branch prefix (legacy, removed MR-771) | `__run__` | filtered by `is_internal_run_branch` defense-in-depth |
+| Schema apply lock | `__schema_apply_lock__` | `db/mod.rs` |
+| Manifest publisher retry budget | `PUBLISHER_RETRY_BUDGET = 5` | `db/manifest/publisher.rs` |
+| Internal manifest schema version | `INTERNAL_MANIFEST_SCHEMA_VERSION = 2` | `db/manifest/migrations.rs` |
+| Merge stage batch | `MERGE_STAGE_BATCH_ROWS = 8192` | `exec/merge.rs` |
+| Maintenance concurrency | `OMNIGRAPH_MAINTENANCE_CONCURRENCY=8` | `db/omnigraph/optimize.rs` |
+| Graph index cache size | `8` (LRU) | `runtime_cache.rs` |
+| Default body limit | `1 MB` | `omnigraph-server/lib.rs` |
+| Ingest body limit | `32 MB` | `omnigraph-server/lib.rs` |
+| Engine embed model | `gemini-embedding-2-preview` | `omnigraph/embedding.rs` |
+| Compiler embed model | `text-embedding-3-small` | `omnigraph-compiler/embedding.rs` |
+| Embed timeout | `30 000 ms` | both clients |
+| Embed retries | `4` | both clients |
+| Embed retry backoff | `200 ms` | both clients |
+| LANCE memory pool default | `1 GB` (raised in v0.3.0) | runtime |
--- a/docs/user/deployment.md
+++ b/docs/user/deployment.md
@ -0,0 +1,184 @@
+# Deployment
+
+This doc describes the public runtime contract for self-hosting Omnigraph. It
+does not include environment-specific secrets, private infrastructure, or
+internal deploy automation.
+
+## Runtime Modes
+
+Omnigraph supports two broad deployment shapes:
+
+- local directory repos
+- `s3://` repos on AWS S3 or S3-compatible object stores
+
+The server binary and container image expose the same HTTP surface.
+
+## Binary Deployment
+
+Build or install:
+
+- `omnigraph`
+- `omnigraph-server`
+
+Run against a local repo:
+
+```bash
+omnigraph-server ./repo.omni --bind 0.0.0.0:8080
+```
+
+Run against an object-store-backed repo:
+
+```bash
+OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
+AWS_REGION="us-east-1" \
+omnigraph-server s3://my-bucket/repos/example/releases/2026-04-10-v0.1.0 \
+  --bind 0.0.0.0:8080
+```
+
+## One-Command Local RustFS Bootstrap
+
+The easiest local S3-backed deployment path is:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash
+```
+
+The bootstrap:
+
+- starts a local RustFS-backed object store
+- creates a bucket and S3-backed Omnigraph repo
+- loads the checked-in context fixture
+- starts `omnigraph-server` on `127.0.0.1:8080`
+
+Supported behavior:
+
+- downloads the rolling `edge` binary when one exists for the current platform
+- otherwise clones `ModernRelay/omnigraph` and builds from source
+- reuses an existing RustFS container if it is already running
+
+Useful overrides:
+
+- `WORKDIR=/path/to/state`
+- `BUCKET=omnigraph-local`
+- `PREFIX=repos/context`
+- `RESET_REPO=1` to delete an existing partially initialized repo prefix before recreating it
+- `BIND=127.0.0.1:8080`
+- `RUSTFS_CONTAINER_NAME=omnigraph-rustfs-demo`
+
+The bootstrap expects:
+
+- Docker
+- `curl`
+- either a matching release asset or a local Rust toolchain plus `git`
+
+If `aws` is not installed, the script attempts a user-local AWS CLI install via
+`python3 -m pip`. Docker Desktop or another Docker daemon must already be
+running.
+
+If a previous bootstrap left objects behind under the selected `PREFIX` but did
+not finish initializing the repo, rerun with `RESET_REPO=1` or choose a new
+`PREFIX`.
+
+## Container Deployment
+
+Build the image:
+
+```bash
+docker build -t omnigraph-server:local .
+```
+
+Run against a local repo:
+
+```bash
+docker run --rm -p 8080:8080 \
+  -v "$PWD/repo.omni:/data/repo.omni" \
+  omnigraph-server:local \
+  /data/repo.omni --bind 0.0.0.0:8080
+```
+
+Run against an S3-backed repo:
+
+```bash
+docker run --rm -p 8080:8080 \
+  -e OMNIGRAPH_SERVER_BEARER_TOKEN="change-me" \
+  -e AWS_REGION="us-east-1" \
+  omnigraph-server:local \
+  s3://my-bucket/repos/example/releases/2026-04-10-v0.1.0 \
+  --bind 0.0.0.0:8080
+```
+
+## Auth
+
+The server can run unauthenticated for local development, but any shared or
+internet-facing deployment should set a bearer token source.
+
+### Token sources
+
+The server reads bearer tokens from one of three places, in precedence order:
+
+1. **AWS Secrets Manager** (build with `--features aws`, see below) — set
+   `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` to the secret ID or ARN.
+2. **JSON file or env** — set one of:
+   - `OMNIGRAPH_SERVER_BEARER_TOKENS_FILE` — path to a JSON `{"actor": "token", ...}` file.
+   - `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON` — the JSON literal inline.
+3. **Single-token env** — `OMNIGRAPH_SERVER_BEARER_TOKEN` (assigns the
+   implicit actor `default`).
+
+Tokens are hashed with SHA-256 immediately on ingest; plaintext does not
+persist in process memory after startup.
+
+The health endpoint `/healthz` remains suitable for load balancer health checks
+and is never gated.
+
+## Build Variants
+
+The server binary ships in two flavors:
+
+| Variant | Command | Contents |
+|---------|---------|----------|
+| **Default** (on-prem / local dev) | `cargo build --release` | Core server, no AWS SDK |
+| **AWS** | `cargo build --release --features aws` | Adds AWS Secrets Manager backend for bearer tokens |
+
+Release artifacts are published with matching suffixes —
+`omnigraph-server-<version>-<platform>.tar.gz` for the default build and
+`omnigraph-server-<version>-<platform>-aws.tar.gz` for the AWS-enabled build.
+
+The AWS build adds ~150 transitive deps and ~30-60s of first-build compile
+time. Default builds don't pay that cost.
+
+## AWS Secrets Manager
+
+When the binary is built with `--features aws`, set
+`OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` to the ARN or name of a Secrets
+Manager secret whose `SecretString` is a JSON object of
+`{"actor_id": "token", ...}`:
+
+```bash
+omnigraph-server-aws s3://my-bucket/repos/example ...
+  # Environment:
+  # OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET=arn:aws:secretsmanager:us-east-1:123456789012:secret:omnigraph-tokens-AbCdEf
+```
+
+Credentials are resolved via the AWS default chain (env vars, shared config,
+IMDSv2 instance role, ECS task role) — no explicit credential plumbing is
+needed when running under an IAM instance role on EC2/ECS/EKS.
+
+The IAM role must permit `secretsmanager:GetSecretValue` on the referenced
+secret.
+
+Setting the env var without building with `--features aws` is a hard error
+with a rebuild instruction — it does not silently fall back to the env/file
+source.
+
+## S3-Compatible Storage
+
+For S3-compatible backends such as RustFS or MinIO, set the usual AWS SDK
+environment variables:
+
+- `AWS_ACCESS_KEY_ID`
+- `AWS_SECRET_ACCESS_KEY`
+- `AWS_REGION`
+- optional `AWS_ENDPOINT_URL`
+- optional `AWS_ENDPOINT_URL_S3`
+- optional `AWS_ALLOW_HTTP=true`
+- optional `AWS_S3_FORCE_PATH_STYLE=true`
--- a/docs/user/embeddings.md
+++ b/docs/user/embeddings.md
@ -0,0 +1,31 @@
+# Embeddings
+
+OmniGraph has **two** embedding clients with different defaults and purposes.
+
+## Compiler-side client (`omnigraph-compiler/src/embedding.rs`) — query-time normalization
+
+- Default model: `text-embedding-3-small` (OpenAI-style schema)
+- Env: `NANOGRAPH_EMBED_MODEL`, `OPENAI_API_KEY`, `OPENAI_BASE_URL` (default `https://api.openai.com/v1`), `NANOGRAPH_EMBEDDINGS_MOCK`, `NANOGRAPH_EMBED_TIMEOUT_MS=30000`, `NANOGRAPH_EMBED_RETRY_ATTEMPTS=4`, `NANOGRAPH_EMBED_RETRY_BACKOFF_MS=200`
+- Methods: `embed_text(input, expected_dim)`, `embed_texts(inputs, expected_dim)`
+- Mock mode: deterministic FNV-1a + xorshift64 → L2-normalized vectors
+
+## Engine-side client (`omnigraph/src/embedding.rs`) — runtime ingest
+
+- Model: `gemini-embedding-2-preview`
+- Env: `GEMINI_API_KEY`, `OMNIGRAPH_GEMINI_BASE_URL` (default Google generativelanguage v1beta), `OMNIGRAPH_EMBED_TIMEOUT_MS=30000`, `OMNIGRAPH_EMBED_RETRY_ATTEMPTS=4`, `OMNIGRAPH_EMBED_RETRY_BACKOFF_MS=200`, `OMNIGRAPH_EMBEDDINGS_MOCK`
+- Two task types: `embed_query_text` (RETRIEVAL_QUERY) and `embed_document_text` (RETRIEVAL_DOCUMENT)
+- Exponential backoff with retryable detection (timeouts, 429, 5xx)
+
+## Schema integration
+
+Mark a Vector property with `@embed("source_text_property")`. At ingest, the engine pulls the source text and writes the embedding into the vector column. Stored as L2-normalized FixedSizeList(Float32, dim).
+
+## CLI `omnigraph embed` (offline file pipeline)
+
+Operates on **JSONL files** (not on a repo). Three modes (mutually exclusive):
+
+- (default) `fill_missing` — only embed rows whose target field is empty
+- `--reembed-all` — overwrite all
+- `--clean` — strip embeddings
+
+Inputs are either a single seed manifest YAML or `--input/--output/--spec`. Selectors `--type T`, `--select T:field=value` filter rows. Streams JSONL → JSONL.
--- a/docs/user/errors.md
+++ b/docs/user/errors.md
@ -0,0 +1,24 @@
+# Errors and Result Serialization
+
+## Error taxonomy (`omnigraph::error::OmniError`)
+
+- `Compiler(...)` — schema/query parse/typecheck errors
+- `Lance(String)` — storage layer
+- `DataFusion(String)` — execution layer
+- `Io(io::Error)`
+- `Manifest(ManifestError { kind: BadRequest|NotFound|Conflict|Internal, details: Option<ManifestConflictDetails>, … })`
+  - `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }` — caller's `expected_table_versions` did not match the manifest's current latest non-tombstoned version (set by `OmniError::manifest_expected_version_mismatch`).
+  - `ManifestConflictDetails::RowLevelCasContention` — Lance row-level CAS rejected the publish because a concurrent writer landed the same `object_id`. Retried internally by the publisher; only surfaces if the retry budget exhausts.
+  - **D₂ parse-time rejection** (MR-794): a single mutation query that mixes inserts/updates with deletes errors out *before any I/O* with kind `BadRequest`. Message: `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes`. See [docs/user/query-language.md](query-language.md) for the rule and [docs/dev/runs.md](../dev/runs.md) for the underlying staged-write rationale.
+- `MergeConflicts(Vec<MergeConflict>)`
+
+Compiler-side `NanoError` covers parse / catalog / type / storage / plan / execution / arrow / lance / IO / manifest / unique-constraint, each with structured spans (`SourceSpan { start, end }`) for ariadne-style diagnostics.
+
+## Result serialization (`omnigraph_compiler::result::QueryResult`)
+
+- `to_arrow_ipc()` — efficient binary
+- `to_sdk_json()` — JS-safe JSON (large i64 wrapped in metadata)
+- `to_rust_json()` — Rust-friendly JSON
+- `batches()` — direct Arrow `RecordBatch` access
+
+Mutation results: `{ affectedNodes: usize, affectedEdges: usize }` (also exposed as a tiny Arrow batch).
--- a/docs/user/index.md
+++ b/docs/user/index.md
@ -0,0 +1,52 @@
+# User Docs
+
+**Audience:** users, CLI users, HTTP clients, and self-hosting operators
+
+This is the public-facing entry point. These docs should describe behavior,
+commands, configuration, and operational contracts without requiring knowledge
+of MRs, internal recovery mechanics, or contributor-only invariants.
+
+## Start Here
+
+| Goal | Read |
+|---|---|
+| Install OmniGraph | [install.md](install.md) |
+| Run the CLI locally | [cli.md](cli.md) |
+| Look up every CLI flag and config field | [cli-reference.md](cli-reference.md) |
+| Write schemas | [schema-language.md](schema-language.md) |
+| Read schema-lint diagnostic codes | [schema-lint.md](schema-lint.md) |
+| Write queries and mutations | [query-language.md](query-language.md) |
+| Use embeddings | [embeddings.md](embeddings.md) |
+
+## Operate A Repo
+
+| Goal | Read |
+|---|---|
+| Understand repo layout and URI support | [storage.md](storage.md) |
+| Work with branches, commits, and snapshots | [branches-commits.md](branches-commits.md) |
+| Coordinate multi-query workflows | [transactions.md](transactions.md) |
+| Read diffs and change feeds | [changes.md](changes.md) |
+| Build and use indexes | [indexes.md](indexes.md) |
+| Compact and clean old versions | [maintenance.md](maintenance.md) |
+| Interpret errors and output formats | [errors.md](errors.md) |
+
+## Run The Server
+
+| Goal | Read |
+|---|---|
+| Deploy the binary or container | [deployment.md](deployment.md) |
+| Use HTTP endpoints | [server.md](server.md) |
+| Configure Cedar authorization | [policy.md](policy.md) |
+| Track actors and audit behavior | [audit.md](audit.md) |
+
+## Releases
+
+Release notes live in [releases/](../releases/). Use them for user-visible
+changes between versions, not for contributor design history.
+
+## Boundary
+
+User docs should focus on stable behavior. If a paragraph needs to explain
+internal sidecars, Lance API blockers, MR numbers, test strategy, or review
+rules, it probably belongs in [docs/dev/index.md](../dev/index.md) or a developer-area document
+instead.
--- a/docs/user/indexes.md
+++ b/docs/user/indexes.md
@ -0,0 +1,26 @@
+# Indexes
+
+## L1 — Lance index types OmniGraph exposes
+
+| Index | Use | Notes |
+|---|---|---|
+| **BTREE scalar** | range / equality on any scalar | created on `@key`, `@index(...)`, and on key columns by `ensure_indices()` |
+| **Inverted (FTS)** | `search`, `fuzzy`, `match_text`, `bm25` | created on text columns referenced by FTS queries |
+| **Vector** | `nearest()` k-NN | Lance picks IVF_PQ vs HNSW family by configuration; OmniGraph stores as FixedSizeList(Float32, dim) |
+
+## L2 — OmniGraph orchestration
+
+- `ensure_indices()` / `ensure_indices_on(branch)` — idempotent build of BTREE + inverted indexes for the current head; safe to re-run.
+- Indexes are built on the *branch head* (not on a snapshot), so reads always see the current index state.
+- **Lazy branch forking for indexes**: a branch that hasn't mutated a sub-table doesn't need its own index — the main lineage's index is reused until the first write triggers a copy-on-write fork.
+- Vector index parameters (metric, nlist, nprobe, etc.) are not exposed in the schema; they default at the Lance layer and are picked up automatically when an index is asked for on a Vector column.
+
+## L2 — Graph topology index (`graph_index/mod.rs`)
+
+This is OmniGraph-specific (not Lance):
+
+- `TypeIndex`: dense `u32 ↔ String id` mapping per node type.
+- `CsrIndex`: Compressed Sparse Row representation of edges per edge type — `offsets[i]..offsets[i+1]` slices into `targets`.
+- `GraphIndex { type_indices, csr (out), csc (in) }` — built on demand from a snapshot's edge tables.
+- Cached in `RuntimeCache::graph_indices` (LRU, max 8 entries, keyed by snapshot id + edge table versions).
+- Built only when an `Expand` or `AntiJoin` IR op is present in the lowered query, so pure scans skip it.
--- a/docs/user/install.md
+++ b/docs/user/install.md
@ -0,0 +1,94 @@
+# Install
+
+## Quick Install
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
+```
+
+By default the installer places:
+
+- `omnigraph`
+- `omnigraph-server`
+
+in `~/.local/bin`.
+
+The default installer is binary-only. It downloads a published release asset,
+verifies the SHA256 checksum, and unpacks it. It does not build from source.
+If no stable tag is published yet, the installer automatically falls back to
+the rolling `edge` release.
+
+## Homebrew
+
+```bash
+brew tap ModernRelay/tap
+brew install ModernRelay/tap/omnigraph
+```
+
+## Channels
+
+Stable binaries:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
+```
+
+Rolling edge binaries from `main`:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | RELEASE_CHANNEL=edge bash
+```
+
+Install from source:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install-source.sh | bash
+```
+
+## Useful Overrides
+
+Install to a different directory:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | INSTALL_DIR="$HOME/bin" bash
+```
+
+Install a specific tag:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | VERSION=v0.1.0 bash
+```
+
+Build from a specific git ref:
+
+```bash
+curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install-source.sh | SOURCE_REF=main bash
+```
+
+## Manual Source Build
+
+```bash
+cargo build --release --locked -p omnigraph-cli -p omnigraph-server
+install -m 0755 target/release/omnigraph ~/.local/bin/omnigraph
+install -m 0755 target/release/omnigraph-server ~/.local/bin/omnigraph-server
+```
+
+## Release Assets
+
+Tagged releases are expected to publish:
+
+- `omnigraph-linux-x86_64.tar.gz`
+- `omnigraph-macos-x86_64.tar.gz`
+- `omnigraph-macos-arm64.tar.gz`
+
+Each archive contains both binaries:
+
+- `omnigraph`
+- `omnigraph-server`
+
+## Verify The Install
+
+```bash
+omnigraph version
+omnigraph-server --help
+```
--- a/docs/user/maintenance.md
+++ b/docs/user/maintenance.md
@ -0,0 +1,29 @@
+# Maintenance: Optimize & Cleanup
+
+`db/omnigraph/optimize.rs`.
+
+## `optimize_all_tables(db)` — non-destructive
+
+- Lance `compact_files()` on every node + edge table on `main`.
+- Rewrites small fragments into fewer large ones; old fragments remain reachable via older manifests.
+- Bounded by `OMNIGRAPH_MAINTENANCE_CONCURRENCY` (default 8).
+- Returns `[TableOptimizeStats { table_key, fragments_removed, fragments_added, committed }]`.
+
+## `cleanup_all_tables(db, options)` — destructive
+
+- Lance `cleanup_old_versions()` per table.
+- Removes manifests (and their unique fragments) older than the retention policy.
+- `CleanupPolicyOptions { keep_versions: Option<u32>, older_than: Option<Duration> }` — at least one is required.
+- Returns `[TableCleanupStats { table_key, bytes_removed, old_versions_removed }]`.
+- CLI guards with `--confirm`; without it, prints a preview line.
+- **Recovery floor:** `--keep < 3` may garbage-collect Lance versions that the open-time recovery sweep needs as a rollback target (the sweep restores to the branch's manifest-pinned table version, which is HEAD-1 in the typical Phase B → Phase C drift case). Default `--keep 10` is safe.
+
+## Tombstones
+
+Logical sub-table delete markers in `__manifest`; `tombstone_object_id(table_key, version)` excludes a sub-table version from snapshot reconstruction.
+
+## Internal schema migrations (`db/manifest/migrations.rs`)
+
+Version evolutions of the on-disk `__manifest` shape are reconciled automatically on the first write under a new binary. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape the binary expects; the on-disk stamp `omnigraph:internal_schema_version` (Lance schema-level metadata) records the on-disk shape. The publisher's open-for-write path calls `migrate_internal_schema` before reading state; reads are side-effect-free. No operator action is required for in-place upgrades. See [storage.md → Internal schema versioning](storage.md) for the full mechanism.
+
+A binary opening a manifest stamped at a version *higher* than it knows about refuses to publish with a clear "upgrade omnigraph first" error — old binaries cannot clobber a newer schema.
--- a/docs/user/policy.md
+++ b/docs/user/policy.md
@ -0,0 +1,44 @@
+# Authorization (Cedar policy)
+
+OmniGraph integrates AWS Cedar (`cedar-policy = 4.9`) for ABAC.
+
+## Policy actions
+
+1. `read` — query / snapshot / list branches & commits
+2. `export` — NDJSON export
+3. `change` — mutations
+4. `schema_apply` — apply schema migrations
+5. `branch_create`
+6. `branch_delete`
+7. `branch_merge`
+8. `run_publish`
+9. `run_abort`
+10. `admin` — reserved
+
+## Scope kinds
+
+- `branch_scope` — applied to source branch (`read`, `export`, `change`)
+- `target_branch_scope` — applied to destination (`schema_apply`, branch ops, run ops)
+- `protected_branches` — named list with special rules; rule scopes are `any | protected | unprotected`
+
+## Configuration
+
+`omnigraph.yaml`:
+
+```yaml
+policy:
+  file: ./policy.yaml          # Cedar rules + groups
+  tests: ./policy.tests.yaml   # declarative test cases
+```
+
+Each rule must use exactly one of `branch_scope` or `target_branch_scope`.
+
+## CLI
+
+- `omnigraph policy validate` — parse + count actors, exit 1 on parse error.
+- `omnigraph policy test` — run cases in `policy.tests.yaml`, exit 1 on any expectation mismatch.
+- `omnigraph policy explain --actor … --action … [--branch …] [--target-branch …]` — show decision and matched rule.
+
+## Server enforcement
+
+Every mutating endpoint calls `authorize_request()` *before* the handler runs; decisions are logged with actor / action / branch / outcome / matched rule.
--- a/docs/user/query-language.md
+++ b/docs/user/query-language.md
@ -0,0 +1,111 @@
+# Query Language (`.gq`)
+
+Pest grammar at `crates/omnigraph-compiler/src/query/query.pest`. AST in `query/ast.rs`. Type checker in `query/typecheck.rs`. Lowering in `ir/lower.rs`.
+
+## Query declarations
+
+```
+query <name>($p1: T1, $p2: T2?, …)
+  @description("…") @instruction("…") {
+  …
+}
+```
+
+Two body shapes:
+
+- **Read**: `match { … } return { … } [order { … }] [limit N]`
+- **Mutation**: one or more of `insert | update | delete` statements
+
+Param types reuse all schema scalars; trailing `?` makes a param optional. The compiler reserves `$__nanograph_now` for `now()`.
+
+## MATCH clauses
+
+- **Binding**: `$x: NodeType { prop: <literal | $param | now()>, … }`
+- **Traversal**: `$src EDGE_NAME { min, max? } $dst` — variable-length paths via hop bounds; default 1..1 if bounds omitted.
+- **Filter**: `<expr> <op> <expr>` with operators `>=`, `<=`, `!=`, `>`, `<`, `=`, and string `contains`.
+- **Negation**: `not { clause+ }` — desugars to anti-join over the inner pipeline.
+
+## Search clauses (multi-modal)
+
+Used inside MATCH or as expressions inside RETURN/ORDER:
+
+| Function | Purpose | Underlying Lance facility |
+|---|---|---|
+| `nearest($x.vec, $q)` | k-NN vector search (cosine) | Lance vector index (IVF / HNSW) |
+| `search(field, q)` | Generic FTS | Inverted index |
+| `fuzzy(field, q [, max_edits])` | Levenshtein-tolerant text search | Inverted index |
+| `match_text(field, q)` | Pattern match | Inverted index |
+| `bm25(field, q)` | BM25 scoring | Inverted index |
+| `rrf(rank_a, rank_b [, k])` | Reciprocal Rank Fusion of two rankings (default k=60) | OmniGraph fuses scored rankings |
+
+`nearest()` requires a `LIMIT`; the compiler resolves the query vector via the param map (or via the runtime embedding client when bound to a text input).
+
+## RETURN clause
+
+`return { <expr> [as <alias>], … }` with expressions:
+
+- Variable / property access: `$x`, `$x.prop`
+- Literals: string, int, float, bool, list
+- `now()`
+- Aggregates: `count`, `sum`, `avg`, `min`, `max`
+- All search functions above (so you can return a score column)
+- `AliasRef` — re-use a previous projection alias
+
+## ORDER & LIMIT
+
+- `order { <expr> [asc|desc], … }` — supports plain expressions and `nearest(...)`.
+- `limit <integer>` — required when there is a `nearest(...)` ordering.
+
+## Mutation statements
+
+- `insert <Type> { prop: <value>, … }`
+- `update <Type> set { prop: <value>, … } where <prop> <op> <value>`
+- `delete <Type> where <prop> <op> <value>`
+
+`<value>` is a literal, `$param`, or `now()`. Multi-statement mutations execute atomically (added in v0.2.0).
+
+### D₂ — mixed insert/update + delete is rejected at parse time
+
+A single mutation query must be **either insert/update-only or delete-only**. Mixed → rejected before any I/O with the message:
+
+> `mutation '<name>' on the same query mixes inserts/updates and deletes; split into separate mutations: (1) inserts and updates, then (2) deletes. This restriction lifts when Lance exposes a two-phase delete API (tracked: MR-793 / Lance-upstream).`
+
+Reason: under the staged-write rewire (MR-794), inserts and updates accumulate in memory and commit at end-of-query, while deletes still inline-commit (Lance 4.0.0 has no public two-phase delete). Mixing creates ordering hazards (same-row insert→delete becomes a no-op because the staged insert isn't visible to delete; cascading deletes of just-inserted edges break referential integrity by silent design). Until Lance exposes `DeleteJob::execute_uncommitted`, the parse-time rejection keeps both paths atomic and correct. See [docs/dev/runs.md](../dev/runs.md) and [docs/dev/invariants.md](../dev/invariants.md).
+
+## IR (Intermediate Representation)
+
+`QueryIR { name, params, pipeline: Vec<IROp>, return_exprs, order_by, limit }`
+
+Pipeline operations:
+
+- `NodeScan { variable, type_name, filters }`
+- `Expand { src_var, dst_var, edge_type, direction (Out|In), dst_type, min_hops, max_hops, dst_filters }` — destination filters are pushed *into* the expand so Lance scalar pushdown can prune.
+- `Filter { left, op, right }`
+- `AntiJoin { outer_var, inner: Vec<IROp> }` — for `not { … }`
+
+Lowering:
+
+1. Partition MATCH clauses (bindings, traversals, filters, negations).
+2. Identify "deferred" bindings (a destination of a traversal that has filters) so the Expand can carry the filter as a pushdown.
+3. Emit NodeScan for the first binding, then Expand operations, then remaining Filter operations, then AntiJoins for negations.
+4. Translate RETURN / ORDER expressions; preserve LIMIT.
+
+## Linting & validation (`query/lint.rs`)
+
+Codes seen so far:
+
+- **Q000** (Error): parse error
+- **L201** (Warning): nullable property never set by any UPDATE — "{type}.{prop} exists in schema but no update query sets it"
+- (Warning): mutation declares no params — hardcoded mutations are easy to miss
+- Plus all type errors from `typecheck_query_decl()` (undefined types, mismatched operators, undefined edges, etc.)
+
+Output:
+
+```
+QueryLintOutput { status, schema_source, query_path,
+  queries_processed, errors, warnings, infos,
+  results: [{ name, kind, status, error?, warnings[] }],
+  findings: [{ severity, code, message, type_name?, property?, query_names[] }] }
+```
+
+CLI exits non-zero only on `status = Error`.
--- a/docs/user/schema-language.md
+++ b/docs/user/schema-language.md
@ -0,0 +1,80 @@
+# Schema Language (`.pg`)
+
+Pest grammar at `crates/omnigraph-compiler/src/schema/schema.pest`. AST at `schema/ast.rs`. Catalog at `catalog/mod.rs`.
+
+## Top-level declarations
+
+- `interface <Name> { property* }` — reusable property contracts.
+- `node <Name> [implements <Iface>, ...] { property* | constraint* }`
+- `edge <Name>: <FromType> -> <ToType> [@card(min..max)] { property* | constraint* }`
+- Comments: line `//` and block `/* … */`.
+
+## Property declarations
+
+`<ident>: <TypeRef> [annotation*]`
+
+## Built-in scalar types
+
+| Scalar | Arrow type |
+|---|---|
+| `String` | Utf8 |
+| `Blob` | LargeBinary |
+| `Bool` | Boolean |
+| `I32` / `I64` | Int32 / Int64 |
+| `U32` / `U64` | UInt32 / UInt64 |
+| `F32` / `F64` | Float32 / Float64 |
+| `Date` | Date32 |
+| `DateTime` | Date64 |
+| `Vector(<dim>)` | FixedSizeList(Float32, dim), `1 ≤ dim ≤ i32::MAX` |
+| `[<scalar>]` | List(scalar) |
+| `enum(v1, v2, …)` | Utf8 with sorted/dedup'd set of allowed string values |
+| `<scalar>?` | Same as scalar but `nullable: true` |
+
+## Constraints (body level)
+
+| Constraint | On | Effect |
+|---|---|---|
+| `@key(p, …)` | node | Primary key; implies index on key columns; `key_property()` returns the first key |
+| `@unique(p, …)` | node, edge | Uniqueness across listed columns |
+| `@index(p, …)` | node, edge | Build a scalar (BTREE) index on the columns |
+| `@range(p, min..max)` | node | Numeric range validation (open ranges allowed) |
+| `@check(p, "regex")` | node | Regex pattern validation |
+| `@card(min..max?)` | edge | Edge multiplicity — default `0..*`; `0..1`, `1..1`, `1..*`, etc. |
+
+Edge bodies only allow `@unique` and `@index`.
+
+## Annotations
+
+- `@<ident>` or `@<ident>(<literal>)` on any declaration or property.
+- Known annotations:
+  - `@embed` on a Vector property — names the *source* property whose text gets embedded into this vector at ingest (`embed_sources` map in NodeType).
+  - `@description("…")`, `@instruction("…")` on query declarations (carried through to clients).
+- Custom annotations are accepted by the parser and surfaced in catalog metadata; unrecognized annotations don't fail compilation.
+
+## Catalog construction
+
+- Pass 0: collect interfaces.
+- Pass 1: collect nodes, expand `implements`, build constraint and `@embed` mappings, build the Arrow schema for each node table (`id: Utf8` plus all properties; blob columns get `LargeBinary`).
+- Pass 2: collect edges, validate that `from_type` / `to_type` exist, normalize edge names case-insensitively for lookup, validate constraints for edges. Edge Arrow schema: `id: Utf8, src: Utf8, dst: Utf8` plus edge properties.
+
+## Schema IR & stable type IDs
+
+- `SCHEMA_IR_VERSION = 1` (`catalog/schema_ir.rs`).
+- Each interface/node/edge currently gets a `stable_type_id` from a kind+name hash.
+- Rename-preserving accepted IDs are an architectural invariant, but the current hash-on-name implementation is a known gap until migration carries IDs across `@rename_from`.
+- Serialized as JSON for diff/migration plans.
+
+## Schema migration planning
+
+`plan_schema_migration(accepted, desired) -> SchemaMigrationPlan { supported, steps[] }` with step types:
+
+- `AddType { type_kind, name }`
+- `RenameType { type_kind, from, to }`
+- `AddProperty { type_kind, type_name, property_name, property_type }`
+- `RenameProperty { type_kind, type_name, from, to }`
+- `AddConstraint { type_kind, type_name, constraint }`
+- `UpdateTypeMetadata { … annotations }`
+- `UpdatePropertyMetadata { … annotations }`
+- `UnsupportedChange { entity, reason }` (forces `supported=false`)
+
+`apply_schema()` returns `SchemaApplyResult { supported, applied, manifest_version, steps }` and is gated by an internal `__schema_apply_lock__` system branch so concurrent schema applies serialize.
--- a/docs/user/schema-lint.md
+++ b/docs/user/schema-lint.md
@ -0,0 +1,61 @@
+# Schema lint
+
+The migration planner emits **code-tagged diagnostics** for every schema change it rejects. Codes have the form `OG-XXX-NNN` and identify the rule (not the message); operators reference them in suppression directives, severity overrides, and CI reports.
+
+This page is the catalog of codes shipped today. The chassis behind it is tracked in [MR-694](https://linear.app/modernrelay/issue/MR-694).
+
+## What's shipped in v0
+
+- Stable code attached to every rejection the planner emits (today: 5 of 17 paths — the rest carry `code: None` and are tagged as future work).
+- Code appears in the user-visible error message: `[OG-DS-104] removing property 'Person.age' is not supported …`.
+- CLI `omnigraph schema plan` shows the code on `unsupported change …` lines.
+- Tests in `tests/schema_apply.rs` assert on codes, not on free-text prose.
+
+## What's not shipped yet
+
+- Severity configuration in `omnigraph.yaml` (planned: `lint: { OG-DS-103: error }`).
+- `@allow(OG-XXX-NNN, "rationale")` suppression directives.
+- Pre-migration checks (the `migration_check { … }` block — MR-941).
+- The CD / VE / LK / NM families (MR-942..945).
+- CI integration (MR-946).
+- Cost-class annotations (MR-944).
+
+See the parent chassis issue (MR-694) for the design and the per-family sub-issues for what's planned.
+
+## Code catalog (v0)
+
+The chassis defines ten families. Today only DS and MF have emitted codes. The remaining families are reserved for future PRs.
+
+| Code | Family | Tier | Default severity | Meaning |
+|---|---|---|---|---|
+| `OG-DS-101` | Destructive | destructive | error | drop graph type with rows (reserved; not yet emitted) |
+| `OG-DS-102` | Destructive | destructive | error | drop node type with rows |
+| `OG-DS-103` | Destructive | destructive | error | drop edge type with rows |
+| `OG-DS-104` | Destructive | destructive | error | drop property with rows |
+| `OG-DS-105` | Destructive | destructive | error | drop populated vector column (reserved) |
+| `OG-MF-103` | Maybe-fail | validated | error | add required property without `@default` to populated type |
+| `OG-MF-104` | Maybe-fail | validated | error | tighten nullable to non-nullable (reserved) |
+| `OG-MF-106` | Maybe-fail | destructive | error | narrowing scalar type |
+
+The full code catalog source of truth lives in `crates/omnigraph-compiler/src/lint/codes.rs`. CI-level invariants (uniqueness, format, family coverage) are unit-tested in the same module.
+
+## Families
+
+The ten chassis families:
+
+| Prefix | Family | Status |
+|---|---|---|
+| **DS** | Destructive (data-loss) | shipped, v0 |
+| **MF** | Maybe-fail / data-dependent | shipped, v0 |
+| **CD** | Constraint deletion (relaxation warning) | tracked in MR-942 |
+| **BC** | Backward-incompatible (rename) | implicit in `@rename_from`; codify later |
+| **NM** | Naming conventions | tracked in MR-945 |
+| **OW** | Ownership (per-resource Cedar) | tracked in MR-722 |
+| **NL** | Non-linear (branch-merge divergence) | stubbed in MR-947 |
+| **VE** | Vector / embedding | tracked in MR-943 |
+| **ED** | Edge / graph topology | tracked in MR-701, MR-943 |
+| **LK** | Lock duration / cost | tracked in MR-944 |
+
+## Prior art
+
+The chassis is modeled on [Atlas's `sqlcheck` analyzers](https://atlasgo.io/lint/analyzers) (DS / MF / CD / BC / NM families). Atlas was the direct inspiration for stable codes, per-rule severity, suppression directives with rationale, and pre-migration checks. omnigraph adapts the chassis to a typed-IR substrate (no SQL injection vector, no per-engine locking, native vector / edge / embedding types Atlas doesn't have).
--- a/docs/user/server.md
+++ b/docs/user/server.md
@ -0,0 +1,101 @@
+# HTTP Server (`omnigraph-server`)
+
+Axum 0.8 + tokio + utoipa-generated OpenAPI. Single repo per process; deploy multiple processes for multi-tenant.
+
+## Endpoint inventory
+
+| Method | Path | Auth | Action | Handler |
+|---|---|---|---|---|
+| GET | `/healthz` | none | — | `server_health` |
+| GET | `/openapi.json` | none | — | `server_openapi` (strips security if auth disabled) |
+| GET | `/snapshot?branch=` | bearer + `read` | snapshot of branch | `server_snapshot` |
+| POST | `/read` | bearer + `read` | run named query | `server_read` |
+| POST | `/export` | bearer + `export` | NDJSON stream | `server_export` |
+| POST | `/change` | bearer + `change` | mutation | `server_change` |
+| GET | `/schema` | bearer + `read` | get current `.pg` source | `server_schema_get` |
+| POST | `/schema/apply` | bearer + `schema_apply` (target=`main`) | migrate | `server_schema_apply` |
+| POST | `/ingest` | bearer + `branch_create` (if new) + `change` | bulk load | `server_ingest` (32 MB body limit) |
+| GET | `/branches` | bearer + `read` | list branches | `server_branch_list` |
+| POST | `/branches` | bearer + `branch_create` | create | `server_branch_create` |
+| DELETE | `/branches/{branch}` | bearer + `branch_delete` | delete | `server_branch_delete` |
+| POST | `/branches/merge` | bearer + `branch_merge` | merge `source → target` | `server_branch_merge` |
+| GET | `/commits?branch=` | bearer + `read` | list | `server_commit_list` |
+| GET | `/commits/{commit_id}` | bearer + `read` | show | `server_commit_show` |
+
+## Streaming
+
+Only `/export` streams (`application/x-ndjson`, MPSC channel + `Body::from_stream`). Everything else is buffered JSON.
+
+## Error model
+
+Uniform `ErrorOutput { error, code?, merge_conflicts[], manifest_conflict? }` with `code ∈ unauthorized | forbidden | bad_request | not_found | conflict | too_many_requests | internal`. Merge conflicts attach structured `MergeConflictOutput { table_key, row_id?, kind, message }`.
+
+`manifest_conflict` is set on **publisher CAS rejections** (HTTP 409): the
+caller's pre-write view of one table's manifest version was stale.
+`ManifestConflictOutput { table_key, expected, actual }` tells the client
+which table to refresh and retry. This is the conflict shape produced by
+concurrent `/change` or `/ingest` calls landing the same `(table, branch)`
+race.
+
+HTTP status codes used: 200, 400, 401, 403, 404, 409, 429, 500.
+
+## Per-actor admission control
+
+Disjoint
+`(table, branch)` writes from different actors now run concurrently,
+guarded only by the engine's per-(table, branch) write queue. To keep
+one heavy actor from exhausting shared capacity (Lance I/O, manifest
+churn, network), the server gates mutating handlers through a
+`WorkloadController` configured per-process from environment variables:
+
+| Env var | Default | Purpose |
+|---|---|---|
+| `OMNIGRAPH_PER_ACTOR_INFLIGHT_MAX` | 16 | Concurrent in-flight mutations per actor |
+| `OMNIGRAPH_PER_ACTOR_BYTES_MAX` | 4 GiB | In-flight estimated bytes per actor |
+
+When an actor exceeds its in-flight count or byte budget, the server
+returns **HTTP 429 Too Many Requests** with `code: too_many_requests`
+and a `Retry-After` header (seconds). The actor should back off; other
+actors are unaffected.
+
+Cedar policy authorization runs **before** admission accounting so
+denied requests don't consume admission slots.
+
+Today admission gates every mutating handler: `/change`, `/ingest`,
+`/branches/{create,delete,merge}`, and `/schema/apply`. Read-only
+endpoints (`/snapshot`, `/read`, `/export`, `/branches` GET, `/commits`,
+`/schema` GET) are not admission-gated.
+
+## Body limits
+
+- Default: 1 MB
+- `/ingest`: 32 MB
+
+## Auth model (`bearer + SHA-256`)
+
+- Tokens are SHA-256 hashed on startup; plaintext is never persisted in memory.
+- Constant-time comparison via `subtle::ConstantTimeEq`.
+- Three sources, in precedence:
+  1. `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` — AWS Secrets Manager (build with `--features aws`)
+  2. `OMNIGRAPH_SERVER_BEARER_TOKENS_FILE` or `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON` — JSON `{actor_id: token, …}`
+  3. `OMNIGRAPH_SERVER_BEARER_TOKEN` — single legacy token, actor `default`
+- If no tokens configured, server runs unauthenticated (local dev) and `/openapi.json` strips the security scheme.
+
+See [deployment.md](deployment.md) for token-source operational details.
+
+## Tracing & observability
+
+- `tower_http::TraceLayer::new_for_http()`
+- Policy decisions logged at INFO level with actor, action, branch, decision, matched rule
+- Startup logs: token source name, repo URI, bind address
+- Graceful SIGINT shutdown
+
+## Not implemented (by design or "TBD")
+
+- CORS — not configured; add `tower_http::cors` if needed.
+- Rate limiting — per-actor admission control gates `/change`, `/ingest`,
+  `/branches/{create,delete,merge}`, `/schema/apply` (see "Per-actor
+  admission control" above). No global rate limiter is configured;
+  add `tower_http::limit` if a graph-wide cap is needed.
+- Pagination — none (commits/branches return everything; export streams).
+- Multi-tenant routing — one repo per process.
--- a/docs/user/storage.md
+++ b/docs/user/storage.md
@ -0,0 +1,115 @@
+# Storage
+
+## L1 — Lance dataset (per node/edge type)
+
+Every node type and every edge type is its own Lance dataset:
+
+- **Columnar Arrow storage**: each property is a column; nullable per Arrow schema.
+- **Fragments**: data is partitioned into fragments; new writes create new fragments.
+- **Manifest versioning**: every commit produces a new dataset version; old versions remain readable.
+- **Stable row IDs**: `enable_stable_row_ids: true` is set on every Lance dataset OmniGraph creates — node and edge data tables, `__manifest`, `_graph_commits.lance`, `_graph_commit_recoveries.lance`, and any future system tables. This is an architectural invariant: the flag is one-way at dataset create per Lance's row-id-lineage spec, so a future change that introduces a Lance dataset must preserve it. Consequences: `_row_created_at_version` and `_row_last_updated_at_version` are available on every dataset (load-bearing for change-feed validators); `CreateIndex × Rewrite` is not a retryable conflict, so indices survive `omnigraph optimize` without needing the Fragment Reuse Index; readers must use a Lance build that recognises the flag (our pinned 4.0.0 is fine). Pre-0.4.x repos created before this code path settled may have datasets without the flag and cannot be retrofitted in place — the supported path is dump-and-reload. The `stage_overwrite` rewrite path (used by `schema_apply`) preserves the flag through `Operation::Overwrite`; pinned by `stage_overwrite_preserves_stable_row_ids` in `crates/omnigraph/tests/staged_writes.rs`.
+- **Append / delete / `merge_insert`**: native Lance write modes.
+- **Per-dataset branches** (Lance native): copy-on-write at the dataset level.
+- **Object-store agnostic**: file://, s3://, gs://, az://, http (read-only via Lance) — OmniGraph wires file:// and s3:// (`storage.rs`).
+
+## L2 — Multi-dataset coordination via `__manifest`
+
+OmniGraph is **not** a single Lance dataset; it is a *graph* of datasets coordinated through one append-only manifest table.
+
+- **Manifest table**: `__manifest/` Lance dataset.
+- **Layout** (`db/manifest/layout.rs`, `db/manifest/state.rs`):
+  - `nodes/{fnv1a64-hex(type_name)}` — one Lance dataset per node type
+  - `edges/{fnv1a64-hex(edge_type_name)}` — one Lance dataset per edge type
+  - `__manifest/` — the catalog of all sub-tables and their published versions
+  - `_graph_commits.lance` / `_graph_commit_actors.lance` — the commit graph and its actor map
+  - (legacy `_graph_runs.lance` / `_graph_run_actors.lance` from pre-v0.4.0 repos are inert; the run state machine was removed in MR-771 and these files are cleaned up via MR-770's production sweep)
+- **Manifest row schema** (`object_id, object_type, location, metadata, base_objects, table_key, table_version, table_branch, row_count`):
+  - `object_type` ∈ `table | table_version | table_tombstone`
+  - `table_key` ∈ `node:<TypeName> | edge:<EdgeName>`
+  - `table_branch` is `null` for the main lineage and the branch name otherwise
+- **Snapshot reconstruction**: latest visible `table_version` per `(table_key, table_branch)` minus tombstones — rows where `object_type = table_tombstone`, whose own `table_version` (acting as the tombstone version) is `>= the entry's table_version`.
+- **Atomic publish**: multi-dataset commits publish via a `ManifestBatchPublisher` so a single write to `__manifest` flips all the new sub-table versions visible at once.
+- **Row-level CAS on the merge-insert join key**: `object_id` carries `lance-schema:unenforced-primary-key=true` so Lance's bloom-filter conflict resolver rejects two concurrent commits that land the same `object_id` row. Without this annotation, Lance's transparent rebase would admit silent duplicates of `version:T@v=N` from racing publishers (see `.context/merge-insert-cas-granularity.md`).
+- **Optimistic concurrency control on publish**: `ManifestBatchPublisher::publish` accepts a `expected_table_versions: HashMap<table_key, u64>` map. Each entry asserts the manifest's current latest non-tombstoned version for that table is exactly what the caller observed; mismatches surface as `OmniError::Manifest` with `ManifestConflictDetails::ExpectedVersionMismatch { table_key, expected, actual }`. Empty map preserves the legacy "best-effort publish" semantics. The publisher uses `conflict_retries(0)` against Lance and owns retry itself (`PUBLISHER_RETRY_BUDGET = 5`), re-running the pre-check on each iteration so concurrent advances surface as `ExpectedVersionMismatch` rather than being silently rebased through.
+
+### Internal schema versioning (`db/manifest/migrations.rs`)
+
+The on-disk shape of `__manifest` is reconciled with the binary via a single stamp + dispatcher. `INTERNAL_MANIFEST_SCHEMA_VERSION` declares the shape this binary writes; the on-disk stamp `omnigraph:internal_schema_version` lives in the manifest dataset's schema-level metadata (Lance `update_schema_metadata`).
+
+- **`init_manifest_repo`** stamps the current version at creation, so newly initialized repos never need migration.
+- **Publisher open-for-write path** (`load_publish_state`) calls `migrate_internal_schema(&mut dataset)` before reading state. When the on-disk stamp matches the binary, this is a single metadata read with no writes; otherwise the dispatcher walks `match`-arm steps forward (1→2, 2→3, …) until the stamp matches, then proceeds with the publish. Reads stay side-effect-free.
+- **Forward-version protection**: a stamp *higher* than the binary's known version triggers a clear "upgrade omnigraph first" error. An old binary cannot clobber a newer schema by silently treating "unknown stamp" as "missing stamp".
+- **Idempotency**: each migration step is safe to re-run. A crash between two metadata updates inside a single step leaves the partial state; the next open re-runs the step and the second update lands. The dispatcher itself is a cheap stamp-read on the steady-state path.
+
+Adding a new on-disk shape change is one constant bump (`INTERNAL_MANIFEST_SCHEMA_VERSION`), one match arm in `migrate_internal_schema`, and one test. No code outside this module branches on the stamp.
+
+| Stamp | Shape change |
+|---|---|
+| v1 (implicit, pre-stamp) | `__manifest.object_id` had no PK annotation; publisher had no row-level CAS protection. |
+| v2 | `__manifest.object_id` carries `lance-schema:unenforced-primary-key=true`; row-level CAS engaged. Stamped as `omnigraph:internal_schema_version=2`. |
+
+## On-disk layout
+
+A repo on disk is a directory tree of Lance datasets. Each dataset follows the standard Lance layout (`_versions/`, `data/`, `_indices/`, `_refs/`); OmniGraph adds the multi-dataset coordination by keeping `__manifest/` alongside the per-type datasets.
+
+```mermaid
+flowchart TB
+    classDef l1 fill:#fef3e8,stroke:#c46900,color:#000
+    classDef l2 fill:#e8f4fd,stroke:#1e6aa8,color:#000
+
+    repo["repo URI<br/>file:// or s3://bucket/prefix"]:::l2
+
+    manifest["__manifest/<br/>L2 catalog of sub-tables"]:::l2
+    nodes["nodes/{fnv1a64-hex}/<br/>one dataset per node type"]:::l2
+    edges["edges/{fnv1a64-hex}/<br/>one dataset per edge type"]:::l2
+    cgraph["_graph_commits.lance/<br/>_graph_commit_actors.lance/<br/>_graph_commit_recoveries.lance/"]:::l2
+    recovery["__recovery/{ulid}.json<br/>recovery sidecars (transient)"]:::l2
+    refs["_refs/branches/{name}.json<br/>graph-level branches"]:::l2
+
+    repo --> manifest
+    repo --> nodes
+    repo --> edges
+    repo --> cgraph
+    repo --> recovery
+    repo --> refs
+
+    subgraph dataset[Inside each Lance dataset — L1]
+        ds_v["_versions/{n}.manifest<br/>per-dataset versions"]:::l1
+        ds_data["data/<br/>fragment files (Arrow IPC)"]:::l1
+        ds_idx["_indices/{uuid}/<br/>BTREE · Inverted FTS · IVF/HNSW"]:::l1
+        ds_refs["_refs/<br/>per-dataset Lance branches/tags"]:::l1
+        ds_tx["_transactions/<br/>commit transaction logs"]:::l1
+    end
+
+    nodes -.-> dataset
+    edges -.-> dataset
+    manifest -.-> dataset
+```
+
+**What's where:**
+
+- **Repo root** is one directory (or S3 prefix). Everything below is part of one OmniGraph repo.
+- **`__manifest/`** is a Lance dataset whose rows describe which sub-table version is published at which graph-branch. Reading a snapshot starts here.
+- **`nodes/`** and **`edges/`** are sibling directories holding one Lance dataset per declared type. Names are `fnv1a64-hex` of the type name to keep paths fixed-length and case-safe.
+- **`_graph_commits.lance`** is an L2 dataset that records the graph-level commit DAG, with a paired `_graph_commit_actors.lance` for the actor map. (Pre-v0.4.0 repos also have inert `_graph_runs.lance` / `_graph_run_actors.lance` from the removed Run state machine; MR-770 sweeps these in production.)
+- **`_graph_commit_recoveries.lance`** — one row per recovery sweep action. Joined to `_graph_commits.lance` by `graph_commit_id`; the linked commit row carries `actor_id=omnigraph:recovery`. Operators correlate recoveries with the original mutations they rolled forward / back via this join. See `crates/omnigraph/src/db/recovery_audit.rs`.
+- **`__recovery/{ulid}.json`** — transient sidecar files written by the four migrated writers (`MutationStaging::finalize`, `schema_apply`, `branch_merge`, `ensure_indices`) before Phase B begins, deleted after Phase C succeeds. A sidecar persisting after process exit means the writer crashed in the Phase B → Phase C window; the next `Omnigraph::open` recovery sweep processes it. Steady-state directory is empty. See `crates/omnigraph/src/db/manifest/recovery.rs`.
+- **`_refs/branches/{name}.json`** is graph-level branch metadata — pointers from a branch name to the manifest version it heads.
+- **Inside each Lance dataset** (orange): the standard Lance directory layout. `_versions/{n}.manifest` records every commit; `data/` holds the actual Arrow fragments; `_indices/{uuid}/` holds index segments with their own `fragment_bitmap` for partial coverage; `_refs/` holds Lance-native per-dataset branches and tags.
+
+The split — L2 owns the cross-dataset catalog; L1 owns the per-dataset internals — means that schema work (which adds or removes datasets) updates `__manifest`, while data work (which adds fragments) updates `_versions/` inside the affected dataset and then bumps `__manifest`.
+
+## URI scheme support (`storage.rs`)
+
+| Scheme | Backend | Notes |
+|---|---|---|
+| local path / `file://` | `LocalStorageAdapter` (tokio) | Normalized to absolute paths |
+| `s3://bucket/prefix` | `S3StorageAdapter` (object_store) | Honors `AWS_ENDPOINT_URL_S3`, `AWS_ALLOW_HTTP`, `AWS_S3_FORCE_PATH_STYLE` |
+| `http(s)://host:port` | HTTP client to `omnigraph-server` | Used by CLI as a target, not a storage backend |
+
+## Object-store env vars (S3-compatible)
+
+- `AWS_REGION`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_SESSION_TOKEN`
+- `AWS_ENDPOINT_URL`, `AWS_ENDPOINT_URL_S3` — for MinIO / RustFS / GCS-via-XML
+- `AWS_S3_FORCE_PATH_STYLE=true` — path-style URLs
+- `AWS_ALLOW_HTTP=true` — allow plain HTTP (local dev)
--- a/docs/user/transactions.md
+++ b/docs/user/transactions.md
@ -0,0 +1,168 @@
+# Transactions in OmniGraph
+
+OmniGraph does not have `BEGIN` / `COMMIT` / `ROLLBACK`. Branches do that job. This page explains the model, when to use which primitive, and shows worked examples for the patterns that come up most.
+
+The architectural rule lives in [`docs/dev/invariants.md`](../dev/invariants.md):
+
+> **Mutations publish at one boundary.** A `mutate_as` or `load` operation
+> accumulates constructive writes, commits each touched table at the end, then
+> publishes one manifest update.
+
+If you need to coordinate multiple queries atomically, you fork a branch, run mutations on it, and merge when you're satisfied. If something goes wrong, you delete the branch.
+
+## The atomicity model
+
+Two primitives, two scopes:
+
+| Scope | Primitive | Atomic? | Failure mode |
+|---|---|---|---|
+| **One `.gq` query** (any number of statements inside) | The query itself — handled by the publisher's atomic manifest commit | Yes — all statements land together or none of them do | The publisher never publishes; target unchanged |
+| **Many queries that must succeed together** | Branches: `branch_create` → run N queries on the branch → `branch_merge` | Yes — the merge is a single atomic publish | Drop the branch (`branch_delete`); main is unaffected |
+
+Snapshot isolation is per-query — every read inside one query sees one consistent manifest version. Two concurrent queries on the same branch see independent snapshots; the publisher's CAS catches racing writes.
+
+## Comparison with `BEGIN` / `COMMIT`
+
+| Postgres / MySQL | OmniGraph |
+|---|---|
+| `BEGIN; … ; COMMIT` | `branch_create review/X` → mutations on `review/X` → `branch_merge review/X --into main` |
+| `ROLLBACK` | `branch_delete review/X` |
+| Connection-bound session state | Branch-scoped lineage on disk |
+| Locks (or MVCC + abort on conflict) | Snapshot isolation per query + three-way merge at branch-join |
+| Transaction is invisible to ops | Branch is a durable artifact (visible in `branch_list`, queryable, time-travelable) |
+
+The trade-off: branches are heavier than a connection-scoped transaction (they exist on disk, have a name, show up in `branch_list`), but they fit the agent-as-user model — agents naturally fork branches to plan, batch, and review work. And they're durable: if your process crashes mid-workflow, the branch survives and you can pick up where you left off.
+
+## Worked examples
+
+### 1. Single query, multi-statement (atomic by default)
+
+A `.gq` query with multiple `insert` / `update` statements is one transaction. Either all statements land together at publish time, or none do.
+
+```gq
+query register_employee_with_team($name: String, $age: I32, $team: String) {
+    insert Person { name: $name, age: $age }
+    insert WorksAt { from: $name, to: $team }
+}
+```
+
+```bash
+omnigraph change --query ./mutations.gq --name register_employee_with_team \
+    --params '{"name":"Alice","age":30,"team":"Acme"}' ./repo.omni
+```
+
+If the second statement fails (e.g. `Acme` doesn't exist), the publisher never publishes; `Alice` is not in the database. Atomic.
+
+### 2. Two separate queries on `main` (NOT atomic)
+
+```bash
+# Query 1
+omnigraph change --query ./mutations.gq --name register_employee --params '{"name":"Alice","age":30}' ./repo.omni
+
+# Query 2 — runs after Query 1 has already published
+omnigraph change --query ./mutations.gq --name link_to_team --params '{"name":"Alice","team":"Acme"}' ./repo.omni
+```
+
+These are **two publishes** on `main`. If Query 2 fails, Query 1's effects are already visible. There is no `ROLLBACK` for Query 1.
+
+If you want both-or-neither, you have two options:
+- Combine them into a single `.gq` query (option 1 above), or
+- Use a branch (option 3 below).
+
+### 3. Many queries, atomic via a branch
+
+The pattern when you need to run multiple queries — possibly across multiple commands, agents, or sessions — and have them succeed or fail as a unit.
+
+```bash
+# Fork a working branch from main.
+omnigraph branch create --from main onboarding/2026-04-25 ./repo.omni
+
+# Run any number of mutations on the branch — each one is its own publish on the branch.
+# Concurrent reads of `main` are unaffected.
+omnigraph change --branch onboarding/2026-04-25 \
+    --query ./mutations.gq --name register_employee \
+    --params '{"name":"Alice","age":30}' ./repo.omni
+
+omnigraph change --branch onboarding/2026-04-25 \
+    --query ./mutations.gq --name register_employee \
+    --params '{"name":"Bob","age":25}' ./repo.omni
+
+omnigraph change --branch onboarding/2026-04-25 \
+    --query ./mutations.gq --name link_to_team \
+    --params '{"name":"Alice","team":"Acme"}' ./repo.omni
+
+# Inspect the branch — read queries work just like on main.
+omnigraph read --branch onboarding/2026-04-25 \
+    --query ./queries.gq --name list_employees ./repo.omni
+
+# Happy with what's on the branch? Merge it. This is one atomic publish:
+# `main` flips to include every commit on the branch.
+omnigraph branch merge onboarding/2026-04-25 --into main ./repo.omni
+
+# OR: not happy? Throw it away. `main` is untouched.
+# omnigraph branch delete onboarding/2026-04-25 ./repo.omni
+```
+
+Properties:
+- Each query on the branch is its own publisher commit — so they're individually atomic. Per-query CAS works on branches just like on main.
+- The branch lives on disk. Process crash mid-workflow? Re-open and resume.
+- Multiple agents can work on different branches in parallel without blocking each other.
+- The merge is a three-way merge at the row level. Conflicts surface as `OmniError::MergeConflicts(Vec<MergeConflict>)`, with structured kinds (`DivergentInsert`, `DivergentUpdate`, `DeleteVsUpdate`, …) so callers can handle them programmatically.
+
+### 4. Coordinating multiple agents
+
+Two agents writing to the same graph independently:
+
+```bash
+# Agent A
+omnigraph branch create --from main agent-a/work ./repo.omni
+omnigraph change --branch agent-a/work … ./repo.omni
+# … many mutations …
+omnigraph branch merge agent-a/work --into main ./repo.omni
+
+# Agent B (running concurrently)
+omnigraph branch create --from main agent-b/work ./repo.omni
+omnigraph change --branch agent-b/work … ./repo.omni
+# … many mutations …
+omnigraph branch merge agent-b/work --into main ./repo.omni
+```
+
+Each agent sees a consistent snapshot of `main` at the time it forked. The first merge to `main` lands as a fast-forward (or a no-op if no concurrent change). The second merge runs three-way: rows touched by both branches surface as `MergeConflict`s for the caller to resolve.
+
+This is the workflow MR-797 / agentic loops are designed around: **branches are the unit of "an agent's working set."**
+
+## Failure modes
+
+| Scenario | What happens | Caller action |
+|---|---|---|
+| Single query fails mid-flight | Publisher never publishes; target unchanged | Read the error, decide whether to retry |
+| Concurrent writers race the same `(table, branch)` | Publisher CAS rejects the loser with `ManifestConflictDetails::ExpectedVersionMismatch` | Refresh handle, retry the query |
+| Branch with N successful mutations, then merge fails (three-way conflict) | Each individual mutation already committed on the branch; merge surfaces `MergeConflicts` | Inspect, decide whether to keep working on the branch, abandon it (`branch_delete`), or resolve and re-merge |
+| Process crashes mid-branch-workflow | Each completed mutation on the branch is durable | Re-open the repo, continue where you left off |
+
+## When to use what
+
+| Intent | Use |
+|---|---|
+| One conceptual change, multiple statements | One `.gq` query with multiple statements |
+| Bulk import of a related set of records | One `omnigraph load` (the loader is one atomic query under the hood) |
+| Many independent changes, no coordination needed | Many separate queries on `main`. Each is its own atomic unit. |
+| "Do these N things, all together or not at all" | Branch → run N queries → merge |
+| "Try things, evaluate, then commit" | Branch → mutate → read/inspect → merge or delete |
+| "Multiple agents writing concurrently" | One branch per agent, merge to `main` at end of agent task |
+| "Long-running workflow that may span sessions or process restarts" | Branch (durable on disk) |
+
+## What this model can't do
+
+- **Cross-query atomicity on `main` without a branch.** If you don't want to fork a branch, multiple queries on `main` publish independently. There is no implicit transaction.
+- **Long-running interactive transactions.** No `BEGIN` over a connection. Branches are the durable equivalent.
+- **Cross-graph (cross-repo) transactions.** Each repo is its own atomicity domain.
+- **"Pessimistic" locks** that serialize writers before they reach the storage layer. Snapshot-MVCC + publisher CAS handles concurrency optimistically; the loser retries.
+
+## See also
+
+- [`docs/user/branches-commits.md`](branches-commits.md) — branch and commit-graph mechanics.
+- [`docs/dev/merge.md`](../dev/merge.md) — three-way merge details and conflict kinds.
+- [`docs/user/query-language.md`](query-language.md) — `.gq` syntax for the multi-statement queries used above.
+- [`docs/dev/runs.md`](../dev/runs.md) — the per-query commit pipeline that gives single-query atomicity.
+- [`docs/dev/invariants.md`](../dev/invariants.md) — the architectural rule.