* test(engine): pin zero-row cascade delete must not drift an edge table (red) A delete <Node> cascades a delete_where into every incident edge type. The inline delete_where (Dataset::delete) advances Lance HEAD even when zero edges match, but the cascade records the new version only if deleted_rows > 0 — so a node with no incident edges leaves edge:Knows HEAD>manifest drift, which trips the next strict write's ExpectedVersionMismatch and repair refuses it. Red today: edge:Knows manifest=v5, Lance HEAD=v6. Goes green when delete moves to the staged two-phase path (iss-950, Lance 7.0 DeleteBuilder::execute_uncommitted), where a 0-row delete commits no Lance version and the deleted_rows>0 gate becomes correct by construction. * fix(engine): a zero-row delete must not advance Lance HEAD Lance's Dataset::delete commits a new version even when the predicate matches nothing (build_transaction always emits Operation::Delete), so a node delete that cascades a delete_where into an incident edge type with no matching edges advanced that edge table's Lance HEAD while the cascade skipped record_inline (gated on deleted_rows > 0) — leaving HEAD>manifest drift that wedged the next strict write and that repair refused as suspicious/unverifiable. Use Lance 7.0's two-phase DeleteBuilder::execute_uncommitted to read num_deleted_rows before committing: a no-match delete now advances nothing (no version, no drift) and the existing deleted_rows>0 gate is correct by construction. Non-zero deletes commit the staged transaction with skip_auto_cleanup + affected_rows (parity with the prior inline path). First step of the staged-delete migration (iss-950); turns the node_delete_with_no_incident_edges_leaves_no_edge_table_drift regression green. * feat(engine): stage_delete two-phase primitive (MR-A step 0) Add TableStore::stage_delete (Lance 7.0 DeleteBuilder::execute_uncommitted), the two-phase analogue of stage_merge_insert: writes deletion files without advancing Lance HEAD, returns Option<StagedWrite> (None on 0 rows = true no-op), carrying the deletion-vector updated_fragments as new_fragments and the superseded originals as removed_fragment_ids so combine_committed_with_staged makes the deletion visible to in-query reads. No affected_rows is threaded: like stage_merge_insert's Operation::Update commit, the staged delete relies on OmniGraph's per-table write queue + manifest CAS, not Lance's per-dataset conflict resolver (commit_staged is a single attempt). Flip the two residual guards to the staged path: staged_writes.rs now asserts stage_delete does NOT advance HEAD and that a staged delete is read-your-writes visible (the deletion-vector RYW proof D2 retirement depends on); the lance_surface_guards delete guard pins execute_uncommitted's UncommittedDelete. No behavior change yet (callers still use delete_where); Step 1 wires them. * feat(engine): TableStorage::stage_delete + migrate merge delete path (MR-A step 1a) Add stage_delete/Option<StagedHandle> to the TableStorage trait (delegates to TableStore::stage_delete). Migrate the two branch_merge delete sites (three-way RewriteMerged + adopt delta) from the inline delete_where residual to stage_delete + commit_staged — identical in shape to the stage_merge_insert + commit_staged pair above each. HEAD still advances within the merge sequence (via commit_staged), under the unchanged SidecarKind::BranchMerge Phase-B confirmation; the _pre_delete/_pre_index failpoints fire by position, unchanged. merge_truth_table, branching, composite_flow green. * feat(engine): migrate all delete sites to staged path, retire inline delete (MR-A step 1b/1c) Routes every delete through the staged write path so delete never advances Lance HEAD inline — the last inline-commit residual on the mutation path is gone. `MutationStaging` now accumulates delete predicates (`record_delete`) alongside pending write batches; at end-of-query `stage_all` combines a table's predicates into one `(p1) OR (p2) …` `stage_delete` (a deletion-vector transaction, no HEAD advance) and `commit_all` commits it through the same `commit_staged` path as inserts/updates. Deletes are now ordinary staged entries: one sidecar pin at `expected + 1`, no inline special-casing. Migrated callers (all 5): the 3 mutation.rs sites (delete-node, cascade, delete-edge) and the 2 merge.rs sites (already on stage_delete in step 1a). `affected_edges`/`affected` move from post-inline-commit `deleted_rows` to a committed `count_rows` at record time — exact under D₂, bounded by the cascade working set. A predicate matching zero rows stages nothing (the staged equivalent of the old "skip record_inline on 0 deleted rows"), so the zero-row edge-table drift class stays closed by construction. Retired scaffolding now that no caller remains: - `MutationStaging.inline_committed` + `record_inline` → `delete_predicates` + `record_delete`; `StagedMutation.inline_committed`/`paths` fields and all the `commit_all` inline handling (queue keys, sidecar pins with the `record_inline` table_version special-case, the inline recheck loop). - `open_table_for_mutation`'s post-inline-commit reopen branch (deletes no longer advance HEAD mid-query, so a second touch reopens at the pinned version like any write). - `InlineCommitResidual::delete_where` + its `TableStore` impl, the orphaned `TableStore::delete_where`, and `DeleteState`. `InlineCommitResidual` now carries only `create_vector_index` (Lance #6666 still open). D₂ stays for now: staged-delete read-your-writes doesn't yet compose into the pending accumulator (insert-then-delete on one table), so mixed insert/update/delete in one query is still rejected at parse time. Retiring D₂ is step 2. Doc comments updated to match across exec/, storage_layer, db/. Tests (all green): writes, consistency, validators, end_to_end, composite_flow, merge_truth_table, maintenance, recovery, staged_writes, forbidden_apis, lance_surface_guards, changes, point_in_time (286), plus failpoints (63). * docs: delete is a staged write, not an inline-commit residual (MR-A step 1) Update the docs that described `delete` as the inline-commit residual now that MR-A routes it through `stage_delete`. Always-loaded surfaces (AGENTS.md rule 4 / capability matrix, invariants.md Invariant 4 / truth matrix / known gaps) plus the dev write-path docs (writes.md, execution.md incl. its mutation sequence diagram, architecture.md) now state: deletes accumulate as predicates and stage like inserts/updates, no inline HEAD advance; `InlineCommitResidual` carries only `create_vector_index` (Lance #6666). The parse-time D₂ rule is documented as retained — not because delete inline-commits, but because staged-delete read-your-writes is not yet wired into the pending accumulator (MR-A step 2). lance.md's 7.0 audit note marked MR-A as landed. * docs: D₂ is a deliberate boundary, not temporary scaffolding (MR-A close-out) After MR-A staged the delete path, D₂ (a mutation query is insert/update-only OR delete-only) was left framed as temporary — "until Lance ships two-phase delete" / "retire in step 2". Lance shipped that and we used it for the inline-commit fix; D₂'s original justification is gone. It now stands for a different, permanent reason: keeping a query to one kind keeps its read-your-writes unambiguous and each table to one version per query. Retiring it would buy single-commit mixed atomicity (cheap workaround: split, or a branch) at the cost of an in-query delete view, pending pruning, edge id-resolution, and two-commit-per-table ordering in the hot mutation path — complexity not worth earning. Decision: keep D₂ as a deliberate boundary. Reframes the now-stale wording everywhere, no logic change: - The D₂ parse-time error message no longer promises "this restriction lifts when Lance exposes a two-phase delete API"; it states the boundary and points to a branch+merge for one atomic commit. - `enforce_no_mixed_destructive_constructive` doc, AGENTS.md, invariants.md (Invariant 4 / truth matrix / removed from the known-gaps), writes.md, architecture.md, lance.md, and the user mutations doc (which wrongly said deletes "commit through a different path" — both stage now). - Swept remaining stale `delete_where` mentions left from the Step-1 migration: the merge.rs "swap when upstream ships" comments (already swapped), the forbidden_apis / table_ops residual notes, the staged_writes vector-index guard doc (was "same as stage_delete's absence" — stage_delete now exists), and test comments/assert messages in recovery/maintenance/writes/failpoints. Genuinely-historical records (dated Lance audit, rfc-013, bug-case-fix) left. Verified: engine builds warning-free; check-agents-md OK; writes/maintenance/ recovery/staged_writes/forbidden_apis all green. Closes MR-A. * test(engine): overlapping delete predicates must not double-count affected_* (red) Reproduces a reporting regression from the staged-delete migration flagged in PR #308 review. Because deletes now stage (instead of inline-committing), two delete statements in one query both scan the same unchanged committed snapshot; counting each predicate independently over-reports `affected_*` when they overlap. The old inline path committed each delete before the next ran, so it counted distinct. `delete Person where name = "Alice"` then `delete Person where age > 29` over the standard fixture (Alice 30, Charlie 35) removes 2 distinct nodes and 3 distinct edges, but the buggy per-statement counting returns 3 nodes / 6 edges. RED at this commit (asserts left=3, right=2). * fix(engine): dedup overlapping delete predicates when counting affected_* Count each delete statement against the committed snapshot MINUS the predicates a prior delete statement on the same table already recorded: `(pred) AND NOT ((prior1) OR (prior2) …)`. Summed over statements this is inclusion-exclusion — `Σ |pₙ \ (p₁ ∪ …)| = |p₁ ∪ p₂ ∪ …|` — exactly the distinct count the combined `(p1) OR (p2)` staged delete removes. Works for nodes and edges alike with no edge identity needed; the node ID scan uses the same exclusion so a later statement also doesn't re-cascade already-deleted nodes. The ORIGINAL predicate is still what gets recorded (the staged delete removes the union); only the count uses the exclusion. The common single-delete path is unchanged (`prior` empty → filter is just the base predicate). New helper `dedup_delete_filter` + `MutationStaging::recorded_delete_predicates`. Turns the red regression test green (2 nodes / 3 edges); writes (33), end_to_end, validators, maintenance, recovery, composite_flow, merge_truth_table, consistency, changes, and failpoints (63) all stay green. * test(engine): delete dedup must not drop NULL-column rows (red) Follow-up to the overlapping-delete fix flagged in PR #308 review (Greptile P1): the `(base) AND NOT (prior)` exclusion breaks under SQL three-valued logic. If a prior delete predicate references a NULLable column, a later statement's matching row whose column is NULL makes `prior` evaluate to UNKNOWN, `NOT UNKNOWN` is UNKNOWN, and the row is filtered out of the scan — even though the prior delete never matched it. That drops it from `deleted_ids`, skipping its cascade (orphaned edges) or, if it is the only match, leaving the node undeleted. A data bug, not just a miscount. Data: Charlie(age 35), Zoe(age NULL); Knows Zoe→Charlie. `delete Person where age > 30` then `delete Person where name = "Zoe"`. Under the buggy `NOT`, Zoe's scan `(name='Zoe') AND NOT (age>30)` is UNKNOWN → Zoe survives. RED at this commit (Person count left=1, right=0). * fix(engine): NULL-safe delete dedup — exclude only definitely-matched prior rows Change `dedup_delete_filter` from `(base) AND NOT (prior)` to `(base) AND ((prior) IS NOT TRUE)`. `IS NOT TRUE` keeps both FALSE and UNKNOWN rows, so a prior predicate that evaluates to SQL UNKNOWN (a NULL in a referenced column) no longer drops a row this statement legitimately matches — only rows a prior predicate matched as definitely TRUE are excluded from the count/scan. The distinct-count semantics are unchanged for non-NULL data. Turns the red NULL-dedup test green (Zoe deleted, her edge cascaded), and the overlapping-dedup + writes/end_to_end/validators/maintenance/recovery/ composite_flow/consistency suites stay green. * docs(engine): note dedup_delete_filter's load-bearing dependency on D₂ Self-review follow-up: the overlapping-delete dedup assumes the committed snapshot is invariant across a query's statements, which holds only because D₂ forbids mixing writes with deletes (so a delete-touched table has no pending writes). Make that dependency explicit at the function so a future D₂ relaxation is forced to revisit the dedup. Comment-only. * Preserve staged write commit metadata |
||
|---|---|---|
| .cargo | ||
| .context | ||
| .github | ||
| assets | ||
| crates | ||
| docker | ||
| docs | ||
| scripts | ||
| skills/omnigraph | ||
| .dockerignore | ||
| .gitignore | ||
| AGENTS.md | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CLAUDE.md | ||
| CODE_OF_CONDUCT.md | ||
| CONTRIBUTING.md | ||
| Dockerfile | ||
| GOVERNANCE.md | ||
| LICENSE | ||
| og-cheet-sheet.md | ||
| omnigraph.example.yaml | ||
| openapi.json | ||
| README.md | ||
| rust-toolchain.toml | ||
| SECURITY.md | ||
Lakehouse graph database for context assembly & multi-agent coordination
Multimodal retrieval · Git-style branching · object-storage native
Quickstart · Docs · Cookbooks · CLI
Omnigraph is the operational state and coordination layer for fleets of agents.
Run it as a server, declared as code; hundreds of agents operate and enrich the graph on parallel isolated branches, and every change is reviewed and merged safely.
Key capabilities
| Capability | What it gives you |
|---|---|
| Declared as code | A cluster.yaml declares graphs, schemas, stored queries, embedding providers, and policies; cluster apply converges it and omnigraph-server brings every graph online at /graphs/{id}/…. |
| Built for fleets of agents | Hundreds of agents enrich the graph on parallel isolated branches; changes are reviewed and merged safely, Git-style, across the whole graph. |
| Multimodal retrieval | Graph traversal + vector ANN + full-text + Reciprocal Rank Fusion in one query runtime, for context assembly. |
| Security as code | Cedar policy enforced server-side on every mutation, per-graph and server-wide; bearer auth; actor/audit tracking. |
| Runs on your infrastructure | Any S3-compatible object store: on-prem via RustFS / MinIO, or AWS S3 / R2 / GCS. VPC, on-prem, hybrid; your data never leaves your store. |
| Open, versioned storage | Lance columnar format: branchable, time-travelable, with native blob-as-data (docs, images, video). |
What you can build
| Use case | What it's for |
|---|---|
| Company brain | Org knowledge unified into one graph every agent can query |
| Agentic memory | Durable, versioned memory: a branch per agent or per task, merged on review |
| Context graph | Decision traces and codified tribal knowledge for retrieval |
| Dev graph | Issues & dependency model that coding agents read and write |
| R&D / ML data layer | Experiments and trials written into branches, versioned for training & eval |
Install
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
This installs omnigraph (CLI) and omnigraph-server into ~/.local/bin from
published release binaries. Or with Homebrew:
brew tap ModernRelay/tap
brew install ModernRelay/tap/omnigraph
Set it up with an AI agent
Omnigraph is built to be run by coding agents. Two ways in:
Teach your agent the playbook. This repo ships the
omnigraph agent skill: the operational playbook
covering cluster mode, the two config surfaces, schema evolution, query linting,
data writes, branches, Cedar policy, and the common gotchas.
npx skills add ModernRelay/omnigraph@omnigraph
Or have an agent set it up from scratch. Paste this into Claude Code, Codex, or any agent that can read a URL and run a shell command:
Help me set up Omnigraph
1. Read the docs at https://github.com/ModernRelay/omnigraph, starting with
docs/user/clusters/index.md, then docs/user/deployment.md.
2. Skim the starter graphs and seed data in the cookbooks:
https://github.com/ModernRelay/omnigraph-cookbooks
3. Ask me what I want to build (company brain, agent memory, dev graph,
research / R&D layer, …). Then stand up a cluster for it, load a little
data, and run a query so I can see it working.
For ready-to-run graphs with real seed data (company brain, VC operating system,
pharma & industry intel),
ModernRelay/omnigraph-cookbooks
is the fastest way to see Omnigraph shaped to a real domain.
Deploy
A deployment is a cluster: a multigraph config directory that declares
its graphs, schemas, stored queries, and policies as code. You manage it
Terraform-style: cluster plan previews the diff, cluster apply converges
it. omnigraph-server then boots from the cluster and brings every graph online
at /graphs/{id}/…, each behind its own policy.
1. Declare the cluster.
company-brain/
├── cluster.yaml
├── people.pg # schema for the "knowledge" graph
├── queries/ # stored queries: the .gq files ARE the declaration
│ └── people.gq
└── base.policy.yaml # a Cedar policy bundle
# cluster.yaml
version: 1
metadata:
name: company-brain
storage: s3://company/clusters/company-brain # ledger, catalog, and graph data live here
graphs:
knowledge:
schema: people.pg
queries: queries/ # every `query <name>` in queries/*.gq registers
policies:
base:
file: base.policy.yaml
applies_to: [knowledge] # graph-bound; use [cluster] for server-level
2. Stand up your object store. On-prem, run RustFS (or MinIO); Omnigraph
writes Lance to it over the standard S3
API. In the cloud, point the same AWS_* env at S3 / R2 / GCS instead.
3. Converge and run. apply creates each graph, applies its schema, and
publishes queries and policies into the content-addressed catalog. It is
idempotent; re-running is always safe.
omnigraph cluster validate # parse + typecheck everything
omnigraph cluster plan # preview what apply would do
omnigraph cluster apply # converge
# Boot the server from the cluster dir; storage resolves through cluster.yaml
omnigraph-server --cluster company-brain --bind 0.0.0.0:8080
See the cluster guide for the day-2 loop
(edit → plan → apply → restart), approval gates for destructive changes, drift
inspection, and recovery; the deployment guide for
containers, AWS/Railway, auth, and the full AWS_* contract.
Query and mutate
Set a default server and graph once in ~/.omnigraph/config.yaml, and the
everyday commands stay short. Stored queries and mutations run by name:
omnigraph query search_docs --params '{"q":"AI safety"}'
omnigraph mutate add_person --params '{"name":"Mina"}'
# Branch, review, merge across the whole graph; agents write in isolation
omnigraph branch create --from main agent/ingest-42
omnigraph branch merge agent/ingest-42 --into main
An alias is shorter still: bind a server, graph, and stored query to one
name, then omnigraph alias triage runs it. For an ad-hoc target, any command
still takes --server <name|url> --graph <id> (or --store <uri> for a local
graph). See the CLI reference.
Security & governance
- Engine-wide enforcement: every write path goes through the same Cedar gate, so the HTTP server, the CLI, and the embedded SDK obey identical rules.
- Declared in the cluster: a policy bundle is bound to graphs (or the whole server) via
policies:→applies_to. - Scoped: rules apply per graph, per branch, or server-wide.
- No plaintext tokens: bearer tokens are hashed at startup and compared in constant time.
- Forge-proof identity: the actor is resolved server-side from the token; clients can't set it.
See the policy guide.
Clients & SDKs
| Client | Use it for | Where |
|---|---|---|
| TypeScript SDK | typed access from Node / TS | @modernrelay/omnigraph · source |
| MCP server | bridge Omnigraph to LLM hosts (Claude, Codex, …) | @modernrelay/omnigraph-mcp |
| HTTP / OpenAPI | any language, the wire contract | the server's OpenAPI spec |
| Python SDK | typed access from Python | coming soon |
Both npm packages are versioned in lockstep with omnigraph-server.
Local quick test (no server)
1-min setup to try it: an embedded, local file-backed graph (no server, no object store). For dev and experiments; production is the deployed cluster above.
cat > schema.pg <<'PG'
node Signal { slug: String @key, title: String }
node Pattern { slug: String @key, name: String }
edge Indicates: Signal -> Pattern
PG
printf '%s\n' \
'{"type":"Signal","data":{"slug":"s1","title":"OSS model adoption surging"}}' \
'{"type":"Pattern","data":{"slug":"p1","name":"adoption"}}' \
'{"edge":"Indicates","from":"s1","to":"p1"}' > data.jsonl
omnigraph init --schema schema.pg ./graph.omni
omnigraph load --data data.jsonl --mode overwrite --store ./graph.omni
# "What pattern does signal s1 indicate?"
omnigraph query --store ./graph.omni \
-e 'query indicates() { match { $s: Signal { slug: "s1" } $s indicates $p } return { $p.name } }'
# → adoption
Docs
Build And Test
cargo build --workspace
cargo test --workspace
Notes:
- Rust stable toolchain, edition 2024
- CI runs
cargo test --workspace --locked - Full CI and some local test flows require
protobuf-compiler - S3 integration tests expect an S3-compatible endpoint such as RustFS
Workspace Crates
crates/omnigraph-compiler: shared schema/query parser, typechecker, catalog, and IR lowering (zero Lance dependency)crates/omnigraph(packageomnigraph-engine): storage/runtime, branching, merge, change detection, query execution, and embeddingscrates/omnigraph-policy: Cedar policy compilation and enforcementcrates/omnigraph-api-types: shared HTTP wire DTOs used by both the server and the CLIcrates/omnigraph-cluster: cluster config validation, planning, and apply (the control plane)crates/omnigraph-server: Axum HTTP server, cluster-first, runs N graphs under/graphs/{id}/…crates/omnigraph-cli: CLI for graph lifecycle, query/mutate, branch/commit/merge, schema/lint, snapshot/export, cluster control, policy/queries, profiles, and maintenance
Contributing
Please open an issue, spec, or design discussion before sending large code changes. Design feedback and concrete problem statements are the fastest way to collaborate on the roadmap.
Community
Join the Omnigraph Slack community to ask questions, share feedback, and follow development.