* docs(dev): write-latency roadmap (validated cost model + layered fix)
Records the validated 6-LIST warm-write cost model, the two root causes
(un-GC'd _versions/; re-resolving latest by listing), and the layered fix
(GC + capture-once reuse), plus how commit-graph-table retirement feeds in.
Linked from docs/dev/index.md next to the RFC-013 docs.
* feat(engine)!: strand storage versioning — one internal-schema version, no in-place migration
Set MIN_SUPPORTED == CURRENT == 4: this binary reads exactly one `__manifest`
internal-schema version and refuses any older graph on open with a
rebuild-via-export/import message, instead of migrating it in place. Storage
format changes become a deliberate cutover, not a permanently-carried in-place
migration — the pre-release "complexity must be earned" contract.
Delete the entire in-place migration apparatus and everything that existed only
to support it: the `migrate_vN` arms + dispatcher + stamp-bump helpers + the
schema-version-floor tripwire; `migrate_on_open` (both open modes now refuse);
the legacy `_graph_commits.lance` readers + the v3 test fixtures + migration
tests + `migration.v3_to_v4.*` failpoints + the two surface guards that pinned
Lance variants only the migration matched on; and `state::merge_lineage_rows`.
Keep `read_stamp` / `stamp_current_version` / `set_stamp` /
`refuse_if_stamp_unsupported` — the seam a future one-shot converter plugs into.
`load_commit_cache_for_branch` now reads the `__manifest` projection
unconditionally (sub-v4 graphs are refused at open). Adds
`sub_current_graph_is_refused_on_open_with_rebuild_hint`.
The commit-graph TABLES are still created/used as branch-ref ledgers — their
retirement (CommitGraph -> pure `__manifest` projection) is the next commit.
BREAKING CHANGE: a graph created by omnigraph <= 0.7.2 (internal schema v3) is
refused on open. Rebuild it: `omnigraph export` with the old release, then
`omnigraph init` + `omnigraph load` with this one. Data, vectors, and blobs are
preserved; commit history and branches are not.
* feat(engine)!: retire `_graph_commits.lance` / `_graph_commit_actors.lance` — CommitGraph is a pure `__manifest` projection
Since RFC-013 Phase 7, graph lineage lives in `__manifest` (`graph_commit` /
`graph_head` rows) and branch authority is `__manifest` (branch create forks it
first). The two commit-graph datasets were vestigial: `_graph_commit_actors.lance`
was never written or read; `_graph_commits.lance` carried zero commit rows and
only mirrored the manifest's branch refs (a deny-list "parallel copy"). Retire
both.
- `CommitGraph` collapses to a pure projection: drops its Lance dataset handles
(`dataset`/`actor_dataset`) and all branch methods; `open`/`open_at_branch`/
`refresh`/`init` open NO dataset, building the cache from
`ManifestCoordinator::read_graph_lineage_at`. Removes ~1.4s of cold-open
dataset opens.
- `graph_coordinator`: `commit_graph` is now non-`Option` (always a valid
projection). `branch_create`/`branch_delete` go through `ManifestCoordinator`
only — a single atomic op, replacing the two-step manifest-fork +
commit-graph-fork + rollback. Deleted `create_commit_graph_branch`,
`reclaim_commit_graph_branch`, `ensure_commit_graph_initialized`, and every
`storage.exists(_graph_commits.lance)` gate.
- `optimize`: dropped `reconcile_commit_graph_orphans` and the two tables from
the internal-table compaction set (now `__manifest` only).
- `instrumentation`: `INTERNAL_TABLE_DIRS` no longer lists the two tables.
- Fresh graphs create neither table; `lineage_projection.rs` now asserts both
`.lance` dirs are absent. Deleted the obsolete commit-graph-branch-race
failpoint tests + their failpoint names, and updated the `maintenance`
optimize tests (one internal table, not three).
Review-pass fixes folded in:
- Removed two stale `omnigraph.rs` in-source tests the prior run missed (a
disk-full link failure masked them): one asserting `open` probes
`_graph_commits.lance` (the exists-gate this commit removes) — it was masked
earlier by a disk-full link failure.
- Corrected src comments referencing deleted code (`migrate_v3_to_v4`,
`append_commit`/`append_merge_commit`, the three-internal-table list,
the `_graph_commits` reconcile owner) in publisher/recovery/optimize/recovery_audit.
- Narrowed `set_stamp_for_test` to `cfg(test)` (its only caller is the refusal
test) — removes a dead-code warning in the failpoints build.
Branch create/delete atomicity improves (single atomic `__manifest` op). No
behavior change for reads or branches.
Follow-up (separate commit): the now-always-0 `IoCounts::commit_graph_reads` test
counter + its `IOTracker`, threaded through ~11 cost-test files.
* feat: surface the internal-schema (storage-format) version to operators
After stranding storage versioning (a sub-v4 graph is refused on open), operators
could only discover the storage-format version by hitting a refusal. Surface it:
- `omnigraph version` prints an `internal-schema <N>` line (the binary's CURRENT
storage-format version).
- `omnigraph snapshot` includes `internal_schema_version` — the GRAPH's per-branch
on-disk stamp, read via the new `Omnigraph::internal_schema_version_of`.
- `GET /healthz` includes `internal_schema_version` (server-scoped: the binary's
CURRENT, alongside `version`/`source_version`).
Wire: re-expose `INTERNAL_MANIFEST_SCHEMA_VERSION` as `pub` on `db::manifest`;
add `internal_schema_version: u32` to `SnapshotOutput` + `HealthOutput`;
`snapshot_payload` takes the per-graph version (the `Snapshot` does not carry it),
threaded through the embedded CLI + server snapshot callers. `openapi.json`
regenerated (two added int32 properties). Extends the existing healthz / snapshot /
version tests.
* docs(engine): gate internal-schema version at the graph level; record the per-branch read gap
PR reviewers flagged that the open path validates only main's internal-schema stamp, so a branch read could decode a branch stamped outside this binary's range. The stamp is a graph-wide storage-format property (the upgrade path is a whole-graph export/import), so with one binary version every branch is always CURRENT; divergence needs concurrent multi-version writers, an unsupported topology already in one-winner-CAS territory. Gating per-branch would add a second __manifest open per non-main branch read to defend a state we do not support, unearned complexity that regresses the warm-read budget.
Keep the graph-level gate, document it at the code site (refuse_if_internal_schema_unsupported), and record the read-only residual hole as a known gap in invariants.md to close only when multi-version write topologies become supported. Also clarify the sub-floor rebuild message to say "export with the older omnigraph binary that created it."
No behavior change: HEAD already gated at the graph level.
* test(cost): remove the dead commit_graph_reads IO counter
Phase B retired _graph_commits.lance / _graph_commit_actors.lance, so no commit-graph dataset is opened and the commit_graph IOTracker term is structurally always 0. Remove IoCounts::commit_graph_reads, its total_reads() term, the commit_graph IOTracker in OpProbes, and the now-dead commit_graph_wrapper field on QueryIoProbes (it had no accessor — nothing ever attached it). Drop the 7 trivially-true assert_eq!(commit_graph_reads, 0) checks in warm_read_cost.rs and the debug-print refs in write_cost{,_s3}.rs.
Lineage and actor rows now live in __manifest (RFC-013 Phase 7), so the internal_table_scans_are_flat_in_history gate folds into the single manifest_reads flat-assertion — the manifest scan already covers them. Harness-only; no production runtime impact.
* docs: align with the commit-graph retirement + strand storage versioning
Update the always-loaded and user-facing docs to match the landed state: graph lineage lives in __manifest, the _graph_commits.lance / _graph_commit_actors.lance tables are retired, and storage is strict-single-version (no in-place migration — a sub-CURRENT graph is refused with an export/import rebuild).
Fixed stale claims in invariants.md (the migration/atomicity known-gap entry, the Truth Matrix branch-delete row, the read-path/optimize internal-table scope), lance.md (the migrate_v1_to_v2 PK bullet now reflects init-time set; removed the two deleted v3->v4 migration surface guards), testing.md (dropped the deleted migration failpoint tests; manifest-only internal-table term), writes.md (rewrote the Migration-code section to the strand model), storage.md / maintenance.md / constants.md (retired tables out of the layout, internal-table compaction scope, and the constants cheat-sheet), and AGENTS.md. Marked the retirement DONE in the RFC-013 handoff/roadmap and banner-noted the historical RFC analysis.
Added docs/user/operations/upgrade.md (the export/import rebuild recipe) and docs/dev/versioning.md (the four-axis compatibility policy: release lockstep / wire additive / storage strict-single-version / Lance pinned), cross-linked from the audience indexes and the AGENTS.md topic map, and rewrote the in-progress v0.8.0 release note for the strand model + version surfacing. check-agents-md.sh passes (65 links, 62 docs).
* test(manifest): cover the v3-refusal→export/import rebuild cycle and branch stamp inheritance
Two coverage additions from PR review (P1):
(a) sub_current_graph_is_refused_then_rebuilt_via_export_import — the full operator narrative in one flow: load → export → a sub-CURRENT graph (stamp rewound below CURRENT) is refused with the export nudge → fresh init + load(export) → data present and the rebuilt graph opens. The refusal is stamp-only (read before any data), so a stamp-rewound graph is a faithful stand-in for a real older-release graph without a second binary; vector/blob fidelity stays covered by tests/export.rs.
(b) branch_inherits_main_internal_schema_stamp — proves a branch cannot diverge from main's stamp under single-binary operation (create_branch forks main's __manifest, the publisher does not re-stamp), which is why the graph-level (main-only) stamp gate is sufficient for supported inputs. A divergent branch stamp needs concurrent multi-version writers, the unsupported topology recorded as a known gap.
|
||
|---|---|---|
| .cargo | ||
| .context | ||
| .github | ||
| assets | ||
| crates | ||
| docker | ||
| docs | ||
| scripts | ||
| skills/omnigraph | ||
| .dockerignore | ||
| .gitignore | ||
| AGENTS.md | ||
| Cargo.lock | ||
| Cargo.toml | ||
| CLAUDE.md | ||
| CODE_OF_CONDUCT.md | ||
| CONTRIBUTING.md | ||
| Dockerfile | ||
| GOVERNANCE.md | ||
| LICENSE | ||
| og-cheet-sheet.md | ||
| omnigraph.example.yaml | ||
| openapi.json | ||
| README.md | ||
| rust-toolchain.toml | ||
| SECURITY.md | ||
Lakehouse graph database for context assembly & multi-agent coordination
Multimodal retrieval · Git-style branching · object-storage native
Quickstart · Docs · Cookbooks · CLI
Omnigraph is the operational state and coordination layer for fleets of agents.
Run it as a server, declared as code; hundreds of agents operate and enrich the graph on parallel isolated branches, and every change is reviewed and merged safely.
Key capabilities
| Capability | What it gives you |
|---|---|
| Declared as code | A cluster.yaml declares graphs, schemas, stored queries, embedding providers, and policies; cluster apply converges it and omnigraph-server brings every graph online at /graphs/{id}/…. |
| Built for fleets of agents | Hundreds of agents enrich the graph on parallel isolated branches; changes are reviewed and merged safely, Git-style, across the whole graph. |
| Multimodal retrieval | Graph traversal + vector ANN + full-text + Reciprocal Rank Fusion in one query runtime, for context assembly. |
| Security as code | Cedar policy enforced server-side on every mutation, per-graph and server-wide; bearer auth; actor/audit tracking. |
| Runs on your infrastructure | Any S3-compatible object store: on-prem via RustFS / MinIO, or AWS S3 / R2 / GCS. VPC, on-prem, hybrid; your data never leaves your store. |
| Open, versioned storage | Lance columnar format: branchable, time-travelable, with native blob-as-data (docs, images, video). |
What you can build
| Use case | What it's for |
|---|---|
| Company brain | Org knowledge unified into one graph every agent can query |
| Agentic memory | Durable, versioned memory: a branch per agent or per task, merged on review |
| Context graph | Decision traces and codified tribal knowledge for retrieval |
| Dev graph | Issues & dependency model that coding agents read and write |
| R&D / ML data layer | Experiments and trials written into branches, versioned for training & eval |
Install
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
This installs omnigraph (CLI) and omnigraph-server into ~/.local/bin from
published release binaries. Or with Homebrew:
brew tap ModernRelay/tap
brew install ModernRelay/tap/omnigraph
Set it up with an AI agent
Omnigraph is built to be run by coding agents. Two ways in:
Teach your agent the playbook. This repo ships the
omnigraph agent skill: the operational playbook
covering cluster mode, the two config surfaces, schema evolution, query linting,
data writes, branches, Cedar policy, and the common gotchas.
npx skills add ModernRelay/omnigraph@omnigraph
Or have an agent set it up from scratch. Paste this into Claude Code, Codex, or any agent that can read a URL and run a shell command:
Help me set up Omnigraph
1. Read the docs at https://github.com/ModernRelay/omnigraph, starting with
docs/user/clusters/index.md, then docs/user/deployment.md.
2. Skim the starter graphs and seed data in the cookbooks:
https://github.com/ModernRelay/omnigraph-cookbooks
3. Ask me what I want to build (company brain, agent memory, dev graph,
research / R&D layer, …). Then stand up a cluster for it, load a little
data, and run a query so I can see it working.
For ready-to-run graphs with real seed data (company brain, VC operating system,
pharma & industry intel),
ModernRelay/omnigraph-cookbooks
is the fastest way to see Omnigraph shaped to a real domain.
Deploy
A deployment is a cluster: a multigraph config directory that declares
its graphs, schemas, stored queries, and policies as code. You manage it
Terraform-style: cluster plan previews the diff, cluster apply converges
it. omnigraph-server then boots from the cluster and brings every graph online
at /graphs/{id}/…, each behind its own policy.
1. Declare the cluster.
company-brain/
├── cluster.yaml
├── people.pg # schema for the "knowledge" graph
├── queries/ # stored queries: the .gq files ARE the declaration
│ └── people.gq
└── base.policy.yaml # a Cedar policy bundle
# cluster.yaml
version: 1
metadata:
name: company-brain
storage: s3://company/clusters/company-brain # ledger, catalog, and graph data live here
graphs:
knowledge:
schema: people.pg
queries: queries/ # every `query <name>` in queries/*.gq registers
policies:
base:
file: base.policy.yaml
applies_to: [knowledge] # graph-bound; use [cluster] for server-level
2. Stand up your object store. On-prem, run RustFS (or MinIO); Omnigraph
writes Lance to it over the standard S3
API. In the cloud, point the same AWS_* env at S3 / R2 / GCS instead.
3. Converge and run. apply creates each graph, applies its schema, and
publishes queries and policies into the content-addressed catalog. It is
idempotent; re-running is always safe.
omnigraph cluster validate # parse + typecheck everything
omnigraph cluster plan # preview what apply would do
omnigraph cluster apply # converge
# Boot the server from the cluster dir; storage resolves through cluster.yaml
omnigraph-server --cluster company-brain --bind 0.0.0.0:8080
See the cluster guide for the day-2 loop
(edit → plan → apply → restart), approval gates for destructive changes, drift
inspection, and recovery; the deployment guide for
containers, AWS/Railway, auth, and the full AWS_* contract.
Query and mutate
Set a default server and graph once in ~/.omnigraph/config.yaml, and the
everyday commands stay short. Stored queries and mutations run by name:
omnigraph query search_docs --params '{"q":"AI safety"}'
omnigraph mutate add_person --params '{"name":"Mina"}'
# Branch, review, merge across the whole graph; agents write in isolation
omnigraph branch create --from main agent/ingest-42
omnigraph branch merge agent/ingest-42 --into main
An alias is shorter still: bind a server, graph, and stored query to one
name, then omnigraph alias triage runs it. For an ad-hoc target, any command
still takes --server <name|url> --graph <id> (or --store <uri> for a local
graph). See the CLI reference.
Security & governance
- Engine-wide enforcement: every write path goes through the same Cedar gate, so the HTTP server, the CLI, and the embedded SDK obey identical rules.
- Declared in the cluster: a policy bundle is bound to graphs (or the whole server) via
policies:→applies_to. - Scoped: rules apply per graph, per branch, or server-wide.
- No plaintext tokens: bearer tokens are hashed at startup and compared in constant time.
- Forge-proof identity: the actor is resolved server-side from the token; clients can't set it.
See the policy guide.
Clients & SDKs
| Client | Use it for | Where |
|---|---|---|
| TypeScript SDK | typed access from Node / TS | @modernrelay/omnigraph · source |
| MCP server | bridge Omnigraph to LLM hosts (Claude, Codex, …) | @modernrelay/omnigraph-mcp |
| HTTP / OpenAPI | any language, the wire contract | the server's OpenAPI spec |
| Python SDK | typed access from Python | coming soon |
Both npm packages are versioned in lockstep with omnigraph-server.
Local quick test (no server)
1-min setup to try it: an embedded, local file-backed graph (no server, no object store). For dev and experiments; production is the deployed cluster above.
cat > schema.pg <<'PG'
node Signal { slug: String @key, title: String }
node Pattern { slug: String @key, name: String }
edge Indicates: Signal -> Pattern
PG
printf '%s\n' \
'{"type":"Signal","data":{"slug":"s1","title":"OSS model adoption surging"}}' \
'{"type":"Pattern","data":{"slug":"p1","name":"adoption"}}' \
'{"edge":"Indicates","from":"s1","to":"p1"}' > data.jsonl
omnigraph init --schema schema.pg ./graph.omni
omnigraph load --data data.jsonl --mode overwrite --store ./graph.omni
# "What pattern does signal s1 indicate?"
omnigraph query --store ./graph.omni \
-e 'query indicates() { match { $s: Signal { slug: "s1" } $s indicates $p } return { $p.name } }'
# → adoption
Docs
Build And Test
cargo build --workspace
cargo test --workspace
Notes:
- Rust stable toolchain, edition 2024
- CI runs
cargo test --workspace --locked - Full CI and some local test flows require
protobuf-compiler - S3 integration tests expect an S3-compatible endpoint such as RustFS
Workspace Crates
crates/omnigraph-compiler: shared schema/query parser, typechecker, catalog, and IR lowering (zero Lance dependency)crates/omnigraph(packageomnigraph-engine): storage/runtime, branching, merge, change detection, query execution, and embeddingscrates/omnigraph-policy: Cedar policy compilation and enforcementcrates/omnigraph-api-types: shared HTTP wire DTOs used by both the server and the CLIcrates/omnigraph-cluster: cluster config validation, planning, and apply (the control plane)crates/omnigraph-server: Axum HTTP server, cluster-first, runs N graphs under/graphs/{id}/…crates/omnigraph-cli: CLI for graph lifecycle, query/mutate, branch/commit/merge, schema/lint, snapshot/export, cluster control, policy/queries, profiles, and maintenance
Contributing
Please open an issue, spec, or design discussion before sending large code changes. Design feedback and concrete problem statements are the fastest way to collaborate on the roadmap.
Community
Join the Omnigraph Slack community to ask questions, share feedback, and follow development.