mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
Content build-out on top of the Phase 1 topic move. No behavior changes. Splits (existing content relocated, cross-linked): - queries/index.md → mutations/index.md (insert/update/delete + the inserts-vs-deletes rule) and search/index.md (the multi-modal search functions + a hybrid-ranking overview tying nearest/bm25/rrf together). queries/index.md now covers the read shape and points at both. - branching/index.md → branching/time-travel.md (snapshots/time travel) and branching/merge.md (three-way merge + the 7 conflict kinds, verified against error.rs MergeConflictKind). New pages (written from the code, user-facing): - quickstart.md — init → load → query → branch, with verified CLI flags. - concepts/index.md — what OmniGraph is + the L1/L2 (Lance/OmniGraph) framing. Expanded operations/audit.md from a 7-line struct dump into a real actor-tracking page (server token-resolved vs CLI --as chain; reading the trail; the omnigraph:recovery reserved actor). Index wiring: docs/user/index.md and AGENTS.md's topic table link every new page; also normalized AGENTS.md's docs/user link display text to match the Phase 1 retargeted paths. Verified: zero broken .md links; check-agents-md.sh green (57 links, 54 docs). Deferred to Phase 3: de-dev polish (grammar paths, IR internals still in queries/branching), guides/, and a possible reference/config.md split (the config schema is already coherent in cli/reference.md). Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2.7 KiB
2.7 KiB
Concepts
OmniGraph is a typed property-graph engine built as a coordination layer over the Lance columnar storage format. It gives you a schema-checked graph with vector, full-text, and graph queries in one runtime, plus Git-style branches and commits across the whole graph.
The data model
- A graph has node types and edge types, declared in a schema.
- Each node type and each edge type is stored as its own Lance dataset — columnar, versioned, on local disk or object storage.
- A single
__manifesttable coordinates all of those datasets, so the graph has one coherent version even though it spans many datasets.
This split is what lets a graph commit be atomic across every type at once: a publish flips every relevant dataset's version together in one manifest write, so readers never see a half-applied change. See storage for the layout.
Two layers: inherited vs. added
Throughout the docs, capabilities are framed as L1 (inherited from Lance) or L2 (added by OmniGraph):
| L1 — from Lance | L2 — added by OmniGraph | |
|---|---|---|
| Storage | Columnar Arrow datasets on object storage | Per-type datasets coordinated as one graph |
| Versioning | Per-dataset versions + time travel | Snapshots across all types at once |
| Branches | Per-dataset branches | Graph-level branches, atomic across types |
| Commits | Per-dataset commits | Commit DAG for the whole graph; three-way merge |
| Indexes | Scalar / vector / full-text indexes | Built per relevant column; graph topology index for traversal |
| Search | Vector + full-text primitives | nearest / bm25 / rrf in one query, plus graph traversal |
| Querying | — | The .gq query language and .pg schema language |
How the pieces fit
- The schema (
.pg) and query (.gq) languages are compiled to a typed intermediate representation. - The engine runs queries and mutations against Lance, coordinates the manifest, maintains the commit graph, and builds indexes.
- The CLI (
omnigraph) and the HTTP server (operations/server.md) are two front ends over the same engine, so embedded and remote behavior match. - Cedar policy enforcement is engine-wide — every writer goes through the same authorization gate regardless of front end.
For deployment-scale topics — multi-graph servers, control-plane operations, recovery — see clusters.