omnigraph/docs/user/concepts/index.md
Andrew Altshuler 612741b387
docs(user): split language/branching pages + add front-door pages (Phase 2) (#225)
Content build-out on top of the Phase 1 topic move. No behavior changes.

Splits (existing content relocated, cross-linked):
- queries/index.md → mutations/index.md (insert/update/delete + the
  inserts-vs-deletes rule) and search/index.md (the multi-modal search
  functions + a hybrid-ranking overview tying nearest/bm25/rrf together).
  queries/index.md now covers the read shape and points at both.
- branching/index.md → branching/time-travel.md (snapshots/time travel) and
  branching/merge.md (three-way merge + the 7 conflict kinds, verified against
  error.rs MergeConflictKind).

New pages (written from the code, user-facing):
- quickstart.md — init → load → query → branch, with verified CLI flags.
- concepts/index.md — what OmniGraph is + the L1/L2 (Lance/OmniGraph) framing.

Expanded operations/audit.md from a 7-line struct dump into a real
actor-tracking page (server token-resolved vs CLI --as chain; reading the
trail; the omnigraph:recovery reserved actor).

Index wiring: docs/user/index.md and AGENTS.md's topic table link every new
page; also normalized AGENTS.md's docs/user link display text to match the
Phase 1 retargeted paths.

Verified: zero broken .md links; check-agents-md.sh green (57 links, 54 docs).

Deferred to Phase 3: de-dev polish (grammar paths, IR internals still in
queries/branching), guides/, and a possible reference/config.md split (the
config schema is already coherent in cli/reference.md).

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-14 13:53:46 +03:00

2.7 KiB

Concepts

OmniGraph is a typed property-graph engine built as a coordination layer over the Lance columnar storage format. It gives you a schema-checked graph with vector, full-text, and graph queries in one runtime, plus Git-style branches and commits across the whole graph.

The data model

  • A graph has node types and edge types, declared in a schema.
  • Each node type and each edge type is stored as its own Lance dataset — columnar, versioned, on local disk or object storage.
  • A single __manifest table coordinates all of those datasets, so the graph has one coherent version even though it spans many datasets.

This split is what lets a graph commit be atomic across every type at once: a publish flips every relevant dataset's version together in one manifest write, so readers never see a half-applied change. See storage for the layout.

Two layers: inherited vs. added

Throughout the docs, capabilities are framed as L1 (inherited from Lance) or L2 (added by OmniGraph):

L1 — from Lance L2 — added by OmniGraph
Storage Columnar Arrow datasets on object storage Per-type datasets coordinated as one graph
Versioning Per-dataset versions + time travel Snapshots across all types at once
Branches Per-dataset branches Graph-level branches, atomic across types
Commits Per-dataset commits Commit DAG for the whole graph; three-way merge
Indexes Scalar / vector / full-text indexes Built per relevant column; graph topology index for traversal
Search Vector + full-text primitives nearest / bm25 / rrf in one query, plus graph traversal
Querying The .gq query language and .pg schema language

How the pieces fit

  • The schema (.pg) and query (.gq) languages are compiled to a typed intermediate representation.
  • The engine runs queries and mutations against Lance, coordinates the manifest, maintains the commit graph, and builds indexes.
  • The CLI (omnigraph) and the HTTP server (operations/server.md) are two front ends over the same engine, so embedded and remote behavior match.
  • Cedar policy enforcement is engine-wide — every writer goes through the same authorization gate regardless of front end.

For deployment-scale topics — multi-graph servers, control-plane operations, recovery — see clusters.