Lakehouse-native graph engine with git-style workflows https://omnigraph.dev
Find a file
Ragnor Comerford e0d88d1295
fix(unique): collision-free tuple key shared by intake and merge, loud on un-keyable types (#160)
* fix(unique): collision-free tuple key shared by intake and merge, loud on un-keyable types

Hardening on top of #133. That PR introduced a shared
`loader::composite_unique_key(parts)` joining per-column scalars with U+001F
and routed both intake and branch-merge through it, closing the original
'|' vs U+001F separator drift. This takes the shared keying the rest of the
way to correct-by-design:

- Collision-free by construction: the key is now the tuple of per-column
  scalar strings (Vec<String>) keyed directly, no separator, so no data value
  (not even a literal U+001F) can forge a collision.
- One scalar converter across both paths: intake used an explicit type-match,
  merge used Arrow's array_value_to_string. Both now derive the key through
  composite_unique_key(group_columns, row), so they can't drift on conversion.
- Loud on un-keyable types: the scalar converter returned None for any Arrow
  type it didn't recognize, and the caller treated None as null-exempt, so a
  @unique on a column type it couldn't reduce (list, blob) was silently
  un-enforced. It now returns Err, surfacing the constraint it can't enforce
  instead of weakening it in silence.

Tests:
- consistency::composite_unique_key_is_consistent_across_intake_and_merge pins
  that intake and merge key the tuple identically (load-on-branch then merge
  of values containing '|').
- loader unit tests pin tuple keying + null exemption and the loud error on an
  un-keyable (binary) column.

Docs: invariants truth-matrix updated; stale loader/mod.rs line pointers fixed.
Scope unchanged: intra-batch / merge-candidate-set only; cross-version
uniqueness against committed rows stays a documented gap.

* fix(unique): cover all string encodings; make format_tuple private (PR #160 review)

Addresses two Greptile P2 comments on PR #160:

- unique_key_scalar handled only StringArray (Utf8). The loud-on-unknown-type
  behavior turned any legal string column that read back as LargeUtf8 or
  Utf8View into a hard write failure (the old code silently returned None). Add
  LargeStringArray and StringViewArray arms so a legal string column is keyable
  in every physical Arrow encoding; the Err path now fires only for a genuinely
  un-keyable logical type (list/blob/vector), never a legal value in an
  unenumerated encoding.
- format_tuple was pub(crate) but only used within loader/mod.rs; make it a
  private fn (matches the old format_unique_columns it replaced, minimal
  exposed surface).

New unit test unique_key_scalar_handles_all_string_encodings pins that Utf8 /
LargeUtf8 / Utf8View all render rather than error.
2026-06-09 19:28:21 +02:00
.cargo Raise LANCE_MEM_POOL_SIZE to 1 GB in .cargo/config.toml 2026-04-19 22:27:49 +03:00
.context Investigate Lance MergeInsertBuilder CAS granularity (MR-766 prereq) 2026-04-28 23:30:17 +00:00
.github ci(branch-protection): let code owners bypass required PR review (#154) 2026-06-08 22:26:04 +03:00
crates fix(unique): collision-free tuple key shared by intake and merge, loud on un-keyable types (#160) 2026-06-09 19:28:21 +02:00
docker feat(server): compose OMNIGRAPH_TARGET_URI with OMNIGRAPH_CONFIG in entrypoint (#129) 2026-05-30 20:17:55 +01:00
docs fix(unique): collision-free tuple key shared by intake and merge, loud on un-keyable types (#160) 2026-06-09 19:28:21 +02:00
scripts governance: external contribution model (issues/discussions/RFCs/PRs) (#143) 2026-06-06 23:58:08 +03:00
.dockerignore Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
.gitignore release: v0.5.0 (#115) 2026-05-23 13:59:42 +01:00
AGENTS.md Merge origin/main into cluster-config-docs 2026-06-09 18:11:12 +03:00
Cargo.lock feat(engine): indexed graph traversal (#149) 2026-06-09 18:09:13 +02:00
Cargo.toml feat(cluster): add read-only validate and plan 2026-06-08 20:07:39 +03:00
CLAUDE.md Add AGENTS.md as canonical agent guide; symlink CLAUDE.md to it 2026-04-28 23:10:09 +02:00
CODE_OF_CONDUCT.md Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
CONTRIBUTING.md governance: external contribution model (issues/discussions/RFCs/PRs) (#143) 2026-06-06 23:58:08 +03:00
Dockerfile Dockerfile: switch base from Docker Hub to ECR Public 2026-04-20 13:46:23 +03:00
GOVERNANCE.md governance: external contribution model (issues/discussions/RFCs/PRs) (#143) 2026-06-06 23:58:08 +03:00
LICENSE Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
og-cheet-sheet.md feat: inline query strings in CLI and HTTP server (#110) 2026-05-29 13:41:54 +02:00
omnigraph.example.yaml example config: use graphs / cli.graph, matching the MR-603 rename 2026-04-18 23:40:35 +03:00
openapi.json release: v0.6.2 2026-06-09 15:59:59 +02:00
README.md README: link to TypeScript SDK and MCP server (#92) 2026-05-30 14:29:49 +02:00
rust-toolchain.toml Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
SECURITY.md Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00

Omnigraph

License: MIT Rust Crates.io CI

Lakehouse native graph engine built for context assembly

Omnigraph acts as operational state & coordination layer for agents

  • Git-style versioning & branching
  • Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly
  • Object storage native (S3, RustFS)
  • Native blob-as-data support (docs, images, videos, etc)
  • VPC, On-prem, hybrid deployment
  • Lance format as open storage layer
AS CODE What it means
Schema AS CODE Typed .pg schemas, planned, applied, enforced
Context AS CODE Linted queries & agentic nudges, versioned and reusable
Security AS CODE Cedar policies enforced server-side on every mutation
Dashboards AS CODE Declarative views & controls over the graph (coming)

Core Use Cases

Use case What it's for
Company brain Org knowledge unified into one queryable graph
Context graph Decision traces and codified tribal knowledge
Agentic memory Durable, versioned memory for long-running agents
Dev graph Issues & dependency model for coding agents
R&D data layer Experiments & trials data written into branches
ML workflows Versioned, branchable graphs for training & eval
Karpathy's LLM wiki A living, agent-updatable knowledge base

Quick Install

curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash

This installs omnigraph and omnigraph-server into ~/.local/bin from published release binaries.

Or install with Homebrew:

brew tap ModernRelay/tap
brew install ModernRelay/tap/omnigraph

For starter graphs and agent skills to bootstrap and operate Omnigraph, see ModernRelay/omnigraph-cookbooks.

One-Command Local RustFS Bootstrap

curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash

That bootstrap:

  • starts RustFS on 127.0.0.1:9000
  • creates a bucket and S3-backed graph
  • loads the checked-in context fixture
  • launches omnigraph-server on 127.0.0.1:8080

Docker must be installed and running first.

The RustFS bootstrap prefers the rolling edge binaries and only falls back to source builds when release assets are unavailable.

If a previous run left objects under the same graph prefix but did not finish initializing the graph, rerun with RESET_REPO=1 or set PREFIX to a new value.

Common Commands

The same URI works for local paths, s3://…, or http://host:port.

omnigraph init   --schema ./schema.pg ./graph.omni
omnigraph load   --data   ./data.jsonl ./graph.omni
omnigraph read   --query  ./queries.gq --name get_person --params '{"name":"Alice"}' ./graph.omni
omnigraph change --query  ./queries.gq --name insert_person --params '{"name":"Mina"}' ./graph.omni
omnigraph branch create --from main feature-x ./graph.omni
omnigraph branch merge  feature-x --into main ./graph.omni

See docs/user/cli.md for schema apply, snapshots, ingest, commits, and policy commands.

Clients

For programmatic access to a running omnigraph-server:

  • TypeScript SDK@modernrelay/omnigraph (source). Instance-per-client, typed errors, camelCase types, async-iterator streaming export.

    npm install @modernrelay/omnigraph
    
  • Model Context Protocol server@modernrelay/omnigraph-mcp (source). Bridges Omnigraph to LLM hosts (Claude Desktop, Claude Code, …) over stdio. Exposes tools and resources for schema, branches, queries, mutations, ingest, and bundles curated best-practices guidance from the cookbook.

    npm install -g @modernrelay/omnigraph-mcp
    

Both packages are versioned in lockstep with omnigraph-server on major.minor: @modernrelay/omnigraph@X.Y.* targets omnigraph-server@X.Y.*. See ModernRelay/omnigraph-ts for the monorepo.

Docs

Build And Test

cargo build --workspace
cargo check --workspace
cargo test --workspace

Notes:

  • Rust stable toolchain, edition 2024
  • CI runs cargo test --workspace --locked
  • Full CI and some local test flows require protobuf-compiler
  • S3 integration tests expect an S3-compatible endpoint such as RustFS

Workspace Crates

  • crates/omnigraph-compiler: shared schema/query parser, typechecker, catalog, and IR lowering
  • crates/omnigraph: storage/runtime, branching, merge, change detection, and query execution
  • crates/omnigraph-cli: CLI for graph lifecycle (init/load/ingest), query/mutate, branch/commit/merge, schema/lint, snapshot/export, policy, and maintenance (optimize/cleanup)
  • crates/omnigraph-server: Axum HTTP server for remote reads, changes, ingest, export, branches, and commits

Contributing

Please open an issue, spec, or design discussion before sending large code changes. Design feedback and concrete problem statements are the fastest way to collaborate on the roadmap.

Community

Join the Omnigraph Slack community to ask questions, share feedback, and follow development.