Lakehouse-native graph engine with git-style workflows https://omnigraph.dev
Find a file
Andrew Altshuler 74eb5a5380
Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46)
* Parallel per-type load writes + omnigraph optimize/cleanup CLI

## MR-677.3 — parallel per-type load writes

The load path already groups records into one RecordBatch per type and
makes one Lance commit per table (loader::mod.rs:249-..), but those
commits ran sequentially. Wrap node and edge write loops in
`futures::stream::buffered(N)` against a new helper
`write_batches_concurrently`. Concurrency tunable via
`OMNIGRAPH_LOAD_CONCURRENCY` (default 8).

## MR-676 — `omnigraph optimize` and `omnigraph cleanup`

New CLI subcommands that walk every node + edge table in the repo:

- `omnigraph optimize <uri>` — runs Lance `compact_files` on each
  table to merge small fragments into fewer larger ones.
- `omnigraph cleanup <uri> --keep N | --older-than 7d --confirm` —
  runs Lance `cleanup_old_versions` to prune historical manifests +
  unique fragments. Requires `--confirm` because it's destructive.
  Supports both count-based and time-based retention (or both AND'd
  together). Time uses chrono `DateTime<Utc>` (added as a workspace
  dep, default-features off).

Both commands run their per-table loops in parallel (8-way bounded,
`OMNIGRAPH_MAINTENANCE_CONCURRENCY` env override). Smoke-tested
against the 114-table prod graph: optimize went 7m15s sequential
→ 1m28s parallel. cleanup --keep 1 removed 137 historical versions
across 114 tables in 1m57s without disrupting `/healthz` or query
responses.

Public API on `Omnigraph`:

  pub async fn optimize(&mut self) -> Result<Vec<TableOptimizeStats>>
  pub async fn cleanup(&mut self, opts: CleanupPolicyOptions)
      -> Result<Vec<TableCleanupStats>>

All 10 existing loader tests still pass.

Closes MR-676.
Partially addresses MR-677 (the .3 — parallel by type — piece;
MR-677.1 is for the `omnigraph embed` path, not load, since load
doesn't call Gemini directly. .2 was already in place).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate openapi.json

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-25 14:22:14 +03:00
.cargo Raise LANCE_MEM_POOL_SIZE to 1 GB in .cargo/config.toml 2026-04-19 22:27:49 +03:00
.github/workflows Fix release.yml: move HOMEBREW_TAP_TOKEN guard into steps 2026-04-21 19:24:41 +03:00
crates Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46) 2026-04-25 14:22:14 +03:00
docker Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
docs Prepare v0.3.0 release (#44) 2026-04-21 19:11:34 +03:00
scripts Revert "Add opt-in git hook for openapi.json drift" 2026-04-17 16:26:57 +02:00
.dockerignore Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
.gitignore Remove Stainless SDK config 2026-04-17 14:31:35 +02:00
Cargo.lock Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46) 2026-04-25 14:22:14 +03:00
Cargo.toml Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46) 2026-04-25 14:22:14 +03:00
CODE_OF_CONDUCT.md Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
CONTRIBUTING.md Merge remote-tracking branch 'origin/main' into ragnorc/explore-api 2026-04-18 20:24:39 +02:00
Dockerfile Dockerfile: switch base from Docker Hub to ECR Public 2026-04-20 13:46:23 +03:00
LICENSE Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
og-cheet-sheet.md Add query lint and check commands 2026-04-13 00:37:44 +03:00
omnigraph.example.yaml example config: use graphs / cli.graph, matching the MR-603 rename 2026-04-18 23:40:35 +03:00
openapi.json Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46) 2026-04-25 14:22:14 +03:00
README.md Add crates.io badge for omnigraph-cli 2026-04-14 22:48:20 +03:00
rust-toolchain.toml Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00
SECURITY.md Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00

Omnigraph

License: MIT Rust Crates.io CI

Typed graph engine built for reasoning paths, not just storage.
Git-style workflows, schema-as-code graph modeling, S3-optimized.

Use Cases

  • On-prem & hybrid context graphs
  • Backbone for multi-agentic research
  • Enterprise knowledge systems

Quick Install

curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash

This installs omnigraph and omnigraph-server into ~/.local/bin from published release binaries.

Or install with Homebrew:

brew tap ModernRelay/tap
brew install ModernRelay/tap/omnigraph

For starter graphs and agent skills to bootstrap and operate Omnigraph, see ModernRelay/omnigraph-starters.

One-Command Local RustFS Bootstrap

curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash

That bootstrap:

  • starts RustFS on 127.0.0.1:9000
  • creates a bucket and S3-backed repo
  • loads the checked-in context fixture
  • launches omnigraph-server on 127.0.0.1:8080

Docker must be installed and running first.

The RustFS bootstrap prefers the rolling edge binaries and only falls back to source builds when release assets are unavailable.

If a previous run left objects under the same repo prefix but did not finish initializing the repo, rerun with RESET_REPO=1 or set PREFIX to a new value.

Omnigraph CORE

  • Typed schema, typed queries, and typed mutations
  • Schema-as-code, query validation and linting
  • Git-style graph workflows: branches, commits, merges, and transactional runs
  • Local, on-prem & cloud S3-native storage with snapshot-pinned reads
  • Graph traversal + text, fuzzy, BM25, vector, and RRF search in one runtime
  • Policy-as-code for server-side access control
  • Single CLI for multiple deployments

Common Commands

The same URI works for local paths, s3://…, or http://host:port.

omnigraph init   --schema ./schema.pg ./repo.omni
omnigraph load   --data   ./data.jsonl ./repo.omni
omnigraph read   --query  ./queries.gq --name get_person --params '{"name":"Alice"}' ./repo.omni
omnigraph change --query  ./queries.gq --name insert_person --params '{"name":"Mina"}' ./repo.omni
omnigraph branch create --from main feature-x ./repo.omni
omnigraph branch merge  feature-x --into main ./repo.omni

See docs/cli.md for schema apply, snapshots, ingest, runs, and policy commands.

Docs

Build And Test

cargo build --workspace
cargo check --workspace
cargo test --workspace

Notes:

  • Rust stable toolchain, edition 2024
  • CI runs cargo test --workspace --locked
  • Full CI and some local test flows require protobuf-compiler
  • S3 integration tests expect an S3-compatible endpoint such as RustFS

Workspace Crates

  • crates/omnigraph-compiler: shared schema/query parser, typechecker, catalog, and IR lowering
  • crates/omnigraph: storage/runtime, branching, merge, change detection, and query execution
  • crates/omnigraph-cli: CLI for init/load/ingest/read/change/branch/snapshot/export/policy operations
  • crates/omnigraph-server: Axum HTTP server for remote reads, changes, ingest, export, branches, commits, and runs

Contributing

Please open an issue, spec, or design discussion before sending large code changes. Design feedback and concrete problem statements are the fastest way to collaborate on the roadmap.