docs(readme): server-first rewrite — deploy, agents, on-prem RustFS (#269)

* docs(readme): server-first rewrite — deploy, agents, on-prem RustFS

Reframe the README around the actual product: a self-hostable, multigraph
graph server for context assembly and multi-agent coordination, deployed
Terraform-style on your own object storage (on-prem via RustFS, or S3/R2/GCS).

- Lead with key capabilities and what you can build, not a local toy.
- Promote "Drive it with an AI agent" (skill + a docs-first setup prompt) above
  the manual deploy walkthrough — agents are the primary operator.
- "Deploy" is the hero: cluster.yaml → object store → validate/plan/apply →
  omnigraph-server, with RustFS as the on-prem path front and center.
- "Query and mutate": stored queries by name + branch/review/merge.
- Security & governance as scannable bullets; Clients & SDKs as a table.
- Embedded local graph demoted to a clearly-labeled "quick test" (Signal →
  Indicates → Pattern), explicitly dev/experiment-only.
- Drop the "serve/served/serving" vocabulary tic in favor of deploy/run.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(readme): add the server boot command to Deploy §3 (Greptile P1)

The "Converge and run" step showed only the converge half — the code block
ended at `cluster apply` with no `omnigraph-server` command, leaving a linear
reader without a way to actually start the server. Add the boot line.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(readme): simplify the server boot command

Drop the inline OMNIGRAPH_SERVER_BEARER_TOKEN prefix from the Deploy hero —
the example cluster declares a policy so the server boots without it, and
bearer auth is covered in Security & governance. Leaves a single clean line.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(readme): boot the server from the cluster dir, not a raw s3:// URI

Pointing --cluster at the bucket hardcodes the storage URI in the run command.
Boot from the config directory instead; the storage URI lives once in
cluster.yaml and the server resolves it — single source of truth, and
consistent with the cluster apply commands above.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Andrew Altshuler 2026-06-17 00:41:10 +03:00 committed by GitHub
parent e510937a7e
commit 8cb127d8fa
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

313
README.md
View file

@ -4,197 +4,214 @@
[![Rust](https://img.shields.io/badge/rust-stable-orange.svg)](rust-toolchain.toml)
[![Crates.io](https://img.shields.io/crates/v/omnigraph-cli.svg)](https://crates.io/crates/omnigraph-cli)
**Lakehouse native graph engine built for context assembly**
**Lakehouse graph db for context assembly & multi-agent coordination**
Multimodal retrieval, Git-style branching, object storage-native
Omnigraph acts as operational state & coordination layer for agents.
Hundreds of agents can enrich the graph on parallel isolated branches and changes can be reviewed and merged safely.
Omnigraph is the operational state and coordination layer for fleets of agents.
Run it as a server, declared as code; hundreds of agents operate and enrich the graph on
parallel isolated branches, and every change is reviewed and merged safely.
- Git-style versioning & branching
- Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly
- Runs on the local filesystem or any S3-compatible object store (AWS S3, R2, MinIO, RustFS)
- Native blob-as-data support (docs, images, videos, etc)
- VPC, On-prem, hybrid deployment
- [`Lance`](https://github.com/lance-format/lance) format as open storage layer
## Key capabilities
| AS CODE | What it means |
- **A graph server you run, declared as code** — a `cluster.yaml` declares graphs, schemas, stored queries, embedding providers, and policies. `cluster apply` converges it; `omnigraph-server` boots from it and brings every graph online at `/graphs/{id}/…`.
- **Built for fleets of agents** — hundreds of agents enrich the graph on **parallel isolated branches**; changes are reviewed and merged safely, Git-style, across the whole graph.
- **Multimodal retrieval for context assembly** — graph traversal + vector ANN + full-text + Reciprocal Rank Fusion in **one** query runtime.
- **Security as code** — Cedar policy enforced **server-side on every mutation**, per-graph and server-wide; bearer auth; actor/audit tracking.
- **Runs on your infrastructure** — any S3-compatible object store: **on-prem via RustFS / MinIO**, or AWS S3 / R2 / GCS. VPC, on-prem, hybrid — your data never leaves your store.
- **Open, versioned storage** — [`Lance`](https://github.com/lance-format/lance) columnar format: branchable, time-travelable, with native blob-as-data (docs, images, video).
## What you can build
| Use case | What it's for |
|---|---|
| **Schema AS CODE** | Typed `.pg` schemas, planned, applied, enforced |
| **Context AS CODE** | Linted queries & agentic nudges, versioned and reusable |
| **Security AS CODE** | Cedar policies enforced server-side on every mutation |
| **Dashboards AS CODE** | Declarative views & controls over the graph *(coming)* |
| **Company brain** | Org knowledge unified into one graph every agent can query |
| **Agentic memory** | Durable, versioned memory — a branch per agent or per task, merged on review |
| **Context graph** | Decision traces and codified tribal knowledge for retrieval |
| **Dev graph** | Issues & dependency model that coding agents read and write |
| **R&D / ML data layer** | Experiments and trials written into branches, versioned for training & eval |
## Core Use Cases
| Use case | What it's for
|---|---|
| **Company brain** | Org knowledge unified into one queryable graph |
| **Context graph** | Decision traces and codified tribal knowledge |
| **Agentic memory** | Durable, versioned memory for long-running agents |
| **Dev graph** | Issues & dependency model for coding agents |
| **R&D data layer** | Experiments & trials data written into branches |
| **ML workflows** | Versioned, branchable graphs for training & eval |
| **Karpathy's LLM wiki** | A living, agent-updatable knowledge base |
## Quick Install
## Install
```bash
curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash
```
This installs `omnigraph` and `omnigraph-server` into `~/.local/bin` from
published release binaries.
Or install with Homebrew:
This installs `omnigraph` (CLI) and `omnigraph-server` into `~/.local/bin` from
published release binaries. Or with Homebrew:
```bash
brew tap ModernRelay/tap
brew install ModernRelay/tap/omnigraph
```
## Quick start
## Drive it with an AI agent
The fastest path is an **embedded, local file-backed graph** — no server, no
object store, no Docker:
Omnigraph is built to be run by coding agents — two ways in.
```bash
# A schema and one row of data
cat > schema.pg <<'PG'
node Person {
slug: String @key
name: String
title: String?
}
PG
echo '{"type":"Person","data":{"slug":"alice","name":"Alice","title":"Engineer"}}' > people.jsonl
# Create → load (--mode is required) → query
omnigraph init --schema schema.pg ./graph.omni
omnigraph load --data people.jsonl --mode overwrite --store ./graph.omni
omnigraph query find_people --store ./graph.omni --params '{"t":"Engineer"}' \
-e 'query find_people($t: String) { match { $p: Person { title: $t } } return { $p.name } }'
# Branch, write in isolation, merge — Git-style across the whole graph
omnigraph branch create --from main review/new-hires --store ./graph.omni
omnigraph branch merge review/new-hires --into main --store ./graph.omni
```
**Storage backends** — the same flow runs on any backend; only the graph address changes:
| Backend | Use it for | Graph address |
|---|---|---|
| **Embedded** (local filesystem) | dev, demos, single machine — the default | `./graph.omni` |
| **Object storage** (AWS S3, R2, GCS-S3) | shared, multi-host, durable | `s3://bucket/graph.omni` (+ the `AWS_*` env) |
| **RustFS / MinIO** | rehearse the S3 path locally, no cloud account | `s3://…` against a local endpoint → [deployment guide](docs/user/deployment.md#testing-against-s3-locally) |
`init` takes the address as its positional argument (`omnigraph init --schema schema.pg <address>`); `load`, `query`, and `branch` take it via `--store <address>`.
For a **served, multi-graph deployment** (the cluster model), see [Common Commands](#common-commands) below.
## Set it up with an AI agent
Omnigraph is built to be set up by coding agents. Paste this into Claude Code,
Cursor, or any agent that can read a URL, install a package, and run a shell
command — it installs the skill, reads the docs, and walks you through setup for
your use case:
```text
Help me set up Omnigraph (a lakehouse-native graph engine for agents).
1. Install the Omnigraph skill so you operate it correctly:
npx skills add ModernRelay/omnigraph@omnigraph
2. Read the docs at https://github.com/ModernRelay/omnigraph — start with
docs/user/quickstart.md, then docs/user/clusters/index.md.
3. Skim the starter graphs and seed data in the cookbooks:
https://github.com/ModernRelay/omnigraph-cookbooks
4. Ask me what I want to build (company brain, agent memory, dev graph,
research / R&D layer, …). Then install the CLI, stand up a first graph for
that use case, load a little data, and run a query so I can see it working.
```
Works with any agent that can browse a URL, install a package, and run a shell.
## Agent skill & starter graphs
This repo ships the [**`omnigraph` agent skill**](skills/omnigraph) — the
operational playbook (cluster mode, the two config surfaces, schema evolution,
query linting, data writes, branches, Cedar policy, and common gotchas) that
teaches a coding agent to drive Omnigraph correctly. Install it with:
**Teach your agent the playbook.** This repo ships the
[**`omnigraph` agent skill**](skills/omnigraph): the operational playbook —
cluster mode, the two config surfaces, schema evolution, query linting, data
writes, branches, Cedar policy, and the common gotchas.
```bash
npx skills add ModernRelay/omnigraph@omnigraph
```
**Or have an agent set it up from scratch.** Paste this into Claude Code,
Cursor, or any agent that can read a URL and run a shell command:
```text
Help me set up Omnigraph
1. Read the docs at https://github.com/ModernRelay/omnigraph — start with
docs/user/clusters/index.md, then docs/user/deployment.md.
2. Skim the starter graphs and seed data in the cookbooks:
https://github.com/ModernRelay/omnigraph-cookbooks
3. Ask me what I want to build (company brain, agent memory, dev graph,
research / R&D layer, …). Then stand up a cluster for it, load a little
data, and run a query so I can see it working.
```
For ready-to-run graphs with real seed data (company brain, VC operating system,
pharma & industry intel),
[`ModernRelay/omnigraph-cookbooks`](https://github.com/ModernRelay/omnigraph-cookbooks)
is the fastest way to see Omnigraph shaped to a real domain. To rehearse the S3
path locally, see [deployment.md → Testing against S3 locally](docs/user/deployment.md#testing-against-s3-locally).
is the fastest way to see Omnigraph shaped to a real domain.
## Common Commands
## Deploy
A deployment is a **cluster**. A `cluster.yaml` declares its graphs, schemas,
stored queries, and policies; you converge it with `cluster apply` and serve it.
The server is cluster-first — it boots only from a cluster and serves every graph
under `/graphs/{id}/…`. Day-to-day work goes through that server: graphs are
addressed with `--server <name|url>` (+ `--graph <id>`), and `query`/`mutate`
invoke a stored query from the catalog **by name**.
A deployment is a **cluster** — a **multigraph** config directory that declares
its graphs, schemas, stored queries, and policies as code. You manage it
**Terraform-style**: `cluster plan` previews the diff, `cluster apply` converges
it. `omnigraph-server` then boots from the cluster and brings every graph online
at `/graphs/{id}/…`, each behind its own policy.
```bash
# 1. Converge the declared cluster, then serve it (--as attributes the apply)
omnigraph cluster apply --config ./company-brain --as you
omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080
# or config-free from object storage — the bucket IS the deployment:
# omnigraph-server --cluster s3://my-bucket/company-brain --bind 0.0.0.0:8080
**1. Declare the cluster.**
# 2. Work against the served graph — stored queries invoked by name
omnigraph query find_people --server prod --graph knowledge --params '{"q":"AI safety"}'
omnigraph mutate add_person --server prod --graph knowledge --params '{"name":"Mina"}'
omnigraph load --data ./data.jsonl --mode merge --server prod --graph knowledge
# 3. Branch and merge, Git-style across the whole graph
omnigraph branch create --from main review/2026-06 --server prod --graph knowledge
omnigraph branch merge review/2026-06 --into main --server prod --graph knowledge
```
company-brain/
├── cluster.yaml
├── people.pg # schema for the "knowledge" graph
├── queries/ # stored queries — the .gq files ARE the declaration
│ └── people.gq
└── base.policy.yaml # a Cedar policy bundle
```
Set a default scope (or a `--profile`) in `~/.omnigraph/config.yaml` — operator
identity, named servers/clusters, credentials — and the `--server`/`--graph`
flags drop away (`omnigraph query find_people --params …`).
**Local / ad-hoc.** For quick iteration on a standalone graph (no cluster, no
server), address storage directly with `--store` (or a positional `file://` /
`s3://` URI) and run ad-hoc `.gq` with `--query` (the positional then selects
which query in the file):
```bash
omnigraph init --schema ./schema.pg ./graph.omni
omnigraph load --data ./data.jsonl --mode merge --store ./graph.omni
omnigraph query --query ./queries.gq get_person --params '{"name":"Alice"}' --store ./graph.omni
```yaml
# cluster.yaml
version: 1
metadata:
name: company-brain
storage: s3://company/clusters/company-brain # ledger, catalog, and graph data live here
graphs:
knowledge:
schema: people.pg
queries: queries/ # every `query <name>` in queries/*.gq registers
policies:
base:
file: base.policy.yaml
applies_to: [knowledge] # graph-bound; use [cluster] for server-level
```
See [docs/user/cli/index.md](docs/user/cli/index.md), the
[CLI reference](docs/user/cli/reference.md), the
[cluster guide](docs/user/clusters/index.md), and the
[deployment guide](docs/user/deployment.md) for schema apply, snapshots, commits,
profiles, and policy/queries tooling.
**2. Stand up your object store.** On-prem, run RustFS (or MinIO) — Omnigraph
writes [Lance](https://github.com/lance-format/lance) to it over the standard S3
API. In the cloud, point the same `AWS_*` env at S3 / R2 / GCS instead.
## Clients
**3. Converge and run.** `apply` creates each graph, applies its schema, and
publishes queries and policies into the content-addressed catalog. It is
idempotent — re-running is always safe.
For programmatic access to a running `omnigraph-server`:
```bash
omnigraph cluster validate # parse + typecheck everything
omnigraph cluster plan # preview what apply would do
omnigraph cluster apply # converge
- **TypeScript SDK + MCP server** — [`@modernrelay/omnigraph`](https://www.npmjs.com/package/@modernrelay/omnigraph) and [`@modernrelay/omnigraph-mcp`](https://www.npmjs.com/package/@modernrelay/omnigraph-mcp), versioned in lockstep with `omnigraph-server`. Source, docs, and examples: [`ModernRelay/omnigraph-ts`](https://github.com/ModernRelay/omnigraph-ts).
- **Python SDK** — coming soon.
# Boot the server from the cluster dir — storage resolves through cluster.yaml
omnigraph-server --cluster company-brain --bind 0.0.0.0:8080
```
See the [cluster guide](docs/user/clusters/index.md) for the day-2 loop
(edit → plan → apply → restart), approval gates for destructive changes, drift
inspection, and recovery; the [deployment guide](docs/user/deployment.md) for
containers, AWS/Railway, auth, and the full `AWS_*` contract.
## Query and mutate
Point the CLI at a running server and a graph. Stored queries and mutations run
**by name** from the catalog; branch and merge run across the whole graph, so a
fleet of agents can write in isolation and have changes reviewed before they
land on `main`.
```bash
# Stored query / mutation, parameters as JSON
omnigraph query search_docs --server https://graph.internal:8080 --graph knowledge --params '{"q":"AI safety"}'
omnigraph mutate add_person --server https://graph.internal:8080 --graph knowledge --params '{"name":"Mina","team":"Research"}'
# An agent enriches on its own branch; you review, then merge
omnigraph branch create --from main agent/ingest-42 --server https://graph.internal:8080 --graph knowledge
omnigraph branch merge agent/ingest-42 --into main --server https://graph.internal:8080 --graph knowledge
```
Name the server (and a default graph) once in `~/.omnigraph/config.yaml` — with
operator identity and credentials — and the `--server`/`--graph` flags drop
away: `omnigraph query search_docs --params '{"q":"…"}'`. See the
[CLI reference](docs/user/cli/reference.md).
## Security & governance
- **Engine-wide enforcement** — every write path goes through the same Cedar gate, so the HTTP server, the CLI, and the embedded SDK obey identical rules.
- **Declared in the cluster** — a policy bundle is bound to graphs (or the whole server) via `policies:``applies_to`.
- **Scoped** — rules apply per graph, per branch, or server-wide.
- **No plaintext tokens** — bearer tokens are hashed at startup and compared in constant time.
- **Forge-proof identity** — the actor is resolved server-side from the token; clients can't set it.
See the [policy guide](docs/user/operations/policy.md).
## Clients & SDKs
| Client | Use it for | Where |
|---|---|---|
| **TypeScript SDK** | typed access from Node / TS | [`@modernrelay/omnigraph`](https://www.npmjs.com/package/@modernrelay/omnigraph) · [source](https://github.com/ModernRelay/omnigraph-ts) |
| **MCP server** | bridge Omnigraph to LLM hosts (Claude, Cursor, …) | [`@modernrelay/omnigraph-mcp`](https://www.npmjs.com/package/@modernrelay/omnigraph-mcp) |
| **HTTP / OpenAPI** | any language — the wire contract | the server's OpenAPI spec |
| **Python SDK** | typed access from Python | *coming soon* |
Both npm packages are versioned in lockstep with `omnigraph-server`.
## Local quick test (no server)
1-min setup to try it: an **embedded, local file-backed graph** — no server, no
object store. For dev and experiments; production is the deployed cluster above.
```bash
cat > schema.pg <<'PG'
node Signal { slug: String @key, title: String }
node Pattern { slug: String @key, name: String }
edge Indicates: Signal -> Pattern
PG
printf '%s\n' \
'{"type":"Signal","data":{"slug":"s1","title":"OSS model adoption surging"}}' \
'{"type":"Pattern","data":{"slug":"p1","name":"adoption"}}' \
'{"edge":"Indicates","from":"s1","to":"p1"}' > data.jsonl
omnigraph init --schema schema.pg ./graph.omni
omnigraph load --data data.jsonl --mode overwrite --store ./graph.omni
# "What pattern does signal s1 indicate?"
omnigraph query --store ./graph.omni \
-e 'query indicates() { match { $s: Signal { slug: "s1" } $s indicates $p } return { $p.name } }'
# → adoption
```
## Docs
- [Install guide](docs/user/install.md)
- [Deployment guide](docs/user/deployment.md)
- [Cluster guide](docs/user/clusters/index.md) · [Deployment guide](docs/user/deployment.md) · [CLI reference](docs/user/cli/reference.md)
- [Schema](docs/user/schema/index.md) · [Queries](docs/user/queries/index.md) · [Search](docs/user/search/index.md) · [Policy](docs/user/operations/policy.md)
## Build And Test
```bash
cargo build --workspace
cargo check --workspace
cargo test --workspace
cargo test --workspace
```
Notes:
@ -211,8 +228,8 @@ Notes:
- `crates/omnigraph-policy`: Cedar policy compilation and enforcement
- `crates/omnigraph-api-types`: shared HTTP wire DTOs used by both the server and the CLI
- `crates/omnigraph-cluster`: cluster config validation, planning, and apply (the control plane)
- `crates/omnigraph-server`: Axum HTTP server — cluster-first, serving N graphs under `/graphs/{id}/…`
- `crates/omnigraph-cli`: CLI for graph lifecycle (init/load), query/mutate, branch/commit/merge, schema/lint, snapshot/export, cluster control, policy/queries, profiles, and maintenance (optimize/repair/cleanup)
- `crates/omnigraph-server`: Axum HTTP server — cluster-first, runs N graphs under `/graphs/{id}/…`
- `crates/omnigraph-cli`: CLI for graph lifecycle, query/mutate, branch/commit/merge, schema/lint, snapshot/export, cluster control, policy/queries, profiles, and maintenance
## Contributing