From 8cb127d8faacd96f8e7749ef185fa4a0313bf2c5 Mon Sep 17 00:00:00 2001 From: Andrew Altshuler Date: Wed, 17 Jun 2026 00:41:10 +0300 Subject: [PATCH] =?UTF-8?q?docs(readme):=20server-first=20rewrite=20?= =?UTF-8?q?=E2=80=94=20deploy,=20agents,=20on-prem=20RustFS=20(#269)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * docs(readme): server-first rewrite — deploy, agents, on-prem RustFS Reframe the README around the actual product: a self-hostable, multigraph graph server for context assembly and multi-agent coordination, deployed Terraform-style on your own object storage (on-prem via RustFS, or S3/R2/GCS). - Lead with key capabilities and what you can build, not a local toy. - Promote "Drive it with an AI agent" (skill + a docs-first setup prompt) above the manual deploy walkthrough — agents are the primary operator. - "Deploy" is the hero: cluster.yaml → object store → validate/plan/apply → omnigraph-server, with RustFS as the on-prem path front and center. - "Query and mutate": stored queries by name + branch/review/merge. - Security & governance as scannable bullets; Clients & SDKs as a table. - Embedded local graph demoted to a clearly-labeled "quick test" (Signal → Indicates → Pattern), explicitly dev/experiment-only. - Drop the "serve/served/serving" vocabulary tic in favor of deploy/run. Co-Authored-By: Claude Opus 4.8 * docs(readme): add the server boot command to Deploy §3 (Greptile P1) The "Converge and run" step showed only the converge half — the code block ended at `cluster apply` with no `omnigraph-server` command, leaving a linear reader without a way to actually start the server. Add the boot line. Co-Authored-By: Claude Opus 4.8 * docs(readme): simplify the server boot command Drop the inline OMNIGRAPH_SERVER_BEARER_TOKEN prefix from the Deploy hero — the example cluster declares a policy so the server boots without it, and bearer auth is covered in Security & governance. Leaves a single clean line. Co-Authored-By: Claude Opus 4.8 * docs(readme): boot the server from the cluster dir, not a raw s3:// URI Pointing --cluster at the bucket hardcodes the storage URI in the run command. Boot from the config directory instead; the storage URI lives once in cluster.yaml and the server resolves it — single source of truth, and consistent with the cluster apply commands above. Co-Authored-By: Claude Opus 4.8 --------- Co-authored-by: Claude Opus 4.8 --- README.md | 313 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 165 insertions(+), 148 deletions(-) diff --git a/README.md b/README.md index 9c4f8bc..e3876f6 100644 --- a/README.md +++ b/README.md @@ -4,197 +4,214 @@ [![Rust](https://img.shields.io/badge/rust-stable-orange.svg)](rust-toolchain.toml) [![Crates.io](https://img.shields.io/crates/v/omnigraph-cli.svg)](https://crates.io/crates/omnigraph-cli) -**Lakehouse native graph engine built for context assembly** +**Lakehouse graph db for context assembly & multi-agent coordination** +Multimodal retrieval, Git-style branching, object storage-native -Omnigraph acts as operational state & coordination layer for agents. -Hundreds of agents can enrich the graph on parallel isolated branches and changes can be reviewed and merged safely. +Omnigraph is the operational state and coordination layer for fleets of agents. +Run it as a server, declared as code; hundreds of agents operate and enrich the graph on +parallel isolated branches, and every change is reviewed and merged safely. -- Git-style versioning & branching -- Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly -- Runs on the local filesystem or any S3-compatible object store (AWS S3, R2, MinIO, RustFS) -- Native blob-as-data support (docs, images, videos, etc) -- VPC, On-prem, hybrid deployment -- [`Lance`](https://github.com/lance-format/lance) format as open storage layer +## Key capabilities -| AS CODE | What it means | +- **A graph server you run, declared as code** — a `cluster.yaml` declares graphs, schemas, stored queries, embedding providers, and policies. `cluster apply` converges it; `omnigraph-server` boots from it and brings every graph online at `/graphs/{id}/…`. +- **Built for fleets of agents** — hundreds of agents enrich the graph on **parallel isolated branches**; changes are reviewed and merged safely, Git-style, across the whole graph. +- **Multimodal retrieval for context assembly** — graph traversal + vector ANN + full-text + Reciprocal Rank Fusion in **one** query runtime. +- **Security as code** — Cedar policy enforced **server-side on every mutation**, per-graph and server-wide; bearer auth; actor/audit tracking. +- **Runs on your infrastructure** — any S3-compatible object store: **on-prem via RustFS / MinIO**, or AWS S3 / R2 / GCS. VPC, on-prem, hybrid — your data never leaves your store. +- **Open, versioned storage** — [`Lance`](https://github.com/lance-format/lance) columnar format: branchable, time-travelable, with native blob-as-data (docs, images, video). + +## What you can build + +| Use case | What it's for | |---|---| -| **Schema AS CODE** | Typed `.pg` schemas, planned, applied, enforced | -| **Context AS CODE** | Linted queries & agentic nudges, versioned and reusable | -| **Security AS CODE** | Cedar policies enforced server-side on every mutation | -| **Dashboards AS CODE** | Declarative views & controls over the graph *(coming)* | +| **Company brain** | Org knowledge unified into one graph every agent can query | +| **Agentic memory** | Durable, versioned memory — a branch per agent or per task, merged on review | +| **Context graph** | Decision traces and codified tribal knowledge for retrieval | +| **Dev graph** | Issues & dependency model that coding agents read and write | +| **R&D / ML data layer** | Experiments and trials written into branches, versioned for training & eval | -## Core Use Cases - -| Use case | What it's for -|---|---| -| **Company brain** | Org knowledge unified into one queryable graph | -| **Context graph** | Decision traces and codified tribal knowledge | -| **Agentic memory** | Durable, versioned memory for long-running agents | -| **Dev graph** | Issues & dependency model for coding agents | -| **R&D data layer** | Experiments & trials data written into branches | -| **ML workflows** | Versioned, branchable graphs for training & eval | -| **Karpathy's LLM wiki** | A living, agent-updatable knowledge base | - -## Quick Install +## Install ```bash curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/install.sh | bash ``` -This installs `omnigraph` and `omnigraph-server` into `~/.local/bin` from -published release binaries. - -Or install with Homebrew: +This installs `omnigraph` (CLI) and `omnigraph-server` into `~/.local/bin` from +published release binaries. Or with Homebrew: ```bash brew tap ModernRelay/tap brew install ModernRelay/tap/omnigraph ``` -## Quick start +## Drive it with an AI agent -The fastest path is an **embedded, local file-backed graph** — no server, no -object store, no Docker: +Omnigraph is built to be run by coding agents — two ways in. -```bash -# A schema and one row of data -cat > schema.pg <<'PG' -node Person { - slug: String @key - name: String - title: String? -} -PG -echo '{"type":"Person","data":{"slug":"alice","name":"Alice","title":"Engineer"}}' > people.jsonl - -# Create → load (--mode is required) → query -omnigraph init --schema schema.pg ./graph.omni -omnigraph load --data people.jsonl --mode overwrite --store ./graph.omni -omnigraph query find_people --store ./graph.omni --params '{"t":"Engineer"}' \ - -e 'query find_people($t: String) { match { $p: Person { title: $t } } return { $p.name } }' - -# Branch, write in isolation, merge — Git-style across the whole graph -omnigraph branch create --from main review/new-hires --store ./graph.omni -omnigraph branch merge review/new-hires --into main --store ./graph.omni -``` - -**Storage backends** — the same flow runs on any backend; only the graph address changes: - -| Backend | Use it for | Graph address | -|---|---|---| -| **Embedded** (local filesystem) | dev, demos, single machine — the default | `./graph.omni` | -| **Object storage** (AWS S3, R2, GCS-S3) | shared, multi-host, durable | `s3://bucket/graph.omni` (+ the `AWS_*` env) | -| **RustFS / MinIO** | rehearse the S3 path locally, no cloud account | `s3://…` against a local endpoint → [deployment guide](docs/user/deployment.md#testing-against-s3-locally) | - -`init` takes the address as its positional argument (`omnigraph init --schema schema.pg
`); `load`, `query`, and `branch` take it via `--store
`. - -For a **served, multi-graph deployment** (the cluster model), see [Common Commands](#common-commands) below. - -## Set it up with an AI agent - -Omnigraph is built to be set up by coding agents. Paste this into Claude Code, -Cursor, or any agent that can read a URL, install a package, and run a shell -command — it installs the skill, reads the docs, and walks you through setup for -your use case: - -```text -Help me set up Omnigraph (a lakehouse-native graph engine for agents). - -1. Install the Omnigraph skill so you operate it correctly: - npx skills add ModernRelay/omnigraph@omnigraph -2. Read the docs at https://github.com/ModernRelay/omnigraph — start with - docs/user/quickstart.md, then docs/user/clusters/index.md. -3. Skim the starter graphs and seed data in the cookbooks: - https://github.com/ModernRelay/omnigraph-cookbooks -4. Ask me what I want to build (company brain, agent memory, dev graph, - research / R&D layer, …). Then install the CLI, stand up a first graph for - that use case, load a little data, and run a query so I can see it working. -``` - -Works with any agent that can browse a URL, install a package, and run a shell. - -## Agent skill & starter graphs - -This repo ships the [**`omnigraph` agent skill**](skills/omnigraph) — the -operational playbook (cluster mode, the two config surfaces, schema evolution, -query linting, data writes, branches, Cedar policy, and common gotchas) that -teaches a coding agent to drive Omnigraph correctly. Install it with: +**Teach your agent the playbook.** This repo ships the +[**`omnigraph` agent skill**](skills/omnigraph): the operational playbook — +cluster mode, the two config surfaces, schema evolution, query linting, data +writes, branches, Cedar policy, and the common gotchas. ```bash npx skills add ModernRelay/omnigraph@omnigraph ``` +**Or have an agent set it up from scratch.** Paste this into Claude Code, +Cursor, or any agent that can read a URL and run a shell command: + +```text +Help me set up Omnigraph + +1. Read the docs at https://github.com/ModernRelay/omnigraph — start with + docs/user/clusters/index.md, then docs/user/deployment.md. +2. Skim the starter graphs and seed data in the cookbooks: + https://github.com/ModernRelay/omnigraph-cookbooks +3. Ask me what I want to build (company brain, agent memory, dev graph, + research / R&D layer, …). Then stand up a cluster for it, load a little + data, and run a query so I can see it working. +``` + For ready-to-run graphs with real seed data (company brain, VC operating system, pharma & industry intel), [`ModernRelay/omnigraph-cookbooks`](https://github.com/ModernRelay/omnigraph-cookbooks) -is the fastest way to see Omnigraph shaped to a real domain. To rehearse the S3 -path locally, see [deployment.md → Testing against S3 locally](docs/user/deployment.md#testing-against-s3-locally). +is the fastest way to see Omnigraph shaped to a real domain. -## Common Commands +## Deploy -A deployment is a **cluster**. A `cluster.yaml` declares its graphs, schemas, -stored queries, and policies; you converge it with `cluster apply` and serve it. -The server is cluster-first — it boots only from a cluster and serves every graph -under `/graphs/{id}/…`. Day-to-day work goes through that server: graphs are -addressed with `--server ` (+ `--graph `), and `query`/`mutate` -invoke a stored query from the catalog **by name**. +A deployment is a **cluster** — a **multigraph** config directory that declares +its graphs, schemas, stored queries, and policies as code. You manage it +**Terraform-style**: `cluster plan` previews the diff, `cluster apply` converges +it. `omnigraph-server` then boots from the cluster and brings every graph online +at `/graphs/{id}/…`, each behind its own policy. -```bash -# 1. Converge the declared cluster, then serve it (--as attributes the apply) -omnigraph cluster apply --config ./company-brain --as you -omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080 -# or config-free from object storage — the bucket IS the deployment: -# omnigraph-server --cluster s3://my-bucket/company-brain --bind 0.0.0.0:8080 +**1. Declare the cluster.** -# 2. Work against the served graph — stored queries invoked by name -omnigraph query find_people --server prod --graph knowledge --params '{"q":"AI safety"}' -omnigraph mutate add_person --server prod --graph knowledge --params '{"name":"Mina"}' -omnigraph load --data ./data.jsonl --mode merge --server prod --graph knowledge - -# 3. Branch and merge, Git-style across the whole graph -omnigraph branch create --from main review/2026-06 --server prod --graph knowledge -omnigraph branch merge review/2026-06 --into main --server prod --graph knowledge +``` +company-brain/ +├── cluster.yaml +├── people.pg # schema for the "knowledge" graph +├── queries/ # stored queries — the .gq files ARE the declaration +│ └── people.gq +└── base.policy.yaml # a Cedar policy bundle ``` -Set a default scope (or a `--profile`) in `~/.omnigraph/config.yaml` — operator -identity, named servers/clusters, credentials — and the `--server`/`--graph` -flags drop away (`omnigraph query find_people --params …`). - -**Local / ad-hoc.** For quick iteration on a standalone graph (no cluster, no -server), address storage directly with `--store` (or a positional `file://` / -`s3://` URI) and run ad-hoc `.gq` with `--query` (the positional then selects -which query in the file): - -```bash -omnigraph init --schema ./schema.pg ./graph.omni -omnigraph load --data ./data.jsonl --mode merge --store ./graph.omni -omnigraph query --query ./queries.gq get_person --params '{"name":"Alice"}' --store ./graph.omni +```yaml +# cluster.yaml +version: 1 +metadata: + name: company-brain +storage: s3://company/clusters/company-brain # ledger, catalog, and graph data live here +graphs: + knowledge: + schema: people.pg + queries: queries/ # every `query ` in queries/*.gq registers +policies: + base: + file: base.policy.yaml + applies_to: [knowledge] # graph-bound; use [cluster] for server-level ``` -See [docs/user/cli/index.md](docs/user/cli/index.md), the -[CLI reference](docs/user/cli/reference.md), the -[cluster guide](docs/user/clusters/index.md), and the -[deployment guide](docs/user/deployment.md) for schema apply, snapshots, commits, -profiles, and policy/queries tooling. +**2. Stand up your object store.** On-prem, run RustFS (or MinIO) — Omnigraph +writes [Lance](https://github.com/lance-format/lance) to it over the standard S3 +API. In the cloud, point the same `AWS_*` env at S3 / R2 / GCS instead. -## Clients +**3. Converge and run.** `apply` creates each graph, applies its schema, and +publishes queries and policies into the content-addressed catalog. It is +idempotent — re-running is always safe. -For programmatic access to a running `omnigraph-server`: +```bash +omnigraph cluster validate # parse + typecheck everything +omnigraph cluster plan # preview what apply would do +omnigraph cluster apply # converge -- **TypeScript SDK + MCP server** — [`@modernrelay/omnigraph`](https://www.npmjs.com/package/@modernrelay/omnigraph) and [`@modernrelay/omnigraph-mcp`](https://www.npmjs.com/package/@modernrelay/omnigraph-mcp), versioned in lockstep with `omnigraph-server`. Source, docs, and examples: [`ModernRelay/omnigraph-ts`](https://github.com/ModernRelay/omnigraph-ts). -- **Python SDK** — coming soon. +# Boot the server from the cluster dir — storage resolves through cluster.yaml +omnigraph-server --cluster company-brain --bind 0.0.0.0:8080 +``` + +See the [cluster guide](docs/user/clusters/index.md) for the day-2 loop +(edit → plan → apply → restart), approval gates for destructive changes, drift +inspection, and recovery; the [deployment guide](docs/user/deployment.md) for +containers, AWS/Railway, auth, and the full `AWS_*` contract. + +## Query and mutate + +Point the CLI at a running server and a graph. Stored queries and mutations run +**by name** from the catalog; branch and merge run across the whole graph, so a +fleet of agents can write in isolation and have changes reviewed before they +land on `main`. + +```bash +# Stored query / mutation, parameters as JSON +omnigraph query search_docs --server https://graph.internal:8080 --graph knowledge --params '{"q":"AI safety"}' +omnigraph mutate add_person --server https://graph.internal:8080 --graph knowledge --params '{"name":"Mina","team":"Research"}' + +# An agent enriches on its own branch; you review, then merge +omnigraph branch create --from main agent/ingest-42 --server https://graph.internal:8080 --graph knowledge +omnigraph branch merge agent/ingest-42 --into main --server https://graph.internal:8080 --graph knowledge +``` + +Name the server (and a default graph) once in `~/.omnigraph/config.yaml` — with +operator identity and credentials — and the `--server`/`--graph` flags drop +away: `omnigraph query search_docs --params '{"q":"…"}'`. See the +[CLI reference](docs/user/cli/reference.md). + +## Security & governance + +- **Engine-wide enforcement** — every write path goes through the same Cedar gate, so the HTTP server, the CLI, and the embedded SDK obey identical rules. +- **Declared in the cluster** — a policy bundle is bound to graphs (or the whole server) via `policies:` → `applies_to`. +- **Scoped** — rules apply per graph, per branch, or server-wide. +- **No plaintext tokens** — bearer tokens are hashed at startup and compared in constant time. +- **Forge-proof identity** — the actor is resolved server-side from the token; clients can't set it. + +See the [policy guide](docs/user/operations/policy.md). + +## Clients & SDKs + +| Client | Use it for | Where | +|---|---|---| +| **TypeScript SDK** | typed access from Node / TS | [`@modernrelay/omnigraph`](https://www.npmjs.com/package/@modernrelay/omnigraph) · [source](https://github.com/ModernRelay/omnigraph-ts) | +| **MCP server** | bridge Omnigraph to LLM hosts (Claude, Cursor, …) | [`@modernrelay/omnigraph-mcp`](https://www.npmjs.com/package/@modernrelay/omnigraph-mcp) | +| **HTTP / OpenAPI** | any language — the wire contract | the server's OpenAPI spec | +| **Python SDK** | typed access from Python | *coming soon* | + +Both npm packages are versioned in lockstep with `omnigraph-server`. + +## Local quick test (no server) + +1-min setup to try it: an **embedded, local file-backed graph** — no server, no +object store. For dev and experiments; production is the deployed cluster above. + +```bash +cat > schema.pg <<'PG' +node Signal { slug: String @key, title: String } +node Pattern { slug: String @key, name: String } +edge Indicates: Signal -> Pattern +PG +printf '%s\n' \ + '{"type":"Signal","data":{"slug":"s1","title":"OSS model adoption surging"}}' \ + '{"type":"Pattern","data":{"slug":"p1","name":"adoption"}}' \ + '{"edge":"Indicates","from":"s1","to":"p1"}' > data.jsonl + +omnigraph init --schema schema.pg ./graph.omni +omnigraph load --data data.jsonl --mode overwrite --store ./graph.omni + +# "What pattern does signal s1 indicate?" +omnigraph query --store ./graph.omni \ + -e 'query indicates() { match { $s: Signal { slug: "s1" } $s indicates $p } return { $p.name } }' +# → adoption +``` ## Docs -- [Install guide](docs/user/install.md) -- [Deployment guide](docs/user/deployment.md) +- [Cluster guide](docs/user/clusters/index.md) · [Deployment guide](docs/user/deployment.md) · [CLI reference](docs/user/cli/reference.md) +- [Schema](docs/user/schema/index.md) · [Queries](docs/user/queries/index.md) · [Search](docs/user/search/index.md) · [Policy](docs/user/operations/policy.md) ## Build And Test ```bash cargo build --workspace -cargo check --workspace -cargo test --workspace +cargo test --workspace ``` Notes: @@ -211,8 +228,8 @@ Notes: - `crates/omnigraph-policy`: Cedar policy compilation and enforcement - `crates/omnigraph-api-types`: shared HTTP wire DTOs used by both the server and the CLI - `crates/omnigraph-cluster`: cluster config validation, planning, and apply (the control plane) -- `crates/omnigraph-server`: Axum HTTP server — cluster-first, serving N graphs under `/graphs/{id}/…` -- `crates/omnigraph-cli`: CLI for graph lifecycle (init/load), query/mutate, branch/commit/merge, schema/lint, snapshot/export, cluster control, policy/queries, profiles, and maintenance (optimize/repair/cleanup) +- `crates/omnigraph-server`: Axum HTTP server — cluster-first, runs N graphs under `/graphs/{id}/…` +- `crates/omnigraph-cli`: CLI for graph lifecycle, query/mutate, branch/commit/merge, schema/lint, snapshot/export, cluster control, policy/queries, profiles, and maintenance ## Contributing