mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-09 01:35:18 +02:00

Ragnor Comerford c924e121d2

Add architectural invariants & deny-list as docs/invariants.md

A standing reference for invariants that hold across storage, engine,
server, schema, indexing, observability, and the OSS/Cloud split. Used
to check RFCs and PRs against the substrate boundaries (don't rebuild
what Lance gives us), layering rules (one trait boundary per layer),
distributability constraints (Send+Sync, location-neutral IR), honesty
expectations (estimate-vs-actual, bounded failure modes), unified
patterns (reconciler, Union polymorphism, SIP, factorize), the §IX
deny-list, and the §X review checklist.

§IV (additivity / migration) and §VIII (OSS/Cloud kernel-product split)
are referenced but not yet drafted — flagged as placeholders pending
upstream fill-in.

AGENTS.md surfaces it from the topic index, the always-on rules
section, and the maintenance contract; the deny-list is also inlined
there as a fast-pass review filter so it stays in scope every turn.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-28 23:34:44 +02:00

15 KiB

Raw Blame History

OmniGraph — Agent Guide

This file is the always-on map for AI coding agents (Claude Code, Codex, Cursor, Cline) working in this repo. It is loaded into context on every turn, so it stays as a map plus the rules and invariants that need to be in scope at all times — the encyclopedia content lives under docs/. When you need depth, follow a pointer.

CLAUDE.md is a symlink to this file — there is exactly one source of truth. Edit AGENTS.md.

Version surveyed: 0.3.1 Workspace crates: omnigraph-compiler, omnigraph (engine), omnigraph-cli, omnigraph-server Storage substrate: Lance 4.x (columnar, versioned, branchable) License: MIT Toolchain: Rust stable, edition 2024

Start here — what is this?

OmniGraph is a typed property-graph engine built as a coordination layer over many Lance datasets. Highlights:

Storage: per node/edge type a separate Lance dataset; multi-dataset commits coordinated atomically through one __manifest table.
Languages: a .pg schema language and a .gq query language, both Pest-based, with a typed IR.
Multi-modal querying: vector ANN (nearest), full-text (search/fuzzy/match_text/bm25), Reciprocal Rank Fusion (rrf), and graph traversal (Expand, anti-join not { … }) in one runtime.
Branches and commits across the whole graph: Git-style — every successful publish appends to a commit DAG; merges are three-way at the row level.
Transactional runs: ephemeral __run__<id> branches for isolated mutation, fast-path or merge-path publish.
HTTP server: Axum + utoipa OpenAPI, bearer auth (SHA-256 hashed, optional AWS Secrets Manager), Cedar policy gating.
CLI driven by a single omnigraph.yaml; multi-format output (json/jsonl/csv/kv/table).

Throughout the docs, capabilities are split into L1 — Inherited from Lance vs L2 — Added by OmniGraph.

Architecture at a glance

CLI (omnigraph)        HTTP Server (omnigraph-server, Axum)
        │                            │
        └─────────────┬──────────────┘
                      ▼
           omnigraph-compiler  ── Pest grammars, catalog, IR, lowering, lint, migration plan
                      │
                      ▼
           omnigraph (engine)  ── ManifestRepo, CommitGraph, RunRegistry, GraphIndex (CSR/CSC), exec
                      │
                      ▼
              Lance 4.x         ── columnar Arrow, fragments, per-dataset versions/branches, indexes
                      │
                      ▼
        Object store (file / s3 / RustFS / MinIO / S3-compat)

Full diagram and concurrency model: docs/architecture.md.

Where to find each topic

Area	Read
Architectural invariants & deny-list (read before any non-trivial proposal or review)	docs/invariants.md
Architecture, L1/L2 framing, concurrency model	docs/architecture.md
Storage layout, `__manifest` schema, URI schemes, S3 env vars	docs/storage.md
`.pg` schema language, types, constraints, annotations, migration planning	docs/schema-language.md
`.gq` query language, MATCH/RETURN/ORDER, search funcs, mutations, IR ops, lint codes	docs/query-language.md
Indexes (BTREE / inverted / vector / graph topology)	docs/indexes.md
Embeddings (compiler + engine clients, env vars, `@embed`)	docs/embeddings.md
Branches, commit graph, snapshots, system branches	docs/branches-commits.md
Runs (transactional graph mutations, `__run__<id>`, publish paths)	docs/runs.md
Three-way merge and conflict kinds	docs/merge.md
Diff / change feed (`diff_between`, `diff_commits`)	docs/changes.md
Query execution, mutation execution, bulk loader, `load` vs `ingest`	docs/execution.md
`optimize` (compaction) and `cleanup` (version GC)	docs/maintenance.md
Cedar policy actions, scopes, CLI	docs/policy.md
HTTP server endpoints, auth, error model, body limits	docs/server.md
CLI quick-start	docs/cli.md
CLI command surface and `omnigraph.yaml` schema	docs/cli-reference.md
Audit / actor tracking	docs/audit.md
Error taxonomy and result serialization	docs/errors.md
Install (binary / Homebrew / source / channels)	docs/install.md
Deployment (binary / container / RustFS bootstrap / auth / build variants)	docs/deployment.md
CI / release workflows	docs/ci.md
Constants & tunables cheat sheet	docs/constants.md
Per-version release notes	docs/releases/

Always-on rules (load these into your working memory)

These invariants need to be in scope on every change — they're the ones that quietly break if forgotten. The full architectural invariants and deny-list live in docs/invariants.md; §IX (deny-list) is the fastest first-pass when reviewing any change.

__manifest is the atomic-publish boundary. Multi-dataset commits flip via a single ManifestBatchPublisher write. Don't introduce code paths that publish per sub-table outside the batch publisher — you'll lose snapshot isolation across tables.
nearest($x.vec, $q) requires a LIMIT. The compiler enforces it, but if you're touching the query lowering or executor, don't break this rule. ANN without a limit is unbounded.
Snapshot isolation per query. A query holds one Snapshot for its lifetime. Don't read against db.head() mid-query; use the snapshot bound at lowering time.
Run isolation lives on __run__<id> branches. Mutations inside begin_run … publish_run must go through run_branch, not target_branch. Publish picks fast-path (target unmoved) or merge-path (three-way).
Schema apply is serialized via __schema_apply_lock__. Concurrent apply_schema is not safe. Don't bypass the lock.
branch_list() filters internal branches. __run__… and __schema_apply_lock__ must not appear in user-visible listings, exports, or policy-scoped operations. If you add a new system branch, follow the __name__ prefix convention and add it to the filter.
Bearer-token plaintext never persists in process memory. Tokens are SHA-256 hashed at startup; comparison uses subtle::ConstantTimeEq. The actor id is server-resolved from the hash match — it must not be settable by the client.
Mutations are atomic at the manifest commit boundary. Multi-statement change queries publish one commit. Don't commit per-statement.
Indexes are built on the branch head, not on a snapshot. Reads always see the current index state. Lazy fork: a branch that hasn't mutated a sub-table reuses the source's index until the first write.
Stable type IDs survive renames. Schema migration uses stable_type_id (kind+name hashed at first sight). Don't mint new IDs on rename.

Deny-list (fast-pass review filter — full reasoning in docs/invariants.md §IX)

If a proposal fits one of these, the burden is on the proposer to justify why this case is the exception:

Synchronous-inline index updates for indexes expensive to build (vector ANN, FTS) — use the reconciler pattern.
Custom WAL / transaction manager / buffer pool — Lance owns these.
Job queue for state derivable from manifest — reconciler pattern instead.
Per-feature lowering for shapes that share a structure (interfaces, wildcards, alternation) — use one mechanism.
Eager materialization of cross-products in multi-hop — factorize; flatten only when needed.
Ad-hoc IN-list filtering when SIP fits.
String-flattened SQL filter generation when structured pushdown is available.
In-process-only Dataset impls — Send + Sync, remote descriptors.
Cost-blind plan choice — lowering-order execution is not a planner.
Hidden statistics — if a metric matters for plan choice, it must be exposed through the trait surface.
Side-channels for query semantics — search modes, mutations, polymorphism are first-class IR concepts.
Discarding rank in retrieval — score and rank propagate as columns.
State that drifts from the manifest — derive from observable state.
Cloud-only correctness fixes — correctness is always OSS.
Forking the codebase for Cloud — trait-extension only.
Hand-rolling something Lance already does — check the spec first.
Mutating in place state that should be immutable (Lance fragments, index segments) — new segments instead.
Silent failures — OOM, timeout, partial result must all be surfaced and bounded.

Quick-reference flows

# Initialize an S3-backed repo
omnigraph init --schema ./schema.pg s3://my-bucket/repo.omni

# Bulk load
omnigraph load --data ./seed.jsonl --mode overwrite s3://my-bucket/repo.omni

# Branch + ingest a review batch
omnigraph branch create --from main review/2026-04-25 s3://my-bucket/repo.omni
omnigraph ingest --branch review/2026-04-25 --data ./batch.jsonl s3://my-bucket/repo.omni

# Run a hybrid (vector + BM25) query
omnigraph read --query ./queries.gq --name find_similar \
  --params '{"q":"trends in AI safety"}' --format table s3://my-bucket/repo.omni

# Plan + apply schema migration
omnigraph schema plan  --schema ./next.pg s3://my-bucket/repo.omni
omnigraph schema apply --schema ./next.pg s3://my-bucket/repo.omni --json

# Merge review branch back
omnigraph branch merge review/2026-04-25 --into main s3://my-bucket/repo.omni

# Compact + GC (preview, then confirm)
omnigraph optimize s3://my-bucket/repo.omni
omnigraph cleanup  --keep 10 --older-than 7d s3://my-bucket/repo.omni
omnigraph cleanup  --keep 10 --older-than 7d --confirm s3://my-bucket/repo.omni

# Stand up the HTTP server (token from env)
OMNIGRAPH_SERVER_BEARER_TOKEN=xxxx \
  omnigraph-server s3://my-bucket/repo.omni --bind 0.0.0.0:8080

# Cedar policy explain
omnigraph policy explain --actor act-alice --action change --branch main

Capability matrix — "Lens by default vs. added by OmniGraph"

Capability	L1 (Lance default)	L2 (OmniGraph adds)
Columnar storage on object store	✅ Arrow/Lance	URI normalization, S3 env-var plumbing
Per-dataset versioning + time travel	✅	`snapshot_at_version`, `entity_at`, snapshot-pinned reads across many tables
Per-dataset branches	✅	Graph-level branches (atomic across all sub-tables), lazy fork, system branch filtering
Atomic single-dataset commits	✅	Atomic multi-dataset publish via `__manifest` + `ManifestBatchPublisher`
Compaction (`compact_files`)	✅	`omnigraph optimize` orchestrates over all node/edge tables, bounded concurrency
Cleanup (`cleanup_old_versions`)	✅	`omnigraph cleanup` with `--keep` / `--older-than` policy
BTREE / inverted (FTS) / vector indexes	✅	`ensure_indices` builds them on every relevant column; idempotent; lazy across branches
`merge_insert` upsert	✅	`LoadMode::Merge`, mutation `update`/`insert`/`delete` lowering
Vector search	✅	`nearest()` query op; embedding pipeline (Gemini / OpenAI clients); `@embed` in schema
Full-text search	✅	`search/fuzzy/match_text/bm25` query ops
Hybrid ranking	—	`rrf(...)` Reciprocal Rank Fusion in one runtime
Graph traversal	—	CSR/CSC topology index, `Expand` IR op, variable-length hops, `not { }` anti-join
Schema language	—	`.pg` + Pest grammar + catalog + interfaces + constraints + annotations
Query language	—	`.gq` + Pest grammar + IR + lowering + linter
Schema migration planning	—	`plan_schema_migration` + `apply_schema` step types + `__schema_apply_lock__`
Commit graph (DAG) across whole repo	—	`_graph_commits.lance` with linear + merge parents, ULID ids, actor map
Transactional runs	—	`_graph_runs.lance`, `__run__<id>` ephemeral branches, fast-path & merge-path publish
Three-way row-level merge	—	`OrderedTableCursor` + `StagedTableWriter`, structured `MergeConflictKind`
Change feeds	—	`diff_between` / `diff_commits` with manifest fast path + ID streaming
Cedar policy	—	10 actions, branch / target_branch / protected scopes, validate/test/explain CLI
HTTP server	—	Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), policy gating, NDJSON streaming export
CLI with config	—	`omnigraph.yaml`, aliases, multi-format output (json/jsonl/csv/kv/table)
Audit / actor tracking	—	`_as` write APIs + actor maps in commit & run datasets
Local RustFS bootstrap	—	`scripts/local-rustfs-bootstrap.sh` one-shot S3-backed dev environment

Maintenance contract for agents

When you change something user-visible, update the relevant docs/<area>.md in the same change. Pointers from this file to that doc must keep working — CI enforces cross-link integrity via scripts/check-agents-md.sh.

When proposing or reviewing a non-trivial change, walk docs/invariants.md — at minimum the §IX deny-list and §X review checklist. Add to the deny-list when a new anti-pattern surfaces; relaxing an invariant requires the same review process as code.

Rules:

Update in the same PR. New endpoint, query function, CLI flag, env var, constant, schema construct, or invariant: update both the source code and the doc in the same change. Never split documentation drift into a follow-up.
Bump version on release. When a release boundary crosses (e.g. v0.3.1 → v0.3.2), update the version line at the top of this file and add a docs/releases/<version>.md describing the user-visible delta. Update docs/architecture.md only if the architecture itself changed.
Don't lie. If a section becomes wrong but you can't rewrite it fully right now, replace the wrong line with *(stale — needs update after <change>)* rather than leaving silently incorrect text. Then fix it ASAP.
Re-verify before recommending. If you cite a flag, env var, endpoint, or constant to the user or in code, grep for it in source first. Memory and docs go stale; the code is authoritative.
Keep AGENTS.md a map, not an encyclopedia. New deep content goes into docs/. Add an entry to "Where to find each topic" instead of pasting prose into this file. The "Always-on rules" section is the exception — it's for invariants that should always be in scope.
Re-read on schema/query/IR changes. Edits to schema.pest, query.pest, ir/lower.rs, query/typecheck.rs, or query/lint.rs should trigger a re-read of docs/schema-language.md, docs/query-language.md, and docs/execution.md to confirm they still describe reality.

CI check: scripts/check-agents-md.sh verifies that every docs/*.md link in this file resolves and that every doc in the canonical set is linked. Run it locally before opening a PR if you've moved or renamed docs.

15 KiB Raw Blame History