diff --git a/AGENTS.md b/AGENTS.md index 0ef8f92..378de88 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -100,7 +100,7 @@ Full diagram and concurrency model: [docs/dev/architecture.md](docs/dev/architec | Audit / actor tracking | [docs/user/operations/audit.md](docs/user/operations/audit.md) | | Error taxonomy and result serialization | [docs/user/operations/errors.md](docs/user/operations/errors.md) | | Install (binary / Homebrew / source / channels) | [docs/user/install.md](docs/user/install.md) | -| Deployment (binary / container / RustFS bootstrap / auth / build variants) | [docs/user/deployment.md](docs/user/deployment.md) | +| Deployment (binary / container / S3-local testing / auth / build variants) | [docs/user/deployment.md](docs/user/deployment.md) | | CI / release workflows | [docs/dev/ci.md](docs/dev/ci.md) | | Code ownership (CODEOWNERS source of truth, roles, regeneration) | [docs/dev/codeowners.md](docs/dev/codeowners.md) | | Branch protection policy (declarative, applied via `scripts/apply-branch-protection.sh`) | [docs/dev/branch-protection.md](docs/dev/branch-protection.md) | @@ -192,7 +192,7 @@ cargo test -p omnigraph-engine --features failpoints --test failpoints # fault cargo build -p omnigraph-server --features aws # AWS Secrets Manager bearer-token source ``` -S3-backed tests (`s3_storage`, and the S3 paths in server/CLI system tests) **skip** unless `OMNIGRAPH_S3_TEST_BUCKET` + `AWS_*` (incl. `AWS_ENDPOINT_URL_S3` for non-AWS) are set; CI runs them against containerized RustFS. `scripts/local-rustfs-bootstrap.sh` stands up a local S3 environment. +S3-backed tests (`s3_storage`, and the S3 paths in server/CLI system tests) **skip** unless `OMNIGRAPH_S3_TEST_BUCKET` + `AWS_*` (incl. `AWS_ENDPOINT_URL_S3` for non-AWS) are set; CI runs them against containerized RustFS. To run RustFS/MinIO yourself, see [docs/user/deployment.md](docs/user/deployment.md) → *Testing against S3 locally*. CI does **not** run `clippy` or `rustfmt` as gates — but `cargo test --workspace --locked` is the exact gate, so run it before pushing. Two non-test CI checks: `scripts/check-agents-md.sh` (doc cross-link integrity — run it after moving/renaming docs) and OpenAPI drift (`crates/omnigraph-server/tests/openapi.rs` regenerates `openapi.json`; set `OMNIGRAPH_UPDATE_OPENAPI=1` to update the checked-in copy when a server/API change is intentional). @@ -268,7 +268,8 @@ omnigraph policy explain --cluster ./company-brain --graph knowledge --actor act | HTTP server | — | Axum, OpenAPI via utoipa, bearer auth (SHA-256, AWS Secrets Manager option), `authorize_request` at the HTTP boundary (resolves bearer→actor, applies admission control), NDJSON streaming export, **cluster-only boot (RFC-011): always `--cluster `, serving N graphs (N ≥ 1) under multi-graph routes + read-only `GET /graphs` enumeration + per-graph + server-level Cedar policies. Add/remove graphs via `cluster apply` and restart.** | | CLI with config | — | two-surface config (team `cluster.yaml` dir + per-operator `~/.omnigraph/config.yaml`), scope addressing (`--store`/`--server`/`--cluster`/`--profile`/defaults, RFC-011), aliases, multi-format output (json/jsonl/csv/kv/table) | | Audit / actor tracking | — | `_as` write APIs + actor map in commit graph | -| Local RustFS bootstrap | — | `scripts/local-rustfs-bootstrap.sh` one-shot S3-backed dev environment | +| Local S3 testing | — | run RustFS/MinIO + the `AWS_*` env; see [docs/user/deployment.md](docs/user/deployment.md) → *Testing against S3 locally* | +| Agent skill | — | `skills/omnigraph` — operational playbook for driving Omnigraph; install with `npx skills add ModernRelay/omnigraph@omnigraph` | --- diff --git a/README.md b/README.md index bee0fa5..e1a99f6 100644 --- a/README.md +++ b/README.md @@ -12,7 +12,7 @@ Hundreds of agents can enrich the graph on parallel isolated branches and change - Git-style versioning & branching - Multimodal retrieval (graph+vector/fts+filters) optimized for context assembly -- Object storage native (S3, RustFS) +- Runs on the local filesystem or any S3-compatible object store (AWS S3, R2, MinIO, RustFS) - Native blob-as-data support (docs, images, videos, etc) - VPC, On-prem, hybrid deployment - [`Lance`](https://github.com/lance-format/lance) format as open storage layer @@ -52,29 +52,45 @@ brew tap ModernRelay/tap brew install ModernRelay/tap/omnigraph ``` -For starter graphs and agent skills to bootstrap and operate Omnigraph, see [`ModernRelay/omnigraph-cookbooks`](https://github.com/ModernRelay/omnigraph-cookbooks). +## Set it up with an AI agent -## One-Command Local RustFS Bootstrap +Omnigraph is built to be set up by coding agents. Paste this into Claude Code, +Cursor, or any agent that can read a URL, install a package, and run a shell +command — it installs the skill, reads the docs, and walks you through setup for +your use case: -```bash -curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash +```text +Help me set up Omnigraph (a lakehouse-native graph engine for agents). + +1. Install the Omnigraph skill so you operate it correctly: + npx skills add ModernRelay/omnigraph@omnigraph +2. Read the docs at https://github.com/ModernRelay/omnigraph — start with + docs/user/quickstart.md, then docs/user/clusters/index.md. +3. Skim the starter graphs and seed data in the cookbooks: + https://github.com/ModernRelay/omnigraph-cookbooks +4. Ask me what I want to build (company brain, agent memory, dev graph, + research / R&D layer, …). Then install the CLI, stand up a first graph for + that use case, load a little data, and run a query so I can see it working. ``` -That bootstrap: +Works with any agent that can browse a URL, install a package, and run a shell. -- starts RustFS on `127.0.0.1:9000` -- creates a bucket and S3-backed graph -- loads the checked-in context fixture -- launches `omnigraph-server` on `127.0.0.1:8080` +## Agent skill & starter graphs -Docker must be installed and running first. +This repo ships the [**`omnigraph` agent skill**](skills/omnigraph) — the +operational playbook (cluster mode, the two config surfaces, schema evolution, +query linting, data writes, branches, Cedar policy, and common gotchas) that +teaches a coding agent to drive Omnigraph correctly. Install it with: -The RustFS bootstrap prefers the rolling `edge` binaries and only falls back to -source builds when release assets are unavailable. +```bash +npx skills add ModernRelay/omnigraph@omnigraph +``` -If a previous run left objects under the same graph prefix but did not finish -initializing the graph, rerun with `RESET_REPO=1` or set `PREFIX` to a new -value. +For ready-to-run graphs with real seed data (company brain, VC operating system, +pharma & industry intel), +[`ModernRelay/omnigraph-cookbooks`](https://github.com/ModernRelay/omnigraph-cookbooks) +is the fastest way to see Omnigraph shaped to a real domain. To rehearse the S3 +path locally, see [deployment.md → Testing against S3 locally](docs/user/deployment.md#testing-against-s3-locally). ## Common Commands diff --git a/docs/user/deployment.md b/docs/user/deployment.md index 21b8087..a0d8e9f 100644 --- a/docs/user/deployment.md +++ b/docs/user/deployment.md @@ -129,49 +129,46 @@ shape above) — the simplest AWS architecture. unvalidated** — boot is lock-free read-only so it should compose, but it is not yet exercised by tests. -## One-Command Local RustFS Bootstrap +## Testing against S3 locally -The easiest local S3-backed deployment path is: +To exercise the S3 storage path without a cloud account, run any S3-compatible +store in Docker and point the standard `AWS_*` environment at it. RustFS is +shown; MinIO works the same way. ```bash -curl -fsSL https://raw.githubusercontent.com/ModernRelay/omnigraph/main/scripts/local-rustfs-bootstrap.sh | bash +docker run -d --name omnigraph-s3 -p 9000:9000 \ + -e RUSTFS_ACCESS_KEY=omnigraph -e RUSTFS_SECRET_KEY=omnigraph \ + -e RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true \ + rustfs/rustfs:latest /data + +export AWS_ACCESS_KEY_ID=omnigraph AWS_SECRET_ACCESS_KEY=omnigraph \ + AWS_REGION=us-east-1 AWS_ENDPOINT_URL_S3=http://127.0.0.1:9000 \ + AWS_ALLOW_HTTP=true AWS_S3_FORCE_PATH_STYLE=true + +# create the bucket once (any S3 client works) +aws --endpoint-url "$AWS_ENDPOINT_URL_S3" s3 mb s3://omnigraph-local ``` -The bootstrap: +Now an `s3://…` URI works anywhere a graph or cluster root is expected. Root a +cluster on the bucket and serve it config-free: -- starts a local RustFS-backed object store -- creates a bucket and S3-backed Omnigraph graph -- loads the checked-in context fixture -- starts `omnigraph-server` on `127.0.0.1:8080` +```bash +# cluster.yaml +# version: 1 +# storage: s3://omnigraph-local/clusters/demo +# graphs: { demo: { schema: schema.pg } } -Supported behavior: +omnigraph cluster validate --config . +omnigraph cluster import --config . +omnigraph cluster apply --config . --as you +omnigraph load --data seed.jsonl --mode merge \ + s3://omnigraph-local/clusters/demo/graphs/demo.omni +omnigraph-server --cluster s3://omnigraph-local/clusters/demo \ + --bind 127.0.0.1:8080 --unauthenticated +``` -- downloads the rolling `edge` binary when one exists for the current platform -- otherwise clones `ModernRelay/omnigraph` and builds from source -- reuses an existing RustFS container if it is already running - -Useful overrides: - -- `WORKDIR=/path/to/state` -- `BUCKET=omnigraph-local` -- `PREFIX=graphs/context` -- `RESET_REPO=1` to delete an existing partially initialized graph prefix before recreating it -- `BIND=127.0.0.1:8080` -- `RUSTFS_CONTAINER_NAME=omnigraph-rustfs-demo` - -The bootstrap expects: - -- Docker -- `curl` -- either a matching release asset or a local Rust toolchain plus `git` - -If `aws` is not installed, the script attempts a user-local AWS CLI install via -`python3 -m pip`. Docker Desktop or another Docker daemon must already be -running. - -If a previous bootstrap left objects behind under the selected `PREFIX` but did -not finish initializing the graph, rerun with `RESET_REPO=1` or choose a new -`PREFIX`. +The same `AWS_*` contract applies to a production object store — swap the +endpoint and credentials. CI exercises this path against containerized RustFS. ## Container Deployment diff --git a/docs/user/quickstart.md b/docs/user/quickstart.md index b39ff1b..dd8c2e7 100644 --- a/docs/user/quickstart.md +++ b/docs/user/quickstart.md @@ -53,10 +53,13 @@ query find_people($title: String) { Run it: ```bash -omnigraph read --query queries.gq --name find_people \ - --params '{"title":"Engineer"}' --format table graph.omni +omnigraph query find_people --query queries.gq \ + --params '{"title":"Engineer"}' --format table --store graph.omni ``` +The query name is positional; `--query` points at the `.gq` source and +`--store` addresses the graph's storage directly. + The [query language](queries/index.md) covers `match`/`return`/`order`, and [search](search/index.md) covers vector and full-text search. diff --git a/scripts/local-rustfs-bootstrap.sh b/scripts/local-rustfs-bootstrap.sh deleted file mode 100755 index 2425c77..0000000 --- a/scripts/local-rustfs-bootstrap.sh +++ /dev/null @@ -1,425 +0,0 @@ -#!/usr/bin/env bash -set -euo pipefail - -REPO_SLUG="${REPO_SLUG:-ModernRelay/omnigraph}" -SOURCE_REF="${SOURCE_REF:-main}" -RELEASE_CHANNEL="${RELEASE_CHANNEL:-edge}" -WORKDIR="${WORKDIR:-$PWD/.omnigraph-rustfs-demo}" -RUSTFS_CONTAINER_NAME="${RUSTFS_CONTAINER_NAME:-omnigraph-rustfs-demo}" -# Pinned to 1.0.0-beta.8 (2026-06-10), matching CI (.github/workflows/ci.yml). -# beta.4+ has a credentials-policy check that refuses to start when the -# access/secret keys are values it considers "default" (rustfsadmin/rustfsadmin -# here); this script passes RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true -# below, so overriding RUSTFS_IMAGE to another tag is safe. -RUSTFS_IMAGE="${RUSTFS_IMAGE:-rustfs/rustfs:1.0.0-beta.8}" -RUSTFS_DATA_DIR="${RUSTFS_DATA_DIR:-$WORKDIR/rustfs-data}" -BUCKET="${BUCKET:-omnigraph-local}" -PREFIX="${PREFIX:-repos/context}" -BIND="${BIND:-127.0.0.1:8080}" -AWS_ACCESS_KEY_ID="${AWS_ACCESS_KEY_ID:-rustfsadmin}" -AWS_SECRET_ACCESS_KEY="${AWS_SECRET_ACCESS_KEY:-rustfsadmin}" -AWS_REGION="${AWS_REGION:-us-east-1}" -AWS_ENDPOINT_URL="${AWS_ENDPOINT_URL:-http://127.0.0.1:9000}" -AWS_ENDPOINT_URL_S3="${AWS_ENDPOINT_URL_S3:-$AWS_ENDPOINT_URL}" -AWS_ALLOW_HTTP="${AWS_ALLOW_HTTP:-true}" -AWS_S3_FORCE_PATH_STYLE="${AWS_S3_FORCE_PATH_STYLE:-true}" -FORCE_BUILD="${FORCE_BUILD:-0}" -RESET_REPO="${RESET_REPO:-0}" - -REPO_URI="s3://$BUCKET/$PREFIX" -SERVER_LOG="$WORKDIR/omnigraph-server.log" -SERVER_PID_FILE="$WORKDIR/omnigraph-server.pid" -BIN_DIR="" -FIXTURE_DIR="" -AWS_BIN="" - -log() { - printf '==> %s\n' "$*" -} - -die() { - printf 'error: %s\n' "$*" >&2 - exit 1 -} - -need_cmd() { - command -v "$1" >/dev/null 2>&1 || die "missing required command: $1" -} - -repo_root_from_shell() { - if [ -f "$PWD/Cargo.toml" ] && [ -f "$PWD/crates/omnigraph/tests/fixtures/context.pg" ]; then - printf '%s\n' "$PWD" - return 0 - fi - - if [ -n "${BASH_SOURCE[0]:-}" ] && [ -f "${BASH_SOURCE[0]}" ]; then - local candidate - candidate="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" - if [ -f "$candidate/Cargo.toml" ] && [ -f "$candidate/crates/omnigraph/tests/fixtures/context.pg" ]; then - printf '%s\n' "$candidate" - return 0 - fi - fi - - return 1 -} - -latest_release_tag() { - local json - json="$(curl -fsSL "https://api.github.com/repos/$REPO_SLUG/releases/latest" 2>/dev/null || true)" - printf '%s' "$json" | sed -n 's/.*"tag_name":[[:space:]]*"\([^"]*\)".*/\1/p' | head -n 1 -} - -platform_asset_name() { - local os arch - os="$(uname -s)" - arch="$(uname -m)" - - case "$os/$arch" in - Linux/x86_64) - printf 'omnigraph-linux-x86_64.tar.gz\n' - ;; - Darwin/arm64) - printf 'omnigraph-macos-arm64.tar.gz\n' - ;; - *) - return 1 - ;; - esac -} - -checksum_command() { - if command -v shasum >/dev/null 2>&1; then - printf 'shasum -a 256' - return - fi - - if command -v sha256sum >/dev/null 2>&1; then - printf 'sha256sum' - return - fi - - die "missing checksum tool: expected shasum or sha256sum" -} - -release_base_url() { - case "$RELEASE_CHANNEL" in - stable) - printf 'https://github.com/%s/releases/latest/download\n' "$REPO_SLUG" - ;; - edge) - printf 'https://github.com/%s/releases/download/edge\n' "$REPO_SLUG" - ;; - *) - die "unsupported RELEASE_CHANNEL '$RELEASE_CHANNEL' (expected stable or edge)" - ;; - esac -} - -verify_checksum() { - local archive="$1" - local checksum_file="$2" - local expected actual tool - - expected="$(awk '{print $1}' "$checksum_file")" - [ -n "$expected" ] || die "checksum file did not contain a SHA256 digest" - - tool="$(checksum_command)" - actual="$($tool "$archive" | awk '{print $1}')" - - [ "$actual" = "$expected" ] || die "checksum verification failed for $(basename "$archive")" -} - -ensure_aws_cli() { - if command -v aws >/dev/null 2>&1; then - AWS_BIN="$(command -v aws)" - return - fi - - need_cmd python3 - - if ! python3 -m pip --version >/dev/null 2>&1; then - python3 -m ensurepip --upgrade --user >/dev/null 2>&1 || die "aws cli not found and python3 pip bootstrap failed" - fi - - log "Installing a user-local AWS CLI" - python3 -m pip install --user awscli >/dev/null - export PATH="$HOME/.local/bin:$PATH" - - command -v aws >/dev/null 2>&1 || die "aws cli installation succeeded but aws was not found on PATH" - AWS_BIN="$(command -v aws)" -} - -download_fixture_files() { - local ref="$1" - local fixture_target="$WORKDIR/fixtures" - mkdir -p "$fixture_target" - - for file in context.pg context.jsonl; do - curl -fsSL \ - "https://raw.githubusercontent.com/$REPO_SLUG/$ref/crates/omnigraph/tests/fixtures/$file" \ - -o "$fixture_target/$file" || return 1 - done - - FIXTURE_DIR="$fixture_target" -} - -download_release_binaries() { - local asset asset_stem archive_dir archive_path checksum_path base_url - - [ "$FORCE_BUILD" = "1" ] && return 1 - - asset="$(platform_asset_name)" || return 1 - asset_stem="${asset%.tar.gz}" - archive_dir="$WORKDIR/release" - archive_path="$archive_dir/$asset" - checksum_path="$archive_dir/$asset_stem.sha256" - mkdir -p "$archive_dir" "$WORKDIR/bin" - base_url="$(release_base_url)" - - log "Downloading release asset $asset" - curl -fsSL \ - "$base_url/$asset" \ - -o "$archive_path" || return 1 - curl -fsSL \ - "$base_url/$asset_stem.sha256" \ - -o "$checksum_path" || return 1 - verify_checksum "$archive_path" "$checksum_path" || return 1 - tar -C "$WORKDIR/bin" -xzf "$archive_path" || return 1 - - BIN_DIR="$WORKDIR/bin" - if [ "$RELEASE_CHANNEL" = "stable" ]; then - local tag - tag="$(latest_release_tag)" - [ -n "$tag" ] || return 1 - download_fixture_files "$tag" || return 1 - else - download_fixture_files "main" || return 1 - fi -} - -build_from_source() { - local repo_root - repo_root="${1:-}" - - if [ -z "$repo_root" ]; then - need_cmd git - need_cmd cargo - - repo_root="$WORKDIR/source" - if [ ! -d "$repo_root/.git" ]; then - log "Cloning $REPO_SLUG at $SOURCE_REF" - git clone --depth 1 --branch "$SOURCE_REF" "https://github.com/$REPO_SLUG.git" "$repo_root" - fi - fi - - need_cmd cargo - log "Building omnigraph binaries from source" - ( - cd "$repo_root" - cargo build --release --locked -p omnigraph-cli -p omnigraph-server - ) - - BIN_DIR="$repo_root/target/release" - FIXTURE_DIR="$repo_root/crates/omnigraph/tests/fixtures" -} - -setup_binaries() { - local repo_root - repo_root="$(repo_root_from_shell || true)" - - if [ -n "${OMNIGRAPH_BIN_DIR:-}" ]; then - BIN_DIR="$OMNIGRAPH_BIN_DIR" - if [ -n "${OMNIGRAPH_FIXTURE_DIR:-}" ]; then - FIXTURE_DIR="$OMNIGRAPH_FIXTURE_DIR" - elif [ -n "$repo_root" ]; then - FIXTURE_DIR="$repo_root/crates/omnigraph/tests/fixtures" - fi - elif ! download_release_binaries; then - if [ -n "$repo_root" ]; then - build_from_source "$repo_root" - else - build_from_source - fi - fi - - [ -x "$BIN_DIR/omnigraph" ] || die "omnigraph binary not found in $BIN_DIR" - [ -x "$BIN_DIR/omnigraph-server" ] || die "omnigraph-server binary not found in $BIN_DIR" - [ -f "$FIXTURE_DIR/context.pg" ] || die "context fixture schema not found in $FIXTURE_DIR" - [ -f "$FIXTURE_DIR/context.jsonl" ] || die "context fixture data not found in $FIXTURE_DIR" -} - -start_rustfs() { - mkdir -p "$RUSTFS_DATA_DIR" - - if docker ps --format '{{.Names}}' | grep -qx "$RUSTFS_CONTAINER_NAME"; then - log "Reusing existing RustFS container $RUSTFS_CONTAINER_NAME" - return - fi - - if docker ps -a --format '{{.Names}}' | grep -qx "$RUSTFS_CONTAINER_NAME"; then - log "Removing stopped RustFS container $RUSTFS_CONTAINER_NAME" - docker rm -f "$RUSTFS_CONTAINER_NAME" >/dev/null - fi - - log "Starting RustFS on $AWS_ENDPOINT_URL_S3" - docker run -d \ - --name "$RUSTFS_CONTAINER_NAME" \ - -p 9000:9000 \ - -p 9001:9001 \ - -v "$RUSTFS_DATA_DIR:/data" \ - -e RUSTFS_ACCESS_KEY="$AWS_ACCESS_KEY_ID" \ - -e RUSTFS_SECRET_KEY="$AWS_SECRET_ACCESS_KEY" \ - -e RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true \ - "$RUSTFS_IMAGE" \ - /data >/dev/null -} - -wait_for_rustfs() { - local attempt - for attempt in $(seq 1 30); do - if "$AWS_BIN" --endpoint-url "$AWS_ENDPOINT_URL_S3" s3api list-buckets >/dev/null 2>&1; then - return - fi - sleep 2 - done - - docker logs "$RUSTFS_CONTAINER_NAME" || true - die "RustFS did not become ready" -} - -ensure_bucket() { - log "Ensuring bucket $BUCKET exists" - "$AWS_BIN" --endpoint-url "$AWS_ENDPOINT_URL_S3" \ - s3api create-bucket --bucket "$BUCKET" >/dev/null 2>&1 || true -} - -graph_prefix_has_objects() { - local key_count - key_count="$("$AWS_BIN" --endpoint-url "$AWS_ENDPOINT_URL_S3" \ - s3api list-objects-v2 \ - --bucket "$BUCKET" \ - --prefix "$PREFIX/" \ - --max-keys 1 \ - --query 'KeyCount' \ - --output text 2>/dev/null || true)" - - [ -n "$key_count" ] && [ "$key_count" != "None" ] && [ "$key_count" != "0" ] -} - -reset_graph_prefix() { - log "Removing existing objects under $REPO_URI" - "$AWS_BIN" --endpoint-url "$AWS_ENDPOINT_URL_S3" \ - s3 rm "s3://$BUCKET/$PREFIX" --recursive >/dev/null -} - -initialize_graph() { - if "$BIN_DIR/omnigraph" snapshot "$REPO_URI" --json >/dev/null 2>&1; then - log "Reusing existing graph at $REPO_URI" - return - fi - - if graph_prefix_has_objects; then - if [ "$RESET_REPO" = "1" ]; then - reset_graph_prefix - else - die "found existing objects under $REPO_URI but could not open an Omnigraph graph there. This usually means a previous bootstrap left a partially initialized prefix. Rerun with RESET_REPO=1 to delete that prefix and recreate it, or set PREFIX to a new value." - fi - fi - - log "Initializing graph at $REPO_URI" - "$BIN_DIR/omnigraph" init --schema "$FIXTURE_DIR/context.pg" "$REPO_URI" - - log "Loading context fixture into $REPO_URI" - "$BIN_DIR/omnigraph" load --data "$FIXTURE_DIR/context.jsonl" "$REPO_URI" -} - -start_server() { - mkdir -p "$WORKDIR" - - if [ -f "$SERVER_PID_FILE" ] && kill -0 "$(cat "$SERVER_PID_FILE")" >/dev/null 2>&1; then - log "Stopping existing server process $(cat "$SERVER_PID_FILE")" - kill "$(cat "$SERVER_PID_FILE")" >/dev/null 2>&1 || true - sleep 1 - fi - - log "Starting omnigraph-server on $BIND" - nohup "$BIN_DIR/omnigraph-server" "$REPO_URI" --bind "$BIND" >"$SERVER_LOG" 2>&1 & - echo "$!" > "$SERVER_PID_FILE" -} - -wait_for_server() { - local bind_host bind_port health_host base_url - bind_host="${BIND%:*}" - bind_port="${BIND##*:}" - health_host="$bind_host" - if [ "$health_host" = "0.0.0.0" ]; then - health_host="127.0.0.1" - fi - base_url="http://$health_host:$bind_port" - - for _ in $(seq 1 30); do - if curl -fsSL "$base_url/healthz" >/dev/null 2>&1; then - printf '%s\n' "$base_url" - return - fi - sleep 1 - done - - cat "$SERVER_LOG" >&2 || true - die "omnigraph-server did not pass /healthz" -} - -print_summary() { - local base_url="$1" - - cat </dev/null 2>&1 || die "docker is installed but the daemon is not reachable; start Docker Desktop or another daemon and rerun" - - export AWS_ACCESS_KEY_ID - export AWS_SECRET_ACCESS_KEY - export AWS_REGION - export AWS_ENDPOINT_URL - export AWS_ENDPOINT_URL_S3 - export AWS_ALLOW_HTTP - export AWS_S3_FORCE_PATH_STYLE - - mkdir -p "$WORKDIR" - - setup_binaries - ensure_aws_cli - start_rustfs - wait_for_rustfs - ensure_bucket - initialize_graph - start_server - print_summary "$(wait_for_server)" -} - -main "$@" diff --git a/skills/omnigraph/SKILL.md b/skills/omnigraph/SKILL.md new file mode 100644 index 0000000..7bf044a --- /dev/null +++ b/skills/omnigraph/SKILL.md @@ -0,0 +1,414 @@ +--- +name: omnigraph +description: Store, retrieve, and query knowledge, memory, and relationships in an Omnigraph graph, and operate a local or remote Omnigraph deployment. Use when the user wants to capture or recall facts, notes, or entities, build or query a knowledge graph or agent memory, or run Omnigraph — and whenever you see Omnigraph CLI commands (omnigraph init/query/mutate/load/schema/lint/embed/branch/commit/login/profile/cluster), .pg schema or .gq query files, s3:// graph URIs, bearer-authed graph endpoints, 504 errors, or a cluster.yaml / omnigraph.yaml / ~/.omnigraph/config.yaml. Covers cluster-mode deployments (cluster.yaml plan/apply, omnigraph-server --cluster), the two config surfaces (cluster.yaml + ~/.omnigraph/config.yaml), schema evolution, query linting, data writes (mutate; load needs --mode/--from), branches, embeddings, Cedar policy, and remote ops. Especially important before schema apply (plan first), any load (--mode required), any .gq/.pg edit (lint after), or any remote write (verify via commit list). +license: MIT (see LICENSE at repo root) +compatibility: Requires omnigraph CLI >= 0.7.0 — the unified `load`, the two config surfaces (cluster.yaml + ~/.omnigraph/config.yaml), and cluster apply/serve all require 0.7.0. +metadata: + author: ModernRelay + version: "0.7.0" + repository: https://github.com/ModernRelay/omnigraph +--- + +# Operating Omnigraph Locally + +This skill captures the operational rules for working with a locally or remotely deployed Omnigraph. Follow them when authoring schema, writing queries, loading data, evolving schema, or automating graph operations. + +## The Seven Rules + +1. **Lint before commit** — `omnigraph lint --schema schema.pg --query queries/foo.gq` validates both sides against each other. No running repo required. +2. **Plan before apply** — never run `schema apply` without a successful `schema plan` first. Apply is destructive; plan is free. (Cluster mode has the same rule with different verbs: `cluster plan` before `cluster apply` — the plan embeds the engine's real migration steps.) +3. **Branches are for data; apply is for schema** — review bulk data loads on a feature branch then merge. Schema changes go straight to `main`: in cluster mode edit the `.pg` and run `cluster apply` (a direct `schema apply` **refuses** a cluster-managed graph); `schema plan`/`apply` is for a non-cluster store. +4. **Pick the right write command** — `mutate` for edits (typechecked, parameterized); `load` for bulk JSONL, local **or** remote, with a **required** `--mode` (`merge` upsert · `append` strict-insert · `overwrite` clean-slate). `load --from ` forks a review branch in one shot; bare `load` needs an existing target branch. +5. **Parameterize everything** — never string-interpolate values into `.gq` bodies or `--params`. Declare `$var: Type` and pass via `--params`. +6. **Expose agent operations as aliases** — not raw CLI invocations. Aliases decouple the operation name from the query implementation. +7. **Verify after every remote write** — compare `commit list --branch main` head before and after. The CLI's exit code is not authoritative on remote graphs; proxies can drop the response while the write commits server-side. See `references/remote-ops.md` for the verification ritual and how to recover from 504s. + +## Essentials: Queries, Mutations, Loads + +The patterns below cover the daily 80% — enough to write correct `.gq` and JSONL without leaving this file. The long tail (multi-hop, negation, aggregations, hybrid search, every decorator) is in [`references/queries.md`](references/queries.md) and [`references/schema.md`](references/schema.md). + +**Comments in `.pg` and `.gq` are `//`, never `#`** (the #1 parse error). + +### Read query (`.gq`) + +```gq +query get_signal($slug: String) { + match { + $s: Signal { slug: $slug } // inline property filter goes in the match block + $s formsPattern $p // edge FormsPattern declared PascalCase, traversed lowerCamelCase + } + return { $s.slug, $s.name, $p.slug } +} +``` + +- **Parameterize, never interpolate.** Declare `$var: Type` in the signature; pass via `--params '{"slug":"sig-foo"}'`. An empty signature still needs parens: `query foo() { ... }`. +- **Edge traversal is lowerCamelCase** even though the schema declares edges PascalCase (`FormsPattern` → `formsPattern`). +- **List/sort** by appending `order { $s.stagingTimestamp desc } limit 50` after `return`. +- **Ranking ops (`nearest`/`bm25`/`rrf`) require a trailing `limit N`** — omitting it is a compile error. They live in `order { }`, not as filters. Scope with `match`/filters first, then rank (`order { nearest($d.embedding, $q) } limit 10`). + +### Mutation (`.gq`) + +There is **no top-level `mutation { }`** — every block is a named `query`; the verb (`insert`/`update`/`delete`) makes it a write. Dispatch with `omnigraph mutate` (not `query`). + +```gq +query add_signal($slug: String, $name: String, $brief: String, $createdAt: DateTime) { + insert Signal { slug: $slug, name: $name, brief: $brief, + stagingTimestamp: $createdAt, createdAt: $createdAt, updatedAt: $createdAt } +} +query link($from: String, $to: String) { insert FormsPattern { from: $from, to: $to } } +query retitle($slug: String, $t: String) { update Signal set { name: $t } where slug = $slug } +query remove($slug: String) { delete Signal where slug = $slug } +``` + +- **Every non-nullable property must be supplied** or lint fails (`T12: insert for 'Signal' must provide non-nullable property 'X'`). +- A single mutation is insert/update-only **or** delete-only — never both (parse-time D₂ rule); split them. +- Edges have no `@key`: give `from`/`to` slugs; the property block is `{}` when the edge has none. + +### Bulk load (JSONL) + +```jsonl +{"type":"Signal","data":{"slug":"sig-foo","name":"Foo","brief":"…","stagingTimestamp":"2026-04-14T00:00:00Z","createdAt":"2026-04-14T00:00:00Z","updatedAt":"2026-04-14T00:00:00Z"}} +{"edge":"FormsPattern","from":"sig-foo","to":"pat-bar","data":{}} +``` + +```bash +omnigraph load --data seed.jsonl --mode merge $GRAPH # --mode is REQUIRED (no default) +omnigraph load --data delta.jsonl --from main --branch review --mode merge $GRAPH # fork a review branch in one shot +``` + +- `--mode`: `merge` (upsert by `@key`) · `append` (fails on collision) · `overwrite` (destructive, staged). `--from ` forks a missing `--branch`; bare `load` needs an existing branch. Works local **and** remote. +- **Date footgun**: `mutate --params` takes ISO strings (`Date` `"2026-04-29"`, `DateTime` `"…T00:00:00Z"`); `load` JSONL takes **integer days since epoch** for `Date` (`20572`) but ISO for `DateTime`. + +### Dispatching + +```bash +omnigraph alias signal sig-foo # operator alias → its bound stored query (read or write) +omnigraph query get_signal --params '{"slug":"sig-foo"}' # served stored query by name (verb asserts read vs write) +omnigraph query -e 'query q() { match { $s: Signal } return { $s.slug } limit 5 }' # ad-hoc/inline (or: --query f.gq ) +omnigraph mutate add_signal --query mutations.gq --params '{"slug":"sig-foo", ...}' # name positional; ad-hoc file source +omnigraph lint --schema schema.pg --query queries/foo.gq # after EVERY .gq/.pg edit (no server needed) +``` + +### `.gq` grammar + +The non-obvious facts that bite, then the full grammar: + +- **Scalar param types**: `String Bool I32 I64 U32 U64 F32 F64 DateTime Date Blob`. Modifiers: `T?` (optional), `[T]` (list), `Vector(N)`. There is **no `Int`** — use `I64`. +- **A read query needs `match` *and* `return`** (`order`/`limit` optional); a mutation has neither — only `insert`/`update`/`delete`. +- **`limit` takes an integer literal, not a param** — `limit 50`, never `limit $n`. +- **Variable-hop traversal**: `$p knows{1,3} $f` (`{1,}` = unbounded). +- **Literals & calls**: `now()`, `date("2026-04-29")`, `datetime("…T00:00:00Z")`, list `[…]`. +- **Filters** `= != > < >= <= contains`; **aggregates** `count/sum/avg/min/max` (`count($f) as n`). +- **Stored-query metadata**: `@description("…")` / `@instruction("…")` may follow the param list. +- **Casing**: type names uppercase-initial (`Signal`); idents/edges lowercase-initial (`formsPattern`); variables `$`-prefixed. `//` and `/* */` comments only. + +Authoritative PEG grammar (pest) for `.gq` files ("NanoGraph" is the legacy engine name): + +```pest +// NanoGraph Query Grammar (.gq files) + +WHITESPACE = _{ " " | "\t" | "\r" | "\n" } +COMMENT = _{ LINE_COMMENT | BLOCK_COMMENT } +LINE_COMMENT = _{ "//" ~ (!"\n" ~ ANY)* } +BLOCK_COMMENT = _{ "/*" ~ (!"*/" ~ ANY)* ~ "*/" } + +query_file = { SOI ~ query_decl* ~ EOI } + +query_decl = { + "query" ~ ident ~ "(" ~ param_list? ~ ")" ~ query_annotation* ~ "{" + ~ query_body + ~ "}" +} +query_annotation = { description_annotation | instruction_annotation } +description_annotation = { "@description" ~ "(" ~ string_lit ~ ")" } +instruction_annotation = { "@instruction" ~ "(" ~ string_lit ~ ")" } + +query_body = { read_query_body | mutation_body } +mutation_body = { mutation_stmt+ } +read_query_body = { + match_clause + ~ return_clause + ~ order_clause? + ~ limit_clause? +} + +mutation_stmt = { insert_stmt | update_stmt | delete_stmt } +insert_stmt = { "insert" ~ type_name ~ "{" ~ mutation_assignment+ ~ "}" } +update_stmt = { "update" ~ type_name ~ "set" ~ "{" ~ mutation_assignment+ ~ "}" ~ "where" ~ mutation_predicate } +delete_stmt = { "delete" ~ type_name ~ "where" ~ mutation_predicate } +mutation_assignment = { ident ~ ":" ~ match_value ~ ","? } +mutation_predicate = { ident ~ comp_op ~ match_value } + +param_list = { param ~ ("," ~ param)* } +param = { variable ~ ":" ~ type_ref } + +type_ref = { (list_type | base_type | vector_type) ~ "?"? } +list_type = { "[" ~ base_type ~ "]" } +vector_type = { "Vector" ~ "(" ~ integer ~ ")" } +base_type = { "String" | "Blob" | "Bool" | "I32" | "I64" | "U32" | "U64" | "F32" | "F64" | "DateTime" | "Date" } + +match_clause = { "match" ~ "{" ~ clause+ ~ "}" } + +clause = { negation | binding | traversal | filter | text_search_clause } +text_search_clause = { search_call | fuzzy_call | match_text_call } + +// Binding: $p: Person { name: "Alice" } +binding = { variable ~ ":" ~ type_name ~ ("{" ~ prop_match_list ~ "}")? } + +prop_match_list = { prop_match ~ ("," ~ prop_match)* ~ ","? } +prop_match = { ident ~ ":" ~ match_value } +match_value = { literal | variable | now_call } + +// Traversal: $p knows $f +traversal = { variable ~ edge_ident ~ traversal_bounds? ~ variable } +traversal_bounds = { "{" ~ integer ~ "," ~ integer? ~ "}" } + +// Filter: $f.age > 25 +filter = { expr ~ filter_op ~ expr } + +// Negation: not { ... } +negation = { "not" ~ "{" ~ clause+ ~ "}" } + +// Return clause — projections separated by commas or newlines +return_clause = { "return" ~ "{" ~ projection+ ~ "}" } +projection = { expr ~ ("as" ~ ident)? ~ ","? } + +// Order clause +order_clause = { "order" ~ "{" ~ ordering ~ ("," ~ ordering)* ~ "}" } +ordering = { nearest_ordering | (expr ~ order_dir?) } +nearest_ordering = { "nearest" ~ "(" ~ prop_access ~ "," ~ expr ~ ")" } +order_dir = { "asc" | "desc" } + +// Limit clause +limit_clause = { "limit" ~ integer } + +// Expressions +expr = { now_call | nearest_ordering | search_call | fuzzy_call | match_text_call | bm25_call | rrf_call | agg_call | prop_access | variable | literal | ident } +now_call = { "now" ~ "(" ~ ")" } +search_call = { "search" ~ "(" ~ expr ~ "," ~ expr ~ ")" } +fuzzy_call = { "fuzzy" ~ "(" ~ expr ~ "," ~ expr ~ ("," ~ expr)? ~ ")" } +match_text_call = { "match_text" ~ "(" ~ expr ~ "," ~ expr ~ ")" } +bm25_call = { "bm25" ~ "(" ~ expr ~ "," ~ expr ~ ")" } +rank_expr = { nearest_ordering | bm25_call } +rrf_call = { "rrf" ~ "(" ~ rank_expr ~ "," ~ rank_expr ~ ("," ~ expr)? ~ ")" } + +prop_access = { variable ~ "." ~ ident } + +agg_call = { agg_func ~ "(" ~ expr ~ ")" } +agg_func = { "count" | "sum" | "avg" | "min" | "max" } + +comp_op = { ">=" | "<=" | "!=" | ">" | "<" | "=" } +filter_op = { "contains" | comp_op } + +// Terminals +variable = @{ "$" ~ (ident_chars | "_") } +ident_chars = @{ (ASCII_ALPHA_LOWER | "_") ~ (ASCII_ALPHANUMERIC | "_")* } + +// Edge identifier — lowercase start, same as ident but used in traversal context +// Must not match keywords +edge_ident = @{ !("not" ~ !ASCII_ALPHANUMERIC) ~ (ASCII_ALPHA_LOWER | "_") ~ (ASCII_ALPHANUMERIC | "_")* } + +type_name = @{ ASCII_ALPHA_UPPER ~ (ASCII_ALPHANUMERIC | "_")* } +ident = @{ (ASCII_ALPHA_LOWER | "_") ~ (ASCII_ALPHANUMERIC | "_")* } + +literal = { list_lit | datetime_lit | date_lit | string_lit | float_lit | integer | bool_lit } +date_lit = { "date" ~ "(" ~ string_lit ~ ")" } +datetime_lit = { "datetime" ~ "(" ~ string_lit ~ ")" } +list_lit = { "[" ~ (literal ~ ("," ~ literal)*)? ~ "]" } +string_lit = @{ "\"" ~ string_char* ~ "\"" } +string_char = @{ !("\"" | "\\") ~ ANY | "\\" ~ ANY } +float_lit = @{ ASCII_DIGIT+ ~ "." ~ ASCII_DIGIT+ } +integer = @{ ASCII_DIGIT+ } +bool_lit = { "true" | "false" } +``` + +## CLI Reference (condensed) + +Notation: `` required · `[x]` optional · `` choice · `…` repeatable. + +**Global addressing flags**: `--as ` (direct/`--store` writes only — a server resolves the actor from its token), `--server `, `--cluster ` (cluster-managed storage, for maintenance), `--graph ` (selects the graph within a `--server` or `--cluster` scope), `--profile ` (`$OMNIGRAPH_PROFILE`), `--store `. Data commands also take a positional `file://`/`s3://` URI (`--config ` is for `cluster` commands only). Output: `--json`, or reads take `--format `. **Write guards:** `--yes` skips the confirm prompt for a destructive write (`cleanup`, overwrite `load`, `branch delete`) against a non-local scope (it *refuses* without it when non-TTY or `--json`); `--quiet` suppresses the resolved-target echo. + +**Data plane** — `any` (served via `--server`/`--profile`, or direct via `--store`/URI): +- `query` (alias `read`) `` — a **served stored query** by name (via `--server`/`--profile`); or ad-hoc `[] (--query | -e '')` where `` picks which query in the source. `[--params | --params-file

] [--branch | --snapshot ] [--format | --json]`. No positional URI — address via `--server`/`--store`/`--profile`. +- `mutate` (alias `change`) — same shape (served stored mutation by ``, or ad-hoc `--query`/`-e`); `[--params …] [--branch ] [--json]`. The verb asserts kind: `query`→read, `mutate`→write (400 on mismatch). +- `alias [args…]` — invoke an operator alias's bound stored query (read or write); `[--params … | --params-file

] [--format | --json]` (server/graph/query come from the binding) +- `load --data --mode [--branch ] [--from ] [--json]` — `--mode` required; `--from` forks a missing `--branch` +- `snapshot [--branch ] [--json]` +- `export [--branch ] [--type …] [--table …]` (streams JSONL) +- `branch [--from ] | list | delete | merge --into > [--json]` +- `commit ] | show > [--json]` +- `schema --schema [--allow-data-loss] [--json]` · `schema show` (alias `get`) — `apply` **refuses a cluster-managed graph** (evolve those via `cluster apply`) + +**Served only** (needs `--server`/`--profile`): `graphs list [--json]` + +**Direct / storage** — reject `--server`; address by positional URI or `--cluster --graph `: +- `init --schema [--force]` +- `lint --query [--schema ] [] [--json]` — offline with `--schema`, graph-backed with a URI +- `optimize [--json]` · `repair [--confirm] [--force] [--json]` · `cleanup (--keep | --older-than <7d>) --confirm [--json]` +- `queries ] | list> [--json]` + +**Control plane** — cluster (`--config

`, default `.`): +- `cluster [--config ] [--json]` +- `cluster approve --as [--config ] [--json]` · `cluster force-unlock [--config ] [--json]` + +**Local** (no graph): +- `policy | explain --actor --action [--branch | --target-branch ]> --cluster [--graph ]` +- `embed --seed [--reembed_all | --clean | --select ":="]` +- `login [--token ]` (prefer piping the token on stdin) · `logout ` · `profile ]>` · `version` + +Pre-0.7.0 spellings (`read`/`change`/`ingest`, `--target`, positional `http://`) → [`references/migrations.md`](references/migrations.md). + +## Five Ontology Design Criteria (Gruber 1993) + +Omnigraph schemas are ontologies. The canonical design criteria from Gruber's *Toward Principles for the Design of Ontologies Used for Knowledge Sharing* (Int. J. Human-Computer Studies 43:907–928) apply directly when authoring `.pg` files. + +1. **Clarity** — definitions should communicate intended meaning unambiguously and be independent of social or computational context. In Omnigraph: precise type names, narrow enums over `String`, `@check`/`@range` for stated invariants. A reviewer should understand the domain from the schema alone. +2. **Coherence** — inferences sanctioned by the schema must be consistent with the domain modeled. Gruber's trap: defining quantity as a `(magnitude, unit)` pair makes `6 feet ≠ 2 yards` even though they describe the same length. In Omnigraph: watch for `@card`, `@unique`, and edge directionality that let the schema distinguish things the domain treats as equal. +3. **Extendibility** — the schema should support specialization without revising existing definitions. In Omnigraph: prefer interfaces for shared shape, leave enums open where the domain genuinely admits more, model identifiers via mapping functions rather than baking units/formats into the entity. +4. **Minimal encoding bias** — representation choices made for notation or implementation convenience leak into the model. In Omnigraph: don't type dates as `String` because the source API returns strings; separate conceptual entities (a publication date, a person) from their surface encoding (a year integer, a name string) when both matter. +5. **Minimal ontological commitment** — make as few claims about the world as the use case requires. In Omnigraph: don't add required properties, closed enums, or `@card(1..1)` "in case"; tighten later via `schema plan`/`apply` when a real constraint emerges. Weaker schemas leave consumers room to specialize. + +The criteria trade off against each other — Clarity wants tight definitions while Minimal Commitment wants weak ones. Gruber's resolution: *having decided a distinction is worth making, give it the tightest possible definition*. Decide what to model conservatively; once modeled, constrain precisely. + +## Schema Authoring Principles + +Twelve practical rules for `.pg` authoring — full text and examples in [`docs/omni-schema.md`](../../docs/omni-schema.md). In short: schema-is-the-contract · explicit identity via `@key` · model meaning not tables · strong intentional types · deliberate optionality · shared shape in interfaces · schema-level constraints (`@unique`/`@index`/`@range`/`@check`/`@card`) · search as a schema decision · edge semantics matter · reviewable schemas · intentional migrations (`@rename_from`) · domain clarity over ORM habits. + +Design flow: entities → stable keys → relationships worth their own edge → enum candidates → uniqueness/bounds/cardinality → search needs → shared shape into interfaces → evolution plan. + +## Provenance Is Structural (Multi-Agent Source of Truth) + +When Omnigraph serves as canonical truth across multiple agents, every assertion must answer *who said it, when, based on what evidence*. This is the runtime guarantee Gruber's criteria don't cover — his agents shared vocabulary; ours additionally must share attribution. Provenance belongs in the schema, not in logs. + +Without structural provenance, agents cannot reconcile contradictory assertions, retract facts when a source is discredited, replay graph state at a past timestamp, or distinguish high-evidence facts from speculation. + +**In Omnigraph:** model provenance as a `Claim`-style interface (or a separate `Claim` node linked to each sourced fact) with required fields — `asserted_by: Actor`, `asserted_at: DateTime`, `evidence_source: Source`, optionally `confidence: F64`. Don't stash provenance into a free-text `source: String` or a `metadata: JSON` dump — structured provenance is queryable, indexable, and migratable; free-form is none of these. + +## Storage & Credentials + +A graph's bytes live in one of two backends: + +- **Local filesystem** — a path or `file://` URI. In cluster mode `storage:` defaults to the config directory, so local dev needs no object store. +- **S3-compatible object storage** — AWS, Railway, Tigris, etc. (`s3://bucket/prefix`). Authenticate with the standard `AWS_*` environment contract; keep dev creds in a git-ignored `.env.omni` and source it before CLI calls: + +```bash +set -a && source .env.omni && set +a +``` + +`init` and `load` write storage directly (bypassing the server); the server reads from it. Validate with `curl http://127.0.0.1:8080/healthz`, then `omnigraph snapshot --json`. + +## Project Layout + +### Deployment & access (omnigraph >= 0.7.0) + +- **Cluster deployment — the only way to serve.** A `cluster.yaml` declares the + whole deployment (graphs, schemas, stored queries, policies, optional S3 + `storage:` root); `omnigraph cluster apply` converges it and + `omnigraph-server --cluster .` (or `--cluster s3://bucket/prefix`, + config-free) serves it. See `references/cluster.md`. +- **Direct / embedded access — no server.** Address a graph's storage directly + with `--store ` or a positional URI for one-off CLI ops. + There is **no single-graph server mode** — the server is cluster-only. + +### The two config surfaces (omnigraph >= 0.7.0) + +Configuration has two single-owner homes (RFC-007/008), plus an +everything-explicit flag/env tier: + +| Surface | Owner | Location | Declares | +|---|---|---|---| +| **Cluster config** | the team, in the repo | `cluster.yaml` + the `.pg`/`.gq`/policy files it references | what the system **is**: graphs, schemas, queries, policies, storage | +| **Operator config** | one person | `~/.omnigraph/config.yaml` (`$OMNIGRAPH_HOME` relocates it) | who **I** am: identity, named servers, output defaults, personal aliases | +| Flags / env | per invocation | — | everything, explicitly | + +```yaml +# ~/.omnigraph/config.yaml — per operator, never committed +operator: + actor: act-andrew # default --as identity +servers: + intel-dev: + url: https://graph.example.com # no tokens here, ever +defaults: + output: table # read-format default + server: intel-dev # default served scope (or `store: file://…/g.omni` for a local default — mutually exclusive) + default_graph: spike # graph within a server/cluster scope +profiles: # optional named scope bundles — pick with --profile + staging: { server: intel-staging, default_graph: spike } +aliases: # personal bindings to TEAM stored queries (see references/aliases.md) + triage: { server: intel-dev, graph: spike, query: weekly_triage, args: [since] } +``` + +The operator config and credentials are **auto-discovered — no flag points at them**: the CLI reads `$OMNIGRAPH_HOME/config.yaml` (default `~/.omnigraph/config.yaml`), and an absent file is just an empty layer (zero-config). `$OMNIGRAPH_HOME` relocates the *directory* only, not a specific file. (`--config`/`$OMNIGRAPH_CONFIG` is a separate flag for the cluster / server config — not this.) + +Credentials live outside config: `echo $TOKEN | omnigraph login intel-dev` +writes `~/.omnigraph/credentials` (`0600`); the matching token resolves via +`OMNIGRAPH_TOKEN_INTEL_DEV` or that file. + +**Addressing a graph**: `--store ` or a positional URI for +direct storage; `--server ` (+ `--graph `) for a served remote; +`--profile ` for a named bundle; else the operator `defaults`. A remote is +addressed with `--server` (a bare `http(s)://` URL is not a graph address). Run +data-plane commands from a graph's project folder so relative `queries/`, +`schema.pg`, and `.env.omni` paths resolve. + +### What to commit + +**Commit:** `schema.pg`, `queries/*.gq`, `cluster.yaml`, `seed.md`, `seed.jsonl`, and the project's `README.md` and `CLAUDE.md`. + +**Ignore:** `.env.omni` (credentials), `.claude/` (local agent state), `*.omni/` (local graph artifacts), `__cluster/` and `graphs/` (cluster state + derived graph roots). + +### Give agents a `CLAUDE.md` + +A per-project `CLAUDE.md` tells coding agents where files live and what conventions matter. Without it, agents re-discover the same things every session. + +## Common Gotchas + +These are the traps most likely to bite. Scan this table before debugging any parse or runtime error. + +| Trap | Symptom | Fix | +|------|---------|-----| +| `#` comments in `.pg` | `parse error: expected schema_file` | Use `//` | +| Standalone `enum Foo { ... }` block | `parse error: expected EOI or schema_decl` | Inline: `kind: enum(a, b)` | +| `[Category]` (list of enum) | compile error | Use `[String]`; lists must contain scalars | +| `@embed(text)` without quotes | `unexpected constraint_name` | `@embed("text")` | +| `@unique(src)` on edge without body block | parse error | `@card(1..1) { @unique(src) }` | +| `load --mode merge` after `@embed` source change | stale embeddings | `omnigraph embed --reembed_all` or `load --mode overwrite` | +| `schema apply` with feature branches open | rejected | Merge or delete branches first | +| `nearest(...)` / `bm25(...)` / `rrf(...)` without `limit` | compile error | Add `limit N` | +| Adding non-nullable property without backfill | unsupported migration | Make optional → backfill → tighten in follow-up apply | +| `omnigraph init --json` | `unexpected argument --json` | `init` doesn't support `--json`; drop the flag | +| `omnigraph init` on an already-initialized URI | `AlreadyInitialized` error (v0.6.0+) | `--force` to re-init (skips the schema preflight; does **not** purge data) | +| `schema apply` dropping a property/type | soft-dropped or rejected (no data loss) | add `--allow-data-loss` to actually drop the column | +| Committing `.env.omni` | credential leak | Add `.env*` to `.gitignore` | +| Non-parameterized query values | typecheck surprise, injection risk | Declare `$param: Type` and pass via `--params` | +| Missing required field in `insert` | `T12: insert for 'X' must provide non-nullable property 'Y'` | Accept the param in the mutation signature | +| Long-lived feature branches | merge conflicts, schema apply blocked | Merge promptly; delete when done | +| `mutation { ... }` wrapper in `.gq` | `parse error: expected query_file` at line 1 | Use `query (...) { insert T { ... } }`; there is no top-level `mutation` keyword | +| `--config` placed before subcommand | `unexpected argument --config` | Put `--config` **after** the subcommand (e.g. `omnigraph schema show --config X`) | +| Reading a large schema via stdout-capped tool | Truncated, garbled, or duplicated output | `omnigraph schema show > /tmp/schema.pg` first; then read the file with offset/limit | +| `omnigraph load` without `--mode` | error: `--mode` is required | Pass `--mode merge\|append\|overwrite` — there is no default (overwrite is destructive, so it is never implicit). `load` works against local and remote URIs | +| Blind retry after 504 | Duplicate Signal/Decision/Claim (append-only types lack `@key` dedup) | `commit list --branch main --json` first; head advanced means it landed; only retry if unchanged | +| `sync_branch()` mentioned in version-drift error | Searching for nonexistent CLI command | Server-internal directive in error text; just retry — the next call re-pins to the new head | +| Stale empty branches at `main`'s head | 504-orphaned forks from a timed-out `load --from`; eventually block writes | List branches, find ones at `main`'s `graph_commit_id`, `omnigraph branch delete --config X ` | +| `omnigraph schema apply` / `init` on a cluster-managed graph | refused — bypasses the cluster ledger | Evolve cluster graphs via `omnigraph cluster apply --config .`; `schema apply`/`init` are for a non-cluster store | +| `omnigraph optimize` against a table with a `Blob` property | table is **skipped**, not failed (Lance blob-v2 compaction bug) | Expected — `--json` reports it under `skipped`; non-blob tables still compact | +| `@unique` on a `[List]`/`Blob` column | `load` now errors loudly (was silently un-enforced before #160) | Use `@unique` only on scalar columns (and composite `@unique(a, b)`, now keyed as a true tuple) — uniqueness needs a type that reduces to a scalar key | + +## Deep Dives + +- `references/cluster.md` — cluster-mode declarative deployments: cluster.yaml, the validate/import/plan/apply loop, approval-gated deletes, `--cluster` serving, the two-file contract, recovery + +For anything beyond the basics, load the relevant reference file. Each is self-contained — load only what you need. + +| Reference | When to load | +|-----------|--------------| +| [`references/schema.md`](references/schema.md) | Editing `.pg` files, running `schema plan`/`apply`, renaming types, backfilling required fields | +| [`references/queries.md`](references/queries.md) | Writing or linting `.gq` files, search functions, aggregations, multi-hop patterns | +| [`references/data.md`](references/data.md) | Choosing between `mutate` and `load` (required `--mode`, `--from` to fork a review branch); branch review workflow; destructive ops | +| [`references/remote-ops.md`](references/remote-ops.md) | Operating against a remote/CloudFront-fronted graph: 504 verification ritual, version drift, fork-branch 504 fingerprints, append-only retry safety, operator `--server`/`login` targeting | +| [`references/search.md`](references/search.md) | Embeddings, `@embed`, vector/text ranking, scope-then-rank pattern | +| [`references/aliases.md`](references/aliases.md) | Defining aliases for agents, structured output, JSON args | +| [`references/stored-queries.md`](references/stored-queries.md) | Server-side stored-query registry: declared in `cluster.yaml`, `omnigraph queries validate/list`, `GET /graphs/{id}/queries` + `POST /graphs/{id}/queries/{name}`, `invoke_query` Cedar gating | +| [`references/server-policy.md`](references/server-policy.md) | Starting the HTTP server, routes, bearer auth, Cedar policy gating, multi-graph mode | +| [`references/commands.md`](references/commands.md) | `snapshot`, `export`, `commit list/show`, addressing & resolution | +| [`references/migrations.md`](references/migrations.md) | Migrating a pre-0.7.0 setup, or you hit an old config/command/flag/route/error and need its current form | diff --git a/skills/omnigraph/references/aliases.md b/skills/omnigraph/references/aliases.md new file mode 100644 index 0000000..85dba93 --- /dev/null +++ b/skills/omnigraph/references/aliases.md @@ -0,0 +1,141 @@ +# Aliases & Agent Automation + +## Contents +- What an alias is +- Operator alias schema +- Args binding & JSON-first parsing +- Default to structured output +- Alias naming convention +- Secrets don't belong in aliases +- Example alias set +- Invocation patterns + +How to wire Omnigraph operations for agents and scripts. + +## What an alias is + +An **operator alias** decouples a stable **operation name** from its implementation, so an agent calling `omnigraph alias signal …` keeps working as the query evolves. Aliases live in `~/.omnigraph/config.yaml` and are personal *bindings* to a **stored query on a named server** — they carry no query content; the stored query in the cluster catalog is the team's contract. + +```yaml +# ~/.omnigraph/config.yaml +aliases: + triage: + server: intel-dev # an entry under servers: + graph: spike # optional (multi-graph servers) + query: weekly_triage # the STORED query's name — never a file + args: [since] # positional args → params, in order + params: { limit: 20 } # fixed defaults; positionals/--params win + format: table +``` + +```bash +omnigraph alias triage 2026-06-01 +# → POST /graphs/spike/queries/weekly_triage with the keyed credential +``` + +> **Alias vs stored query.** The alias is *yours* (a personal name + defaults); the **stored query** it points at is the *team's* — declared in `cluster.yaml`, type-checked and served by the cluster (`GET /graphs//queries`, `POST /graphs//queries/`, gated by `invoke_query`). See [`stored-queries.md`](stored-queries.md). +## Operator Alias Schema + +```yaml +aliases: + : + server: # an entry under servers: in ~/.omnigraph/config.yaml + graph: # optional: for multi-graph servers + query: # the stored query's NAME (never a file path) + args: [, ] # positional CLI args → named params, in order + params: { : } # fixed default params; positionals / --params win + format: table|kv|csv|jsonl|json # optional: output format +``` + +Dispatch with `omnigraph alias [args]` — one subcommand for read **and** write stored queries (a mutation alias is double-gated by `invoke_query` + `change`). Aliases live in their own namespace, so one can never shadow or be shadowed by a built-in verb. + +### `args` bind to query parameters + +If `args: [slug, name, age]`, then: + +```bash +omnigraph alias foo sig-bar "Some Name" 29 +``` + +...maps to `{"slug":"sig-bar","name":"Some Name","age":29}`. + +### Args are JSON-first + +Each arg is parsed as JSON first, then falls back to string: +- `29` → integer +- `"29"` → string +- `true` → boolean +- `Alice` → string (JSON parse fails, falls back) +- `{"x":1}` → object + +Explicit `--params '{...}'` wins on key conflict. + +## Default to Structured Output + +For scripts and agents, prefer `jsonl` or `json`; `table` is for humans. Set a default in `~/.omnigraph/config.yaml`: + +```yaml +defaults: + output: jsonl +``` + +Or per-alias (`format: jsonl`), or per-call (`--format jsonl`). + +### When to use which + +- **`jsonl`** — one JSON object per line, first line is metadata; streams; ideal for agents +- **`json`** — pretty-printed JSON array; smaller results; human-readable +- **`kv`** — `key: value` per line; good for single-row lookups +- **`csv`** — for spreadsheets or line-count-heavy analysis +- **`table`** — default human view; don't use in automation + +## Alias Naming Convention + +Short, hyphenated, matches the conceptual operation: + +- `signal`, `pattern`, `element` — single lookup (typical pair with `format: kv`) +- `signals`, `patterns`, `elements` — list +- `signal-patterns`, `pattern-signals` — traversals +- `add-signal`, `link-forms-pattern` — mutations + +## Secrets Don't Belong in Aliases + +Credentials never live in an alias or any config file. For remote servers, `omnigraph login ` stores the bearer token in `~/.omnigraph/credentials` (`0600`); for S3-backed storage, AWS creds go in `.env.omni`. Aliases should only contain query names and parameter bindings — never tokens, passwords, or API keys. + +## Example Alias Set + +```yaml +# ~/.omnigraph/config.yaml +servers: + intel-dev: { url: https://graph.example.com } +aliases: + # Lookups (kv format for single-row readability) + signal: { server: intel-dev, graph: spike, query: get_signal, args: [slug], format: kv } + pattern: { server: intel-dev, graph: spike, query: get_pattern, args: [slug], format: kv } + # Lists + signals: { server: intel-dev, graph: spike, query: recent_signals } + # Traversals + pattern-signals: { server: intel-dev, graph: spike, query: pattern_signals, args: [slug] } + # Mutations (stored mutation; invoke_query + change) + add-signal: { server: intel-dev, graph: spike, query: add_signal, args: [slug, name, brief, stagingTimestamp, createdAt, updatedAt] } + link-forms-pattern: { server: intel-dev, graph: spike, query: link_signal_forms_pattern, args: [signal, pattern] } +``` + +Each `query:` names a stored query the cluster serves — declare them in `cluster.yaml` and `cluster apply` first (see [`stored-queries.md`](stored-queries.md)). + +## Invocation Patterns + +```bash +# Invoke an alias (read or write — the bound stored query decides) +omnigraph alias signal sig-kimi-k25 +omnigraph alias add-signal sig-new "Name" "Brief" \ + 2026-04-14T00:00:00Z 2026-04-14T00:00:00Z 2026-04-14T00:00:00Z + +# Override output format +omnigraph alias signals --format jsonl + +# Explicit --params (wins over positional args on key conflict) +omnigraph alias signal --params '{"slug":"sig-override"}' +``` + +The `alias` subcommand carries `--params`/`--params-file`, `--format`/`--json`, and `--config`; the server, graph, and stored-query name come from the binding. For a different server/graph or a branch read, call `query`/`mutate` directly. diff --git a/skills/omnigraph/references/cluster.md b/skills/omnigraph/references/cluster.md new file mode 100644 index 0000000..3e9f6e9 --- /dev/null +++ b/skills/omnigraph/references/cluster.md @@ -0,0 +1,128 @@ +# Cluster Mode — Declarative Deployments + +## Contents +- The model +- The loop (validate → import → plan → apply → serve) +- The config contract (`cluster.yaml` vs `~/.omnigraph/config.yaml`) +- Serving (`--cluster`, config-free bucket boot) +- Recovery cheat-sheet + +The cluster control plane (omnigraph >= 0.7.0) manages a whole deployment — +graphs, schemas, stored queries, Cedar policies — as **declared files in one +directory**, converged Terraform-style. It is the **only way to serve** a +graph (the server is cluster-only); the data-plane operations in the other +references work against the cluster's graphs unchanged. + +## The model + +``` +company-brain/ +├── cluster.yaml # the deployment: graphs, schemas, queries, policies +├── schema.pg +├── queries/*.gq +├── *.policy.yaml +├── graphs/.omni # DERIVED — created by apply, never by hand (gitignore) +└── __cluster/ # ledger + catalog + approvals — local state (gitignore) +``` + +```yaml +# cluster.yaml +version: 1 +# storage: s3://my-bucket/clusters/company-brain # optional — put ledger, +# catalog, and graph roots on S3 object storage (default: this folder) +state: { backend: cluster, lock: true } +graphs: + knowledge: + schema: schema.pg + queries: queries/ # the .gq files ARE the declaration — every `query ` registers +policies: + base: { file: base.policy.yaml, applies_to: [knowledge] } # or [cluster] for server-level +``` + +`queries` also accepts a file list (`[a.gq, b.gq]`) or a fine-grained +`name: { file: ... }` map. Discovery is loud: unparseable files and duplicate +names across files fail validation. + +## The loop (memorize this) + +```bash +omnigraph cluster validate --config . # parse + typecheck everything +omnigraph cluster import --config . # one-time: create the state ledger +omnigraph cluster plan --config . # preview — REQUIRED reading before apply +omnigraph cluster apply --config . --as # converge (idempotent) +omnigraph-server --cluster . --bind 127.0.0.1:8080 --unauthenticated # serve (local dev) +``` + +- **`apply` creates graphs** at `graphs/.omni` — there is no separate + `omnigraph init` in cluster mode. +- **Schema changes**: edit the `.pg`, `plan` shows the engine's real migration + steps (`add_property`, `drop_property [soft]`, `unsupported: …`), `apply` + migrates the live graph. **Soft drops only** — data-loss migrations are not + reachable from cluster apply (prior versions retain dropped columns). +- **Applied = serving on the next server restart.** No hot reload. +- **`storage: s3://bucket/prefix`** (optional) puts the entire cluster — state + ledger, lock, content-addressed catalog, recovery sidecars, approval + artifacts, and the derived graph roots (`/graphs/.omni`) — on + S3-compatible object storage. The ledger CAS uses S3 conditional writes and + the lock becomes genuinely cross-machine. Absent, everything defaults to the + config directory (byte-compatible with pre-existing clusters). Credentials + come from the standard `AWS_*` env contract, never `cluster.yaml`. +- **`--as ` attributes every run** (sidecars, audit, engine commits). + Defaults from your operator config's `operator.actor`; required for `approve`. +- **Destructive changes are gated**: removing a graph from `cluster.yaml` + blocks with `approval_required` until + `omnigraph cluster approve graph. --config . --as ` records a + digest-bound approval. Any config/state drift after approving invalidates it. +- **Drift**: `cluster refresh` re-observes live graphs and marks out-of-band + changes `drifted`; the next `apply` converges them back to the declaration. +- **Data is NOT cluster's job**: rows flow through `omnigraph load / mutate` + against the derived roots, with branches as usual. + +## The config contract (do not blur this) + +| File | Owns | Read by | +|---|---|---| +| `cluster.yaml` | the deployment: graph set, schemas, stored queries, policy bindings, storage | `cluster` commands; the `--cluster` server | +| `~/.omnigraph/config.yaml` | per-operator: identity (`operator.actor`), named `servers:`, output defaults, personal aliases | data-plane CLI commands (tokens live in `~/.omnigraph/credentials` via `omnigraph login`) | + +Cluster commands read the operator config for **exactly one thing**: the actor +default when `--as` is omitted (`--as` > `operator.actor`). A `--cluster` server +reads it for **nothing** — boot from cluster state XOR the operator file, never +a merge. +Address a cluster-managed graph's data directly with `--store /graphs/.omni`, +or via `--server`/aliases against a serving instance — that is ergonomics, not +coupling. + +## Serving + +`omnigraph-server --cluster ` is exclusive (cannot combine with a URI, +`--target`, or `--config`), always multi-graph (`/graphs/{id}/...`), and +fail-fast: missing/pending/tampered state refuses boot with a remedy. Every +declared query is exposed (`GET /graphs//queries`, `POST +/graphs//queries/`); Cedar bundles attach via `applies_to` +(`cluster` → server-level gate incl. `graph_list`; `graph.` → that +graph's gate incl. `invoke_query`). Bearer tokens and bind stay process-level +(env/flags). + +**Config-free serving.** `--cluster` also accepts the storage-root URI +directly — `omnigraph-server --cluster s3://bucket/prefix` boots from the +applied revision on the bucket with **no checkout of the config repo**. The +ledger and catalog on the bucket are the whole deployment artifact; policy +bundles serve as digest-verified content from the catalog. The preferred +container shape is **bucket, no volume** (AWS ECS / Railway recipes in the +omnigraph repo's `docs/user/deployment.md`). For a mounted config directory +instead, `OMNIGRAPH_CLUSTER=` works and the image ships the CLI for +in-container `cluster apply`. + +## Recovery cheat-sheet + +| Symptom | Fix | +|---|---| +| Apply crashed mid-run | run `cluster apply` again — sidecars + sweep reconcile | +| Held lock | `cluster status` (shows lock id) → `cluster force-unlock --config .` | +| Lost/corrupt `state.json` | `cluster import` rebuilds from config + live graphs, then `apply` | +| Server refuses to boot | the error names its remedy (usually `cluster refresh` + `apply`, restart) | +| `approval_stale` warning | re-run `cluster approve` — the plan changed since you approved | + +Full reference: the omnigraph repo's `docs/user/clusters/index.md` (operator guide) +and `docs/user/clusters/config.md` (every key, flag, and diagnostic). diff --git a/skills/omnigraph/references/commands.md b/skills/omnigraph/references/commands.md new file mode 100644 index 0000000..a76844b --- /dev/null +++ b/skills/omnigraph/references/commands.md @@ -0,0 +1,237 @@ +# Reference Commands + +## Contents +- Inspect state (snapshot, export) +- Branches · commits · graphs +- Schema · lint · embed · init +- Load (bulk JSONL) +- Query / mutate +- Maintenance (optimize, cleanup) +- Stored queries +- Operator config & credentials +- Config resolution order +- Output formats · health check +- Cluster control plane + +Commands you'll reach for but don't need best-practice rules around. Quick syntax reference. + +## Inspect State + +### `snapshot` — tables + row counts + +```bash +omnigraph snapshot $REPO --branch main --json +``` + +Returns the manifest: all node/edge tables with row counts and versions. Use this to verify a load succeeded or to see what types exist. + +### `export` — full JSONL dump + +```bash +omnigraph export $REPO --branch main > graph.jsonl +``` + +Streams all nodes and edges as JSONL. The right tool for large-snapshot inspection. Don't try to page through the whole graph with read queries. + +Filter by type: + +```bash +omnigraph export $REPO --branch main --type Signal > signals.jsonl +``` + +## Branches + +```bash +omnigraph branch create --from main --store $REPO +omnigraph branch list --store $REPO +omnigraph branch merge --into main --store $REPO +omnigraph branch delete --store $REPO +``` + +All support `--json`. + +## Commits (History) + +```bash +omnigraph commit list $REPO --branch main +omnigraph commit show $REPO +``` + +Inspect graph history. Useful for "what changed between these two points" investigation. + +## Graphs (multi-graph servers) + +```bash +omnigraph graphs list --config X --json +``` + +Lists the graphs a multi-graph server serves. Remote servers only (rejects local URIs); the server must expose `GET /graphs` via `server.policy.file`. See `references/server-policy.md`. + +## Schema + +```bash +omnigraph schema plan --schema next.pg $REPO --json +omnigraph schema apply --schema next.pg $REPO +``` + +See `references/schema.md` for the full workflow. + +## Lint + +```bash +omnigraph lint --schema schema.pg --query queries/foo.gq --json +# or against a live repo: +omnigraph lint --query queries/foo.gq $REPO --json +``` + +`lint` is the single query-validation command. See `references/queries.md`. + +## Embed + +```bash +omnigraph embed --seed embed-config.yaml # fill missing +omnigraph embed --seed embed-config.yaml --reembed_all # regenerate all +omnigraph embed --seed embed-config.yaml --clean # delete +omnigraph embed --seed embed-config.yaml --select "Type:field=value" +``` + +See `references/search.md`. + +## Init + +```bash +omnigraph init --schema schema.pg $REPO +``` + +Creates a new graph at `$REPO` with the given schema. Declare the deployment in a `cluster.yaml` (see `references/cluster.md`). + +**Strict by default (v0.6.0+):** `init` against a URI that already holds schema files errors with `AlreadyInitialized` instead of silently overwriting. Use `omnigraph init --force` to re-init deliberately. `--force` only skips the schema-file preflight — it does **not** purge existing Lance datasets. + +**Note:** `init` does not accept `--json`. Drop the flag if you see `unexpected argument --json`. + +## Load (bulk JSONL) + +```bash +# bare load: operates on an existing branch (default main); --mode is required +omnigraph load --data seed.jsonl --mode merge $REPO + +# --from forks a missing branch from , then loads onto it (one-shot review branch) +omnigraph load --data delta.jsonl --branch feature-x --from main --mode merge $REPO +``` + +`--mode` is **required** (no default): `merge`, `append`, or `overwrite`. `load` works against local **and** remote URIs. See `references/data.md`. + +## Query / Mutate + +```bash +omnigraph query get_signal --query queries/signals.gq --params '{"slug":"sig-foo"}' # ad-hoc file; is positional +omnigraph query get_signal --server intel-dev --params '{"slug":"sig-foo"}' # served stored query by name +omnigraph mutate add_signal --query queries/mutations.gq --params '{"slug":"sig-foo",...}' +``` + +With aliases: + +```bash +omnigraph alias signal sig-foo +omnigraph alias add-signal sig-foo "Name" "Brief" 2026-04-14T00:00:00Z 2026-04-14T00:00:00Z 2026-04-14T00:00:00Z +``` + +> `query` and `mutate` also accept inline source via `-e/--query-string ''` instead of `--query `. + +## Maintenance: Optimize & Cleanup (v0.6.1) + +### `optimize` — non-destructive Lance compaction + +```bash +omnigraph optimize $REPO --json +``` + +Compacts fragments and reclaims deleted-row space. Non-destructive — safe to run any time. **Skips tables with a `Blob` property** (Lance blob-v2 compaction decode bug); skipped tables are reported in the `skipped` field of `--json` output and in logs. Non-blob tables compact normally. Blob-table fragment count won't shrink until the upstream Lance fix lands — reads/writes are unaffected. + +### `cleanup` — destructive version GC + +```bash +omnigraph cleanup $REPO --keep 5 --older-than 7d --confirm +``` + +Garbage-collects old table versions, dropping time-travel reachability for anything pruned. **Destructive** — requires `--confirm`. Duration units for `--older-than`: `s`, `m`, `h`, `d`, `w`. Also reconciles orphaned per-table forks left by an interrupted `branch delete`. + +## Stored Queries (v0.6.1) + +```bash +omnigraph queries validate # type-check the stored-query registry vs the live schema (offline; exits non-zero on drift) +omnigraph queries list # list registry query names, MCP exposure, and typed params +``` + +`validate` opens the addressed graph and type-checks every applied stored query against the live schema — catches drift without restarting the server. `list` prints that graph's registry. Address the graph with `--store ` or a positional URI. Distinct from `lint` (which validates a single `.gq` file). See `references/stored-queries.md`. + +## Operator Config & Credentials + +```bash +echo "$TOKEN" | omnigraph login # store a bearer token in ~/.omnigraph/credentials (0600) +omnigraph logout # remove it (idempotent) +``` + +The operator config and `~/.omnigraph/credentials` are **auto-discovered — there is no flag to point at them.** `$OMNIGRAPH_HOME` relocates the `~/.omnigraph` *directory* (mainly for test isolation), and an absent file is just an empty layer (zero-config). Separately, `$OMNIGRAPH_CONFIG` stands in for the `--config` flag — which targets the **cluster directory / server config**, never the operator config. See SKILL.md → *The two config surfaces*. + +## Addressing a Graph + +How the CLI resolves which graph a data command (`query`, `mutate`, `load`, `branch`, …) runs against. A remote is addressed with `--server` (a bare `http(s)://` URL is not a graph address). + +Precedence (highest first): + +1. **`--store `** or a **positional `file://`/`s3://` URI** — direct storage access (bypasses any server; no catalog, so stored-query *names* don't resolve). `--store` is exclusive with a positional URI and with `--server`. +2. **`--server `** (+ `--graph ` for a multi-graph server) — served/remote. A name resolves from `servers:` in `~/.omnigraph/config.yaml`; a literal `http(s)://` URL also works. +3. **`--profile `** (or `$OMNIGRAPH_PROFILE`) — a named scope bundle from `profiles:` in the operator config (binds one of server/cluster/store + a default graph). +4. **Operator defaults** — `defaults.server` + `defaults.default_graph`, or `defaults.store` for a zero-flag local scope (mutually exclusive with `defaults.server`). + +Control-plane commands use `--config ` (cluster); maintenance against a cluster-managed graph uses `--cluster --graph `. Each command declares a **capability** — `any` / `served` / `direct` / `control` / `local` — shown in `omnigraph --help`; mis-addressing (e.g. `--server` on a `direct` verb, or a remote URI to `optimize`) fails loudly. + +For query source (`query`/`mutate`): + +1. **`--query `** or **`-e/--query-string ''`** — exactly one (operator aliases are invoked via the separate `alias` subcommand) +2. Relative `--query` paths resolve through **`query.roots`** in config + +For params: + +1. **Explicit `--params '{...}'`** wins on key conflict +2. **Positional alias args** map to alias `args` list + +## Output Formats + +`--format ` on query/mutate: + +- `table` (default) — human-readable +- `kv` — `key: value` per line; good for single rows +- `csv` — comma-separated +- `jsonl` — NDJSON, one per line, with metadata line first +- `json` — pretty JSON array + +For admin commands (branch, commit, schema, policy): use `--json` for structured output, otherwise human text. + +## Health Check + +```bash +curl http://127.0.0.1:8080/healthz +``` + +Returns `200 OK` if the server is up. + +## Cluster Control Plane (omnigraph >= 0.7.0) + +```bash +omnigraph cluster validate --config # parse + typecheck the declaration +omnigraph cluster import --config # one-time: create the state ledger +omnigraph cluster plan --config [--json] # preview (schema changes show migration steps) +omnigraph cluster apply --config --as # converge; idempotent +omnigraph cluster approve --config --as # gate destructive changes (graph deletes) +omnigraph cluster status --config [--json] # read the ledger (read-only) +omnigraph cluster refresh --config # re-observe live graphs; flags drift +omnigraph cluster force-unlock --config # clear a crashed run's lock (exact id from status) +``` + +Topology rule: `omnigraph schema apply` and `omnigraph init` **refuse a +cluster-managed graph** — in a cluster their jobs belong to `cluster apply`. +Data commands (`load`, `mutate`, branches) work either way — point them at the +derived root (`/graphs/.omni`, or `/graphs/.omni` for an +S3-backed cluster). See `references/cluster.md`. diff --git a/skills/omnigraph/references/data.md b/skills/omnigraph/references/data.md new file mode 100644 index 0000000..f553270 --- /dev/null +++ b/skills/omnigraph/references/data.md @@ -0,0 +1,175 @@ +# Data Changes & Branches + +## Contents +- Choose the right write command +- `mutate` — single edits +- `load` — bulk JSONL (`--mode`, `--from`) +- Branches: review before merge +- Destructive ops go through a branch +- Branch commands +- Inspecting state after changes + +How to modify data safely in Omnigraph. + +## Choose the Right Write Command + +`load` is the one bulk-JSONL command — local **or** remote, against any +existing branch, with a **required** `--mode`. `mutate` is for single typed +edits. + +| Task | Command | Why | +|------|---------|-----| +| Add/update a single entity | `mutate` with a named mutation | typechecked, parameterized, auditable | +| Bulk upsert by `@key` | `load --mode merge` | preserves rows not in the file | +| Additive-only bulk | `load --mode append` | fails on key collision | +| Clean-slate reseed | `load --mode overwrite` | **destructive** — wipes the branch | +| Bulk load onto a fresh review branch | `load --from main --mode merge --branch ` | forks `` from `main`, loads onto it, leaves it for review | + +> **`--mode` is required** — there is no default. Overwrite is destructive, so +> the CLI never picks a mode for you. +> +> **Local and remote are one command.** `load` works against a local repo URI +> (writing storage directly) *and* a remote `omnigraph-server` endpoint (the +> server orchestrates the write and publishes one atomic commit). See +> [`references/remote-ops.md`](remote-ops.md) for remote-specific concerns +> (504 handling, write-verification ritual). + +## `mutate` — Single Edits + +Goes through the running server (the configured default graph, or an alias): + +```bash +omnigraph mutate add_signal \ + --query mutations.gq \ + --params '{"slug":"sig-foo","name":"Foo","brief":"...","stagingTimestamp":"2026-04-14T00:00:00Z","createdAt":"2026-04-14T00:00:00Z","updatedAt":"2026-04-14T00:00:00Z"}' +``` + +Or via an alias: + +```bash +omnigraph alias add-signal sig-foo "Foo" "..." 2026-04-14T00:00:00Z 2026-04-14T00:00:00Z 2026-04-14T00:00:00Z +``` + +Prefer `mutate` for interactive edits, mutations called from agents, and anything you want typechecked at call time. + +## `load` — Bulk JSONL + +JSONL format: + +```jsonl +{"type":"Signal","data":{"id":"sig-foo","slug":"sig-foo","name":"Foo","brief":"...","stagingTimestamp":"2026-04-14T00:00:00Z","createdAt":"2026-04-14T00:00:00Z","updatedAt":"2026-04-14T00:00:00Z"}} +{"edge":"FormsPattern","from":"sig-foo","to":"pat-bar","data":{}} +``` + +- Nodes: `{"type":"","data":{...props...}}` — `id` equals `slug` +- Edges: `{"edge":"","from":"","to":"","data":{...edge_props...}}` + +Load command: + +```bash +omnigraph load --data seed.jsonl --mode merge s3://my-bucket/repos/spike-intel +``` + +`--from ` forks a missing `--branch` from `` before loading (the +one-shot review-branch flow below). Without `--from`, the target `--branch` +(default `main`) must already exist. + +### `--mode` semantics + +- **`overwrite`** (destructive) — replaces every node/edge table on the branch with the file's contents. **Staged**: the loader validates node/edge constraints, referential integrity, and edge cardinality *before* any data moves, so a bad file fails before touching the branch. Safe on a **first** load; risky afterward. Don't run it against `main` in production without a branch backup path. +- **`merge`** (upsert) — for each row, insert if `@key` is new, update if it exists. Rows not in the file are preserved. The safe default for incremental bulk updates. +- **`append`** (strict insert) — fails on key collision. Use when you're certain every row is new. + +### `merge` does NOT recompute embeddings + +If you change seed rows that feed into `@embed("source")` via `load --mode merge`, the source field updates but the embedding stays stale. + +**Fix:** run `omnigraph embed --reembed_all` after, or use `load --mode overwrite` once (which re-triggers embedding on load). + +### `overwrite` is destructive + +Wipes the entire branch's data for every node and edge type. Use only for: +- First-time seed +- Intentional full reseed on a feature branch +- Recovery scenarios + +Never on `main` without a branch backup. + +## Branches: Review Before Merge + +Branches exist for **data review**, not schema changes. Schema goes straight to `main` via `plan` + `apply`. + +### The review loop + +```bash +REPO=s3://my-bucket/repos/spike-intel + +# 1. Create feature branch from main +omnigraph branch create --from main staging-2026-04-14 --store $REPO + +# 2. Load delta onto the branch (merge mode is typical for review) +omnigraph load --data delta.jsonl --branch staging-2026-04-14 --mode merge $REPO + +# 3. Verify on the branch (reads can target --branch or --snapshot) +omnigraph query recent_signals --query queries/signals.gq --branch staging-2026-04-14 --store $REPO + +# 4. Merge to main when happy +omnigraph branch merge staging-2026-04-14 --into main --store $REPO + +# 5. Optionally delete the branch +omnigraph branch delete staging-2026-04-14 --store $REPO +``` + +### Fork a branch in one shot with `--from` + +- Bare `load` operates on an existing branch (default `main`). +- `load --from main --branch ` forks `` from `main`, loads onto it, and leaves it for review — the whole review-branch flow in one command. + +Use `--from` for anything you want reviewed before it touches `main`. + +### Keep branches short-lived + +Long-lived branches compound merge risk. The usual flow is: create → load → verify → merge → delete, all in the same session. A week-old feature branch is a yellow flag. + +### Schema apply blocks non-main branches + +`omnigraph schema apply` rejects the request if any non-main branches exist. Merge or delete them first. This is enforced — it's not just a guideline. + +## Destructive Ops Go Through a Branch + +For any bulk load that could disrupt downstream queries (overwriting a heavily-referenced node type, removing edges en masse, reseeding a core table), use a feature branch: + +```bash +omnigraph load --data risky.jsonl --branch recovery-2026-04-14 \ + --from main --mode overwrite $REPO +# inspect, diff, verify reads +omnigraph branch merge recovery-2026-04-14 --into main --store $REPO +``` + +## Branch Commands (quick reference) + +```bash +omnigraph branch create --from main --store $REPO +omnigraph branch list --store $REPO +omnigraph branch merge --into main --store $REPO +omnigraph branch delete --store $REPO +``` + +All support `--json` for automation-friendly output. Address the graph with a +positional `file://`/`s3://` URI (shown), `--store `, or `--server `. + +## Inspecting State After Changes + +```bash +omnigraph snapshot $REPO --branch main --json # tables + row counts +omnigraph export $REPO --branch main > graph.jsonl # full JSONL dump +omnigraph commit list $REPO --branch main --json # history +``` + +`export` is the right tool for large-snapshot inspection — don't try to page through the whole graph with read queries. + +> **Cluster note:** everything in this file applies unchanged in cluster +> deployments — the control plane owns schema/queries/policies; rows, loads, +> and branches stay on the data plane against the derived graph roots +> (`/graphs/.omni`, or `/graphs/.omni` for an S3-backed +> cluster). diff --git a/skills/omnigraph/references/migrations.md b/skills/omnigraph/references/migrations.md new file mode 100644 index 0000000..9aca605 --- /dev/null +++ b/skills/omnigraph/references/migrations.md @@ -0,0 +1,65 @@ +# Migration & Deprecations (pre-0.7.0 → 0.7.0) + +The rest of this skill teaches the **current 0.7.0 surface only**. Consult this page solely when you meet an old config file, command, flag, route, or error and need its current form. Pre-0.7.0 spellings keep working as deprecated aliases (they print a warning) unless marked **removed**. + +## Config files + +| Before (pre-0.7.0) | Now (0.7.0) | +|---|---| +| `omnigraph.yaml` (one combined file) | **`cluster.yaml`** (team deployment) + **`~/.omnigraph/config.yaml`** (operator) | +| `cli.actor` | `operator.actor` | +| `cli.graph` / `server.graph` | `defaults.default_graph` (+ `defaults.server`) | +| `targets:` / `target:` | `graphs:` / `graph:` | +| `omnigraph init` scaffolds `omnigraph.yaml` | `init` scaffolds nothing — start a `cluster.yaml` from [`cluster.md`](cluster.md) | + +- **`omnigraph.yaml` is fully removed in 0.7.0** — no CLI command or server reads it, and there is **no `config migrate`**. Move team settings to `cluster.yaml` and personal settings (identity, `servers:`, `defaults:`, `aliases:`) to `~/.omnigraph/config.yaml` by hand. + +## CLI addressing (RFC-011) + +| Before | Now | +|---|---| +| `--target ` | **removed** — use `--server `, `--store `, or `--profile ` (SKILL.md → *Addressing a graph*) | +| positional `http(s)://` URL → a server | **removed** — address a remote with `--server ` | +| `--as` on a served (remote) write | no-op — the server resolves the actor from the bearer token (`--as` applies to direct `--store` writes) | +| `--cluster-graph ` | **removed** — `--cluster ` is a global scope; pick the graph with `--graph `. `--graph` now selects within a `--server` *or* `--cluster` scope | +| `query`/`mutate` `--name ` + positional graph URI / `--uri` | **removed** — the query name is the **positional** (`omnigraph query `): a bare `` invokes a served stored query (kind-asserted), `--query`/`-e` is the ad-hoc lane. Address the graph via `--server`/`--store`/`--profile` (not a positional URI on query/mutate) | + +## Server boot & schema (RFC-011) + +| Before | Now | +|---|---| +| `omnigraph-server ` / `--config omnigraph.yaml` / `--target` / single-graph flat routes | **removed** — the server is **cluster-only**: `omnigraph-server --cluster `; all HTTP is nested under `/graphs//...` (flat routes → 404) | +| `omnigraph schema apply` on a cluster-managed graph | **refused** — evolve cluster graphs via `cluster apply` (the ledger). `schema apply` still works on a non-cluster store or via `--server` | +| `policy …` / `queries validate` via `--config omnigraph.yaml` | `policy validate\|test\|explain` reads `--cluster ` (+ `--graph`); `queries validate` takes the store URI | + +## CLI verbs + +| Before | Now | +|---|---| +| `omnigraph ingest …` | `omnigraph load --from main --mode merge …` | +| `omnigraph read` | `omnigraph query` | +| `omnigraph change` | `omnigraph mutate` | +| `omnigraph query lint` / `query check` | `omnigraph lint` | +| `omnigraph query --alias ` / `mutate --alias ` | `omnigraph alias ` (dedicated subcommand; the `--alias` flag was removed) | + +## HTTP routes + +| Before | Now | +|---|---| +| `POST /ingest` | `POST /load` | +| `POST /read` | `POST /query` | +| `POST /change` | `POST /mutate` | + +The old routes remain as **deprecated aliases** (retained indefinitely), carrying `Deprecation: true` + `Link: ` response headers. + +## Server token resolution + +| Before | Now | +|---|---| +| `graphs..bearer_token_env` in `omnigraph.yaml` | `omnigraph login ` → `~/.omnigraph/credentials`, or `OMNIGRAPH_TOKEN_` | + +The client bearer token now comes only from `OMNIGRAPH_TOKEN_` or the credentials file — the `omnigraph.yaml` `bearer_token_env` chain is gone with the file. + +## Older removals (still worth knowing) + +- The transactional **Run** state machine, its `/runs` routes, and the `run_publish` / `run_abort` Cedar actions were **removed in v0.4.0**. Writes publish directly — use `GET /commits` for history and the `change` action for write gating; `/runs` returns 404. diff --git a/skills/omnigraph/references/queries.md b/skills/omnigraph/references/queries.md new file mode 100644 index 0000000..f9f84e0 --- /dev/null +++ b/skills/omnigraph/references/queries.md @@ -0,0 +1,302 @@ +# Query Authoring & Linting + +## Contents +- File organization +- Linting +- Parameterization +- Query structure +- Search functions +- Aggregations +- Filter operators +- Mutations +- Naming convention +- Aliases over raw queries + +Writing `.gq` query files in Omnigraph. + +## File Organization + +- One `.gq` file per primary node type (`signals.gq`, `patterns.gq`, `elements.gq`) +- One `mutations.gq` file for all insert/update/delete queries +- Put query files in `queries/` — cluster mode discovers `queries/*.gq` automatically + +## Linting + +```bash +omnigraph lint --schema schema.pg --query queries/signals.gq +``` + +Or (lint against a live repo): + +```bash +omnigraph lint --query queries/signals.gq s3://bucket/repo +``` + +Lint returns: +- `"status": "ok"` — all queries passed +- `"errors": N` — count of type errors (exit 1 when nonzero) +- `"warnings": N` — count of drift warnings + +Run lint after every `.gq` or `.pg` edit. Wire into precommit. + +## Parameterization + +### Always declare typed parameters + +```gq +query get_signal($slug: String) { + match { $s: Signal { slug: $slug } } + return { $s.slug, $s.name } +} +``` + +Never string-interpolate values into query bodies. Pass them via `--params`: + +```bash +omnigraph query get_signal --query signals.gq --params '{"slug":"sig-foo"}' +``` + +The compiler typechecks parameter values against declared types. + +> For one-off/ad-hoc execution, pass the query inline instead of a file with `-e/--query-string` (v0.6.0+): `omnigraph query -e 'query q($slug: String){ match { $s: Signal { slug: $slug } } return { $s.name } }' --params '{"slug":"sig-foo"}'` (and `omnigraph mutate -e '...'`). `-e` is mutually exclusive with `--query ` — exactly one of the two is required. (Operator aliases are invoked via the separate `omnigraph alias ` subcommand.) + +## Query Structure + +### Match → Return → Order → Limit + +```gq +query recent_signals() { + match { + $s: Signal + } + return { $s.slug, $s.name, $s.stagingTimestamp } + order { $s.stagingTimestamp desc } + limit 50 +} +``` + +### Edge traversal (lowerCamelCase) + +Schema edges are PascalCase; traversal uses lowerCamelCase: + +```gq +match { + $s: Signal { slug: $slug } + $s formsPattern $p // edge FormsPattern: Signal -> Pattern +} +``` + +### Multi-hop + +Chain traversal clauses: + +```gq +query friends_of_friends($name: String) { + match { + $p: Person { name: $name } + $p knows $mid + $mid knows $fof + } + return { $fof.name } +} +``` + +### Reverse traversal + +Flip the subject/object: + +```gq +query employees_of($company: String) { + match { + $c: Company { name: $company } + $p worksAt $c + } + return { $p.name } +} +``` + +### Negation + +```gq +query orphan_signals() { + match { + $s: Signal + not { $s formsPattern $_ } + } + return { $s.slug } +} +``` + +## Search Functions + +### Text search + +```gq +match { + $d: Doc + search($d.title, $q) // full-text on @index'd String +} +``` + +```gq +match { + $d: Doc + fuzzy($d.title, $q, 2) // fuzzy match, max 2 edits +} +``` + +```gq +match { + $d: Doc + match_text($d.body, $q) // phrase match +} +``` + +### Vector/ranking (require `limit`) + +```gq +query vector_search($q: Vector(3072)) { + match { $d: Doc } + return { $d.slug, $d.title } + order { nearest($d.embedding, $q) } + limit 10 +} +``` + +`nearest`, `bm25`, and `rrf` are ranking operators, not filters. Every query using them **must** end with `limit N` — omitting it is a compile error. + +### Hybrid (reciprocal rank fusion) + +```gq +query hybrid_search($vq: Vector(3072), $tq: String) { + match { $d: Doc } + return { $d.slug, $d.title } + order { rrf(nearest($d.embedding, $vq), bm25($d.title, $tq)) } + limit 10 +} +``` + +## Aggregations + +```gq +query friend_counts() { + match { + $p: Person + $p knows $f + } + return { + $p.name + count($f) as friends + } + order { friends desc } + limit 20 +} +``` + +Supported: `count`, `sum`, `avg`, `min`, `max`. Grouping is implicit on non-aggregated return fields. + +## Filter Operators + +`=`, `!=`, `>`, `<`, `>=`, `<=`, `contains` + +```gq +match { + $p: Person + $p.age > 30 + $p.name contains "Al" +} +``` + +## Mutations + +> **No top-level `mutation { ... }` wrapper.** Agents trained on GraphQL reflexively write `mutation { insert T { ... } }` — that fails the parser at character 1 with `parse error: expected query_file`. Every executable block in a `.gq` file is a named `query`; the body's verb (`insert` / `update` / `delete`) determines whether it's a write. Dispatch via `omnigraph mutate` (not `query`). + +### Insert + +```gq +query add_signal($slug: String, $name: String, $brief: String, + $stagingTimestamp: DateTime, $createdAt: DateTime, $updatedAt: DateTime) { + insert Signal { + slug: $slug, + name: $name, + brief: $brief, + stagingTimestamp: $stagingTimestamp, + createdAt: $createdAt, + updatedAt: $updatedAt + } +} +``` + +**Every non-nullable property must be provided.** Lint catches missing ones as: + +``` +error: T12: insert for 'Signal' must provide non-nullable property 'brief' +``` + +### Insert edge + +```gq +query link_signal_forms_pattern($signal: String, $pattern: String) { + insert FormsPattern { from: $signal, to: $pattern } +} +``` + +Edge `data` block is `{}` if the edge has no properties — just specify `from` and `to` slugs. + +### Update + +```gq +query retitle_signal($slug: String, $new_title: String) { + update Signal set { name: $new_title } where slug = $slug +} +``` + +### Delete + +```gq +query remove_signal($slug: String) { + delete Signal where slug = $slug +} +``` + +### Multi-statement + +```gq +query add_and_link($slug: String, $pattern: String, $createdAt: DateTime, $updatedAt: DateTime) { + insert Signal { slug: $slug, name: $slug, brief: $slug, + stagingTimestamp: $createdAt, createdAt: $createdAt, updatedAt: $updatedAt } + insert FormsPattern { from: $slug, to: $pattern } +} +``` + +There's no `upsert` keyword at the query level — use `load --mode merge` for bulk upsert. + +> **Insert/update-only OR delete-only (the D₂ rule).** A single mutation query may contain inserts and updates, **or** deletes — never both. Mixing a `delete` with an `insert`/`update` in the same query is rejected at parse time. (Inserts/updates go through a staged two-phase publish; deletes inline-commit — omnigraph doesn't yet use Lance's two-phase delete API (it shipped in Lance 7.0.0 but isn't wired in) — so they can't share one atomic statement.) Split a delete-then-insert into two separate mutations. + +### Date and DateTime values + +Date format is asymmetric between `mutate` (parameter values) and `load` (JSONL): + +| Path | Date | DateTime | +|---|---|---| +| `mutate --params` | ISO string `"2026-04-29"` | ISO string `"2026-04-29T10:00:00Z"` | +| `load` JSONL | Integer days since epoch `20572` | ISO string `"2026-04-29T10:00:00Z"` | + +Compute integer days form for a given date `d`: + +```python +(d - datetime.date(1970, 1, 1)).days # d is the date you're loading, not today() +``` + +This asymmetry is one of the most common silent type errors when bulk-loading data prepared for one path through the other. + +## Naming Convention + +`verb_object`: +- `get_signal`, `recent_signals`, `search_signals` +- `signal_patterns`, `signal_elements` (traversal queries) +- `add_signal`, `link_signal_forms_pattern` (mutations) + +## Aliases Over Raw Queries + +For anything an agent or script will call repeatedly, define an operator alias. See `references/aliases.md`. diff --git a/skills/omnigraph/references/remote-ops.md b/skills/omnigraph/references/remote-ops.md new file mode 100644 index 0000000..e956dd7 --- /dev/null +++ b/skills/omnigraph/references/remote-ops.md @@ -0,0 +1,142 @@ +# Remote Graph Operations + +## Contents +- What's different about remote +- Verify after every write +- 504 Gateway Timeout +- Fork-branch 504 fingerprint +- Targeting a remote graph (`--server`, `login`) +- Version drift / `sync_branch()` +- `manifest_conflict` 409 +- 429 Too Many Requests +- Duplicate risk on blind retry +- Reading large schemas safely +- Prevention checklist + +When the graph URI is a remote endpoint (`omnigraph-server` behind ALB / CloudFront, bearer-authenticated) instead of a local S3 path, several CLI behaviors change in ways the local-storage workflow never exposes. This reference covers the failures and operational rituals specific to remote graphs. + +## What's different about remote + +A remote graph runs server-side. Every write executes on the server — staged per touched table, then published atomically as a **single manifest commit** guarded by a compare-and-swap on expected table versions — and is gated by a connection-level idle timeout (CloudFront defaults to ~30s). There is no separate "run" object to poll — write status is implied by the HTTP response (and verifiable via `commit list`). The local CLI is a thin client; it never sees the commit happen, only the HTTP response. That asymmetry is the root of every gotcha below. + +| Local repo | Remote repo | +|---|---| +| CLI writes S3 directly | Server executes the write, publishes one atomic manifest commit | +| No connection timeout | ~30s idle timeout (CloudFront) | +| No admission control | Per-actor `429` + `Retry-After` on writes | +| `load` writes S3-backed storage directly | `load` is server-orchestrated — same command, one atomic commit | +| CLI exit code is authoritative | CLI exit code can lie — verify via `commit list` | + +## Verify after every write + +The CLI's exit code is **not authoritative on remote graphs**. The proxy can drop a response after the server has already committed. Always verify by comparing `main`'s head: + +```bash +HEAD_BEFORE=$(omnigraph commit list --config X --branch main --json | jq -r '.commits[0].graph_commit_id') + +# … run your load / mutate … + +HEAD_AFTER=$(omnigraph commit list --config X --branch main --json | jq -r '.commits[0].graph_commit_id') + +if [[ "$HEAD_BEFORE" != "$HEAD_AFTER" ]]; then + echo "landed" +else + echo "did NOT land — safe to retry" +fi +``` + +For a `load --from` that forks a review branch, also compare the new branch head's `graph_commit_id` against `main`'s. **Identical means the load didn't land — empty fork left behind.** + +For pointed verification of a single record: + +```bash +omnigraph export --config X --type | grep +omnigraph export --config X --type | grep +``` + +## 504 Gateway Timeout: response lost, write status unknown + +A 504 from the proxy means the server didn't respond within the idle timeout. Two server-side outcomes are possible — **the 504 alone cannot distinguish them**: + +1. **Write completed and published** — landed, `main`'s head advanced. Common for small mutations finishing just past the 30s edge. +2. **Write still in progress** — will publish or fail soon. Re-check after a minute. + +Always verify via `commit list` before retrying. Blind retry on append-only types creates duplicates. + +## Fork-branch 504 fingerprint + +`load --from ` creates the branch **before** loading data. A timed-out fork-load where the data didn't land leaves an empty branch at ``'s head. Stale numbered branches (`feature-v2`, `-v3`, `-v4` …) all sitting at the same `graph_commit_id` as `main` are the fingerprint of prior 504-blocked attempts. + +Find them by comparing each branch's head against `main`'s in `omnigraph branch list --config X --json`, then delete the empty ones. + +## Targeting a remote graph: `--server` and `login` + +`load`, `query`, and `mutate` all run against a remote `omnigraph-server` endpoint — there is no local-only restriction as of 0.7.0. Address an operator-defined server by name instead of pasting URLs and juggling tokens: + +```bash +echo "$TOKEN" | omnigraph login intel-dev # stores it in ~/.omnigraph/credentials (0600) +omnigraph load --server intel-dev --graph spike \ + --data delta.jsonl --from main --mode merge --branch staging +``` + +`--server ` resolves the URL from `~/.omnigraph/config.yaml` and the token via `OMNIGRAPH_TOKEN_` or the credentials file. A token is only ever sent to the server it is keyed to. `--graph ` selects the graph on a multi-graph server. + +## Version drift / `sync_branch()` + +``` +version drift on node:: snapshot pinned vN but dataset is at vM — call sync_branch() and retry +``` + +- `sync_branch()` is **not a CLI command** — it's a server-internal directive that leaked into the error text. Don't go looking for it. +- Cause: another actor committed to `main` between your CLI's snapshot pin and your `mutate` attempt. +- Usually self-resolves on retry — the next call re-pins. +- Calling `omnigraph snapshot` does **not** reliably re-pin for subsequent `mutate`s in the same session. +- If persistent, fall back to `load --from main` onto a fresh branch — a forked branch doesn't suffer from concurrent-commit drift on `main`. +- The cleaner, modern form of this conflict is a structured `manifest_conflict` **409** — see below. + +## `manifest_conflict` 409 — stale snapshot, retry + +When another actor commits to the same branch between your query's snapshot pin and your write, the server returns a structured **`manifest_conflict` 409** carrying `table_key` / `expected` / `actual`, rather than silently overwriting. Since v0.4.2 this is the form most concurrent update/delete/merge races take. + +- **Retry it.** A 409 means your write was computed against a stale view and was rejected *before* committing — there is no partial state and no duplicate risk. Re-issue the same call; it re-pins to the new head. +- Concurrent `mutate` × branch-merge on the same target branch resolves to either success or a clean 409 depending on who wins the server's per-table queue — both outcomes are safe. + +## 429 Too Many Requests — back off, then retry + +The server applies **per-actor admission control** to every mutating endpoint (`mutate` / `load` / `schema apply` / branch create·delete·merge). An actor that exceeds its in-flight-request or estimated-byte budget gets a structured **HTTP 429** (`code: too_many_requests`) with a `Retry-After` header — instead of blocking unrelated actors behind a global lock. + +- This is **not** a failed write — the write never started. Honor `Retry-After` and retry; it is always safe (no partial write, no duplicate risk). +- It's per-actor, so one noisy automation can't starve others. If you hit it constantly, batch less aggressively or space your calls out. +- Read-only endpoints are not admission-gated. + +## Duplicate risk on blind retry + +After a 504, never retry without verifying first. Different node kinds have different retry semantics: + +| Kind | Retry safety | +|---|---| +| Pointer nodes (`Org`, `Person`, `Opportunity`, `Channel`, `Actor`, `ActionItem`, `Artifact`, `Meeting`, `Technology`, `Campaign`, `UseCase`) | ✓ Idempotent — `@key` upserts dedupe | +| Append-only nodes (`Signal`, `Claim`, `Decision`, `Event`, `Interaction`, `MarketingElement`, `Policy`, `Outcome`) | ✗ Duplicates on retry — verify before retrying | +| Edges | ⚠ No `@key`. Verify via `export --type ` + grep. Some simple edges dedupe server-side; don't rely on it. | + +## Reading large schemas safely + +Remote schemas can be large (tens of KB). Tools that cap stdout (~50KB is common) will truncate or duplicate the output silently — leading to memory-based answers from agents that look correct but reference nonexistent fields. + +Always redirect to a file before reading: + +```bash +omnigraph schema show --config X > /tmp/schema.pg +wc -l /tmp/schema.pg +``` + +Then read the file with offset/limit, not via piped stdout. + +## Prevention checklist + +- Keep mutations small. Single-node inserts finish well under the timeout. +- Prefer `mutate` over `load` for ≤ a handful of records. +- Always run `commit list` after a 504 before deciding to retry. +- For destructive or large-batch work, use `load --from main` onto a feature branch and verify the branch head before merging. +- Read large schemas via file redirect, not piped stdout. +- A `429` (throttle) or a `manifest_conflict` `409` (stale snapshot) is always safe to retry — the write never committed. Honor `Retry-After` on a 429. diff --git a/skills/omnigraph/references/schema.md b/skills/omnigraph/references/schema.md new file mode 100644 index 0000000..b30745b --- /dev/null +++ b/skills/omnigraph/references/schema.md @@ -0,0 +1,192 @@ +# Schema Authoring & Evolution + +## Contents +- Authoring (.pg files) +- Evolution (schema plan/apply) +- Supported types +- Decorators (quick reference) +- Interfaces +- Design principles +- Schema evolution in cluster mode + +How to write and evolve `.pg` schemas in Omnigraph. + +## Authoring (.pg files) + +### Use `//` for comments + +Not `#`. The compiler rejects `#` with a parse error that looks like: + +``` +parse error: expected schema_file +``` + +### Enums are inline, not standalone + +The compiler does **not** accept top-level `enum Foo { ... }` blocks. Put the values inline on the property: + +```pg +kind: enum(product, technology, framework, concept, ops) @index +``` + +If the same enum appears on multiple nodes, duplicate it inline — there's no shared enum type. + +### Lists contain scalars only + +`[String]` and `[I32]` are fine. `[Category]` (a list of enum values) is **not** supported. Use `[String]` with query-side filtering, or use a single-valued enum property if one value is enough. + +### `@embed` takes a quoted string + +```pg +embedding: Vector(3072) @embed("text") @index +``` + +Not `@embed(text)`. The source property name is a string literal. + +### Edge constraints go inside a body block + +`@unique(src, dst)` on an edge goes inside `{ }`, after `@card(...)`: + +```pg +edge PartOfArtifact: Chunk -> InformationArtifact @card(1..1) { + @unique(src) +} +``` + +### Lint after every edit + +```bash +omnigraph lint --schema schema.pg --query queries/signals.gq +``` + +This validates the schema **and** the queries against it. No running repo required. Wire it into a precommit hook. + +## Evolution (schema plan/apply) + +### Plan before apply — always + +```bash +omnigraph schema plan --schema next.pg s3://bucket/repo --json +# inspect "supported": true|false and the step list +omnigraph schema apply --schema next.pg s3://bucket/repo +``` + +If `supported: false`, fix the source before applying. Plan is free; run it as often as needed. + +Plan/apply diagnostics carry stable codes of the form **`OG-XXX-NNN`** (since v0.5.0) — match on the code, not the free-form message text. + +**Destructive drops are gated (since v0.5.0).** Dropping a property or type is a soft drop by default (or rejected); to actually lose data you must opt in: + +```bash +omnigraph schema apply --schema next.pg s3://bucket/repo --allow-data-loss +``` + +Over HTTP the equivalent is `{"allow_data_loss": true}` in the schema-apply body. Without the flag, a destructive drop returns a structured diagnostic instead of silently deleting columns. + +### Apply is main-only + +`omnigraph schema apply` rejects any non-`main` branches. Delete or merge feature branches first. This is deliberate: schema changes don't go through review branches. They go straight to main via `plan` + `apply`. + +### Rename, don't replace + +Use `@rename_from(...)` on renames so the planner emits a rename step (preserves data), not a drop+add pair (loses data): + +```pg +node Account @rename_from("User") { + full_name: String @rename_from("name") +} +``` + +Works on node types, edge types, and properties. + +### Required properties need a backfill plan + +Adding a non-nullable property to an existing node is rejected as unsupported. Pattern: + +1. Add as optional: `new_prop: String?` +2. Apply +3. Backfill via a `mutate` or `load --mode merge` +4. Tighten to required in a follow-up apply: `new_prop: String` + +### Keep `@key` stable + +Changing the key field is effectively a replace — it invalidates every external reference to the node. Treat identity changes as deliberate, multi-step migrations, not casual field renames. + +### `schema apply` blocks writes while running + +No concurrent mutations during an apply. Plan for a short read-only window. + +## Supported Types + +- **Scalars:** `String`, `Bool`, `I32`, `I64`, `U32`, `U64`, `F32`, `F64`, `Date`, `DateTime`, `Blob` +- **Collections:** `Vector(N)` (fixed-size float vector), `[ScalarType]` (list of scalar) +- **Enums:** `enum(value1, value2, ...)` — inline only, values can contain alphanumerics, underscores, hyphens +- **Optional:** any type + `?` suffix (`String?`, `[I32]?`, `Vector(4)?`) + +## Decorators (quick reference) + +**Property-level:** +- `@key` — primary key (implies index; usually one per node) +- `@unique` — uniqueness constraint +- `@index` — query optimization +- `@range(min, max)` — numeric bounds (open ranges allowed) +- `@check(prop, "regex")` — regex pattern validation on a String property +- `@embed("source_prop")` — embed from a String source into a Vector property +- `@description("...")` — metadata (no migration impact) +- `@instruction("...")` — semantic hint for LLMs/operators + +**Edge-level:** +- `@card(min..max)` — edge cardinality (default: `0..*`) + +**Type-level (nodes/edges/properties):** +- `@rename_from("OldName")` — migration-aware rename + +**Group-level (inside body block):** +- `@unique(prop1, prop2)` — composite uniqueness, enforced as a true tuple key at both intake and merge (works on edges too: `@unique(src, dst)`). Columns must reduce to a scalar key: `@unique` on a `[List]`/`Blob` column is rejected loudly at `load` (it used to be silently un-enforced — fixed in #160). +- `@index(prop1, prop2)` — composite index + +## Interfaces + +Supported but rarely used. Declare shared property contracts and node types implement them: + +```pg +interface Searchable { + title: String @index + embedding: Vector(3072) @embed("title") +} + +node Doc implements Searchable { + slug: String @key + body: String +} +``` + +Most schemas are fine without interfaces. Reach for them only when 3+ node types need to share a property contract. + +## Design Principles (brief) + +- **Identity is explicit** — use `@key` on a semantic slug, not internal row IDs +- **Narrow types** — `Date` over `String` for dates, `enum` over `String` for lifecycle states +- **Edge semantics matter** — prefer `AuthoredBy` over `RelatedTo` +- **Constraints live in the schema** — `@unique`, `@range`, `@card` keep invariants out of application code +- **Schemas are reviewable** — clear names, explicit enums, obvious keys + +## Schema Evolution in Cluster Mode + +In a cluster deployment there is **no direct `omnigraph schema apply`** — the +schema is declared (`graphs..schema:` in `cluster.yaml`) and converged: + +```bash +$EDITOR schema.pg +omnigraph cluster plan --config . # shows the engine's migration steps +omnigraph cluster apply --config . --as +# restart the --cluster server to serve the new shape +``` + +Differences from direct `schema apply` (on a non-cluster store): **soft drops +only** (`--allow-data-loss` is not reachable from cluster apply — prior versions +retain dropped columns), +and out-of-band schema changes on the live graph are *drift* — `cluster +refresh` flags them and the next `apply` converges the graph back to the +declared schema. Everything else in this file (`@rename_from`, backfills, +linting, enum discipline) applies unchanged to the `.pg` you edit. diff --git a/skills/omnigraph/references/search.md b/skills/omnigraph/references/search.md new file mode 100644 index 0000000..53397ab --- /dev/null +++ b/skills/omnigraph/references/search.md @@ -0,0 +1,150 @@ +# Search & Embeddings + +## Contents +- Embeddings are schema-declared +- Generating embeddings +- Embeddings + `load --mode merge` interaction +- Search functions in queries +- The key pattern: scope first, rank second +- Model / config + +Vector embeddings and text search in Omnigraph. + +## Embeddings are Schema-Declared + +```pg +node Chunk { + text: String + chunk_index: I32 + embedding: Vector(3072) @embed("text") @index + createdAt: DateTime +} +``` + +- `Vector(N)` — fixed-size float vector +- `@embed("source_prop")` — what text field to embed from (quoted string) +- `@index` — enables vector search on this field + +The schema says **where** embeddings live and **what** they come from. Queries don't recompute; they read. + +## Generating Embeddings + +### First time / refresh missing + +```bash +omnigraph embed --seed embed-config.yaml +``` + +Default mode is `fill_missing` — only generates embeddings for rows without one. + +### Re-embed everything + +```bash +omnigraph embed --seed embed-config.yaml --reembed_all +``` + +Use when: +- You changed the source field: `@embed("body")` → `@embed("title")` +- You mutated text at scale and need fresh embeddings +- You switched embedding models (rare) + +### Selective refresh + +```bash +omnigraph embed --seed embed-config.yaml --select "Chunk:chunk_index=42" +``` + +Regenerate only rows matching the selector. + +### Clean (delete) embeddings + +```bash +omnigraph embed --seed embed-config.yaml --clean +``` + +## Embeddings + `load --mode merge` Interaction + +**`load --mode merge` does NOT recompute embeddings.** + +If you update rows whose source fields feed into `@embed(...)`, the source updates but the embedding stays stale. + +Two fixes: +1. Run `omnigraph embed --reembed_all` after the merge +2. Use `load --mode overwrite` instead, which re-triggers embedding on load + +## Search Functions in Queries + +All ranking functions require `limit N` — they're order operators, not filters. + +### Vector similarity + +```gq +query nearest_chunks($q: Vector(3072)) { + match { $c: Chunk } + return { $c.text } + order { nearest($c.embedding, $q) } + limit 10 +} +``` + +### BM25 text ranking + +```gq +query top_titles($q: String) { + match { $d: Doc } + return { $d.slug, $d.title } + order { bm25($d.title, $q) } + limit 10 +} +``` + +### Hybrid (Reciprocal Rank Fusion) + +```gq +query hybrid($vq: Vector(3072), $tq: String) { + match { $d: Doc } + return { $d.slug, $d.title } + order { rrf(nearest($d.embedding, $vq), bm25($d.title, $tq)) } + limit 10 +} +``` + +### Text filter (not ranking — no `limit` required) + +```gq +match { + $d: Doc + search($d.title, $q) // full-text filter + fuzzy($d.title, $q, 2) // fuzzy filter, max 2 edits + match_text($d.body, $q) // phrase filter +} +``` + +## The Key Pattern: Scope First, Rank Second + +Filter with graph traversal before invoking vector or text ranking. Ranking over a narrow set is both cheaper and more relevant. + +```gq +query related_chunks($artifact_slug: String, $q: Vector(3072)) { + match { + $a: InformationArtifact { slug: $artifact_slug } + $c partOfArtifact $a // scope: only this artifact's chunks + } + return { $c.text } + order { nearest($c.embedding, $q) } // rank: vector similarity within scope + limit 10 +} +``` + +Don't rank over the entire chunk set if you know a traversal can narrow it first. + +## Model / Config + +Omnigraph uses **two distinct embedding clients** — don't conflate them: + +| Client | When it runs | Default model | Configured via | +|--------|--------------|---------------|----------------| +| **Engine / load-time** | At load, when an `@embed("source")` field is populated (and `omnigraph embed`) | `gemini-embedding-2-preview` (3072-dim) | `GEMINI_API_KEY`, `OMNIGRAPH_GEMINI_BASE_URL`, `OMNIGRAPH_EMBED_*`, `OMNIGRAPH_EMBEDDINGS_MOCK` | +| **Compiler / query-time** | When a query passes a *string* to a ranking op (e.g. `nearest($c.embedding, "some text")`) and the server auto-embeds it | `text-embedding-3-small` (OpenAI-style) | `NANOGRAPH_EMBED_MODEL`, `OPENAI_API_KEY`, `OPENAI_BASE_URL`, `NANOGRAPH_EMBEDDINGS_MOCK` | + +The vector stored in the schema is produced by the **load-time (engine)** client, so `Vector(N)` must match that model's output dimension — `Vector(3072)` for `gemini-embedding-2-preview`. If you point the query-time client at a model with a different dimension than your stored vectors, similarity search returns garbage or errors — keep both sides on the same dimension. Vectors are stored L2-normalized. diff --git a/skills/omnigraph/references/server-policy.md b/skills/omnigraph/references/server-policy.md new file mode 100644 index 0000000..225c708 --- /dev/null +++ b/skills/omnigraph/references/server-policy.md @@ -0,0 +1,224 @@ +# HTTP Server & Cedar Policy + +## Contents +- Starting the server (boot sources) +- HTTP routes +- Auth +- Setup operations bypass the server +- Cedar policy +- Multi-graph mode +- Server + policy together +- Cluster-booted servers + +How to run `omnigraph-server` and gate operations with Cedar policies. + +## Starting the Server + +The server is the canonical runtime entry point — all CLI queries, mutations, and admin ops go through it. **Boot is cluster-only** (RFC-011): the server boots from a cluster and serves N graphs (N ≥ 1) under nested routes. There is **no** single-graph / bare-URI / `omnigraph.yaml` boot. + +```bash +omnigraph-server --cluster ./company-brain --bind 127.0.0.1:8080 # a config directory … +omnigraph-server --cluster s3://bucket/prefix --bind 0.0.0.0:8080 # … or a storage-root URI (config-free) +``` + +`--cluster` boots from the cluster's applied revision (see *Cluster-Booted Servers* below). Run it in a separate terminal or background process. + +## HTTP Routes + +All per-graph routes are nested under `/graphs/{id}/...` (`{id}` = a graph id from the applied cluster); bare flat paths (`/query`, `/snapshot`, …) return **404**. `/healthz` and `/graphs` stay flat. + +| Route | Purpose | +|-------|---------| +| `GET /healthz` | liveness probe (flat) | +| `GET /graphs` | enumerate served graphs (flat; `graph_list`-gated) | +| `GET /graphs/{id}/snapshot?branch=` | table state + row counts | +| `POST /graphs/{id}/query` | read query (canonical; `/read` = deprecated alias) | +| `POST /graphs/{id}/mutate` | mutation (`/change` = deprecated alias) | +| `POST /graphs/{id}/load` | bulk JSONL load, 32 MB; branch creation opt-in via `from` (`/ingest` = deprecated alias) | +| `POST /graphs/{id}/export` | NDJSON stream of a branch | +| `GET /graphs/{id}/queries` · `POST /graphs/{id}/queries/{name}` | stored-query catalog (`read`) + invocation (`invoke_query`, +`change` for a stored mutation; deny == 404) | +| `GET /graphs/{id}/schema` · `POST /graphs/{id}/schema/apply` | read `.pg` · migrate (`schema_apply`) | +| `GET/POST /graphs/{id}/branches` · `DELETE …/branches/{b}` · `POST …/branches/merge` | branch ops | +| `GET /graphs/{id}/commits?branch=` · `…/commits/{commit_id}` | history | + +Read routes take `?branch=main` or `?snapshot=`. Writes publish directly and commit atomically via `__manifest`; use the commits route for write/audit history. + +## Auth + +Set bearer tokens on the server process. Three sources, in precedence: `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` (AWS Secrets Manager) → `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON`/`_FILE` (JSON `{actor_id: token}`) → `OMNIGRAPH_SERVER_BEARER_TOKEN` (single token, actor `default`): + +```bash +OMNIGRAPH_SERVER_BEARER_TOKENS_JSON='{"act-reader":"s3cret"}' \ + omnigraph-server --cluster ./company-brain --bind 0.0.0.0:8080 +``` + +On the client side (0.7.0), register the server once and store its token out of band: + +```bash +echo "s3cret" | omnigraph login remote # → ~/.omnigraph/credentials (0600) +omnigraph query get_signal --server remote --graph spike --params '{"slug":"sig-foo"}' +``` + +`--server remote` resolves the URL from `~/.omnigraph/config.yaml`'s `servers:` and the token via `OMNIGRAPH_TOKEN_REMOTE` or the credentials file. A token is only ever sent to the server it is keyed to. + +### Running without auth requires an explicit opt-in + +You can no longer just "leave auth off." Since v0.6.0 the server **refuses to start** when it has neither bearer tokens nor a policy file, unless you explicitly opt in: + +```bash +omnigraph-server --cluster . --unauthenticated +# or: OMNIGRAPH_UNAUTHENTICATED=1 omnigraph-server --cluster . +``` + +This is a guardrail against accidentally shipping an open server. For pure local dev, pass `--unauthenticated` deliberately. + +## Setup Operations Bypass the Server + +`init` and **local** `load` write storage directly — they don't go through the server (a **remote** `load` is server-orchestrated, POSTing `/load`). Pass the repo URI: + +```bash +omnigraph init --schema schema.pg s3://my-bucket/repos/ +omnigraph load --data seed.jsonl --mode overwrite s3://my-bucket/repos/ +``` + +Everything else — `query`, `mutate`, `snapshot`, `schema plan/apply`, `branch`, `commit` — goes through the running server. + +## Cedar Policy + +Omnigraph can gate sensitive actions with [Cedar](https://www.cedarpolicy.com/) policies. + +### Default-deny posture + +Policy is enforced engine-wide (every authoring path calls the same gate), and the default is **closed**, not open: + +| Server state | Bearer tokens | Policy file | Behavior | +|---|---|---|---| +| **Open** | no | no | Every request permitted — but the server refuses to start without `--unauthenticated` / `OMNIGRAPH_UNAUTHENTICATED=1`. | +| **DefaultDeny** | yes | no | Every authenticated request for an action other than `read` is rejected (HTTP 403). "Tokens but forgot the policy file" no longer ships the illusion of protection. | +| **PolicyEnabled** | yes | yes | Requests are evaluated against your Cedar rules. | + +So configuring a policy file is what *enables* writes — there is no "permit everything by default" mode once tokens are set. + +### Gated actions + +Per-graph actions (evaluated against the graph being addressed): + +| Action | Protects | +|--------|----------| +| `read` | query execution | +| `export` | data export | +| `change` | mutations | +| `invoke_query` | stored-query invocation via `POST /graphs/{id}/queries/{name}` (graph-scoped, not branch-scoped). A stored **mutation** is double-gated — it also passes `change`. For a caller without the grant, a denial and an unknown query name both return the same **404** so the catalog can't be probed. | +| `schema_apply` | schema migrations | +| `branch_create` | branch creation | +| `branch_delete` | branch deletion | +| `branch_merge` | merges (especially into protected branches) | + +`admin` exists but is reserved (no call site yet — don't write rules for it). A server-scoped `graph_list` action gates `GET /graphs`; declare it in a `[cluster]`-scoped bundle. + +For any shared repo, gate at least `schema_apply` and `branch_merge`. + +### Where policy is declared + +Cedar bundles are declared in `cluster.yaml` and attach via `applies_to`: `[cluster]` is the server-level engine (gates `graph_list` / `GET /graphs`); `[]` is that graph's engine (gates `invoke_query`, `read`, `change`, `branch_*`, `schema_apply`). `cluster apply` publishes them and the `--cluster` server enforces the applied revision. The `policy.yaml` rule format (below) is the bundle content. + +### `policy.yaml` shape + +The policy model is **allow-only**: every rule is a `permit`. You grant capabilities to groups; anything ungranted is denied by default. There is **no `deny` / `effect` key** — to forbid something, simply don't grant it. + +```yaml +version: 1 # required; must be 1 + +groups: + admins: [act-alice, act-bob] + team: [act-carol, act-dan] + +protected_branches: + - main + +rules: + - id: admins-can-apply-schema # rules use `id`, not `name` + allow: # required `allow:` block + actors: { group: admins } # references a group by name + actions: [schema_apply] + target_branch_scope: protected + + - id: team-can-merge-to-protected + allow: + actors: { group: team } + actions: [branch_merge] + target_branch_scope: protected + + - id: team-can-read-write-unprotected + allow: + actors: { group: team } + actions: [read, change] + branch_scope: unprotected +``` + +To "block unreviewed schema applies," you don't write a deny rule — you just don't grant `schema_apply` to that group. Default-deny does the rest. + +Scope rules (a rule's `allow` block may use **at most one**): + +- `branch_scope: any | protected | unprotected` — for `read`, `export`, `change` (matches the source branch). +- `target_branch_scope: any | protected | unprotected` — for `schema_apply`, `branch_create`, `branch_delete`, `branch_merge` (matches the destination branch). + +### Validate, test, explain + +```bash +# Compile Cedar + check the cluster's applied policies +omnigraph policy validate --cluster . + +# Run declarative test cases +omnigraph policy test --cluster . --tests policy.tests.yaml + +# Debug a single decision +omnigraph policy explain \ + --actor act-alice \ + --action schema_apply \ + --target-branch main \ + --cluster . +``` + +### Test cases (`policy.tests.yaml`) + +```yaml +version: 1 # required; must be 1 +cases: + - id: alice-can-apply-schema # cases use `id`, not `name` + actor: act-alice + action: schema_apply + target_branch: main # schema_apply is target-branch scoped + expect: allow # `allow` / `deny` (not `permit`) + + - id: random-user-cannot-merge-to-main + actor: act-random + action: branch_merge + target_branch: main + expect: deny +``` + +Run `policy test` after every policy edit. Tests are cheap. + +## Multi-graph serving + +A `--cluster` server serves every graph in the applied cluster, each under `/graphs/{id}/...`. `GET /graphs` enumerates them (sorted by id), gated by the cluster-level `graph_list` action — even under `--unauthenticated`, topology stays closed until a `[cluster]` policy grants it. `omnigraph graphs list` mirrors it (remote servers only). + +Policy attaches at two levels via `cluster.yaml` `applies_to`: +- `[]` — per-graph rules (`read`, `change`, `branch_*`, `schema_apply`, `invoke_query`). +- `[cluster]` — server-level rules (`graph_list`). + +There is no runtime add/remove of graphs — edit `cluster.yaml`, `cluster apply`, restart. + +## Server + Policy Together + +When the server is running with a policy file: +1. Every request resolves the actor from the bearer token (the client cannot set actor identity) and checks it against Cedar rules. +2. Unauthorized requests return `403 Forbidden`. +3. The CLI doesn't bypass policy when it connects over HTTP — it's enforced at the server. Enforcement is also engine-wide, so CLI direct-engine writes and embedded SDK consumers hit the same gate. + +Setup ops (`init`, `load`) write storage directly. With a policy configured they still flow through the engine-layer enforce gate for the actor you pass via `--as` (or `operator.actor` in `~/.omnigraph/config.yaml`); gate the raw storage layer too (S3 bucket ACLs, object locks) if the bucket is shared. + +## Cluster-Booted Servers + +`omnigraph-server --cluster ` is the only boot source (covered above). It serves the cluster's **applied revision**: `cluster apply` changes take effect on the next restart (no hot reload), and boot is fail-fast with named remedies for missing/pending/tampered state. Bearer tokens and bind stay process-level (env/flags). See `references/cluster.md`. diff --git a/skills/omnigraph/references/stored-queries.md b/skills/omnigraph/references/stored-queries.md new file mode 100644 index 0000000..02aaf75 --- /dev/null +++ b/skills/omnigraph/references/stored-queries.md @@ -0,0 +1,54 @@ +# Stored-Query Registries + +A **stored query** is a `.gq` query that the *server* loads, type-checks at startup, and exposes by name — without ever accepting ad-hoc query source from the client. It's how you publish a vetted, typed query surface to remote callers and MCP tools. + +This is a server-side feature introduced in **v0.6.1**. It is distinct from CLI `aliases:` (see [`aliases.md`](aliases.md)): an alias is local client ergonomics; a stored query is a server-published, policy-gated endpoint. + +## Declaring stored queries (`cluster.yaml`) + +Stored queries are declared in the cluster's `cluster.yaml` — every `query ` in the listed `.gq` files registers: + +```yaml +graphs: + : + schema: schema.pg + queries: queries/ # discover every `query ` in queries/*.gq +``` + +`queries` also accepts an explicit file list (`[a.gq, b.gq]`) or a fine-grained `name: { file: … }` map; an unparseable `.gq` or a duplicate query name across files fails `cluster validate`. `cluster apply` publishes them to the content-addressed catalog, and the `--cluster` server type-checks and serves every applied query. Every applied query is listed (per-query `mcp:`/expose flags are a planned phase). + +## CLI + +```bash +omnigraph queries validate # type-check every stored query against the live schema (offline; opens the graph; exits non-zero on drift) +omnigraph queries list # print the addressed graph's registry: query names and typed params +``` + +- `validate` catches schema drift **without restarting the server** — run it after a `schema apply` or before deploying a config change. The server also runs this check at startup and **refuses to boot** on drift or on a duplicate MCP tool name. +- `validate` opens the graph (address with `--store ` or a positional URI); `list` reads the addressed graph's catalog. +- `queries` is distinct from `lint` — `lint` validates a single `.gq` file you point it at; `queries validate` validates the registry the server will actually serve. + +## HTTP surface + +| Route | Gate | Purpose | +|-------|------|---------| +| `GET /graphs/{id}/queries` | `read` | Typed tool catalog of the served queries. Graph-wide (branch-independent; `read` authorized against `main`). | +| `POST /graphs/{id}/queries/{name}` | `invoke_query` (+ `change` for a stored mutation) | Invoke a named query. Body carries params only — **never** `.gq` source. A stored mutation cannot target a `snapshot` (`400`); a param type error is a structured `400` naming the param. | + +`?branch=` / `?snapshot=` query params apply to `POST /graphs/{id}/queries/{name}` reads; branch/snapshot access stays enforced by the inner `read`/`change` gate (`invoke_query` itself is graph-scoped, not branch-scoped). + +## Policy gating (`invoke_query`) + +- **`invoke_query`** is a per-graph Cedar action gating the whole stored-query invocation surface. Grant it like any other action (see [`server-policy.md`](server-policy.md)). +- **Stored mutations are double-gated:** the caller needs `invoke_query` to reach the query **and** `change` for the write. An actor with `invoke_query` but not `change` gets `403` on a stored mutation. +- **Deny == unknown:** for a caller *lacking* `invoke_query`, a denial and an unknown query name return the **same 404** (identical body) — the catalog can't be probed. A caller who *holds* `invoke_query` may still get a `403` from the inner gate for a query it can't `read`/`change`, so existence is visible to grant-holders by design. +- **Default-deny mode** (bearer tokens, no `policy.file`) permits only `read`, so *every* `/graphs/{id}/queries/{name}` call returns `404` until an `invoke_query` rule is configured. + +## MCP exposure + +Every applied query is listed in `GET /graphs/{id}/queries` as a typed MCP tool. Per-query exposure controls (`mcp.expose`, `tool_name`) are a planned phase — there is no per-query `mcp:` flag in cluster mode today. + +## Note on per-query authorization + +The catalog is **not** Cedar-filtered per query yet: a caller with `read` but not `invoke_query` can *list* a query it cannot *invoke* (invocation would 404). Per-query authorization is future work; for now the catalog is a discovery surface and `invoke_query` is the invocation gate. +