mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-12 01:45:14 +02:00
Merge pull request #181 from ModernRelay/feat/container-cluster-mode
feat(docker): cluster-mode container + AWS/Railway recipes
This commit is contained in:
commit
c3ff076e89
6 changed files with 107 additions and 2 deletions
|
|
@ -2,3 +2,4 @@
|
|||
!Dockerfile
|
||||
!docker/entrypoint.sh
|
||||
!target/release/omnigraph-server
|
||||
!target/release/omnigraph
|
||||
|
|
|
|||
|
|
@ -11,9 +11,13 @@ RUN groupadd --system omnigraph \
|
|||
&& useradd --system --gid omnigraph --create-home --home-dir /var/lib/omnigraph omnigraph
|
||||
|
||||
COPY target/release/omnigraph-server /usr/local/bin/omnigraph-server
|
||||
# The CLI ships in the image so the cluster day-2 loop (cluster
|
||||
# apply/approve/status, data loads by explicit URI) runs in-container via
|
||||
# `docker exec` / ECS exec / `railway shell` — no omnigraph.yaml required.
|
||||
COPY target/release/omnigraph /usr/local/bin/omnigraph
|
||||
COPY docker/entrypoint.sh /usr/local/bin/omnigraph-entrypoint
|
||||
|
||||
RUN chmod 0755 /usr/local/bin/omnigraph-server /usr/local/bin/omnigraph-entrypoint
|
||||
RUN chmod 0755 /usr/local/bin/omnigraph-server /usr/local/bin/omnigraph /usr/local/bin/omnigraph-entrypoint
|
||||
|
||||
ENV OMNIGRAPH_BIND=0.0.0.0:8080
|
||||
|
||||
|
|
|
|||
|
|
@ -9,6 +9,17 @@ fi
|
|||
|
||||
bind="${OMNIGRAPH_BIND:-0.0.0.0:8080}"
|
||||
|
||||
# Cluster mode first, and exclusive (the server's mode-inference rule 0):
|
||||
# a deployment serves from cluster state XOR omnigraph.yaml, never a merge.
|
||||
# Fail fast here with the same contract the server enforces.
|
||||
if [ -n "${OMNIGRAPH_CLUSTER:-}" ]; then
|
||||
if [ -n "${OMNIGRAPH_TARGET_URI:-}" ] || [ -n "${OMNIGRAPH_CONFIG:-}" ] || [ -n "${OMNIGRAPH_TARGET:-}" ]; then
|
||||
echo "OMNIGRAPH_CLUSTER is an exclusive boot source; unset OMNIGRAPH_TARGET_URI/OMNIGRAPH_CONFIG/OMNIGRAPH_TARGET" >&2
|
||||
exit 64
|
||||
fi
|
||||
exec "$SERVER_BIN" --cluster "${OMNIGRAPH_CLUSTER}" --bind "${bind}"
|
||||
fi
|
||||
|
||||
# URI comes from the env var (the positional arg wins over any config
|
||||
# `graphs` block in resolve_target_uri). OMNIGRAPH_CONFIG, when also set,
|
||||
# is forwarded as --config purely to supply a policy file — the two
|
||||
|
|
@ -28,6 +39,8 @@ fi
|
|||
|
||||
cat >&2 <<'EOF'
|
||||
omnigraph-server container startup requires one of:
|
||||
- OMNIGRAPH_CLUSTER (serve a cluster directory's applied revision;
|
||||
exclusive — cannot combine with the others)
|
||||
- OMNIGRAPH_TARGET_URI
|
||||
- OMNIGRAPH_CONFIG
|
||||
|
||||
|
|
|
|||
|
|
@ -58,6 +58,26 @@ got=$(sh "$ep" some-uri --bind 1.2.3.4:9 --extra)
|
|||
check "explicit args passthrough" \
|
||||
"ARGS: some-uri --bind 1.2.3.4:9 --extra" "$got"
|
||||
|
||||
got=$(OMNIGRAPH_CLUSTER="/var/lib/omnigraph/company-brain" OMNIGRAPH_BIND="0.0.0.0:8080" sh "$ep")
|
||||
check "CLUSTER only (Phase 5 mode switch)" \
|
||||
"ARGS: --cluster /var/lib/omnigraph/company-brain --bind 0.0.0.0:8080" "$got"
|
||||
|
||||
# Exclusivity: OMNIGRAPH_CLUSTER refuses every combination, exit 64.
|
||||
for combo in "OMNIGRAPH_TARGET_URI=s3://b/g" "OMNIGRAPH_CONFIG=/etc/o.yaml" "OMNIGRAPH_TARGET=active"; do
|
||||
if out=$(env "$combo" OMNIGRAPH_CLUSTER="/data/cluster" sh "$ep" 2>&1); then
|
||||
echo "FAIL: CLUSTER + ${combo%%=*} unexpectedly succeeded: $out"
|
||||
fail=1
|
||||
else
|
||||
status=$?
|
||||
if [ "$status" -ne 64 ]; then
|
||||
echo "FAIL: CLUSTER + ${combo%%=*} exited $status, want 64"
|
||||
fail=1
|
||||
else
|
||||
echo "ok: CLUSTER + ${combo%%=*} refused (64)"
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
if [ "$fail" -ne 0 ]; then
|
||||
echo "entrypoint_test: FAILED"
|
||||
exit 1
|
||||
|
|
|
|||
|
|
@ -229,7 +229,8 @@ with an in-flight apply.
|
|||
- **Replicas**: any number of `--cluster` servers can serve the same config
|
||||
directory; boot is read-only. Roll out a change by `apply` once, then
|
||||
restarting replicas (serving is static per process — there is no hot
|
||||
reload yet).
|
||||
reload yet). Container/cloud recipes (AWS ECS+EFS, Railway volumes):
|
||||
[deployment.md](deployment.md#cluster-mode-in-containers-aws-railway).
|
||||
- **The directory is the deployable unit**: config, catalog, ledger,
|
||||
approvals, and graph data all live under it. Back it up as a whole;
|
||||
version the *config files* (not `__cluster/` or `graphs/`) in git.
|
||||
|
|
|
|||
|
|
@ -45,6 +45,72 @@ omnigraph-server s3://my-bucket/graphs/example/releases/2026-04-10-v0.1.0 \
|
|||
--bind 0.0.0.0:8080
|
||||
```
|
||||
|
||||
## Cluster Mode in Containers (AWS, Railway)
|
||||
|
||||
A cluster-booted deployment serves a **cluster directory** (config + state
|
||||
ledger + content-addressed catalog + graph data) from a mounted volume — the
|
||||
one structural difference from the stateless S3 single-graph shape, which
|
||||
needs no volume at all. The container contract:
|
||||
|
||||
```bash
|
||||
docker run -d \
|
||||
-v /srv/company-brain:/var/lib/omnigraph/cluster \
|
||||
-e OMNIGRAPH_CLUSTER=/var/lib/omnigraph/cluster \
|
||||
-e OMNIGRAPH_SERVER_BEARER_TOKEN=... \
|
||||
-p 8080:8080 <image>
|
||||
```
|
||||
|
||||
`OMNIGRAPH_CLUSTER` is exclusive: combining it with `OMNIGRAPH_TARGET_URI`,
|
||||
`OMNIGRAPH_CONFIG`, or `OMNIGRAPH_TARGET` fails fast (exit 64), the same
|
||||
rule the server itself enforces. The image also ships the `omnigraph` CLI,
|
||||
so the day-2 loop runs in-container with no `omnigraph.yaml`:
|
||||
|
||||
```bash
|
||||
docker exec -it <container> sh -c \
|
||||
'omnigraph cluster apply --as <you> --config /var/lib/omnigraph/cluster'
|
||||
# then restart the container to pick up the applied state
|
||||
```
|
||||
|
||||
### AWS (ECS/Fargate + EFS)
|
||||
|
||||
1. Push the image to ECR (the `package.yml` workflow builds it).
|
||||
2. Create an EFS filesystem; mount it in the task definition at
|
||||
`/var/lib/omnigraph/cluster`.
|
||||
3. Task environment: `OMNIGRAPH_CLUSTER=/var/lib/omnigraph/cluster`, bearer
|
||||
tokens via Secrets Manager/SSM into `OMNIGRAPH_SERVER_BEARER_TOKENS_JSON`
|
||||
(or the `--features aws` build's native Secrets Manager source).
|
||||
4. ALB in front for TLS; target the container's 8080 with `/healthz` checks.
|
||||
5. Day-2: ECS exec into the task → edit/upload config on the volume →
|
||||
`omnigraph cluster apply --as <you> --config /var/lib/omnigraph/cluster`
|
||||
→ force a new deployment (restart).
|
||||
|
||||
For a deployment that doesn't need the cluster control plane, the classic
|
||||
stateless shape — `OMNIGRAPH_TARGET_URI=s3://bucket/graph.omni`, no volume —
|
||||
remains the simplest AWS architecture (see Binary/Container Deployment
|
||||
above).
|
||||
|
||||
### Railway
|
||||
|
||||
1. Create a service from the image; attach a **volume** mounted at
|
||||
`/var/lib/omnigraph/cluster`.
|
||||
2. Variables: `OMNIGRAPH_CLUSTER=/var/lib/omnigraph/cluster`,
|
||||
`OMNIGRAPH_SERVER_BEARER_TOKEN=<token>`. Railway terminates TLS at its
|
||||
edge and routes to the exposed 8080.
|
||||
3. Day-2: `railway shell` (or `railway run`) → `omnigraph cluster apply
|
||||
--as <you> --config /var/lib/omnigraph/cluster` → redeploy/restart the
|
||||
service.
|
||||
|
||||
### Constraints (current honest list)
|
||||
|
||||
- **Cluster directories are local-filesystem** — the volume is mandatory;
|
||||
S3-hosted cluster dirs are not supported.
|
||||
- **No hot reload** — applied changes serve on the next restart.
|
||||
- **Single-writer apply** — run `cluster apply` from one place at a time
|
||||
(the state lock enforces this; CI or one operator shell, not both).
|
||||
- **Multi-replica serving off a shared volume (EFS) is documented but
|
||||
unvalidated** — boot is lock-free read-only so it should compose, but it
|
||||
is not yet exercised by tests.
|
||||
|
||||
## One-Command Local RustFS Bootstrap
|
||||
|
||||
The easiest local S3-backed deployment path is:
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue