mr-668: POST /graphs runtime create endpoint (PR 7/10)

PR 7 of the MR-668 multi-graph server work. Operators can now add a
graph to a running multi-graph server without restarting:

  curl -X POST http://server/graphs \
    -H "Content-Type: application/json" \
    -d '{
          "graph_id": "beta",
          "uri": "/data/beta.omni",
          "schema": { "source": "node Person { name: String @key }\n" },
          "policy": { "file": "./policies/beta.yaml" }
        }'

DELETE remains deferred (out of v0.7.0 scope per the trimmed plan —
no `delete_prefix`, no tombstones).

Body shape (decision 7):
  - Nested `schema: { source: "..." }` (mirrors the `policy: { file }`
    pattern; leaves room for future fields without breakage).
  - Optional nested `policy: { file: "..." }` for per-graph Cedar.
  - 32 MiB body limit (reuses `INGEST_REQUEST_BODY_LIMIT_BYTES`).
  - Asymmetric with `SchemaApplyRequest` which keeps flat
    `schema_source: String` — documented in api.rs.

Atomic YAML rewrite + drift detection:
  - New `config::rewrite_atomic(path, new_config, expected_hash)`:
    flock → re-read + hash check → serialize → write `.tmp` → fsync
    → rename → fsync parent dir. Returns the new hash for the caller
    to update its in-memory baseline.
  - New `config::hash_config_file(path)` — SHA-256 of the on-disk
    bytes, used at startup and after each rewrite.
  - New `RewriteAtomicError { Drift | Io | Serialize }` enum.
  - `AppState.config_hash: Option<Arc<Mutex<[u8;32]>>>` carries the
    in-memory baseline. Updated after every successful rewrite so
    subsequent POSTs don't false-trigger drift.
  - The mutex is `std::sync::Mutex` (brief critical section, no .await
    inside). The flock itself serializes file access process-wide
    AND across multiple server instances (defense in depth).
  - All sync I/O runs inside `tokio::task::spawn_blocking` — flock
    is sync.

Handler ordering (the load-bearing sequence):
  1. Mode check: 405 in single mode.
  2. Cedar authorize: `GraphCreate` against `Omnigraph::Server::"root"`.
  3. Validate body: `GraphId::try_from` (regex + reserved-name), empty
     schema/uri checks, per-graph policy file parse.
  4. Pre-check registry for duplicate graph_id / duplicate uri (409).
  5. `Omnigraph::init` the new engine.
  6. Atomic YAML rewrite (drift detection inside).
  7. Publish in registry (atomic re-check via `GraphRegistry::insert`).

Failure modes (documented in handler rustdoc):
  - Init fails → orphan storage at `req.uri` (PR 2a cleans up schema
    files; Lance datasets remain orphans until `delete_prefix` lands).
  - YAML rewrite fails (drift, IO) → orphan storage; YAML unchanged.
  - Registry insert fails (race) → YAML has entry but registry doesn't;
    next restart opens it cleanly.

New dependency: `fs2 = "0.4"` (workspace + omnigraph-server). POSIX-only
file locking. Linux/macOS deployment supported; Windows out of scope.

Tests (10 new in `tests/server.rs::multi_graph_startup`):
  - `post_graphs_creates_a_new_graph_end_to_end` — happy path, includes
    YAML inspection to confirm the rewrite landed.
  - `post_graphs_baseline_hash_updates_between_rewrites` — two POSTs in
    a row both succeed (drift baseline updates correctly).
  - `post_graphs_duplicate_graph_id_returns_409`
  - `post_graphs_duplicate_uri_returns_409`
  - `post_graphs_invalid_graph_id_returns_400` (reserved name)
  - `post_graphs_empty_schema_source_returns_400`
  - `post_graphs_returns_405_in_single_mode`
  - `post_graphs_yaml_drift_detection_returns_503` — operator hand-edits
    omnigraph.yaml; server refuses to clobber.
  - `hash_config_file_is_deterministic_and_detects_changes`
  - `rewrite_atomic_refuses_when_hash_drifts`

OpenAPI: `server_graphs_create` registered in `ApiDoc::paths(...)`;
openapi.json regenerated.

Result: 225 server tests green (74 lib + 66 openapi + 85 integration),
all MR-731 regressions still pinned.

LOC: ~580 lib.rs net (handler + helpers), ~120 config.rs (rewrite
machinery), +71 api.rs (request/response shapes), +332 tests/server.rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ragnor Comerford 2026-05-25 20:38:58 +02:00
parent 94b6346bdd
commit a4e6cb689a
No known key found for this signature in database
9 changed files with 1030 additions and 5 deletions

View file

@ -640,6 +640,121 @@
"bearer_token": []
}
]
},
"post": {
"tags": [
"management"
],
"summary": "Create a new graph at runtime (MR-668 PR 7).",
"description": "Multi-graph mode only. Operators add a graph to the registry\nwithout restarting the server. The server `Omnigraph::init`s the\nnew graph at `req.uri`, atomically rewrites `omnigraph.yaml` to\ninclude the new entry, then publishes the handle in the registry.\n\nCedar-gated by `PolicyAction::GraphCreate` against\n`Omnigraph::Server::\"root\"` (the same server-level policy as\n`GET /graphs`).\n\nFailure modes:\n* Init fails → orphan storage files at `req.uri` (PR 2a cleans up\n schema files but not Lance datasets; operator removes manually).\n* Rewrite fails (`fs2::flock` IO error) → orphan storage; YAML\n unchanged.\n* YAML drift (operator edited the file) → 503; YAML and storage\n both unchanged.\n* Duplicate `graph_id` or `uri` → 409; storage already in use.",
"operationId": "createGraph",
"requestBody": {
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/GraphCreateRequest"
}
}
},
"required": true
},
"responses": {
"201": {
"description": "Graph created",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/GraphCreateResponse"
}
}
}
},
"400": {
"description": "Invalid request body (graph_id, schema, policy file)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"401": {
"description": "Unauthorized",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"403": {
"description": "Forbidden",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"405": {
"description": "Method not allowed (single-graph mode)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"409": {
"description": "graph_id or uri already registered",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"413": {
"description": "Request body too large (>32 MiB)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"500": {
"description": "Init failure or YAML rewrite failure",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
},
"503": {
"description": "omnigraph.yaml drift detected (operator edited the file)",
"content": {
"application/json": {
"schema": {
"$ref": "#/components/schemas/ErrorOutput"
}
}
}
}
},
"security": [
{
"bearer_token": []
}
]
}
},
"/healthz": {
@ -1325,6 +1440,56 @@
}
}
},
"GraphCreateRequest": {
"type": "object",
"description": "Request body for `POST /graphs` (MR-668 PR 7).\n\nBody shape:\n```json\n{\n \"graph_id\": \"alpha\",\n \"uri\": \"/path/to/alpha.omni\",\n \"schema\": { \"source\": \"<inline .pg source>\" },\n \"policy\": { \"file\": \"./policies/alpha.yaml\" }\n}\n```\n\n32 MiB body limit (matches `INGEST_REQUEST_BODY_LIMIT_BYTES`).",
"required": [
"graph_id",
"uri",
"schema"
],
"properties": {
"graph_id": {
"type": "string",
"description": "New graph's id. Must satisfy `^[a-zA-Z0-9-]{1,64}$`, not start with\n`_`, and not be a reserved name. See `GraphId::try_from`."
},
"policy": {
"oneOf": [
{
"type": "null"
},
{
"$ref": "#/components/schemas/GraphPolicySpec",
"description": "Per-graph Cedar policy. Optional — `None` means the graph has\nno per-graph policy enforcement (HTTP auth still applies if\nconfigured)."
}
]
},
"schema": {
"$ref": "#/components/schemas/GraphSchemaSpec",
"description": "Inline schema (`{ source }`). Required."
},
"uri": {
"type": "string",
"description": "Storage URI (local path or `s3://...`). Must NOT already be in\nuse by another registered graph. Server `Omnigraph::init`s the\ngraph at this URI."
}
}
},
"GraphCreateResponse": {
"type": "object",
"description": "Response from `POST /graphs` on success (201 Created).",
"required": [
"graph_id",
"uri"
],
"properties": {
"graph_id": {
"type": "string"
},
"uri": {
"type": "string"
}
}
},
"GraphInfo": {
"type": "object",
"description": "One entry in the response from `GET /graphs`. Cluster operators\nconsume this list to discover which graphs the server is currently\nserving. The shape is intentionally minimal — `graph_id` and `uri`\nare the only fields a routing client needs.",
@ -1356,6 +1521,33 @@
}
}
},
"GraphPolicySpec": {
"type": "object",
"description": "Per-graph policy specification in `POST /graphs`. Mirrors the\n`policy: { file }` shape in `omnigraph.yaml`'s `graphs.<id>.policy`\nsection.",
"properties": {
"file": {
"type": [
"string",
"null"
],
"description": "Path to the per-graph Cedar policy file, server-side.\nMust be readable by the server process at request time.\nPath is relative to the server's working directory (NOT to the\n`omnigraph.yaml`'s `base_dir`) — caller-supplied paths are\ntrusted as-is."
}
}
},
"GraphSchemaSpec": {
"type": "object",
"description": "Schema specification for a new graph in `POST /graphs`. Nested\nper MR-668 decision 7 — leaves room for future fields without\nbreaking the request shape. Mirrors the `policy: { file }` nesting\npattern.\n\nToday only `source` (inline `.pg` text) is supported. Future fields\nmight include `schema.allow_data_loss`, `schema.version`, etc.\n\n**Asymmetric with `SchemaApplyRequest`**: `POST /schema/apply` still\nuses a flat `schema_source: String` for backwards compatibility.\nA follow-up release may migrate that too.",
"required": [
"source"
],
"properties": {
"source": {
"type": "string",
"description": "Inline `.pg` schema source.",
"example": "node Person {\n name: String @key\n age: I32?\n}"
}
}
},
"HealthOutput": {
"type": "object",
"required": [