mr-668: remove POST /graphs and CLI graphs create (defer runtime graph mgmt)

The POST /graphs runtime-create endpoint shipped in PR 7/10 has three
unresolved high-severity bugs:

  - flock-on-renamed-inode race: the YAML flock is taken on
    omnigraph.yaml itself, then a temp file is renamed over it.
    Cross-process writers end up locking different inodes — both
    believing they hold exclusive access.
  - duplicate-check outside the file lock: precheck runs against
    the in-memory registry only; the locked closure does
    config.graphs.insert(...) unconditionally. Concurrent same-id
    POSTs can persist the loser in YAML while the in-memory registry
    keeps the winner — they disagree after restart.
  - best_effort_cleanup_init_artifacts deletes _schema.pg /
    _schema.ir.json / __schema_state.json on any init failure. An
    accidental re-init against an existing graph's URI destroys its
    schema; subsequent open() fails at read_text(_schema.pg).

The correct fix is a Lance-style cluster catalog (reserve → init →
publish with recovery sidecars), parallel to the engine's existing
__manifest discipline. That work is out of scope for v0.7.0.

For now, disable runtime add/remove from the network and CLI surface.
Operators add graphs by editing omnigraph.yaml and restarting. The
GET /graphs read-only enumeration stays.

Removed:
- POST /graphs handler + router fragment + utoipa registration
- 13 post_graphs_* server tests + 3 composite POST tests +
  multi_mode_app_with_real_config / post_graph helpers
- CLI omnigraph graphs create subcommand + its handler + cli.rs tests
- system_remote.rs combined list+create test trimmed to list-only
- YAML rewrite infra: rewrite_atomic[_with_modify], RewriteAtomicError,
  staging_path, hash_config_file, AppState::config_hash field +
  threading through new_multi and open_multi_graph_state
- fs2 dependency (verified absent from cargo tree)
- sha2/fs2 imports in config.rs (only the rewrite path used them)
- Cedar PolicyAction::GraphCreate variant + "graph_create" match arms
  + action def in Cedar schema + graph_create_action_authorizes_against_server_resource test
- GraphCreateRequest / GraphCreateResponse / GraphSchemaSpec /
  GraphPolicySpec API types (only the POST handler / CLI imported them)

Kept:
- GET /graphs (read-only enumeration) and graph_list Cedar action
- omnigraph graphs list CLI subcommand
- All multi-graph startup, mode inference, cluster routes,
  per-graph + server-level Cedar policies
- server_settings_drive_multi_graph_startup_end_to_end (the test
  that covers operator-authored YAML + restart — the path that
  survives)
- best_effort_cleanup_init_artifacts and the three init failpoints
  (still reachable from CLI `omnigraph init`; preflight fix deferred
  as a follow-up)
- GraphRegistry::insert and its concurrency tests — production
  callers gone, but the method is the natural seam for the future
  cluster-catalog work

Also fixed (transcript issue 4):
- ALWAYS_FLAT_PATHS now includes /graphs so multi-mode OpenAPI
  advertises the management route correctly (was previously rewritten
  to /graphs/{graph_id}/graphs)
- multi_mode_openapi_keeps_healthz_flat → renamed to
  multi_mode_openapi_keeps_management_paths_flat, asserts both
  /healthz and /graphs stay flat
- multi_mode_openapi_prefixes_operation_ids_with_cluster skips
  /graphs in addition to /healthz

Doc fixes:
- docs/user/cli.md: graphs list example was --target http://...,
  but --target is a config-graph-name lookup; corrected to --uri.
  Removed the graphs create example.
- docs/user/server.md: dropped POST /graphs row, "omnigraph.yaml
  ownership", and "POST /graphs body shape" sections. Added a
  paragraph stating runtime add/remove is not exposed in v0.7.0.
- docs/user/policy.md: dropped graph_create action; reworded the
  "Configuration" line to clarify that server-scoped rules (graph_list)
  take neither branch_scope nor target_branch_scope.
- docs/releases/v0.7.0.md: rewrote release narrative — multi-graph
  mode ships; runtime add/remove deferred.
- AGENTS.md: HTTP server bullet and capability matrix row updated to
  reflect read-only GET /graphs and the operator-edit workflow.
- openapi.json regenerated; /graphs has only .get, no .post.

Diff: 17 files, +123 −1525 LOC.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Ragnor Comerford 2026-05-26 17:49:38 +02:00
parent d11c18fb27
commit 937fd6382d
No known key found for this signature in database
18 changed files with 136 additions and 1727 deletions

View file

@ -2043,25 +2043,26 @@ fn schema_plan_parity_cli_and_sdk() {
// ─── MR-668 PR 8 — omnigraph graphs subcommand ─────────────────────────────
/// `omnigraph graphs --help` lists the two subcommands shipped in
/// v0.7.0 (`list`, `create`). `delete` is intentionally NOT in the
/// help — DELETE was deferred to bound the v0.7.0 scope.
/// `omnigraph graphs --help` lists only the read-only `list`
/// subcommand. Runtime add (`create`) and remove (`delete`) are
/// deferred — operators add/remove graphs by editing `omnigraph.yaml`
/// and restarting. This test pins the deferral against accidental
/// re-introduction.
#[test]
fn graphs_subcommand_help_lists_list_and_create() {
fn graphs_subcommand_help_lists_list_only() {
let output = output_success(cli().arg("graphs").arg("--help"));
let stdout = stdout_string(&output);
assert!(
stdout.contains("list"),
"expected `list` subcommand in help output:\n{stdout}"
);
let lowered = stdout.to_lowercase();
assert!(
stdout.contains("create"),
"expected `create` subcommand in help output:\n{stdout}"
!lowered.contains("create a new graph"),
"graph create should not be in v0.7.0 help; got:\n{stdout}"
);
// Pin the deferral: `delete` must not appear yet (catches an
// accidental scope expansion).
assert!(
!stdout.to_lowercase().contains("delete a graph"),
!lowered.contains("delete a graph"),
"graph delete should not be in v0.7.0 help; got:\n{stdout}"
);
}
@ -2078,53 +2079,3 @@ fn graphs_list_against_local_uri_errors_with_remote_only_message() {
);
}
/// `omnigraph graphs create` against a local URI errors the same way.
#[test]
fn graphs_create_against_local_uri_errors_with_remote_only_message() {
let temp = tempdir().unwrap();
let schema = temp.path().join("schema.pg");
fs::write(&schema, "node Person { name: String @key }\n").unwrap();
let output = output_failure(
cli()
.arg("graphs")
.arg("create")
.arg("--uri")
.arg("/tmp/local")
.arg("--graph-id")
.arg("alpha")
.arg("--graph-uri")
.arg("/tmp/alpha.omni")
.arg("--schema")
.arg(&schema),
);
let stderr = String::from_utf8_lossy(&output.stderr).into_owned();
assert!(
stderr.contains("remote multi-graph server URL"),
"expected 'remote multi-graph server URL' rejection in stderr; got:\n{stderr}"
);
}
/// `omnigraph graphs create` with a missing `--schema` file errors
/// with the IO context. Catches a regression where the CLI silently
/// sent an empty body.
#[test]
fn graphs_create_with_missing_schema_file_errors() {
let output = output_failure(
cli()
.arg("graphs")
.arg("create")
.arg("--uri")
.arg("http://127.0.0.1:0")
.arg("--graph-id")
.arg("alpha")
.arg("--graph-uri")
.arg("/tmp/alpha.omni")
.arg("--schema")
.arg("/this/path/does/not/exist.pg"),
);
let stderr = String::from_utf8_lossy(&output.stderr).into_owned();
assert!(
stderr.contains("reading schema file"),
"expected 'reading schema file' error in stderr; got:\n{stderr}"
);
}