mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-30 02:49:39 +02:00
[codex] fix RFC-011 follow-up regressions (#258)
* fix rfc-011 follow-up regressions
* test(cli): remove served schema-apply tests obsoleted by the cluster 409
This PR disables server-side schema apply for cluster-backed serving (409 →
`omnigraph cluster apply`). Two system_local tests still drove *served* schema
apply against a spawned `--cluster` server and asserted the pre-409 behavior, so
they failed under `cargo test --workspace`:
- `local_cli_schema_apply_enforces_engine_layer_policy` — expected a per-actor
policy `denied`/allow on the served route; the route now 409s for everyone
before policy runs.
- `local_cli_schema_apply_rejects_stored_query_breakage_before_publish` —
expected a served apply to reject a stored-query breakage; the route now 409s
before any apply.
Both exercise a path the PR intentionally removed. Their surviving coverage:
the 409 itself is pinned by `schema_routes::schema_apply_route_refuses_cluster_backed_server_mode`
(asserts 409 + no mutation); stored-query-breakage-before-publish stays covered
by `schema_routes::schema_apply_route_rejects_stored_query_breakage_before_publish`
(single-mode); engine-layer schema_apply Cedar enforcement stays covered by
`policy_engine_chassis`. Remove the obsolete served versions.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(server): report the cluster-backed schema-apply 409 after the Cedar gate
The 409 ("schema apply is disabled for cluster-backed serving") fired at the top
of `server_schema_apply`, before `authorize_request`. An authenticated-but-
unauthorized actor therefore learned the server is cluster-backed (409) instead
of getting a normal 403 — leaking topology before authorization, against the
same posture that keeps `GET /graphs` default-deny.
Move the 409 below the Cedar gate so the route reports 401 → 403 → 409: an
unauthorized actor gets 403, and only an actor authorized for `schema_apply`
sees the actionable "use `omnigraph cluster apply`" 409. (An open/unauthenticated
server still 409s, as it has no topology to protect.)
Regression: `schema_apply_route_cluster_backed_denies_unauthorized_actor_before_409`
(POLICY_YAML grants no schema_apply → act-ragnor gets 403, not 409). Addresses the
bot-review finding on #258.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
9513b076d2
commit
b5658dc696
19 changed files with 429 additions and 261 deletions
|
|
@ -53,14 +53,13 @@ pub(crate) async fn server_graphs_list(
|
|||
) -> std::result::Result<Json<GraphListResponse>, ApiError> {
|
||||
let registry = &state.routing().registry;
|
||||
|
||||
// Server-level Cedar gate. `state.server_policy` is loaded from
|
||||
// `server.policy.file` in `omnigraph.yaml` at startup. When no
|
||||
// server policy is configured, `authorize_request_server` falls
|
||||
// through to the MR-723 default-deny semantics (every non-Read
|
||||
// action denied for an authenticated actor). `GraphList` is not
|
||||
// `Read`, so without a server policy the request gets 403 — which
|
||||
// is the right default (don't leak the registry until the operator
|
||||
// explicitly authorizes it).
|
||||
// Server-level Cedar gate. `state.server_policy` is loaded from the
|
||||
// cluster-scoped policy bundle at startup. When no server policy is
|
||||
// configured, `authorize_request_server` falls through to the MR-723
|
||||
// default-deny semantics (every non-Read action denied for an
|
||||
// authenticated actor). `GraphList` is not `Read`, so without a server
|
||||
// policy the request gets 403 — which is the right default (don't leak
|
||||
// the registry until the operator explicitly authorizes it).
|
||||
authorize_request(
|
||||
actor.as_ref().map(|Extension(actor)| actor),
|
||||
state.server_policy.as_deref(),
|
||||
|
|
@ -360,22 +359,25 @@ pub(crate) fn authorize(
|
|||
// runtime state means the docstring contract on
|
||||
// `server_graphs_list` ("don't leak the registry until the
|
||||
// operator explicitly authorizes it") holds uniformly; the
|
||||
// operator's only path to enabling it is configuring an
|
||||
// explicit `server.policy.file` in omnigraph.yaml.
|
||||
// operator's only path to enabling it is configuring a
|
||||
// cluster-scoped policy bundle, applying the cluster, and
|
||||
// restarting the server.
|
||||
if request.action.resource_kind() == PolicyResourceKind::Server {
|
||||
return Ok(Authz::Denied(
|
||||
"server-scoped actions require an explicit `server.policy.file` \
|
||||
configured in omnigraph.yaml — the management surface is closed \
|
||||
by default in every runtime state, including --unauthenticated, \
|
||||
so that server topology is never exposed without operator opt-in."
|
||||
"server-scoped actions require an explicit cluster policy bundle \
|
||||
applied with `omnigraph cluster apply` and served after restart — \
|
||||
the management surface is closed by default in every runtime state, \
|
||||
including --unauthenticated, so that server topology is never exposed \
|
||||
without operator opt-in."
|
||||
.to_string(),
|
||||
));
|
||||
}
|
||||
if actor.is_some() && request.action != PolicyAction::Read {
|
||||
return Ok(Authz::Denied(
|
||||
"server runs in default-deny mode (bearer tokens configured but no \
|
||||
policy file). Only `read` actions are permitted; configure \
|
||||
`policy.file` in omnigraph.yaml to enable other actions."
|
||||
applied policy bundle). Only `read` actions are permitted; configure \
|
||||
a graph or cluster policy bundle in the cluster config, run \
|
||||
`omnigraph cluster apply`, and restart the server to enable other actions."
|
||||
.to_string(),
|
||||
));
|
||||
}
|
||||
|
|
@ -488,7 +490,7 @@ pub(crate) fn deprecation_headers(successor_link: &'static str) -> [(HeaderName,
|
|||
operation_id = "read",
|
||||
request_body = ReadRequest,
|
||||
responses(
|
||||
(status = 200, description = "Query results (response includes `Deprecation: true` + `Link: </query>; rel=\"successor-version\"`)", body = ReadOutput),
|
||||
(status = 200, description = "Query results (response includes `Deprecation: true` + `Link: <query>; rel=\"successor-version\"`)", body = ReadOutput),
|
||||
(status = 400, description = "Bad request", body = ErrorOutput),
|
||||
(status = 401, description = "Unauthorized", body = ErrorOutput),
|
||||
(status = 403, description = "Forbidden", body = ErrorOutput),
|
||||
|
|
@ -502,7 +504,7 @@ pub(crate) fn deprecation_headers(successor_link: &'static str) -> [(HeaderName,
|
|||
/// route is kept indefinitely for byte-stable back-compat. New integrations
|
||||
/// should target `POST /query`, which has clean field names (`query` /
|
||||
/// `name`) and a 400-on-mutation guard. Responses from this route include
|
||||
/// `Deprecation: true` and `Link: </query>; rel="successor-version"`
|
||||
/// `Deprecation: true` and `Link: <query>; rel="successor-version"`
|
||||
/// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the
|
||||
/// signal.
|
||||
pub(crate) async fn server_read(
|
||||
|
|
@ -522,7 +524,7 @@ pub(crate) async fn server_read(
|
|||
)
|
||||
.await?;
|
||||
Ok((
|
||||
deprecation_headers("</query>; rel=\"successor-version\""),
|
||||
deprecation_headers("<query>; rel=\"successor-version\""),
|
||||
Json(api::read_output(selected_name, &target, result)),
|
||||
))
|
||||
}
|
||||
|
|
@ -771,7 +773,7 @@ pub(crate) async fn run_query(
|
|||
operation_id = "change",
|
||||
request_body = ChangeRequest,
|
||||
responses(
|
||||
(status = 200, description = "Mutation results (response includes `Deprecation: true` + `Link: </mutate>; rel=\"successor-version\"`)", body = ChangeOutput),
|
||||
(status = 200, description = "Mutation results (response includes `Deprecation: true` + `Link: <mutate>; rel=\"successor-version\"`)", body = ChangeOutput),
|
||||
(status = 400, description = "Bad request", body = ErrorOutput),
|
||||
(status = 401, description = "Unauthorized", body = ErrorOutput),
|
||||
(status = 403, description = "Forbidden", body = ErrorOutput),
|
||||
|
|
@ -787,7 +789,7 @@ pub(crate) async fn run_query(
|
|||
/// kept indefinitely for back-compat. New integrations should target
|
||||
/// `POST /mutate`, which has identical semantics and a name that pairs
|
||||
/// cleanly with `POST /query`. Responses from this route include
|
||||
/// `Deprecation: true` and `Link: </mutate>; rel="successor-version"`
|
||||
/// `Deprecation: true` and `Link: <mutate>; rel="successor-version"`
|
||||
/// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the
|
||||
/// signal.
|
||||
pub(crate) async fn server_change(
|
||||
|
|
@ -808,7 +810,7 @@ pub(crate) async fn server_change(
|
|||
)
|
||||
.await?;
|
||||
Ok((
|
||||
deprecation_headers("</mutate>; rel=\"successor-version\""),
|
||||
deprecation_headers("<mutate>; rel=\"successor-version\""),
|
||||
Json(output),
|
||||
))
|
||||
}
|
||||
|
|
@ -1111,12 +1113,16 @@ pub(crate) async fn server_schema_get(
|
|||
(status = 400, description = "Bad request", body = ErrorOutput),
|
||||
(status = 401, description = "Unauthorized", body = ErrorOutput),
|
||||
(status = 403, description = "Forbidden", body = ErrorOutput),
|
||||
(status = 409, description = "Schema apply is disabled for cluster-backed serving; use `omnigraph cluster apply` and restart", body = ErrorOutput),
|
||||
(status = 429, description = "Per-actor admission cap exceeded; honor `Retry-After` header", body = ErrorOutput),
|
||||
),
|
||||
security(("bearer_token" = [])),
|
||||
)]
|
||||
/// Apply a schema migration.
|
||||
///
|
||||
/// Cluster-backed servers reject this route with `409 Conflict`; operators
|
||||
/// must apply schema changes through `omnigraph cluster apply` and restart.
|
||||
///
|
||||
/// Diffs `schema_source` against the current schema and applies the resulting
|
||||
/// migration steps (add/drop type, add/drop column, etc.). **Destructive**:
|
||||
/// some steps drop data. Returns the list of steps applied; if `applied` is
|
||||
|
|
@ -1143,6 +1149,17 @@ pub(crate) async fn server_schema_apply(
|
|||
target_branch: Some("main".to_string()),
|
||||
},
|
||||
)?;
|
||||
// Disable HTTP schema apply on cluster-backed serving AFTER the Cedar gate,
|
||||
// so an unauthorized actor gets a 403 (not a 409 that would disclose the
|
||||
// server is cluster-backed): 401 → 403 → 409, never leak topology before
|
||||
// authorization. An authorized actor gets the actionable 409 signpost.
|
||||
if state.routing().config_path.is_some() {
|
||||
return Err(ApiError::conflict(
|
||||
"server-side schema apply is disabled for cluster-backed serving; \
|
||||
update the cluster config, run `omnigraph cluster apply`, and restart \
|
||||
the server.",
|
||||
));
|
||||
}
|
||||
let est_bytes = request.schema_source.len() as u64;
|
||||
let _admission = state
|
||||
.workload
|
||||
|
|
@ -1324,7 +1341,7 @@ pub(crate) async fn server_load(
|
|||
operation_id = "ingest",
|
||||
request_body = IngestRequest,
|
||||
responses(
|
||||
(status = 200, description = "Load results (response includes `Deprecation: true` + `Link: </load>; rel=\"successor-version\"`)", body = IngestOutput),
|
||||
(status = 200, description = "Load results (response includes `Deprecation: true` + `Link: <load>; rel=\"successor-version\"`)", body = IngestOutput),
|
||||
(status = 400, description = "Bad request", body = ErrorOutput),
|
||||
(status = 401, description = "Unauthorized", body = ErrorOutput),
|
||||
(status = 403, description = "Forbidden", body = ErrorOutput),
|
||||
|
|
@ -1338,7 +1355,7 @@ pub(crate) async fn server_load(
|
|||
/// Bulk-load NDJSON data into a branch. Behavior is unchanged; the route is
|
||||
/// kept indefinitely for back-compat. New integrations should target
|
||||
/// `POST /load`, which has identical semantics. Responses from this route
|
||||
/// include `Deprecation: true` and `Link: </load>; rel="successor-version"`
|
||||
/// include `Deprecation: true` and `Link: <load>; rel="successor-version"`
|
||||
/// headers per RFC 9745 / RFC 8288 so SDKs and proxies can surface the signal.
|
||||
pub(crate) async fn server_ingest(
|
||||
State(state): State<AppState>,
|
||||
|
|
@ -1354,7 +1371,7 @@ pub(crate) async fn server_ingest(
|
|||
)
|
||||
.await?;
|
||||
Ok((
|
||||
deprecation_headers("</load>; rel=\"successor-version\""),
|
||||
deprecation_headers("<load>; rel=\"successor-version\""),
|
||||
Json(output),
|
||||
))
|
||||
}
|
||||
|
|
@ -1738,4 +1755,3 @@ pub(crate) fn query_params_from_json(
|
|||
json_params_to_param_map(params_json, query_params, JsonParamMode::Standard)
|
||||
.map_err(|err| color_eyre::eyre::eyre!(err.to_string()))
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -191,10 +191,10 @@ pub enum ServerConfigMode {
|
|||
},
|
||||
}
|
||||
|
||||
/// Where a Cedar policy bundle comes from at startup. File-based for
|
||||
/// omnigraph.yaml deployments; inline (digest-verified catalog content)
|
||||
/// for cluster-mode boots, where the catalog may live on object storage
|
||||
/// and the server must not re-read mutable state after the snapshot.
|
||||
/// Where a Cedar policy bundle comes from at startup. Cluster-local files are
|
||||
/// used during config application; inline digest-verified catalog content is
|
||||
/// used for serving, where the catalog may live on object storage and the
|
||||
/// server must not re-read mutable state after the snapshot.
|
||||
#[derive(Debug, Clone)]
|
||||
pub enum PolicySource {
|
||||
File(PathBuf),
|
||||
|
|
@ -249,12 +249,10 @@ pub struct AppState {
|
|||
/// see MR-668 decision Q6.
|
||||
workload: Arc<workload::WorkloadController>,
|
||||
bearer_tokens: Arc<[(BearerTokenHash, Arc<str>)]>,
|
||||
/// Server-level Cedar policy. Used by management endpoints (`POST
|
||||
/// /graphs`, `GET /graphs`) which act on the registry resource,
|
||||
/// not on a per-graph resource. Loaded from `server.policy.file`
|
||||
/// in `omnigraph.yaml`. `None` outside multi mode and when no
|
||||
/// server policy is configured. Per-graph policies live on each
|
||||
/// `GraphHandle.policy`.
|
||||
/// Server-level Cedar policy. Used by management endpoints (`GET
|
||||
/// /graphs`) which act on the registry resource, not on a per-graph
|
||||
/// resource. Loaded from the cluster-scoped policy binding when
|
||||
/// configured. Per-graph policies live on each `GraphHandle.policy`.
|
||||
server_policy: Option<Arc<PolicyEngine>>,
|
||||
}
|
||||
|
||||
|
|
@ -534,12 +532,11 @@ impl AppState {
|
|||
}
|
||||
|
||||
/// Multi-mode constructor — used by the startup loop. Operators
|
||||
/// reach this by invoking `omnigraph-server --config omnigraph.yaml`
|
||||
/// with a non-empty `graphs:` map.
|
||||
/// reach this by invoking `omnigraph-server --cluster <dir|s3://...>`.
|
||||
///
|
||||
/// Caller supplies the already-opened `GraphHandle`s and (optionally)
|
||||
/// the path to the source config file. `server_policy` is loaded
|
||||
/// from `server.policy.file` if configured.
|
||||
/// the path to the source cluster. `server_policy` is loaded from the
|
||||
/// cluster-scoped policy binding if configured.
|
||||
pub fn new_multi(
|
||||
handles: Vec<Arc<GraphHandle>>,
|
||||
bearer_tokens: Vec<(String, String)>,
|
||||
|
|
@ -993,7 +990,8 @@ pub async fn serve(config: ServerConfig) -> Result<()> {
|
|||
ServerRuntimeState::DefaultDeny => warn!(
|
||||
"bearer tokens are configured but no policy file is set — running in \
|
||||
default-deny mode (only `read` actions are permitted for authenticated \
|
||||
actors). Configure `policy.file` in omnigraph.yaml to enable Cedar rules."
|
||||
actors). Configure a graph or cluster policy bundle in the cluster config, \
|
||||
run `omnigraph cluster apply`, and restart to enable Cedar rules."
|
||||
),
|
||||
ServerRuntimeState::PolicyEnabled => {}
|
||||
}
|
||||
|
|
@ -1123,5 +1121,3 @@ async fn shutdown_signal() {
|
|||
}
|
||||
info!("shutdown signal received");
|
||||
}
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -1,14 +1,13 @@
|
|||
//! Server settings: omnigraph.yaml/CLI/env resolution, mode inference
|
||||
//! (single vs multi vs cluster), bearer-token sources, and runtime-state
|
||||
//! classification (moved verbatim from lib.rs in the modularization).
|
||||
//! Server settings: cluster/CLI/env resolution, bearer-token sources, and
|
||||
//! runtime-state classification (moved verbatim from lib.rs in the
|
||||
//! modularization).
|
||||
|
||||
use super::*;
|
||||
|
||||
/// Build serving settings from a cluster directory's applied revision
|
||||
/// (RFC-005 §D2): graphs at derived roots, stored queries from verified
|
||||
/// catalog blob content, policy bundles from blob paths with their applied
|
||||
/// bindings. Always multi-graph routing. The unauthenticated/env handling
|
||||
/// matches the omnigraph.yaml path.
|
||||
/// bindings. Always multi-graph routing.
|
||||
pub(crate) async fn load_cluster_settings(
|
||||
cluster_dir: &PathBuf,
|
||||
cli_bind: Option<String>,
|
||||
|
|
@ -189,7 +188,8 @@ pub fn classify_server_runtime_state(
|
|||
"server has no bearer tokens and no policy file configured. This is a fully \
|
||||
open server — pass `--unauthenticated` (or set OMNIGRAPH_UNAUTHENTICATED=1) \
|
||||
if you actually want that, otherwise configure bearer tokens (see \
|
||||
docs/user/operations/server.md) and/or `policy.file` in omnigraph.yaml."
|
||||
docs/user/operations/server.md) and a graph or cluster policy bundle in \
|
||||
the cluster config, then run `omnigraph cluster apply` and restart."
|
||||
),
|
||||
(false, false, true) => Ok(ServerRuntimeState::Open),
|
||||
(true, false, _) => Ok(ServerRuntimeState::DefaultDeny),
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue