omnigraph/crates/omnigraph-server/src/config.rs

535 lines
16 KiB
Rust
Raw Normal View History

2026-04-10 20:49:41 +03:00
use std::collections::BTreeMap;
use std::env;
use std::fs;
use std::path::{Path, PathBuf};
use clap::ValueEnum;
use color_eyre::eyre::{Result, bail};
use serde::{Deserialize, Serialize};
mr-668: remove POST /graphs and CLI graphs create (defer runtime graph mgmt) The POST /graphs runtime-create endpoint shipped in PR 7/10 has three unresolved high-severity bugs: - flock-on-renamed-inode race: the YAML flock is taken on omnigraph.yaml itself, then a temp file is renamed over it. Cross-process writers end up locking different inodes — both believing they hold exclusive access. - duplicate-check outside the file lock: precheck runs against the in-memory registry only; the locked closure does config.graphs.insert(...) unconditionally. Concurrent same-id POSTs can persist the loser in YAML while the in-memory registry keeps the winner — they disagree after restart. - best_effort_cleanup_init_artifacts deletes _schema.pg / _schema.ir.json / __schema_state.json on any init failure. An accidental re-init against an existing graph's URI destroys its schema; subsequent open() fails at read_text(_schema.pg). The correct fix is a Lance-style cluster catalog (reserve → init → publish with recovery sidecars), parallel to the engine's existing __manifest discipline. That work is out of scope for v0.7.0. For now, disable runtime add/remove from the network and CLI surface. Operators add graphs by editing omnigraph.yaml and restarting. The GET /graphs read-only enumeration stays. Removed: - POST /graphs handler + router fragment + utoipa registration - 13 post_graphs_* server tests + 3 composite POST tests + multi_mode_app_with_real_config / post_graph helpers - CLI omnigraph graphs create subcommand + its handler + cli.rs tests - system_remote.rs combined list+create test trimmed to list-only - YAML rewrite infra: rewrite_atomic[_with_modify], RewriteAtomicError, staging_path, hash_config_file, AppState::config_hash field + threading through new_multi and open_multi_graph_state - fs2 dependency (verified absent from cargo tree) - sha2/fs2 imports in config.rs (only the rewrite path used them) - Cedar PolicyAction::GraphCreate variant + "graph_create" match arms + action def in Cedar schema + graph_create_action_authorizes_against_server_resource test - GraphCreateRequest / GraphCreateResponse / GraphSchemaSpec / GraphPolicySpec API types (only the POST handler / CLI imported them) Kept: - GET /graphs (read-only enumeration) and graph_list Cedar action - omnigraph graphs list CLI subcommand - All multi-graph startup, mode inference, cluster routes, per-graph + server-level Cedar policies - server_settings_drive_multi_graph_startup_end_to_end (the test that covers operator-authored YAML + restart — the path that survives) - best_effort_cleanup_init_artifacts and the three init failpoints (still reachable from CLI `omnigraph init`; preflight fix deferred as a follow-up) - GraphRegistry::insert and its concurrency tests — production callers gone, but the method is the natural seam for the future cluster-catalog work Also fixed (transcript issue 4): - ALWAYS_FLAT_PATHS now includes /graphs so multi-mode OpenAPI advertises the management route correctly (was previously rewritten to /graphs/{graph_id}/graphs) - multi_mode_openapi_keeps_healthz_flat → renamed to multi_mode_openapi_keeps_management_paths_flat, asserts both /healthz and /graphs stay flat - multi_mode_openapi_prefixes_operation_ids_with_cluster skips /graphs in addition to /healthz Doc fixes: - docs/user/cli.md: graphs list example was --target http://..., but --target is a config-graph-name lookup; corrected to --uri. Removed the graphs create example. - docs/user/server.md: dropped POST /graphs row, "omnigraph.yaml ownership", and "POST /graphs body shape" sections. Added a paragraph stating runtime add/remove is not exposed in v0.7.0. - docs/user/policy.md: dropped graph_create action; reworded the "Configuration" line to clarify that server-scoped rules (graph_list) take neither branch_scope nor target_branch_scope. - docs/releases/v0.7.0.md: rewrote release narrative — multi-graph mode ships; runtime add/remove deferred. - AGENTS.md: HTTP server bullet and capability matrix row updated to reflect read-only GET /graphs and the operator-edit workflow. - openapi.json regenerated; /graphs has only .get, no .post. Diff: 17 files, +123 −1525 LOC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 17:49:38 +02:00
2026-04-10 20:49:41 +03:00
pub const DEFAULT_CONFIG_FILE: &str = "omnigraph.yaml";
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct ProjectConfig {
pub name: Option<String>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TargetConfig {
pub uri: String,
pub bearer_token_env: Option<String>,
mr-668: multi-graph startup + mode inference (PR 5/10) PR 5 of the MR-668 multi-graph server work. This is the first PR that makes multi mode actually usable end-to-end: operators invoking `omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and no single-mode selector now get a running multi-graph server. Mode inference (MR-668 decision 2, four-rule matrix in `load_server_settings`): 1. CLI `<URI>` positional → Single 2. CLI `--target <name>` → Single (URI from graphs.<name>) 3. `server.graph` in config → Single (URI from graphs.<name>) 4. `--config` + non-empty `graphs:` + no single-mode selector → Multi (all entries in `graphs:`) 5. otherwise → error with migration hint Rule 5's error message names every escape hatch so operators can fix their invocation without grepping docs. Config schema extensions: - `TargetConfig.policy: PolicySettings` (per-graph Cedar policy file). `#[serde(default)]` so existing single-graph YAMLs keep parsing. - `ServerDefaults.policy: PolicySettings` (server-level Cedar policy for management endpoints — loaded in PR 5, wired into `GET /graphs` in PR 6b). - `OmnigraphConfig::resolve_target_policy_file(name)` and `resolve_server_policy_file()` helpers — both resolve relative to the config file's `base_dir`. Public types added to `omnigraph-server`: - `ServerConfigMode { Single { uri, policy_file } | Multi { graphs, config_path, server_policy_file } }`. - `GraphStartupConfig { graph_id, uri, policy_file }` — one entry per graph in multi mode. `ServerConfig` shape change: - WAS: `{ uri: String, bind, policy_file, allow_unauthenticated }`. - NOW: `{ mode: ServerConfigMode, bind, allow_unauthenticated }`. - Breaking for any code that constructs `ServerConfig` directly. `main.rs` is unaffected (uses `load_server_settings`). `serve()` now forks on `ServerConfig.mode`: - Single: existing flow via `AppState::open_with_bearer_tokens_and_policy`. - Multi: parallel open via `futures::stream::iter(graphs) .map(open_single_graph).buffer_unordered(4).collect()`. Bound 4 is a rule-of-thumb for I/O-bound work — at N≤10 this trades startup latency for a small amount of concurrent S3/Lance open pressure. Fail-fast: first open error aborts startup; in-flight opens drop their engine via Arc (Lance datasets close cleanly). New helper `open_single_graph(GraphStartupConfig)`: - Validates `GraphId` per the regex in PR 1. - `Omnigraph::open(uri).await` with descriptive error context. - Loads per-graph policy file and re-applies it via `Omnigraph::with_policy` (engine-layer enforcement, MR-722). - Returns `Arc<GraphHandle>` ready for the registry. Routing middleware bug fix: - `Router::nest("/graphs/{graph_id}", inner)` rewrites `request.uri().path()` to the inner suffix (e.g. `/snapshot`). The previous middleware tried to parse `{graph_id}` from `request.uri().path()` and got 400 instead of 200. Fixed by reading from `axum::extract::OriginalUri` request extension, which preserves the pre-rewrite URI. - Caught by the two new tests `cluster_routes_dispatch_per_graph_handle` and `cluster_route_for_unknown_graph_returns_404`. Tests (14 new, all passing): - Four-rule matrix: one test per branch + the joint case `mode_inference_cli_uri_overrides_graphs_map` + the empty-graphs-map error case. - Per-graph + server-level policy file path resolution. - Reserved `GraphId` rejection at startup. - End-to-end multi-graph routing: two graphs side by side, each cluster route hits the right engine. - Unknown graph id under cluster prefix → 404. - Flat routes 404 in multi mode. Inline `ServerConfig` test (`serve_refuses_to_start_in_state_1_without_unauthenticated`) and three `server_settings_*` tests updated to the new `mode` shape. Result: 211 server tests green (74 lib + 71 integration + 66 openapi), MR-731 regression test still pinned and passing. LOC: +45 config.rs, +281 lib.rs (net), +395 tests/server.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:09:12 +02:00
/// Per-graph Cedar policy file (MR-668). In single-graph mode this
/// field is unused — the top-level `policy.file` applies. In
/// multi-graph mode, each `graphs.<id>.policy.file` governs that
/// graph's HTTP-layer Cedar enforcement.
#[serde(default)]
pub policy: PolicySettings,
2026-04-10 20:49:41 +03:00
}
#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)]
#[serde(rename_all = "snake_case")]
pub enum ReadOutputFormat {
#[default]
Table,
Kv,
Csv,
Jsonl,
Json,
}
#[derive(Debug, Clone, Copy, Default, Eq, PartialEq, Serialize, Deserialize, ValueEnum)]
#[serde(rename_all = "snake_case")]
pub enum TableCellLayout {
#[default]
Truncate,
Wrap,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct CliDefaults {
2026-04-14 04:12:14 +03:00
#[serde(rename = "graph")]
pub graph: Option<String>,
2026-04-10 20:49:41 +03:00
pub branch: Option<String>,
pub output_format: Option<ReadOutputFormat>,
pub table_max_column_width: Option<usize>,
pub table_cell_layout: Option<TableCellLayout>,
policy: CLI policy injection — local writes go through engine enforce (MR-722) (#104) Closes the CLI side of the policy chassis fan-out. Before this commit, CLI direct-engine writes bypassed Cedar entirely because the CLI never called `Omnigraph::with_policy(...)` for non-`policy validate|test|explain` subcommands. After this commit, every CLI direct-engine writer (change, load, ingest, branch create/delete/merge, schema apply) opens the engine via a new `open_local_db_with_policy(uri, &config)` helper that installs the configured `PolicyEngine` when `policy.file` is set, and threads the resolved actor through to the `_as` writer methods. Actor identity resolution: - New top-level `--as <ACTOR>` global flag on the CLI overrides config. - New `cli.actor` field in `omnigraph.yaml` provides a default actor. - Precedence: `--as` > `cli.actor` > None. - When policy is configured and neither is set, the engine-layer footgun guard fires and the write is denied — silent bypass via "I forgot the actor" is exactly what the guard prevents. - Remote HTTP writes ignore both — bearer-token-resolved server-side. Helpers added in main.rs: - `open_local_db_with_policy(uri, &config) -> Result<Omnigraph>` — opens the DB and installs the PolicyEngine when configured. Without policy this is identical to a bare `Omnigraph::open`. - `resolve_cli_actor(cli_as, &config) -> Option<&str>` — implements the flag > config > None precedence. Engine: added `load_file_as` to the loader as the actor-aware mirror of `load_file`, so CLI file-path loads flow through the same enforce gate as in-memory `load_as` calls. Test rewrite: `local_cli_policy_tooling_is_end_to_end_while_local_writes_stay_unenforced` was the explicit assertion of the pre-chassis hole. Renamed and split: - `local_cli_policy_tooling_is_end_to_end` — sanity for the read-only policy CLI surfaces (validate/test/explain), unchanged behavior. - `local_cli_change_enforces_engine_layer_policy` — the new assertion: policy installed + no actor → footgun-guard denial; `--as act-bruno` on protected main → Cedar denial; `--as act-ragnor` (admins-write rule) on main → permit, write committed. POLICY_E2E_YAML gains an `admins-write` rule so the permit case has a non-trivial actor to exercise. docs/user/policy.md updated with `cli.actor` + `--as <ACTOR>` usage. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-18 04:06:21 +03:00
/// Default actor identity for CLI direct-engine writes (MR-722).
/// Used when `policy.file` is configured and the operator hasn't
/// passed `--as <actor>` on the command line. With policy configured
/// and neither this nor `--as` set, the engine-layer footgun guard
/// fires (no silent bypass).
pub actor: Option<String>,
2026-04-10 20:49:41 +03:00
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct ServerDefaults {
2026-04-14 04:12:14 +03:00
#[serde(rename = "graph")]
pub graph: Option<String>,
2026-04-10 20:49:41 +03:00
pub bind: Option<String>,
mr-668: multi-graph startup + mode inference (PR 5/10) PR 5 of the MR-668 multi-graph server work. This is the first PR that makes multi mode actually usable end-to-end: operators invoking `omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and no single-mode selector now get a running multi-graph server. Mode inference (MR-668 decision 2, four-rule matrix in `load_server_settings`): 1. CLI `<URI>` positional → Single 2. CLI `--target <name>` → Single (URI from graphs.<name>) 3. `server.graph` in config → Single (URI from graphs.<name>) 4. `--config` + non-empty `graphs:` + no single-mode selector → Multi (all entries in `graphs:`) 5. otherwise → error with migration hint Rule 5's error message names every escape hatch so operators can fix their invocation without grepping docs. Config schema extensions: - `TargetConfig.policy: PolicySettings` (per-graph Cedar policy file). `#[serde(default)]` so existing single-graph YAMLs keep parsing. - `ServerDefaults.policy: PolicySettings` (server-level Cedar policy for management endpoints — loaded in PR 5, wired into `GET /graphs` in PR 6b). - `OmnigraphConfig::resolve_target_policy_file(name)` and `resolve_server_policy_file()` helpers — both resolve relative to the config file's `base_dir`. Public types added to `omnigraph-server`: - `ServerConfigMode { Single { uri, policy_file } | Multi { graphs, config_path, server_policy_file } }`. - `GraphStartupConfig { graph_id, uri, policy_file }` — one entry per graph in multi mode. `ServerConfig` shape change: - WAS: `{ uri: String, bind, policy_file, allow_unauthenticated }`. - NOW: `{ mode: ServerConfigMode, bind, allow_unauthenticated }`. - Breaking for any code that constructs `ServerConfig` directly. `main.rs` is unaffected (uses `load_server_settings`). `serve()` now forks on `ServerConfig.mode`: - Single: existing flow via `AppState::open_with_bearer_tokens_and_policy`. - Multi: parallel open via `futures::stream::iter(graphs) .map(open_single_graph).buffer_unordered(4).collect()`. Bound 4 is a rule-of-thumb for I/O-bound work — at N≤10 this trades startup latency for a small amount of concurrent S3/Lance open pressure. Fail-fast: first open error aborts startup; in-flight opens drop their engine via Arc (Lance datasets close cleanly). New helper `open_single_graph(GraphStartupConfig)`: - Validates `GraphId` per the regex in PR 1. - `Omnigraph::open(uri).await` with descriptive error context. - Loads per-graph policy file and re-applies it via `Omnigraph::with_policy` (engine-layer enforcement, MR-722). - Returns `Arc<GraphHandle>` ready for the registry. Routing middleware bug fix: - `Router::nest("/graphs/{graph_id}", inner)` rewrites `request.uri().path()` to the inner suffix (e.g. `/snapshot`). The previous middleware tried to parse `{graph_id}` from `request.uri().path()` and got 400 instead of 200. Fixed by reading from `axum::extract::OriginalUri` request extension, which preserves the pre-rewrite URI. - Caught by the two new tests `cluster_routes_dispatch_per_graph_handle` and `cluster_route_for_unknown_graph_returns_404`. Tests (14 new, all passing): - Four-rule matrix: one test per branch + the joint case `mode_inference_cli_uri_overrides_graphs_map` + the empty-graphs-map error case. - Per-graph + server-level policy file path resolution. - Reserved `GraphId` rejection at startup. - End-to-end multi-graph routing: two graphs side by side, each cluster route hits the right engine. - Unknown graph id under cluster prefix → 404. - Flat routes 404 in multi mode. Inline `ServerConfig` test (`serve_refuses_to_start_in_state_1_without_unauthenticated`) and three `server_settings_*` tests updated to the new `mode` shape. Result: 211 server tests green (74 lib + 71 integration + 66 openapi), MR-731 regression test still pinned and passing. LOC: +45 config.rs, +281 lib.rs (net), +395 tests/server.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:09:12 +02:00
/// Server-level Cedar policy (MR-668). Governs management endpoints
mr-668: remove POST /graphs and CLI graphs create (defer runtime graph mgmt) The POST /graphs runtime-create endpoint shipped in PR 7/10 has three unresolved high-severity bugs: - flock-on-renamed-inode race: the YAML flock is taken on omnigraph.yaml itself, then a temp file is renamed over it. Cross-process writers end up locking different inodes — both believing they hold exclusive access. - duplicate-check outside the file lock: precheck runs against the in-memory registry only; the locked closure does config.graphs.insert(...) unconditionally. Concurrent same-id POSTs can persist the loser in YAML while the in-memory registry keeps the winner — they disagree after restart. - best_effort_cleanup_init_artifacts deletes _schema.pg / _schema.ir.json / __schema_state.json on any init failure. An accidental re-init against an existing graph's URI destroys its schema; subsequent open() fails at read_text(_schema.pg). The correct fix is a Lance-style cluster catalog (reserve → init → publish with recovery sidecars), parallel to the engine's existing __manifest discipline. That work is out of scope for v0.7.0. For now, disable runtime add/remove from the network and CLI surface. Operators add graphs by editing omnigraph.yaml and restarting. The GET /graphs read-only enumeration stays. Removed: - POST /graphs handler + router fragment + utoipa registration - 13 post_graphs_* server tests + 3 composite POST tests + multi_mode_app_with_real_config / post_graph helpers - CLI omnigraph graphs create subcommand + its handler + cli.rs tests - system_remote.rs combined list+create test trimmed to list-only - YAML rewrite infra: rewrite_atomic[_with_modify], RewriteAtomicError, staging_path, hash_config_file, AppState::config_hash field + threading through new_multi and open_multi_graph_state - fs2 dependency (verified absent from cargo tree) - sha2/fs2 imports in config.rs (only the rewrite path used them) - Cedar PolicyAction::GraphCreate variant + "graph_create" match arms + action def in Cedar schema + graph_create_action_authorizes_against_server_resource test - GraphCreateRequest / GraphCreateResponse / GraphSchemaSpec / GraphPolicySpec API types (only the POST handler / CLI imported them) Kept: - GET /graphs (read-only enumeration) and graph_list Cedar action - omnigraph graphs list CLI subcommand - All multi-graph startup, mode inference, cluster routes, per-graph + server-level Cedar policies - server_settings_drive_multi_graph_startup_end_to_end (the test that covers operator-authored YAML + restart — the path that survives) - best_effort_cleanup_init_artifacts and the three init failpoints (still reachable from CLI `omnigraph init`; preflight fix deferred as a follow-up) - GraphRegistry::insert and its concurrency tests — production callers gone, but the method is the natural seam for the future cluster-catalog work Also fixed (transcript issue 4): - ALWAYS_FLAT_PATHS now includes /graphs so multi-mode OpenAPI advertises the management route correctly (was previously rewritten to /graphs/{graph_id}/graphs) - multi_mode_openapi_keeps_healthz_flat → renamed to multi_mode_openapi_keeps_management_paths_flat, asserts both /healthz and /graphs stay flat - multi_mode_openapi_prefixes_operation_ids_with_cluster skips /graphs in addition to /healthz Doc fixes: - docs/user/cli.md: graphs list example was --target http://..., but --target is a config-graph-name lookup; corrected to --uri. Removed the graphs create example. - docs/user/server.md: dropped POST /graphs row, "omnigraph.yaml ownership", and "POST /graphs body shape" sections. Added a paragraph stating runtime add/remove is not exposed in v0.7.0. - docs/user/policy.md: dropped graph_create action; reworded the "Configuration" line to clarify that server-scoped rules (graph_list) take neither branch_scope nor target_branch_scope. - docs/releases/v0.7.0.md: rewrote release narrative — multi-graph mode ships; runtime add/remove deferred. - AGENTS.md: HTTP server bullet and capability matrix row updated to reflect read-only GET /graphs and the operator-edit workflow. - openapi.json regenerated; /graphs has only .get, no .post. Diff: 17 files, +123 −1525 LOC. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 17:49:38 +02:00
/// — currently `GET /graphs`; future runtime add/remove endpoints
/// will plug in here too. In single-graph mode this is unused — the
/// top-level `policy.file` covers the single graph.
mr-668: multi-graph startup + mode inference (PR 5/10) PR 5 of the MR-668 multi-graph server work. This is the first PR that makes multi mode actually usable end-to-end: operators invoking `omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and no single-mode selector now get a running multi-graph server. Mode inference (MR-668 decision 2, four-rule matrix in `load_server_settings`): 1. CLI `<URI>` positional → Single 2. CLI `--target <name>` → Single (URI from graphs.<name>) 3. `server.graph` in config → Single (URI from graphs.<name>) 4. `--config` + non-empty `graphs:` + no single-mode selector → Multi (all entries in `graphs:`) 5. otherwise → error with migration hint Rule 5's error message names every escape hatch so operators can fix their invocation without grepping docs. Config schema extensions: - `TargetConfig.policy: PolicySettings` (per-graph Cedar policy file). `#[serde(default)]` so existing single-graph YAMLs keep parsing. - `ServerDefaults.policy: PolicySettings` (server-level Cedar policy for management endpoints — loaded in PR 5, wired into `GET /graphs` in PR 6b). - `OmnigraphConfig::resolve_target_policy_file(name)` and `resolve_server_policy_file()` helpers — both resolve relative to the config file's `base_dir`. Public types added to `omnigraph-server`: - `ServerConfigMode { Single { uri, policy_file } | Multi { graphs, config_path, server_policy_file } }`. - `GraphStartupConfig { graph_id, uri, policy_file }` — one entry per graph in multi mode. `ServerConfig` shape change: - WAS: `{ uri: String, bind, policy_file, allow_unauthenticated }`. - NOW: `{ mode: ServerConfigMode, bind, allow_unauthenticated }`. - Breaking for any code that constructs `ServerConfig` directly. `main.rs` is unaffected (uses `load_server_settings`). `serve()` now forks on `ServerConfig.mode`: - Single: existing flow via `AppState::open_with_bearer_tokens_and_policy`. - Multi: parallel open via `futures::stream::iter(graphs) .map(open_single_graph).buffer_unordered(4).collect()`. Bound 4 is a rule-of-thumb for I/O-bound work — at N≤10 this trades startup latency for a small amount of concurrent S3/Lance open pressure. Fail-fast: first open error aborts startup; in-flight opens drop their engine via Arc (Lance datasets close cleanly). New helper `open_single_graph(GraphStartupConfig)`: - Validates `GraphId` per the regex in PR 1. - `Omnigraph::open(uri).await` with descriptive error context. - Loads per-graph policy file and re-applies it via `Omnigraph::with_policy` (engine-layer enforcement, MR-722). - Returns `Arc<GraphHandle>` ready for the registry. Routing middleware bug fix: - `Router::nest("/graphs/{graph_id}", inner)` rewrites `request.uri().path()` to the inner suffix (e.g. `/snapshot`). The previous middleware tried to parse `{graph_id}` from `request.uri().path()` and got 400 instead of 200. Fixed by reading from `axum::extract::OriginalUri` request extension, which preserves the pre-rewrite URI. - Caught by the two new tests `cluster_routes_dispatch_per_graph_handle` and `cluster_route_for_unknown_graph_returns_404`. Tests (14 new, all passing): - Four-rule matrix: one test per branch + the joint case `mode_inference_cli_uri_overrides_graphs_map` + the empty-graphs-map error case. - Per-graph + server-level policy file path resolution. - Reserved `GraphId` rejection at startup. - End-to-end multi-graph routing: two graphs side by side, each cluster route hits the right engine. - Unknown graph id under cluster prefix → 404. - Flat routes 404 in multi mode. Inline `ServerConfig` test (`serve_refuses_to_start_in_state_1_without_unauthenticated`) and three `server_settings_*` tests updated to the new `mode` shape. Result: 211 server tests green (74 lib + 71 integration + 66 openapi), MR-731 regression test still pinned and passing. LOC: +45 config.rs, +281 lib.rs (net), +395 tests/server.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:09:12 +02:00
#[serde(default)]
pub policy: PolicySettings,
2026-04-10 20:49:41 +03:00
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct AuthDefaults {
pub env_file: Option<String>,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct QueryDefaults {
#[serde(default)]
pub roots: Vec<String>,
}
#[derive(Debug, Clone, Default, Serialize, Deserialize)]
pub struct PolicySettings {
pub file: Option<String>,
}
#[derive(Debug, Clone, Copy, Eq, PartialEq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum AliasCommand {
Read,
Change,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AliasConfig {
pub command: AliasCommand,
pub query: String,
pub name: Option<String>,
#[serde(default)]
pub args: Vec<String>,
2026-04-14 04:12:14 +03:00
#[serde(rename = "graph")]
pub graph: Option<String>,
2026-04-10 20:49:41 +03:00
pub branch: Option<String>,
pub format: Option<ReadOutputFormat>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct OmnigraphConfig {
#[serde(default)]
pub project: ProjectConfig,
2026-04-14 04:12:14 +03:00
#[serde(default, rename = "graphs")]
pub graphs: BTreeMap<String, TargetConfig>,
2026-04-10 20:49:41 +03:00
#[serde(default)]
pub server: ServerDefaults,
#[serde(default)]
pub auth: AuthDefaults,
#[serde(default)]
pub cli: CliDefaults,
#[serde(default)]
pub query: QueryDefaults,
#[serde(default)]
pub aliases: BTreeMap<String, AliasConfig>,
#[serde(default)]
pub policy: PolicySettings,
#[serde(skip)]
base_dir: PathBuf,
}
impl Default for OmnigraphConfig {
fn default() -> Self {
Self {
project: ProjectConfig::default(),
2026-04-14 04:12:14 +03:00
graphs: BTreeMap::new(),
2026-04-10 20:49:41 +03:00
server: ServerDefaults::default(),
auth: AuthDefaults::default(),
cli: CliDefaults::default(),
query: QueryDefaults::default(),
aliases: BTreeMap::new(),
policy: PolicySettings::default(),
base_dir: PathBuf::new(),
}
}
}
impl OmnigraphConfig {
pub fn base_dir(&self) -> &Path {
&self.base_dir
}
pub fn cli_branch(&self) -> &str {
self.cli.branch.as_deref().unwrap_or("main")
}
pub fn cli_output_format(&self) -> ReadOutputFormat {
self.cli.output_format.unwrap_or_default()
}
pub fn table_max_column_width(&self) -> usize {
self.cli.table_max_column_width.unwrap_or(80)
}
pub fn table_cell_layout(&self) -> TableCellLayout {
self.cli.table_cell_layout.unwrap_or_default()
}
2026-04-14 04:12:14 +03:00
pub fn cli_graph_name(&self) -> Option<&str> {
self.cli.graph.as_deref()
2026-04-10 20:49:41 +03:00
}
2026-04-14 04:12:14 +03:00
pub fn server_graph_name(&self) -> Option<&str> {
self.server.graph.as_deref()
2026-04-10 20:49:41 +03:00
}
pub fn server_bind(&self) -> &str {
self.server.bind.as_deref().unwrap_or("127.0.0.1:8080")
}
pub fn resolve_target_name<'a>(
&self,
explicit_uri: Option<&str>,
explicit_target: Option<&'a str>,
default_target: Option<&'a str>,
) -> Option<&'a str> {
explicit_target.or_else(|| {
if explicit_uri.is_some() {
None
} else {
default_target
}
})
}
2026-04-14 04:12:14 +03:00
pub fn graph_bearer_token_env(
2026-04-10 20:49:41 +03:00
&self,
explicit_uri: Option<&str>,
explicit_target: Option<&str>,
default_target: Option<&str>,
) -> Option<&str> {
let target_name =
self.resolve_target_name(explicit_uri, explicit_target, default_target)?;
2026-04-14 04:12:14 +03:00
self.graphs
2026-04-10 20:49:41 +03:00
.get(target_name)
.and_then(|target| target.bearer_token_env.as_deref())
}
pub fn resolve_auth_env_file(&self) -> Option<PathBuf> {
let path = self.auth.env_file.as_deref()?;
let path = Path::new(path);
Some(if path.is_absolute() {
path.to_path_buf()
} else {
self.base_dir.join(path)
})
}
pub fn resolve_policy_file(&self) -> Option<PathBuf> {
let path = self.policy.file.as_deref()?;
let path = Path::new(path);
Some(if path.is_absolute() {
path.to_path_buf()
} else {
self.base_dir.join(path)
})
}
mr-668: multi-graph startup + mode inference (PR 5/10) PR 5 of the MR-668 multi-graph server work. This is the first PR that makes multi mode actually usable end-to-end: operators invoking `omnigraph-server --config omnigraph.yaml` with a non-empty `graphs:` map and no single-mode selector now get a running multi-graph server. Mode inference (MR-668 decision 2, four-rule matrix in `load_server_settings`): 1. CLI `<URI>` positional → Single 2. CLI `--target <name>` → Single (URI from graphs.<name>) 3. `server.graph` in config → Single (URI from graphs.<name>) 4. `--config` + non-empty `graphs:` + no single-mode selector → Multi (all entries in `graphs:`) 5. otherwise → error with migration hint Rule 5's error message names every escape hatch so operators can fix their invocation without grepping docs. Config schema extensions: - `TargetConfig.policy: PolicySettings` (per-graph Cedar policy file). `#[serde(default)]` so existing single-graph YAMLs keep parsing. - `ServerDefaults.policy: PolicySettings` (server-level Cedar policy for management endpoints — loaded in PR 5, wired into `GET /graphs` in PR 6b). - `OmnigraphConfig::resolve_target_policy_file(name)` and `resolve_server_policy_file()` helpers — both resolve relative to the config file's `base_dir`. Public types added to `omnigraph-server`: - `ServerConfigMode { Single { uri, policy_file } | Multi { graphs, config_path, server_policy_file } }`. - `GraphStartupConfig { graph_id, uri, policy_file }` — one entry per graph in multi mode. `ServerConfig` shape change: - WAS: `{ uri: String, bind, policy_file, allow_unauthenticated }`. - NOW: `{ mode: ServerConfigMode, bind, allow_unauthenticated }`. - Breaking for any code that constructs `ServerConfig` directly. `main.rs` is unaffected (uses `load_server_settings`). `serve()` now forks on `ServerConfig.mode`: - Single: existing flow via `AppState::open_with_bearer_tokens_and_policy`. - Multi: parallel open via `futures::stream::iter(graphs) .map(open_single_graph).buffer_unordered(4).collect()`. Bound 4 is a rule-of-thumb for I/O-bound work — at N≤10 this trades startup latency for a small amount of concurrent S3/Lance open pressure. Fail-fast: first open error aborts startup; in-flight opens drop their engine via Arc (Lance datasets close cleanly). New helper `open_single_graph(GraphStartupConfig)`: - Validates `GraphId` per the regex in PR 1. - `Omnigraph::open(uri).await` with descriptive error context. - Loads per-graph policy file and re-applies it via `Omnigraph::with_policy` (engine-layer enforcement, MR-722). - Returns `Arc<GraphHandle>` ready for the registry. Routing middleware bug fix: - `Router::nest("/graphs/{graph_id}", inner)` rewrites `request.uri().path()` to the inner suffix (e.g. `/snapshot`). The previous middleware tried to parse `{graph_id}` from `request.uri().path()` and got 400 instead of 200. Fixed by reading from `axum::extract::OriginalUri` request extension, which preserves the pre-rewrite URI. - Caught by the two new tests `cluster_routes_dispatch_per_graph_handle` and `cluster_route_for_unknown_graph_returns_404`. Tests (14 new, all passing): - Four-rule matrix: one test per branch + the joint case `mode_inference_cli_uri_overrides_graphs_map` + the empty-graphs-map error case. - Per-graph + server-level policy file path resolution. - Reserved `GraphId` rejection at startup. - End-to-end multi-graph routing: two graphs side by side, each cluster route hits the right engine. - Unknown graph id under cluster prefix → 404. - Flat routes 404 in multi mode. Inline `ServerConfig` test (`serve_refuses_to_start_in_state_1_without_unauthenticated`) and three `server_settings_*` tests updated to the new `mode` shape. Result: 211 server tests green (74 lib + 71 integration + 66 openapi), MR-731 regression test still pinned and passing. LOC: +45 config.rs, +281 lib.rs (net), +395 tests/server.rs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 20:09:12 +02:00
/// Resolve the per-graph policy file path for the named target,
/// relative to the config file's `base_dir`. Returns `None` if the
/// target is unknown or no per-graph `policy.file` is set.
pub fn resolve_target_policy_file(&self, target_name: &str) -> Option<PathBuf> {
let target = self.graphs.get(target_name)?;
let path = target.policy.file.as_deref()?;
let path = Path::new(path);
Some(if path.is_absolute() {
path.to_path_buf()
} else {
self.base_dir.join(path)
})
}
/// Resolve the server-level policy file path (used by management
/// endpoints). Returns `None` if `server.policy.file` is not set.
pub fn resolve_server_policy_file(&self) -> Option<PathBuf> {
let path = self.server.policy.file.as_deref()?;
let path = Path::new(path);
Some(if path.is_absolute() {
path.to_path_buf()
} else {
self.base_dir.join(path)
})
}
/// Resolve a raw config-supplied URI (which may be relative) to its
/// absolute form. URIs containing `://` are passed through as-is;
/// relative paths are joined with the config file's `base_dir`.
pub fn resolve_uri_value(&self, value: &str) -> String {
self.resolve_config_uri(value)
}
2026-04-10 20:49:41 +03:00
pub fn resolve_policy_tests_file(&self) -> Option<PathBuf> {
let policy_file = self.resolve_policy_file()?;
Some(policy_file.with_file_name("policy.tests.yaml"))
}
pub fn alias(&self, name: &str) -> Result<&AliasConfig> {
self.aliases
.get(name)
.ok_or_else(|| color_eyre::eyre::eyre!("alias '{}' not found", name))
}
pub fn resolve_target_uri(
&self,
explicit_uri: Option<String>,
explicit_target: Option<&str>,
default_target: Option<&str>,
) -> Result<String> {
if let Some(uri) = explicit_uri {
return Ok(uri);
}
let target_name = explicit_target.or(default_target).ok_or_else(|| {
color_eyre::eyre::eyre!("URI must be provided via <URI>, --target, or config")
})?;
2026-04-14 04:12:14 +03:00
let target = self.graphs.get(target_name).ok_or_else(|| {
2026-04-10 20:49:41 +03:00
color_eyre::eyre::eyre!(
2026-04-14 04:12:14 +03:00
"graph '{}' not found in {}",
2026-04-10 20:49:41 +03:00
target_name,
DEFAULT_CONFIG_FILE
)
})?;
Ok(self.resolve_config_uri(&target.uri))
}
pub fn resolve_query_path(&self, query: &Path) -> Result<PathBuf> {
if query.is_absolute() {
return Ok(query.to_path_buf());
}
let direct = self.base_dir.join(query);
if direct.exists() {
return Ok(direct);
}
for root in &self.query.roots {
let candidate = self.base_dir.join(root).join(query);
if candidate.exists() {
return Ok(candidate);
}
}
bail!("query file '{}' not found", query.display());
}
fn resolve_config_uri(&self, value: &str) -> String {
if value.contains("://") {
return value.to_string();
}
let path = Path::new(value);
if path.is_absolute() {
value.to_string()
} else {
self.base_dir.join(path).to_string_lossy().to_string()
}
}
}
pub fn default_config_path() -> PathBuf {
PathBuf::from(DEFAULT_CONFIG_FILE)
}
pub fn load_config(config_path: Option<&PathBuf>) -> Result<OmnigraphConfig> {
load_config_in(&env::current_dir()?, config_path)
}
fn load_config_in(cwd: &Path, config_path: Option<&PathBuf>) -> Result<OmnigraphConfig> {
let explicit_path = config_path.cloned();
let config_path = explicit_path.or_else(|| {
let default_path = cwd.join(DEFAULT_CONFIG_FILE);
default_path.exists().then_some(default_path)
});
let mut config = if let Some(path) = &config_path {
serde_yaml::from_str::<OmnigraphConfig>(&fs::read_to_string(path)?)?
} else {
OmnigraphConfig::default()
};
config.base_dir = if let Some(path) = config_path {
absolute_base_dir(cwd, &path)?
} else {
cwd.to_path_buf()
};
Ok(config)
}
fn absolute_base_dir(cwd: &Path, path: &Path) -> Result<PathBuf> {
let path = if path.is_absolute() {
path.to_path_buf()
} else {
cwd.join(path)
};
Ok(path
.parent()
.map(Path::to_path_buf)
.unwrap_or_else(|| cwd.to_path_buf()))
}
#[cfg(test)]
mod tests {
use std::fs;
use std::path::{Path, PathBuf};
use tempfile::tempdir;
use super::{ReadOutputFormat, TableCellLayout, load_config_in};
#[test]
fn load_config_reads_yaml_defaults_from_current_dir() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
r#"
2026-04-14 04:12:14 +03:00
graphs:
2026-04-10 20:49:41 +03:00
local:
uri: ./demo.omni
bearer_token_env: DEMO_TOKEN
auth:
env_file: .env.omni
cli:
2026-04-14 04:12:14 +03:00
graph: local
2026-04-10 20:49:41 +03:00
branch: main
output_format: kv
table_max_column_width: 40
table_cell_layout: wrap
policy: {}
"#,
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
2026-04-14 04:12:14 +03:00
assert_eq!(config.cli_graph_name(), Some("local"));
2026-04-10 20:49:41 +03:00
assert_eq!(config.cli_branch(), "main");
assert_eq!(config.cli_output_format(), ReadOutputFormat::Kv);
assert_eq!(config.table_max_column_width(), 40);
assert_eq!(config.table_cell_layout(), TableCellLayout::Wrap);
assert_eq!(
2026-04-14 04:12:14 +03:00
config.graph_bearer_token_env(None, None, config.cli_graph_name()),
2026-04-10 20:49:41 +03:00
Some("DEMO_TOKEN")
);
assert_eq!(
config.resolve_auth_env_file().unwrap(),
temp.path().join(".env.omni")
);
assert_eq!(
PathBuf::from(
config
2026-04-14 04:12:14 +03:00
.resolve_target_uri(None, None, config.cli_graph_name())
2026-04-10 20:49:41 +03:00
.unwrap()
),
temp.path().join("./demo.omni")
);
}
#[test]
fn load_config_does_not_walk_parent_directories() {
let temp = tempdir().unwrap();
let child = temp.path().join("child");
fs::create_dir_all(&child).unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
2026-04-14 04:12:14 +03:00
"graphs:\n local:\n uri: ./demo.omni\n",
2026-04-10 20:49:41 +03:00
)
.unwrap();
let config = load_config_in(&child, None).unwrap();
2026-04-14 04:12:14 +03:00
assert!(config.graphs.is_empty());
2026-04-10 20:49:41 +03:00
}
#[test]
fn resolve_query_path_searches_config_roots() {
let temp = tempdir().unwrap();
fs::create_dir_all(temp.path().join("queries")).unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"query:\n roots:\n - queries\npolicy: {}\n",
)
.unwrap();
fs::write(
temp.path().join("queries").join("test.gq"),
"query q { return {} }",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
let resolved = config.resolve_query_path(Path::new("test.gq")).unwrap();
assert_eq!(resolved, temp.path().join("queries").join("test.gq"));
}
#[test]
fn resolve_query_path_prefers_config_base_dir_over_ambient_cwd() {
let workspace = tempdir().unwrap();
let config_dir = workspace.path().join("config");
let ambient_dir = workspace.path().join("ambient");
fs::create_dir_all(&config_dir).unwrap();
fs::create_dir_all(&ambient_dir).unwrap();
fs::write(config_dir.join("omnigraph.yaml"), "policy: {}\n").unwrap();
fs::write(config_dir.join("local.gq"), "query local { return {} }").unwrap();
fs::write(ambient_dir.join("local.gq"), "query ambient { return {} }").unwrap();
let config =
load_config_in(&ambient_dir, Some(&config_dir.join("omnigraph.yaml"))).unwrap();
let resolved = config.resolve_query_path(Path::new("local.gq")).unwrap();
assert_eq!(resolved, config_dir.join("local.gq"));
}
#[test]
fn policy_block_accepts_non_empty_mapping() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
"policy:\n file: ./policy.yaml\n",
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
assert_eq!(
config.resolve_policy_file().unwrap(),
temp.path().join("policy.yaml")
);
}
#[test]
fn scoped_auth_env_ignores_default_target_when_uri_is_explicit() {
let temp = tempdir().unwrap();
fs::write(
temp.path().join("omnigraph.yaml"),
r#"
2026-04-14 04:12:14 +03:00
graphs:
2026-04-10 20:49:41 +03:00
demo:
uri: https://example.com
bearer_token_env: DEMO_TOKEN
cli:
2026-04-14 04:12:14 +03:00
graph: demo
2026-04-10 20:49:41 +03:00
"#,
)
.unwrap();
let config = load_config_in(temp.path(), None).unwrap();
assert_eq!(
2026-04-14 04:12:14 +03:00
config.graph_bearer_token_env(
2026-04-10 20:49:41 +03:00
Some("https://override.example.com"),
None,
2026-04-14 04:12:14 +03:00
config.cli_graph_name()
2026-04-10 20:49:41 +03:00
),
None
);
assert_eq!(
2026-04-14 04:12:14 +03:00
config.graph_bearer_token_env(
2026-04-10 20:49:41 +03:00
Some("https://override.example.com"),
Some("demo"),
2026-04-14 04:12:14 +03:00
config.cli_graph_name()
2026-04-10 20:49:41 +03:00
),
Some("DEMO_TOKEN")
);
}
}