mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
feat(cli): cluster-managed maintenance addressing + init signpost (RFC-010 Slice 3) (#221)
* feat(cluster): cluster_root_for_graph_uri detection helper (RFC-010 Slice 3) Public helper the CLI uses to refuse `init` into a cluster-managed location: given a graph storage URI of the cluster layout (`<root>/graphs/<id>.omni`), return the cluster root if `<root>` holds `__cluster/state.json`, else None. Cheap by construction — a URI that doesn't match the `<root>/graphs/<id>.omni` shape returns None with zero I/O, so ordinary `init` targets never probe storage. Works for file:// and s3:// via the storage adapter. Adds two ClusterStore accessors (`display_root`, `has_state`). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(cli): cluster-managed maintenance addressing + init signpost (RFC-010 Slice 3) Two cluster-graph-aware CLI behaviors, sharing the cluster-resolution path. Maintenance addressing. `optimize`/`repair`/`cleanup` gain `--cluster <dir|s3://…> --cluster-graph <id>`, which resolves the graph's storage URI from the served cluster snapshot (the same truth a `--cluster` server boots from — `read_serving_snapshot*`) and opens it embedded. The operator no longer hand-types `<storage>/graphs/<id>.omni`. A distinct flag is required because the global `--graph` is `requires = server` and means a remote multi-graph id. clap enforces both-or-neither and exclusion with the positional URI / `--target`; an unserved graph errors loudly, pointing at `cluster apply`. init signpost. `init` refuses a cluster-managed positional path (the `<root>/graphs/<id>.omni` layout where `<root>` holds `__cluster/state.json`, detected by `cluster_root_for_graph_uri`) and points at `cluster apply` — graphs in an established cluster are created with ledger/recovery/approvals, not by hand. The check is gated on the path shape, so ordinary `init` does no extra I/O and existing pre-apply cluster-graph inits are unaffected. planes guard remediation now also mentions `--cluster … --cluster-graph …` (the two Slice-1 guard-string tests track it). Docs updated (cli-reference Command planes, maintenance.md, cluster.md §7); the stale "no S3-hosted cluster directories" limitation is dropped (RFC-006 landed it). Tests (cli_cluster.rs, reusing the apply-a-cluster fixture): resolve by id, unknown-id error, `--cluster` requires `--cluster-graph`, init refusal + signpost, and ordinary init still works. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(cli): resolve cluster graphs from the state ledger, not the serving snapshot Addresses the Greptile review on #221. `read_serving_snapshot*` does all-or-nothing serving validation — recovery-sidecar checks plus a digest verify of every catalog payload (query .gq, policy blobs). Using it to resolve a maintenance target coupled `optimize`/`repair`/`cleanup` to the readiness of unrelated resources: a single corrupt policy blob, or a pending recovery sweep, would block the command before it could touch the graph — worst for `repair`, the tool you reach for *when the cluster is degraded*. Add `omnigraph_cluster::resolve_graph_storage_uri(cluster, graph_id)`: read the state ledger, confirm the graph is in the applied revision, return `graph_root(id)` — the URI is deterministically derivable, no catalog validation. The CLI's cluster resolver now calls it. Test: `optimize --cluster … --cluster-graph …` still resolves after the catalog payloads (`__cluster/resources/`) are removed — the ledger-only path is not blocked by degraded/unrelated catalog state. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
d6cf5b298c
commit
6144bb18d6
13 changed files with 401 additions and 14 deletions
|
|
@ -238,6 +238,13 @@ pub(crate) enum Command {
|
|||
target: Option<String>,
|
||||
#[arg(long)]
|
||||
config: Option<PathBuf>,
|
||||
/// Cluster directory or storage-root URI; with --cluster-graph, resolves
|
||||
/// the graph's storage URI from the served cluster state.
|
||||
#[arg(long, conflicts_with_all = ["uri", "target"], requires = "cluster_graph")]
|
||||
cluster: Option<String>,
|
||||
/// Graph id within --cluster.
|
||||
#[arg(long, requires = "cluster")]
|
||||
cluster_graph: Option<String>,
|
||||
#[arg(long)]
|
||||
json: bool,
|
||||
},
|
||||
|
|
@ -249,6 +256,13 @@ pub(crate) enum Command {
|
|||
target: Option<String>,
|
||||
#[arg(long)]
|
||||
config: Option<PathBuf>,
|
||||
/// Cluster directory or storage-root URI; with --cluster-graph, resolves
|
||||
/// the graph's storage URI from the served cluster state.
|
||||
#[arg(long, conflicts_with_all = ["uri", "target"], requires = "cluster_graph")]
|
||||
cluster: Option<String>,
|
||||
/// Graph id within --cluster.
|
||||
#[arg(long, requires = "cluster")]
|
||||
cluster_graph: Option<String>,
|
||||
/// Publish verified maintenance drift. Without this flag, repair only
|
||||
/// previews what it would do.
|
||||
#[arg(long)]
|
||||
|
|
@ -268,6 +282,13 @@ pub(crate) enum Command {
|
|||
target: Option<String>,
|
||||
#[arg(long)]
|
||||
config: Option<PathBuf>,
|
||||
/// Cluster directory or storage-root URI; with --cluster-graph, resolves
|
||||
/// the graph's storage URI from the served cluster state.
|
||||
#[arg(long, conflicts_with_all = ["uri", "target"], requires = "cluster_graph")]
|
||||
cluster: Option<String>,
|
||||
/// Graph id within --cluster.
|
||||
#[arg(long, requires = "cluster")]
|
||||
cluster_graph: Option<String>,
|
||||
/// Number of recent versions to keep per table. Either `--keep` or
|
||||
/// `--older-than` (or both) must be set.
|
||||
#[arg(long)]
|
||||
|
|
|
|||
|
|
@ -513,6 +513,37 @@ pub(crate) fn resolve_local_uri(
|
|||
Ok(resolve_local_graph(config, cli_uri, cli_target, operation)?.uri)
|
||||
}
|
||||
|
||||
/// Resolve a storage-plane verb's target to a direct storage URI (RFC-010
|
||||
/// Slice 3). `--cluster <dir|uri> --cluster-graph <id>` resolves the graph's
|
||||
/// storage URI from the **served cluster state** (the truth a `--cluster`
|
||||
/// server serves); otherwise the ordinary positional-URI / `--target` path.
|
||||
/// clap enforces both-or-neither and exclusion with `uri`/`--target`, so the
|
||||
/// mismatched arm is defensive.
|
||||
pub(crate) async fn resolve_storage_uri(
|
||||
config: &OmnigraphConfig,
|
||||
cli_uri: Option<String>,
|
||||
cli_target: Option<&str>,
|
||||
cluster: Option<&str>,
|
||||
cluster_graph: Option<&str>,
|
||||
operation: &str,
|
||||
) -> Result<String> {
|
||||
match (cluster, cluster_graph) {
|
||||
(Some(cluster), Some(graph_id)) => resolve_cluster_graph_uri(cluster, graph_id).await,
|
||||
(None, None) => resolve_local_uri(config, cli_uri, cli_target, operation),
|
||||
_ => bail!("--cluster and --cluster-graph must be given together"),
|
||||
}
|
||||
}
|
||||
|
||||
/// Look up a graph's storage URI from a cluster's applied state ledger. Uses
|
||||
/// the lightweight `resolve_graph_storage_uri` (NOT the full serving-snapshot
|
||||
/// validation), so maintenance — especially `repair` — works even when an
|
||||
/// unrelated catalog payload is corrupt or a recovery sweep is pending.
|
||||
async fn resolve_cluster_graph_uri(cluster: &str, graph_id: &str) -> Result<String> {
|
||||
omnigraph_cluster::resolve_graph_storage_uri(cluster, graph_id)
|
||||
.await
|
||||
.map_err(|diagnostic| color_eyre::eyre::eyre!("{}", diagnostic.message))
|
||||
}
|
||||
|
||||
pub(crate) fn resolve_branch(
|
||||
config: &OmnigraphConfig,
|
||||
cli_branch: Option<String>,
|
||||
|
|
|
|||
|
|
@ -147,6 +147,16 @@ async fn main() -> Result<()> {
|
|||
}
|
||||
}
|
||||
Command::Init { schema, uri, force } => {
|
||||
// RFC-010 Slice 3: graphs inside an established cluster are created
|
||||
// by `cluster apply` (which records ledger/recovery/approvals), not
|
||||
// by hand-running `init` into the cluster's storage layout.
|
||||
if let Some(root) = omnigraph_cluster::cluster_root_for_graph_uri(&uri).await {
|
||||
bail!(
|
||||
"`{uri}` is inside cluster `{root}`. Graphs in a cluster are created by \
|
||||
`cluster apply` (which records ledger, recovery, and approvals), not `init`. \
|
||||
Declare the graph in cluster.yaml and run `cluster apply`."
|
||||
);
|
||||
}
|
||||
let schema_source = fs::read_to_string(&schema)?;
|
||||
ensure_local_graph_parent(&uri)?;
|
||||
Omnigraph::init_with_options(
|
||||
|
|
@ -783,10 +793,20 @@ async fn main() -> Result<()> {
|
|||
uri,
|
||||
target,
|
||||
config,
|
||||
cluster,
|
||||
cluster_graph,
|
||||
json,
|
||||
} => {
|
||||
let config = load_cli_config(config.as_ref())?;
|
||||
let uri = resolve_local_uri(&config, uri, target.as_deref(), "optimize")?;
|
||||
let uri = resolve_storage_uri(
|
||||
&config,
|
||||
uri,
|
||||
target.as_deref(),
|
||||
cluster.as_deref(),
|
||||
cluster_graph.as_deref(),
|
||||
"optimize",
|
||||
)
|
||||
.await?;
|
||||
let db = Omnigraph::open(&uri).await?;
|
||||
let stats = db.optimize().await?;
|
||||
if json {
|
||||
|
|
@ -823,12 +843,22 @@ async fn main() -> Result<()> {
|
|||
uri,
|
||||
target,
|
||||
config,
|
||||
cluster,
|
||||
cluster_graph,
|
||||
confirm,
|
||||
force,
|
||||
json,
|
||||
} => {
|
||||
let config = load_cli_config(config.as_ref())?;
|
||||
let uri = resolve_local_uri(&config, uri, target.as_deref(), "repair")?;
|
||||
let uri = resolve_storage_uri(
|
||||
&config,
|
||||
uri,
|
||||
target.as_deref(),
|
||||
cluster.as_deref(),
|
||||
cluster_graph.as_deref(),
|
||||
"repair",
|
||||
)
|
||||
.await?;
|
||||
let db = Omnigraph::open(&uri).await?;
|
||||
let stats = db
|
||||
.repair(omnigraph::db::RepairOptions { confirm, force })
|
||||
|
|
@ -906,13 +936,23 @@ async fn main() -> Result<()> {
|
|||
uri,
|
||||
target,
|
||||
config,
|
||||
cluster,
|
||||
cluster_graph,
|
||||
keep,
|
||||
older_than,
|
||||
confirm,
|
||||
json,
|
||||
} => {
|
||||
let config = load_cli_config(config.as_ref())?;
|
||||
let uri = resolve_local_uri(&config, uri, target.as_deref(), "cleanup")?;
|
||||
let uri = resolve_storage_uri(
|
||||
&config,
|
||||
uri,
|
||||
target.as_deref(),
|
||||
cluster.as_deref(),
|
||||
cluster_graph.as_deref(),
|
||||
"cleanup",
|
||||
)
|
||||
.await?;
|
||||
|
||||
let older_than_dur = older_than.as_deref().map(parse_duration_arg).transpose()?;
|
||||
|
||||
|
|
|
|||
|
|
@ -139,7 +139,7 @@ pub(crate) fn guard_addressing(cli: &Cli) -> Result<()> {
|
|||
// required positional URI), so its remediation drops the `--target` half.
|
||||
Plane::Storage => match cli.command {
|
||||
Command::Init { .. } => "Pass a storage URI.",
|
||||
_ => "Use --target <name> or a storage URI.",
|
||||
_ => "Use --target <name>, a storage URI, or --cluster <dir> --cluster-graph <id>.",
|
||||
},
|
||||
Plane::Control => "It operates on a cluster directory (pass --config <dir>).",
|
||||
Plane::Session => "It does not address a graph.",
|
||||
|
|
|
|||
|
|
@ -950,3 +950,138 @@ graphs:
|
|||
assert!(!leaked.contains("phantom") && !leaked.contains("9999"), "{leaked}");
|
||||
}
|
||||
|
||||
|
||||
// ── RFC-010 Slice 3: cluster-managed maintenance addressing + init signpost ──
|
||||
|
||||
/// Stand up an applied, served cluster with the `knowledge` graph and return
|
||||
/// its directory guard. Mirrors the e2e setup (fixture → init → import → apply).
|
||||
fn applied_knowledge_cluster() -> tempfile::TempDir {
|
||||
let temp = tempdir().unwrap();
|
||||
write_cluster_config_fixture(temp.path());
|
||||
init_cluster_derived_graph(temp.path());
|
||||
let import = cluster_json(temp.path(), "import");
|
||||
assert_eq!(import["ok"], true, "{import}");
|
||||
let apply = cluster_json(temp.path(), "apply");
|
||||
assert_eq!(apply["converged"], true, "{apply}");
|
||||
temp
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn optimize_resolves_a_cluster_graph_by_id() {
|
||||
let temp = applied_knowledge_cluster();
|
||||
// No hand-typed storage path: address the graph by cluster dir + id.
|
||||
let out = output_success(
|
||||
cli()
|
||||
.arg("optimize")
|
||||
.arg("--cluster")
|
||||
.arg(temp.path())
|
||||
.arg("--cluster-graph")
|
||||
.arg("knowledge")
|
||||
.arg("--json"),
|
||||
);
|
||||
let payload = parse_stdout_json(&out);
|
||||
assert!(
|
||||
payload["tables"].as_array().is_some(),
|
||||
"optimize did not run against the resolved cluster graph: {payload}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn optimize_unknown_cluster_graph_id_errors() {
|
||||
let temp = applied_knowledge_cluster();
|
||||
let out = output_failure(
|
||||
cli()
|
||||
.arg("optimize")
|
||||
.arg("--cluster")
|
||||
.arg(temp.path())
|
||||
.arg("--cluster-graph")
|
||||
.arg("does-not-exist")
|
||||
.arg("--json"),
|
||||
);
|
||||
let stderr = String::from_utf8_lossy(&out.stderr);
|
||||
assert!(
|
||||
stderr.contains("is not applied in cluster") && stderr.contains("cluster apply"),
|
||||
"expected an unapplied-graph error pointing at cluster apply; got: {stderr}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn cluster_flag_requires_cluster_graph() {
|
||||
// clap enforces both-or-neither.
|
||||
let out = output_failure(
|
||||
cli()
|
||||
.arg("optimize")
|
||||
.arg("--cluster")
|
||||
.arg(".")
|
||||
.arg("--json"),
|
||||
);
|
||||
let stderr = String::from_utf8_lossy(&out.stderr);
|
||||
assert!(
|
||||
stderr.contains("cluster-graph") || stderr.contains("required"),
|
||||
"expected --cluster to require --cluster-graph; got: {stderr}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn init_refuses_a_cluster_managed_path_and_signposts_cluster_apply() {
|
||||
let temp = applied_knowledge_cluster();
|
||||
// Hand-init a NEW graph into the established cluster's storage layout.
|
||||
let out = output_failure(
|
||||
cli()
|
||||
.arg("init")
|
||||
.arg("--schema")
|
||||
.arg(temp.path().join("people.pg"))
|
||||
.arg(temp.path().join("graphs").join("sneaky.omni")),
|
||||
);
|
||||
let stderr = String::from_utf8_lossy(&out.stderr);
|
||||
assert!(
|
||||
stderr.contains("cluster apply"),
|
||||
"init into a cluster-managed path should signpost `cluster apply`; got: {stderr}"
|
||||
);
|
||||
// And it did not create the graph.
|
||||
assert!(!temp.path().join("graphs").join("sneaky.omni").exists());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn init_outside_a_cluster_still_works() {
|
||||
// Regression guard: ordinary init (no cluster layout) is unaffected.
|
||||
let temp = tempdir().unwrap();
|
||||
let schema = fixture("test.pg");
|
||||
let out = output_success(
|
||||
cli()
|
||||
.arg("init")
|
||||
.arg("--schema")
|
||||
.arg(&schema)
|
||||
.arg(temp.path().join("plain.omni")),
|
||||
);
|
||||
assert!(stdout_string(&out).contains("initialized"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn optimize_by_cluster_works_when_catalog_payloads_are_degraded() {
|
||||
// Robustness (Greptile, #221): maintenance resolves the graph URI from the
|
||||
// state ledger alone, so an unrelated corrupt/missing catalog payload (or a
|
||||
// pending recovery sweep) does NOT block it — unlike the full serving-snapshot
|
||||
// read. This is what keeps `repair --cluster` usable on a degraded cluster.
|
||||
let temp = applied_knowledge_cluster();
|
||||
// Remove the verified catalog payloads (queries/policies) — a serving read
|
||||
// would refuse with a catalog-payload diagnostic; the ledger-only resolve
|
||||
// must not care.
|
||||
let resources = temp.path().join("__cluster").join("resources");
|
||||
if resources.exists() {
|
||||
fs::remove_dir_all(&resources).unwrap();
|
||||
}
|
||||
let out = output_success(
|
||||
cli()
|
||||
.arg("optimize")
|
||||
.arg("--cluster")
|
||||
.arg(temp.path())
|
||||
.arg("--cluster-graph")
|
||||
.arg("knowledge")
|
||||
.arg("--json"),
|
||||
);
|
||||
assert!(
|
||||
parse_stdout_json(&out)["tables"].as_array().is_some(),
|
||||
"optimize should resolve via the ledger despite degraded catalog payloads"
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -165,7 +165,7 @@ fn optimize_with_server_flag_errors_wrong_plane() {
|
|||
assert!(
|
||||
stderr.contains("`optimize` is a storage-plane command")
|
||||
&& stderr.contains("--server/--graph address the data plane and do not apply")
|
||||
&& stderr.contains("Use --target <name> or a storage URI."),
|
||||
&& stderr.contains("Use --target <name>, a storage URI, or --cluster <dir> --cluster-graph <id>."),
|
||||
"wrong-plane guard message not found; got: {stderr}"
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -121,7 +121,7 @@ fn schema_plan_with_server_flag_errors_wrong_plane() {
|
|||
let stderr = String::from_utf8_lossy(&output.stderr);
|
||||
assert!(
|
||||
stderr.contains("`schema plan` is a storage-plane command")
|
||||
&& stderr.contains("Use --target <name> or a storage URI."),
|
||||
&& stderr.contains("Use --target <name>, a storage URI, or --cluster <dir> --cluster-graph <id>."),
|
||||
"schema plan wrong-plane message not found; got: {stderr}"
|
||||
);
|
||||
}
|
||||
|
|
|
|||
|
|
@ -28,7 +28,7 @@ mod store;
|
|||
use store::{ClusterStore, StateLockGuard, StateSnapshot};
|
||||
pub use types::*;
|
||||
use types::*;
|
||||
pub use serve::{ServingGraph, ServingPolicy, ServingQuery, ServingSnapshot, read_serving_snapshot, read_serving_snapshot_from_storage};
|
||||
pub use serve::{ServingGraph, ServingPolicy, ServingQuery, ServingSnapshot, cluster_root_for_graph_uri, read_serving_snapshot, read_serving_snapshot_from_storage, resolve_graph_storage_uri};
|
||||
use config::{QueriesDecl, observe_declared_graphs, validate_cluster_header, future_field_diagnostics, initial_import_state, observe_live_graph, preview_schema_migration, state_resource_digests, graph_address, policy_address, query_address, schema_address, load_desired, normalize_policy_target, parse_cluster_config, resolve_config_path, resolve_query_decls, validate_id, validate_query_source};
|
||||
use diff::{FailedGraphOrigin, ResourceKind, append_policy_binding_changes, approved_resources, classify_changes, compute_approvals, compute_blast_radius, demote_dependents_of_failed_graphs, diff_resources, resource_kind};
|
||||
use sweep::{mark_approvals_consumed, record_approval_consumed, sweep_recovery_sidecars, tombstone_graph_subtree, warn_pending_recovery_sidecars};
|
||||
|
|
|
|||
|
|
@ -79,6 +79,87 @@ pub async fn read_serving_snapshot_from_storage(
|
|||
read_snapshot_with_store(backend).await
|
||||
}
|
||||
|
||||
/// Cluster root for a graph **storage URI** of the cluster layout
|
||||
/// (`<root>/graphs/<id>.omni`), if `<root>` is actually a cluster (holds
|
||||
/// `__cluster/state.json`); otherwise `None`. Used by the CLI to refuse
|
||||
/// `init` into a cluster-managed location — graphs there are created by
|
||||
/// `cluster apply`, not `init`.
|
||||
///
|
||||
/// Cheap by construction: a URI that does not match the `<root>/graphs/<id>.omni`
|
||||
/// shape returns `None` without any I/O, so ordinary `init` targets
|
||||
/// (`./kb.omni`, `s3://bucket/kb.omni`) never probe storage. Works for
|
||||
/// `file://` and `s3://` via the storage adapter.
|
||||
pub async fn cluster_root_for_graph_uri(graph_uri: &str) -> Option<String> {
|
||||
let root = cluster_root_of_graph_layout(graph_uri)?;
|
||||
let store = ClusterStore::for_storage_root(&root).ok()?;
|
||||
store
|
||||
.has_state()
|
||||
.await
|
||||
.then(|| store.display_root().to_string())
|
||||
}
|
||||
|
||||
/// Resolve a graph's **storage URI** (`<root>/graphs/<id>.omni`) from a cluster's
|
||||
/// applied state ledger — the lightweight path for storage-plane maintenance
|
||||
/// (`optimize`/`repair`/`cleanup`).
|
||||
///
|
||||
/// Unlike [`read_serving_snapshot`], this deliberately does NOT validate catalog
|
||||
/// payloads or recovery readiness: maintenance only needs the derivable graph
|
||||
/// root, and must not be blocked by an unrelated corrupt policy/query blob or a
|
||||
/// pending recovery sweep — a degraded cluster is exactly when an operator
|
||||
/// reaches for `repair`. It reads the state ledger, confirms the graph is in the
|
||||
/// applied revision, and returns `graph_root(id)`.
|
||||
///
|
||||
/// `cluster` is a config directory or a storage-root URI (`s3://…`, config-free),
|
||||
/// mirroring the server's `--cluster` dispatch.
|
||||
pub async fn resolve_graph_storage_uri(cluster: &str, graph_id: &str) -> Result<String, Diagnostic> {
|
||||
let backend = if cluster.contains("://") {
|
||||
ClusterStore::for_storage_root(cluster)?
|
||||
} else {
|
||||
ClusterStore::for_config_dir(Path::new(cluster))
|
||||
};
|
||||
let mut observations = backend.observations();
|
||||
let snapshot = backend.read_state(&mut observations).await?;
|
||||
let state = snapshot.state.ok_or_else(|| {
|
||||
Diagnostic::error(
|
||||
"cluster_state_missing",
|
||||
CLUSTER_STATE_FILE,
|
||||
format!("cluster `{cluster}` has no applied state; run `cluster apply` first"),
|
||||
)
|
||||
})?;
|
||||
let address = format!("graph.{graph_id}");
|
||||
if !state.applied_revision.resources.contains_key(&address) {
|
||||
let applied: Vec<&str> = state
|
||||
.applied_revision
|
||||
.resources
|
||||
.keys()
|
||||
.filter_map(|a| a.strip_prefix("graph."))
|
||||
.collect();
|
||||
return Err(Diagnostic::error(
|
||||
"graph_not_applied",
|
||||
address,
|
||||
format!(
|
||||
"graph `{graph_id}` is not applied in cluster `{cluster}` (applied graphs: [{}]); \
|
||||
declare it in cluster.yaml and run `cluster apply`, or check the id",
|
||||
applied.join(", ")
|
||||
),
|
||||
));
|
||||
}
|
||||
Ok(backend.graph_root(graph_id))
|
||||
}
|
||||
|
||||
/// Split `<root>/graphs/<id>.omni` → `<root>`, gating on the exact cluster
|
||||
/// graph-layout shape (a single `<id>` segment, no nested path). `None` for
|
||||
/// anything else — no I/O is done for non-cluster-shaped URIs.
|
||||
fn cluster_root_of_graph_layout(graph_uri: &str) -> Option<String> {
|
||||
let trimmed = graph_uri.trim_end_matches('/');
|
||||
let rest = trimmed.strip_suffix(".omni")?;
|
||||
let (root, id) = rest.rsplit_once("/graphs/")?;
|
||||
if root.is_empty() || id.is_empty() || id.contains('/') {
|
||||
return None;
|
||||
}
|
||||
Some(root.to_string())
|
||||
}
|
||||
|
||||
async fn read_snapshot_with_store(
|
||||
backend: ClusterStore,
|
||||
) -> Result<ServingSnapshot, Vec<Diagnostic>> {
|
||||
|
|
@ -186,3 +267,50 @@ async fn read_snapshot_with_store(
|
|||
})
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn graph_layout_gating_does_no_io_for_non_cluster_shapes() {
|
||||
// Only `<root>/graphs/<id>.omni` matches; everything else is None.
|
||||
assert_eq!(
|
||||
cluster_root_of_graph_layout("/data/cluster/graphs/kb.omni").as_deref(),
|
||||
Some("/data/cluster")
|
||||
);
|
||||
assert_eq!(
|
||||
cluster_root_of_graph_layout("s3://bucket/prefix/graphs/kb.omni").as_deref(),
|
||||
Some("s3://bucket/prefix")
|
||||
);
|
||||
assert_eq!(cluster_root_of_graph_layout("./kb.omni"), None);
|
||||
assert_eq!(cluster_root_of_graph_layout("s3://bucket/kb.omni"), None);
|
||||
// nested id under graphs/ is not the cluster layout
|
||||
assert_eq!(cluster_root_of_graph_layout("/c/graphs/a/b.omni"), None);
|
||||
// not a .omni graph
|
||||
assert_eq!(cluster_root_of_graph_layout("/c/graphs/kb"), None);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn cluster_root_detected_only_when_state_ledger_present() {
|
||||
let temp = tempfile::tempdir().unwrap();
|
||||
let root = temp.path();
|
||||
std::fs::create_dir_all(root.join("graphs")).unwrap();
|
||||
let graph_uri = format!("{}/graphs/kb.omni", root.to_string_lossy());
|
||||
|
||||
// No __cluster/state.json yet → not a cluster.
|
||||
assert_eq!(cluster_root_for_graph_uri(&graph_uri).await, None);
|
||||
|
||||
// Lay down the state ledger → now it's a cluster-managed location.
|
||||
std::fs::create_dir_all(root.join("__cluster")).unwrap();
|
||||
std::fs::write(root.join(CLUSTER_STATE_FILE), "{}").unwrap();
|
||||
let detected = cluster_root_for_graph_uri(&graph_uri).await;
|
||||
assert!(detected.is_some(), "expected cluster root to be detected");
|
||||
|
||||
// A non-cluster-shaped target never probes and is always None.
|
||||
assert_eq!(
|
||||
cluster_root_for_graph_uri(&format!("{}/plain.omni", root.to_string_lossy())).await,
|
||||
None
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -154,6 +154,21 @@ impl ClusterStore {
|
|||
}
|
||||
}
|
||||
|
||||
/// Display-form storage root (plain local path for `file://`, URI for S3).
|
||||
pub(crate) fn display_root(&self) -> &str {
|
||||
&self.display_root
|
||||
}
|
||||
|
||||
/// Whether this root holds the cluster state ledger (`__cluster/state.json`)
|
||||
/// — i.e. is an actual cluster, not just any directory. Probed via the
|
||||
/// adapter (`file://` or `s3://`), failures read as "not a cluster".
|
||||
pub(crate) async fn has_state(&self) -> bool {
|
||||
self.adapter
|
||||
.exists(&self.uri(CLUSTER_STATE_FILE))
|
||||
.await
|
||||
.unwrap_or(false)
|
||||
}
|
||||
|
||||
/// `read_text_versioned`, returning None for a missing object (probed
|
||||
/// via `exists` — the engine error type doesn't discriminate NotFound).
|
||||
async fn read_versioned_opt(&self, uri: &str) -> Result<Option<(String, String)>, String> {
|
||||
|
|
|
|||
|
|
@ -33,15 +33,16 @@ Top-level command families and subcommands. Graph-targeting commands accept a po
|
|||
Every command lives on one **plane**, which determines how it reaches a graph and which addressing flags apply (RFC-010):
|
||||
|
||||
- **Data plane** — `query`, `mutate`, `load`, `ingest`, `branch *`, `snapshot`, `export`, `commit *`, `schema show`, `schema apply` (and `graphs list`, remote-only today). Run against a graph **embedded or via a server**: accept a positional `URI` / `--target` / `--server` (+ `--graph` for multi-graph servers).
|
||||
- **Storage / maintenance plane** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate`, `lint`. Run with **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI` or `--target`, but **not** `--server` / `--graph`, and a `--target` that resolves to a remote (`http(s)://`) server is rejected. (`init` takes only a positional `URI` today — no `--target`.)
|
||||
- **Storage / maintenance plane** — `init`, `optimize`, `repair`, `cleanup`, `schema plan`, `queries validate`, `lint`. Run with **direct storage access** (`file://` / `s3://`), never through a server. They accept a positional `URI` or `--target`, but **not** `--server` / `--graph`, and a `--target` that resolves to a remote (`http(s)://`) server is rejected. (`init` takes only a positional `URI` today — no `--target`.) `optimize` / `repair` / `cleanup` also accept **`--cluster <dir|s3://…> --cluster-graph <id>`**, which resolves the graph's storage URI from the served cluster state (so you needn't know the `<storage>/graphs/<id>.omni` layout).
|
||||
- **Control plane** — `cluster *`. Operates on a cluster directory via `--config <dir>`.
|
||||
|
||||
These restrictions are enforced and reported, not silent:
|
||||
|
||||
- A data-plane addressing flag on a non-data verb fails loudly, e.g.: ``optimize is a storage-plane command; --server/--graph address the data plane and do not apply. Use --target <name> or a storage URI.``
|
||||
- A data-plane addressing flag on a non-data verb fails loudly, e.g.: ``optimize is a storage-plane command; --server/--graph address the data plane and do not apply. Use --target <name>, a storage URI, or --cluster <dir> --cluster-graph <id>.``
|
||||
- A storage-plane verb pointed at a remote target fails loudly, e.g.: ``optimize is a storage-plane command and needs direct storage access; the resolved target is a remote server (https://…). Pass the graph's file:// or s3:// URI.``
|
||||
- `init` into an **established cluster's** storage layout (`<root>/graphs/<id>.omni` where `<root>` holds `__cluster/state.json`) is refused — graphs in a cluster are created by `cluster apply` (which records ledger / recovery / approvals), not `init`.
|
||||
|
||||
To maintain a server-backed graph, run the maintenance verbs from a host with storage access against the graph's storage URI (or `--target`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design.
|
||||
To maintain a server-backed graph, run the maintenance verbs from a host with storage access against the graph's storage URI (`--target`, or `--cluster … --cluster-graph …`), out-of-band from the serving process — there are no server routes for `optimize` / `repair` / `cleanup` by design.
|
||||
|
||||
`omnigraph --help` lists commands **clustered by plane** (data → storage → control → session) with a plane legend at the bottom.
|
||||
|
||||
|
|
|
|||
|
|
@ -251,12 +251,28 @@ with an in-flight apply.
|
|||
loads). It just no longer describes the deployment — a server boots from
|
||||
one source or the other, never a merge of both.
|
||||
|
||||
## 7. Maintaining a cluster graph
|
||||
|
||||
Storage maintenance (`optimize` / `repair` / `cleanup`) is **not** a control-plane
|
||||
operation — it runs out-of-band, with direct storage access, against the graph's
|
||||
roots. Address a cluster graph by name instead of hand-typing its storage path:
|
||||
|
||||
```bash
|
||||
omnigraph optimize --cluster ./company-brain --cluster-graph knowledge
|
||||
omnigraph cleanup --cluster ./company-brain --cluster-graph knowledge --keep 10 --confirm
|
||||
# --cluster also takes the storage-root URI directly (config-free):
|
||||
omnigraph optimize --cluster s3://bucket/clusters/company-brain --cluster-graph knowledge
|
||||
```
|
||||
|
||||
The graph's storage URI is resolved from the **served cluster state** (the same
|
||||
truth a `--cluster` server boots from); a graph that hasn't been applied yet is
|
||||
not resolvable. Run these from a host with storage access — there are no server
|
||||
routes for them. Conversely, **`init` refuses** a cluster-managed path: graphs in
|
||||
a cluster are created by `cluster apply`, not by hand.
|
||||
|
||||
## What the control plane does not do (yet)
|
||||
|
||||
- **No hot reload** — applied changes serve on the next restart.
|
||||
- **No S3-hosted cluster directories** — the config dir, ledger, catalog,
|
||||
and derived graph roots are local-filesystem paths today. (Individual
|
||||
*graphs* on S3 are a server feature outside cluster mode.)
|
||||
- **No data operations** — rows move through `omnigraph load / ingest /
|
||||
mutate` against the graph roots, with branches and merges as usual.
|
||||
- **Stored-query exposure is all-or-nothing per cluster** — every applied
|
||||
|
|
|
|||
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
`db/omnigraph/optimize.rs` and `db/omnigraph/repair.rs`.
|
||||
|
||||
**Addressing (RFC-010).** `optimize`, `repair`, and `cleanup` are **storage-plane** CLI commands: they run with direct storage access against a positional `URI` or `--target`, never through a server, and reject `--server` / `--graph` or a `--target` that resolves to a remote (`http(s)://`) URL with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command planes* section of [cli-reference.md](cli-reference.md).
|
||||
**Addressing (RFC-010).** `optimize`, `repair`, and `cleanup` are **storage-plane** CLI commands: they run with direct storage access against a positional `URI`, `--target`, or **`--cluster <dir|s3://…> --cluster-graph <id>`** (which resolves the graph's storage URI from the served cluster state, so you needn't know the `<storage>/graphs/<id>.omni` layout). They never run through a server, and reject `--server` / `--graph` or a `--target` that resolves to a remote (`http(s)://`) URL with a declared error. There are no server routes for them by design — to maintain a server-backed graph, run them out-of-band against the graph's storage URI. See the *Command planes* section of [cli-reference.md](cli-reference.md).
|
||||
|
||||
## `optimize_all_tables(db)` — non-destructive
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue