Commit graph

488 commits

Author SHA1 Message Date
Andrew Altshuler
867138499e
Merge pull request #200 from ModernRelay/feat/no-legacy-config-strict
Some checks are pending
CI / Classify Changes (push) Waiting to run
CI / Check AGENTS.md Links (push) Waiting to run
CI / Container Entrypoint (push) Waiting to run
CI / Test Workspace (push) Blocked by required conditions
CI / Test omnigraph-server --features aws (push) Blocked by required conditions
CI / RustFS S3 Integration (push) Blocked by required conditions
Release Edge / Prepare edge release (push) Waiting to run
Release Edge / Build edge omnigraph-linux-x86_64 (push) Blocked by required conditions
Release Edge / Build edge omnigraph-macos-arm64 (push) Blocked by required conditions
Release Edge / Build edge omnigraph-windows-x86_64 (push) Blocked by required conditions
Release Edge / Smoke Windows installer (push) Blocked by required conditions
feat(config): OMNIGRAPH_NO_LEGACY_CONFIG strict mode (RFC-008 stage 4)
2026-06-12 00:15:20 +03:00
aaltshuler
4c50170c77 feat(config): OMNIGRAPH_NO_LEGACY_CONFIG strict mode (RFC-008 stage 4)
Opt-in: with the env set, loading a legacy omnigraph.yaml is a hard
error pointing at config migrate — the regression guard for migrated
teams (a stray legacy file would otherwise silently outrank operator
config during the window) and the rehearsal for stage 5's removal.
Strict refuses the FILE, never its absence: flag-less invocations on
migrated setups are untouched. Inert unless set.

The RFC's stages-1-3-then-4 release gap collapsed honestly: no version
boundary was crossed between them, so all four ship in the same release
(noted in the RFC).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 00:03:10 +03:00
Andrew Altshuler
108d2defa6
Merge pull request #199 from ModernRelay/feat/yaml-deprecation-stages
feat(cli,config): RFC-008 stages 1–3 — deprecate omnigraph.yaml (warnings, config migrate, scaffold flip)
2026-06-11 23:55:16 +03:00
aaltshuler
5328c91341 refactor(cli): drop cluster init — no replacement scaffold
Andrew's call, and the right one by the repo's own lens: a minimal
cluster.yaml is five lines; a generator is a second copy of the schema to
keep in sync forever, emitting a file that is unusable until hand-edited
anyway (graphs: {} cannot apply or serve). Terraform has no config
scaffolder either. New users copy from the cluster quick-start; migrants
get a ready-to-review cluster.yaml from config migrate. RFC-008 stage 3
becomes purely subtractive.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 23:45:18 +03:00
aaltshuler
3adbc65af2 docs(cli): config migrate, cluster init, the legacy-file deprecation notice
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 23:37:12 +03:00
aaltshuler
5ba9656666 feat(cli): init stops scaffolding omnigraph.yaml; cluster init replaces it (RFC-008 stage 3)
omnigraph init no longer writes a legacy config into cwd (the source of
the earlier test-pollution bug, and a scaffold for a deprecated file);
the scaffolder is deleted. omnigraph cluster init scaffolds the
replacement: a minimal valid cluster.yaml (version: 1, optional
metadata.name / storage:, a commented graphs example), refusing to
overwrite. The scaffold validates clean via cluster validate in the e2e.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 23:34:04 +03:00
aaltshuler
cd1f175396 feat(cli): omnigraph config migrate — the RFC-008 split (stage 2)
Reads a legacy omnigraph.yaml and produces the three-section split: team
half as a ready-to-review cluster.yaml proposal (graphs with TODO schema
pointers — the legacy file never knew schemas — per-graph queries
directories, policies with applies_to bindings), personal half as an
operator-config merge (actor, output/table defaults — OperatorDefaults
gains the two table keys with their cascade hops — remote graphs with
bearer_token_env become servers entries plus a printed login step, and
legacy aliases split per the RFC: content to the catalog as a manual
step, binding to an operator alias), plus a dropped-keys section with
reasons. Touches nothing without --write; with it, the operator merge is
key-level (existing entries always win; prior file backed up), and
cluster.yaml is emitted only when absent (else cluster.yaml.proposed).
--json emits the report structurally.

The completeness contract is a unit test: every top-level key of the
legacy schema must classify somewhere, or the RFC-008 map has a bug.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 23:32:05 +03:00
aaltshuler
c89d268b23 feat(config): per-key deprecation warnings on legacy omnigraph.yaml load (RFC-008 stage 1)
Loading a legacy file (flag, env, or cwd-found — never on defaults) emits
one stderr block listing each key actually present with its destination
from RFC-008's migration map — the map applied to YOUR file, not a
generic banner. Once per process; both binaries warn (cluster-mode boots
never reach load_config, silent by construction); suppressible via
OMNIGRAPH_SUPPRESS_YAML_DEPRECATION=1 for CI logs during the window.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 23:28:33 +03:00
Andrew Altshuler
588b0c1b6c
Merge pull request #198 from ModernRelay/feat/operator-targeting
feat(cli): operator targeting — --server + aliases as pure bindings (RFC-007 PR 3)
2026-06-11 22:54:50 +03:00
aaltshuler
20ddfc61c1 fix(cli): reclaim the hidden legacy-uri positional for operator aliases
Caught on the live smoke: with --alias, the first bare CLI arg lands in
the hidden legacy_uri positional, so an operator alias's positional param
never bound ('parameter not provided' from the server). An operator alias
always knows its target, so the existing normalize_legacy_alias_uri
reclaims the swallowed positional as the first alias arg — same rule the
legacy path already applies.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 22:29:57 +03:00
aaltshuler
dc91c55970 feat(cli): operator aliases — pure bindings invoking stored queries (RFC-007 PR 3, part 2)
aliases: in the operator config bind a personal name to (server, graph,
stored-query NAME, positional arg mapping, fixed param defaults, format)
— zero content, per the ratified bindings-not-content model. Invocation
goes through the server's stored-query endpoint (POST
{base}/graphs/{g}/queries/{name}) with the keyed credential resolving via
the ordinary URL match; param precedence --params > positionals > fixed
defaults; the result renders through the existing format cascade with the
alias's format as its hop. A legacy omnigraph.yaml alias with the same
name wins during the RFC-008 window, with a warning naming both.

E2e (spawned policy-gated server, invoke_query granted via a per-graph
bundle): the alias invokes with name + one positional and nothing else —
server, graph, query, and token all from the operator layer; --server/
--graph explicit targeting; unknown --server lists defined names;
--server exclusive with a positional URI.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 22:25:42 +03:00
aaltshuler
2b33ab64f2 feat(cli): --server <name> targeting (RFC-007 PR 3, part 1)
Global flags --server (operator-defined server name) and --graph (graph id
on a multi-graph server, requires --server) resolve to the effective
remote URI through one helper and feed the ordinary uri slot — graph
resolution and the PR-2 keyed-token URL match work unchanged; the flag is
sugar for a URI the operator already owns. Exclusive with a positional
URI and --target (loud error, never silent precedence). Unknown names
fail listing the servers that ARE defined.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 22:19:25 +03:00
aaltshuler
65160cc060 docs(rfc): aliases are bindings, not content — the ratified alias model
RFC-007 §D2 gains the model the alias design reasoned through: stored
queries are content + its canonical team-owned name; legacy
omnigraph.yaml aliases conflate a personal name with a local-file content
pointer (the muddle RFC-008 retires); operator aliases are pure bindings
(server, graph, stored-query NAME, arg mapping, defaults) — an alias that
carries content competes with the catalog, one that references a name
composes with it. The three senses of 'global' are resolved explicitly:
cross-graph globality is strengthened (one $HOME file vs per-directory),
team-shared shorthand is deliberately NOT an alias mechanism (the shared
name IS the catalog name), cross-machine follows the dotfile. Collision
rule: legacy wins during the RFC-008 window, with a warning.

RFC-008's migration row for aliases sharpens accordingly: a legacy alias
splits — content to the catalog (via cluster apply), binding to the
operator layer; config migrate proposes both halves.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 22:15:19 +03:00
Andrew Altshuler
b6ebe6cbe5
Merge pull request #197 from ModernRelay/feat/operator-keyed-credentials
feat(cli): keyed credentials — servers:, the token chain, omnigraph login (RFC-007 PR 2)
2026-06-11 21:44:58 +03:00
aaltshuler
a819ab500e feat(cli): keyed credentials — servers:, the token chain, login/logout (RFC-007 PR 2)
The operator config gains servers: (name -> url; never a token). A remote
command whose URL prefix-matches an operator server resolves its bearer
token through the keyed chain first — OMNIGRAPH_TOKEN_<NAME> env, then the
[<name>] section of ~/.omnigraph/credentials (created 0600 via temp+rename,
#139 finding 7; group/world-readable files refused loudly) — falling
through to the legacy chain unchanged. URL keying makes §D5 rule 3
structural: a token is only ever sent to the server it is keyed to.
Longest-prefix matching with a path-boundary check (http://h:8080 never
matches http://h:8080-evil). Inserting the keyed hop above the legacy chain
is safe by construction — no existing setup can have servers: defined.

omnigraph login <name> stores/rotates one section (token from --token or
one stdin line — the pipe flow keeps secrets out of shell history);
omnigraph logout removes it, idempotently; logging in before declaring the
server warns instead of failing (the gh model).

Coverage: URL-match/no-substring-trap, credentials round-trip preserving
sibling sections, 0600 write + over-permissive refusal, env-name mapping;
the legacy resolve test is now hermetic against a real ~/.omnigraph and
asserts byte-identical legacy behavior with no servers defined; one
spawned-binary e2e walks the whole lifecycle against an authed server:
refusal -> wrong-token login (stdin) -> rotate (--token) -> authorized read
-> env-beats-file -> non-matching-URL negative -> logout revokes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 21:24:51 +03:00
Andrew Altshuler
5db42fb660
Merge pull request #196 from ModernRelay/feat/operator-config-identity
feat(cli): operator config surface — identity + output defaults (RFC-007 PR 1)
2026-06-11 21:01:58 +03:00
Andrew Altshuler
d5d703fccc
Merge pull request #195 from ModernRelay/rfc/operator-config
docs(rfc): RFC-007 + RFC-008 — the config architecture pair (operator layer; deprecate omnigraph.yaml)
2026-06-11 21:01:54 +03:00
aaltshuler
9427fb510e docs(cli): the two config surfaces + the operator file reference
cli-reference.md gains the config-surfaces table (cluster / operator /
flags-env, with omnigraph.yaml marked as the legacy combined file per
RFC-008) and the operator config.yaml reference; audit.md documents the
unified actor chain.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 20:32:04 +03:00
aaltshuler
be4bd46212 feat(cli): the operator config surface — identity and output defaults (RFC-007 PR 1)
~/.omnigraph/config.yaml joins the resolution chains as the operator
surface: operator.actor becomes the last hop of THE actor chain (--as >
legacy cli.actor during the RFC-008 window > operator.actor > none, one
implementation for direct-engine and cluster commands alike) and
defaults.output joins the read-format cascade below every more-specific
source. Discovery honors $OMNIGRAPH_HOME (tilde-expanded, #139 finding 9);
an absent file is an empty layer; unknown keys WARN and load (a file
written for later slices must not break this CLI); malformed YAML is a
loud error. The module is CLI-only — the server never reads operator
config (invariant 11 by construction).

$OMNIGRAPH_CONFIG becomes a first-class stand-in for --config in
load_config (flag > env > ./omnigraph.yaml), one meaning in both binaries.

The test harness pins hermeticity: spawned binaries get a nonexistent
OMNIGRAPH_HOME by default so no test ever reads the developer's real
operator config. New coverage: loader unit tests, the env-precedence
matrix on load_config_in, and spawned-binary e2es for the actor chain
(operator wins with no flag/legacy key; legacy outranks it; --as wins) and
the format cascade.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 20:29:02 +03:00
aaltshuler
08ce8dc34d docs(rfc): align RFC-007 with RFC-008's two-surface architecture
RFC-007 now speaks the end-state language throughout: the operator surface
is one half of the two-surface split (cluster config / operator config),
not a layer over a living omnigraph.yaml. The precedence cascade drops the
project layer (cluster config carries no operator-resolvable keys — a
checkout can never supply identity); legacy omnigraph.yaml appears only as
the RFC-008 deprecation-window slot. The trust boundary is restated as
closed-by-construction in the end state, with the rules governing the
window. PR 3 becomes operator targeting (--server + operator aliases — the
replacement RFC-008 needs before legacy aliases migrate), and the schema
example gains the aliases block.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 19:54:34 +03:00
aaltshuler
320311e759 docs(rfc): RFC-008 — deprecate omnigraph.yaml, one concern per config surface
The file is three unrelated concerns wearing one filename — server
deployment config, project/CLI conveniences, operator identity — and the
mixture is the root cause of a recurring problem class (per-operator
copies of project files, checkout-supplied credential redirection, init
scaffold pollution). End state: two single-owner surfaces — cluster
config (team, repo) and operator config (person, $HOME) — plus the
zero-config flags/env tier.

Complete key-by-key migration map over the verified OmnigraphConfig
surface; staged retirement per the repo's Hyrum rules (warn with per-key
guidance -> `config migrate` tool -> stop scaffolding -> opt-in strict ->
removal at the next major). RFC-007's project-layer framing is amended to
transitional accordingly.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 19:33:19 +03:00
aaltshuler
d531f60999 docs(rfc): RFC-007 — per-operator config, the operator slice of RFC-002
Terraform-style operator/project split: ~/.omnigraph/config.yaml for
identity (operator.actor in the --as cascade), credentials keyed by
server name (env -> 0600 credentials file; no inline secrets), and
operator-owned named servers that project configs reference but cannot
redefine. Explicitly a staged subset of RFC-002: adopts its settled
decisions (one dir, keyed credentials, env precedence), defers
GraphLocator/use/state-layer, and encodes the ten confirmed PR #139
findings as design rules (compat shims, key-level merges, atomic writes,
the project-layer trust boundary).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 18:29:55 +03:00
Andrew Altshuler
29dd827208
Merge pull request #194 from ModernRelay/feat/cluster-bucket-serving
feat(cluster,server): bucket-backed serving — config-free --cluster s3:// + RustFS e2e (RFC-006 PR 3/3)
2026-06-11 17:03:14 +03:00
aaltshuler
8d7aed065f test(cluster,server): gated object-storage cluster e2e + CI wiring + docs
s3_cluster.rs runs the full control-plane lifecycle against a real
bucket (CI: containerized RustFS; locally the RustFS binary): import →
lock released (pins the drop-time release regression caught on the first
live smoke) → apply (graph roots + catalog on the bucket, nothing local)
→ serving snapshots from both the config dir and the bare URI → schema
evolution → approved delete (prefix removal) → empty-cluster refusal.
The server suite gains the config-free boot test: --cluster s3://… with
zero local files serves a stored query over HTTP.

CI: the rustfs job runs both suites; the classify filter covers the
cluster store/serve modules and the new test files. The server smoke
drops its name filter — every test in the s3 target is bucket-gated, and
a filter matching nothing passes vacuously (which silently ran zero
tests for a while).

Docs: deployment.md gains the Bucket-no-volume shape as the preferred
cloud deployment; cluster.md/server.md document --cluster <uri>;
testing.md maps the new suite.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:56:40 +03:00
aaltshuler
58855c0a7c feat(cluster,server): inline policy content + config-free --cluster URI boot
Two serving changes that complete RFC-006's read side:

ServingPolicy carries the policy bundle CONTENT (digest-verified at
snapshot read) instead of a blob path — the catalog may live on object
storage, and the server must not re-read mutable state after the
snapshot. The server grows a PolicySource enum: File for omnigraph.yaml
deployments (unchanged), Inline for cluster boots, wired through
PolicyEngine::load_{graph,server}_from_source.

read_serving_snapshot_from_storage(uri) reads the applied revision
straight from a storage root, and --cluster accepts a scheme-qualified
URI (s3://bucket/prefix): config-free serving — a serving box needs only
the URI and credentials; the ledger and catalog on the bucket ARE the
deployment artifact. Bare paths keep the config-directory behavior.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:56:22 +03:00
Andrew Altshuler
7af3697397
Merge pull request #193 from ModernRelay/refactor/cli-modularize
refactor(cli): modularize main.rs and the test monolith — pure code movement
2026-06-11 15:37:28 +03:00
Andrew Altshuler
c116a12fc9
Merge pull request #192 from ModernRelay/refactor/server-modularize
refactor(server): modularize the test monolith and lib.rs — pure code movement
2026-06-11 15:37:23 +03:00
aaltshuler
4a3f8e3a96 ci: point the RustFS server smoke at the renamed s3 test target
The test-split renamed tests/server.rs away; the job now targets --test
s3. Also fixes a stale name filter (s3_repo vs the actual s3_graph test):
a substring filter matching nothing passes vacuously, so this step had
been running zero tests.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:21:44 +03:00
aaltshuler
d5e75df272 refactor(cli): split the test monolith into command-area suites
tests/cli.rs (4,548 lines, 112 tests) becomes five area files —
cli_cluster (24), cli_cluster_e2e (10, the spawned-binary lifecycle
compositions), cli_data (49), cli_schema_config (16), cli_queries (13) —
with the file-local helpers joining the existing tests/support harness.
Verbatim moves + visibility bumps; 161 crate tests green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:16:51 +03:00
aaltshuler
916015c416 refactor(cli): split main.rs into cli/helpers/output modules
Verbatim moves: the clap surface (every command/subcommand/arg struct) to
cli.rs, resolution helpers (config/actor/graph/branch/query, remote HTTP,
env/token, scaffolding) to helpers.rs, human/JSON formatting to output.rs,
the in-source test mod to main_tests.rs via #[path]. main.rs (1,184 lines)
keeps main() and the dispatch match. Visibility bumps only; 22 binary
tests green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:14:27 +03:00
aaltshuler
127440d873 refactor(server): split lib.rs into handlers and settings modules
Verbatim moves: route handlers + bearer-auth middleware + per-request
authorization + the cluster-prefix OpenAPI rewrite go to handlers.rs;
settings resolution (omnigraph.yaml/CLI/env, mode inference, bearer-token
sources, runtime-state classification) and its in-source test mod go to
settings.rs. lib.rs (1,158 lines) keeps the public types, app/router
assembly, and serve(). The ApiDoc derive references handlers::-qualified
paths; the one multi-line utoipa attribute the cut orphaned was relocated
with its handler. 289 crate tests green, OpenAPI drift check included.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:08:25 +03:00
aaltshuler
b036073ec6 refactor(server): split the test monolith into area suites
tests/server.rs (6,517 lines, 110 tests) becomes seven area files —
auth_policy, data_routes, schema_routes, stored_queries, multi_graph,
boot_settings, s3 — with shared helpers in tests/support/mod.rs. Verbatim
moves + visibility bumps (pub on helpers, pub(super)->pub inside the
matrix harness); cargo fix stripped the per-file unused imports. All 110
tests pass in their new homes (289 across the crate including lib and
openapi).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 15:03:51 +03:00
Andrew Altshuler
4e526b3e5a
Merge pull request #190 from ModernRelay/feat/cluster-storage-root-v2
feat(cluster): storage root — ledger, catalog, and graphs on the StorageAdapter (RFC-006 PR 2/3)
2026-06-11 14:54:10 +03:00
aaltshuler
f6ae3e4fa3 fix(cluster): lock release must complete before a CLI process exits
Caught by the first live s3 smoke: StateLockGuard's spawned async delete
dies with the runtime when a short-lived CLI process exits right after the
command — import's lock survived into the next command as state_lock_held.
On the multi-thread runtime (the CLI, and the gated s3 tests)
block_in_place waits for the delete to complete; current-thread runtimes
keep the spawn fallback with force-unlock as the documented recovery, same
as a crash.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:33:26 +03:00
aaltshuler
8dc2f15255 feat(cluster): the storage: root — state, catalog, and graph roots relocatable
cluster.yaml gains an optional storage: URI deciding where everything the
cluster STORES lives: the state ledger, lock, content-addressed catalog,
recovery sidecars, approval artifacts, and the derived graph roots
(<storage>/graphs/<id>.omni). Absent, it defaults to the config directory
itself — the original layout, byte-compatible, so pre-existing clusters and
the whole test suite are untouched. Declared configuration always stays in
the working tree (Terraform's config-local/state-remote split); credentials
are env-only, never in cluster.yaml.

Every command resolves its store from the declared root (a bad root is a
loud invalid_storage_root). Graph-root derivation, the delete executor
(prefix delete via the adapter), the sweep's existence probes, the catalog
payload write/verify/read paths, and the serving snapshot all flow through
ClusterStore — the last raw-fs holdouts for stored state are gone, and the
deny-list gains the rule that keeps it that way.

Tests: default-layout byte-compat, a file:// root relocating the entire
cluster (ledger+catalog+graphs under the new root, nothing under the config
dir, serving snapshot follows), invalid-root validation. 98 in-crate + 9
failpoints + full workspace gate green. The s3:// flavor lands with PR 3's
gated RustFS e2e.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:28:04 +03:00
aaltshuler
fd002abaa5 feat(cluster): port the storage backend to the engine StorageAdapter
LocalStateBackend becomes ClusterStore: every stored byte — state ledger,
lock, recovery sidecars, approval artifacts — now flows through the
engine's StorageAdapter, making file:// and s3:// one code path. Behavior
on the file backend is byte-compatible (layout, CAS semantics, diagnostics,
lock release timing) and the entire pre-existing suite passes unchanged.

Mechanics: the ledger CAS keeps its public sha256 vocabulary while the
physical swap is token-conditioned (ETag If-Match on S3 via PR #186's
primitives; content-token + temp/rename locally — the pre-port semantics);
the lock is a create-only put (genuinely cross-machine on object stores)
with deterministic drop-release locally and best-effort spawned release on
S3; sidecars/approvals address by URI (SweepOutcome and the executors carry
strings); sweep row-1 retirement joins the uniform deferred post-CAS
cleanup. ClusterStore also gains the catalog-payload and graph-root
methods that commit 2 wires in.

Async ripple: status/force-unlock/serving-snapshot and the server's
settings loader chain go async (CLI dispatch and ~20 test hosts follow,
mechanically). tokio joins the cluster crate's runtime deps for the lock
guard's handle.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 14:11:14 +03:00
Andrew Altshuler
2f58fc47fa
Merge pull request #188 from ModernRelay/refactor/cluster-modularize
refactor(cluster): modularize lib.rs — pure code movement
2026-06-11 12:07:32 +03:00
Andrew Altshuler
e19c095e8c
Merge pull request #189 from ModernRelay/ci/test-workspace-timeout
ci: raise Test Workspace timeout to 75 minutes
2026-06-11 11:39:13 +03:00
aaltshuler
7f32e6f1bc ci: raise Test Workspace timeout to 75 minutes
A cold rust-cache (every Cargo.lock change) means a full workspace +
failpoints-feature build on the 2-core runner, which now exceeds 45
minutes on slow runner days — and because a timed-out run never saves its
cache, an undersized budget self-perpetuates: every retry starts cold and
dies identically (observed four consecutive 45-minute cancellations on
main and PR #188 after #186's lock bump). Warm-cache runs stay ~15
minutes; 75 is headroom matching the rustfs job's budget, not a target.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 10:38:54 +03:00
aaltshuler
db6fe03be1 refactor(cluster): move type definitions to types.rs
Verbatim move of the public output/diagnostic types and the internal
state/sidecar/approval models; previously-private types and their fields
get pub(crate) (they were crate-visible by position before). lib.rs is now
the command pipeline + public API. 95 tests green; full workspace gate
green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:42:02 +03:00
aaltshuler
dc0a1fc5a5 refactor(cluster): move declared-config loading to config.rs
Verbatim move of cluster.yaml parsing, query discovery, source digesting,
header/id validation, path resolution, and live-graph observation. Two
helpers that the cut swept along were relocated to their right homes
(state-status helpers back to lib.rs, lock-file helpers to store.rs). 95
tests green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:37:20 +03:00
aaltshuler
dd17c0c50f refactor(cluster): move diffing and classification to diff.rs
Verbatim move of diff_resources, binding-change diffing, blast radius,
approval gating, ResourceKind, classify_changes, and demotion. 95 tests
green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:33:13 +03:00
aaltshuler
9c3e09e838 refactor(cluster): move the recovery sweep to sweep.rs
Verbatim move of the sidecar classification (all RFC-004 D3 rows),
tombstoning, and approval-consumption helpers. 95 tests green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:30:55 +03:00
aaltshuler
00fc5cf537 refactor(cluster): move the serving snapshot to serve.rs
Verbatim move of the Serving* types, read_serving_snapshot, and
read_verified_payload; public re-exports preserved (the server's imports
are unchanged). 95 tests green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:29:44 +03:00
aaltshuler
5a8047e5d0 refactor(cluster): move the storage backend to store.rs
Verbatim move of LocalStateBackend, StateSnapshot, StateLockGuard and their
impls — the single home for stored-state I/O (state ledger, lock, recovery
sidecars, approval artifacts), where the RFC-006 object-storage port lands
next as a focused diff. Visibility bumps (pub(crate)) only; 95 tests green
before and after.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:28:04 +03:00
Andrew Altshuler
1b1583d897
Merge pull request #186 from ModernRelay/feat/object-store-primitives
feat(storage,policy): object-store primitives for the cluster port (RFC-006 PR 1/3)
2026-06-11 05:26:55 +03:00
aaltshuler
fbb86dee0e refactor(cluster): move the in-source test suite to tests.rs
Verbatim move (indentation preserved — embedded raw-string fixtures are
content). lib.rs drops from 7,857 to ~4,750 lines; `use super::*` resolves
to the crate root through the #[path] module declaration unchanged. 95
tests green before and after.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:25:53 +03:00
aaltshuler
d702fd106a feat(policy): from-source twins for the policy loaders
PolicyConfig::from_source + PolicyEngine::load_graph_from_source /
load_server_from_source — the path-based loaders delegate to them. Needed by
callers whose policy bundles don't live on the local filesystem (the cluster
catalog on object storage); kind-alignment validation stays loud through the
new path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:09:45 +03:00
aaltshuler
f48e69b999 feat(storage): versioned CAS, conditional replace, and prefix delete on StorageAdapter
Three primitives the cluster's object-storage port (RFC-006) needs, on the
engine's existing adapter rather than a parallel store:

- read_text_versioned: content + an opaque backend version token (S3: the
  ETag from GET; local: content sha256 — ETags don't exist on a filesystem).
- write_text_if_match: replace only when the token still matches. S3 maps to
  a conditional put (PutMode::Update / If-Match) — verified against RustFS
  beta.8 through the real object_store 0.12.5 path, no extra builder config
  needed; local compares content then swaps via temp+rename, the same
  single-machine semantics callers had before this trait (safe under their
  own lock protocol, not a cross-process barrier by itself). CAS-lost is
  Ok(None), never silent.
- delete_prefix: recursive + idempotent (local remove_dir_all; S3 list +
  delete, with the non-atomicity documented for crash-retry callers).

Gated S3 coverage: s3_adapter_conditional_writes_contract pins the
conditional-write behavior the cluster ledger will depend on (red if a
backend bump regresses it), and s3_schema_apply_migrates_live_graph closes
the previously-untested schema-apply-on-S3 path before the cluster's schema
executor leans on it. Engine gains the sha2 workspace dep.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 05:09:45 +03:00
Andrew Altshuler
328bfef6fb
Merge pull request #184 from ModernRelay/refactor/load-ingest-unification
refactor: unify load/ingest — load survives, ingest deprecated
2026-06-11 04:43:36 +03:00