Caught on the live smoke: with --alias, the first bare CLI arg lands in
the hidden legacy_uri positional, so an operator alias's positional param
never bound ('parameter not provided' from the server). An operator alias
always knows its target, so the existing normalize_legacy_alias_uri
reclaims the swallowed positional as the first alias arg — same rule the
legacy path already applies.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
aliases: in the operator config bind a personal name to (server, graph,
stored-query NAME, positional arg mapping, fixed param defaults, format)
— zero content, per the ratified bindings-not-content model. Invocation
goes through the server's stored-query endpoint (POST
{base}/graphs/{g}/queries/{name}) with the keyed credential resolving via
the ordinary URL match; param precedence --params > positionals > fixed
defaults; the result renders through the existing format cascade with the
alias's format as its hop. A legacy omnigraph.yaml alias with the same
name wins during the RFC-008 window, with a warning naming both.
E2e (spawned policy-gated server, invoke_query granted via a per-graph
bundle): the alias invokes with name + one positional and nothing else —
server, graph, query, and token all from the operator layer; --server/
--graph explicit targeting; unknown --server lists defined names;
--server exclusive with a positional URI.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Global flags --server (operator-defined server name) and --graph (graph id
on a multi-graph server, requires --server) resolve to the effective
remote URI through one helper and feed the ordinary uri slot — graph
resolution and the PR-2 keyed-token URL match work unchanged; the flag is
sugar for a URI the operator already owns. Exclusive with a positional
URI and --target (loud error, never silent precedence). Unknown names
fail listing the servers that ARE defined.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
RFC-007 §D2 gains the model the alias design reasoned through: stored
queries are content + its canonical team-owned name; legacy
omnigraph.yaml aliases conflate a personal name with a local-file content
pointer (the muddle RFC-008 retires); operator aliases are pure bindings
(server, graph, stored-query NAME, arg mapping, defaults) — an alias that
carries content competes with the catalog, one that references a name
composes with it. The three senses of 'global' are resolved explicitly:
cross-graph globality is strengthened (one $HOME file vs per-directory),
team-shared shorthand is deliberately NOT an alias mechanism (the shared
name IS the catalog name), cross-machine follows the dotfile. Collision
rule: legacy wins during the RFC-008 window, with a warning.
RFC-008's migration row for aliases sharpens accordingly: a legacy alias
splits — content to the catalog (via cluster apply), binding to the
operator layer; config migrate proposes both halves.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The operator config gains servers: (name -> url; never a token). A remote
command whose URL prefix-matches an operator server resolves its bearer
token through the keyed chain first — OMNIGRAPH_TOKEN_<NAME> env, then the
[<name>] section of ~/.omnigraph/credentials (created 0600 via temp+rename,
#139 finding 7; group/world-readable files refused loudly) — falling
through to the legacy chain unchanged. URL keying makes §D5 rule 3
structural: a token is only ever sent to the server it is keyed to.
Longest-prefix matching with a path-boundary check (http://h:8080 never
matches http://h:8080-evil). Inserting the keyed hop above the legacy chain
is safe by construction — no existing setup can have servers: defined.
omnigraph login <name> stores/rotates one section (token from --token or
one stdin line — the pipe flow keeps secrets out of shell history);
omnigraph logout removes it, idempotently; logging in before declaring the
server warns instead of failing (the gh model).
Coverage: URL-match/no-substring-trap, credentials round-trip preserving
sibling sections, 0600 write + over-permissive refusal, env-name mapping;
the legacy resolve test is now hermetic against a real ~/.omnigraph and
asserts byte-identical legacy behavior with no servers defined; one
spawned-binary e2e walks the whole lifecycle against an authed server:
refusal -> wrong-token login (stdin) -> rotate (--token) -> authorized read
-> env-beats-file -> non-matching-URL negative -> logout revokes.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
cli-reference.md gains the config-surfaces table (cluster / operator /
flags-env, with omnigraph.yaml marked as the legacy combined file per
RFC-008) and the operator config.yaml reference; audit.md documents the
unified actor chain.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
~/.omnigraph/config.yaml joins the resolution chains as the operator
surface: operator.actor becomes the last hop of THE actor chain (--as >
legacy cli.actor during the RFC-008 window > operator.actor > none, one
implementation for direct-engine and cluster commands alike) and
defaults.output joins the read-format cascade below every more-specific
source. Discovery honors $OMNIGRAPH_HOME (tilde-expanded, #139 finding 9);
an absent file is an empty layer; unknown keys WARN and load (a file
written for later slices must not break this CLI); malformed YAML is a
loud error. The module is CLI-only — the server never reads operator
config (invariant 11 by construction).
$OMNIGRAPH_CONFIG becomes a first-class stand-in for --config in
load_config (flag > env > ./omnigraph.yaml), one meaning in both binaries.
The test harness pins hermeticity: spawned binaries get a nonexistent
OMNIGRAPH_HOME by default so no test ever reads the developer's real
operator config. New coverage: loader unit tests, the env-precedence
matrix on load_config_in, and spawned-binary e2es for the actor chain
(operator wins with no flag/legacy key; legacy outranks it; --as wins) and
the format cascade.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
RFC-007 now speaks the end-state language throughout: the operator surface
is one half of the two-surface split (cluster config / operator config),
not a layer over a living omnigraph.yaml. The precedence cascade drops the
project layer (cluster config carries no operator-resolvable keys — a
checkout can never supply identity); legacy omnigraph.yaml appears only as
the RFC-008 deprecation-window slot. The trust boundary is restated as
closed-by-construction in the end state, with the rules governing the
window. PR 3 becomes operator targeting (--server + operator aliases — the
replacement RFC-008 needs before legacy aliases migrate), and the schema
example gains the aliases block.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The file is three unrelated concerns wearing one filename — server
deployment config, project/CLI conveniences, operator identity — and the
mixture is the root cause of a recurring problem class (per-operator
copies of project files, checkout-supplied credential redirection, init
scaffold pollution). End state: two single-owner surfaces — cluster
config (team, repo) and operator config (person, $HOME) — plus the
zero-config flags/env tier.
Complete key-by-key migration map over the verified OmnigraphConfig
surface; staged retirement per the repo's Hyrum rules (warn with per-key
guidance -> `config migrate` tool -> stop scaffolding -> opt-in strict ->
removal at the next major). RFC-007's project-layer framing is amended to
transitional accordingly.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Terraform-style operator/project split: ~/.omnigraph/config.yaml for
identity (operator.actor in the --as cascade), credentials keyed by
server name (env -> 0600 credentials file; no inline secrets), and
operator-owned named servers that project configs reference but cannot
redefine. Explicitly a staged subset of RFC-002: adopts its settled
decisions (one dir, keyed credentials, env precedence), defers
GraphLocator/use/state-layer, and encodes the ten confirmed PR #139
findings as design rules (compat shims, key-level merges, atomic writes,
the project-layer trust boundary).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
s3_cluster.rs runs the full control-plane lifecycle against a real
bucket (CI: containerized RustFS; locally the RustFS binary): import →
lock released (pins the drop-time release regression caught on the first
live smoke) → apply (graph roots + catalog on the bucket, nothing local)
→ serving snapshots from both the config dir and the bare URI → schema
evolution → approved delete (prefix removal) → empty-cluster refusal.
The server suite gains the config-free boot test: --cluster s3://… with
zero local files serves a stored query over HTTP.
CI: the rustfs job runs both suites; the classify filter covers the
cluster store/serve modules and the new test files. The server smoke
drops its name filter — every test in the s3 target is bucket-gated, and
a filter matching nothing passes vacuously (which silently ran zero
tests for a while).
Docs: deployment.md gains the Bucket-no-volume shape as the preferred
cloud deployment; cluster.md/server.md document --cluster <uri>;
testing.md maps the new suite.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Two serving changes that complete RFC-006's read side:
ServingPolicy carries the policy bundle CONTENT (digest-verified at
snapshot read) instead of a blob path — the catalog may live on object
storage, and the server must not re-read mutable state after the
snapshot. The server grows a PolicySource enum: File for omnigraph.yaml
deployments (unchanged), Inline for cluster boots, wired through
PolicyEngine::load_{graph,server}_from_source.
read_serving_snapshot_from_storage(uri) reads the applied revision
straight from a storage root, and --cluster accepts a scheme-qualified
URI (s3://bucket/prefix): config-free serving — a serving box needs only
the URI and credentials; the ledger and catalog on the bucket ARE the
deployment artifact. Bare paths keep the config-directory behavior.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The test-split renamed tests/server.rs away; the job now targets --test
s3. Also fixes a stale name filter (s3_repo vs the actual s3_graph test):
a substring filter matching nothing passes vacuously, so this step had
been running zero tests.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim moves: the clap surface (every command/subcommand/arg struct) to
cli.rs, resolution helpers (config/actor/graph/branch/query, remote HTTP,
env/token, scaffolding) to helpers.rs, human/JSON formatting to output.rs,
the in-source test mod to main_tests.rs via #[path]. main.rs (1,184 lines)
keeps main() and the dispatch match. Visibility bumps only; 22 binary
tests green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim moves: route handlers + bearer-auth middleware + per-request
authorization + the cluster-prefix OpenAPI rewrite go to handlers.rs;
settings resolution (omnigraph.yaml/CLI/env, mode inference, bearer-token
sources, runtime-state classification) and its in-source test mod go to
settings.rs. lib.rs (1,158 lines) keeps the public types, app/router
assembly, and serve(). The ApiDoc derive references handlers::-qualified
paths; the one multi-line utoipa attribute the cut orphaned was relocated
with its handler. 289 crate tests green, OpenAPI drift check included.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
tests/server.rs (6,517 lines, 110 tests) becomes seven area files —
auth_policy, data_routes, schema_routes, stored_queries, multi_graph,
boot_settings, s3 — with shared helpers in tests/support/mod.rs. Verbatim
moves + visibility bumps (pub on helpers, pub(super)->pub inside the
matrix harness); cargo fix stripped the per-file unused imports. All 110
tests pass in their new homes (289 across the crate including lib and
openapi).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Caught by the first live s3 smoke: StateLockGuard's spawned async delete
dies with the runtime when a short-lived CLI process exits right after the
command — import's lock survived into the next command as state_lock_held.
On the multi-thread runtime (the CLI, and the gated s3 tests)
block_in_place waits for the delete to complete; current-thread runtimes
keep the spawn fallback with force-unlock as the documented recovery, same
as a crash.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
cluster.yaml gains an optional storage: URI deciding where everything the
cluster STORES lives: the state ledger, lock, content-addressed catalog,
recovery sidecars, approval artifacts, and the derived graph roots
(<storage>/graphs/<id>.omni). Absent, it defaults to the config directory
itself — the original layout, byte-compatible, so pre-existing clusters and
the whole test suite are untouched. Declared configuration always stays in
the working tree (Terraform's config-local/state-remote split); credentials
are env-only, never in cluster.yaml.
Every command resolves its store from the declared root (a bad root is a
loud invalid_storage_root). Graph-root derivation, the delete executor
(prefix delete via the adapter), the sweep's existence probes, the catalog
payload write/verify/read paths, and the serving snapshot all flow through
ClusterStore — the last raw-fs holdouts for stored state are gone, and the
deny-list gains the rule that keeps it that way.
Tests: default-layout byte-compat, a file:// root relocating the entire
cluster (ledger+catalog+graphs under the new root, nothing under the config
dir, serving snapshot follows), invalid-root validation. 98 in-crate + 9
failpoints + full workspace gate green. The s3:// flavor lands with PR 3's
gated RustFS e2e.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
LocalStateBackend becomes ClusterStore: every stored byte — state ledger,
lock, recovery sidecars, approval artifacts — now flows through the
engine's StorageAdapter, making file:// and s3:// one code path. Behavior
on the file backend is byte-compatible (layout, CAS semantics, diagnostics,
lock release timing) and the entire pre-existing suite passes unchanged.
Mechanics: the ledger CAS keeps its public sha256 vocabulary while the
physical swap is token-conditioned (ETag If-Match on S3 via PR #186's
primitives; content-token + temp/rename locally — the pre-port semantics);
the lock is a create-only put (genuinely cross-machine on object stores)
with deterministic drop-release locally and best-effort spawned release on
S3; sidecars/approvals address by URI (SweepOutcome and the executors carry
strings); sweep row-1 retirement joins the uniform deferred post-CAS
cleanup. ClusterStore also gains the catalog-payload and graph-root
methods that commit 2 wires in.
Async ripple: status/force-unlock/serving-snapshot and the server's
settings loader chain go async (CLI dispatch and ~20 test hosts follow,
mechanically). tokio joins the cluster crate's runtime deps for the lock
guard's handle.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A cold rust-cache (every Cargo.lock change) means a full workspace +
failpoints-feature build on the 2-core runner, which now exceeds 45
minutes on slow runner days — and because a timed-out run never saves its
cache, an undersized budget self-perpetuates: every retry starts cold and
dies identically (observed four consecutive 45-minute cancellations on
main and PR #188 after #186's lock bump). Warm-cache runs stay ~15
minutes; 75 is headroom matching the rustfs job's budget, not a target.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim move of the public output/diagnostic types and the internal
state/sidecar/approval models; previously-private types and their fields
get pub(crate) (they were crate-visible by position before). lib.rs is now
the command pipeline + public API. 95 tests green; full workspace gate
green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim move of cluster.yaml parsing, query discovery, source digesting,
header/id validation, path resolution, and live-graph observation. Two
helpers that the cut swept along were relocated to their right homes
(state-status helpers back to lib.rs, lock-file helpers to store.rs). 95
tests green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim move of the Serving* types, read_serving_snapshot, and
read_verified_payload; public re-exports preserved (the server's imports
are unchanged). 95 tests green.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim move of LocalStateBackend, StateSnapshot, StateLockGuard and their
impls — the single home for stored-state I/O (state ledger, lock, recovery
sidecars, approval artifacts), where the RFC-006 object-storage port lands
next as a focused diff. Visibility bumps (pub(crate)) only; 95 tests green
before and after.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Verbatim move (indentation preserved — embedded raw-string fixtures are
content). lib.rs drops from 7,857 to ~4,750 lines; `use super::*` resolves
to the crate root through the #[path] module declaration unchanged. 95
tests green before and after.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
PolicyConfig::from_source + PolicyEngine::load_graph_from_source /
load_server_from_source — the path-based loaders delegate to them. Needed by
callers whose policy bundles don't live on the local filesystem (the cluster
catalog on object storage); kind-alignment validation stays loud through the
new path.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Three primitives the cluster's object-storage port (RFC-006) needs, on the
engine's existing adapter rather than a parallel store:
- read_text_versioned: content + an opaque backend version token (S3: the
ETag from GET; local: content sha256 — ETags don't exist on a filesystem).
- write_text_if_match: replace only when the token still matches. S3 maps to
a conditional put (PutMode::Update / If-Match) — verified against RustFS
beta.8 through the real object_store 0.12.5 path, no extra builder config
needed; local compares content then swaps via temp+rename, the same
single-machine semantics callers had before this trait (safe under their
own lock protocol, not a cross-process barrier by itself). CAS-lost is
Ok(None), never silent.
- delete_prefix: recursive + idempotent (local remove_dir_all; S3 list +
delete, with the non-atomicity documented for crash-retry callers).
Gated S3 coverage: s3_adapter_conditional_writes_contract pins the
conditional-write behavior the cluster ledger will depend on (red if a
backend bump regresses it), and s3_schema_apply_migrates_live_graph closes
the previously-untested schema-apply-on-S3 path before the cluster's schema
executor leans on it. Engine gains the sha2 workspace dep.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
omnigraph load is now the single data-write command:
- works against remote graphs (POSTs the server's /ingest endpoint with the
same bearer/actor resolution as other remote commands) — previously load
was the only data command forced to open Lance storage directly
- --from <base> opts into fork-if-missing for --branch (the former ingest
semantics); without --from a missing branch is an error, never a fork
- --mode is now required: overwrite is destructive, so there is no implicit
default (the old silent default was overwrite)
- output gains base_branch/branch_created (and table sums on remote loads)
omnigraph ingest stays as a deprecated alias (defaults preserved: --from
main --mode merge) that prints a one-line warning to stderr, matching the
read/change deprecation convention; removal in a later release.
Docs updated in the same change: cli.md, cli-reference.md, policy.md,
audit.md, execution.md (unified load section), AGENTS.md quick-flow,
README.md.
BREAKING CHANGE: scripts running omnigraph load without --mode must now
pass it explicitly (previously defaulted to the destructive overwrite).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Branch creation becomes opt-in by presence of the request's 'from' field.
Previously the handler defaulted from to 'main' and always auto-created a
missing branch — a typo'd branch name silently forked main and landed the
data there, with the client none the wiser. Now a request without 'from'
against a missing branch returns 404 branch-not-found and creates nothing;
with 'from' set, fork-if-missing behaves as before. The BranchCreate
authority is only consulted when a fork will actually happen.
The handler calls the unified load_as directly (the deprecated ingest_as
shim is no longer used in the server). IngestOutput.base_branch becomes
nullable: it echoes the request's 'from' and is null when absent. OpenAPI
regenerated; the CLI's local ingest arm moves to load_file_as + the new
converter shape.
BREAKING CHANGE: clients that relied on implicit fork-from-main with 'from'
omitted must now pass from='main' explicitly. IngestOutput.base_branch is
now nullable.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The free helpers needlessly demanded &mut Omnigraph (every load API takes
&self) and read as leftovers. Rather than rewriting their ~200 call sites
across the test suites — which would have to re-derive the active-branch
resolution at each site — keep the one convenience and make it honest:
borrow immutably (&mut callers coerce, no churn) and document it as the
active-branch shorthand over Omnigraph::load.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
load_as/load_file_as gain a base: Option<&str> parameter: with Some(base) a
missing target branch is forked from base first (the former ingest
semantics); with None the target branch must exist — staging fails on an
unknown branch, so a typo'd name can never create one. LoadResult gains
branch/base_branch/branch_created metadata (additive).
The ingest family (ingest, ingest_as, ingest_file, ingest_file_as) becomes
#[deprecated] shims over load_as that preserve the historical contract
exactly (from: None still means fork from main; base recorded even when no
fork happened). IngestResult and to_ingest_tables stay for the shims and
the server until the removal release.
The layered policy check is unchanged: Change on the target branch always,
BranchCreate additionally when a fork actually happens (enforced inside
branch_create_from_as with the actor threaded through).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The LoadMode table still described Overwrite as an inline-commit-per-type
residual with a partial-truncation failure window. Since MR-793 Phase 2,
Overwrite goes through the same MutationStaging accumulator as Append/Merge,
staged as a Lance Operation::Overwrite transaction via stage_overwrite
(table_store.rs) and committed with commit_staged + publisher CAS — a
mid-load failure leaves Lance HEAD untouched in all three modes.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
resolve_query_decls hands its file contents to the caller; the per-query
digest/typecheck pass reuses them instead of re-reading (a file with N
queries was read N+1 times), which also closes the window where a file
changing between enumeration and validation produced a confusing
query_key_mismatch for a just-discovered name. Explicit-map declarations
read as before.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Paths in cluster.yaml and command examples are relative to one explicit
config folder (Terraform-shaped) — the ./ prefixes were noise and are gone
across the user docs (109 instances; ../ links and ./scripts executables
untouched). The cluster docs now present directory discovery as the primary
queries form with the list and map forms documented alongside.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>