omnigraph

mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-12 01:45:14 +02:00

Author	SHA1	Message	Date
aaltshuler	c3007369cd	feat(cluster): execute graph creates in cluster apply Stage 4A (RFC-004 §D1/§D5): graph.<id> Create — and its paired schema Create, which the init carries — classify Applied and execute first in the run, sequentially and sidecar-fenced: sidecar written before Omnigraph::init at the derived root, rewritten with the post-init manifest pin, deleted only after the final state CAS lands. Dependent queries and policies no longer block on a graph create in the same plan — creates run first, so they apply in the same run; a create failure demotes them to blocked (dependency_not_applied) and stops further graph-moving work (loud partials), with the sidecar left for the sweep to classify. Graphs with a kept recovery sidecar (rows 5/6) classify Blocked/cluster_recovery_pending, and the sweep's Drifted/Error statuses are never clobbered by a generic Blocked. Schema source is re-read and digest-verified under the lock before the init (the write_resource_payload TOCTOU posture). Plan previews the same dispositions. e2e fallout updated: a fresh multi-graph config now converges in one apply; a destroyed root is re-created as an EMPTY graph by the next apply (declarative convergence — visible in plan, called out in docs); the new cluster_e2e_declared_graph_created_by_apply pins the no-manual-init flow. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 04:58:56 +03:00
aaltshuler	bf8cc7a753	feat(cluster): graph-create recovery sidecars and sweep RFC-004 §D2/§D3 for the graph_create kind. RecoverySidecar records intent under __cluster/recoveries/{ulid}.json; the roll-forward-only sweep runs at the start of apply/refresh/import under the state lock and classifies each survivor by observation: root absent -> intent removed (row 1); outcome already recorded -> retired (row 2); create completed but state stale -> ledger rolled forward with a recovery_records audit entry (row 4); partial root -> Error/graph_create_incomplete, kept, never auto-deleted (row 5); unexpected schema -> Drifted/actual_applied_state_pending, kept (row 6). Sweep mutations ride the command's existing CAS write; completed sidecars are deleted only after that write lands. Read-only status/plan warn (cluster_recovery_pending) without acting. The apply payload gate now counts only payload-phase errors so kept-sidecar diagnostics don't abort the run before their statuses persist. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 04:50:42 +03:00
aaltshuler	6fbf09d5c9	refactor(cluster): make apply_config_dir async Mechanical conversion ahead of Stage 4A graph create (which calls the async Omnigraph::init from inside apply): the fn signature, the CLI dispatch arm, and every test caller (#[test] -> #[tokio::test]). Zero behavior change; all 60 lib tests and 3 failpoint tests green before and after. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 04:43:38 +03:00
Andrew Altshuler	26b26999fd	ci(codeowners): aaltshuler owns all paths; remove ragnorc (#169 ) Engineering and docs roles both resolve to @aaltshuler; every path (catch-all, crates/, docs/, repo-level docs) now requires their review. CODEOWNERS and the doc tables regenerated from codeowners-roles.yml via render-codeowners.py. Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 04:34:17 +03:00
Andrew Altshuler	58c66a54a2	docs(cluster): RFC-004 — graph & schema apply design (Phase 4) (#168 ) * docs(cluster): RFC-004 — graph & schema apply design (Phase 4) The design the implementation spec's exit criteria require before graph-moving cluster apply ships. Core positions: - Cluster recovery is roll-forward-only: the engine's own sidecars make every graph-level operation atomic within the graph, so the cluster never rolls a graph back — its sidecars (__cluster/recoveries/{ulid}.json) classify and record, converging the ledger to observable reality (axiom 5) or surfacing a loud pending-repair condition. Eight-row decision matrix, every row testable with the Stage 3B failpoint harness. - Irreversible operations (graph delete, allow_data_loss schema apply) consume digest-bound approval artifacts written by a new cluster approve command and retired into state.approval_records (axiom 11). A stale approval can never authorize a different change. - cluster apply gains an actor, threaded to apply_schema_as so engine Cedar enforcement and commit attribution work unchanged; the cluster adds no policy engine of its own. - Deterministic ordering (creates -> schema applies -> catalog -> deletes), per-resource apply groups, cross-graph atomicity explicitly not promised. - Staged 4A graph create / 4B schema apply / 4C graph delete, each gated on per-matrix-row failpoint tests. Answers exit criteria 2 and 4 fully, 1/5/6 partially; 3/7/8/9 deferred to their phases (coverage table in the RFC). Linked from the dev index and the implementation spec's Phase 4 section. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(cluster): RFC-004 review fixes — graph_delete sweep rows, state_cas_base contract Two greptile findings: (1) D3 row 2 could not be evaluated for graph_delete (no manifest to version-check after prefix removal) and 'root absent, state already tombstoned' fell into the stale row — split into rows 7 (delete's analog of row 2) and 7b (the roll-forward), with expected_manifest_version documented as always null for the delete kind. (2) state_cas_base is now explicitly audit/diagnostics-only — the sweep never consults it; independent state mutations are handled by the ordinary CAS like any concurrent write. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 04:34:14 +03:00
Andrew Altshuler	effb9cc068	Merge pull request #167 from ModernRelay/feat/cluster-stage3b feat(cluster): Stage 3B — catalog payload verification + failpoint coverage	2026-06-10 03:17:11 +03:00
aaltshuler	16759b28b9	fix(cluster): RAII-guard the callback failpoint ScopedFailPoint::with_callback gives cfg_callback the same Drop-based cleanup as cfg actions; a panic while the point is active no longer leaks the callback into the process-global registry where it would fire under later tests (greptile review, PR #167). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:36:24 +03:00
aaltshuler	08ea659c9b	build: commit Cargo.lock for omnigraph-cluster's optional fail dependency The failpoints feature added fail = { workspace = true, optional = true } to the crate manifest; the lockfile edge belongs with it (--locked CI gate). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:21:10 +03:00
aaltshuler	50543a8ce0	docs(cluster): record Stage 3B failpoint + verification coverage Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:15:13 +03:00
aaltshuler	211b37e6de	test(cluster): failpoint tests for crash-mid-apply and state CAS race The apply-side coverage the implementation spec's hard gate requires before Phase 4 graph-moving apply: - crash after the payload phase: state.json byte-identical, blobs inert on disk, lock released, no phantom statuses, nothing acknowledged; a plain re-run repairs via skip-if-exists blob reuse. - CAS race: a cfg_callback rewrites state.json at the exact read->write window (the state.lock:false concurrent-writer scenario); apply surfaces state_cas_mismatch, acknowledges nothing, reports the persisted status snapshot, leaves the concurrent writer's state on disk; a re-run converges. CI's failpoints step now runs both the engine and cluster suites. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:14:06 +03:00
aaltshuler	21b531605f	feat(cluster): failpoint infrastructure mirroring the engine Optional failpoints feature (dep:fail + fail/failpoints, deliberately NOT enabling omnigraph/failpoints), a maybe_fail/ScopedFailPoint module returning Diagnostic-typed injected errors, and two call sites in apply_config_dir: cluster_apply.after_payload_phase (the crash point: blobs on disk, state untouched) and cluster_apply.before_state_write (routes through the persisted-statuses revert contract; a cfg_callback here can mutate state.json to make the CAS check fail organically). Feature off compiles to Ok(()) — zero behavior change. Tests live in a separate integration binary because the fail registry is process-global. Also refresh the crate description (stale 'read-only' since Stage 3A). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:12:59 +03:00
aaltshuler	acb3f1cc14	test(cli): e2e for catalog payload drift self-heal loop status warns read-only -> refresh persists drift and drops the digest -> apply republishes the blob -> status clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:08:14 +03:00
aaltshuler	15868972ff	feat(cluster): verify catalog payload blobs in status and refresh Closes the Stage 3A product gap where a deleted or corrupted blob under __cluster/resources/ went unnoticed forever (status reported converged and apply could not repair it because the digests matched). verify_catalog_payloads checks every query/policy digest in state against its content-addressed blob (existence + full sha256 re-hash; graph/schema/unknown addresses have no payloads and are skipped). status reports findings read-only (warnings catalog_payload_missing/_mismatch; error catalog_payload_read_error — an unverifiable catalog must not report healthy). refresh closes the self-heal loop: missing/mismatched blobs mark the resource drifted and remove its digest from state so the next plan proposes a create and the next apply republishes; unreadable blobs keep the digest (no spurious republish), mark error, and exit non-zero. Verification runs before graph observation so the recomputed graph composite already excludes removed query digests. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 02:07:08 +03:00
Andrew Altshuler	b6d228ff54	test(cli): cluster e2e hardening — lost-state recovery, out-of-band drift, root destruction, multi-graph convergence (#166 ) Four lifecycle compositions over the spawned binary that pin spec claims no single-command test proves: - Lost ledger: delete state.json -> re-import from the live graph -> re-apply converges onto the same content-addressed blobs (axiom 5's reconstructable- state resilience edge, end to end). - Out-of-band schema apply (the Sarah/Bob violation): refresh marks graph/schema Drifted with schema_mismatch, status and plan surface it, and cluster apply refuses to silently correct it — state keeps the LIVE schema digest (drift correction is gated, axiom 8). - Destroyed graph root: refresh records graph_missing drift and drops graph/schema digests while preserving query/policy; plan proposes deferred creates only; apply moves nothing and the catalog stays intact. - Two graphs (one live, one not yet created) + a graph-spanning policy + a cluster-scoped policy: a single apply yields all four dispositions at once (applied/derived/deferred/blocked, deterministically ordered), then the second graph appears, refresh observes it, and apply converges. Helpers: init_named_cluster_graph generalizes init_cluster_derived_graph; write_multi_graph_cluster_fixture builds the two-graph config. Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:59:20 +03:00
Andrew Altshuler	2d1c25d3fa	Merge pull request #165 from ModernRelay/feat/cluster-apply-stage3a feat(cluster): Stage 3A — config-only cluster apply	2026-06-10 00:52:54 +03:00
aaltshuler	7f3ecf282a	Merge origin/main (#164 axiom-15 docs, #86 TableStorage migration) into feat/cluster-apply-stage3a Clean auto-merge; also fix the stale 'Stage 2C accepts' line in cluster-config.md to Stage 3A. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:45:44 +03:00
aaltshuler	69b63c33ac	Merge remote-tracking branch 'origin/main' into feat/cluster-apply-stage3a	2026-06-10 00:45:03 +03:00
Andrew Altshuler	cec65b8ef8	docs(cluster): axiom 15 — single ownership, mode-switch migration, per-operator layer (#164 ) Encode the omnigraph.yaml ↔ cluster.yaml coexistence rules that were implicit across the specs: - cluster-axioms.md: new axiom 15 — every fact has exactly one owner at a time; coexistence is a mode switch, never a merge; omnigraph.yaml's job description shrinks to the permanent per-operator layer. Added review-tension bullet. - cluster-config-specs.md: "Migration model" subsection (three coexistence windows: no-conflict, Phase-5 mode switch, bridges-with-sunsets) and a "per-operator layer" completeness table (connection, credential reference, active context, ergonomics, personal aliases) with its global-config-dir destination per the RFC-002 direction. - cluster-config-implementation-spec.md: Compatibility Stance #7–#9 (single ownership, shrinking role, bridges carry sunsets); Phase 5 boot is an exclusive XOR mode switch; fixed the duplicated recoveries/recovery dirs in the Phase-1 storage layout. - docs/user/cluster-config.md: "Relationship to omnigraph.yaml" section in current-reality terms (cluster catalog is inspectable, not live). Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:44:51 +03:00
aaltshuler	5e1dede08f	fix(cluster,cli): apply failure output — persisted statuses only, changes list printed Two review findings (greptile, PR #165): - ApplyOutput.resource_statuses on a failed state write now carries the pre-apply on-disk snapshot instead of the in-memory mutations that were never persisted, so automation reading the field independently of `ok` cannot see phantom applied/blocked statuses. Regression test forces the state write to fail via a read-only __cluster dir (unix-only, skips when permissions are not enforced). - Human-mode `cluster apply` prints the classified changes list on failure too, so an operator debugging a partial apply without --json sees what was attempted. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 00:35:03 +03:00
devin-ai-integration[bot]	2c578a60b2	(feat) convert engine call sites to &dyn TableStorage; demote legacy TableStore methods to pub(crate) (#86 ) * MR-854: convert engine call sites to &dyn TableStorage; demote legacy methods Phase 1b: every db.table_store.X(...) call site converts to db.storage().X(...), reaching the storage layer through the sealed TableStorage trait (returns &dyn TableStorage). Opaque SnapshotHandle and StagedHandle replace bare lance::Dataset and Transaction in the threaded values. Phase 9: the inherent inline-commit methods on TableStore (append_batch, merge_insert_batch{,es}, overwrite_batch, create_btree_index, create_inverted_index) demote from pub to pub(crate). Their only remaining direct users are table_store.rs itself and the bulk loader's LoadMode::{Append, Overwrite, Merge} concurrent fast-paths in loader::write_batch_to_dataset (no two-phase shape in Lance 4.0.0 — closes after lance#6658 and #6666). Docs: - invariants.md \u00a7VI.23: drop "at the writer-trait surface" qualifier; staged primitives are now the only engine surface. - runs.md: residual matrix shrinks to delete_where and create_vector_index (the two upstream-blocked residuals). - forbidden_apis.rs: replace transitional language with the current allow-list shape (table_store.rs + loader concurrent fast-path only). Files touched: - changes/mod.rs, db/omnigraph.rs (+export/optimize/schema_apply/ table_ops.rs), exec/{merge,mod,mutation,staging}.rs, loader/mod.rs, storage_layer.rs, table_store.rs, tests/forbidden_apis.rs, docs/{invariants,runs}.md. Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * MR-854: replace test-only inline-commit append callers with local Lance helpers After demoting TableStore::append_batch from pub to pub(crate), the integration tests in tests/recovery.rs and tests/staged_writes.rs that previously called store.append_batch(...) directly to simulate HEAD-ahead-of-manifest drift can no longer access the inherent method. Replace those calls with small in-test helpers that do a raw Dataset::append (the same body the inherent method runs). - tests/helpers/mod.rs gains lance_append_inline (shared helper). - tests/staged_writes.rs gets a file-local lance_append_inline_local (staged_writes.rs does not import helpers::). - tests/recovery.rs drops the unused TableStore import in the one function whose store binding became unused after the conversion. Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * MR-854: retrigger CI for flaky Test Workspace job Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * MR-854: convert remaining table_store call sites in export.rs / read_blob Two leftover `self.table_store.X` / `db.table_store.X` call sites were missed in the initial sweep — flagged by Devin Review on PR #86. Both now go through the trait surface: - `entity_from_snapshot` (db/omnigraph/export.rs): switch from `db.table_store.open_snapshot_table` + `db.table_store.scan` to `db.storage().open_snapshot_at_table` + `db.storage().scan`. - `read_blob` (db/omnigraph.rs): replace `snapshot.open(table_key)` + `self.table_store.first_row_id_for_filter` with `self.storage().open_snapshot_at_table` + `self.storage().first_row_id_for_filter`. The follow-up `take_blobs` call still needs an `Arc<Dataset>` (it's a Lance blob accessor not surfaced through the trait), so we hand off via `SnapshotHandle::into_arc()` with a comment. After this commit, no engine code outside `table_store.rs` reaches the inherent `TableStore` API — the docs/runs.md and docs/invariants.md claim is now uniformly true. Co-Authored-By: Ragnor Comerford <ragnor.comerford@gmail.com> * MR-854: post-rebase doc fixes (Lance 6.0.1, MR-A framing, into_dataset note) Reviewer feedback on the rebased PR: * docs/dev/writes.md residuals matrix: drop demoted methods from the trait-surface table (now `pub(crate)`); keep only the two genuine trait-surface residuals (`delete_where`, `create_vector_index`); reframe under MR-A (Lance v7.x bump) per docs/dev/lance.md. * tests/forbidden_apis.rs: update transitional allow-list header to (a) drop the truncate_table mislabel (truncate_table is a Lance Dataset method, not a TableStore method — overwrite_batch's internal call), (b) reframe trait-surface residuals under MR-A / Lance #6666. * crates/omnigraph/src/storage_layer.rs::SnapshotHandle::{into_arc, into_dataset}: add single-ref invariant doc — both consume Arc via try_unwrap-or-clone; sibling SnapshotHandle clones across an await point force a deep Dataset clone. * Replace lance-4.0.0 version refs with lance-6.0.1 in active source/test/dev-doc comments (storage_layer.rs, table_store.rs, table_ops.rs, schema_apply.rs, merge.rs, recovery.rs, staged_writes.rs, consistency.rs, docs/dev/execution.md, docs/user/query-language.md). Historical refs in docs/releases/v0.4.1.md and the canonical "Lance 4.0.0 → 6.0.1 migration" line in docs/dev/lance.md left intact. No engine code changes. * MR-854: update docs/dev/invariants.md Storage trait row + gap entry Reviewer feedback: the docs reorg landed; the invariant row now lives in docs/dev/invariants.md with stable headings (no more numbered §VI.23). Update two pieces to reflect MR-854 completion: * Status table 'Storage trait' row: was 'full call-site migration ... incomplete'; now 'engine call sites all route through db.storage() (MR-854); inline-commit inherent methods are pub(crate)-demoted; capability/stat surfaces are roadmap'. * 'Known Gaps' 'Storage abstraction' entry: was 'older inherent TableStore call sites and inline residuals remain'; now names the closed scope (MR-854 — call sites migrated, methods demoted, loader fast-paths) and the remaining trait-surface residuals under MR-A (Lance v7.x bump) and Lance #6666. Cross-links to docs/dev/lance.md and docs/dev/writes.md so the framing stays co-located with the canonical Lance surface tracking. * MR-854: remove dead inline-commit methods from the storage surface The loader concurrent fast-path (write_batch_to_dataset) is only reached for LoadMode::Overwrite — Append/Merge route through MutationStaging — so its Append/Merge arms were unreachable. Collapse it to overwrite-only and drop the now-unused mode params, which removes the only callers of: - TableStorage::append_batch + TableStorage::merge_insert_batches (trait) - TableStore::merge_insert_batch + merge_insert_batches (inherent) create_btree_index / create_inverted_index had zero callers anywhere (scalar index builds use the stage_* primitives). Remove both from the trait and the inherent impl. Inherent append_batch stays pub(crate): overwrite_batch and recovery tests use it. Migrate the one trait-append_batch test caller (seed_person_row) to stage_append + commit_staged. The merge_insert FirstSeen-workaround rationale moves from the deleted merge_insert_batch into stage_merge_insert (now the sole merge path). No behavior change. Also corrects the inaccurate loader residual comment (the prior text blamed Lance #6658/#6666, which are the delete and vector-index issues, for keeping overwrite inline; a stage_overwrite primitive already exists and schema_apply uses it). * MR-854: seal db.storage() to staged-only; move residuals to InlineCommitResidual Split the three remaining inline-commit writes (overwrite_batch, delete_where, create_vector_index) off the TableStorage trait onto a new sealed InlineCommitResidual trait, reachable only via the explicit Omnigraph::storage_inline_residual() accessor. db.storage() now exposes only staged primitives + reads, so engine code cannot couple a write with a Lance HEAD advance through the default surface — MR-793 acceptance §1 ("no public method commits as a side effect of writing") now holds by construction, not by review + naming. Call sites moved to storage_inline_residual(): loader overwrite fast-path, the three mutation delete_where paths, the branch-merge delete, and the vector-index build. Impl bodies are unchanged (same delegation to the pub(crate) inherent methods); this is a pure surface reshape with no behavior change. The residual trait holds two genuinely upstream-blocked methods (delete_where -> Lance #6658/v7.x, create_vector_index -> Lance #6666) plus overwrite_batch, kept for the loader's cross-table bulk-overwrite concurrency until its staged migration lands (tracked follow-up). * MR-854 docs: describe the staged-only seal; fix stale Lance index URLs - writes.md / invariants.md / AGENTS.md: the inline-commit residuals now live on InlineCommitResidual behind db.storage_inline_residual(), so acceptance §1 holds by construction rather than 'option (b)' per-method enumeration. Drop the inaccurate 'until Lance exposes Operation::Overwrite { fragments }' claim (that op exists; stage_overwrite already builds it) and reframe overwrite_batch as a removable legacy residual gated on the loader's bulk-overwrite concurrency. - forbidden_apis.rs: rewrite the allow-list doc for the split surface. - lance.md: the index spec pages moved from /format/table/index/ to /format/index/ in Lance 6.x (the old paths 404). Fix all 13 URLs. * MR-854: fix stale lance-4.0.0 comment refs flagged in review Addresses greptile (exec/merge.rs) and aaltshuler's stale-version blocker: update lance-4.0.0 -> 6.0.1 in the comment/doc refs within this PR's footprint (exec/merge.rs, exec/mutation.rs, docs/dev/writes.md). Also corrects exec/merge.rs to cite lance#6666 (not #6658) for build_index_metadata_from_segments — that is the vector-index segment-commit API; #6658 is the two-phase delete. (Pre-existing 4.0.0 refs in untouched files like architecture.md/storage.md are main's incomplete migration cleanup, left out of scope.) * fix(storage): stage loader overwrites * fix(storage): stage empty schema rewrites --------- Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com> Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com> Co-authored-by: Ragnor Comerford <hello@ragnor.co>	2026-06-09 23:03:08 +02:00
aaltshuler	d870eaaf3f	test(cli): cluster lifecycle e2e — real-graph import/apply/refresh, schema-change loop, force-unlock retry Three composition tests over the spawned binary against a real derived graph: - import -> plan (dispositions) -> apply -> status -> refresh -> plan-empty, then a query edit round-trip. Pins that refresh and apply recompute the graph composite digest identically — divergence would silently re-open the plan forever and no single-command test would catch it. - The Stage 3A operator workflow across the control/data-plane boundary: cluster apply defers a schema change, omnigraph schema apply executes it, cluster refresh observes it, the next cluster apply re-converges. - Held lock refuses apply, force-unlock clears it, retried apply converges. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-09 23:44:49 +03:00
aaltshuler	40a21e4e77	docs(cluster): document Stage 3A config-only cluster apply Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-09 23:36:33 +03:00
aaltshuler	bcef8444dd	feat(cli): omnigraph cluster apply Terraform-style: apply executes directly (cluster plan is the preview, now annotated with apply dispositions). Human output prints per-change dispositions, convergence, and the catalog-only caveat; --json emits the full ApplyOutput. Exit is non-zero only on errors — deferred/blocked changes are warnings with converged: false as the automation signal. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-09 23:34:48 +03:00
aaltshuler	1f8e5945cf	feat(cluster): config-only apply with content-addressed catalog publish apply_config_dir executes the query/policy subset of the plan: payloads are written content-addressed under __cluster/resources/{query,policy}/... before the state CAS (state is the publish point; orphaned blobs from a failed CAS are inert and re-apply is the repair), then state.json is CAS-updated with applied digests, Applied/Blocked statuses, and a revision bump. Graph/schema changes are never executed here: schema content and graph lifecycle defer to a later phase with loud warnings, while graph.<id> composite-digest updates whose schema component is unchanged converge automatically via recomputation from state's own components (without which apply could never converge). Idempotent re-apply leaves state bytes and revision untouched. PlanChange gains optional disposition/reason fields, populated by the same classifier in cluster plan, so plan is an honest preview of what apply will execute, derive, defer, or block. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-09 23:32:13 +03:00
Andrew Altshuler	171a8c5d13	docs(releases): attribute the __run__ sweep (MR-770) to v0.6.2, not v0.6.1 (#161 ) The v0.6.2 notes omitted the MR-770 `__run__` cleanup entirely, and the v0.6.1 notes wrongly claimed it shipped in v0.6.1. The code (the `migrate_v2_to_v3` `__manifest` sweep + `is_internal_run_branch`/`run_registry.rs` removal) first appears at the v0.6.2 tag via #132 and is absent at v0.6.1. - v0.6.2: add the MR-770 highlight, correct the manifest-stamp note to describe the v2→v3 auto-migration on first read-write open (with the read-only caveat), and mention the cleanup in the intro. - v0.6.1: replace the two over-claiming `__run__` lines with corrections that point to v0.6.2. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 22:31:50 +03:00
aaltshuler	89b876c797	Add cluster state lock recovery	2026-06-09 22:31:46 +03:00
Andrew Altshuler	737a0f6e45	Merge pull request #159 from ModernRelay/codex/cluster-config-stage2-refresh-import [codex] add cluster refresh and import	2026-06-09 21:25:08 +03:00
aaltshuler	d00d42274e	Implement cluster refresh and import	2026-06-09 21:17:23 +03:00
Ragnor Comerford	e0d88d1295	fix(unique): collision-free tuple key shared by intake and merge, loud on un-keyable types (#160 ) * fix(unique): collision-free tuple key shared by intake and merge, loud on un-keyable types Hardening on top of #133. That PR introduced a shared `loader::composite_unique_key(parts)` joining per-column scalars with U+001F and routed both intake and branch-merge through it, closing the original '\|' vs U+001F separator drift. This takes the shared keying the rest of the way to correct-by-design: - Collision-free by construction: the key is now the tuple of per-column scalar strings (Vec<String>) keyed directly, no separator, so no data value (not even a literal U+001F) can forge a collision. - One scalar converter across both paths: intake used an explicit type-match, merge used Arrow's array_value_to_string. Both now derive the key through composite_unique_key(group_columns, row), so they can't drift on conversion. - Loud on un-keyable types: the scalar converter returned None for any Arrow type it didn't recognize, and the caller treated None as null-exempt, so a @unique on a column type it couldn't reduce (list, blob) was silently un-enforced. It now returns Err, surfacing the constraint it can't enforce instead of weakening it in silence. Tests: - consistency::composite_unique_key_is_consistent_across_intake_and_merge pins that intake and merge key the tuple identically (load-on-branch then merge of values containing '\|'). - loader unit tests pin tuple keying + null exemption and the loud error on an un-keyable (binary) column. Docs: invariants truth-matrix updated; stale loader/mod.rs line pointers fixed. Scope unchanged: intra-batch / merge-candidate-set only; cross-version uniqueness against committed rows stays a documented gap. * fix(unique): cover all string encodings; make format_tuple private (PR #160 review) Addresses two Greptile P2 comments on PR #160: - unique_key_scalar handled only StringArray (Utf8). The loud-on-unknown-type behavior turned any legal string column that read back as LargeUtf8 or Utf8View into a hard write failure (the old code silently returned None). Add LargeStringArray and StringViewArray arms so a legal string column is keyable in every physical Arrow encoding; the Err path now fires only for a genuinely un-keyable logical type (list/blob/vector), never a legal value in an unenumerated encoding. - format_tuple was pub(crate) but only used within loader/mod.rs; make it a private fn (matches the old format_unique_columns it replaced, minimal exposed surface). New unit test unique_key_scalar_handles_all_string_encodings pins that Utf8 / LargeUtf8 / Utf8View all render rather than error.	2026-06-09 19:28:21 +02:00
Ragnor Comerford	dbfdddc952	feat(engine): indexed graph traversal (#149 ) * perf(engine): route Expand node hydration through the id BTREE via structured filter hydrate_nodes built an `id IN (...)` SQL string applied via Scanner::filter, which DataFusion evaluates with InListEval (O(N×M)) rather than using the id BTREE scalar index — measured at 72× the indexed cost on a 100k-node hop (MR-376). Build the id IN-list as a structured DataFusion Expr, AND it with the pushable destination filters, and apply via Scanner::filter_expr (the same path execute_node_scan already uses); Lance then compiles it to scalar-index-search -> take. Destination-filter pushability is now decided by ir_filter_to_expr (structured) instead of ir_filter_to_sql, so list-contains (array_has) pushes down too. Removes the now-dead string-filter helpers build_lance_filter, ir_filter_to_sql, and ir_expr_to_sql; literal_to_sql stays (still used by the mutation delete path). * feat(engine): add TableStore::scan_edges_by_endpoint for indexed neighbor lookup Static helper returning edge rows that match a set of endpoint keys on src/dst, projected to [key_col, opposite_col], via a structured `key_col IN (keys)` filter_expr. Lance routes it through the persisted BTREE on the endpoint column (index-search -> take), so cost scales with the frontier size rather than \|E\|. Unused until execute_expand's indexed mode lands; isolated in its own commit so the storage-layer primitive is reviewable on its own. * feat(engine): add BTREE-indexed Expand traversal path Split execute_expand into a dispatcher over execute_expand_csr (the existing in-memory CSR BFS, unchanged) and a new execute_expand_indexed that serves each hop by batching the frontier into one scan_edges_by_endpoint call against the persisted src/dst BTREE (index-search -> take), then fans out per source row. Both share expand_hydrate_and_align — the destination hydration + alignment + hconcat + in-memory non-pushable filters — which now aligns by string id (a HashMap) instead of a dense row-id vec, so one tail serves both modes. Mode selection is OMNIGRAPH_TRAVERSAL_MODE for now (default csr); the frontier-size auto policy and lazy CSR build follow. AntiJoin stays on CSR. tests/traversal_indexed.rs (its own #[serial] binary, so env writes never race a reader) asserts the indexed path matches CSR for one-hop, multi-hop, cross-type, and no-match cases, and that a freshly-appended unindexed edge is still found (partial index coverage — fast_search=false unindexed-fragment scan). * feat(engine): frontier-size Expand dispatcher + lazy CSR build Replace the env-only mode switch with an auto policy: Expand uses the BTREE-indexed path when the source frontier is small and the hop count bounded (OMNIGRAPH_EXPAND_INDEXED_MAX_FRONTIER=1024, OMNIGRAPH_EXPAND_INDEXED_MAX_HOPS=6), else the in-memory CSR. OMNIGRAPH_TRAVERSAL_MODE=indexed\|csr still forces a mode. Make the CSR index lazy: thread a GraphIndexHandle (memoizing OnceCell over a Cached/Direct/None builder) through execute_query/execute_pipeline/ execute_rrf_query/execute_anti_join instead of a pre-built Option<&GraphIndex>. A query served entirely by the indexed path with no AntiJoin never pays the O(\|E\|) CSR build — the perf win of Tier 3. AntiJoin still realizes the index (its negation uses CSR has_neighbors). Net effect: selective traversals (the common case) skip the whole-graph CSR build and resolve neighbors from the persisted, incrementally-maintained src/dst BTREE. Existing traversal/aggregation/end_to_end/search suites now run the indexed path by default and stay green. Docs: constants.md (new env knobs), query-language.md (Expand dual path), indexes.md (graph index is lazy + the indexed alternative). * test(engine): bench indexed vs CSR selective traversal Add a selective single-source knows{1,2} comparison to bench_expand: per growing \|E\|, time the cold query in csr vs indexed mode (fresh db each, so CSR pays its O(\|E\|) build) and assert both modes return identical rows — a guard against the scalar-index physical_rows silent fallback dropping unindexed-fragment rows. The existing dense hop1/2/3 latency bench is unchanged. * feat(engine): surface silent scalar-index fallback in indexed traversal (C6) Add TableStore::key_column_index_coverage — a metadata-only check (no IO) of whether a `key_col IN (...)` scan will be served by the persisted BTREE or silently fall back to a full filtered scan, mirroring Lance's own decision: no BTREE on the column, or any fragment missing physical_rows (which disables scalar indices for the whole scan, lance dataset/scanner.rs create_filter_plan). execute_expand_indexed calls it once per traversal and tracing::warn!s on Degraded, so the perf cliff is observable instead of hidden behind a bench oracle. Detection-only: results are correct either way (the scan returns all rows). Closes the "no silent failures" gap the traversal best-practice audit flagged as the top deviation, and adds an IndexCoverage value a future cost-based planner can consume. * perf(engine): dense-id BFS on the indexed traversal path (C3) execute_expand_indexed ran its per-source BFS in string space (Vec<HashSet<String>>, HashMap<String,Vec<String>>, ~4 String clones per neighbor occurrence). Intern node ids to u32 once via a per-traversal TypeIndex (no GraphIndex/CSR build — laziness preserved) and run visited/seen/frontier/ neighbor-map in dense u32 space, mirroring the CSR path; de-intern only for the per-hop IN-list and the emitted dst ids handed to the hydrate+align tail. Behavior-preserving — the traversal_indexed CSR-vs-indexed equivalence tests are the guard (results are identical, the key type just changes String -> u32). * refactor(engine): thread the opened edge dataset into indexed Expand Hoist the edge-dataset open and the C6 index-coverage warning out of execute_expand_indexed into execute_expand, threading the opened dataset in as a parameter so it is opened exactly once. Extract the endpoint-column mapping (endpoint_columns) and the coverage warning (warn_on_degraded_coverage) as helpers. Behavior-preserving: same dataset, same warning, same dispatch decision. This only relocates the open so the upcoming cost-based chooser can consult index coverage before dispatch without opening the dataset twice. * feat(engine): cost-based Expand dispatch chooser (C5) Replace the fixed frontier<=1024 && hops<=6 dispatch threshold with a pure, IO-free cost model. choose_expand_mode compares the indexed path's frontier-relative work (hops * frontier * fanout, or hops * \|E\| when BTREE coverage is degraded) against the cost of building the whole-graph CSR (BUILD_FACTOR * \|E\|), from cheap manifest row counts. Under good coverage this reduces to a selectivity ratio independent of \|E\|, preserving the flat-in-\|E\| indexed win for selective traversals while routing dense / deep / high-fanout or degraded-and-expensive traversals to CSR. execute_expand decides cardinality-first and only opens the edge dataset to confirm coverage when it leans indexed (no open on a clearly-CSR traversal). The two env knobs become hard ceilings layered on the model; the OMNIGRAPH_TRAVERSAL_MODE override still forces a path; the chosen mode is traced. Results are unchanged across modes — only the path differs. Adds inline crossover unit tests and extends the traversal_indexed both_modes harness with an auto pass asserting the chooser is result-preserving across every traversal shape. Documents the new flag semantics in docs/user/{constants,query-language}.md. * test(engine): pin Lance scalar-index coverage + system-column/deletion-metadata surface Add three Lance surface guards de-risking a future persisted-adjacency cache: - a compile-only guard pinning the fragment physical_rows + index-detail surface that key_column_index_coverage mirrors (the C6 fallback); - a runtime probe confirming a scalar BTREE on the system column _row_last_updated_at_version is not buildable via the normal create-index path (the column is not in the user schema), so a version-column range delta is not viable as drafted; - a runtime probe confirming per-fragment deletion metadata (deletion_file.num_deleted_rows) is available as cheap O(fragments) metadata, the primitive a fragment-coverage delete model would rely on. The probes turn the two largest substrate assumptions into green/red CI facts before any cache work begins. * test(engine): regression for cross-type id-collision in indexed traversal A node id is unique only within a type, so a Person and a Company can share an id string. A variable-length traversal over a cross-type edge (WorksAt) must structurally stop after one hop. This test builds a graph where 'shared' is both a Person and a Company id and asserts worksAt{1,2} returns only the one-hop company. It fails today: the indexed path's single string interner de-interns the hop-1 Company id back to the colliding Person id and runs a hop-2 scan that matches that Person's edges, emitting a spurious second-hop company (indexed ["other","shared"] vs csr ["shared"]). * fix(engine): structurally cap cross-type Expand at one hop A cross-type edge cannot chain (e.g. a Company is not a WorksAt source), so a variable-length traversal over one is structurally single-hop. Both traversal paths now enforce this by capping max hops at 1 when from_type != to_type, instead of relying on the hop-2 scan returning empty. That reliance was a correctness hole on the indexed path: it interns every endpoint string into one dense id space, so a cross-type id-string collision (a Person and a Company sharing an id) let hop 2 de-intern a destination id back to the colliding source-type id and match its edges, emitting rows the CSR path never produces. With the cap the cross-type second-hop scan never runs, so the shared interner can no longer alias across types. Turns the regression test green (indexed == csr == ["shared"]). * perf(engine): set-oriented filtered anti-join, remove per-row dispatch execute_anti_join's filtered slow path sliced the outer batch to one row at a time and re-ran the inner pipeline per row, so each 1-row inner Expand dispatched to the indexed path — one Lance scan per outer row, while the CSR realized up front sat unused. Replace it with a set-oriented anti-semi-join: tag each outer row with a synthetic index column, run the inner pipeline once over the whole frontier (the tag survives Expand's hconcat and Filter's row-drop), then exclude outer rows whose tag survived. The inner Expand now runs as a single set-at-a-time traversal over the full frontier; config is read once per operator, not per row (the env nit is mooted). A produced-but-untagged inner batch fails loudly rather than silently keeping every row. Results are unchanged (the predicated-negation tests exercise the path over a multi-row outer with dst-filters). * test(engine): drop flaky wall-clock budget from the merge truth table The 30s wall-clock assertion in merge_pair_truth_table flakes under parallel test load: it tripped at ~31s in the full --test-threads=4 gate while passing at ~20s in isolation. A fixed time budget in a correctness test depends on machine and parallelism, not correctness; elapsed is still logged for visibility, and a real merge-perf regression belongs in a bench. The cell-count correctness assertions (81 / 36 / 45) are unchanged. * fix(engine): total deterministic ORDER via entity-key tie-break + NULL contract apply_ordering used an unstable lexsort with no tie-break, so rows with equal user-sort keys came out in a run-dependent order (the input order depends on scan parallelism / upstream hashing) — making ORDER ... LIMIT non-deterministic, a latent deny-list violation (no nondeterministic result ordering). Append the bound entities' key columns (<var>.id, unique per row) in canonical name-sorted order as ascending tie-breaks, giving a total, reproducible order (and a deterministic top-N when ties straddle the LIMIT cutoff). NULL placement (nulls_first = !descending) is unchanged and now documented as the contract. New tests/ordering.rs locks descending, multi-key precedence, the deterministic key tie-break (data loaded in a different order than the expected output, so it proves the tie sorts by key not by load order), and NULL placement under ASC/DESC. docs/user/query-language.md documents the total-order + NULL contract. * test(engine): property-based query-correctness invariants over generated graphs Adds a proptest harness (new dev-dep) that generates small graphs whose Person and Company keys are drawn from a shared 5-key alphabet, so cross-type id collisions, cycles, and self-loops arise by search rather than from one hand-built fixture. Three invariants: - prop_expand_indexed_eq_csr: csr == indexed == auto over knows{1,3} (same-type, cycles) and worksAt{1,2} (cross-type, collision-prone) from every start. - prop_results_subset_of_existing_nodes: no phantom rows (catches over-emission even if both modes are wrong identically). - prop_antijoin_partitions_persons: not{worksAt} and its complement are disjoint and cover all persons. Verified the guard bites: neutering the cross-type hop cap makes prop_expand_indexed_eq_csr fail and proptest shrinks it to persons["c","e"] / companies["b","c"] — the cross-type collision class the hand-built fixture only sampled once. Tests are sync + #[serial] (per-case runtime; the mode test writes OMNIGRAPH_TRAVERSAL_MODE). * test(engine): cover cycle/self-loop termination + nested anti-join (C5 edge cases) - variable_hops_terminate_and_dedup_on_cycle: a 3-cycle a->b->c->a traversed with knows{1,5} (ceiling above the cycle length) terminates and emits each node once (the c->a back-edge hits the seeded source); both_modes confirms indexed == csr. Uses a bounded range deliberately — unbounded {1,} is a typecheck error, not a runtime path. - variable_hops_handle_self_loop: a->a self-loop does not loop forever and does not re-emit the seeded source. - nested_anti_join_double_negation: not { worksAt; not { name = Acme } } recurses through execute_pipeline, yielding [Alice,Charlie,Diana] (people with no non-Acme employer) — distinct from plain unemployed [Charlie,Diana]. * test(engine): execution goldens for typed-literal filters (C4 gap #4) New literal_filters.rs covers filtering by F64/F32/Bool/Date/DateTime LITERALS across both arms: standalone comparisons ($m.score > 1.5, $m.ratio <= 0.25, $m.active = true, $m.born >= date(...), $m.seen < datetime(...)) exercise the in-memory comparison path, and inline bindings (Metric { active: true }, Metric { score: 3.0 }) exercise Lance filter_expr pushdown. Seeds partition each predicate so a dropped/miscast filter returns all rows. (Param-bound scalars and list-column contains are covered elsewhere.) * test(engine): full rank-order goldens for nearest + bm25 (gap #2) Existing search tests stopped at top-1 (nearest) or non-empty (bm25), so a regression corrupting ranks 2..k or reversing the sort direction passed CI silently. Pin the FULL ordered slug list: nearest([0.1,0.2,0.3,0.4]) -> [ml-intro, nlp-guide, rl-intro] (ml-intro exact at dist 0, rest by ascending L2); bm25(Learning) -> [rl-intro, ml-intro, dl-basics] (descending score). nearest/bm25 skip apply_ordering (is_search_ordered) and return Lance native order, so result_slugs row order == rank order; values resolved by running and confirmed stable across runs. * test(engine): search fuzzy/match_text characterization + RRF non-default pairings - match_text_matches_exact_set_excludes_unrelated: match_text(body,'neural') == [dl-basics] exactly (not just contains). - fuzzy_does_not_match_under_default_tokenizer: characterizes that fuzzy() is inert with the default tokenizer here (search/match_text work, fuzzy returns nothing); turns red — to be promoted to a real golden — if fuzzy starts matching. - rrf_fuses_two_fts_fields / rrf_fuses_two_vector_queries: RRF fuses arms other than the default nearest+bm25 (bm25 title+body; two vector queries), proving primary_var resolves and fusion runs. New fixtures/search.gq queries + two_vector_params helper. Orders resolved by running, confirmed stable. * test(engine): anti-join fast-vs-slow path equivalence harness anti_join_fast_and_slow_paths_agree: the CSR has_neighbors fast path (not { $p worksAt $_ }) and the set-oriented inner-pipeline replay (same negation forced slow by an always-true $c.name != "" dst filter) must produce the same result ([Charlie, Diana]). Closes the second real engine fork explicitly. * test(engine): regression for nested slow-path anti-join tag collision A nested not { ... not { ... } } where both levels hit the set-oriented slow path collides on the fixed __antijoin_outer_row correlation column: the inner call appends a duplicate, and column_by_name reads the OUTER tag. Fan-out (p1 works at two companies) makes inner row indices diverge from outer tags, so the bug returns the wrong person set. Fails on current code (left ["p2","p4"] vs right ["p3","p4"]). * fix(engine): collision-free anti-join correlation tag for nested negation The set-oriented anti-join tagged the outer batch with a fixed column name and read it back by name. Under a nested slow-path anti-join the enclosing tag rides through the inner pipeline, so the inner call produced a duplicate field; Arrow permits duplicate names and column_by_name returns the first, so the inner negation mis-correlated against the outer row indices. Choose a tag name not already present in the batch (suffix-incremented), so each nesting level reads its own correlation column. Turns the fan-out regression green; the existing nested/fast-vs-slow/proptest anti-join invariants still pass. * fix(engine): cap cross-type hops in the Expand cost model gather_cost_inputs fed the requested max_hops into choose_expand_mode even though execute_expand_indexed runs at most one hop for a cross-type edge. So a cross-type variable-length expand (e.g. worksAt{1,5}) had its indexed cost scaled by 5 while only one hop runs, skewing the chooser toward CSR (an unnecessary whole-graph build) near the crossover. Results were unaffected (modes are equivalent); this is a plan-accuracy fix. Add cost_effective_hops(requested, same_type) — caps to 1 for cross-type — and apply it in gather_cost_inputs so the estimate matches what executes. Unit test covers the cap and the crossover consequence (capped 1 hop stays indexed where the requested 5 would have flipped to CSR). * perf(engine): realize anti-join CSR lazily + reuse a warm CSR in the chooser Two CSR build/reuse fixes flagged on the set-oriented anti-join work (results unchanged — plan/perf accuracy): - execute_anti_join called graph_index.get() (the O(\|E\|) whole-graph CSR build) unconditionally, but only the bulk fast path consumes it; a filtered/nested slow-path anti-join's inner Expand picks its own access path. Gate the build on a pure shape predicate (bulk_anti_join_applies) so a selective anti-join over a large graph no longer pays a build it won't use. - gather_cost_inputs hardcoded csr_cached=false, so once an earlier op realized the CSR, later Expands still cost it as a cold build and could pick per-hop indexed scans over reusing the warm in-memory CSR. Add GraphIndexHandle:: is_built() and thread it through so the chooser reuses a materialized CSR. Anti-join, cross-type, proptest-equivalence, and chooser unit tests stay green. * test(engine): RAII traversal-mode guard in proptest equivalence prop_expand_indexed_eq_csr set/cleared OMNIGRAPH_TRAVERSAL_MODE manually; a panic between set and clear (e.g. a query unwrap on a generated case) would leak the forced mode into proptest's shrink/subsequent cases and mask the divergence under test. Replace with a ModeGuard that clears on drop (including on unwind), scoping the forced mode to a single query. * test(engine): regression for multi-hop anti-join hop bounds The bulk anti-join fast path answers via has_neighbors (one-hop existence), so not { $p knows{2,2} $x } wrongly drops a node with a 1-hop neighbor but no 2-hop path. On a->b (sink) and c->d->e, only c has a 2-hop path; the query should keep [a,b,d,e]. Fails on current code (left ["b","e"] — only the sinks). * fix(engine): restrict anti-join bulk fast path to one-hop expands bulk_anti_join_applies accepted any single Expand, but try_bulk_anti_join_mask decides via the CSR has_neighbors one-hop existence check — wrong for multi-hop negations. Require min_hops==1 && max_hops==1 in the predicate; anything else falls to the slow path, whose inner Expand runs the real bounded traversal. Turns the multi-hop regression green; one-hop anti-joins unchanged. * fix(engine): IndexCoverage reports Degraded for uncovered fragments key_column_index_coverage checked BTREE-exists + physical_rows but not that the index actually covers the current fragments. Since edge-index creation is skipped once a BTREE exists, fragments appended later stay unindexed while coverage still reported Indexed — so the cost chooser priced a partly-full scan as fully indexed. Compare the BTREE's fragment_bitmap (public on lance_table IndexMetadata) against the dataset's current fragment ids; report Degraded when any are uncovered. A None bitmap means Lance can't report coverage — don't over-degrade. Results are unaffected (the scan returns unindexed-fragment rows either way); this corrects the cost signal. Test: a freshly-loaded edge BTREE is Indexed; after appending an edge the new fragment is uncovered → Degraded. Surface guard pins IndexMetadata.fragment_bitmap. * docs: clarify the Expand frontier ceiling bounds the initial dispatch frontier The cap is applied at dispatch on the initial frontier; per-hop fan-out (union_dense) is not hard-capped. Correct the constants.md and query-language.md claims: the ceilings bound the initial-dispatch frontier/hops, the cost model estimates total indexed work as ~hopsfrontierfanout (pricing dense fan-out toward CSR), and per-hop work is not a hard bound. Drops the overstated 'hard caps bound indexed work' / 'cost ∝ frontier' wording.	2026-06-09 18:09:13 +02:00
Andrew Altshuler	59b64ea097	Merge pull request #158 from ModernRelay/cluster-config-docs [codex] integrate cluster config read-only state stack	2026-06-09 18:54:12 +03:00
aaltshuler	2f19656c0e	fix(cluster): tighten state lock observations	2026-06-09 18:30:33 +03:00
aaltshuler	b046515e1c	Merge origin/main into cluster-config-docs	2026-06-09 18:11:12 +03:00
Andrew Altshuler	4d7c1ef42f	Merge pull request #153 from ModernRelay/codex/cluster-config-stage2-state-ledger [codex] Add cluster JSON state ledger status	2026-06-09 17:46:14 +03:00
Andrew Altshuler	83785a8c4d	Merge pull request #152 from ModernRelay/codex/cluster-config-stage1 [codex] add read-only cluster validate and plan	2026-06-09 17:44:29 +03:00
devin-ai-integration[bot]	b7f5276ab5	fix(loader): enforce composite @unique(a, b) as a true composite key (#133 ) * fix(loader): enforce composite @unique(a, b) as a true composite key Node/edge composite uniqueness constraints were flattened into a single list of property names, so @unique(a, b) was enforced as independent single-field checks @unique(a) AND @unique(b) at intake. Preserve the constraint grouping and check each group as a composite key, mirroring the merge-path enforcement. Error messages now name the full composite. MR-983 * docs: clarify unit-separator comment in composite unique check * docs: fix separator reference in composite unique comment (merge.rs also uses U+001F) * fix(merge): align composite @unique key separator with intake (U+001F) The branch-merge path (update_unique_constraints) joined composite key columns with '\|', while intake joins with U+001F. The same @unique(a, b) was keyed two different ways, and '\|'-join can raise phantom merge conflicts for values containing '\|' (e.g. ('x\|y','z') vs ('x','y\|z')). Factor the tuple-join into one shared helper (loader::composite_unique_key) so the intake and merge paths cannot drift again. Add branching regression tests for edge @unique(src, dst) on the merge path. Refs MR-983. --------- Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com> Co-authored-by: Andrew Altshuler <andrew@collectivelab.io>	2026-06-09 17:17:31 +03:00
Ragnor Comerford	131b78705d	release: v0.6.2	2026-06-09 15:59:59 +02:00
Ragnor Comerford	d0e39e677e	fix(maintenance): route uncovered drift through repair (#156 ) * docs(invariants): note the non-atomic manifest->commit-graph publish gap Every graph publish commits __manifest then appends _graph_commits as two separate writes; a crash between them leaves the manifest ahead of the commit DAG. Live reads + durability are unaffected (reads resolve via the manifest) and recovery does not repair it; impact is bounded to commit history / time-travel by commit id / merge-base completeness. Pre-existing across all publishes, not the optimize reconcile specifically. Documented as a Known Gap; the fix is a commit-graph reconcilable from the manifest, not a recovery sidecar. * fix(maintenance): route uncovered drift through repair * fix(maintenance): harden repair review feedback	2026-06-09 14:42:54 +02:00
Andrew Altshuler	5eead8d29e	ci(branch-protection): let code owners bypass required PR review (#154 ) Some checks failed CI / Classify Changes (push) Has been cancelled Details CI / Check AGENTS.md Links (push) Has been cancelled Details CI / Container Entrypoint (push) Has been cancelled Details Release Edge / Prepare edge release (push) Has been cancelled Details CI / Test Workspace (push) Has been cancelled Details CI / Test omnigraph-server --features aws (push) Has been cancelled Details CI / RustFS S3 Integration (push) Has been cancelled Details Release Edge / Build edge omnigraph-linux-x86_64 (push) Has been cancelled Details Release Edge / Build edge omnigraph-macos-arm64 (push) Has been cancelled Details Release Edge / Build edge omnigraph-windows-x86_64 (push) Has been cancelled Details Release Edge / Smoke Windows installer (push) Has been cancelled Details require_code_owner_reviews + count=1 with no bypass meant EVERY PR needed a code-owner approval — including code owners' own PRs, which can't be self-approved, so an owner's PR deadlocked on the other owner (forcing admin overrides). Intended behavior: review is required only for non-owners. Add bypass_pull_request_allowances for the two engineering owners (ragnorc, aaltshuler): they merge their own PRs after CI without a second review; non-owners still require a code-owner approval. CI status checks remain required for everyone. Applied live via scripts/apply-branch-protection.sh. Note: the bypass list mirrors codeowners-roles.yml engineering members by hand (render-codeowners.py doesn't generate it) — keep in sync on owner changes. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 22:26:04 +03:00
Andrew Altshuler	c2a97f4559	ci: drop per-PR Windows release build; bind to release tags (#155 ) The `test_windows_binaries` job ran a full Windows --release build + smoke test on every code PR. It was a non-required (non-blocking) check, so it never gated a merge — it only burned the slowest/most expensive runner (windows-latest, --release, 75-min ceiling) on every code change. Windows binary validation is already covered (better) on release tags: release.yml's `smoke_windows_installer` (on v* tags) builds the release binaries, installs via scripts/install.ps1, and smoke-runs `omnigraph.exe version` + `omnigraph-server.exe --help` — the same smoke test plus the real installer path. Nothing `needs:` the removed job. Trade-off (accepted): a PR that breaks the Windows build or install.ps1 syntax is now caught at release-cut rather than at PR time. install.ps1 and platform-specific code change rarely; the cost savings on every PR outweigh the earlier signal. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 22:25:33 +03:00
Andrew Altshuler	ce150fb0ca	docs(testing): fix stale optimize test name in maintenance.rs row (#148 ) The maintenance.rs row referenced `optimize_reconciles_preexisting_manifest_head_drift`, which never existed (leftover from the reconcile-drift heuristic removed in #141). The actual second test is `optimize_defers_when_recovery_sidecar_is_pending`. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 22:19:21 +03:00
aaltshuler	a7956ea5a9	Add cluster JSON state ledger status	2026-06-08 21:09:23 +03:00
aaltshuler	043b02e617	feat(cluster): add read-only validate and plan	2026-06-08 20:07:39 +03:00
aaltshuler	ab5f3b878a	docs: add cluster config specs	2026-06-08 17:31:36 +03:00
Ragnor Comerford	e62d9166fb	fix: optimize publishes compaction; recovery roll-back converges manifest (#141 ) * test(optimize): cover manifest publish + HEAD-drift reconcile Red against the pre-fix optimize, which ran compact_files without publishing the compacted version to __manifest: - maintenance: optimize must publish so the manifest table_version tracks the compacted Lance HEAD and a later schema apply succeeds; and must reconcile a pre-existing manifest-behind-HEAD drift (forged via raw Lance compaction) so strict writes commit again. - end_to_end + composite_flow: post-optimize query / strict update / reopen in the full lifecycle (the canonical flow previously omitted post-optimize writes as a documented "known limitation"). - failpoints: a crash between compaction and the manifest publish rolls forward on next open. * fix(optimize): publish compaction to manifest and reconcile HEAD drift optimize ran Lance compact_files without publishing the new version to __manifest, so the manifest table_version lagged the Lance HEAD: reads stayed pinned to the pre-compaction version, and the next schema apply or strict update/delete failed its HEAD-vs-manifest precondition with "stale view ... refresh and retry" (open-time recovery rollback inflated the gap on retry). optimize now publishes each compacted table's version under the per-(table, main) write queue, guarded by a manifest CAS and a SidecarKind::Optimize recovery sidecar (loose-match; roll-forward is safe because compaction is content-preserving). When a table has nothing left to compact but its Lance HEAD is already ahead of the manifest pin (pre-fix drift, or a recovery restore commit), optimize reconciles the manifest forward to HEAD (metadata-only, no sidecar). Caches and the CSR/CSC graph index are invalidated after a publish. Docs updated (maintenance, storage, branches-commits, writes, testing). * test(recovery): rollback convergence + optimize-defer regressions Red against the current code, landed before the fix: - recovery: after the open-time sweep rolls a sidecar back, the manifest must track Lance HEAD (no residual drift) so a follow-up schema apply succeeds — the original "+1 per retry" loop. Today roll-back restores without publishing, so the manifest lags HEAD and the apply fails its HEAD-vs-manifest precondition. - maintenance: optimize must refuse while a recovery sidecar is pending — operating on an unrecovered graph could publish a partial write the sweep would roll back. Also removes optimize_reconciles_preexisting_manifest_head_drift: the ad-hoc drift reconcile it covered is replaced by recovery-side convergence. * fix(recovery): converge manifest on roll-back; optimize defers on pending recovery Root of PR #141's review findings and the original "+1 per retry" loop: a Lance HEAD ahead of the manifest was ambiguous (benign content-preserving drift vs. a partial write a sidecar will roll back), and optimize's reconcile guessed it benign. Close the class instead of guessing: - Recovery roll-back now PUBLISHES the restored version (via a push_table_update_at_head helper shared with roll-forward), so the manifest tracks the Lance HEAD after recovery — symmetric with roll-forward. This fixes the +1 loop (after one roll-back the retry's HEAD-vs-manifest precondition passes) and removes the only remaining source of orphaned drift. The audit still records the logical rolled-back-to version; the manifest is published at the restore commit (identical content). - optimize drops the ad-hoc drift reconcile and instead REFUSES when a __recovery sidecar is pending, so it only ever operates on a recovered graph (manifest == HEAD); its compaction publish can no longer commit a partial write. With the reconcile gone, the blob-skip-vs-reconcile gap is moot. Updates the rollback recovery-test helper (manifest == HEAD after roll-back), the failpoints assertions, and the user/dev docs. * test(recovery): fix rollback assertion for manifest convergence The roll-back-publishes change makes the manifest version advance after a SchemaApply roll-back (to the old-schema content), so the schema_apply_without_schema_staging_rolls_back_on_next_open assertion must be `version > pre`, not `version == pre`. This update was dropped during the commit churn and surfaced as a CI Test Workspace failure; the old-schema-preserved intent stays covered by count_rows + _schema.pg + the RolledBack convergence invariant.	2026-06-08 02:50:12 +03:00
Aaron Goh	4a66d6e071	fix(loader): accept multi-line (pretty-printed) JSON in load (#146 ) The loader read input line-by-line (reader.lines() + serde_json::from_str per line), so any delta where a JSON object spanned multiple lines failed with 'invalid JSON on line 1: EOF while parsing an object'. Compact JSONL worked; pretty-printed JSON never did. Switch to a streaming value deserializer (Deserializer::from_reader().into_iter::<Value>()), which treats any whitespace (including newlines inside objects) as a separator — so both compact JSONL and pretty-printed JSON load. Error labels switch from line numbers to record numbers (line numbers are meaningless once objects span lines). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Ragnor Comerford <ragnor.comerford@gmail.com>	2026-06-07 20:37:37 +02:00
Ragnor Comerford	54842808db	feat(engine): sweep & remove legacy __run__ branch guard (MR-770) (#132 ) * feat(engine): sweep legacy __run__ branches via v2→v3 manifest migration Pre-v0.4.0 graphs can carry stale `__run__<id>` staging branches on the `__manifest` dataset, left by the Run state machine removed in MR-771. Lance's `list_branches` still enumerates them, so they leak into `branch_list()` and count as blocking branches at schema-apply time. Add a one-time `migrate_v2_to_v3` arm to the internal-schema dispatcher: on the first read-write open it enumerates `__manifest` branches, deletes every `__run__` ref, and bumps the stamp to 3. Idempotent under retry (re-enumerates fresh each run). The `"__run__"` prefix is inlined so the migration does not depend on the run_registry guard that MR-770 removes next. This is the prerequisite sweep; the guard removal follows in the next commit. refactor(engine): remove the legacy __run__ branch guard (MR-770) With the v2→v3 migration sweeping stale `__run__` branches off `__manifest` on first read-write open, the defense-in-depth `is_internal_run_branch` guard is no longer needed. - delete `db/run_registry.rs`; drop the module + re-export from `db/mod.rs` - collapse `is_internal_system_branch` to the schema-apply-lock check only - `ensure_public_branch_ref`: drop the run-ref rejection; `__run__` is now an ordinary branch name - `branch_merge`: reject `is_internal_system_branch` (was run-only) so the schema-apply lock is rejected consistently with create/delete — a small, deliberate tightening - update the inline schema-apply test + the writes integration tests (`public_branch_apis_reject_internal_run_refs` → `public_branch_apis_reject_internal_system_refs`, which also asserts `__run__` now creates successfully) - docs: flip the "pending production sweep / defense-in-depth" notes to "auto-swept by the v2→v3 migration"; document the read-only-open limitation Known residual: the inert `_graph_runs.lance` / `_graph_run_actors.lance` bytes remain until a `StorageAdapter::delete_prefix` primitive lands. fix(engine): run __run__ sweep at Omnigraph::open, not only on publish Review (PR #132) caught a regression: removing __run__ from `is_internal_system_branch` exposed legacy `__run__` branches to the schema-apply blocking-branch checks (schema_apply.rs:104 and :778) and to `branch_list()`, but the v2→v3 sweep ran only inside the publisher's `load_publish_state`. On a pre-v0.4.0 graph whose first write is a schema apply, the blocking-branch check fires before any publish, so apply failed with "found non-main branches: __run__…". The same lazy timing also created a reverse hazard: a user-created `__run__` branch on a still-v2 graph could be deleted by the first publish's sweep. Fix: run the internal-schema migration in `Omnigraph::open(ReadWrite)` (new `manifest::migrate_on_open`), before the coordinator reads branch state. The sweep now lands before any branch-observing code, and a graph is stamped v3 at open — so the one-time sweep can never catch a legitimately-created branch. Both checks and `branch_list` see the swept graph; correct by construction for every write path. Accepted residual: a read-only open of an unmigrated legacy graph still lists `__run__` (read-only opens must not write, so they can't sweep). Documented. Regression test `legacy_run_branch_is_swept_on_open_and_does_not_block_schema_apply` confirmed RED before the fix (panicked on the branch_list leak assertion) and GREEN after. Also updates the stale schema_apply.rs comment, the writes.md "Migration code" section, and adds the v3 row to storage.md's migration table. test(engine): sweep multiple legacy __run__ branches; doc nit Strengthen the v2→v3 migration test to synthesize three `__run__` branches (a real legacy graph accumulates one per run) so the migration's delete loop is exercised on a single reused dataset handle, not just a single branch. Confirms multi-branch deletion is safe. Also drop a stale "active runs" reference from the branch_delete doc line. fix(engine): force-delete in __run__ sweep for concurrency safety `migrate_v2_to_v3` ran `Dataset::delete_branch` (= `branches().delete(.., false)`), which errors "BranchContents not found" if the branch is already gone. Since the sweep now runs in `Omnigraph::open(ReadWrite)`, two processes opening the same legacy v2 graph concurrently would race: one wins each delete, the other's open fails. The migration only claimed idempotency under sequential retry. Switch to `Dataset::force_delete_branch` (= `delete(.., true)`), Lance's documented path for cleaning up zombie branches, which tolerates an already-absent branch. The sweep is now idempotent under concurrent runners and robust to partial/zombie state. Found in self-review; no behavior change for the common single-open path. * docs(release): note MR-770 __run__ cleanup in v0.6.1 * docs(branches): reconcile branch cleanup semantics	2026-06-07 18:33:14 +03:00
Andrew Altshuler	fd8e078a77	ci(codeowners): add aaltshuler to engineering role (#147 ) Restores aaltshuler as an `engineering` code-owner (removed in #142), so `crates/**` and repo-infra PRs have a second reviewer besides the sole owner ragnorc — unblocking review of author-ragnorc PRs (e.g. #132) that ragnorc cannot self-approve. Edited the source of truth (.github/codeowners-roles.yml) and re-rendered .github/CODEOWNERS + the docs/dev/codeowners.md tables via .github/scripts/render-codeowners.py, per the documented flow. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 18:05:01 +03:00
Andrew Altshuler	343f1f17ed	governance: external contribution model (issues/discussions/RFCs/PRs) (#143 ) Formalize the public contribution surface. Maintainers keep a separate internal process and are exempt from the intake gates; everyone stays bound by review, CODEOWNERS, and branch protection. Model: - Issues = problem reports only (bug form + config.yml redirects ideas to Discussions and disables blank issues). - Discussions = ideas + RFC incubation. - RFCs = anyone (incl. external) authors docs/rfcs/NNNN-.md; a maintainer merging it is acceptance. Distinct from the maintainer-internal docs/dev/rfc-00N- track. - PRs = link an `accepted` issue or accepted RFC, or use the trivial fast-lane (typos/docs/deps). Enforced softly to start (template + review). Adds GOVERNANCE.md, rewrites CONTRIBUTING.md, adds docs/rfcs/ (README + template), .github issue/PR/discussion templates. Wires docs/rfcs/ into the doc-link checker (excluded like releases; linked from docs/dev/index.md). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 23:58:08 +03:00
Andrew Altshuler	c7365bf8ef	ci(codeowners): un-trap required checks, auto-render, generate owner tables (#142 ) The CODEOWNERS required checks blocked every PR — the real root cause was a name mismatch, compounded by a path filter: - branch-protection.json required the contexts `CODEOWNERS / drift` and `CODEOWNERS / noedit` (the GitHub UI "workflow / job-id" display form), but the jobs report check-run names from their `name:` fields — "CODEOWNERS matches source" / "CODEOWNERS not hand-edited". The required contexts therefore never matched any reported check and sat permanently pending. - The workflow was also path-filtered to CODEOWNERS files, so it didn't even run for most PRs. Net effect: with both required checks unsatisfiable, every PR could only land via admin override (e.g. #140). Fixes: - A: drop the `paths:` filter so the workflow runs on every PR and both required contexts always report. - name fix: point branch-protection.json at the actual job names verbatim, and add a doc note that the contexts must equal the job `name:` values. - B: the `drift` job now re-renders and, on same-repo PRs, auto-commits the regenerated artifacts back to the branch (mirrors the openapi.json job in ci.yml); forks / manual runs strict-check instead. Contributors no longer run the script by hand. - D: render-codeowners.py also generates a "who owns what" path->owners + roles table spliced into docs/dev/codeowners.md between markers, so the human-readable view never drifts. Idempotent; CODEOWNERS output unchanged. - docs: correct the stale `enforce_admins: true` line (JSON and live are false). NOTE: the branch-protection.json change only takes effect after an admin runs `./scripts/apply-branch-protection.sh` (deliberate manual step, per docs/dev/branch-protection.md). Until then `main` still requires the old mismatched contexts, so this PR itself needs an admin-override merge — the last one that should be necessary. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 18:09:47 +03:00

1 2 3 4 5 ...

387 commits