omnigraph

mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-09 01:35:18 +02:00

Author	SHA1	Message	Date
Ragnor Comerford	24413844ae	Add Windows release binaries (#127 ) * Add Windows release binaries * Fix Windows installer downloads	2026-05-30 14:23:40 +02:00
Andrew Altshuler	587fbeabd8	ci(publish-crates): set User-Agent + treat "already exists" as success (#117 ) Two related fixes uncovered while recovering the v0.5.0 publish. 1. crates.io API requires a User-Agent header. The `publish_if_new` skip check was doing a bare `curl -fsSL https://crates.io/api/...` which crates.io rejects with HTTP 403. With `-f` curl exits non-zero, the pipeline returns empty, the script doesn't recognize already-published crates, and we fall through to a real publish attempt. On a re-run that means cargo publish errors with "already exists on crates.io index" for crates that DID publish successfully on the previous run. Fix: send a `User-Agent: ModernRelay-omnigraph-ci (URL)` header. 2. Defense in depth: even with the UA, the API could hiccup. If the skip check misses an existing version and cargo publish errors with "already exists on crates.io index", treat as success instead of failing the whole run. This makes the workflow re-runnable after any partial publish without needing manual intervention. Both fixes are required to recover from the v0.5.0 partial publish where omnigraph-compiler@0.5.0 made it through but the run failed before omnigraph-policy / engine / server / cli — re-triggering the workflow now succeeds end-to-end. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 14:19:17 +01:00
Andrew Altshuler	1a9f8b1f7f	ci(publish-crates): include omnigraph-policy in the publish list (#116 ) omnigraph-policy is a new crate this release cycle (Cedar policy engine, MR-722). It wasn't added to the publish list when it was created, so v0.5.0's tag-triggered publish run succeeded for omnigraph-compiler but failed at omnigraph-engine: failed to prepare local package for uploading Caused by: no matching package named `omnigraph-policy` found location searched: crates.io index required by package `omnigraph-engine v0.5.0` omnigraph-policy has no internal omnigraph-* deps so it can publish after omnigraph-compiler (either could go first). omnigraph-engine depends on both; server on the three; cli on everything. publish_if_new is idempotent — re-running with the v0.5.0 tag after this lands will skip omnigraph-compiler (already published), then publish policy + engine + server + cli. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 14:09:58 +01:00
Andrew Altshuler	cb80fa40f1	exec/query: structured Expr pushdown via Scanner::filter_expr (unblocks CompOp::Contains) (#113 ) * exec/query: pushdown IR filters via DataFusion Expr (Scanner::filter_expr) Switches `execute_node_scan` from string-flattened Lance SQL pushdown (`build_lance_filter` + `scanner.filter(&str)`) to structured DataFusion Expr pushdown (`build_lance_filter_expr` + `scanner.filter_expr(Expr)`). ## What this enables 1. `CompOp::Contains` now pushes down. `ir_filter_to_sql` returned `None` for list-contains (the comment said "Can't pushdown list contains") because string SQL can't easily express it. With Expr, it lowers to DataFusion's `array_has(col, value)` builtin via the `nested_expressions` feature, and pushes down to Lance's scan layer the same way Eq/Lt/etc. do. Pinned by the new regression test `end_to_end::ir_filter_with_list_contains_pushes_down`. 2. DataFusion 53's optimizer rules now reach our predicates. Once the Expr lands at the Lance scanner, DF's planner runs: - `IN`-list vectorized eq kernel (DF #20528) - `PhysicalExprSimplifier` (DF #20111) - CASE WHEN x THEN y ELSE NULL shortcut (DF #20097) - Push limit into hash join (DF #20228) None of these were applicable before because the string SQL path short-circuited the optimizer. ## Scope This is one of three string-flattened pushdown sites; the other two (`hydrate_nodes`/Expand pushdown at query.rs:771-796 and the mutation delete path in `exec/mutation.rs::predicate_to_sql`) stay on the SQL string path for now: - The Expand pushdown still serializes through `hydrate_nodes`'s `extra_filter_sql: Option<&str>` parameter. Migrating it changes the `TableStorage` trait surface (`scan_stream(filter: Option<&str>)` → `Option<Expr>`) and the cascading call sites — out of scope for this MR. - The mutation delete predicate still goes through `Dataset::delete(&str)` in Lance 6.0.1. MR-A (delete two-phase via Lance #6658, gated on the Lance v7 bump per issue #112) will migrate that path to `DeleteBuilder::execute_uncommitted` taking an Expr. The existing `ir_filter_to_sql` / `ir_expr_to_sql` / `literal_to_sql` helpers stay in place to serve the remaining string-SQL consumers (mutation predicates). They get retired when the other call sites migrate. ## Cargo Enables the `nested_expressions` feature on the `datafusion` workspace dep. Lance already pulls in `datafusion-functions-nested` transitively (it's listed in their feature set), so this just exposes the `datafusion::functions_nested::expr_fn::array_has` re-export. No transitive dep change (Cargo.lock unchanged). ## Tests - New: `ir_filter_with_list_contains_pushes_down` — pins the case that was previously impossible (`ir_filter_to_sql` returning `None`). - 906/906 workspace tests still pass. - 417/417 engine integration tests pass (was 416 + the new one). - 19/19 failpoints (recovery canary). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: pin rustfs/rustfs to 1.0.0-beta.3 (last known-good before creds-policy break) The RustFS S3 Integration job started failing 2026-05-23 with all 3 tests panicking on the first PUT: HTTP error: error sending request The "Dump RustFS logs on failure" step revealed the container was dying at startup: [FATAL] Server encountered an error and is shutting down: Default root credentials are not allowed on non-loopback listeners; set RUSTFS_ACCESS_KEY and RUSTFS_SECRET_KEY to non-default values, bind to loopback, or set RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true for local development only `rustfs/rustfs:latest` was updated 2026-05-21 (1.0.0-beta.4) with a credentials-policy check that rejects `rustfsadmin`/`rustfsadmin` as "default" values. PR #111 passed yesterday because it ran against beta.3; today's runs against beta.4 fail at container startup. This is unrelated to PR #113's Expr-pushdown refactor — the bump just happened to hit the same week. Pin to 1.0.0-beta.3 (2026-05-14, last tag before the change). The right long-term fix is one of: - Rotate the CI creds to less-default values (less coupling to RustFS's "default" set definition) - Set `RUSTFS_ALLOW_INSECURE_DEFAULT_CREDENTIALS=true` per the error message - Use a workflow service container with controlled lifecycle Deferred — pinning is the minimal restore. Also incidentally documents which version we tested against, which `:latest` never did. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 12:47:33 +01:00
Andrew Altshuler	730712b73f	codeowners: yml source of truth + generator + drift CI (#88 ) * codeowners: generator + drift CI + initial roles Source-of-truth approach to CODEOWNERS: yml is hand-edited, CODEOWNERS is generated and CI-enforced. Every role change is a reviewable PR with a permanent in-repo audit trail. No GitHub UI clicks, no shadow state. Initial roles: engineering @aaltshuler owns crates/** + default (.github/, scripts/, Cargo., openapi.json, everything else not docs) docs @aaltshuler @ragnorc owns docs/, README.md, AGENTS.md, CLAUDE.md, SECURITY.md Per GitHub semantics, multiple owners on a CODEOWNERS line means "any one satisfies the review" — for docs, either named member can approve. Strict "N distinct approvers" would need a CI workaround (not wired today; tracked for future hardening). Components: - .github/codeowners-roles.yml — source of truth. Edit this. - .github/scripts/render-codeowners.py — generator (PyYAML; ~100 LoC). - .github/CODEOWNERS — generated. CI rejects hand-edits. - .github/workflows/codeowners.yml — two checks: drift: re-render and assert CODEOWNERS matches. * noedit: reject PRs that edit CODEOWNERS without editing the yml. - docs/codeowners.md — explains the source-of-truth pattern, how to change roles, how to add new roles. - AGENTS.md topic-index row. What's NOT in this PR: - Branch protection on main (separate PR; needs `gh api` call against the org). - Required-reviewer enforcement (depends on branch protection landing). - Required CI status checks (depends on branch protection landing). - Scheduled rotation (the schedule: block in the yml + a weekly workflow). Today's roles are stable; rotation isn't needed yet. - Linear-as-source-of-truth integration (Approach 4 from the design discussion; deferred). Verified: - Generator output is deterministic (idempotent re-runs). - scripts/check-agents-md.sh OK (28 links, 28 docs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * codeowners: fix catch-all ordering (Devin review #88) Devin caught a real bug: GitHub CODEOWNERS uses "last match wins" semantics, but the generator emitted the catch-all `` AFTER specific patterns. Net effect: `` won for every file, silently nullifying the docs role and never routing reviews to @ragnorc. Fix is one-line — emit the default `` line before iterating the specific paths. Also: - Added a regression assertion in the generator: after rendering, the first non-comment line must start with `` if a default is configured. Generator exits non-zero otherwise. Catches the same class of mistake in any future refactor. - Rewrote the yml header comment, which incorrectly stated "keep more-specific paths after broader patterns" (correct for GitHub semantics but the generator was doing the opposite — so the comment read as a description of behavior when it was actually a contradicted intention). Verified by re-rendering: `` is now line 12, `crates/` is line 14, `docs/` is line 15, etc. README.md matches both `` and `README.md`; `README.md` is later → wins → @aaltshuler + @ragnorc both assigned. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-13 17:26:06 +03:00
Andrew Altshuler	92e3886cfa	ci: add publish-crates workflow for crates.io releases (#74 ) The release.yml workflow builds binaries and updates Homebrew but never published to crates.io — v0.4.0 and v0.4.1 are missing from the registry even though the local Cargo.toml and the v0.4.1 tag are at 0.4.1. This adds a separate workflow that: - auto-publishes on every v* tag push (future releases self-publish) - can be manually dispatched with a tag input (catch up on v0.4.1) - is idempotent: skips a crate if its current crates.io version already matches local, so a partial failure is safe to retry - gates on CARGO_REGISTRY_TOKEN (already in repo secrets); skips cleanly if the token is ever rotated out Publishes in dependency order: omnigraph-compiler → omnigraph-engine → omnigraph-server → omnigraph-cli. Path-only deps in Cargo.toml carry explicit version fields, so cargo publish strips paths and resolves against crates.io. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-08 15:48:37 +03:00
Ragnor Comerford	675568ce85	ci: fold failpoints test into Test Workspace job The standalone test_failpoints_feature job took 21min on first run (cold cache; the omnigraph-engine crate has lance + datafusion deps that make any fresh build expensive). Folding into Test Workspace shares the warm cache so the failpoints invocation is incremental — ~30s vs 21min on subsequent runs, and within the workspace job's existing budget. The failpoints feature is gated behind a Cargo flag and only adds the small `fail` crate dep + a few feature-gated code paths; it doesn't change the dep tree of any other crate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 21:15:14 +02:00
Ragnor Comerford	052b6e680f	MR-794 step 2: address PR #68 follow-up review (Cubic) — pending dedupe + projection guard + CI Three new findings from Cubic on commit `3223b51`: * Pending edge cardinality counted within-input duplicates (P2): count_src_per_edge's pending walk added every row to the count, including duplicate rows that finalize will collapse via dedupe_merge_batches_by_id. A LoadMode::Merge with the same edge id twice would over-count → spurious @card violation. Fix: when dedupe_key_column is Some, walk pending in reverse, track seen keys via HashSet, count only the kept (last-occurrence) rows. Mirrors finalize-time dedupe so cardinality counts what stage_merge_insert actually publishes. * scan_with_pending silently disabled merge-shadow when projection omitted key_column (P2): if a caller passed Some("id") as key_column but their projection didn't include "id", the filter_out_rows_where_string_in helper passed batches through unchanged — silently degrading to union semantics. Fix: validate up front that projection contains key_column when both are Some; return a typed Lance error otherwise. Tightened the helper too: missing column is now an internal error (was a silent passthrough). * Cascade-vs-explicit delete test was too weak (P2): asserted only that edge count decreased after delete. The cascade alone could satisfy that even if the explicit second-delete silently no-op'd. Strengthened: assert post_knows == 0, which only holds when both ops landed (Bob→Diana would survive if op-2 no-op'd). CI gap: also added test_failpoints_feature job to .github/workflows/ci.yml. The workspace test runs without --features failpoints (the feature is behind a Cargo flag), so the failpoints test suite was never exercised by CI before now. The new job builds + runs `cargo test -p omnigraph-engine --features failpoints --test failpoints` on every full CI run, mirroring the test_aws_feature pattern. New tests on tests/runs.rs: * load_merge_mode_dedupes_within_pending_for_cardinality_count (Cubic P2 #2 — pending-vs-pending dedup, distinct from the load_merge_mode_dedupes_edge_for_cardinality_count test which covers committed-vs-pending dedup). * scan_with_pending_rejects_key_column_missing_from_projection (Cubic P2 #3 — verifies the up-front validation rejects bad callers and that the happy path still works correctly). Local test results: * tests/runs.rs: 23/23 passed * tests/failpoints.rs --features failpoints: 7/7 passed (includes the two new finalize→publisher residual tests landed in `3223b51`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 20:47:45 +02:00
Ragnor Comerford	a9430978fb	Merge pull request #60 from ModernRelay/ragnorc/omnigraph-spec Add AGENTS.md (map) + docs/ knowledge base + CI link check	2026-04-29 00:15:19 +02:00
Ragnor Comerford	a335d98854	Refactor AGENTS.md from encyclopedia to map; move spec into docs/ Splits the 990-line AGENTS.md into a 184-line map (architecture, where-to-find index, always-on invariants, capability matrix, maintenance contract) plus 18 new docs/*.md files holding the deep content per topic (storage, schema and query languages, indexes, embeddings, branches/commits, runs, merge, changes, execution, policy, server, CLI reference, audit, errors, CI, constants, v0.3.1 notes). Adds scripts/check-agents-md.sh and a check_agents_md CI job that verifies every docs/ link in AGENTS.md resolves and every doc in the canonical set is linked. CLAUDE.md remains a symlink to AGENTS.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-28 23:31:08 +02:00
Andrew Altshuler	372f793ad6	Drop macOS x86_64 build target (#55 ) Stop producing the omnigraph-macos-x86_64 archive in both the stable and edge release workflows. The macos-15-intel runner build was the slowest of the matrix and Apple Silicon is now the default Mac developer target. - release.yml + release-edge.yml: drop the macos-15-intel matrix entry - install.sh: drop the Darwin/x86_64 case so Intel Macs get a clear "no prebuilt binary" error instead of attempting an absent download - update-homebrew-formula.sh: drop the MACOS_X86_* variables and emit an arm64-only Homebrew formula. The on_macos block now declares `depends_on arch: :arm64` so Intel `brew install` fails fast with a clear architecture message instead of installing an arm64 binary that errors at exec time. Linux x86_64 build is unaffected. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 18:19:26 +03:00
andrew	a1b00e2d06	Fix release.yml: move HOMEBREW_TAP_TOKEN guard into steps GitHub Actions rejects `secrets.*` in job-level `if:` conditions at runtime (job-level `if` is evaluated before secrets are available), causing the workflow to abort in 0s with "workflow file issue" on every trigger. Moving the guard into a step-level check that writes `HOMEBREW_TAP_SKIP` to GITHUB_ENV lets the rest of the steps conditionally no-op when the tap token isn't configured. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 19:24:41 +03:00
Ragnor Comerford	567ebe5f24	Merge pull request #24 from ModernRelay/ragnorc/explore-api Add static OpenAPI spec and clean up operation IDs	2026-04-19 15:36:49 +02:00
Ragnor Comerford	bcddbdf485	Test merge commit; push openapi.json via separate clone Restore the default pull_request checkout (refs/pull/N/merge) so tests see the merged state. The openapi.json auto-commit now uses a separate shallow clone of the PR branch, so the pushed commit contains only the spec change rather than the merge-commit tree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 12:10:40 +02:00
Ragnor Comerford	a157f6a17c	Fold openapi.json auto-sync into main CI test job The separate openapi-sync workflow was duplicating the workspace build (~15 min cold-cache compile), paying the cost twice per PR. Fold the regen + auto-commit into the existing test job: one compile, shared rust-cache, same drift-check semantics. - Same-repo PRs: OMNIGRAPH_UPDATE_OPENAPI=1 during the test run, then commit the regenerated spec back to the PR branch - Fork PRs / pushes: env var empty, test stays in strict drift-check mode - openapi_spec_is_up_to_date treats empty env value as unset, so the conditional workflow env expression works Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 21:00:46 +02:00
andrew	987c51c376	package caller: pass AWS secrets via secrets: inherit GitHub Actions doesn't expose the 'secrets' context in 'with:' when calling a reusable workflow. The companion PR on the shared workflow (ModernRelay/.github) moves the four AWS values into on.workflow_call.secrets; this caller drops them from 'with:' and adds 'secrets: inherit' so all four flow through masked. Trailing from PRs #33 and #34.	2026-04-18 21:54:08 +03:00
andrew	8086a0099c	package workflow: read AWS config from secrets, not variables On a public repo, Actions variables are not masked in workflow logs. The AWS role ARN and artifact bucket name embed the AWS account ID — not catastrophic, but norm-preserving to keep them out of public logs. Switch all four values (region, role, project, bucket) from `${{ vars.* }}` to `${{ secrets.* }}`. When secrets are passed via `with:` to a reusable workflow, GitHub's masking still applies because the value is added to the run's mask list as soon as the secret reference is resolved. Followup to #33 — should have landed as secrets from the start.	2026-04-18 21:43:12 +03:00
Ragnor Comerford	9de2079263	Merge remote-tracking branch 'origin/main' into ragnorc/explore-api # Conflicts: # CONTRIBUTING.md	2026-04-18 20:24:39 +02:00
andrew	807c1ba4dc	Add manual-dispatch Package workflow for CodeBuild image builds Invokes the shared omnigraph-package reusable workflow twice per run — once with default features, once with --features aws — producing two ECR tags per source commit: <sha> (default features) <sha>-aws (--features aws → SecretsManagerTokenSource) Manual-dispatch only for now. Neither release.yml nor release-edge.yml currently invokes the CodeBuild-backed packaging path; this gives operators a way to produce on-demand image variants without wiring packaging into the tag/push cadence. Prerequisites: - Repo vars AWS_REGION, AWS_ROLE_TO_ASSUME, AWS_CODEBUILD_PACKAGE_PROJECT, AWS_ARTIFACT_BUCKET must be set. - Shared workflow must support the `features` and `image_tag_suffix` inputs. Uses @main as the shared-workflow ref until a versioned tag is cut.	2026-04-18 16:29:43 +03:00
andrew	7a3bf5c758	Add aws feature + SecretsManagerTokenSource backend Introduces an opt-in AWS Secrets Manager backend for bearer tokens, behind the `aws` Cargo feature. Default builds (on-prem, local dev) don't pull in the AWS SDK and don't pay its compile cost. - New Cargo feature `aws` gates the `aws-config` + `aws-sdk-secretsmanager` optional deps. Default features remain empty. - New `auth::aws::SecretsManagerTokenSource` implements `TokenSource` by fetching a JSON `{"actor_id": "token", ...}` payload from a named Secrets Manager secret. Credentials resolve via the AWS default chain (env, shared config, IMDSv2 instance role, ECS task role) so no explicit plumbing is needed under an IAM role. - New `resolve_token_source()` dispatches based on the `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` env var. If the var is set but the binary was built without `--features aws`, returns a clear rebuild instruction rather than silently falling back. - `serve()` now uses `resolve_token_source()` and logs which source was selected at startup. - `parse_json_secret_payload()` is factored out as a free function so the payload validation (trim whitespace, reject blank actor/token, reject non-object) is unit-testable without the AWS SDK. - New CI job `test_aws_feature` builds + tests with `--features aws`. Not in this PR (follow-ups): - Background refresh loop for rotation. `SecretsManagerTokenSource` advertises `supports_refresh: true` but the AppState-level refresh task isn't wired yet. - Config-YAML dispatch (today the AWS source is selected via env var only; eventually `server.bearer_tokens.source` in `omnigraph.yaml`). Tests: - Default-feature build: 33 lib + 41 integration + 64 openapi. - `--features aws` build: 32 lib (one test is cfg-gated) + 41 + 64. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-18 03:48:51 +03:00
Ragnor Comerford	dda9728473	Add openapi.json auto-sync workflow	2026-04-17 19:09:36 +02:00
andrew	ad7027c7e9	Automate Homebrew tap updates on release tags	2026-04-15 17:57:21 +03:00
andrew	33bdab1fcb	Prepare v0.2.2 release	2026-04-14 20:13:00 +03:00
andrew	ff83e97cb5	Scope RustFS CI to relevant changes	2026-04-12 15:33:41 +03:00
andrew	af7a74bf2c	Skip heavy CI on text-only changes	2026-04-11 15:22:11 +03:00
andrew	446075f333	Update workflow actions and add Homebrew install docs	2026-04-11 04:01:39 +03:00
andrew	816b24d05e	Fix public binary install flow	2026-04-11 02:19:21 +03:00
andrew	cbb312e74f	Split binary and source install flows	2026-04-10 23:26:09 +03:00
andrew	338289656a	Initial public Omnigraph repository	2026-04-10 20:49:41 +03:00

29 commits