Commit graph

53 commits

Author SHA1 Message Date
Andrew Altshuler
7310f69928
Revert "Merge pull request #49 from ModernRelay/ragnorc/x-request-id" (#54)
This reverts commit b352fca13c, reversing
changes made to 748ad334a9.
2026-04-26 15:56:29 +03:00
Ragnor Comerford
b352fca13c
Merge pull request #49 from ModernRelay/ragnorc/x-request-id
Add X-Request-Id middleware
2026-04-26 12:33:33 +02:00
Ragnor Comerford
e14b203208
Reuse X_REQUEST_ID constant for inbound header lookup
Both Cursor Bugbot and Cubic flagged that the inbound `headers().get(...)`
call constructed `HeaderName::from_static("x-request-id")` inline instead
of reusing the `X_REQUEST_ID` constant defined at the top of the file.
The two were already kept in sync by both being `from_static("x-request-id")`,
but a future rename would have to touch both sites or risk silent drift
between read and write.

Also drops the now-unused `header` module import.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-26 12:05:19 +02:00
Ragnor Comerford
748ad334a9
Merge pull request #48 from ModernRelay/ragnorc/api-sdk-research
Polish OpenAPI spec for SDK generation
2026-04-26 11:52:46 +02:00
Ragnor Comerford
189caf893c
Merge pull request #47 from ModernRelay/perf/expand-dense-ids
perf(expand): dense u32 ids end-to-end (follow-up to #45)
2026-04-25 23:54:03 +02:00
Ragnor Comerford
284c9377c2
Add X-Request-Id middleware
Per-request ULID minted at the edge, exposed in request extensions and
on the response header. Caller-supplied X-Request-Id is echoed when
well-formed (1..=128 ASCII printable characters); otherwise rejected
and replaced with a fresh ULID so the value is always safe to log.

Companion to the TypeScript SDK redesign — clients now correlate logs
across the wire by reading X-Request-Id from response headers (and the
SDK already surfaces it on every OmnigraphError as `requestId`).

No spec change required; the header is a transport-layer concern.

Tests:
- mint a ULID when no header is provided
- echo a valid caller-supplied id
- reject overlong header (200 chars), mint a fresh ULID

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 22:56:17 +02:00
Ragnor Comerford
7809bf607e
Polish OpenAPI spec for SDK generation
Add operation descriptions and examples to utoipa annotations so the
generated TypeScript SDK has rich JSDoc, and so future Python/Go SDKs
and any /openapi.json docs UI benefit from the same effort.

- Doc comments on all 18 handlers (utoipa picks up summary/description)
- #[schema(example = ...)] on free-text fields (query_source,
  schema_source, NDJSON data) and i64 timestamps
- Destructive/irreversible warnings on change, applySchema, ingest,
  mergeBranches, deleteBranch, publishRun, abortRun

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 16:36:51 +02:00
Andrew Altshuler
74eb5a5380
Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46)
* Parallel per-type load writes + omnigraph optimize/cleanup CLI

## MR-677.3 — parallel per-type load writes

The load path already groups records into one RecordBatch per type and
makes one Lance commit per table (loader::mod.rs:249-..), but those
commits ran sequentially. Wrap node and edge write loops in
`futures::stream::buffered(N)` against a new helper
`write_batches_concurrently`. Concurrency tunable via
`OMNIGRAPH_LOAD_CONCURRENCY` (default 8).

## MR-676 — `omnigraph optimize` and `omnigraph cleanup`

New CLI subcommands that walk every node + edge table in the repo:

- `omnigraph optimize <uri>` — runs Lance `compact_files` on each
  table to merge small fragments into fewer larger ones.
- `omnigraph cleanup <uri> --keep N | --older-than 7d --confirm` —
  runs Lance `cleanup_old_versions` to prune historical manifests +
  unique fragments. Requires `--confirm` because it's destructive.
  Supports both count-based and time-based retention (or both AND'd
  together). Time uses chrono `DateTime<Utc>` (added as a workspace
  dep, default-features off).

Both commands run their per-table loops in parallel (8-way bounded,
`OMNIGRAPH_MAINTENANCE_CONCURRENCY` env override). Smoke-tested
against the 114-table prod graph: optimize went 7m15s sequential
→ 1m28s parallel. cleanup --keep 1 removed 137 historical versions
across 114 tables in 1m57s without disrupting `/healthz` or query
responses.

Public API on `Omnigraph`:

  pub async fn optimize(&mut self) -> Result<Vec<TableOptimizeStats>>
  pub async fn cleanup(&mut self, opts: CleanupPolicyOptions)
      -> Result<Vec<TableCleanupStats>>

All 10 existing loader tests still pass.

Closes MR-676.
Partially addresses MR-677 (the .3 — parallel by type — piece;
MR-677.1 is for the `omnigraph embed` path, not load, since load
doesn't call Gemini directly. .2 was already in place).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate openapi.json

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-25 14:22:14 +03:00
Ragnor Comerford
53d7f47909
Pass dense u32 ids through expand instead of round-tripping via String
BFS now emits Vec<u32> dense ids directly with HashSet<u32> per-source
dedup. Only the deduped set is stringified for Lance's IN-list. The
post-hydrate alignment uses a dense-indexed Vec<Option<u32>> instead of
HashMap<&str, usize>, giving O(1) lookup without repeated string hashing.

End-to-end on the bench_expand harness (release, M-series):

  query     baseline    after     speedup
  1k  hop3   460.2 ms    23.7 ms     19x
  10k hop2     4.21 s   139.9 ms     30x
  10k hop3    40.59 s    898.5 ms    45x
  30k hop2    11.71 s    490.2 ms    24x
  30k hop3   197.38 s      3.22 s    61x

The cost lived in stringifying every (src,dst) pair and re-hashing the
strings during alignment; once dense ids stay dense, the BFS inner loop
and the final fan-out both collapse to integer ops.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 12:07:25 +02:00
andrew
628bc2e607 Clean up bench_expand example
Remove vestigial code left from removed hasher variants: unused
BuildHasherDefault import, PhantomData suppression line, orphan planning
comments for Variant C/E. Also drop an unused `mut` on the PRNG closure
binding. No behavior change; compiles warning-free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 00:59:21 +03:00
Ragnor Comerford
d8e0bfeb22
Dedupe dst ids before hydrating nodes in execute_expand (#45)
The BFS in execute_expand emits one (src_idx, dst_id) pair per edge, so
dst_id_list contains heavy duplication when multi-hop traversals revisit
the same destination nodes. hydrate_nodes then built an
"id IN ('a', 'b', ...)" filter from the full list, passing it verbatim
to Lance. On a 30k-node Person graph, a 3-hop query produced a 15.4M-
entry IN-list against a 30k-row target — 512x more entries than unique
ids.

Deduplicate before the Lance scan; the post-hydrate alignment HashMap
already fans results back out to the original (src, dst) pairs, so
output is bit-identical.

Bench numbers (crates/omnigraph/examples/bench_expand.rs, min of 2-3
runs, release build):

  query         before     after    speedup
  1k   hop3     460 ms     28 ms     16x
  10k  hop2    4.21 s     188 ms     22x
  10k  hop3   40.59 s    1.30 s     31x
  30k  hop2   11.71 s    678 ms     17x
  30k  hop3  197.38 s    4.86 s     41x

All existing omnigraph-engine tests pass (72/72, 0 failures).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-25 00:56:18 +03:00
Andrew Altshuler
8649b2084f
Prepare v0.3.0 release (#44)
* Prepare v0.3.0 release

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate openapi.json

* ci: retrigger CI on latest openapi.json

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-21 19:11:34 +03:00
andrew
2df578eab8 Delete __run__ branches on every terminal state (MR-674)
Run branches are transactional scaffolding — the durable audit lives
on RunRecord. Invariant: every terminal state (Published, Aborted,
Failed) deletes the __run__ branch.

- Add `terminate_run` helper: appends terminal RunRecord, then
  deletes the run branch. Delete errors are swallowed — the record
  is authoritative; `cleanup_terminal_run_branches_for_target`
  retries on later `branch_delete` of the target.
- Wire into `publish_run_as`, `abort_run`, `fail_run`.
- Include `Failed` in the cleanup filter (was `Published | Aborted`
  only) for legacy-repo GC during branch_delete.
- Cleanup now checks `coordinator.all_branches()` first to skip
  branches already deleted by a concurrent handle — avoids Lance
  NotFound when two handles publish/clean up independently.
- Drop `Failed` from `ensure_branch_delete_safe` — post-fix, Failed
  means the branch is already gone, so there's no reason to block
  target deletion (MR-674 "Downstream effects").

Tests:
- New regression: `run_branches_do_not_accumulate_across_repeated_loads`
  — 10 loads + 1 abort → `branch_list() == ["main"]`.
- New `failed_load_deletes_run_branch` asserts Failed path cleans up.
- Rename `abort_run_keeps_target_unchanged_and_preserves_hidden_branch_for_inspection`
  → `abort_run_leaves_target_unchanged_and_deletes_run_branch`, invert
  the hidden-branch assertion.
- Rewrite `public_{load,mutation}_preserves_staged_edge_ids_on_publish`
  to capture staged IDs before publish instead of inspecting the run
  branch after (branch is gone now).
- Update MR-670 regression test to assert the run branch is *absent*
  after publish.

Deferred to follow-up: `--keep-run-branch` debug flag, `omnigraph run gc`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 14:15:39 +03:00
andrew
54101f7e2c Extract remaining crowded compiler test modules
Files where inline tests crowded out production code (test/prod ratio
≥ 0.8) move to sibling files via `#[path]`. Files where production
dominates (query_input.rs, schema_plan.rs) stay inline — extracting
would add noise, not reduce it.

- ir/lower.rs: 1239 → 577 lines (ratio 1.15)
- catalog/mod.rs: 594 → 326 lines (ratio 0.83)
- query/lint.rs: 562 → 314 lines (ratio 0.80)

catalog/tests.rs uses the shorter name since it's inside a module
directory (no ambiguity with filename).

All 229 compiler tests green, identical count to before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:49:09 +03:00
andrew
94849a50b4 Extract compiler test modules to sibling files
typecheck.rs, schema/parser.rs, and query/parser.rs each had
~1000-line inline `mod tests` blocks that overshadowed the production
code in the file. Move each to a sibling `*_tests.rs` using
`#[path = "..."] mod tests;`.

- typecheck.rs: 2865 → 1708 lines; typecheck_tests.rs: 1156 lines
- schema/parser.rs: 1950 → 994 lines; parser_tests.rs: 955 lines
- query/parser.rs: 1737 → 803 lines; parser_tests.rs: 933 lines

No visibility change — the sibling module still has `use super::*`
access to crate-privates. No semantic edits beyond de-indenting by
4 spaces (mechanical). All 229 compiler tests green, identical
count to before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:50:18 +03:00
andrew
f05ea2c7c3 Extract public-API tests from omnigraph.rs to integration tests
The inline `mod tests` in crates/omnigraph/src/db/omnigraph.rs had grown
to ~620 lines, mixing tests that need crate-private access with tests
that only exercise the public API. Splits the latter out.

- tests/lifecycle.rs: 10 init/open/snapshot/drift tests
- tests/schema_apply.rs: 5 plan/apply tests
- omnigraph.rs: 10 tests remain inline because they use
  db.coordinator, db.table_store(), ManifestCoordinator,
  SCHEMA_APPLY_LOCK_BRANCH, or is_internal_run_branch — all
  crate-private and intentionally kept so.

No behavior change. Zero semantic edits to the tests themselves beyond
replacing db.snapshot() (pub(crate)) with snapshot_main helper at
integration-test boundaries.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:09:34 +03:00
andrew
26012d156e Filter internal run branches in schema_apply (MR-670)
Published `__run__` branches are intentionally retained after publish
for post-publish inspection (runs.rs tests verify edge IDs match
between run branch and main). `apply_schema` was counting them as
"non-main" branches and refusing to run — permanently blocking schema
evolution after any load or change, with no CLI recovery path
(`branch_delete` rejects internal refs, `run abort` rejects Published
runs).

Fix: `apply_schema` filters `is_internal_system_branch` (covers both
`__run__*` and the schema-apply lock) rather than just the lock.
Run branches remain available for inspection.

Regression: test_apply_schema_succeeds_after_load_creates_published_run_branch
pins that schema apply succeeds after a load even while the run
branch is still present.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:32:20 +03:00
Ragnor Comerford
a157f6a17c
Fold openapi.json auto-sync into main CI test job
The separate openapi-sync workflow was duplicating the workspace build
(~15 min cold-cache compile), paying the cost twice per PR. Fold the
regen + auto-commit into the existing test job: one compile, shared
rust-cache, same drift-check semantics.

- Same-repo PRs: OMNIGRAPH_UPDATE_OPENAPI=1 during the test run, then
  commit the regenerated spec back to the PR branch
- Fork PRs / pushes: env var empty, test stays in strict drift-check mode
- openapi_spec_is_up_to_date treats empty env value as unset, so the
  conditional workflow env expression works

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 21:00:46 +02:00
Ragnor Comerford
9de2079263
Merge remote-tracking branch 'origin/main' into ragnorc/explore-api
# Conflicts:
#	CONTRIBUTING.md
2026-04-18 20:24:39 +02:00
andrew
7a3bf5c758 Add aws feature + SecretsManagerTokenSource backend
Introduces an opt-in AWS Secrets Manager backend for bearer tokens,
behind the `aws` Cargo feature. Default builds (on-prem, local dev)
don't pull in the AWS SDK and don't pay its compile cost.

- New Cargo feature `aws` gates the `aws-config` + `aws-sdk-secretsmanager`
  optional deps. Default features remain empty.
- New `auth::aws::SecretsManagerTokenSource` implements `TokenSource` by
  fetching a JSON `{"actor_id": "token", ...}` payload from a named
  Secrets Manager secret. Credentials resolve via the AWS default chain
  (env, shared config, IMDSv2 instance role, ECS task role) so no
  explicit plumbing is needed under an IAM role.
- New `resolve_token_source()` dispatches based on the
  `OMNIGRAPH_SERVER_BEARER_TOKENS_AWS_SECRET` env var. If the var is set
  but the binary was built without `--features aws`, returns a clear
  rebuild instruction rather than silently falling back.
- `serve()` now uses `resolve_token_source()` and logs which source was
  selected at startup.
- `parse_json_secret_payload()` is factored out as a free function so
  the payload validation (trim whitespace, reject blank actor/token,
  reject non-object) is unit-testable without the AWS SDK.
- New CI job `test_aws_feature` builds + tests with `--features aws`.

Not in this PR (follow-ups):
- Background refresh loop for rotation. `SecretsManagerTokenSource`
  advertises `supports_refresh: true` but the AppState-level refresh
  task isn't wired yet.
- Config-YAML dispatch (today the AWS source is selected via env var
  only; eventually `server.bearer_tokens.source` in `omnigraph.yaml`).

Tests:
- Default-feature build: 33 lib + 41 integration + 64 openapi.
- `--features aws` build: 32 lib (one test is cfg-gated) + 41 + 64.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 03:48:51 +03:00
andrew
af41630520 Extract TokenSource trait for bearer token loading
Pure refactor. No behavior change. Introduces a TokenSource trait so
additional backends (AWS Secrets Manager, Vault, etc.) can plug in
behind feature flags without touching the server wiring.

- New module crates/omnigraph-server/src/auth.rs with the TokenSource
  trait and a single EnvOrFileTokenSource implementation that delegates
  to the existing server_bearer_tokens_from_env() function.
- serve() now constructs EnvOrFileTokenSource and calls load() instead
  of calling the free function directly.
- The trait has a supports_refresh() hook (false for env/file) for
  future implementations that can rotate without restart.
- async-trait added to omnigraph-server deps; it's already in the
  workspace.

Tests:
- Unit tests in auth.rs covering load paths and the default supports_refresh
  / name values.
- Existing 128 tests (lib + integration + openapi) pass unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 03:31:43 +03:00
andrew
c338e80180 Harden bearer auth: constant-time compare, hashed at rest, authoritative actor_id
Fixes two live authz bugs in omnigraph-server:

- Bearer-token lookup previously used HashMap::get, which compares keys with
  Eq and short-circuits on the first differing byte — a network-observable
  timing oracle for brute-forcing tokens. Tokens are now stored as SHA-256
  digests and compared with subtle::ConstantTimeEq, iterating every entry
  unconditionally so total work is independent of which slot matches. Raw
  token bytes no longer live in server memory after startup.

- authorize_request now overwrites PolicyRequest.actor_id from the
  authenticated session instead of trusting the handler-supplied field,
  which previously defaulted to "" via unwrap_or_default(). The empty
  string can no longer reach Cedar as a policy subject even if a future
  refactor drops the None check.

External API of AppState constructors is unchanged — tokens still enter as
Vec<(String, String)> and are hashed on the way in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:41:02 +03:00
andrew
be520f31f4 Polish schema endpoint: rename show, align field name, add tests
Review feedback on #23, applied on top of the original commit:

- Rename the CLI subcommand from `schema get` to `schema show` to match
  the existing `run show` / `commit show` convention. A `#[command(alias
  = "get")]` preserves muscle memory for anyone who already typed `get`.
- Rename `SchemaGetOutput` → `SchemaOutput` and its field `source` →
  `schema_source`, so the get response and the apply request use the
  same field name for the same concept.
- Use `println!` instead of `print!` in the CLI so the shell prompt
  doesn't land on the last line of schema output.
- Add three integration tests on `/schema`: happy path (no auth),
  401 when bearer is required but missing, 403 when the policy grants
  the actor branch_create but not read.

Follow-ups left for a separate PR: include `schema_ir_hash` and
`schema_identity_version` in the response payload so clients can do
drift detection and the server can set an ETag; and a fast-path local
read that skips `Omnigraph::open()` when only the schema source is
needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:30:46 +03:00
Ragnor Comerford
228032a4ac
Add static OpenAPI spec and Stainless SDK config
Introduce SDK generation scaffolding: commit a static openapi.json
extracted from the Utoipa annotations via a golden-file test, add
Stainless workspace/config for TypeScript and Python SDKs, and clean
up operation IDs for ergonomic generated method names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:26:31 +02:00
Claude
0c4df674fa
Add schema get command to CLI and HTTP API
Exposes the existing schema_source() method via a new `omnigraph schema get`
CLI subcommand and a `GET /schema` API endpoint, allowing users to retrieve
the current accepted schema from any graph repository.

https://claude.ai/code/session_01UYybeBQks3fz3RJrTHtwQw
2026-04-16 21:15:17 +00:00
andrew
33bdab1fcb Prepare v0.2.2 release 2026-04-14 20:13:00 +03:00
andrew
3d74cbfc20 Prepare v0.2.1 release 2026-04-14 19:19:00 +03:00
andrew
1a26e2e654 Rename config targets to graphs 2026-04-14 04:12:14 +03:00
Ragnor Comerford
063be3ddc7
Merge pull request #16 from ModernRelay/tin-epoch
Fix join alignment for traversal-introduced bindings
2026-04-13 16:54:52 +02:00
Ragnor Comerford
6e43ceac08
Add comprehensive tests from morphological matrix analysis
Unit tests covering gaps identified by systematic matrix of:
topology (fan-out, fan-in, cycle) × deferral × filter type × direction.

New unit tests:
- fan-out: one root fans to two deferred destinations via different edges
- fan-in: two sources converge on one destination via reverse expand
- cycle: deferred binding + genuine cycle-closing on return edge
- multiple filters on single deferred binding (name + age)
- param filter on deferred binding (IRExpr::Param in dst_filters)
- negation with inner binding (documents current NodeScan+cycle-close behavior)

New integration tests:
- fan-out projection (friend × company cross-product per source)
- deferred filter matching nothing (empty result propagation)
- negation with inner destination binding filter

Also: guard anti-join fast path against non-empty dst_filters. The bulk
CSR existence check only tests neighbor existence, not destination
properties — it must fall back to the slow path when dst_filters are
present to avoid false negatives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:31:08 +02:00
Ragnor Comerford
3461aa123d
Fix: exclude wildcard $_ from traversal adjacency graph
The anonymous wildcard variable _ was included as a regular node in the
undirected adjacency graph used for component analysis. When multiple
traversals referenced $_, it falsely bridged otherwise-independent
components, causing bindings in separate components to be deferred.
The deferred binding would never be introduced (since _ is never added
to bound_vars), leading to silently dropped traversals.

Fix: skip edges involving _ when building the adjacency graph.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:11:17 +02:00
Ragnor Comerford
fabd65b08a
Fix: propagate edge-lookup errors from iterative traversal loop
The retain-based loop swallowed catalog.lookup_edge_by_name errors by
keeping the traversal for the next pass, where it could never succeed.
This caused the no-progress break to fire, silently dropping the
traversal and producing incorrect query results with missing joins.

Replaced retain with a manual for-loop that propagates errors via ?.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 13:40:22 +02:00
Ragnor Comerford
88384476be
Fix traversal ordering: process in dependency order, not declaration order
The iterative lowering now handles traversals declared in non-topological
order (e.g. `$b worksAt $c` before `$a knows $b`).  Each pass processes
traversals that have at least one bound endpoint, repeating until all are
consumed.  Caught during self-review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:16:45 +02:00
Ragnor Comerford
853691c70e
Fix join alignment for traversal-introduced bindings with Lance filter pushdown
The IR lowering previously emitted independent NodeScans for every binding
in a match clause, even when bindings were connected by traversals. This
created O(N×M) cross-joins followed by cycle-closing filters — correct but
extremely slow for large datasets.

Two changes fix this by design:

1. **Deferred bindings** — When multiple bindings are connected by
   traversals, only the first-declared binding gets a NodeScan. The rest
   are introduced by Expand operations, eliminating cross-joins entirely.

2. **Filter fusion into Expand** — Deferred binding filters are attached
   directly to IROp::Expand (new `dst_filters` field) and pushed into
   Lance SQL during hydrate_nodes(), so the storage layer skips
   non-matching rows. Non-pushable filters (list-contains, FTS) fall back
   to in-memory application after hconcat.

For a query like:
  match { $p: Person  $p worksAt $c  $c: Company { name: "Acme" } }

Old plan: NodeScan($p) → NodeScan($c) → cross-join → Expand(__temp) → cycle-close
New plan: NodeScan($p) → Expand($p→$c, Lance SQL: id IN (...) AND name='Acme')

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:10:50 +02:00
Claude
c943d97744
Fix null-fill for nullable params when params JSON is None/null
The early return at line 273 for None/Value::Null params was skipping
the null-fill loop, leaving declared nullable params absent from the
map. Downstream code would then error with "parameter not provided".

https://claude.ai/code/session_014oGFKL7EVg1b2cyPgt9Gne
2026-04-13 09:37:17 +00:00
Claude
37b7a94eb7
Fix nullable query parameters: accept omission and null for ? params
Parameters declared with `?` (e.g. `$changelogUrl: String?`) now correctly
accept omission or explicit null in JSON input instead of requiring empty
strings as a workaround. Adds `Literal::Null` variant and threads it through
parameter parsing, type-checking, and Arrow array conversion.

https://claude.ai/code/session_014oGFKL7EVg1b2cyPgt9Gne
2026-04-13 08:43:48 +00:00
Ragnor Comerford
c5a88cacb5
Merge pull request #6 from ModernRelay/claude/omnigraph-aggregates-a53rG
Implement aggregate functions with GROUP BY support
2026-04-13 10:26:07 +02:00
andrew
1bf55fa52d Add query lint and check commands 2026-04-13 00:37:44 +03:00
Claude
351610d18c
Implement aggregate execution with wide-batch model
Add runtime support for aggregate functions (count, sum, avg, min, max)
with GROUP BY semantics, built on a single wide RecordBatch that
eliminates correlation tracking by construction.

Execution engine (exec/query.rs):
- Replace HashMap<String, RecordBatch> with Option<RecordBatch> where
  columns are prefixed as <variable>.<property>
- NodeScan prefixes columns and cross-joins with existing batch
- Expand collects (src_row, dst_id) pairs, takes wide batch rows,
  appends prefixed destination columns via hconcat
- Filter applies single mask to entire wide batch
- AntiJoin: fast-path returns BooleanArray mask; slow-path slices
  one row for inner pipeline execution

Projection engine (exec/projection.rs):
- aggregate_return groups rows by non-aggregate key columns using
  length-prefixed string encoding, computes per-group aggregates
- SUM accumulates into f64 to avoid integer overflow
- MIN/MAX support both numeric and string types
- Empty input returns count=0, others=null

Compiler (typecheck.rs):
- T8: split MIN/MAX from SUM/AVG — allow string arguments
- T9: non-aggregate expressions in aggregate queries must be
  property accesses or variables
- SUM type inference returns Float64 (matching runtime)

Tests: 8 new integration tests covering grouped count, global count,
sum/avg/min/max per company, aggregate+order+limit, string min/max,
multi-hop aggregates, and edge cases.

https://claude.ai/code/session_019o5NRyYomgETFyd7hpiLey
2026-04-12 20:59:13 +00:00
andrew
95e89a343e Genericize bootstrap context fixture 2026-04-12 22:02:25 +03:00
andrew
5daeae7571 Prepare v0.2.0 release 2026-04-12 20:35:34 +03:00
andrew
e9a511e38f Refactor query execution modules 2026-04-12 18:18:57 +03:00
andrew
3741900611 Refactor omnigraph db module layout 2026-04-12 17:07:24 +03:00
andrew
6655cd65d5 Harden schema apply against write races 2026-04-12 15:19:48 +03:00
Ragnor Comerford
e5528047a9
Merge pull request #1 from ModernRelay/claude/review-rust-api-JYSCx
Add OpenAPI documentation endpoint and schema
2026-04-12 13:15:43 +02:00
Claude
4c07d3c095
Make /openapi.json reflect runtime auth configuration
The served OpenAPI spec now matches runtime behavior: when no bearer
tokens or policy are configured (open mode), the spec omits security
schemes and per-operation security requirements. When auth is active,
the full bearer_token security metadata is included.

Also fixes SecurityAddon to initialize components if absent, and
removes the redundant utoipa dev-dependency.

Adds 5 new tests covering open-mode vs auth-mode spec serving.

https://claude.ai/code/session_01NfoPVx21rZUQned1f7WpXY
2026-04-12 11:04:13 +00:00
Claude
859ec9faa8
Add OpenAPI spec generation via utoipa with /openapi.json endpoint
Integrate utoipa 5 to auto-generate an OpenAPI 3.1 spec from the existing
Axum handlers and serde types. All 16 endpoints are annotated with path
metadata, request/response schemas, security requirements, and tags. A
public /openapi.json endpoint serves the spec without requiring auth.

Includes 59 tests covering path completeness, HTTP methods, schema fields,
enum variants, security scheme, path/query parameters, request bodies,
response references, and endpoint integration.

https://claude.ai/code/session_01NfoPVx21rZUQned1f7WpXY
2026-04-12 11:03:23 +00:00
Andrew Altshuler
af9a44e879
Merge pull request #4 from ModernRelay/claude/omnigraph-multi-statement-mutations-DxWSA
Support multi-statement mutations (insert + edge in one query)
2026-04-12 13:58:26 +03:00
andrew
92fa3189f7 Add schema apply command and policy support 2026-04-12 04:01:14 +03:00
Claude
d10f78530f
Support multi-statement mutations (insert + edge in one query)
Allow mutation queries to contain multiple sequential statements that
execute atomically within a single transactional run. This enables
patterns like inserting a node and its edges in one query:

    query add_and_link($name: String, $age: I32, $friend: String) {
        insert Person { name: $name, age: $age }
        insert Knows { from: $name, to: $friend }
    }

Changes span the full compiler-to-execution pipeline:
- Grammar: mutation_body = { mutation_stmt+ }
- AST: QueryDecl.mutations: Vec<Mutation>
- IR: MutationIR.ops: Vec<MutationOpIR>
- Execution: loop over ops, accumulate affected counts

Cross-statement visibility works because each statement's commit_updates
advances the manifest state, so subsequent statements see prior writes.
Atomicity comes from the existing run mechanism (begin_run/publish_run).

https://claude.ai/code/session_01E4VG2WXrZW8aeXFiqr8NwF
2026-04-11 20:27:51 +00:00