Commit graph

28 commits

Author SHA1 Message Date
Ragnor Comerford
cc2412dc65
Rename repo terminology to graph (#118)
Some checks failed
CI / Classify Changes (push) Has been cancelled
CI / Check AGENTS.md Links (push) Has been cancelled
Release Edge / Prepare edge release (push) Has been cancelled
CI / Test Workspace (push) Has been cancelled
CI / Test omnigraph-server --features aws (push) Has been cancelled
CI / RustFS S3 Integration (push) Has been cancelled
Release Edge / Build edge omnigraph-linux-x86_64 (push) Has been cancelled
Release Edge / Build edge omnigraph-macos-arm64 (push) Has been cancelled
2026-05-24 16:46:00 +01:00
Andrew Altshuler
bb1fe57640
release: v0.5.0 (#115)
* gitignore: exclude docs/internal/ from publication

Mirrors the existing "Local-only working files (not for the public
repo)" pattern. Working notes filed under docs/internal/ stay on the
contributor's machine instead of cluttering the published doc tree
or tripping the AGENTS.md / docs-index cross-link check
(scripts/check-agents-md.sh enumerates every docs/*.md and requires
each one to be linked from an audience index — internal notes don't
have an audience index by definition).

Incidental to the v0.5.0 release; lands separately from the version
bump commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: skip docs/internal/ in agents-md cross-link check

Matches the .gitignore exclusion. Mirrors the existing 'docs/releases/'
exclusion pattern: notes under docs/internal/ aren't part of the
published doc tree and don't need to be linked from an audience index.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* release: v0.5.0 — Lance 6 substrate, Cedar policy engine, schema-lint v1

Bumps the workspace from 0.4.2 to 0.5.0. Release notes at
docs/releases/v0.5.0.md.

Three user-visible pillars motivate the minor bump:
  1. Lance 6.0.1 substrate (DataFusion 52→53, Arrow 57→58)
  2. Engine-wide Cedar policy enforcement on every _as writer; server
     defaults to deny-all; signed-token-claim-only actor identity
  3. Schema-lint v1 chassis: OG-XXX-NNN codes, soft drops, and
     `--allow-data-loss` (Hard mode) for destructive migrations

Plus structured DataFusion Expr filter pushdown (unblocks
CompOp::Contains via array_has), HTTP allow_data_loss parity, inline
.gq sources on CLI/HTTP, optional CORS layer, and bug fixes
(merge-insert dup-rowid, branch-merge coordinator restore on error,
blob columns in branch merge).

Sites bumped:
  - 5 crate [package].version lines (omnigraph, omnigraph-cli,
    omnigraph-compiler, omnigraph-policy, omnigraph-server)
  - 10 internal path-dep `version = "..."` constraints across the
    four manifests that depend on sister crates (engine, server, cli,
    plus engine's dev-dep on the compiler)
  - Cargo.lock (regenerated via cargo update --workspace)
  - AGENTS.md "Version surveyed:"
  - openapi.json `info.version` (regenerated via
    OMNIGRAPH_UPDATE_OPENAPI=1 cargo test -p omnigraph-server --test
    openapi)

Verification:
  - cargo test --workspace --locked: 907/907 green
  - cargo test -p omnigraph-engine --test failpoints --features
    failpoints: 19/19 green
  - cargo test -p omnigraph-engine --test lance_surface_guards: 3/3
  - scripts/check-agents-md.sh: clean

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 13:59:42 +01:00
Andrew Altshuler
58cee158d8
schema-lint v1 commit 4: emit + apply DropType { Soft } (#99)
Wire the second half of the dormant Drop* family. Per
docs/dev/schema-lint-v1-plan.md, commit #4 of the schema-lint chassis
v1 series (MR-694). Builds on commit #3 (PR #90, DropProperty Soft).

Planner (schema_plan.rs):
- plan_nodes leftover loop: emit DropType { Node, name, Soft }
  instead of UnsupportedChange (OG-DS-102) for node-type removals.
- plan_edges leftover loop: emit DropType { Edge, name, Soft }
  instead of UnsupportedChange (OG-DS-103) for edge-type removals.

Apply (schema_apply.rs):
- New dropped_tables: BTreeSet<String> accumulator alongside
  added_tables / renamed_tables / rewritten_tables.
- DropType arm in the metadata loop populates dropped_tables for
  Soft mode. Hard mode errors (lands in commit #5 with
  --allow-data-loss).
- New tombstone-emission loop after the rename sidecar build:
  for each dropped table, push to sidecar_tombstones AND populate
  table_tombstones with table_version + 1. The existing manifest
  publish path converts table_tombstones into ManifestChange::Tombstone
  operations — no new manifest plumbing needed.
- Soft DropType has no Phase B per-table write; the tombstone is the
  entire change. Lance dataset files are retained — prior __manifest
  versions still reference them, so time travel + branch-from-snapshot
  can read the dropped table until cleanup_old_versions runs.
- Rides on SidecarKind::SchemaApply per MR-847 (already established
  by commit #3).

Tests:
- Planner unit test plan_emits_soft_drop_for_removed_node_and_edge_types
  asserts both Node and Edge DropType { Soft } emission for the
  Company + WorksAt combined drop, plus no UnsupportedChange.
- Integration test apply_schema_drops_node_and_referencing_edge_softly
  (replaces apply_schema_rejects_dropping_a_node_type): asserts
  plan emission, apply success, current manifest entries absent,
  pre-drop manifest entries present (time-travel reversibility),
  reopen consistency.
- Integration test apply_schema_drops_an_edge_type_softly (replaces
  apply_schema_rejects_dropping_an_edge_type): single edge drop,
  asserts other tables untouched, time-travel reversibility.

Test results:
- cargo test -p omnigraph-compiler --lib: 239 passed (1 new + 238)
- cargo test -p omnigraph-engine --test schema_apply: 11 passed
  (2 converted + 9 unchanged)

Pending for v1 completion:
- Commit #5: --allow-data-loss CLI flag + Hard mode promotion in
  planner + immediate compact_files + cleanup_old_versions for
  both DropProperty and DropType.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 20:25:42 +03:00
Andrew Altshuler
e98347eb7b
schema-lint chassis v1.0: DropProperty Soft + code-tagged diagnostics (MR-694) (#90)
* schema-lint chassis v1 (WIP): tier surfacing + plan doc

First commit of the chassis v1 branch. Lands a small, foundational
slice without behavior change, plus a planning doc that lays out the
remaining 7 commits in sequence so the PR can be reviewed
incrementally.

This commit:

- Adds SchemaMigrationStep::diagnostic() returning the full
  &'static DiagnosticCode (family + tier + severity) for
  UnsupportedChange steps with codes. Renderers can now reach the
  tier without re-implementing the code → tier lookup.

- CLI `omnigraph schema plan` output now displays tier alongside
  code:

    unsupported change on node:Person.age [OG-DS-104, destructive]:
        removing property 'Person.age' is not supported in schema
        migration v1

  Operators see at-a-glance the kind of risk each rejection
  represents — not just the rule identifier.

- No behavior change. All 11 existing schema_apply tests still pass.

Planning doc at docs/schema-lint-v1-plan.md tracks the 7 remaining
commits to bring v1 to feature-complete:

  1. (this commit) Tier surfacing in plan output.
  2. Soft / Hard mode enum on drop steps.
  3. Tombstone fields on catalog IR.
  4. Planner emits DropProperty { Soft } by default.
  5. Apply path implements Soft mode.
  6. Convert PR #62 destructive-rejection tests.
  7. --allow-data-loss flag + Hard mode.
  8. (optional) Tombstone unhide / restore command.

Delete the planning doc when v1 lands. Intentionally checked in to
the WIP branch so the scope is reviewable; not intended as a
permanent doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* schema-lint v1 commit 2: DropMode + dormant Drop* variants

Second commit of the chassis v1 branch. Lands the type-level shape
of soft/hard drops without wiring them up. Variants are reachable
from emitters but the planner doesn't produce them yet; the apply
path returns an explicit not-yet-implemented error if one shows up
via deserialization.

Added:

- `DropMode { Soft, Hard }` — orthogonal to `SafetyTier`. Tier
  classifies the rule's risk class; mode is the operator's intent
  for data treatment.
    - `Soft` → catalog tombstone, data retained. Tier: safe.
    - `Hard` → Lance-level removal. Tier: destructive; will require
      --allow-data-loss to apply (commit 7).

- `SchemaMigrationStep::DropType { type_kind, name, mode }` and
  `SchemaMigrationStep::DropProperty { type_kind, type_name,
  property_name, mode }` variants.

- Re-export `DropMode` from `omnigraph_compiler::DropMode` so
  downstream crates don't reach into the catalog submodule.

- CLI `render_schema_plan_step` arms for both variants, surfacing
  the mode in plan output: `drop property 'Person.age' of node
  'Person' (soft mode)`.

- `apply_schema_with_lock` exhaustive match arm for the two new
  variants that returns `manifest_internal` with a clear
  not-yet-implemented message. If a SchemaIR JSON containing
  Drop{Type,Property} arrives (e.g. from a future tool or hand-
  written), the apply path fails explicitly rather than silently
  misclassifying.

- Two new in-source tests:
    - `drop_steps_round_trip_through_serde` — pins the wire shape
      for all four (variant × mode) combinations.
    - `drop_mode_serde_uses_snake_case` — pins external-tool-
      friendly serialization (`"soft"` / `"hard"`).

Build: clean, only pre-existing warnings.
Tests:
- omnigraph-compiler schema_plan: 6/6 (4 existing + 2 new).
- omnigraph-engine schema_apply: 11/11 (unchanged — planner still
  emits UnsupportedChange for removal paths).

Next commit (commit 3 per docs/schema-lint-v1-plan.md): add the
`tombstoned: bool` fields to NodeIR / EdgeIR / PropertyIR for the
catalog representation of soft-mode tombstones.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plan doc: reframe v1 around Lance native drop_columns

After a substrate audit of the Lance data-evolution guide on
2026-05-13, the v1 plan was simplified. Two key findings:

1. Lance's `drop_columns()` is already metadata-only and reversible
   via time travel until cleanup. No need for a parallel
   `tombstoned: bool` field in our catalog IR — Lance's version
   graph IS the tombstone.

2. The full schema_apply substrate migration (add_columns,
   drop_columns, alter_columns vs. stage_overwrite across all step
   types) is consolidated in MR-948 as a sibling issue. v1 only
   uses the relevant slice (drop_columns for OG-DS-1XX).

Net plan changes:

- Commit 3 (original): tombstone fields on catalog IR → dropped.
  No catalog IR change needed. The Lance drop_columns commit IS the
  tombstone.

- Commit 5 (original): apply path writes tombstoned: true → replaced
  with: apply path calls Dataset::drop_columns([name]).

- Commit 7 Hard mode: stage_overwrite removing the column → replaced
  with: drop_columns + compact_files + cleanup_old_versions. Same
  APIs omnigraph cleanup already uses.

- Commit 8 (original): omnigraph schema unhide → dropped. Time
  travel is the undo (omnigraph snapshot --at <commit>).

Net result: 8 commits → 5 commits. ~250 LoC less surface. More
substrate-aligned.

The chassis types from commit 2 (DropMode enum, DropType /
DropProperty variants) are kept exactly as designed; only the
implementation strategy changed.

The Lance docs quote is included in the doc so future readers see
the substrate behavior cited verbatim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* schema-lint v1 commit 3: emit + apply DropProperty { Soft }

Wire the dormant DropProperty variant end-to-end for the Soft case.
Per docs/schema-lint-v1-plan.md, commit #3 of the schema-lint chassis
v1 series (MR-694).

Planner (schema_plan.rs):
- plan_properties: emit DropProperty { type_kind, type_name,
  property_name, mode: Soft } instead of UnsupportedChange when a
  property exists in accepted but not in desired. Plan is now
  supported = true for drop-only changes.

Apply (schema_apply.rs):
- Route DropProperty { Soft } through rewritten_tables. The existing
  batch_for_schema_apply_rewrite path already iterates the *target*
  schema fields, so a property absent from desired_catalog is
  naturally projected away. The prior Lance version retains the
  dropped column for time-travel reversibility (until cleanup runs).
- DropType still errors (lands in commit #4 with different mechanics:
  __manifest entry removal instead of column projection).
- DropProperty { Hard } still errors (lands in commit #5 with
  --allow-data-loss CLI flag + immediate compact_files +
  cleanup_old_versions).

Tests:
- Planner unit test plan_emits_soft_drop_for_removed_nullable_property
  asserts the variant emission + supported = true + no UnsupportedChange.
- Integration test apply_schema_drops_a_nullable_property_softly_
  preserves_prior_version (replaces the former
  apply_schema_rejects_dropping_a_property_with_data) asserts:
  (a) plan contains DropProperty { Soft }
  (b) apply succeeds + manifest advances + row count unchanged
  (c) current dataset schema lacks the dropped column
  (d) snapshot_at_version(pre_drop) still has the dropped column
  (e) reopen consistency — drop preserved across engine restart

Recovery: rides on SidecarKind::SchemaApply per MR-847. No new
sidecar kind needed; the entire apply path is already sidecar-wrapped.

Substrate alignment: this commit uses the stage_overwrite full-rewrite
path (full_rewrite cost class) rather than Lance native drop_columns
(catalog_only cost class). MR-948 is the follow-up substrate-alignment
refactor that introduces a LanceColumnOp surface and switches the
metadata-only case onto drop_columns. Functional outcome is identical;
cost-class improvement deferred.

Test results:
- cargo test -p omnigraph-compiler --lib: 238 passed
- cargo test -p omnigraph-engine --test schema_apply: 11 passed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: move schema-lint-v1-plan into docs/dev/ + add to index

Post-rebase fixup for the docs split (#93). The plan doc was added
to docs/ at the top level before main reorganized to docs/{user,dev}/.
This moves it into docs/dev/ and adds an entry to docs/dev/index.md
under a new "Active Implementation Plans" section so the
check-agents-md.sh link check passes.

Per the original commit message (617a77d), the plan doc is intentionally
temporary — it will be deleted when v1 lands.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 16:30:03 +03:00
Andrew Altshuler
c142dafdf3
schema-lint chassis v0: code-tagged diagnostics (MR-694) (#87)
First slice of the schema-lint chassis. Adds stable `OG-XXX-NNN`
codes to schema-migration rejections so operators can suppress, look
up, and filter on identifiers rather than free-text prose. Atlas-style
chassis adapted to omnigraph's typed-IR substrate (no SQL injection
vector, no per-engine locks, native edge/vector/embedding types).

What's in v0:

- New `omnigraph-compiler/src/lint/` module with:
  - `diagnostic.rs` — Family / SafetyTier / Severity enums covering ten
    families (DS, MF, CD, BC, NM, OW, NL, VE, ED, LK). Only DS and MF
    are populated in this PR.
  - `codes.rs` — 8 DiagnosticCode constants (OG-DS-101..105,
    OG-MF-103, OG-MF-104, OG-MF-106). Five of the eight are wired to
    real emission sites; the other three are reserved.
  - Unit tests for catalog invariants: codes unique, prefix matches
    family, suffixes are 3-digit, destructive defaults to error,
    lookup() works, EMITTED_IN_V0 codes exist in ALL_CODES.

- `SchemaMigrationStep::UnsupportedChange` gains an optional
  `code: Option<String>` field. New `unsupported_error_message()`
  helper prefixes the message with `[code]` when present.

- 5 of 17 existing rejection paths now carry codes:
  - `removing node type` → OG-DS-102
  - `removing edge type` → OG-DS-103
  - `removing property` → OG-DS-104
  - `adding required property without backfill` → OG-MF-103
  - `changing property type` → OG-MF-106
  Remaining 12 paths carry `code: None` and are tagged as future work.

- `schema_apply` surfaces the formatted error (with `[code]` prefix);
  CLI `omnigraph schema plan` renders the code on the
  `unsupported change on <entity>` line.

- PR #62 destructive-rejection tests in `tests/schema_apply.rs` now
  assert on the stable code (`msg.contains("OG-DS-104")`) instead of
  the error-message substring. 11/11 tests pass.

- New `docs/schema-lint.md` documents the v0 catalog + the 10 families
  + Atlas prior art. AGENTS.md index updated.

What's explicitly NOT in v0 (subsequent PRs):

- No severity config in `omnigraph.yaml` (MR-694 §2).
- No `@allow(OG-XXX-NNN, "rationale")` suppression directive (§3).
- No `--allow-data-loss` flag or destructive-tier enforcement.
- No new `SchemaMigrationStep` variants (soft/hard drops, default,
  widen/narrow). MR-700, MR-697 land those.
- No pre-migration checks (MR-941).
- No CD / VE / LK / NM family rules (MR-942..945).
- No CI integration (MR-946).

Tests: 235 compiler tests, 11 schema_apply integration tests, 14
lint module tests, 55 CLI tests — all green.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 17:08:18 +03:00
Devin AI
a42d178119 release: prepare omnigraph 0.4.2 2026-05-10 14:02:28 +00:00
Ragnor Comerford
8726ffe0a3
release: bump version to 0.4.1 2026-05-02 23:20:50 +02:00
Andrew Altshuler
74eb5a5380
Parallel per-type load writes + omnigraph optimize/cleanup CLI (#46)
* Parallel per-type load writes + omnigraph optimize/cleanup CLI

## MR-677.3 — parallel per-type load writes

The load path already groups records into one RecordBatch per type and
makes one Lance commit per table (loader::mod.rs:249-..), but those
commits ran sequentially. Wrap node and edge write loops in
`futures::stream::buffered(N)` against a new helper
`write_batches_concurrently`. Concurrency tunable via
`OMNIGRAPH_LOAD_CONCURRENCY` (default 8).

## MR-676 — `omnigraph optimize` and `omnigraph cleanup`

New CLI subcommands that walk every node + edge table in the repo:

- `omnigraph optimize <uri>` — runs Lance `compact_files` on each
  table to merge small fragments into fewer larger ones.
- `omnigraph cleanup <uri> --keep N | --older-than 7d --confirm` —
  runs Lance `cleanup_old_versions` to prune historical manifests +
  unique fragments. Requires `--confirm` because it's destructive.
  Supports both count-based and time-based retention (or both AND'd
  together). Time uses chrono `DateTime<Utc>` (added as a workspace
  dep, default-features off).

Both commands run their per-table loops in parallel (8-way bounded,
`OMNIGRAPH_MAINTENANCE_CONCURRENCY` env override). Smoke-tested
against the 114-table prod graph: optimize went 7m15s sequential
→ 1m28s parallel. cleanup --keep 1 removed 137 historical versions
across 114 tables in 1m57s without disrupting `/healthz` or query
responses.

Public API on `Omnigraph`:

  pub async fn optimize(&mut self) -> Result<Vec<TableOptimizeStats>>
  pub async fn cleanup(&mut self, opts: CleanupPolicyOptions)
      -> Result<Vec<TableCleanupStats>>

All 10 existing loader tests still pass.

Closes MR-676.
Partially addresses MR-677 (the .3 — parallel by type — piece;
MR-677.1 is for the `omnigraph embed` path, not load, since load
doesn't call Gemini directly. .2 was already in place).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate openapi.json

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-25 14:22:14 +03:00
Andrew Altshuler
8649b2084f
Prepare v0.3.0 release (#44)
* Prepare v0.3.0 release

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: regenerate openapi.json

* ci: retrigger CI on latest openapi.json

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-04-21 19:11:34 +03:00
andrew
54101f7e2c Extract remaining crowded compiler test modules
Files where inline tests crowded out production code (test/prod ratio
≥ 0.8) move to sibling files via `#[path]`. Files where production
dominates (query_input.rs, schema_plan.rs) stay inline — extracting
would add noise, not reduce it.

- ir/lower.rs: 1239 → 577 lines (ratio 1.15)
- catalog/mod.rs: 594 → 326 lines (ratio 0.83)
- query/lint.rs: 562 → 314 lines (ratio 0.80)

catalog/tests.rs uses the shorter name since it's inside a module
directory (no ambiguity with filename).

All 229 compiler tests green, identical count to before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:49:09 +03:00
andrew
94849a50b4 Extract compiler test modules to sibling files
typecheck.rs, schema/parser.rs, and query/parser.rs each had
~1000-line inline `mod tests` blocks that overshadowed the production
code in the file. Move each to a sibling `*_tests.rs` using
`#[path = "..."] mod tests;`.

- typecheck.rs: 2865 → 1708 lines; typecheck_tests.rs: 1156 lines
- schema/parser.rs: 1950 → 994 lines; parser_tests.rs: 955 lines
- query/parser.rs: 1737 → 803 lines; parser_tests.rs: 933 lines

No visibility change — the sibling module still has `use super::*`
access to crate-privates. No semantic edits beyond de-indenting by
4 spaces (mechanical). All 229 compiler tests green, identical
count to before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:50:18 +03:00
andrew
33bdab1fcb Prepare v0.2.2 release 2026-04-14 20:13:00 +03:00
andrew
3d74cbfc20 Prepare v0.2.1 release 2026-04-14 19:19:00 +03:00
Ragnor Comerford
063be3ddc7
Merge pull request #16 from ModernRelay/tin-epoch
Fix join alignment for traversal-introduced bindings
2026-04-13 16:54:52 +02:00
Ragnor Comerford
6e43ceac08
Add comprehensive tests from morphological matrix analysis
Unit tests covering gaps identified by systematic matrix of:
topology (fan-out, fan-in, cycle) × deferral × filter type × direction.

New unit tests:
- fan-out: one root fans to two deferred destinations via different edges
- fan-in: two sources converge on one destination via reverse expand
- cycle: deferred binding + genuine cycle-closing on return edge
- multiple filters on single deferred binding (name + age)
- param filter on deferred binding (IRExpr::Param in dst_filters)
- negation with inner binding (documents current NodeScan+cycle-close behavior)

New integration tests:
- fan-out projection (friend × company cross-product per source)
- deferred filter matching nothing (empty result propagation)
- negation with inner destination binding filter

Also: guard anti-join fast path against non-empty dst_filters. The bulk
CSR existence check only tests neighbor existence, not destination
properties — it must fall back to the slow path when dst_filters are
present to avoid false negatives.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:31:08 +02:00
Ragnor Comerford
3461aa123d
Fix: exclude wildcard $_ from traversal adjacency graph
The anonymous wildcard variable _ was included as a regular node in the
undirected adjacency graph used for component analysis. When multiple
traversals referenced $_, it falsely bridged otherwise-independent
components, causing bindings in separate components to be deferred.
The deferred binding would never be introduced (since _ is never added
to bound_vars), leading to silently dropped traversals.

Fix: skip edges involving _ when building the adjacency graph.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:11:17 +02:00
Ragnor Comerford
fabd65b08a
Fix: propagate edge-lookup errors from iterative traversal loop
The retain-based loop swallowed catalog.lookup_edge_by_name errors by
keeping the traversal for the next pass, where it could never succeed.
This caused the no-progress break to fire, silently dropping the
traversal and producing incorrect query results with missing joins.

Replaced retain with a manual for-loop that propagates errors via ?.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 13:40:22 +02:00
Ragnor Comerford
88384476be
Fix traversal ordering: process in dependency order, not declaration order
The iterative lowering now handles traversals declared in non-topological
order (e.g. `$b worksAt $c` before `$a knows $b`).  Each pass processes
traversals that have at least one bound endpoint, repeating until all are
consumed.  Caught during self-review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:16:45 +02:00
Ragnor Comerford
853691c70e
Fix join alignment for traversal-introduced bindings with Lance filter pushdown
The IR lowering previously emitted independent NodeScans for every binding
in a match clause, even when bindings were connected by traversals. This
created O(N×M) cross-joins followed by cycle-closing filters — correct but
extremely slow for large datasets.

Two changes fix this by design:

1. **Deferred bindings** — When multiple bindings are connected by
   traversals, only the first-declared binding gets a NodeScan. The rest
   are introduced by Expand operations, eliminating cross-joins entirely.

2. **Filter fusion into Expand** — Deferred binding filters are attached
   directly to IROp::Expand (new `dst_filters` field) and pushed into
   Lance SQL during hydrate_nodes(), so the storage layer skips
   non-matching rows. Non-pushable filters (list-contains, FTS) fall back
   to in-memory application after hconcat.

For a query like:
  match { $p: Person  $p worksAt $c  $c: Company { name: "Acme" } }

Old plan: NodeScan($p) → NodeScan($c) → cross-join → Expand(__temp) → cycle-close
New plan: NodeScan($p) → Expand($p→$c, Lance SQL: id IN (...) AND name='Acme')

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:10:50 +02:00
Claude
c943d97744
Fix null-fill for nullable params when params JSON is None/null
The early return at line 273 for None/Value::Null params was skipping
the null-fill loop, leaving declared nullable params absent from the
map. Downstream code would then error with "parameter not provided".

https://claude.ai/code/session_014oGFKL7EVg1b2cyPgt9Gne
2026-04-13 09:37:17 +00:00
Claude
37b7a94eb7
Fix nullable query parameters: accept omission and null for ? params
Parameters declared with `?` (e.g. `$changelogUrl: String?`) now correctly
accept omission or explicit null in JSON input instead of requiring empty
strings as a workaround. Adds `Literal::Null` variant and threads it through
parameter parsing, type-checking, and Arrow array conversion.

https://claude.ai/code/session_014oGFKL7EVg1b2cyPgt9Gne
2026-04-13 08:43:48 +00:00
Ragnor Comerford
c5a88cacb5
Merge pull request #6 from ModernRelay/claude/omnigraph-aggregates-a53rG
Implement aggregate functions with GROUP BY support
2026-04-13 10:26:07 +02:00
andrew
1bf55fa52d Add query lint and check commands 2026-04-13 00:37:44 +03:00
Claude
351610d18c
Implement aggregate execution with wide-batch model
Add runtime support for aggregate functions (count, sum, avg, min, max)
with GROUP BY semantics, built on a single wide RecordBatch that
eliminates correlation tracking by construction.

Execution engine (exec/query.rs):
- Replace HashMap<String, RecordBatch> with Option<RecordBatch> where
  columns are prefixed as <variable>.<property>
- NodeScan prefixes columns and cross-joins with existing batch
- Expand collects (src_row, dst_id) pairs, takes wide batch rows,
  appends prefixed destination columns via hconcat
- Filter applies single mask to entire wide batch
- AntiJoin: fast-path returns BooleanArray mask; slow-path slices
  one row for inner pipeline execution

Projection engine (exec/projection.rs):
- aggregate_return groups rows by non-aggregate key columns using
  length-prefixed string encoding, computes per-group aggregates
- SUM accumulates into f64 to avoid integer overflow
- MIN/MAX support both numeric and string types
- Empty input returns count=0, others=null

Compiler (typecheck.rs):
- T8: split MIN/MAX from SUM/AVG — allow string arguments
- T9: non-aggregate expressions in aggregate queries must be
  property accesses or variables
- SUM type inference returns Float64 (matching runtime)

Tests: 8 new integration tests covering grouped count, global count,
sum/avg/min/max per company, aggregate+order+limit, string min/max,
multi-hop aggregates, and edge cases.

https://claude.ai/code/session_019o5NRyYomgETFyd7hpiLey
2026-04-12 20:59:13 +00:00
andrew
5daeae7571 Prepare v0.2.0 release 2026-04-12 20:35:34 +03:00
Claude
d10f78530f
Support multi-statement mutations (insert + edge in one query)
Allow mutation queries to contain multiple sequential statements that
execute atomically within a single transactional run. This enables
patterns like inserting a node and its edges in one query:

    query add_and_link($name: String, $age: I32, $friend: String) {
        insert Person { name: $name, age: $age }
        insert Knows { from: $name, to: $friend }
    }

Changes span the full compiler-to-execution pipeline:
- Grammar: mutation_body = { mutation_stmt+ }
- AST: QueryDecl.mutations: Vec<Mutation>
- IR: MutationIR.ops: Vec<MutationOpIR>
- Execution: loop over ops, accumulate affected counts

Cross-statement visibility works because each statement's commit_updates
advances the manifest state, so subsequent statements see prior writes.
Atomicity comes from the existing run mechanism (begin_run/publish_run).

https://claude.ai/code/session_01E4VG2WXrZW8aeXFiqr8NwF
2026-04-11 20:27:51 +00:00
andrew
40ed575e7e Set public release version to 0.1.0 2026-04-11 05:33:04 +03:00
andrew
338289656a Initial public Omnigraph repository 2026-04-10 20:49:41 +03:00