mirror of
https://github.com/ModernRelay/omnigraph.git
synced 2026-06-15 01:55:13 +02:00
The IR lowering previously emitted independent NodeScans for every binding
in a match clause, even when bindings were connected by traversals. This
created O(N×M) cross-joins followed by cycle-closing filters — correct but
extremely slow for large datasets.
Two changes fix this by design:
1. **Deferred bindings** — When multiple bindings are connected by
traversals, only the first-declared binding gets a NodeScan. The rest
are introduced by Expand operations, eliminating cross-joins entirely.
2. **Filter fusion into Expand** — Deferred binding filters are attached
directly to IROp::Expand (new `dst_filters` field) and pushed into
Lance SQL during hydrate_nodes(), so the storage layer skips
non-matching rows. Non-pushable filters (list-contains, FTS) fall back
to in-memory application after hconcat.
For a query like:
match { $p: Person $p worksAt $c $c: Company { name: "Acme" } }
Old plan: NodeScan($p) → NodeScan($c) → cross-join → Expand(__temp) → cycle-close
New plan: NodeScan($p) → Expand($p→$c, Lance SQL: id IN (...) AND name='Acme')
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| fixtures | ||
| helpers | ||
| aggregation.rs | ||
| branching.rs | ||
| changes.rs | ||
| consistency.rs | ||
| end_to_end.rs | ||
| export.rs | ||
| failpoints.rs | ||
| lance_version_columns.rs | ||
| point_in_time.rs | ||
| runs.rs | ||
| s3_storage.rs | ||
| search.rs | ||
| traversal.rs | ||