omnigraph

apunkt/omnigraph

Fork 0

mirror of https://github.com/ModernRelay/omnigraph.git synced 2026-06-24 02:38:06 +02:00

Commit graph

Author	SHA1	Message	Date
andrew	628bc2e607	Clean up bench_expand example Remove vestigial code left from removed hasher variants: unused BuildHasherDefault import, PhantomData suppression line, orphan planning comments for Variant C/E. Also drop an unused `mut` on the PRNG closure binding. No behavior change; compiles warning-free. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 00:59:21 +03:00
Ragnor Comerford	d8e0bfeb22	Dedupe dst ids before hydrating nodes in execute_expand (#45 ) The BFS in execute_expand emits one (src_idx, dst_id) pair per edge, so dst_id_list contains heavy duplication when multi-hop traversals revisit the same destination nodes. hydrate_nodes then built an "id IN ('a', 'b', ...)" filter from the full list, passing it verbatim to Lance. On a 30k-node Person graph, a 3-hop query produced a 15.4M- entry IN-list against a 30k-row target — 512x more entries than unique ids. Deduplicate before the Lance scan; the post-hydrate alignment HashMap already fans results back out to the original (src, dst) pairs, so output is bit-identical. Bench numbers (crates/omnigraph/examples/bench_expand.rs, min of 2-3 runs, release build): query before after speedup 1k hop3 460 ms 28 ms 16x 10k hop2 4.21 s 188 ms 22x 10k hop3 40.59 s 1.30 s 31x 30k hop2 11.71 s 678 ms 17x 30k hop3 197.38 s 4.86 s 41x All existing omnigraph-engine tests pass (72/72, 0 failures). Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-25 00:56:18 +03:00

Author

SHA1

Message

Date

andrew

628bc2e607

Clean up bench_expand example

Remove vestigial code left from removed hasher variants: unused
BuildHasherDefault import, PhantomData suppression line, orphan planning
comments for Variant C/E. Also drop an unused `mut` on the PRNG closure
binding. No behavior change; compiles warning-free.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-04-25 00:59:21 +03:00

Ragnor Comerford

d8e0bfeb22

Dedupe dst ids before hydrating nodes in execute_expand (#45 )

The BFS in execute_expand emits one (src_idx, dst_id) pair per edge, so
dst_id_list contains heavy duplication when multi-hop traversals revisit
the same destination nodes. hydrate_nodes then built an
"id IN ('a', 'b', ...)" filter from the full list, passing it verbatim
to Lance. On a 30k-node Person graph, a 3-hop query produced a 15.4M-
entry IN-list against a 30k-row target — 512x more entries than unique
ids.

Deduplicate before the Lance scan; the post-hydrate alignment HashMap
already fans results back out to the original (src, dst) pairs, so
output is bit-identical.

Bench numbers (crates/omnigraph/examples/bench_expand.rs, min of 2-3
runs, release build):

  query         before     after    speedup
  1k   hop3     460 ms     28 ms     16x
  10k  hop2    4.21 s     188 ms     22x
  10k  hop3   40.59 s    1.30 s     31x
  30k  hop2   11.71 s    678 ms     17x
  30k  hop3  197.38 s    4.86 s     41x

All existing omnigraph-engine tests pass (72/72, 0 failures).

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

2026-04-25 00:56:18 +03:00

2 commits