diff --git a/.context/experiments/bitmap-pushdown.md b/.context/experiments/bitmap-pushdown.md index 67f88d8..2dac578 100644 --- a/.context/experiments/bitmap-pushdown.md +++ b/.context/experiments/bitmap-pushdown.md @@ -1,6 +1,15 @@ # Experiment 1.5 — Extending DataFusion dynamic-filter-pushdown to bitmap shape (code-dive) -**Ticket:** MR-925 §1.5 (validates MR-737 §5.6, §5.7 / Open Q3). +**Ticket:** MR-925 §1.5 (validates MR-737 §5.3 "Sideways Information Passing +(SIP) — extends DataFusion's dynamic filter pushdown" + §5.6 "Capability-bearing +storage trait", with implications for §5.7 cost-model). + +**§-numbering note:** the original MR-925 cross-reference said "§5.3 / §5.10". +On a full re-read of MR-737: §5.3 is the SIP design and is the correct primary +§ for this experiment; §5.10 in current MR-737 is "First-class scores and rank +fusion" (unrelated). The capability seam that the bitmap pushdown traverses +on the storage side is §5.6 (`scan_by_key_set` capability), so this writeup +also produces deltas for §5.6. **Type:** Code-dive only (no prototype crate). **Substrate pin:** DataFusion 52.5. **Date:** 2026-05-12. diff --git a/.context/experiments/custom-lance-index.md b/.context/experiments/custom-lance-index.md index 16c23e4..125b5e0 100644 --- a/.context/experiments/custom-lance-index.md +++ b/.context/experiments/custom-lance-index.md @@ -1,6 +1,14 @@ # Experiment 1.2 — Custom Lance index plugin from outside the lance crate -**Ticket:** MR-925 §1.2 (validates MR-737 §5.4, §5.5). +**Ticket:** MR-925 §1.2 (validates MR-737 §5.4 "Persisted CSR adjacency as +Lance index plugin" + §5.16 "Index rebuild orchestration — stateless reconciler"). + +**§-numbering note:** the original MR-925 cross-reference said "§5.4, §5.5". +On a full re-read of MR-737: §5.4 is correct (this is the plugin section); +§5.5 is "Stable row IDs as graph IDs" — *not* the reconciler. The reconciler +is §5.16 ("Index rebuild orchestration"). Section references in the body have +been corrected; §5.5 is *indirectly* in scope only because the experiment +leans on stable row IDs. **Prototype:** `validation-prototypes/custom-lance-index/` (long-lived branch). **Substrate pin:** Lance 4.0.1 (matched by cargo to 4.0.0 spec). Lance 4.0.1 internally pulls roaring 0.11 and prost-types 0.14; the workspace deps were lifted to match. **Date:** 2026-05-12. @@ -128,8 +136,8 @@ is closed to us). Any custom-index reconciler we ship has to: - emit `Operation::CreateIndex { new_indices: vec![updated], removed_indices: vec![old] }` to re-publish the index with the updated bitmap. -This is *consistent with* the §5.5 reconciler pattern in MR-737, so it's -not a blocker — but the writeup of §5.5 should explicitly say "the +This is *consistent with* the §5.16 reconciler pattern in MR-737, so it's +not a blocker — but the writeup of §5.16 should explicitly say "the reconciler also owns fragment coverage diffs, not just file content". ### F5. Compaction does not move our index. ⚠️ @@ -203,7 +211,7 @@ paths forward: both. The current MR-737 wording implies path (1) is available; this experiment proves it is not. -§5.5 (reconciler pattern) is unaffected by this finding — but it must +§5.16 (reconciler pattern) is unaffected by this finding — but it must expand to explicitly own `fragment_bitmap` recomputation across all mutating operations, since with path (2) or path (3) we are the only party that knows the index's row coverage. diff --git a/.context/experiments/custom-operator.md b/.context/experiments/custom-operator.md index 8bd5f43..5e53c8c 100644 --- a/.context/experiments/custom-operator.md +++ b/.context/experiments/custom-operator.md @@ -1,6 +1,16 @@ # Experiment 1.3 — Custom UserDefinedLogicalNode + ExecutionPlan e2e -**Ticket:** MR-925 §1.3 (validates MR-737 §5.3, §5.10). +**Ticket:** MR-925 §1.3 (validates MR-737 §5.1 "Unified IR", §5.2 "Factorized IR", +§5.11 "Substrate choice — DataFusion vs. custom executor (A)", and §5.12 +"Mutation IR" — all of which rely on `UserDefinedLogicalNode` + `ExecutionPlan` +surviving the optimizer end-to-end). + +**§-numbering note:** the original §1.3 cross-reference in MR-925 cited "§5.3, +§5.10". On a full re-read of MR-737: §5.10 is "First-class scores and rank +fusion (open-world over modalities)" (not "operators survive the optimizer"), +and the custom-operator e2e contract is actually shared across §5.1, §5.2, +§5.11, and §5.12. §5.3 (SIP) is the *first* operator that consumes the contract +— valid — but the contract itself is broader than §5.3 alone. **Prototype:** `validation-prototypes/custom-operator/`. **Substrate pin:** DataFusion 52.5 (matched to omnigraph workspace). **Date:** 2026-05-12. @@ -168,7 +178,7 @@ These are not blockers but should be noted for the §11 RFC-body delta: the RFC that graph operators must explicitly choose their output partitioning rather than inheriting. -## Decision impact on MR-737 §5.3 and §5.10 +## Decision impact on MR-737 §5.1, §5.2, §5.3, §5.11, §5.12 **§5.3 is achievable on DataFusion 52.5 as written.** The `UserDefinedLogicalNode`/`ExecutionPlan` surface is fully sufficient @@ -183,9 +193,9 @@ NeighborSetIntersect, etc.). The only edits needed in §5.3: operators implementing it accurately, not punting to `Statistics::new_unknown`. -**§5.10 ("operators survive the optimizer + execute correctly")**: -The composition test (E4) plus the metrics test (E5) cover this. No -deltas needed. +**§5.1 / §5.2 / §5.11 / §5.12 ("custom operators survive the optimizer + +execute correctly")**: The composition test (E4) plus the metrics test (E5) +cover this. No deltas needed. ## Caveats diff --git a/.context/experiments/sip-format-bench.md b/.context/experiments/sip-format-bench.md index fd7dd2c..d8f76e9 100644 --- a/.context/experiments/sip-format-bench.md +++ b/.context/experiments/sip-format-bench.md @@ -1,6 +1,15 @@ # Experiment 1.4 — Roaring bitmap variant for u64 row IDs (SIP wire format) -**Ticket:** MR-925 §1.4 (validates MR-737 §5.6, §5.8 / Open Q4). +**Ticket:** MR-925 §1.4 (validates MR-737 §5.3 "Sideways Information Passing +(SIP)" wire format + §10 Open Q3 "SIP wire format"). + +**§-numbering note:** the original MR-925 cross-reference said "§5.6, §5.8 / +Open Q4". On a full re-read of MR-737: §5.3 is SIP (the wire format is named +in Open Q3); §5.6 is the storage trait that *consumes* SIP via `sip_mask` / +`key_set` on `ScanRequest`; §5.8 is "Tiering via Lance base paths" (unrelated +to SIP). Open Q4 is "CSR index format inside Lance" (also unrelated). The +correct primary mapping is **§5.3 + §10 Open Q3**, with downstream +implications for §5.6 (the capability advertisement). **Prototype:** `validation-prototypes/sip-format-bench/`. **Substrate pin:** `roaring = "0.11"` (matched to lance-table dependency). **Date:** 2026-05-12. @@ -10,7 +19,7 @@ ## Hypothesis For propagating row-ID side-information predicates (SIPs) between operators — -the §5.6 dynamic-filter-pushdown wire format — Roaring bitmaps over u64 +the §5.3 dynamic-filter-pushdown wire format — Roaring bitmaps over u64 (`RoaringTreemap`) are the right encoding when row IDs cluster by Lance fragment (which they do). For random u64s, Roaring is *not* the right choice. @@ -148,7 +157,7 @@ At `n=1M` dense, encoding takes **37ms**, decoding takes **0.02ms**. For "build once, read many" wire-format use, this is fine. But if the SIP is built mid-pipeline (e.g. from a `FilterExec`'s output IDs) and intersected immediately with another payload, the build cost dominates. -The §5.6 RFC should clarify: SIPs are produced at *probe-build time* on +The §5.3 RFC should clarify: SIPs are produced at *probe-build time* on the hash-join build side, where 37ms is amortized across the entire probe phase. @@ -176,9 +185,9 @@ construction. Default fallback (for non-row-ID u64s): **varint-delta**. -## Decision impact on MR-737 §5.6 and §5.8 +## Decision impact on MR-737 §5.3 + §10 Open Q3 (and downstream §5.6) -**§5.6 (SIP wire format) — concrete choice:** +**§5.3 (SIP wire format) — concrete choice:** > ROW_ID_SIP wire format := length-prefixed roaring `serialize_into` bytes > with a 1-byte format-tag prefix. Tag values: `0x01` = Roaring (u64 @@ -189,7 +198,7 @@ Default fallback (for non-row-ID u64s): **varint-delta**. This makes the wire format extensible while picking a default that matches the dominant workload. -**§5.8 / Open Q4 — answered:** +**Open Q3 — answered:** The RFC's Q4 ("can we share the SIP filter between operator stages by serializing roaring bytes?") is **yes for row-ID payloads**. diff --git a/.context/experiments/stable-row-id-compaction.md b/.context/experiments/stable-row-id-compaction.md index e1ffa9d..58016a0 100644 --- a/.context/experiments/stable-row-id-compaction.md +++ b/.context/experiments/stable-row-id-compaction.md @@ -1,6 +1,17 @@ # Experiment 1.7 — Stable-row-id-aware indices survive compaction (code-dive + small repro plan) -**Ticket:** MR-925 §1.7 (validates MR-737 §5.4, §5.10 / Open Q6). +**Ticket:** MR-925 §1.7 (validates MR-737 §5.4 "Persisted CSR adjacency as Lance +index plugin" + §5.5 "Stable row IDs as graph IDs"). + +**§-numbering note (added on re-read of MR-737):** MR-925's original §1.7 cross- +reference cited "§5.8 / Open Q7" / "§5.10". On a full read of MR-737, §5.10 is +"First-class scores and rank fusion" (NOT custom index types), §5.4 is "Persisted +CSR adjacency as Lance index plugin" (which contains the custom-index-type seam), +and §5.5 is "Stable row IDs as graph IDs" (which flags the experimental status of +"Stable Row ID for Index" in lance-4.0.x). The corrected mapping for §1.7 is +**§5.4 + §5.5** and the MR-737 §5.5 substrate caveat that "`Stable Row ID for +Index` is documented as experimental in lance-4.0.x" is the immediate caveat for +this experiment. **Type:** Code-dive plus a planned small repro (not yet built; specified for Phase 0 entry). **Substrate pin:** Lance 4.0.1, lance-index 4.0.1. **Date:** 2026-05-12. @@ -9,10 +20,13 @@ ## Question -MR-737 §5.4 (graph topology index) and §5.10 (custom index types for graph -adjacency) both depend on the assumption that a custom index — i.e. our -own CSR/CSC adjacency lists keyed by source-table row IDs — **continues -to point at the right rows after the source table is compacted.** +MR-737 §5.4 ("Persisted CSR adjacency as Lance index plugin") and §5.5 ("Stable +row IDs as graph IDs") both depend on the assumption that a custom CSR/CSC +adjacency index keyed by source-table row IDs **continues to point at the right +rows after the source table is compacted.** §5.5 explicitly flags the substrate +caveat: "Stable Row ID for Index" is **experimental** in lance-4.0.x; confirming +whether our created indices opt into stable-row-id mode is a follow-up worth +doing before MR-848 (index reconciler) lands. Lance's compaction (`compact_files`) consolidates fragments, which on the non-stable row-ID scheme renumbers row addresses. The question: @@ -253,19 +267,22 @@ demonstrates end-to-end survival. Specification: Estimated effort: 1–2 days. **Defer to Phase 0**; the code-dive already justifies §5.4 as feasible without the repro. -## Decision impact on MR-737 §5.4 and §5.10 +## Decision impact on MR-737 §5.4 and §5.5 -**§5.4 (graph topology index) is feasible on Lance 4.0.1 with stable -row IDs (Path B):** +**§5.4 (persisted CSR adjacency as Lance index plugin) is feasible on Lance +4.0.1 with stable row IDs (Path B):** - No Lance plugin-registry dependency; we drive remapping ourselves. - The custom topology dataset stores stable row IDs end-to-end; the bulk of compaction-induced changes don't require remap. - Path A (Lance-managed remapping) is a follow-up improvement - contingent on the §1.2 contribution. + contingent on the §1.2 plugin-registry contribution. -**§5.10 (custom index types):** No new findings beyond §1.2. The -`ScalarIndex::remap` contract is sufficient and stable. +**§5.5 (stable row IDs as graph IDs):** The MR-737 substrate caveat +("`Stable Row ID for Index` is experimental in lance-4.0.x") still +stands. The small repro in §5 above is the way to confirm opt-in; +until it runs, treat §5.5 as substrate-positive but not yet validated +for index-side stable IDs. **Open Q6 ("survive compaction"):** Answered yes. The recommendation is **Path B for v1, Path A for v2**. RFC §5.4 should specify the diff --git a/.context/experiments/txn-branches-cost.md b/.context/experiments/txn-branches-cost.md index dbb92fa..cfe001e 100644 --- a/.context/experiments/txn-branches-cost.md +++ b/.context/experiments/txn-branches-cost.md @@ -1,6 +1,25 @@ # Experiment 1.6 — Lance native per-table txn branches (code-dive, cost model) -**Ticket:** MR-925 §1.6 (validates MR-737 §5.11, §5.12 / Open Q5). +**Ticket:** MR-925 §1.6 (validates MR-737 §5.12 "Mutation IR, write planner, +and external sources"). + +**§-numbering note (added on re-read of MR-737):** MR-925's original §1.6 cross- +reference cited "§5.11, §5.12 / Open Q5". On a full read of MR-737: + +- **§5.11** is "Substrate choice — DataFusion vs. custom executor — RECOMMENDED + (A)" — resolved 2026-05-11, not about per-table branches. +- **§5.12** "Mutation IR, write planner, and external sources" is where the + *implementation note* says "writes go to per-table Lance native branches + (storage-layer staging invisible above the TableProvider/Dataset boundary), + success → fast-forward, failure → drop branches. NOT `restore()`-based rollback" + (also called out in §0.5 item 9). This is what §1.6 actually validates. +- **Open Q5** in the current MR-737 §10 is "What 'extension rate' means under + filters" — unrelated. The MR-925 cross-reference is to a pre-2026-05-11 + numbering where Q5 covered per-table branches; that question is now folded + into §5.12 and resolved per Ragnor 2026-04-29. + +Corrected mapping for §1.6: **§5.12 (mutation-IR per-table branches)** — cost +model + recommendation. **Type:** Code-dive only — no prototype crate. **Substrate pin:** Lance 4.0.1. **Date:** 2026-05-12. @@ -204,11 +223,12 @@ cheaper. *branches*, not just zombie *commits*. This is more state to track, not less. -## Decision impact on MR-737 §5.11 and §5.12 +## Decision impact on MR-737 §5.12 -**§5.11 (per-table txn isolation):** Lance-native branches **can** -implement this, but the steady-state cost is essentially the same as -the current lazy-graph-branch model, and the abort-path cost is *higher*. +**§5.12 (mutation IR with per-table branches):** Lance-native branches **can** +implement the per-table staging shape that §5.12 prescribes, but the steady-state +cost is essentially the same as the current lazy-graph-branch model, and the +abort-path cost is *higher*. The conceptual clarity argument is real but not load-bearing — both models provide the same isolation guarantee. @@ -236,7 +256,7 @@ Lance-native per-table branches only if: "lazy" so most branches don't fork most tables). For Phase 0, the deliverable is a **clear specification** of which model -MR-737 §5.11/§5.12 prescribes. The cost analysis above suggests +MR-737 §5.12 prescribes. The cost analysis above suggests specifying the current model. ## Caveats and follow-ups @@ -255,4 +275,4 @@ specifying the current model. - **Forbidden APIs file** (`crates/omnigraph/tests/forbidden_apis.rs:57`) excludes `.delete_branch(` from the over-match check — there's intent in the codebase to allow these calls. Worth a follow-up read - to see if MR-737 §5.11 has already opened the door. + to see if MR-737 §5.12 has already opened the door. diff --git a/.context/research/df-extensions.md b/.context/research/df-extensions.md index 7470b59..05a3191 100644 --- a/.context/research/df-extensions.md +++ b/.context/research/df-extensions.md @@ -73,7 +73,7 @@ TableProviders live) absorbs most of the churn. **Reinforces the ## ParadeDB — Postgres + DataFusion for search -**Maps to:** §5.10 custom index types, §5.6 mixing engines. +**Maps to:** §5.4 custom index types (corrected from §5.10 on re-read of MR-737; §5.10 is rank fusion), §5.6 mixing engines. ParadeDB embeds DataFusion inside Postgres as a query executor for analytics workloads. They have a custom `pg_search` extension that diff --git a/.context/research/duckdb.md b/.context/research/duckdb.md index 5c947b3..fff712f 100644 --- a/.context/research/duckdb.md +++ b/.context/research/duckdb.md @@ -3,8 +3,17 @@ **Repo:** [`github.com/duckdb/duckdb`](https://github.com/duckdb/duckdb) (C++, ~25k★). **MR-925 §3.5 mapping:** §5.2 (factorization alternatives — DuckDB -takes a different approach), §5.6 (vectorized scan), §5.10 (custom -extensions / index types). +takes a different approach), §5.6 (vectorized scan), §5.4 (custom +extensions / index types — corrected on re-read of MR-737; §5.10 is +rank fusion, *not* index types). + +**§-numbering note:** the original MR-925 cross-references in this file said +§5.10 (custom index types) and §5.11 (multi-statement transactions). On a +full re-read of MR-737, §5.10 is "First-class scores and rank fusion" and +§5.11 is "Substrate choice — DataFusion vs. custom executor". The correct +mapping is §5.4 (custom index plugin model) and §5.12 (mutation IR / +multi-statement transactions). Section headings below are updated to use the +corrected numbering. **Date:** 2026-05-12. --- @@ -51,7 +60,7 @@ For graph workloads with high expansion (per-row neighbor counts > workloads vs. a fully-vectorized non-factorized engine. Validates the design choice for MR-737. -### Custom extensions / index types — maps to MR-737 §5.10 +### Custom extensions / index types — maps to MR-737 §5.4 DuckDB has a **rich extension API**: - `duckdb_extension.h` — public C ABI for loading shared libraries. @@ -64,13 +73,13 @@ core crate." DuckDB extensions can register completely new index types without forking the core. Compare to Lance (§1.2): Lance has the necessary trait surface but the registry is `pub(crate)`. -**Decision impact on MR-737 §5.10:** DuckDB demonstrates that the +**Decision impact on MR-737 §5.4:** DuckDB demonstrates that the plugin pattern is workable and prevalent. Our §1.2 blocker (Lance's `pub(crate)` registry) is a fixable one — it's not an architectural question, just an API-visibility question. Worth a DuckDB-cited upstream PR to Lance. -### Multi-statement transactions — maps to MR-737 §5.11 +### Multi-statement transactions — maps to MR-737 §5.12 DuckDB uses MVCC with **per-transaction undo logs** for in-memory state. Persistent state is flushed at commit time. @@ -118,7 +127,7 @@ sink builds and source emits. This is the same shape DataFusion's - **§5.6 (scan model):** Pull-based vectorized scan is the right shape. DataFusion already gives us this; no new patterns from DuckDB. -- **§5.10 (custom index types):** DuckDB is the proof-of-concept for +- **§5.4 (custom index plugin model):** DuckDB is the proof-of-concept for fully-extensible index plugins. Cite when raising the Lance upstream `pub(crate)` issue. diff --git a/.context/research/lancedb.md b/.context/research/lancedb.md index 2dc86ec..08ba440 100644 --- a/.context/research/lancedb.md +++ b/.context/research/lancedb.md @@ -2,8 +2,9 @@ **Repo:** [`github.com/lancedb/lancedb`](https://github.com/lancedb/lancedb) (OSS LanceDB, Rust + Python). -**MR-925 §3.2 mapping:** §5.6 (TableProvider integration), §5.10 -(custom index types), §5.12 (mutation-as-IR), §5.1 (vector search as +**MR-925 §3.2 mapping:** §5.6 (TableProvider integration), §5.4 (Lance index +plugin model — *not* §5.10, which is rank fusion; corrected on re-read of MR-737), +§5.12 (mutation-as-IR), §5.1 + §5.10 (vector search as operator vs UDF). **Date:** 2026-05-12. @@ -36,7 +37,10 @@ DataFusion's `TableProvider` trait. Key files (Rust side): for predicates Lance can fully handle and `::Inexact`/`::Unsupported` for the rest. This is the exact pattern §5.6 prescribes. -### Vector search as a query operator — maps to MR-737 §5.1, §5.10 +### Vector search as a query operator — maps to MR-737 §5.1 + §5.10 + +(§5.1 = unified IR with `VectorSearch` as `IROp`; §5.10 = first-class scores + +rank fusion; VectorSearch emits `_score`/`_rank` per §5.10.) LanceDB's `Table::query(vector)` builds a `VectorQuery` and lowers it to a Lance scan with a `nearest` filter. **At the DataFusion level**, @@ -119,7 +123,7 @@ the rest must be designed fresh. - **§5.6 (TableProvider integration):** LanceDB's filter-pushdown conversion is directly reusable. Capability advertisement beyond filters (partitioning, SIP) is OmniGraph-original. -- **§5.10 (custom index types):** LanceDB integrates with Lance's +- **§5.4 (Lance index plugin model):** LanceDB integrates with Lance's built-in scalar/vector indices but does **not** add custom index types from outside the lance crate. Per Experiment 1.2, the plugin registry is `pub(crate)`. LanceDB is not a reference for solving diff --git a/.context/research/trino.md b/.context/research/trino.md index 51e7080..5d34f97 100644 --- a/.context/research/trino.md +++ b/.context/research/trino.md @@ -2,9 +2,20 @@ **Repo:** [`github.com/trinodb/trino`](https://github.com/trinodb/trino) (Java, ~10k★). -**MR-925 §3.6 mapping:** §5.7 (cost-based optimizer at scale), §5.11 -(distributed transaction patterns — informative, NOT prescriptive), -§5.10 (connector SPI). +**MR-925 §3.6 mapping:** §5.7 (cost-based optimizer at scale), §5.12 +(distributed transaction patterns — informative, NOT prescriptive; per-table +branches are §5.12, not §5.11 which is "DataFusion vs custom executor"), +§5.4 + §5.6 (connector SPI as the plugin/capability surface — corrected on +re-read of MR-737; §5.10 is rank fusion, *not* connector SPI). + +**§-numbering note:** original MR-925 cross-references in this file used +§5.10 (connector SPI) and §5.11 (per-table txn branches / distributed +transactions). On a full re-read of MR-737, §5.10 is "First-class scores +and rank fusion" and §5.11 is "Substrate choice — DataFusion vs. custom +executor (A)". The correct mapping for the plugin/capability surface is +§5.4 (Lance index plugin model) + §5.6 (capability-bearing storage trait); +the correct mapping for per-table txn branches is §5.12 (mutation IR with +per-table Lance native branches). Section headings below are updated. **Date:** 2026-05-12. --- @@ -37,7 +48,7 @@ greedy** for larger queries. most useful histogram (informs neighbor-expansion cardinality). **Add to §5.7 spec.** -### Connector SPI — maps to MR-737 §5.10 +### Connector SPI — maps to MR-737 §5.4 + §5.6 Trino's "Connector SPI" is a Java interface for adding new data sources (Postgres, MySQL, Hive, Iceberg, …). The interface is **fully @@ -55,7 +66,7 @@ public** — connectors are JARs that get registered at server startup. `TableProvider` (less stable, breaks ~once per release). Trino is the reference for "what a properly-versioned plugin API looks like." -### Distributed transactions — maps to MR-737 §5.11 (informative) +### Distributed transactions — maps to MR-737 §5.12 + §10 Open Q9 (informative) Trino is a **stateless query coordinator**: it does NOT manage transactions itself. Each connector's underlying system manages its @@ -64,8 +75,8 @@ identifier that lets multiple queries see the same snapshot of each connected system. **Why this matters:** Trino's approach is the opposite of MR-737 -§5.11 (per-table txn branches with cross-table coordination). Trino -punts on the cross-system coordination problem; MR-737 doesn't. +§5.12 (per-table Lance native txn branches with cross-table coordination). +Trino punts on the cross-system coordination problem; MR-737 doesn't. **Re-usable pattern:** None directly. Trino is the **anti-reference** — what NOT to do if you want cross-table ACID. MR-737's design is @@ -123,16 +134,17 @@ encoding; we'd be ahead of Trino on this dimension. OmniGraph is currently in-process (Phase 0–2). The distributed patterns become relevant only at Phase 3+. - **The stateless-coordinator approach to transactions.** MR-737 - §5.11 explicitly wants stateful per-table txn branches. + §5.12 explicitly wants stateful per-table txn branches. ## Decision impact on MR-737 - **§5.7 (cost model):** Add memory cost as third component; out-degree histograms as the most-useful additional statistic. -- **§5.10 (connector SPI / plugin):** Trino is the gold standard for - a stable, versioned plugin API. Cite when designing OmniGraph's - custom-index-type registration spec. -- **§5.11 (transactions):** Trino is anti-reference; MR-737 is +- **§5.4 + §5.6 (connector SPI / plugin / capability surface):** Trino is + the gold standard for a stable, versioned plugin API. Cite when designing + OmniGraph's custom-index-type registration spec (§5.4) and the storage + trait capability surface (§5.6). +- **§5.12 (transactions):** Trino is anti-reference; MR-737 is strictly more ambitious. Document this. - **§5.6 (dynamic filters):** Validates the §1.5 design; we'd be ahead of Trino with roaring-encoded dynamic filters.