omnigraph/vendor/lance-table/protos/index_old.proto
aaltshuler b5c0c6238b fix(deps): vendor lance-table 7.0.0 + lance#7480 so merge-updated tables survive filtered reads after deletes
iss-merge-rowid-overlap-corrupts-filtered-reads / lance#7444: an
update-style merge_insert over a merge-written fragment legally reuses the
updated rows' stable row ids (row-id-lineage spec: updates preserve
_rowid) while the superseded fragment keeps its full sequence plus a
deletion vector. A later delete leaves the overlapping id range sparsely
tiled, and lance-table 7.0.0's RowIdIndex::new asserted dense tiling —
failing every filtered read that builds the id→address map ("Wrong range"
debug assert; "all columns in a record batch must have the same length"
or a silently-wrong batch in release).

The upstream fix (lance#7480, merged 2026-07-01) landed hours AFTER
v8.0.0 was cut, so no release ≤ 8.0.0 carries it. Consume it now as a
vendored pin: vendor/lance-table is the pristine published 7.0.0 source
plus ONLY the #7480 rowids/index.rs hunk (drop the false tiling assert;
hard-error on the true invariant — one live id claimed by two fragments)
and upstream's regression unit test, wired via [patch.crates-io]. The fix
is read-side only, so already-written graphs become readable as-is — no
data repair.

Removal condition (see vendor/lance-table/README.omnigraph.md): drop the
vendor dir + patch entry at the first Lance bump whose lance-table ships
lance#7480 (9.0.0, or a backported 8.0.1). The surface guard
filtered_scan_tolerates_merge_update_row_id_overlap keeps that honest in
both directions.

Turns the previous commit's red tests green. Full workspace gate passes
(cargo test --workspace --locked --no-fail-fast, 68 suites).
2026-07-02 23:23:39 +03:00

42 lines
1.6 KiB
Protocol Buffer

// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright The Lance Authors
syntax = "proto3";
package lance.table;
// NOTE: Do *NOT* add new index details here. Add them to the index.proto file instead.
// This file is in the lance.table package namespace while the index.proto file is in the
// lance.index package namespace.
//
// These are only here for forward compatibility. Older versions of Lance expect btree indexes
// to have lance.table in the package namespace.
//
// If you need to modify these messages (e.g. to add new fields to btree or bitmap) then
// it is ok to modify them here.
// Currently many of these are empty messages because all needed details are either hard-coded (e.g.
// filenames) or stored in the index itself. However, we may want to add more details in the
// future, in particular we can add details that may be useful for planning queries (e.g. don't
// force us to load the index until we know we can make use of it)
message BTreeIndexDetails {}
message BitmapIndexDetails {}
message LabelListIndexDetails {}
message NGramIndexDetails {}
message ZoneMapIndexDetails {}
message InvertedIndexDetails {
// Marking this field as optional as old versions of the index store blank details and we
// need to make sure we have a proper optional field to detect this.
optional string base_tokenizer = 1;
string language = 2;
bool with_position = 3;
optional uint32 max_token_length = 4;
bool lower_case = 5;
bool stem = 6;
bool remove_stop_words = 7;
bool ascii_folding = 8;
uint32 min_ngram_length = 9;
uint32 max_ngram_length = 10;
bool prefix_only = 11;
}