mirror of
https://github.com/asg017/sqlite-vec.git
synced 2026-04-25 08:46:49 +02:00
add broken shadow table test
This commit is contained in:
parent
a657b3a216
commit
e99e31feb7
5 changed files with 226 additions and 36 deletions
|
|
@ -1,3 +1,13 @@
|
|||
# `sqlite-vec` Architecture
|
||||
|
||||
Internal documentation for how `sqlite-vec` works under-the-hood. Not meant for
|
||||
users of the `sqlite-vec` project, consult
|
||||
[the official `sqlite-vec` documentation](https://alexgarcia.xyz/sqlite-vec) for
|
||||
how-to-guides. Rather, this is for people interested in how `sqlite-vec` works
|
||||
and some guidelines to any future contributors.
|
||||
|
||||
Very much a WIP.
|
||||
|
||||
## `vec0`
|
||||
|
||||
### Shadow Tables
|
||||
|
|
@ -9,7 +19,6 @@
|
|||
- `validity BLOB`
|
||||
- `rowids BLOB`
|
||||
|
||||
|
||||
#### `xyz_rowids`
|
||||
|
||||
- `rowid INTEGER`
|
||||
|
|
@ -32,7 +41,6 @@
|
|||
- `rowid INTEGER`
|
||||
- `data BLOB`
|
||||
|
||||
|
||||
#### `xyz_metadata_text_data_00`
|
||||
|
||||
- `rowid INTEGER`
|
||||
|
|
@ -52,8 +60,11 @@ The "header" charcter denotes the type of query plan, as determined by the
|
|||
| `VEC0_QUERY_PLAN_POINT` | `'2'` | Perform a single-lookup point query for the provided rowid |
|
||||
| `VEC0_QUERY_PLAN_KNN` | `'3'` | Perform a KNN-style query on the provided query vector and parameters. |
|
||||
|
||||
Each 4-character "block" is associated with a corresponding value in `argv[]`. For example, the 1st block at byte offset `1-4` (inclusive) is the 1st block and is associated with `argv[1]`. The 2nd block at byte offset `5-8` (inclusive) is associated with `argv[2]` and so on. Each block describes what kind of value or filter the given `argv[i]` value is.
|
||||
|
||||
Each 4-character "block" is associated with a corresponding value in `argv[]`.
|
||||
For example, the 1st block at byte offset `1-4` (inclusive) is the 1st block and
|
||||
is associated with `argv[1]`. The 2nd block at byte offset `5-8` (inclusive) is
|
||||
associated with `argv[2]` and so on. Each block describes what kind of value or
|
||||
filter the given `argv[i]` value is.
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_KNN_MATCH` (`'{'`)
|
||||
|
||||
|
|
@ -69,7 +80,8 @@ The remaining 3 characters of the block are `_` fillers.
|
|||
|
||||
#### `VEC0_IDXSTR_KIND_KNN_ROWID_IN` (`'['`)
|
||||
|
||||
`argv[i]` is the optional `rowid in (...)` value, and must be handled with [`sqlite3_vtab_in_first()` /
|
||||
`argv[i]` is the optional `rowid in (...)` value, and must be handled with
|
||||
[`sqlite3_vtab_in_first()` /
|
||||
`sqlite3_vtab_in_next()`](https://www.sqlite.org/c3ref/vtab_in_first.html).
|
||||
|
||||
The remaining 3 characters of the block are `_` fillers.
|
||||
|
|
@ -78,13 +90,16 @@ The remaining 3 characters of the block are `_` fillers.
|
|||
|
||||
`argv[i]` is a "constraint" on a specific partition key.
|
||||
|
||||
The second character of the block denotes which partition key to filter on, using `A` to denote the first partition key column, `B` for the second, etc. It is encoded with `'A' + partition_idx` and can be decoded with `c - 'A'`.
|
||||
The second character of the block denotes which partition key to filter on,
|
||||
using `A` to denote the first partition key column, `B` for the second, etc. It
|
||||
is encoded with `'A' + partition_idx` and can be decoded with `c - 'A'`.
|
||||
|
||||
The third character of the block denotes which operator is used in the constraint. It will be one of the values of `enum vec0_partition_operator`, as only a subset of operations are supported on partition keys.
|
||||
The third character of the block denotes which operator is used in the
|
||||
constraint. It will be one of the values of `enum vec0_partition_operator`, as
|
||||
only a subset of operations are supported on partition keys.
|
||||
|
||||
The fourth character of the block is a `_` filler.
|
||||
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_POINT_ID` (`'!'`)
|
||||
|
||||
`argv[i]` is the value of the rowid or id to match against for the point query.
|
||||
|
|
@ -93,11 +108,16 @@ The remaining 3 characters of the block are `_` fillers.
|
|||
|
||||
#### `VEC0_IDXSTR_KIND_METADATA_CONSTRAINT` (`'&'`)
|
||||
|
||||
`argv[i]` is the value of the `WHERE` constraint for a metdata column in a KNN query.
|
||||
`argv[i]` is the value of the `WHERE` constraint for a metdata column in a KNN
|
||||
query.
|
||||
|
||||
The second character of the block denotes which metadata column the constraint belongs to, using `A` to denote the first metadata column column, `B` for the second, etc. It is encoded with `'A' + metadata_idx` and can be decoded with `c - 'A'`.
|
||||
The second character of the block denotes which metadata column the constraint
|
||||
belongs to, using `A` to denote the first metadata column column, `B` for the
|
||||
second, etc. It is encoded with `'A' + metadata_idx` and can be decoded with
|
||||
`c - 'A'`.
|
||||
|
||||
The third character of the block is the constraint operator. It will be one of `enum vec0_metadata_operator`, as only a subset of operators are supported on metadata column KNN filters.
|
||||
The third character of the block is the constraint operator. It will be one of
|
||||
`enum vec0_metadata_operator`, as only a subset of operators are supported on
|
||||
metadata column KNN filters.
|
||||
|
||||
The foruth character of the block is a `_` filler.
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue