mirror of
https://github.com/asg017/sqlite-vec.git
synced 2026-04-27 09:46:27 +02:00
PARTITION KEY support (#122)
* initial pass at PARTITION KEY support. * unit tests * gha this PR branch * fixup tests * doc internal * fix tests, KNN/rowids in * define SQLITE_INDEX_CONSTRAINT_OFFSET * whoops * update tests, syrupy, use uv * un ignore pyproject.toml * dot * tests/ * type error? * win: .exe, update error name * try fix macos python, paren around expr? * win bash? * dbg :( * explicit error * op * dbg win * win ./tests/.venv/Scripts/python.exe * block UPDATEs on partition key values for now
This commit is contained in:
parent
ee3654701f
commit
6658624172
16 changed files with 1522 additions and 245 deletions
54
ARCHITECTURE.md
Normal file
54
ARCHITECTURE.md
Normal file
|
|
@ -0,0 +1,54 @@
|
|||
## `vec0`
|
||||
|
||||
### idxStr
|
||||
|
||||
The `vec0` idxStr is a string composed of single "header" character and 0 or
|
||||
more "blocks" of 4 characters each.
|
||||
|
||||
The "header" charcter denotes the type of query plan, as determined by the
|
||||
`enum vec0_query_plan` values. The current possible values are:
|
||||
|
||||
| Name | Value | Description |
|
||||
| -------------------------- | ----- | ---------------------------------------------------------------------- |
|
||||
| `VEC0_QUERY_PLAN_FULLSCAN` | `'1'` | Perform a full-scan on all rows |
|
||||
| `VEC0_QUERY_PLAN_POINT` | `'2'` | Perform a single-lookup point query for the provided rowid |
|
||||
| `VEC0_QUERY_PLAN_KNN` | `'3'` | Perform a KNN-style query on the provided query vector and parameters. |
|
||||
|
||||
Each 4-character "block" is associated with a corresponding value in `argv[]`. For example, the 1st block at byte offset `1-4` (inclusive) is the 1st block and is associated with `argv[1]`. The 2nd block at byte offset `5-8` (inclusive) is associated with `argv[2]` and so on. Each block describes what kind of value or filter the given `argv[i]` value is.
|
||||
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_KNN_MATCH` (`'{'`)
|
||||
|
||||
`argv[i]` is the query vector of the KNN query.
|
||||
|
||||
The remaining 3 characters of the block are `_` fillers.
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_KNN_K` (`'}'`)
|
||||
|
||||
`argv[i]` is the limit/k value of the KNN query.
|
||||
|
||||
The remaining 3 characters of the block are `_` fillers.
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_KNN_ROWID_IN` (`'['`)
|
||||
|
||||
`argv[i]` is the optional `rowid in (...)` value, and must be handled with [`sqlite3_vtab_in_first()` /
|
||||
`sqlite3_vtab_in_next()`](https://www.sqlite.org/c3ref/vtab_in_first.html).
|
||||
|
||||
The remaining 3 characters of the block are `_` fillers.
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_KNN_PARTITON_CONSTRAINT` (`']'`)
|
||||
|
||||
`argv[i]` is a "constraint" on a specific partition key.
|
||||
|
||||
The second character of the block denotes which partition key to filter on, using `A` to denote the first partition key column, `B` for the second, etc. It is encoded with `'A' + partition_idx` and can be decoded with `c - 'A'`.
|
||||
|
||||
The third character of the block denotes which operator is used in the constraint. It will be one of the values of `enum vec0_partition_operator`, as only a subset of operations are supported on partition keys.
|
||||
|
||||
The fourth character of the block is a `_` filler.
|
||||
|
||||
|
||||
#### `VEC0_IDXSTR_KIND_POINT_ID` (`'!'`)
|
||||
|
||||
`argv[i]` is the value of the rowid or id to match against for the point query.
|
||||
|
||||
The remaining 3 characters of the block are `_` fillers.
|
||||
Loading…
Add table
Add a link
Reference in a new issue