Implements distance_hamming_avx2() which processes 32 bytes per
iteration using the standard VPSHUFB nibble-lookup popcount pattern.
Dispatched when SQLITE_VEC_ENABLE_AVX is defined and input >= 32
bytes. Falls back to u64 scalar or u8 byte-at-a-time for smaller
inputs.
Also adds -mavx2 flag to Makefile for x86-64 targets alongside
existing -mavx.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add DiskANN graph-based index: builds a Vamana graph with configurable R
(max degree) and L (search list size, separate for insert/query), supports
int8 quantization with rescore, lazy reverse-edge replacement, pre-quantized
query optimization, and insert buffer reuse. Includes shadow table management,
delete support, KNN integration, compile flag (SQLITE_VEC_ENABLE_DISKANN),
release-demo workflow, fuzz targets, and tests. Fixes rescore int8
quantization bug.
Add inverted file (IVF) index type: partitions vectors into clusters via
k-means, quantizes to int8, and scans only the nearest nprobe partitions at
query time. Includes shadow table management, insert/delete, KNN integration,
compile flag (SQLITE_VEC_ENABLE_IVF), fuzz targets, and tests. Removes
superseded ivf-benchmarks/ directory.
The Linux AVX auto-detection checked the host's /proc/cpuinfo, which
passes on x86 CI runners even when cross-compiling for Android ARM
targets. Skip AVX detection when CC contains 'android'.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add approximate nearest neighbor infrastructure to vec0: shared distance
dispatch (vec0_distance_full), flat index type with parser, NEON-optimized
cosine/Hamming for float32/int8, amalgamation script, and benchmark suite
(benchmarks-ann/) with ground-truth generation and profiling tools. Remove
unused vec_npy_each/vec_static_blobs code, fix missing stdint.h include.
Add rescore index type: stores full-precision float vectors in a rowid-keyed
shadow table, quantizes to int8 for fast initial scan, then rescores top
candidates with original vectors. Includes config parser, shadow table
management, insert/delete support, KNN integration, compile flag
(SQLITE_VEC_ENABLE_RESCORE), fuzz targets, and tests.
Add approximate nearest neighbor infrastructure to vec0: shared distance
dispatch (vec0_distance_full), flat index type with parser, NEON-optimized
cosine/Hamming for float32/int8, amalgamation script, and benchmark suite
(benchmarks-ann/) with ground-truth generation and profiling tools. Remove
unused vec_npy_each/vec_static_blobs code, fix missing stdint.h include.
Add test-only wrappers behind SQLITE_VEC_TEST compile flag to expose
static distance functions for unit testing. Includes tests for
distance_l2_sqr_float (4 cases), distance_cosine_float (3 cases),
and distance_hamming (4 cases). Print active SIMD flags at test start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Expand sqlite-vec-internal.h with scanner/tokenizer types, vector column
definition types, and parser function declarations
- Fix min_idx declaration to match actual C signature (add candidates,
bTaken, k_used params)
- Compile test-unit with -DSQLITE_CORE and link vendor/sqlite3.c so
sqlite3 API functions (sqlite3_strnicmp, sqlite3_mprintf, etc.) resolve
- Add unit tests for vec0_token_next, Vec0Scanner, and
vec0_parse_vector_column
- Fix Rust build.rs to define SQLITE_CORE and compile vendor/sqlite3.c
- Fix Rust min_idx FFI signature and wrapper to match actual C function
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* initial pass at PARTITION KEY support.
* Initial pass, allow auxiliary columns on vec0 virtual tables
* update TODO
* Initial pass at metadata filtering
* unit tests
* gha this PR branch
* fixup tests
* doc internal
* fix tests, KNN/rowids in
* define SQLITE_INDEX_CONSTRAINT_OFFSET
* whoops
* update tests, syrupy, use uv
* un ignore pyproject.toml
* dot
* tests/
* type error?
* win: .exe, update error name
* try fix macos python, paren around expr?
* win bash?
* dbg :(
* explicit error
* op
* dbg win
* win ./tests/.venv/Scripts/python.exe
* block UPDATEs on partition key values for now
* test this branch
* accidentally removved "partition key type mistmatch" block during merge
* typo ugh
* bruv
* start aux snapshots
* drop aux shadow table on destroy
* enforce column types
* block WHERE constraints on auxiliary columns in KNN queries
* support delete
* support UPDATE on auxiliary columns
* test this PR
* dont inline that
* test-metadata.py
* memzero text buffer
* stress test
* more snpashot tests
* rm double/int32, just float/int64
* finish type checking
* long text support
* DELETE support
* UPDATE support
* fix snapshot names
* drop not-used in eqp
* small fixes
* boolean comparison handling
* ensure error is raised when long string constraint
* new version string for beta builds
* typo whoops
* ann-filtering-benchmark directory
* test-case
* updates
* fix aux column error when using non-default rowid values, needs test
* refactor some text knn filtering
* rowids blob read only on text metadata filters
* refactor
* add failing test causes for non eq text knn
* text knn NE
* test cases diff
* GT
* text knn GT/GE fixes
* text knn LT/LE
* clean
* vtab_in handling
* unblock aux failures for now
* guard sqlite3_vtab_in
* else in guard?
* fixes and tests
* add broken shadow table test
* rename _metadata_chunksNN shadown table to _metadatachunksNN, for proper shadowName detection
* _metadata_text_NN shadow tables to _metadatatextNN
* SQLITE_VEC_VERSION_MAJOR SQLITE_VEC_VERSION_MINOR and SQLITE_VEC_VERSION_PATCH in sqlite-vec.h
* _info shadow table
* forgot to update aux snapshot?
* fix aux tests