Rename SQLITE_VEC_ENABLE_IVF to SQLITE_VEC_EXPERIMENTAL_IVF_ENABLE and
flip the default from 1 to 0. IVF tests are automatically skipped when
the build flag is not set.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add inverted file (IVF) index type: partitions vectors into clusters via
k-means, quantizes to int8, and scans only the nearest nprobe partitions at
query time. Includes shadow table management, insert/delete, KNN integration,
compile flag (SQLITE_VEC_ENABLE_IVF), fuzz targets, and tests. Removes
superseded ivf-benchmarks/ directory.
Add approximate nearest neighbor infrastructure to vec0: shared distance
dispatch (vec0_distance_full), flat index type with parser, NEON-optimized
cosine/Hamming for float32/int8, amalgamation script, and benchmark suite
(benchmarks-ann/) with ground-truth generation and profiling tools. Remove
unused vec_npy_each/vec_static_blobs code, fix missing stdint.h include.
vec0Update_Delete_ClearMetadata's long-text branch runs a DELETE via
sqlite3_step, which returns SQLITE_DONE (101) on success. The code
checked for failure but never normalized the success case to SQLITE_OK.
The function's epilogue returned SQLITE_DONE as-is, which the caller
(vec0Update_Delete) treated as an error, aborting the DELETE scan and
silently leaving rows behind.
- Normalize rc to SQLITE_OK after successful sqlite3_step in ClearMetadata
- Move sqlite3_finalize before the rc check (cleanup on all paths)
- Add test_delete_by_metadata_with_long_text regression test
- Update test_deletes snapshot (row 3 now correctly deleted)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add rescore index type: stores full-precision float vectors in a rowid-keyed
shadow table, quantizes to int8 for fast initial scan, then rescores top
candidates with original vectors. Includes config parser, shadow table
management, insert/delete support, KNN integration, compile flag
(SQLITE_VEC_ENABLE_RESCORE), fuzz targets, and tests.
Add approximate nearest neighbor infrastructure to vec0: shared distance
dispatch (vec0_distance_full), flat index type with parser, NEON-optimized
cosine/Hamming for float32/int8, amalgamation script, and benchmark suite
(benchmarks-ann/) with ground-truth generation and profiling tools. Remove
unused vec_npy_each/vec_static_blobs code, fix missing stdint.h include.
When a row is deleted from a vec0 virtual table, the rowid slot in
_chunks.rowids and vector data in _vector_chunksNN.vectors are now
zeroed out (previously left as stale data, tracked in #54). When all
rows in a chunk are deleted (validity bitmap all zeros), the chunk and
its associated vector/metadata shadow table rows are reclaimed.
- Add vec0Update_Delete_ClearRowid to zero the rowid blob slot
- Add vec0Update_Delete_ClearVectors to zero all vector blob slots
- Add vec0Update_Delete_DeleteChunkIfEmpty to detect and delete
fully-empty chunks from _chunks, _vector_chunksNN, _metadatachunksNN
- Fix missing rc check in ClearMetadata loop (bug: errors were silently
ignored)
- Fix vec0_new_chunk to explicitly set _rowid_ on shadow table INSERTs
(SHADOW_TABLE_ROWID_QUIRK: "rowid PRIMARY KEY" without INTEGER type
is not a true rowid alias, causing blob_open failures after chunk
delete+recreate cycles)
- Add 13 new tests covering rowid/vector zeroing, chunk reclamation,
metadata/auxiliary/partition/text-PK/int8/bit variants, and
page_count shrinkage verification
- Add vec0-delete-completeness fuzz target
- Update snapshots for new delete zeroing behavior
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
sqlite-vec.c:
- vec0_free: add loops to free partition, auxiliary, and metadata
column names (previously leaked on error paths)
- vec0_init: update pNew->numXxxColumns incrementally in the parse
loop so vec0_free sees correct counts on early goto-error paths
(previously the counts were only written after the loop, so vec0_free
would loop 0 times and leak names allocated inside the loop)
fuzz.yaml:
- macOS: pass -isysroot $(xcrun --sdk macosx --show-sdk-path) so
Xcode clang can find system headers (stdio.h, assert.h, etc.)
- Fix artifact upload paths: libFuzzer writes crash-*/leak-* to
the cwd (repo root), not tests/fuzz/
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- fuzz.yaml: switch macOS to llvm@18 (latest LLVM uses typed allocation
C++ ABI symbols not available on macOS 14 runner's system libc++)
- sqlite-vec.c: fix NaN input in vec_quantize_int8 by using !(val <= X)
comparisons which evaluate to true for NaN, ensuring the clamp fires
- sqlite-vec.c: free pzErrMsg in vec_eachFilter error path (was leaking
the error string returned by vector_from_value)
- sqlite-vec.c: add sqlite3_free(pNew) to vec0_init error path; vec0_free
frees the contents but not the struct itself, mirroring vec0Disconnect
- sqlite-vec.c: free knn_data in vec0Filter_knn cleanup when rc != SQLITE_OK;
on error the cursor's knn_data field is never set so it would not be
freed by the cursor teardown path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- fuzz.yaml: embed rpath to Homebrew LLVM's libc++ so macOS binaries can
find the right C++ runtime at load time (fixes dyld weak-def crash)
- fuzz.yaml: add `make sqlite-vec.h` step on all platforms before building
fuzz targets (the header is generated from a template, not checked in)
- fuzz.yaml: drop llvm version pin on Windows so choco succeeds when a
newer LLVM is already installed on the runner
- sqlite-vec.c: change fvec_cleanup / fvec_cleanup_noop to take void*
instead of f32* so they are ABI-compatible with vector_cleanup; removes
UBSAN indirect-call errors at many call sites
- sqlite-vec.c: copy BLOB data into sqlite3_malloc'd buffer in
fvec_from_value instead of aliasing the raw blob pointer, fixing UBSAN
misaligned-load errors when SQLite hands us an unaligned blob
- sqlite-vec.c: guard npy_token_next string scan with ptr < end check
before the closing-quote dereference (heap-buffer-overflow)
- sqlite-vec.c: clamp vec_quantize_int8 intermediate value to [-128, 127]
before casting to i8 (UBSAN out-of-range conversion)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When the second vector argument fails to parse, the cleanup of the
first vector was called with the double-pointer 'a' instead of '*a'.
When the first vector was parsed from JSON text (cleanup = sqlite3_free),
this called sqlite3_free on a stack address, causing a crash.
Found by the vec-mismatch fuzz target.
Shout out to @renatgalimov in #257 for finding the original bug!
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extends the vec0 tokenizer to recognize '(', ')', and ',' as
single-character tokens, preparing for DiskANN index option parsing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add test-only wrappers behind SQLITE_VEC_TEST compile flag to expose
static distance functions for unit testing. Includes tests for
distance_l2_sqr_float (4 cases), distance_cosine_float (3 cases),
and distance_hamming (4 cases). Print active SIMD flags at test start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* initial pass at PARTITION KEY support.
* Initial pass, allow auxiliary columns on vec0 virtual tables
* update TODO
* Initial pass at metadata filtering
* unit tests
* gha this PR branch
* fixup tests
* doc internal
* fix tests, KNN/rowids in
* define SQLITE_INDEX_CONSTRAINT_OFFSET
* whoops
* update tests, syrupy, use uv
* un ignore pyproject.toml
* dot
* tests/
* type error?
* win: .exe, update error name
* try fix macos python, paren around expr?
* win bash?
* dbg :(
* explicit error
* op
* dbg win
* win ./tests/.venv/Scripts/python.exe
* block UPDATEs on partition key values for now
* test this branch
* accidentally removved "partition key type mistmatch" block during merge
* typo ugh
* bruv
* start aux snapshots
* drop aux shadow table on destroy
* enforce column types
* block WHERE constraints on auxiliary columns in KNN queries
* support delete
* support UPDATE on auxiliary columns
* test this PR
* dont inline that
* test-metadata.py
* memzero text buffer
* stress test
* more snpashot tests
* rm double/int32, just float/int64
* finish type checking
* long text support
* DELETE support
* UPDATE support
* fix snapshot names
* drop not-used in eqp
* small fixes
* boolean comparison handling
* ensure error is raised when long string constraint
* new version string for beta builds
* typo whoops
* ann-filtering-benchmark directory
* test-case
* updates
* fix aux column error when using non-default rowid values, needs test
* refactor some text knn filtering
* rowids blob read only on text metadata filters
* refactor
* add failing test causes for non eq text knn
* text knn NE
* test cases diff
* GT
* text knn GT/GE fixes
* text knn LT/LE
* clean
* vtab_in handling
* unblock aux failures for now
* guard sqlite3_vtab_in
* else in guard?
* fixes and tests
* add broken shadow table test
* rename _metadata_chunksNN shadown table to _metadatachunksNN, for proper shadowName detection
* _metadata_text_NN shadow tables to _metadatatextNN
* SQLITE_VEC_VERSION_MAJOR SQLITE_VEC_VERSION_MINOR and SQLITE_VEC_VERSION_PATCH in sqlite-vec.h
* _info shadow table
* forgot to update aux snapshot?
* fix aux tests
* Fix compilation error for redefinition of jsonIsspace when including in amalgamation build of sqlite3.c
* Fix redefinition variable jsonIsSpaceX[]
* Add check for SQLITE_AMALGMATION
* Add check for SQLITE_CORE
* initial work on l1
* l1 int8 neon implementation
* tweak l1 int8 and add test
* broken overflow still
* some progress on l1
* change to i32 instead of i64
* remove comment
* ignore poetry stuff
* unrolled l1 int8 and format
* remove comments