Commit graph

5 commits

Author SHA1 Message Date
Alex Garcia
ba0db0b6d6 Add rescore index for ANN queries
Add rescore index type: stores full-precision float vectors in a rowid-keyed
shadow table, quantizes to int8 for fast initial scan, then rescores top
candidates with original vectors. Includes config parser, shadow table
management, insert/delete support, KNN integration, compile flag
(SQLITE_VEC_ENABLE_RESCORE), fuzz targets, and tests.
2026-03-29 19:45:54 -07:00
Alex Garcia
cb147c8834
Complete vec0 DELETE: zero data, reclaim empty chunks, fix metadata rc bug (#268)
When a row is deleted from a vec0 virtual table, the rowid slot in
_chunks.rowids and vector data in _vector_chunksNN.vectors are now
zeroed out (previously left as stale data, tracked in #54). When all
rows in a chunk are deleted (validity bitmap all zeros), the chunk and
its associated vector/metadata shadow table rows are reclaimed.

- Add vec0Update_Delete_ClearRowid to zero the rowid blob slot
- Add vec0Update_Delete_ClearVectors to zero all vector blob slots
- Add vec0Update_Delete_DeleteChunkIfEmpty to detect and delete
  fully-empty chunks from _chunks, _vector_chunksNN, _metadatachunksNN
- Fix missing rc check in ClearMetadata loop (bug: errors were silently
  ignored)
- Fix vec0_new_chunk to explicitly set _rowid_ on shadow table INSERTs
  (SHADOW_TABLE_ROWID_QUIRK: "rowid PRIMARY KEY" without INTEGER type
  is not a true rowid alias, causing blob_open failures after chunk
  delete+recreate cycles)
- Add 13 new tests covering rowid/vector zeroing, chunk reclamation,
  metadata/auxiliary/partition/text-PK/int8/bit variants, and
  page_count shrinkage verification
- Add vec0-delete-completeness fuzz target
- Update snapshots for new delete zeroing behavior

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-17 00:02:36 -07:00
Alex Garcia
0dd0765cc6 Add vec-mismatch fuzz target that catches aCleanup(a) bug in ensure_vector_match
Targeted fuzzer for two-argument vector functions (vec_distance_*,
vec_add, vec_sub) that binds a valid JSON vector as arg1 and fuzz
data as arg2. This exercises the error path in ensure_vector_match()
where the first vector parses successfully (with sqlite3_free cleanup)
but the second fails, triggering the buggy aCleanup(a) call on line
1031 of sqlite-vec.c (should be aCleanup(*a)).

The fuzzer catches this immediately — ASAN reports "bad-free" when
sqlite3_free is called on a stack address.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:45:50 -08:00
Alex Garcia
a61d45183b Add comprehensive fuzz testing infrastructure with 6 new targets
- Fix numpy.c: tautology bug (|| → &&), infinite loop, and missing
  sqlite3_vec_numpy_init call
- Replace tests/fuzz/Makefile: auto-detect clang, add UBSAN, macOS
  ld_classic workaround, generic build rules for all 10 targets
- Add 6 new fuzz targets: shadow-corrupt (corrupted shadow tables),
  vec0-operations (INSERT/DELETE/query sequences), scalar-functions
  (all 18 SQL scalar functions), vec0-create-full (CREATE + lifecycle),
  metadata-columns (metadata/auxiliary columns), vec-each (vec_each TVF)
- Add seed corpora for shadow-corrupt, vec0-operations, exec, and json
- Add fuzz-build/fuzz-quick/fuzz-long targets to root Makefile

All 10 targets verified building and running on macOS ARM (Apple Silicon).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:33:05 -08:00
Alex Garcia
65656cbadc fuzz work 2024-07-25 11:16:06 -07:00