The fuzz targets were crashing on macOS 14 with:
dyld: weak-def symbol not found '__ZnwmSt19__type_descriptor_t'
libFuzzer compiled with LLVM 18 uses typed allocation ABI symbols
not present in macOS 14's system libc++. Since DYLD_LIBRARY_PATH
cannot override SIP-protected /usr/lib/libc++.1.dylib at runtime,
we fix this at link time:
- -nostdlib++: suppress implicit system libc++ linking
- -L$LLVM/lib/c++ -lc++: explicitly link LLVM's libc++ (which has the symbol)
- -Wl,-rpath,$LLVM/lib/c++: embed rpath so dyld finds LLVM's libc++ at runtime
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- fuzz.yaml: switch macOS to llvm@18 (latest LLVM uses typed allocation
C++ ABI symbols not available on macOS 14 runner's system libc++)
- sqlite-vec.c: fix NaN input in vec_quantize_int8 by using !(val <= X)
comparisons which evaluate to true for NaN, ensuring the clamp fires
- sqlite-vec.c: free pzErrMsg in vec_eachFilter error path (was leaking
the error string returned by vector_from_value)
- sqlite-vec.c: add sqlite3_free(pNew) to vec0_init error path; vec0_free
frees the contents but not the struct itself, mirroring vec0Disconnect
- sqlite-vec.c: free knn_data in vec0Filter_knn cleanup when rc != SQLITE_OK;
on error the cursor's knn_data field is never set so it would not be
freed by the cursor teardown path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- fuzz.yaml: embed rpath to Homebrew LLVM's libc++ so macOS binaries can
find the right C++ runtime at load time (fixes dyld weak-def crash)
- fuzz.yaml: add `make sqlite-vec.h` step on all platforms before building
fuzz targets (the header is generated from a template, not checked in)
- fuzz.yaml: drop llvm version pin on Windows so choco succeeds when a
newer LLVM is already installed on the runner
- sqlite-vec.c: change fvec_cleanup / fvec_cleanup_noop to take void*
instead of f32* so they are ABI-compatible with vector_cleanup; removes
UBSAN indirect-call errors at many call sites
- sqlite-vec.c: copy BLOB data into sqlite3_malloc'd buffer in
fvec_from_value instead of aliasing the raw blob pointer, fixing UBSAN
misaligned-load errors when SQLite hands us an unaligned blob
- sqlite-vec.c: guard npy_token_next string scan with ptr < end check
before the closing-quote dereference (heap-buffer-overflow)
- sqlite-vec.c: clamp vec_quantize_int8 intermediate value to [-128, 127]
before casting to i8 (UBSAN out-of-range conversion)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Document three classes of undefined behavior found by UBSAN:
function pointer type mismatches, misaligned f32 reads, and
float-to-integer overflow in vec_quantize_int8.
Improve vec-mismatch fuzzer to cover all error-path cleanup patterns:
type mismatches, dimension mismatches, single-arg functions, and
both text and blob inputs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the second vector argument fails to parse, the cleanup of the
first vector was called with the double-pointer 'a' instead of '*a'.
When the first vector was parsed from JSON text (cleanup = sqlite3_free),
this called sqlite3_free on a stack address, causing a crash.
Found by the vec-mismatch fuzz target.
Shout out to @renatgalimov in #257 for finding the original bug!
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Targeted fuzzer for two-argument vector functions (vec_distance_*,
vec_add, vec_sub) that binds a valid JSON vector as arg1 and fuzz
data as arg2. This exercises the error path in ensure_vector_match()
where the first vector parses successfully (with sqlite3_free cleanup)
but the second fails, triggering the buggy aCleanup(a) call on line
1031 of sqlite-vec.c (should be aCleanup(*a)).
The fuzzer catches this immediately — ASAN reports "bad-free" when
sqlite3_free is called on a stack address.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runs on push to main, nightly at 2am UTC, and manual dispatch.
Linux (ubuntu-22.04) and macOS (macos-14) are fully supported.
Windows (windows-2022) is best-effort with continue-on-error.
Crash artifacts are uploaded on failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deduplicates exec(), vec0_shadow_table_contents(), _f32(), _i64(), and
_int8() helpers that were copied across six test files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Baseline tests protecting non-DiskANN chunk-based insert and delete
behavior: vector round-trips, auto rowids, text primary keys, delete
validity, reinsert after delete, dimension/type validation, and v_info
snapshot.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extends the vec0 tokenizer to recognize '(', ')', and ',' as
single-character tokens, preparing for DiskANN index option parsing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add test-only wrappers behind SQLITE_VEC_TEST compile flag to expose
static distance functions for unit testing. Includes tests for
distance_l2_sqr_float (4 cases), distance_cosine_float (3 cases),
and distance_hamming (4 cases). Print active SIMD flags at test start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add SQLITE_EMPTY tests for non-vector column inputs (primary key,
partition key, unknown types) and SQLITE_ERROR tests for zero
dimensions and empty brackets. Tighten existing error assertions
from rc != SQLITE_OK to exact expected return codes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Expand sqlite-vec-internal.h with scanner/tokenizer types, vector column
definition types, and parser function declarations
- Fix min_idx declaration to match actual C signature (add candidates,
bTaken, k_used params)
- Compile test-unit with -DSQLITE_CORE and link vendor/sqlite3.c so
sqlite3 API functions (sqlite3_strnicmp, sqlite3_mprintf, etc.) resolve
- Add unit tests for vec0_token_next, Vec0Scanner, and
vec0_parse_vector_column
- Fix Rust build.rs to define SQLITE_CORE and compile vendor/sqlite3.c
- Fix Rust min_idx FFI signature and wrapper to match actual C function
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>