Commit graph

404 commits

Author SHA1 Message Date
Alex Garcia
b93a669224 Fix macOS fuzz: explicitly link LLVM libc++ to avoid weak-def symbol error
The fuzz targets were crashing on macOS 14 with:
  dyld: weak-def symbol not found '__ZnwmSt19__type_descriptor_t'

libFuzzer compiled with LLVM 18 uses typed allocation ABI symbols
not present in macOS 14's system libc++. Since DYLD_LIBRARY_PATH
cannot override SIP-protected /usr/lib/libc++.1.dylib at runtime,
we fix this at link time:
- -nostdlib++: suppress implicit system libc++ linking
- -L$LLVM/lib/c++ -lc++: explicitly link LLVM's libc++ (which has the symbol)
- -Wl,-rpath,$LLVM/lib/c++: embed rpath so dyld finds LLVM's libc++ at runtime

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-03 10:06:36 -08:00
Alex Garcia
1b53b942e0 Fix remaining fuzzer issues: leaks, UBSAN NaN, macOS LLVM version
- fuzz.yaml: switch macOS to llvm@18 (latest LLVM uses typed allocation
  C++ ABI symbols not available on macOS 14 runner's system libc++)
- sqlite-vec.c: fix NaN input in vec_quantize_int8 by using !(val <= X)
  comparisons which evaluate to true for NaN, ensuring the clamp fires
- sqlite-vec.c: free pzErrMsg in vec_eachFilter error path (was leaking
  the error string returned by vector_from_value)
- sqlite-vec.c: add sqlite3_free(pNew) to vec0_init error path; vec0_free
  frees the contents but not the struct itself, mirroring vec0Disconnect
- sqlite-vec.c: free knn_data in vec0Filter_knn cleanup when rc != SQLITE_OK;
  on error the cursor's knn_data field is never set so it would not be
  freed by the cursor teardown path

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 08:36:59 -08:00
Alex Garcia
8c976205dd gha: bump windows runners 2026-03-03 07:22:04 -08:00
Alex Garcia
cdbc34785f Fix fuzzer-found bugs and CI build issues
- fuzz.yaml: embed rpath to Homebrew LLVM's libc++ so macOS binaries can
  find the right C++ runtime at load time (fixes dyld weak-def crash)
- fuzz.yaml: add `make sqlite-vec.h` step on all platforms before building
  fuzz targets (the header is generated from a template, not checked in)
- fuzz.yaml: drop llvm version pin on Windows so choco succeeds when a
  newer LLVM is already installed on the runner
- sqlite-vec.c: change fvec_cleanup / fvec_cleanup_noop to take void*
  instead of f32* so they are ABI-compatible with vector_cleanup; removes
  UBSAN indirect-call errors at many call sites
- sqlite-vec.c: copy BLOB data into sqlite3_malloc'd buffer in
  fvec_from_value instead of aliasing the raw blob pointer, fixing UBSAN
  misaligned-load errors when SQLite hands us an unaligned blob
- sqlite-vec.c: guard npy_token_next string scan with ptr < end check
  before the closing-quote dereference (heap-buffer-overflow)
- sqlite-vec.c: clamp vec_quantize_int8 intermediate value to [-128, 127]
  before casting to i8 (UBSAN out-of-range conversion)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 07:16:33 -08:00
Alex Garcia
b1a02195d9 gha: attempt fuzz fixing 2026-03-03 06:44:35 -08:00
Alex Garcia
b669801d31 Add UBSAN findings TODO and improve vec-mismatch fuzzer
Document three classes of undefined behavior found by UBSAN:
function pointer type mismatches, misaligned f32 reads, and
float-to-integer overflow in vec_quantize_int8.

Improve vec-mismatch fuzzer to cover all error-path cleanup patterns:
type mismatches, dimension mismatches, single-arg functions, and
both text and blob inputs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 21:19:33 -08:00
Alex Garcia
4ce1ef3c6f Fix bad-free in ensure_vector_match: aCleanup(a) → aCleanup(*a)
When the second vector argument fails to parse, the cleanup of the
first vector was called with the double-pointer 'a' instead of '*a'.
When the first vector was parsed from JSON text (cleanup = sqlite3_free),
this called sqlite3_free on a stack address, causing a crash.

Found by the vec-mismatch fuzz target.

Shout out to @renatgalimov in #257 for finding the original bug!

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:50:54 -08:00
Alex Garcia
0dd0765cc6 Add vec-mismatch fuzz target that catches aCleanup(a) bug in ensure_vector_match
Targeted fuzzer for two-argument vector functions (vec_distance_*,
vec_add, vec_sub) that binds a valid JSON vector as arg1 and fuzz
data as arg2. This exercises the error path in ensure_vector_match()
where the first vector parses successfully (with sqlite3_free cleanup)
but the second fails, triggering the buggy aCleanup(a) call on line
1031 of sqlite-vec.c (should be aCleanup(*a)).

The fuzzer catches this immediately — ASAN reports "bad-free" when
sqlite3_free is called on a stack address.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:45:50 -08:00
Alex Garcia
4418a341e0 Add GitHub Actions CI workflow for fuzz testing
Runs on push to main, nightly at 2am UTC, and manual dispatch.
Linux (ubuntu-22.04) and macOS (macos-14) are fully supported.
Windows (windows-2022) is best-effort with continue-on-error.
Crash artifacts are uploaded on failure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:34:30 -08:00
Alex Garcia
a61d45183b Add comprehensive fuzz testing infrastructure with 6 new targets
- Fix numpy.c: tautology bug (|| → &&), infinite loop, and missing
  sqlite3_vec_numpy_init call
- Replace tests/fuzz/Makefile: auto-detect clang, add UBSAN, macOS
  ld_classic workaround, generic build rules for all 10 targets
- Add 6 new fuzz targets: shadow-corrupt (corrupted shadow tables),
  vec0-operations (INSERT/DELETE/query sequences), scalar-functions
  (all 18 SQL scalar functions), vec0-create-full (CREATE + lifecycle),
  metadata-columns (metadata/auxiliary columns), vec-each (vec_each TVF)
- Add seed corpora for shadow-corrupt, vec0-operations, exec, and json
- Add fuzz-build/fuzz-quick/fuzz-long targets to root Makefile

All 10 targets verified building and running on macOS ARM (Apple Silicon).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:33:05 -08:00
Alex Garcia
9a6bf96b92 Extract shared Python test utilities into tests/helpers.py
Deduplicates exec(), vec0_shadow_table_contents(), _f32(), _i64(), and
_int8() helpers that were copied across six test files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 20:05:21 -08:00
Alex Garcia
206fbc2bdd Add Python regression tests for existing insert/delete paths
Baseline tests protecting non-DiskANN chunk-based insert and delete
behavior: vector round-trips, auto rowids, text primary keys, delete
validity, reinsert after delete, dimension/type validation, and v_info
snapshot.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 18:12:01 -08:00
Alex Garcia
0bca960e9d Add LPAREN, RPAREN, COMMA token types to the scanner
Extends the vec0 tokenizer to recognize '(', ')', and ',' as
single-character tokens, preparing for DiskANN index option parsing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 18:07:57 -08:00
Alex Garcia
aab9b37de2 Add unit tests for distance functions (L2, cosine, hamming)
Add test-only wrappers behind SQLITE_VEC_TEST compile flag to expose
static distance functions for unit testing. Includes tests for
distance_l2_sqr_float (4 cases), distance_cosine_float (3 cases),
and distance_hamming (4 cases). Print active SIMD flags at test start.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 18:04:30 -08:00
Alex Garcia
79d5818015 Add regression tests for vec0_parse_vector_column edge cases
Add SQLITE_EMPTY tests for non-vector column inputs (primary key,
partition key, unknown types) and SQLITE_ERROR tests for zero
dimensions and empty brackets. Tighten existing error assertions
from rc != SQLITE_OK to exact expected return codes.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 17:50:31 -08:00
Alex Garcia
0659d8848d Update test-unit.c and unittest.rs functions to enforce pre-existing behavior
- Expand sqlite-vec-internal.h with scanner/tokenizer types, vector column
  definition types, and parser function declarations
- Fix min_idx declaration to match actual C signature (add candidates,
  bTaken, k_used params)
- Compile test-unit with -DSQLITE_CORE and link vendor/sqlite3.c so
  sqlite3 API functions (sqlite3_strnicmp, sqlite3_mprintf, etc.) resolve
- Add unit tests for vec0_token_next, Vec0Scanner, and
  vec0_parse_vector_column
- Fix Rust build.rs to define SQLITE_CORE and compile vendor/sqlite3.c
- Fix Rust min_idx FFI signature and wrapper to match actual C function

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-02 17:46:11 -08:00
Alex Garcia
0eb855ca67 gha: bump site runner 2026-03-01 21:43:43 -08:00
Alex Garcia
563a3e60f8 bump deploy pages 2026-02-13 10:04:59 -08:00
Alex Garcia
ce7b53e849 v0.1.7-alpha.10 2026-02-13 09:54:41 -08:00
Alex Garcia
1953d776a6 gha: linux-arm setup uv 2026-02-13 09:53:47 -08:00
Alex Garcia
5674c14b0a gha: update linux arm runner 2026-02-13 09:52:06 -08:00
Alex Garcia
fef78ccb69 v0.1.7-alpha.9 2026-02-13 09:46:47 -08:00
Alex Garcia
5f0e6e506f gha: use new sqlite-dist workflow 2026-02-13 09:46:40 -08:00
Alex Garcia
6e93ae0383 v0.1.7-alpha.8 2026-02-13 08:47:57 -08:00
Alex Garcia
f27a3d3432 gha: npm publish --tag bruh 2026-02-13 08:47:49 -08:00
Alex Garcia
15f0e1a26e v0.1.7-alpha.7 2026-02-13 08:42:22 -08:00
Alex Garcia
692ac21581 gha: npm publish requires --tag 2026-02-13 08:42:13 -08:00
Alex Garcia
58603cfe0d v0.1.7-alpha.6 2026-02-13 08:21:58 -08:00
Alex Garcia
c2f4d11b94 gha: bump node version? 2026-02-13 08:21:54 -08:00
Alex Garcia
96873a3ba0 v0.1.7-alpha.5 2026-02-13 08:16:08 -08:00
Alex Garcia
831f4acbfd gha: OIDC unset NODE_AUTH_TOKEN 2026-02-13 08:16:02 -08:00
Alex Garcia
2d8636c512 v0.1.7-alpha.4 2026-02-13 08:08:08 -08:00
Alex Garcia
9333e62bdc npm oidc 2026-02-13 08:07:55 -08:00
Alex Garcia
6bf347fbba release env 2026-02-13 08:04:03 -08:00
Alex Garcia
90920142da v0.1.7-alpha.3 2026-02-13 07:22:30 -08:00
Alex Garcia
aa2c61091e release script updates 2026-02-13 07:21:18 -08:00
Alex Garcia
681f9028af --managed-python, fix flakey tests 2026-02-13 07:08:48 -08:00
Alex Garcia
8bc206fa0b test: drop vec_xyz if it exists 2026-02-13 07:05:39 -08:00
Alex Garcia
d9020b7ded test: shadow snapshot update 2026-02-13 06:59:18 -08:00
Alex Garcia
f8db17fded fix test tale ordering 2026-02-13 06:55:49 -08:00
Alex Garcia
0dd6827f31 uv directory fix 2026-02-13 06:52:51 -08:00
Alex Garcia
1cbe22c78b build fixes, uv for testing 2026-02-13 06:48:07 -08:00
Alex Garcia
fb31b780f5 Merge branch 'main' of github.com:asg017/sqlite-vec 2026-02-13 06:47:20 -08:00
Alex Garcia
611ca631ea
Support constaints on distance column in KNN queries, for pagination and range queries (#166)
* Initial pass, needs tests+docs

* old: test-knn-constraints

* cleanup
2026-02-13 06:38:26 -08:00
Alex Garcia
3a3aefc672 doc improve 2025-01-31 16:46:42 -08:00
Alex Garcia
a2dd24f27e docs 2025-01-24 13:30:18 -08:00
Alex Garcia
dbc5098114 node:sqlite sample 2025-01-24 13:30:14 -08:00
Alex Garcia
d1d1ed7a57 sponsor update 2025-01-24 12:00:41 -08:00
Alex Garcia
194789c9ff Merge branch 'main' of github.com:asg017/sqlite-vec 2025-01-23 19:11:35 -08:00
Alex Yang
d8b3684dd7
update node.js doc (#162) 2025-01-23 19:10:43 -08:00