Replace the old INSERT INTO t(rowid) VALUES('command') hack with a
proper hidden command column named after the table (FTS5 pattern):
INSERT INTO t(t) VALUES ('oversample=16')
The command column is the first hidden column (before distance and k)
to reserve ability for future table-valued function argument use.
Schema: CREATE TABLE x(rowid, <cols>, "<table>" hidden, distance hidden, k hidden)
For backwards compat, pre-v0.1.10 tables (detected via _info shadow
table version) skip the command column to avoid name conflicts with
user columns that may share the table's name. Verified with legacy
fixture DB generated by sqlite-vec v0.1.6.
Changes:
- Add hidden command column to sqlite3_declare_vtab for new tables
- Version-gate via _info shadow table for existing tables
- Validate at CREATE time that no column name matches table name
- Add rescore_handle_command() with oversample=N support
- rescore_knn() prefers runtime oversample_search over CREATE default
- Remove old rowid-based command dispatch
- Migrate all DiskANN/IVF/fuzz tests and benchmarks to new syntax
- Add legacy DB fixture (v0.1.6) and 9 backwards-compat tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The constructor previously rejected auxiliary columns (+col) for all
non-flat index types. Analysis confirms all code paths already handle
aux columns correctly — aux data lives in _auxiliary shadow table,
independent of the vector index structures.
Remove the three auxiliary column guards. Metadata and partition key
guards remain in place (separate analysis needed).
Adds 8 snapshot-based tests covering shadow table creation, insert+KNN
returning aux values, aux UPDATE, aux DELETE cleanup, and DROP TABLE
for both rescore and DiskANN. IVF aux verified with IVF-enabled build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After deleting a node, its rowid and quantized vector remained in
other nodes' neighbor blobs via unidirectional reverse edges. This
is a data leak — the deleted vector's compressed representation was
still readable in shadow tables.
Fix: after deleting the node and repairing forward edges, scan all
remaining nodes and clear any neighbor slot that references the
deleted rowid. Uses a lightweight two-pass approach: first scan
reads only validity + neighbor_ids to find affected nodes, then
does full read/clear/write only for those nodes.
Tradeoff: O(N) scan per delete adds ~1ms/row at 10k vectors, ~10ms
at 100k. Recall and query latency are unaffected.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Infrastructure improvements:
- Fix benchmarks-ann Makefile: type=baseline -> type=vec0-flat (baseline
was never a valid INDEX_REGISTRY key)
- Add DiskANN + text primary key test: insert, KNN, delete, KNN
- Add rescore + text primary key test: insert, KNN, delete, KNN
- Add WAL concurrency test: reader sees snapshot isolation while
writer has an open transaction, KNN works on reader's snapshot
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DiskANN's delete repair only fixes forward edges (nodes the deleted
node pointed to). Stale reverse edges can cause deleted rowids to
appear in search results. Fix: track a 'confirmed' flag on each
search candidate, set when the full-precision vector is successfully
read during re-ranking. Only confirmed candidates are included in
output. Zero additional SQL queries — piggybacks on the existing
re-rank vector read.
Also adds delete hardening tests:
- Rescore: interleaved delete+KNN, rowid_in after deletes, full
delete+reinsert cycle
- DiskANN: delete+reinsert cycles with KNN verification, interleaved
delete+KNN
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
sqlite3_column_blob() returns NULL for zero-length blobs or on OOM.
Several call sites in rescore KNN and DiskANN node/vector read passed
the result directly to memcpy without checking, risking NULL deref on
corrupt or empty databases. IVF already had proper NULL checks.
Adds corruption regression tests that truncate shadow table blobs and
verify the query errors cleanly instead of crashing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
vec0Update_UpdateVectorColumn writes to flat chunk blobs but does not
update DiskANN graph or IVF index structures, silently corrupting KNN
results. Now returns a clear error for these index types. Rescore
UPDATE is unaffected — it already has a full implementation that
updates both quantized chunks and float vectors.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add DiskANN graph-based index: builds a Vamana graph with configurable R
(max degree) and L (search list size, separate for insert/query), supports
int8 quantization with rescore, lazy reverse-edge replacement, pre-quantized
query optimization, and insert buffer reuse. Includes shadow table management,
delete support, KNN integration, compile flag (SQLITE_VEC_ENABLE_DISKANN),
release-demo workflow, fuzz targets, and tests. Fixes rescore int8
quantization bug.