mirror of https://github.com/asg017/sqlite-vec.git synced 2026-04-25 16:56:27 +02:00

Alex Garcia 4febdff11a benchmark updates		2024-07-28 11:08:12 -07:00
..
.gitignore	benchmark updates	2024-07-28 11:08:12 -07:00
bench.py	benchmark updates	2024-07-28 11:08:12 -07:00
gist.suite	benchmark updates	2024-07-28 11:08:12 -07:00
Makefile	benchmark updates	2024-07-28 11:08:12 -07:00
README.md	benchmark updates	2024-07-28 11:08:12 -07:00
requirements.txt	benchmark updates	2024-07-28 11:08:12 -07:00
sift.suite	benchmark updates	2024-07-28 11:08:12 -07:00

README.md

`sqlite-vec` In-memory benchmark comparisions

This repo contains a benchmarks that compares KNN queries of sqlite-vec to other in-process vector search tools using brute force linear scans only. These include:

Again ONLY BRUTE FORCE LINEAR SCANS ARE TESTED. This benchmark does not test approximate nearest neighbors (ANN) implementations. This benchmarks is extremely narrow to just testing KNN searches using brute force.

A few other caveats:

Only brute-force linear scans, no ANN
Only CPU is used. The only tool that does offer GPU is Faiss anyway.
Only in-memory datasets are used. Many of these tools do support serializing and reading from disk (including sqlite-vec) and possibly mmap'ing, but this only tests in-memory datasets. Mostly because of numpy
Queries are made one after the other, not batched. Some tools offer APIs to query multiple inputs at the same time, but this benchmark runs queries sequentially. This was done to emulate "server request"-style queries, but multiple users would send queries at different times, making batching more difficult. To note, sqlite-vec does not support batched queries yet.

These tests are run in Python. Vectors are provided as an in-memory numpy array, and each test converts that numpy array into whatever makes sense for the given tool. For example, sqlite-vec tests will read those vectors into a SQLite table. DuckDB will read them into an Array array then create a DuckDB table from that.

README.md

sqlite-vec In-memory benchmark comparisions

`sqlite-vec` In-memory benchmark comparisions