misc docs

This commit is contained in:
Alex Garcia 2024-07-14 13:47:41 -07:00
parent b57a05e2e8
commit 4b140f7294
6 changed files with 113 additions and 24 deletions

View file

@ -8,35 +8,71 @@ functions:
vec_version:
params: []
section: meta
desc: Returns a version string of the current `sqlite-vec` version.
desc: Returns a version string of the current `sqlite-vec` installation.
example: select vec_version();
vec_debug:
params: []
section: meta
desc: x
example: x
desc: Returns debugging information of the current `sqlite-vec` installation.
example: select vec_debug();
vec_bit:
params: []
params: [vector]
section: constructor
desc: x
example: x
desc: Creates a binary vector from a BLOB.
example:
- select vec_bit(X'F0');
- select subtype(vec_bit(X'F0'));
- select vec_to_json(vec_bit(X'F0'));
vec_f32:
params: []
params: [vector]
section: constructor
desc: x
example: x
desc: |
Creates a float vector from a BLOB or JSON text. If a BLOB is provided,
the length must be divisible by 4, as a float takes up 4 bytes of space each.
example:
- select vec_float32('[.1, .2, .3, 4]');
- select subtype(vec_float32('[.1, .2, .3, 4]'));
- select vec_float32(X'AABBCCDD');
- select vec_to_json(vec_float32(X'AABBCCDD'));
- select vec_float32(X'AA');
vec_int8:
params: []
params: [vector]
section: constructor
desc: x
example: x
desc: |
Creates a 8-bit integer vector from a BLOB or JSON text. If a BLOB is provided,
the length must be divisible by 4, as a float takes up 4 bytes of space each.
If JSON text is provided, each element must be an integer between -128 and 127 inclusive.
example:
- select vec_int8('[1, 2, 3, 4]');
- select subtype(vec_int8('[1, 2, 3, 4]'));
- select vec_int8(X'AABBCCDD');
- select vec_to_json(vec_int8(X'AABBCCDD'));
- select vec_int8('[999]');
vec_add:
params: []
params: [a, b]
section: op
desc: x
example: x
desc: |
Adds each element in vector `a` with vector `b`, returning a new vector `c`. Both vectors
must be of the same type, and can only be a `float32` or `int8` vector.
example: |
- select vec_add(
'[.1, .2, .3]',
'[.4, .5, .6]'
);
- |
select vec_to_json(
vec_add(
'[.1, .2, .3]',
'[.4, .5, .6]'
)
);
- |
select vec_to_json(
vec_int8('[1, 2, 3]'),
vec_int8('[4, 5, 6]')
)
vec_length:
params: []
section: op
@ -118,4 +154,5 @@ entrypoints:
compile_options:
- SQLITE_VEC_ENABLE_AVX
- SQLITE_VEC_ENABLE_NEON
- SQLITE_VEC_OMIT_FS

View file

@ -48,6 +48,7 @@ const guides = {
collapsed: true,
items: [
{ text: "Performance", link: "/guides/performance" },
{
text: "Vector operations",
items: [
@ -136,10 +137,16 @@ function sidebar(): DefaultTheme.SidebarItem[] {
text: "Installation",
link: "/installation",
},
{
text: "Quick Start",
link: "/quickstart",
],
},
{
text: "Features",
collapsed: true,
items: [
{ text: "Vector formats", link: "/vector-formats" },
{ text: "KNN queries", link: "/knn" },
{ text: "vec0 virtual vables", link: "/vec0" },
{ text: "Static blobs", link: "/numpy" },
],
},
{

View file

@ -24,13 +24,13 @@ binary quantized 8-dimensional vector can be stored in a single byte — one bit
per element. For 1 million vectors, that would be just `1MB`, a 32x reduction!
Though keep in mind, you're bound to lose a lot quality when reducing 32 bits of
information to 1 bit. [Over-sampling and re-scoring](#re-scoring) will help a
information to 1 bit. [Oversampling and re-scoring](#re-scoring) will help a
lot.
The main goal of BQ is to dramatically reduce the size of your vector index,
resulting in faster searches and less resources. This is especially useful in
resulting in faster searches with less resources. This is especially useful in
`sqlite-vec`, which is (currently) brute-force only and meant to run on small
devices. BQ is an easy low-cost method to make larger vector datasets easy to
devices. BQ is an easy low-cost method to make larger vector datasets easier to
manage.
## Binary Quantization `sqlite-vec`
@ -41,7 +41,9 @@ element in a given vector, it will apply `0` to negative values and `1` to
positive values, and pack them into a `BLOB`.
```sqlite
select vec_quantize_binary('[-0.73, -0.80, 0.12, -0.73, 0.79, -0.11, 0.23, 0.97]');
select vec_quantize_binary(
'[-0.73, -0.80, 0.12, -0.73, 0.79, -0.11, 0.23, 0.97]'
);
-- X'd4`
```
@ -51,6 +53,9 @@ The single byte `0xd4` in hexadecimal is `11010100` in binary.
## Demo
Here's an end-to-end example of using binary quantization with `vec0` virtual
tables in `sqlite-vec`.
```sqlite
create virtual table vec_movies using vec0(
synopsis_embedding bit[768]

View file

@ -1,7 +1,21 @@
# Using `sqlite-vec` in Go
There are two ways you can embed `sqlite-vec` into Go applications: a CGO option
for libraries like
[`github.com/mattn/go-sqlite3`](https://github.com/mattn/go-sqlite3), or a
WASM-based option with
[`github.com/ncruces/go-sqlite3`](https://github.com/ncruces/go-sqlite3).
## Option 1: CGO
```bash
go get -u github.com/asg017/sqlite-vec/bindings/go/cgo
```
## Option 2: WASM based with `ncruces/go-sqlite3`
```
go
```
## Working with vectors in Go

View file

@ -1,7 +1,33 @@
# Using `sqlite-vec` in Rust
You can embed `sqlite-vec` into your Rust projects using the official
[`sqlite-vec` crate](https://crates.io/crates/sqlite-vec).
```bash
cargo add sqlite-vec
```
The crate embeds the `sqlite-vec` C source code, and uses the
[`cc` crate](https://crates.io/crates/sqlite-vec) to compile and statically link
`sqlite-vec` at build-time.
The `sqlite-vec` crate exposes a single function `sqlite3_vec_init`, which is
the C entrypoint for the SQLite extension. You can "register" with your Rust
SQLite library's `sqlite3_auto_extension()` function. Here's an example with
`rusqlite`:
```rs
use sqlite_vec::sqlite3_vec_init;
use rusqlite::{ffi::sqlite3_auto_extension};
fn main() {
unsafe {
sqlite3_auto_extension(Some(std::mem::transmute(sqlite3_vec_init as *const ())));
}
// future database connection will now automatically include sqlite-vec functions!
}
```
A full [`sqlite-vec` Rust demo](#TODO) is also available.
## Working with vectors in Rust