feat: Vestige v1.6.0 — 6x storage reduction, neural reranking, instant startup

Four internal optimizations for dramatically better performance: 1. F16 vector quantization (ScalarKind::F16 in USearch) — 2x storage savings 2. Matryoshka 256-dim truncation (768→256) — 3x embedding storage savings 3. Convex Combination fusion (0.3 keyword / 0.7 semantic) replacing RRF 4. Cross-encoder reranker (Jina Reranker v1 Turbo via fastembed TextRerank) Combined: 6x vector storage reduction, ~20% better retrieval quality. Cross-encoder loads in background — server starts instantly. Old 768-dim embeddings auto-migrated on load. 614 tests pass, zero warnings.
2026-05-10 08:12:37 +02:00 · 2026-02-19 01:09:39 -06:00 · 2026-02-19 01:09:39 -06:00 · 495a88331f
commit 495a88331f
parent 5b7d22d427
19 changed files with 195 additions and 98 deletions
--- a/crates/vestige-mcp/src/tools/search.rs
+++ b/crates/vestige-mcp/src/tools/search.rs
@ -162,8 +162,8 @@ pub async fn execute_hybrid(
        .hybrid_search(
            &args.query,
            args.limit.unwrap_or(10).clamp(1, 50),
-            args.keyword_weight.unwrap_or(0.5).clamp(0.0, 1.0),
-            args.semantic_weight.unwrap_or(0.5).clamp(0.0, 1.0),
+            args.keyword_weight.unwrap_or(0.3).clamp(0.0, 1.0),
+            args.semantic_weight.unwrap_or(0.7).clamp(0.0, 1.0),
        )
        .map_err(|e| e.to_string())?;