Switch embedding model from BGE to nomic-embed-text-v1.5

- Replace BGE-base-en-v1.5 with nomic-embed-text-v1.5
- 8192 token context window (vs 512 for BGE)
- Matryoshka representation learning support
- Fully open source with training data released
- Same 768 dimensions, no schema changes required

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Sam Valladares 2026-01-25 03:11:15 -06:00
parent 449d60754a
commit 5337efdfa7
5 changed files with 20 additions and 19 deletions

View file

@ -6,7 +6,7 @@ A bleeding-edge Rust MCP (Model Context Protocol) server for Vestige - providing
- **FSRS-6 Algorithm**: State-of-the-art spaced repetition (21 parameters, personalized decay)
- **Dual-Strength Memory Model**: Based on Bjork & Bjork 1992 cognitive science research
- **Local Semantic Embeddings**: BGE-base-en-v1.5 (768d) via fastembed v5 (no external API)
- **Local Semantic Embeddings**: nomic-embed-text-v1.5 (768d) via fastembed v5 (no external API)
- **HNSW Vector Search**: USearch-based, 20x faster than FAISS
- **Hybrid Search**: BM25 + semantic with RRF fusion
- **Codebase Memory**: Remember patterns, decisions, and context