DmitrL-dev
|
cc7956d835
|
feat: TurboQuant VectorStore integration & Edge PQ KV cache prototype
- QJL (1-bit) approximate filter for 2.3x fast search
- PolarQuant (4-bit/8-bit) compressed storage with PQDropFloat64 memory reclamation (15x heap reduction)
- Two-Phase SearchQJL with fallback to CompressedSimilarity
- Edge Deployment prototype (pq_attention.cu) for LLaMA 1.5M token context
|
2026-03-26 22:00:49 +10:00 |
|