mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-06-18 21:15:16 +02:00
Greedy multiset match on chunk text decides which rows keep their embeddings, which texts need embedding, and which rows are deleted. No DB, no embeddings; fully unit-tested (reuse, head insert, middle edit, deletion, duplicates, reorder, full rewrite). |
||
|---|---|---|
| .. | ||
| cache | ||
| __init__.py | ||
| conftest.py | ||
| test_chunk_reconciler.py | ||
| test_connector_document.py | ||
| test_create_placeholder_documents.py | ||
| test_document_chunker.py | ||
| test_document_hashing.py | ||
| test_index_batch.py | ||
| test_index_batch_parallel.py | ||
| test_migrate_legacy_docs.py | ||
| test_prepare_placeholder_dedup.py | ||