SurfSense/surfsense_backend/tests/unit/indexing_pipeline
CREDO23 f82dedf712 feat(indexing): add pure chunk reconciler for content-addressed diffs
Greedy multiset match on chunk text decides which rows keep their embeddings,
which texts need embedding, and which rows are deleted. No DB, no embeddings;
fully unit-tested (reuse, head insert, middle edit, deletion, duplicates,
reorder, full rewrite).
2026-06-12 18:52:46 +02:00
..
cache refactor(embedding-cache): rename index cache to embedding cache 2026-06-12 17:00:01 +02:00
__init__.py test: bootstrap pytest environment for backend 2026-02-24 18:19:56 +02:00
conftest.py feat: enhance performance logging and caching in various components 2026-02-26 13:00:31 -08:00
test_chunk_reconciler.py feat(indexing): add pure chunk reconciler for content-addressed diffs 2026-06-12 18:52:46 +02:00
test_connector_document.py feat(tests): Update tests for summary-free indexing 2026-06-04 00:53:51 +05:30
test_create_placeholder_documents.py feat: made agent file sytem optimized 2026-03-28 16:39:46 -07:00
test_document_chunker.py add docstrings to all indexing pipeline tests 2026-02-25 20:30:31 +02:00
test_document_hashing.py feat: enhance Google connectors indexing with content extraction and document migration 2026-03-25 18:33:44 +05:30
test_index_batch.py refactor(tests): Update tests to remove summary references and adjust for embedding errors 2026-06-04 01:51:21 +05:30
test_index_batch_parallel.py test(index-cache): add unit tests and repoint embed/chunk patch targets 2026-06-12 16:48:18 +02:00
test_migrate_legacy_docs.py feat: add integration tests for indexing pipeline components 2026-03-25 18:34:02 +05:30
test_prepare_placeholder_dedup.py feat: made agent file sytem optimized 2026-03-28 16:39:46 -07:00