SurfSense/surfsense_backend/app/indexing_pipeline
CREDO23 91d947ff79 refactor(embedding-cache): rename index cache to embedding cache
The cached payload is the indexing pipeline's embeddings (markdown is
chunked then embedded), so "embedding cache" names the expensive output
directly and removes the "index" ambiguity (DB index vs vector index vs
indexing phase). Renames the service, settings, eligibility, eviction
task, metrics, config flags (INDEX_CACHE_* -> EMBEDDING_CACHE_*), object
prefix, and the table (index_cache_embedding_sets -> embedding_cache_sets)
with its constraint and indexes. Migration 161 renamed accordingly.
2026-06-12 17:00:01 +02:00
..
adapters feat(backend): Remove LLM summaries from document indexing 2026-06-04 00:50:19 +05:30
cache refactor(embedding-cache): rename index cache to embedding cache 2026-06-12 17:00:01 +02:00
__init__.py test: add ConnectorDocument unit tests and factory fixture 2026-02-24 22:20:08 +02:00
connector_document.py feat(backend): Remove LLM summaries from document indexing 2026-06-04 00:50:19 +05:30
document_chunker.py feat(chunker): add table-aware chunk_text_hybrid to prevent mid-row table splits 2026-05-05 12:48:04 +08:00
document_embedder.py feat: re-export embed_texts from document_embedder 2026-03-09 15:54:02 +02:00
document_hashing.py feat: made agent file sytem optimized 2026-03-28 16:39:46 -07:00
document_persistence.py fix(indexing): log and recover session in rollback_and_persist_failure 2026-06-10 00:10:25 +02:00
exceptions.py style: simplify LLM model terminology in UI 2026-04-02 10:11:35 +05:30
indexing_pipeline_service.py feat(index-cache): serve chunk embeddings from cache during indexing 2026-06-12 16:48:18 +02:00
pipeline_logger.py feat: enhance performance logging and caching in various components 2026-02-26 13:00:31 -08:00