SurfSense/surfsense_backend/app/tasks/document_processors
CREDO23 1791241c0c perf(indexers): offload sync embed_text to thread across background workers
Connector kb_sync_services (gmail, onedrive, google_calendar, jira),
streaming indexers (discord, luma, teams) and the file-processor save
path all called embed_text inside async coroutines, blocking the
background worker's event loop for the duration of the embed. Wrap each
call site in asyncio.to_thread so concurrent indexing tasks stop
serialising on the embed.
2026-05-20 10:09:38 +02:00
..
__init__.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
_direct_converters.py refactor: improve content extraction and encoding handling 2026-04-16 00:25:46 -07:00
_helpers.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
_save.py perf(indexers): offload sync embed_text to thread across background workers 2026-05-20 10:09:38 +02:00
base.py chore: ran linting 2026-03-17 04:40:46 +05:30
circleback_processor.py refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation 2026-03-15 00:44:27 -07:00
extension_processor.py refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation 2026-03-15 00:44:27 -07:00
file_processors.py chore: evals 2026-05-13 14:02:26 -07:00
markdown_processor.py refactor: streamline document upload limits and enhance handling of mentioned documents 2026-04-02 19:39:10 -07:00
youtube_processor.py refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation 2026-03-15 00:44:27 -07:00