SurfSense/surfsense_backend/app/tasks/document_processors
DESKTOP-RTLN3BA\$punk 2f793e7a69 refactor: improve content extraction and encoding handling
- Enhanced Azure Document Intelligence parser to raise an error for empty or whitespace-only content.
- Updated LLMRouterService to log premium model strings more clearly.
- Added automatic encoding detection for file reading in document processors.
- Improved error handling for empty markdown content extraction in file processors.
- Refactored DocumentUploadTab component for better accessibility and user interaction.
2026-04-16 00:25:46 -07:00
..
__init__.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
_direct_converters.py refactor: improve content extraction and encoding handling 2026-04-16 00:25:46 -07:00
_helpers.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
_save.py refactor: consolidate document processing logic and remove unused files and ETL strategies 2026-04-05 17:29:24 +05:30
base.py chore: ran linting 2026-03-17 04:40:46 +05:30
circleback_processor.py refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation 2026-03-15 00:44:27 -07:00
extension_processor.py refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation 2026-03-15 00:44:27 -07:00
file_processors.py refactor: improve content extraction and encoding handling 2026-04-16 00:25:46 -07:00
markdown_processor.py refactor: streamline document upload limits and enhance handling of mentioned documents 2026-04-02 19:39:10 -07:00
youtube_processor.py refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation 2026-03-15 00:44:27 -07:00