SurfSense/surfsense_backend/app/etl_pipeline
DESKTOP-RTLN3BA\$punk 2f793e7a69 refactor: improve content extraction and encoding handling
- Enhanced Azure Document Intelligence parser to raise an error for empty or whitespace-only content.
- Updated LLMRouterService to log premium model strings more clearly.
- Added automatic encoding detection for file reading in document processors.
- Improved error handling for empty markdown content extraction in file processors.
- Refactored DocumentUploadTab component for better accessibility and user interaction.
2026-04-16 00:25:46 -07:00
..
parsers refactor: improve content extraction and encoding handling 2026-04-16 00:25:46 -07:00
__init__.py feat: implement ETL pipeline with file classification and extraction services 2026-04-05 17:25:25 +05:30
constants.py feat: implement ETL pipeline with file classification and extraction services 2026-04-05 17:25:25 +05:30
etl_document.py feat: add processing mode support for document uploads and ETL pipeline, improded error handling ux 2026-04-14 21:26:00 -07:00
etl_pipeline_service.py feat: add processing mode support for document uploads and ETL pipeline, improded error handling ux 2026-04-14 21:26:00 -07:00
exceptions.py refactor: implement file type classification for supported extensions across Dropbox, Google Drive, and OneDrive connectors, enhancing file handling and error management 2026-04-06 22:03:47 +05:30
file_classifier.py Route uploaded images to vision LLM with document-parser fallback 2026-04-09 14:33:33 +02:00