SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-06-12 20:45:20 +02:00

Anish Sarkar 3da0ffd683 feat: add native Excel parsing and improve Google Drive content extraction - Introduced a new utility for parsing .xlsx files into markdown format, enhancing the ability to process Excel documents natively. - Updated the Google Drive content extractor to utilize the new Excel parsing functionality, allowing for better handling of spreadsheet files. - Enhanced file type detection and export logic to support various document formats, improving overall content extraction accuracy. - Added unit tests to ensure the correctness of the new Excel parsing feature and its integration with existing content extraction workflows.		2026-03-27 21:47:14 +05:30
..
__init__.py	Removed the CRAWLED_URL document processors	2025-11-21 23:27:21 -08:00
base.py	chore: ran linting	2026-03-17 04:40:46 +05:30
circleback_processor.py	refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation	2026-03-15 00:44:27 -07:00
extension_processor.py	refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation	2026-03-15 00:44:27 -07:00
file_processors.py	feat: add native Excel parsing and improve Google Drive content extraction	2026-03-27 21:47:14 +05:30
markdown_processor.py	feat: unify handling of native and legacy document types for Google connectors	2026-03-20 03:41:32 +05:30
youtube_processor.py	refactor: update safe_set_chunks function to be asynchronous and modify all connector and document processor files to use the new async implementation	2026-03-15 00:44:27 -07:00