SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-07-26 23:51:14 +02:00

Anish Sarkar f7b52470eb feat: enhance Google connectors indexing with content extraction and document migration - Added `download_and_extract_content` function to extract content from Google Drive files as markdown. - Updated Google Drive indexer to utilize the new content extraction method. - Implemented document migration logic to update legacy Composio document types to their native Google types. - Introduced identifier hashing for stable document identification. - Improved file pre-filtering to handle unchanged and rename-only files efficiently.		2026-03-25 18:33:44 +05:30
..
adapters	refactor: implement UploadDocumentAdapter for file indexing and reindexing	2026-02-28 01:38:32 +05:30
__init__.py	test: add ConnectorDocument unit tests and factory fixture	2026-02-24 22:20:08 +02:00
connector_document.py	feat: enhance performance logging and caching in various components	2026-02-26 13:00:31 -08:00
document_chunker.py	feat: enhance performance logging and caching in various components	2026-02-26 13:00:31 -08:00
document_embedder.py	feat: re-export embed_texts from document_embedder	2026-03-09 15:54:02 +02:00
document_hashing.py	feat: enhance Google connectors indexing with content extraction and document migration	2026-03-25 18:33:44 +05:30
document_persistence.py	fix bugs in indexing pipeline exception handling	2026-02-25 16:27:12 +02:00
document_summarizer.py	feat: enhance performance logging and caching in various components	2026-02-26 13:00:31 -08:00
exceptions.py	Merge branch 'dev' of https://github.com/MODSetter/SurfSense into dev	2026-02-26 13:01:24 -08:00
indexing_pipeline_service.py	feat: enhance Google connectors indexing with content extraction and document migration	2026-03-25 18:33:44 +05:30
pipeline_logger.py	feat: enhance performance logging and caching in various components	2026-02-26 13:00:31 -08:00