SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-05-03 21:02:40 +02:00

Author	SHA1	Message	Date
Anish Sarkar	99623a85d5	refactor: remove legacy Obsidian connector support	2026-04-22 00:10:24 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	656e061f84	feat: add processing mode support for document uploads and ETL pipeline, improded error handling ux Some checks are pending Build and Push Docker Images / tag_release (push) Waiting to run Details Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions Details Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions Details Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions Details Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions Details Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions Details Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions Details - Introduced a `ProcessingMode` enum to differentiate between basic and premium processing modes. - Updated `EtlRequest` to include a `processing_mode` field, defaulting to basic. - Enhanced ETL pipeline services to utilize the selected processing mode for Azure Document Intelligence and LlamaCloud parsing. - Modified various routes and services to handle processing mode, affecting document upload and indexing tasks. - Improved error handling and logging to include processing mode details. - Added tests to validate processing mode functionality and its impact on ETL operations.	2026-04-14 21:26:00 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	4bee367d4a	feat: added ai file sorting	2026-04-14 01:43:30 -07:00
CREDO23	a95bf58c8f	Make Vision LLM opt-in for uploads and connectors	2026-04-10 16:45:51 +02:00
CREDO23	0aefcbd504	Remove vision LLM from desktop folder watcher	2026-04-09 22:06:06 +02:00
CREDO23	afd3c2cde2	Pass vision LLM through local folder indexer call chain	2026-04-09 14:50:24 +02:00
Anish Sarkar	56c5809170	chore: ran linting	2026-04-08 18:23:03 +05:30
Anish Sarkar	37c52ce7ea	feat: implement indexing progress management in local folder indexing process and enhance related test coverage	2026-04-08 18:01:55 +05:30
Anish Sarkar	a8b83dcf3f	feat: add folder_id support in ConnectorDocument and indexing pipeline for improved document organization	2026-04-08 17:48:50 +05:30
Anish Sarkar	f3aa514240	feat: integrate subtree ID retrieval in local folder cleanup process and enhance UI component styling for folder selection	2026-04-08 17:25:18 +05:30
Anish Sarkar	cab0d1bdfe	feat: enhance folder synchronization by integrating subtree ID retrieval and optimizing empty folder cleanup process	2026-04-08 17:10:22 +05:30
Anish Sarkar	ae98f64760	feat: enhance folder indexing with metadata management and improve folder structure handling in UI components	2026-04-08 16:48:40 +05:30
Anish Sarkar	60eb1e4060	feat: implement raw file hash computation to optimize content extraction during local folder indexing	2026-04-08 16:28:51 +05:30
Anish Sarkar	5f5954e932	feat: implement upload-based folder indexing and synchronization features	2026-04-08 15:46:52 +05:30
Anish Sarkar	0a26a6c5bb	chore: ran linting	2026-04-07 05:55:39 +05:30
Anish Sarkar	1b87719a92	refactor: enhance file skipping logic in Google Drive connector to check for Google Workspace files before unsupported extensions	2026-04-07 05:36:29 +05:30
Anish Sarkar	e4462292e4	refactor: update Google Drive indexer to return an additional unsupported file count, enhancing error reporting consistency	2026-04-07 05:30:10 +05:30
Anish Sarkar	122be76133	refactor: update _index_selected_files method signatures in Dropbox, Google Drive, and OneDrive indexers to include unsupported file count, enhancing error reporting and consistency across connectors	2026-04-07 03:16:46 +05:30
Anish Sarkar	3a1d700817	refactor: enhance file skipping logic across Dropbox, Google Drive, and OneDrive connectors to return unsupported extensions, improving error reporting and maintainability	2026-04-07 03:16:34 +05:30
Anish Sarkar	f03bf05aaa	refactor: enhance Google Drive indexer to support file extension filtering, improving file handling and error reporting	2026-04-06 22:34:49 +05:30
Anish Sarkar	0fb92b7c56	refactor: streamline file skipping logic in Dropbox indexer by removing redundant checks, improving code clarity	2026-04-06 22:17:50 +05:30
Anish Sarkar	63a75052ca	Merge remote-tracking branch 'upstream/dev' into feat/unified-etl-pipeline	2026-04-06 22:04:51 +05:30
Anish Sarkar	b5a15b7681	feat: implement cursor-based delta sync for Dropbox integration, enhancing file indexing efficiency and preserving folder cursors during re-authentication	2026-04-06 18:36:29 +05:30
Anish Sarkar	87af012a60	refactor: streamline file processing by integrating ETL pipeline for all file types and removing redundant functions	2026-04-05 17:45:18 +05:30
Anish Sarkar	a2b3541046	chore: ran linting	2026-04-04 03:11:56 +05:30
Anish Sarkar	0d2acc665d	Merge remote-tracking branch 'upstream/dev' into feat/page-limit-connectors	2026-04-04 03:08:27 +05:30
Anish Sarkar	ce40da80ea	feat: implement page limit estimation and enforcement in file based connector indexers - Added a static method `estimate_pages_from_metadata` to `PageLimitService` for estimating page counts based on file metadata. - Integrated page limit checks in Google Drive, Dropbox, and OneDrive indexers to prevent exceeding user quotas during file indexing. - Updated relevant indexing methods to utilize the new page estimation logic and enforce limits accordingly. - Enhanced tests for page limit functionality, ensuring accurate estimation and enforcement across different file types.	2026-04-04 02:51:28 +05:30
Anish Sarkar	9c0af6569d	feat: implement page limit checks in local folder indexing to manage user page usage	2026-04-03 19:13:25 +05:30
Anish Sarkar	b759bb36a9	feat: add direct conversion support for CSV, TSV, and HTML files in local folder indexing	2026-04-03 17:36:48 +05:30
Anish Sarkar	746c730b2e	chore: ran linting	2026-04-03 13:14:40 +05:30
Anish Sarkar	1fa8e1cc83	feat: refactor folder indexing to support batch processing of multiple files, enhancing performance and error handling	2026-04-03 10:02:36 +05:30
Anish Sarkar	e2ba509314	feat: enhance error handling in local folder indexing by adding rollback and refresh on IntegrityError	2026-04-03 09:29:59 +05:30
Anish Sarkar	44e39792da	feat: assign folder_id to documents before indexing to ensure correct folder visibility during processing	2026-04-03 04:14:28 +05:30
Anish Sarkar	53df393cf7	refactor: streamline local folder indexing logic by removing unused imports, enhancing content hashing, and improving document creation process	2026-04-02 23:28:23 +05:30
Anish Sarkar	c27d24a117	feat: enhance folder indexing by adding root folder ID support and implement folder creation and cleanup logic	2026-04-02 22:41:45 +05:30
Anish Sarkar	caf2525ab5	fix: update folder ID collection logic to include deleted directories and adjust test cases for document titles	2026-04-02 22:29:07 +05:30
Anish Sarkar	22ee5c99cc	refactor: remove Local Folder connector and related tasks, implement new folder indexing endpoints	2026-04-02 22:21:31 +05:30
Anish Sarkar	96a58d0d30	feat: implement local folder indexing and document versioning capabilities	2026-04-02 11:11:57 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	17642493eb	chore: linting	2026-03-31 14:45:46 -07:00
Anish Sarkar	526940e9fe	fix: improve error handling and path retrieval in Dropbox indexing for better reliability	2026-03-30 23:51:21 +05:30
Anish Sarkar	d8d5102416	feat: introduce incremental sync option for Dropbox indexing, enhancing performance and user control	2026-03-30 23:27:48 +05:30
Anish Sarkar	1f12151e03	feat: implement Dropbox API client and folder management for enhanced file indexing	2026-03-30 22:17:50 +05:30
Anish Sarkar	04691d572b	chore: ran linting	2026-03-30 01:50:41 +05:30
Anish Sarkar	5a3eece397	Merge remote-tracking branch 'upstream/dev' into feat/onedrive-connector	2026-03-29 11:55:06 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	2cc2d339e6	feat: made agent file sytem optimized	2026-03-28 16:39:46 -07:00
Anish Sarkar	5bddde60cb	feat: implement Microsoft OneDrive connector with OAuth support and indexing capabilities	2026-03-28 14:31:25 +05:30
Anish Sarkar	4e0749f907	fix: update file skipping logic for failed documents in Google Drive indexer - Modified the `_should_skip_file` function to skip previously failed documents during processing, improving error handling. - Updated the corresponding test to reflect the new behavior, ensuring that failed documents are correctly identified and skipped during automatic sync.	2026-03-27 20:01:08 +05:30
Anish Sarkar	00934ff462	feat: enhance Google Drive client with improved logging and thread-safe operations - Added logging to track the start and end of file download and export processes, improving visibility into execution time. - Implemented per-thread HTTP transport for concurrent downloads and exports, ensuring thread safety. - Refactored download and export methods to utilize resolved credentials, enhancing functionality. - Updated unit tests to validate the new threading and logging features, ensuring robust parallel execution.	2026-03-27 19:25:45 +05:30
Anish Sarkar	0bc1c766ff	feat: migrate Confluence and Jira indexers to unified parallel pipeline - Refactored Confluence and Jira indexers to utilize the shared IndexingPipelineService for improved document processing. - Updated the `_build_connector_doc` function in both indexers to create ConnectorDocument instances with enhanced metadata and fallback summaries. - Modified the `index_confluence_pages` and `index_jira_issues` functions to return a tuple of (indexed_count, skipped_count, warning_or_error_message) for better error handling and reporting. - Added unit tests for both indexers to validate the new parallel processing logic and ensure correct document creation and indexing behavior.	2026-03-27 16:02:09 +05:30
Anish Sarkar	db6dd058dd	feat: migrate Linear and Notion indexers to unified parallel pipeline - Refactored Linear and Notion indexers to utilize the shared IndexingPipelineService for improved document deduplication, summarization, chunking, and embedding with bounded parallel indexing. - Updated the `_build_connector_doc` function in both indexers to create ConnectorDocument instances with enhanced metadata and fallback summaries. - Modified the `index_linear_issues` and `index_notion_pages` functions to return a tuple of (indexed_count, skipped_count, warning_or_error_message) for better error handling and reporting. - Added unit tests for both indexers to validate the new parallel processing logic and ensure correct document creation and indexing behavior.	2026-03-27 11:19:32 +05:30

1 2 3 4 5

215 commits