SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-04-26 09:16:22 +02:00

Author	SHA1	Message	Date
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	656e061f84	feat: add processing mode support for document uploads and ETL pipeline, improded error handling ux Some checks are pending Build and Push Docker Images / tag_release (push) Waiting to run Details Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions Details Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions Details Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions Details Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions Details Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions Details Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions Details - Introduced a `ProcessingMode` enum to differentiate between basic and premium processing modes. - Updated `EtlRequest` to include a `processing_mode` field, defaulting to basic. - Enhanced ETL pipeline services to utilize the selected processing mode for Azure Document Intelligence and LlamaCloud parsing. - Modified various routes and services to handle processing mode, affecting document upload and indexing tasks. - Improved error handling and logging to include processing mode details. - Added tests to validate processing mode functionality and its impact on ETL operations.	2026-04-14 21:26:00 -07:00
CREDO23	a95bf58c8f	Make Vision LLM opt-in for uploads and connectors	2026-04-10 16:45:51 +02:00
Anish Sarkar	56c5809170	chore: ran linting	2026-04-08 18:23:03 +05:30
Anish Sarkar	37c52ce7ea	feat: implement indexing progress management in local folder indexing process and enhance related test coverage	2026-04-08 18:01:55 +05:30
Anish Sarkar	a624c86b04	refactor: update file skipping logic in Dropbox, Google Drive, and OneDrive connectors to return unsupported extension information	2026-04-07 05:11:15 +05:30
Anish Sarkar	f03bf05aaa	refactor: enhance Google Drive indexer to support file extension filtering, improving file handling and error reporting	2026-04-06 22:34:49 +05:30
Anish Sarkar	a2b3541046	chore: ran linting	2026-04-04 03:11:56 +05:30
Anish Sarkar	0d2acc665d	Merge remote-tracking branch 'upstream/dev' into feat/page-limit-connectors	2026-04-04 03:08:27 +05:30
Anish Sarkar	ce40da80ea	feat: implement page limit estimation and enforcement in file based connector indexers - Added a static method `estimate_pages_from_metadata` to `PageLimitService` for estimating page counts based on file metadata. - Integrated page limit checks in Google Drive, Dropbox, and OneDrive indexers to prevent exceeding user quotas during file indexing. - Updated relevant indexing methods to utilize the new page estimation logic and enforce limits accordingly. - Enhanced tests for page limit functionality, ensuring accurate estimation and enforcement across different file types.	2026-04-04 02:51:28 +05:30
Anish Sarkar	9c0af6569d	feat: implement page limit checks in local folder indexing to manage user page usage	2026-04-03 19:13:25 +05:30
Anish Sarkar	edda5b98cb	chore: ran linting	2026-04-03 17:38:29 +05:30
Anish Sarkar	b759bb36a9	feat: add direct conversion support for CSV, TSV, and HTML files in local folder indexing	2026-04-03 17:36:48 +05:30
Anish Sarkar	746c730b2e	chore: ran linting	2026-04-03 13:14:40 +05:30
Anish Sarkar	62b44889d1	Merge remote-tracking branch 'upstream/dev' into feat/local-folder-sync	2026-04-03 11:42:43 +05:30
Anish Sarkar	2b9d79d44c	feat: add integration tests for batch processing of local folder indexing, covering multiple file scenarios and error handling	2026-04-03 10:04:14 +05:30
Anish Sarkar	1fa8e1cc83	feat: refactor folder indexing to support batch processing of multiple files, enhancing performance and error handling	2026-04-03 10:02:36 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	62e698d8aa	refactor: streamline document upload limits and enhance handling of mentioned documents - Updated maximum file size limit to 500 MB per file. - Removed restrictions on the number of files per upload and total upload size. - Enhanced handling of user-mentioning documents in the knowledge base search middleware. - Improved document reading and processing logic to accommodate new features and optimizations.	2026-04-02 19:39:10 -07:00
Anish Sarkar	53df393cf7	refactor: streamline local folder indexing logic by removing unused imports, enhancing content hashing, and improving document creation process	2026-04-02 23:28:23 +05:30
Anish Sarkar	c27d24a117	feat: enhance folder indexing by adding root folder ID support and implement folder creation and cleanup logic	2026-04-02 22:41:45 +05:30
Anish Sarkar	caf2525ab5	fix: update folder ID collection logic to include deleted directories and adjust test cases for document titles	2026-04-02 22:29:07 +05:30
Anish Sarkar	22ee5c99cc	refactor: remove Local Folder connector and related tasks, implement new folder indexing endpoints	2026-04-02 22:21:31 +05:30
Anish Sarkar	775dea7894	feat: add integration and unit tests for local folder indexing and document versioning	2026-04-02 11:12:16 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	ad0e77c3d6	feat: enhance knowledge base search with date filtering	2026-03-31 20:13:46 -07:00
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	a9fd45844d	feat: integrate Stripe for page purchases and reconciliation tasks	2026-03-31 18:39:45 -07:00
Anish Sarkar	272de1bb40	feat: add integration and unit tests for Dropbox indexing pipeline and parallel downloads	2026-03-30 22:19:15 +05:30
Anish Sarkar	04691d572b	chore: ran linting	2026-03-30 01:50:41 +05:30
Anish Sarkar	5a3eece397	Merge remote-tracking branch 'upstream/dev' into feat/onedrive-connector	2026-03-29 11:55:06 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	2cc2d339e6	feat: made agent file sytem optimized	2026-03-28 16:39:46 -07:00
Anish Sarkar	028c88be72	feat: add integration and unit tests for OneDrive indexing pipeline and parallel downloads	2026-03-28 16:39:47 +05:30
Anish Sarkar	4e0749f907	fix: update file skipping logic for failed documents in Google Drive indexer - Modified the `_should_skip_file` function to skip previously failed documents during processing, improving error handling. - Updated the corresponding test to reflect the new behavior, ensuring that failed documents are correctly identified and skipped during automatic sync.	2026-03-27 20:01:08 +05:30
Anish Sarkar	c3d5c865fd	fix: update file skipping logic in Google Drive indexer - Modified the `_should_skip_file` function to prevent skipping of documents with a FAILED status, ensuring they are reprocessed even if their content remains unchanged. - Added a new integration test to verify that FAILED documents are not skipped during the indexing process.	2026-03-25 18:51:40 +05:30
Anish Sarkar	8c41fd91ba	feat: add integration tests for indexing pipeline components - Introduced integration tests for Calendar, Drive, and Gmail indexers to ensure proper document creation and migration. - Added tests for batch indexing functionality to validate the processing of multiple documents. - Implemented tests for legacy document migration to verify updates to document types and hashes. - Enhanced test coverage for the IndexingPipelineService to ensure robust functionality across various document types.	2026-03-25 18:34:02 +05:30
Anish Sarkar	2bc6a0c3bc	chore: ran linting	2026-03-22 00:43:53 +05:30
Anish Sarkar	e37e6d2d18	chore: ran linting	2026-03-21 13:21:19 +05:30
Anish Sarkar	de8841fb86	chore: ran linting	2026-03-21 13:20:13 +05:30
Anish Sarkar	8e7cda31c5	feat: update Google indexing functions to track skipped messages - Modified the indexing functions for Google Calendar and Gmail to return the count of skipped messages alongside indexed messages, enhancing performance tracking. - Updated related tests to accommodate the new return values, ensuring comprehensive coverage of the indexing process. - Improved error handling to maintain consistency in returned values across different indexing functions.	2026-03-19 20:56:40 +05:30
Anish Sarkar	e9485ab2df	feat: update Google Drive indexing to include skipped file tracking	2026-03-19 20:27:50 +05:30
Anish Sarkar	36f4709225	feat: add integration and unit tests for Google unification connectors - Introduced comprehensive integration tests for Google Drive, Gmail, and Calendar indexers, ensuring proper credential handling for both Composio and native connectors. - Added unit tests to validate the acceptance of Composio-sourced credentials across various connector types. - Implemented fixtures to seed test data and facilitate testing of hybrid search functionality, ensuring accurate document type filtering.	2026-03-19 17:51:15 +05:30
Anish Sarkar	851856a54b	fix: update document cleanup logic and mock Celery task in tests	2026-03-11 12:27:32 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	d8a05ae4d5	feat: refactor agent tools management and add UI integration - Added endpoint to list agent tools with metadata, excluding hidden tools. - Updated NewChatRequest and RegenerateRequest schemas to include disabled tools. - Integrated disabled tools management in the NewChatPage and Composer components. - Improved tool instructions and visibility in the system prompt. - Refactored tool registration to support hidden tools and default enabled states. - Enhanced document chunk creation to handle strict zip behavior. - Cleaned up imports and formatting across various files for consistency.	2026-03-10 17:36:26 -07:00
Rohan Verma	547077e5b9	Merge pull request #865 from CREDO23/sur-182-fix-ux-experience-for-composio-google-drive-connector [Perf] Batch embedding, non-blocking search, chunks index & Google Drive UX fix	2026-03-10 12:52:16 -07:00
CREDO23	e951fbb991	fix: update stale embed_text mock in document_upload tests	2026-03-09 21:47:27 +02:00
CREDO23	929445afd9	feat: use batch embedding in IndexingPipelineService.index	2026-03-09 16:13:44 +02:00
Anish Sarkar	ca3710a239	fix: remove slowapi limiter for testing	2026-03-08 02:41:05 +05:30
Anish Sarkar	b2bf00e11a	chore: ran linting	2026-02-28 02:28:03 +05:30
Anish Sarkar	ce82807f16	test: enhance reindexing tests for UploadDocumentAdapter	2026-02-28 02:18:02 +05:30
Anish Sarkar	37f76a8533	test: add should_summarize parameter to file upload adapter tests	2026-02-28 01:44:41 +05:30
Anish Sarkar	23a98d802c	refactor: implement UploadDocumentAdapter for file indexing and reindexing	2026-02-28 01:38:32 +05:30
$DESKTOP-RTLN3BA\$punk$ DESKTOP-RTLN3BA\$punk	a4dc84d1ab	feat: add should_summarize parameter to task dispatchers - Introduced should_summarize parameter in TaskDispatcher and CeleryTaskDispatcher to control summary generation. - Updated InlineTaskDispatcher to support the new parameter for document processing.	2026-02-26 19:12:37 -08:00
Anish Sarkar	836d5293df	refactor: remove unused TestStatusPolling class from document upload integration tests	2026-02-27 01:52:35 +05:30

1 2

73 commits