Commit graph

1211 commits

Author SHA1 Message Date
Anish Sarkar
5f0a4d1a0f feat: add OneDrive file creation and deletion tools with connector checks 2026-03-28 14:34:30 +05:30
Anish Sarkar
5bddde60cb feat: implement Microsoft OneDrive connector with OAuth support and indexing capabilities 2026-03-28 14:31:25 +05:30
Anish Sarkar
b5ef7afb1c feat: add multi-format document export functionality to editor routes and UI components
- Implemented a new export endpoint in the backend to support exporting documents in various formats (PDF, DOCX, HTML, LaTeX, EPUB, ODT, plain text).
- Enhanced DocumentNode and FolderTreeView components to include export options in context and dropdown menus.
- Created shared ExportMenuItems component for consistent export options across the application.
- Integrated loading indicators for export actions to improve user experience.
2026-03-28 02:58:38 +05:30
Anish Sarkar
17091edb77 Merge remote-tracking branch 'upstream/dev' into refactor/indexing-pipelines 2026-03-27 22:36:34 +05:30
Anish Sarkar
6d4eb32345 fix: update export format for Google Docs to use correct MIME type 2026-03-27 22:20:32 +05:30
Anish Sarkar
489e48644f fix: revert native excel parsing 2026-03-27 22:15:24 +05:30
Anish Sarkar
dff8a1df37 feat: add descendant checking for folder filtering in Google Drive changes 2026-03-27 22:00:31 +05:30
Anish Sarkar
3da0ffd683 feat: add native Excel parsing and improve Google Drive content extraction
- Introduced a new utility for parsing .xlsx files into markdown format, enhancing the ability to process Excel documents natively.
- Updated the Google Drive content extractor to utilize the new Excel parsing functionality, allowing for better handling of spreadsheet files.
- Enhanced file type detection and export logic to support various document formats, improving overall content extraction accuracy.
- Added unit tests to ensure the correctness of the new Excel parsing feature and its integration with existing content extraction workflows.
2026-03-27 21:47:14 +05:30
Anish Sarkar
4e0749f907 fix: update file skipping logic for failed documents in Google Drive indexer
- Modified the `_should_skip_file` function to skip previously failed documents during processing, improving error handling.
- Updated the corresponding test to reflect the new behavior, ensuring that failed documents are correctly identified and skipped during automatic sync.
2026-03-27 20:01:08 +05:30
Anish Sarkar
00934ff462 feat: enhance Google Drive client with improved logging and thread-safe operations
- Added logging to track the start and end of file download and export processes, improving visibility into execution time.
- Implemented per-thread HTTP transport for concurrent downloads and exports, ensuring thread safety.
- Refactored download and export methods to utilize resolved credentials, enhancing functionality.
- Updated unit tests to validate the new threading and logging features, ensuring robust parallel execution.
2026-03-27 19:25:45 +05:30
Anish Sarkar
d2a4b238d7 feat: enhance Google Drive client with thread-safe download and export methods
- Implemented per-thread HTTP transport for concurrent downloads to ensure thread safety.
- Refactored `download_file` and `download_file_to_disk` methods to utilize blocking calls on separate threads, improving performance during file operations.
- Added logging to track the start and end of download and export processes, providing better visibility into execution time.
- Updated unit tests to verify parallel execution of download and export operations, ensuring efficiency in handling multiple requests.
2026-03-27 19:25:03 +05:30
Anish Sarkar
0bc1c766ff feat: migrate Confluence and Jira indexers to unified parallel pipeline
- Refactored Confluence and Jira indexers to utilize the shared IndexingPipelineService for improved document processing.
- Updated the `_build_connector_doc` function in both indexers to create ConnectorDocument instances with enhanced metadata and fallback summaries.
- Modified the `index_confluence_pages` and `index_jira_issues` functions to return a tuple of (indexed_count, skipped_count, warning_or_error_message) for better error handling and reporting.
- Added unit tests for both indexers to validate the new parallel processing logic and ensure correct document creation and indexing behavior.
2026-03-27 16:02:09 +05:30
DESKTOP-RTLN3BA\$punk
685ad0c02d feat: add folder management features including creation, deletion, and organization of documents within folders 2026-03-27 01:39:15 -07:00
Anish Sarkar
683a4c17dd feat: implement thread-safe embedding access in document converters
- Added a reentrant lock to ensure thread-safe access to the tokenizer and embedding model, preventing runtime errors during concurrent operations.
- Updated the `truncate_for_embedding` and `embed_text` functions to utilize the lock, ensuring safe execution in multi-threaded environments.
- Enhanced the `embed_texts` function to maintain thread safety while processing multiple texts for embedding.
2026-03-27 11:31:00 +05:30
Anish Sarkar
db6dd058dd feat: migrate Linear and Notion indexers to unified parallel pipeline
- Refactored Linear and Notion indexers to utilize the shared IndexingPipelineService for improved document deduplication, summarization, chunking, and embedding with bounded parallel indexing.
- Updated the `_build_connector_doc` function in both indexers to create ConnectorDocument instances with enhanced metadata and fallback summaries.
- Modified the `index_linear_issues` and `index_notion_pages` functions to return a tuple of (indexed_count, skipped_count, warning_or_error_message) for better error handling and reporting.
- Added unit tests for both indexers to validate the new parallel processing logic and ensure correct document creation and indexing behavior.
2026-03-27 11:19:32 +05:30
Anish Sarkar
da6bbcfe39 feat: add file streaming download functionality to Google Drive client
- Introduced `download_file_to_disk` method to stream files directly to disk in chunks, reducing memory usage during downloads.
- Updated `download_and_extract_content` function to utilize the new streaming download method for binary files, enhancing efficiency in handling large files.
- Improved error handling for download operations, providing clearer feedback on failures.
2026-03-27 08:54:06 +05:30
Anish Sarkar
7c7f8b216c feat: implement batch indexing for selected Google Drive files
- Introduced `index_google_drive_selected_files` function to enable indexing of multiple user-selected files in parallel, improving efficiency.
- Refactored existing indexing logic to handle batch processing, including error handling for individual file failures.
- Added unit tests for the new batch indexing functionality, ensuring robustness and proper error collection during the indexing process.
2026-03-27 00:17:07 +05:30
Anish Sarkar
2f30e48e90 feat: implement async service locking in Google Drive client
- Introduced an asyncio lock to the GoogleDriveClient to ensure thread-safe access to the service instance.
- Refactored the get_service method to utilize the lock, preventing concurrent attempts to create the service and improving stability in multi-threaded environments.
2026-03-27 00:06:21 +05:30
Anish Sarkar
c016962064 feat: implement parallel file downloading and indexing in Google Drive indexer
- Added `_download_files_parallel` function to enable concurrent downloading of files from Google Drive, improving efficiency in document processing.
- Introduced `_download_and_index` function to handle the parallel downloading and indexing phases, streamlining the overall workflow.
- Updated `_index_full_scan` and `_index_with_delta_sync` methods to utilize the new parallel downloading functionality, enhancing performance.
- Added unit tests to validate the new parallel downloading and indexing logic, ensuring robustness and error handling during document processing.
2026-03-26 23:53:26 +05:30
Anish Sarkar
bd6e335cb3 feat: enhance performance logging in indexing pipeline
- Added performance logging to the `index_batch_parallel` method, capturing metrics for document indexing duration and concurrency.
- Introduced timing measurements for both the overall indexing process and the parallel document gathering phase, improving observability of the indexing workflow.
- Updated logging statements to provide detailed insights into the number of documents processed, indexed, and failed during the indexing operation.
2026-03-26 23:10:49 +05:30
Anish Sarkar
4fd776e7ef feat: implement parallel indexing for Google Calendar and Gmail connectors
- Refactored Google Calendar and Gmail indexers to utilize the new `index_batch_parallel` method for concurrent document indexing, enhancing performance.
- Updated the indexing logic to replace serial processing with parallel execution, allowing for improved efficiency in handling multiple documents.
- Adjusted logging and error handling to accommodate the new parallel processing approach, ensuring robust operation during indexing.
- Enhanced unit tests to validate the functionality of the parallel indexing method and its integration with existing workflows.
2026-03-26 19:34:04 +05:30
Anish Sarkar
e5cb6bfacf feat: implement parallel document indexing in IndexingPipelineService
- Added `index_batch_parallel` method to enable concurrent indexing of documents with bounded concurrency, improving performance and efficiency.
- Refactored existing indexing logic to utilize `asyncio.to_thread` for non-blocking execution of embedding and chunking functions.
- Introduced unit tests to validate the functionality of the new parallel indexing method, ensuring robustness and error handling during document processing.
2026-03-26 19:33:49 +05:30
Anish Sarkar
bbd5ee8a19 feat: enhance Google Calendar event update functionality
- Introduced helper functions `_is_date_only` and `_build_time_body` to streamline the construction of event start and end times for all-day and timed events.
- Refactored the `create_update_calendar_event_tool` to utilize the new helper functions, improving code readability and maintainability.
- Updated the Google Calendar sync service to ensure proper handling of calendar IDs with a default fallback to "primary".
- Modified the ApprovalCard component to simplify the construction of event update arguments, enhancing clarity and reducing redundancy.
2026-03-25 20:35:23 +05:30
Anish Sarkar
c3d5c865fd fix: update file skipping logic in Google Drive indexer
- Modified the `_should_skip_file` function to prevent skipping of documents with a FAILED status, ensuring they are reprocessed even if their content remains unchanged.
- Added a new integration test to verify that FAILED documents are not skipped during the indexing process.
2026-03-25 18:51:40 +05:30
Anish Sarkar
f7b52470eb feat: enhance Google connectors indexing with content extraction and document migration
- Added `download_and_extract_content` function to extract content from Google Drive files as markdown.
- Updated Google Drive indexer to utilize the new content extraction method.
- Implemented document migration logic to update legacy Composio document types to their native Google types.
- Introduced identifier hashing for stable document identification.
- Improved file pre-filtering to handle unchanged and rename-only files efficiently.
2026-03-25 18:33:44 +05:30
Anish Sarkar
778cfac6fa Merge remote-tracking branch 'upstream/dev' into impr/thinking-steps 2026-03-25 01:50:10 +05:30
CREDO23
5d8a62a4a6 merge upstream/dev into feat/migrate-electric-to-zero
Resolve 8 conflicts:
- Accept upstream deletion of 3 composio_*_connector.py (unified Google connectors)
- Accept our deletion of ElectricProvider.tsx, use-connectors-electric.ts,
  use-messages-electric.ts (replaced by Zero equivalents)
- Keep both new deps in package.json (@rocicorp/zero + @slate-serializers/html)
- Regenerate pnpm-lock.yaml
2026-03-24 17:40:34 +02:00
Anish Sarkar
c926c3f62e refactor: remove display_image tool and associated UI components to streamline chat functionality 2026-03-24 19:00:55 +05:30
Anish Sarkar
3f4e1a7dfd refactor: remove frontend of scrape_webpage tool 2026-03-24 18:55:06 +05:30
Anish Sarkar
a009cae62a refactor: remove link_preview tool and associated components to streamline agent functionality 2026-03-24 17:15:29 +05:30
Anish Sarkar
6c507989d2 refactor: remove display_image tool and update related components to streamline image handling 2026-03-24 16:28:11 +05:30
CREDO23
cf21eaacfc fix: critical timestamp parsing and audit fixes
- Fix timestamp conversion: String(epochMs) → new Date(epochMs).toISOString()
  in use-messages-sync, use-comments-sync, use-documents, use-inbox.
  Without this, date comparisons (isEdited, cutoff filters) would fail.
- Fix updated_at: undefined → null in use-inbox to match InboxItem type
- Fix ZeroProvider: skip Zero connection for unauthenticated users
- Clean 30+ stale "Electric SQL" comments in backend Python code
2026-03-23 19:49:28 +02:00
Anish Sarkar
0fd03709c6 feat: add internal metadata keys and clean metadata in document formatting 2026-03-22 20:19:42 +05:30
Anish Sarkar
5c598e8588 Merge remote-tracking branch 'upstream/dev' into feat/human-in-the-loop 2026-03-22 15:45:45 +05:30
DESKTOP-RTLN3BA\$punk
eb8cfd296c feat: add public routes for video presentations and audio streaming 2026-03-21 23:29:23 -07:00
DESKTOP-RTLN3BA\$punk
d90b6d35ce feat: enhance video presentation agent with parallel theme assignment and watermarking 2026-03-21 23:02:09 -07:00
DESKTOP-RTLN3BA\$punk
b28f135a96 feat: init video presentation agent 2026-03-21 22:13:41 -07:00
Anish Sarkar
5b6b1e5d72 feat: add issue URL to Jira issue creation and update responses for direct access 2026-03-22 03:16:34 +05:30
Anish Sarkar
1e9ea983dd chore: ran linting 2026-03-22 03:07:18 +05:30
Anish Sarkar
2c17c355d5 feat: add page URL to Confluence page creation and update responses instead of showing page id 2026-03-22 02:55:33 +05:30
Anish Sarkar
2bc6a0c3bc chore: ran linting 2026-03-22 00:43:53 +05:30
Anish Sarkar
68f1a7c5ce refactor: deduplicate issue type names in JiraToolMetadataService 2026-03-21 21:02:52 +05:30
Anish Sarkar
e37e6d2d18 chore: ran linting 2026-03-21 13:21:19 +05:30
Anish Sarkar
de8841fb86 chore: ran linting 2026-03-21 13:20:13 +05:30
Anish Sarkar
77cc2af14f Merge remote-tracking branch 'upstream/dev' into feat/human-in-the-loop 2026-03-21 13:17:24 +05:30
Anish Sarkar
79bc123439 feat: implement lazy imports for token refresh in Confluence and Jira connectors
- Refactored token refresh logic in ConfluenceHistoryConnector and JiraHistoryConnector to use lazy imports, avoiding circular dependencies.
- Enhanced the ComposerAction component to manage tool availability based on connected types, adding support for Jira and Confluence tools.
- Updated tool icon management to include Jira and Confluence, improving the user interface for tool interactions.
2026-03-21 12:41:06 +05:30
Anish Sarkar
e71eae26fc feat: initial files for jira and confluence HITL tool 2026-03-21 12:16:44 +05:30
Anish Sarkar
9a20db7fc4 feat: add created_by_id to document creation in various sync services 2026-03-21 11:41:59 +05:30
Anish Sarkar
b71dd425f8 feat: enhance tool management in ComposerAction component
- Added support for grouping tools with connector icons, improving organization and user interaction.
- Implemented logic to toggle tool groups based on their enabled/disabled state, enhancing user experience.
- Updated the display of enabled tools count to reflect the new grouping structure.
- Introduced a new constant for connector tool icon paths to streamline icon management across components.
- Added a new tool action for updating Gmail drafts in the backend agent, expanding functionality.
2026-03-21 11:38:42 +05:30
Anish Sarkar
ff6514a99f feat: add DedupHITLToolCallsMiddleware to prevent duplicate tool calls
- Introduced DedupHITLToolCallsMiddleware to prevent duplicate HITL tool calls within a single LLM response, ensuring only the first occurrence of each tool call is retained.
- Updated the create_surfsense_deep_agent function to include the new middleware, enhancing the efficiency of tool interactions.
- Added a new middleware file for better organization and maintainability of the codebase.
2026-03-21 03:47:30 +05:30