Commit graph

1437 commits

Author SHA1 Message Date
DESKTOP-RTLN3BA\$punk
2cc2d339e6 feat: made agent file sytem optimized 2026-03-28 16:39:46 -07:00
Anish Sarkar
b5ef7afb1c feat: add multi-format document export functionality to editor routes and UI components
- Implemented a new export endpoint in the backend to support exporting documents in various formats (PDF, DOCX, HTML, LaTeX, EPUB, ODT, plain text).
- Enhanced DocumentNode and FolderTreeView components to include export options in context and dropdown menus.
- Created shared ExportMenuItems component for consistent export options across the application.
- Integrated loading indicators for export actions to improve user experience.
2026-03-28 02:58:38 +05:30
Anish Sarkar
17091edb77 Merge remote-tracking branch 'upstream/dev' into refactor/indexing-pipelines 2026-03-27 22:36:34 +05:30
Anish Sarkar
6d4eb32345 fix: update export format for Google Docs to use correct MIME type 2026-03-27 22:20:32 +05:30
Anish Sarkar
489e48644f fix: revert native excel parsing 2026-03-27 22:15:24 +05:30
Anish Sarkar
dff8a1df37 feat: add descendant checking for folder filtering in Google Drive changes 2026-03-27 22:00:31 +05:30
Anish Sarkar
3da0ffd683 feat: add native Excel parsing and improve Google Drive content extraction
- Introduced a new utility for parsing .xlsx files into markdown format, enhancing the ability to process Excel documents natively.
- Updated the Google Drive content extractor to utilize the new Excel parsing functionality, allowing for better handling of spreadsheet files.
- Enhanced file type detection and export logic to support various document formats, improving overall content extraction accuracy.
- Added unit tests to ensure the correctness of the new Excel parsing feature and its integration with existing content extraction workflows.
2026-03-27 21:47:14 +05:30
Anish Sarkar
4e0749f907 fix: update file skipping logic for failed documents in Google Drive indexer
- Modified the `_should_skip_file` function to skip previously failed documents during processing, improving error handling.
- Updated the corresponding test to reflect the new behavior, ensuring that failed documents are correctly identified and skipped during automatic sync.
2026-03-27 20:01:08 +05:30
Anish Sarkar
00934ff462 feat: enhance Google Drive client with improved logging and thread-safe operations
- Added logging to track the start and end of file download and export processes, improving visibility into execution time.
- Implemented per-thread HTTP transport for concurrent downloads and exports, ensuring thread safety.
- Refactored download and export methods to utilize resolved credentials, enhancing functionality.
- Updated unit tests to validate the new threading and logging features, ensuring robust parallel execution.
2026-03-27 19:25:45 +05:30
Anish Sarkar
d2a4b238d7 feat: enhance Google Drive client with thread-safe download and export methods
- Implemented per-thread HTTP transport for concurrent downloads to ensure thread safety.
- Refactored `download_file` and `download_file_to_disk` methods to utilize blocking calls on separate threads, improving performance during file operations.
- Added logging to track the start and end of download and export processes, providing better visibility into execution time.
- Updated unit tests to verify parallel execution of download and export operations, ensuring efficiency in handling multiple requests.
2026-03-27 19:25:03 +05:30
Anish Sarkar
0bc1c766ff feat: migrate Confluence and Jira indexers to unified parallel pipeline
- Refactored Confluence and Jira indexers to utilize the shared IndexingPipelineService for improved document processing.
- Updated the `_build_connector_doc` function in both indexers to create ConnectorDocument instances with enhanced metadata and fallback summaries.
- Modified the `index_confluence_pages` and `index_jira_issues` functions to return a tuple of (indexed_count, skipped_count, warning_or_error_message) for better error handling and reporting.
- Added unit tests for both indexers to validate the new parallel processing logic and ensure correct document creation and indexing behavior.
2026-03-27 16:02:09 +05:30
DESKTOP-RTLN3BA\$punk
685ad0c02d feat: add folder management features including creation, deletion, and organization of documents within folders 2026-03-27 01:39:15 -07:00
Anish Sarkar
683a4c17dd feat: implement thread-safe embedding access in document converters
- Added a reentrant lock to ensure thread-safe access to the tokenizer and embedding model, preventing runtime errors during concurrent operations.
- Updated the `truncate_for_embedding` and `embed_text` functions to utilize the lock, ensuring safe execution in multi-threaded environments.
- Enhanced the `embed_texts` function to maintain thread safety while processing multiple texts for embedding.
2026-03-27 11:31:00 +05:30
Anish Sarkar
db6dd058dd feat: migrate Linear and Notion indexers to unified parallel pipeline
- Refactored Linear and Notion indexers to utilize the shared IndexingPipelineService for improved document deduplication, summarization, chunking, and embedding with bounded parallel indexing.
- Updated the `_build_connector_doc` function in both indexers to create ConnectorDocument instances with enhanced metadata and fallback summaries.
- Modified the `index_linear_issues` and `index_notion_pages` functions to return a tuple of (indexed_count, skipped_count, warning_or_error_message) for better error handling and reporting.
- Added unit tests for both indexers to validate the new parallel processing logic and ensure correct document creation and indexing behavior.
2026-03-27 11:19:32 +05:30
Anish Sarkar
da6bbcfe39 feat: add file streaming download functionality to Google Drive client
- Introduced `download_file_to_disk` method to stream files directly to disk in chunks, reducing memory usage during downloads.
- Updated `download_and_extract_content` function to utilize the new streaming download method for binary files, enhancing efficiency in handling large files.
- Improved error handling for download operations, providing clearer feedback on failures.
2026-03-27 08:54:06 +05:30
Anish Sarkar
7c7f8b216c feat: implement batch indexing for selected Google Drive files
- Introduced `index_google_drive_selected_files` function to enable indexing of multiple user-selected files in parallel, improving efficiency.
- Refactored existing indexing logic to handle batch processing, including error handling for individual file failures.
- Added unit tests for the new batch indexing functionality, ensuring robustness and proper error collection during the indexing process.
2026-03-27 00:17:07 +05:30
Anish Sarkar
2f30e48e90 feat: implement async service locking in Google Drive client
- Introduced an asyncio lock to the GoogleDriveClient to ensure thread-safe access to the service instance.
- Refactored the get_service method to utilize the lock, preventing concurrent attempts to create the service and improving stability in multi-threaded environments.
2026-03-27 00:06:21 +05:30
Anish Sarkar
c016962064 feat: implement parallel file downloading and indexing in Google Drive indexer
- Added `_download_files_parallel` function to enable concurrent downloading of files from Google Drive, improving efficiency in document processing.
- Introduced `_download_and_index` function to handle the parallel downloading and indexing phases, streamlining the overall workflow.
- Updated `_index_full_scan` and `_index_with_delta_sync` methods to utilize the new parallel downloading functionality, enhancing performance.
- Added unit tests to validate the new parallel downloading and indexing logic, ensuring robustness and error handling during document processing.
2026-03-26 23:53:26 +05:30
Anish Sarkar
bd6e335cb3 feat: enhance performance logging in indexing pipeline
- Added performance logging to the `index_batch_parallel` method, capturing metrics for document indexing duration and concurrency.
- Introduced timing measurements for both the overall indexing process and the parallel document gathering phase, improving observability of the indexing workflow.
- Updated logging statements to provide detailed insights into the number of documents processed, indexed, and failed during the indexing operation.
2026-03-26 23:10:49 +05:30
Anish Sarkar
4fd776e7ef feat: implement parallel indexing for Google Calendar and Gmail connectors
- Refactored Google Calendar and Gmail indexers to utilize the new `index_batch_parallel` method for concurrent document indexing, enhancing performance.
- Updated the indexing logic to replace serial processing with parallel execution, allowing for improved efficiency in handling multiple documents.
- Adjusted logging and error handling to accommodate the new parallel processing approach, ensuring robust operation during indexing.
- Enhanced unit tests to validate the functionality of the parallel indexing method and its integration with existing workflows.
2026-03-26 19:34:04 +05:30
Anish Sarkar
e5cb6bfacf feat: implement parallel document indexing in IndexingPipelineService
- Added `index_batch_parallel` method to enable concurrent indexing of documents with bounded concurrency, improving performance and efficiency.
- Refactored existing indexing logic to utilize `asyncio.to_thread` for non-blocking execution of embedding and chunking functions.
- Introduced unit tests to validate the functionality of the new parallel indexing method, ensuring robustness and error handling during document processing.
2026-03-26 19:33:49 +05:30
Anish Sarkar
bbd5ee8a19 feat: enhance Google Calendar event update functionality
- Introduced helper functions `_is_date_only` and `_build_time_body` to streamline the construction of event start and end times for all-day and timed events.
- Refactored the `create_update_calendar_event_tool` to utilize the new helper functions, improving code readability and maintainability.
- Updated the Google Calendar sync service to ensure proper handling of calendar IDs with a default fallback to "primary".
- Modified the ApprovalCard component to simplify the construction of event update arguments, enhancing clarity and reducing redundancy.
2026-03-25 20:35:23 +05:30
Anish Sarkar
c3d5c865fd fix: update file skipping logic in Google Drive indexer
- Modified the `_should_skip_file` function to prevent skipping of documents with a FAILED status, ensuring they are reprocessed even if their content remains unchanged.
- Added a new integration test to verify that FAILED documents are not skipped during the indexing process.
2026-03-25 18:51:40 +05:30
Anish Sarkar
8c41fd91ba feat: add integration tests for indexing pipeline components
- Introduced integration tests for Calendar, Drive, and Gmail indexers to ensure proper document creation and migration.
- Added tests for batch indexing functionality to validate the processing of multiple documents.
- Implemented tests for legacy document migration to verify updates to document types and hashes.
- Enhanced test coverage for the IndexingPipelineService to ensure robust functionality across various document types.
2026-03-25 18:34:02 +05:30
Anish Sarkar
f7b52470eb feat: enhance Google connectors indexing with content extraction and document migration
- Added `download_and_extract_content` function to extract content from Google Drive files as markdown.
- Updated Google Drive indexer to utilize the new content extraction method.
- Implemented document migration logic to update legacy Composio document types to their native Google types.
- Introduced identifier hashing for stable document identification.
- Improved file pre-filtering to handle unchanged and rename-only files efficiently.
2026-03-25 18:33:44 +05:30
Anish Sarkar
778cfac6fa Merge remote-tracking branch 'upstream/dev' into impr/thinking-steps 2026-03-25 01:50:10 +05:30
CREDO23
5d8a62a4a6 merge upstream/dev into feat/migrate-electric-to-zero
Resolve 8 conflicts:
- Accept upstream deletion of 3 composio_*_connector.py (unified Google connectors)
- Accept our deletion of ElectricProvider.tsx, use-connectors-electric.ts,
  use-messages-electric.ts (replaced by Zero equivalents)
- Keep both new deps in package.json (@rocicorp/zero + @slate-serializers/html)
- Regenerate pnpm-lock.yaml
2026-03-24 17:40:34 +02:00
CREDO23
5bf2734e8c add migration to clean up Electric SQL artifacts 2026-03-24 17:19:07 +02:00
CREDO23
0916c1addd remove stale ElectricSQL references from changelog and test fixtures 2026-03-24 17:07:11 +02:00
Anish Sarkar
c926c3f62e refactor: remove display_image tool and associated UI components to streamline chat functionality 2026-03-24 19:00:55 +05:30
Anish Sarkar
3f4e1a7dfd refactor: remove frontend of scrape_webpage tool 2026-03-24 18:55:06 +05:30
Anish Sarkar
a009cae62a refactor: remove link_preview tool and associated components to streamline agent functionality 2026-03-24 17:15:29 +05:30
Anish Sarkar
6c507989d2 refactor: remove display_image tool and update related components to streamline image handling 2026-03-24 16:28:11 +05:30
CREDO23
4efb4fab4b fix: make migration 107 idempotent for video_presentations table
Check information_schema before creating table to avoid
DuplicateTableError when table already exists from model metadata.
Original op.create_table preserved exactly, only skipped if exists.
2026-03-23 20:27:09 +02:00
CREDO23
29b9cc814f fix: make migration 104 idempotent with if_not_exists on index creation
The notification composite indexes may already exist from the model's
__table_args__, causing DuplicateTableError on fresh migrations.
2026-03-23 20:22:48 +02:00
CREDO23
cf21eaacfc fix: critical timestamp parsing and audit fixes
- Fix timestamp conversion: String(epochMs) → new Date(epochMs).toISOString()
  in use-messages-sync, use-comments-sync, use-documents, use-inbox.
  Without this, date comparisons (isEdited, cutoff filters) would fail.
- Fix updated_at: undefined → null in use-inbox to match InboxItem type
- Fix ZeroProvider: skip Zero connection for unauthenticated users
- Clean 30+ stale "Electric SQL" comments in backend Python code
2026-03-23 19:49:28 +02:00
CREDO23
2b7465cdaa chore: remove Electric SQL plumbing and infrastructure
Remove all Electric SQL client code, Docker service, env vars, CI build
args, install scripts, and documentation. Feature hooks that depend on
Electric are intentionally left in place to be rewritten with Rocicorp
Zero in subsequent commits.

Deleted:
- lib/electric/ (client.ts, context.ts, auth.ts, baseline.ts)
- ElectricProvider.tsx
- docker/scripts/init-electric-user.sh
- content/docs/how-to/electric-sql.mdx

Cleaned:
- package.json (4 @electric-sql/* deps)
- app/layout.tsx, UserDropdown.tsx, LayoutDataProvider.tsx
- docker-compose.yml, docker-compose.dev.yml
- Dockerfile, docker-entrypoint.js
- .env.example (frontend, docker, backend)
- CI workflows, install scripts, docs
2026-03-23 16:53:20 +02:00
Anish Sarkar
0fd03709c6 feat: add internal metadata keys and clean metadata in document formatting 2026-03-22 20:19:42 +05:30
Anish Sarkar
5c598e8588 Merge remote-tracking branch 'upstream/dev' into feat/human-in-the-loop 2026-03-22 15:45:45 +05:30
DESKTOP-RTLN3BA\$punk
43c9a09a6c chore: add daytona dependency to pyproject.toml and update uv.lock 2026-03-22 01:55:56 -07:00
DESKTOP-RTLN3BA\$punk
0cd596b91f refactor: update video presentation status enum creation to use SQL execution for better handling of duplicates 2026-03-22 01:41:15 -07:00
DESKTOP-RTLN3BA\$punk
eb8cfd296c feat: add public routes for video presentations and audio streaming 2026-03-21 23:29:23 -07:00
DESKTOP-RTLN3BA\$punk
d90b6d35ce feat: enhance video presentation agent with parallel theme assignment and watermarking 2026-03-21 23:02:09 -07:00
DESKTOP-RTLN3BA\$punk
b28f135a96 feat: init video presentation agent 2026-03-21 22:13:41 -07:00
Anish Sarkar
5b6b1e5d72 feat: add issue URL to Jira issue creation and update responses for direct access 2026-03-22 03:16:34 +05:30
Anish Sarkar
1e9ea983dd chore: ran linting 2026-03-22 03:07:18 +05:30
Anish Sarkar
2c17c355d5 feat: add page URL to Confluence page creation and update responses instead of showing page id 2026-03-22 02:55:33 +05:30
Anish Sarkar
2bc6a0c3bc chore: ran linting 2026-03-22 00:43:53 +05:30
Anish Sarkar
68f1a7c5ce refactor: deduplicate issue type names in JiraToolMetadataService 2026-03-21 21:02:52 +05:30
Anish Sarkar
e37e6d2d18 chore: ran linting 2026-03-21 13:21:19 +05:30