- Introduced a `ProcessingMode` enum to differentiate between basic and premium processing modes.
- Updated `EtlRequest` to include a `processing_mode` field, defaulting to basic.
- Enhanced ETL pipeline services to utilize the selected processing mode for Azure Document Intelligence and LlamaCloud parsing.
- Modified various routes and services to handle processing mode, affecting document upload and indexing tasks.
- Improved error handling and logging to include processing mode details.
- Added tests to validate processing mode functionality and its impact on ETL operations.
- Added a static method `estimate_pages_from_metadata` to `PageLimitService` for estimating page counts based on file metadata.
- Integrated page limit checks in Google Drive, Dropbox, and OneDrive indexers to prevent exceeding user quotas during file indexing.
- Updated relevant indexing methods to utilize the new page estimation logic and enforce limits accordingly.
- Enhanced tests for page limit functionality, ensuring accurate estimation and enforcement across different file types.
- Updated maximum file size limit to 500 MB per file.
- Removed restrictions on the number of files per upload and total upload size.
- Enhanced handling of user-mentioning documents in the knowledge base search middleware.
- Improved document reading and processing logic to accommodate new features and optimizations.
- Modified the `_should_skip_file` function to skip previously failed documents during processing, improving error handling.
- Updated the corresponding test to reflect the new behavior, ensuring that failed documents are correctly identified and skipped during automatic sync.
- Modified the `_should_skip_file` function to prevent skipping of documents with a FAILED status, ensuring they are reprocessed even if their content remains unchanged.
- Added a new integration test to verify that FAILED documents are not skipped during the indexing process.
- Introduced integration tests for Calendar, Drive, and Gmail indexers to ensure proper document creation and migration.
- Added tests for batch indexing functionality to validate the processing of multiple documents.
- Implemented tests for legacy document migration to verify updates to document types and hashes.
- Enhanced test coverage for the IndexingPipelineService to ensure robust functionality across various document types.
- Modified the indexing functions for Google Calendar and Gmail to return the count of skipped messages alongside indexed messages, enhancing performance tracking.
- Updated related tests to accommodate the new return values, ensuring comprehensive coverage of the indexing process.
- Improved error handling to maintain consistency in returned values across different indexing functions.
- Introduced comprehensive integration tests for Google Drive, Gmail, and Calendar indexers, ensuring proper credential handling for both Composio and native connectors.
- Added unit tests to validate the acceptance of Composio-sourced credentials across various connector types.
- Implemented fixtures to seed test data and facilitate testing of hybrid search functionality, ensuring accurate document type filtering.
- Added endpoint to list agent tools with metadata, excluding hidden tools.
- Updated NewChatRequest and RegenerateRequest schemas to include disabled tools.
- Integrated disabled tools management in the NewChatPage and Composer components.
- Improved tool instructions and visibility in the system prompt.
- Refactored tool registration to support hidden tools and default enabled states.
- Enhanced document chunk creation to handle strict zip behavior.
- Cleaned up imports and formatting across various files for consistency.
- Introduced should_summarize parameter in TaskDispatcher and CeleryTaskDispatcher to control summary generation.
- Updated InlineTaskDispatcher to support the new parameter for document processing.