Gmail indexer was using a hardcoded 30-day default instead of respecting
last_indexed_at like other connectors. Now uses calculate_date_range()
for consistent behavior (last_indexed_at → now, or 365 days for first run).
Prevent UniqueViolationError on ix_documents_content_hash constraint by
adding check_duplicate_document_by_hash() before inserting new documents
in 15 connector indexers that were missing this check.
Affected: clickup, luma, linear, jira, google_gmail, confluence,
bookstack, github, webcrawler, teams, slack, notion, discord,
airtable, obsidian indexers.
- Updated `format_mentioned_documents_as_context` to include detailed document metadata, including URL and JSON representation of metadata.
- Improved citation handling by using chunk IDs for mentioned documents.
- Adjusted the fetching of mentioned documents to load associated chunks for better citation accuracy.
- Cleaned up the context formatting for better readability and structure.
- Modified the Google Drive indexer to use SQLAlchemy's cast function for querying document metadata, ensuring proper type handling for file IDs.
- Improved the consistency of metadata queries across the indexing functions, enhancing reliability in document retrieval and processing.
- Updated error handling in the indexing functions for BookStack, Confluence, Google Calendar, Jira, Linear, and Luma connectors to log specific error messages when failures occur.
- Enhanced logging for cases where no pages or events are found, providing clearer informational messages instead of treating them as critical errors.
- Ensured consistent error reporting across all connector indexers, improving debugging and user feedback during indexing operations.
- Enhanced Google Calendar and Composio connector indexing to track and log duplicate content, preventing re-indexing of already processed events.
- Implemented robust error handling during final commits to manage integrity errors gracefully, ensuring successful indexing despite potential duplicates.
- Updated notification service to differentiate between actual errors and warnings for duplicate content, improving user feedback.
- Refactored date handling to ensure valid date ranges and adjusted end dates when necessary for better indexing accuracy.
- Added normalization for "undefined" strings to None in date parameters to prevent parsing errors.
- Improved date range validation to ensure start_date is strictly before end_date, adjusting end_date if necessary.
- Updated Google Calendar and Composio connector indexing logic to handle duplicate content more effectively, logging warnings for skipped events.
- Enhanced error handling during final commits to manage integrity errors gracefully.
- Refactored date handling in various connector indexers for consistency and reliability.
- Added user-friendly re-authentication messages for expired or revoked tokens in both Google Calendar and Gmail connectors.
- Updated error handling in indexing tasks to log specific authentication errors and provide clearer feedback to users.
- Enhanced the connector UI to handle indexing failures more effectively, improving overall user experience.
- Added methods to retrieve the starting page token and list changes in Google Drive, enabling delta sync capabilities.
- Updated Composio service to handle file download directory configuration.
- Modified indexing tasks to support delta sync, improving efficiency by processing only changed files.
- Adjusted date handling in connector tasks to allow optional start and end dates.
- Improved error handling and logging throughout the Composio indexing process.
- Enhanced the handling of file content from Composio, supporting both binary and text files with appropriate processing methods.
- Introduced robust error logging and handling for file content extraction, ensuring better visibility into issues during processing.
- Updated the indexing logic to accommodate new content processing methods, improving overall reliability and user feedback on errors.
- Added temporary file handling for binary files to facilitate text extraction using the ETL service.
- Added a new endpoint to list folders and files in a user's Composio Google Drive, supporting hierarchical structure.
- Implemented UI components for selecting specific folders and files to index, improving user control over indexing options.
- Introduced indexing options for maximum files per folder and inclusion of subfolders, allowing for customizable indexing behavior.
- Enhanced error handling and logging for Composio Drive operations, ensuring better visibility into issues during file retrieval and indexing.
- Updated the Composio configuration component to reflect new selection capabilities and indexing options.
- Updated the list_gmail_messages method to support pagination with page tokens, allowing for more efficient message retrieval.
- Modified the return structure to include next_page_token and result_size_estimate for better client-side handling.
- Improved error handling and logging throughout the Gmail indexing process, ensuring better visibility into failures.
- Implemented batch processing for Gmail messages, committing changes incrementally to prevent data loss.
- Ensured consistent timestamp updates for connectors, even when no documents are indexed, to maintain accurate UI states.
- Refactored the indexing logic to streamline message processing and enhance overall performance.
- Introduced new enum values for Composio connectors: COMPOSIO_GOOGLE_DRIVE_CONNECTOR, COMPOSIO_GMAIL_CONNECTOR, and COMPOSIO_GOOGLE_CALENDAR_CONNECTOR.
- Updated database migration to add these new enum values to the relevant types.
- Refactored Composio integration logic to handle specific connector types, improving the management of connected accounts and indexing processes.
- Enhanced frontend components to support the new Composio connector types, including updated UI elements and connector configuration handling.
- Improved backend services to manage Composio connected accounts more effectively, including deletion and indexing tasks.