- Refactored GitHubConnector to utilize gitingest CLI via subprocess, improving performance and avoiding async issues with Celery.
- Updated ingestion method to handle repository digests more efficiently, including error handling for subprocess execution.
- Adjusted GitHub indexer to call the new synchronous ingestion method.
- Clarified documentation regarding the optional nature of the Personal Access Token for public repositories.
- Added gitingest as a dependency to streamline the ingestion of GitHub repositories.
- Refactored GitHubConnector to utilize gitingest for efficient repository digest generation, reducing API calls.
- Updated GitHub indexer to process entire repository digests, enhancing performance and simplifying the indexing process.
- Modified GitHub connect form to indicate that the Personal Access Token is optional for public repositories.
- Updated Google Drive API calls to include md5Checksum in file metadata retrieval for improved content tracking.
- Added logic to check for rename-only updates based on md5Checksum, optimizing document processing by preventing unnecessary ETL operations for unchanged content.
- Enhanced existing document update logic to handle renaming and metadata updates more effectively, particularly for Google Drive files.
- Implemented support for both new file_id-based and legacy filename-based hash schemes in document processing.
- Added functions to generate unique identifier hashes and find existing documents with migration support.
- Improved existing document update logic to handle content changes and metadata updates, particularly for Google Drive files.
- Enhanced UI components to display appropriate file icons based on file types in the Google Drive connector.
- Updated document processing functions to accommodate the new connector structure and ensure seamless integration.
- Added logic to refresh connector and notification attributes after indexing to ensure up-to-date information.
- Enhanced periodic sync configuration to disable the option when no folders or files are selected for Google Drive, providing user feedback through a message.
- Updated the connector edit view to reflect the new disabled state for periodic sync based on selected items.
- Implemented validation in the connector dialog to prevent enabling periodic sync without selected items, improving user experience.
- Updated the Google Drive indexing functionality to include indexing options such as max files per folder, incremental sync, and inclusion of subfolders.
- Modified the API to accept a new 'indexing_options' parameter in the request body.
- Enhanced the UI to allow users to configure these options when selecting folders and files for indexing.
- Updated related components and tasks to support the new indexing options, ensuring a more flexible and efficient indexing process.
- Enhanced error handling in the indexing process to differentiate between actual failures and cases where no new documents are processed.
- Updated notification messages to reflect the status accurately, including a message for when no new items are synced.
- Standardized return values across various indexer tasks to return `None` on success, simplifying logging and error management.
- Updated date handling in indexing functions to permit future dates for Google Calendar and Luma connectors.
- Enhanced UI components to support future date selection, including a new button for selecting the next 30 days.
- Adjusted documentation and descriptions to clarify date range options for users.
- Added ClickUp OAuth authentication flow with new environment variables for client ID, client secret, and redirect URI.
- Introduced ClickUpHistoryConnector to manage OAuth-based authentication and token refresh for ClickUp API access.
- Created ClickUp connector routes for OAuth flow, including authorization and callback handling.
- Updated indexing logic to utilize the new ClickUpHistoryConnector, supporting both OAuth and legacy API token methods.
- Enhanced frontend components to reflect the new ClickUp integration and removed legacy API token forms.
- Introduced AirtableHistoryConnector to manage OAuth-based authentication and token refresh for Airtable API access.
- Added date string validation in AirtableConnector to ensure valid date inputs before processing.
- Updated indexing logic to utilize the new AirtableHistoryConnector, improving credential management and token handling.
- Introduced JiraHistoryConnector to handle OAuth-based authentication and automatic token refresh for Jira API access.
- Refactored Jira indexing logic to utilize the new connector, simplifying credential management and enhancing token refresh capabilities.
- Removed legacy token handling code from the Jira indexer, streamlining the integration process.
- Ensured compatibility with both OAuth 2.0 and legacy API token methods for improved flexibility.
- Added support for Confluence OAuth with new environment variables for client ID, client secret, and redirect URI.
- Implemented Confluence connector routes for OAuth flow, including authorization and callback handling.
- Enhanced Confluence connector to support both OAuth 2.0 and legacy API token authentication methods.
- Updated Confluence indexing logic to utilize OAuth credentials with auto-refresh capabilities.
- Removed outdated Confluence UI components and adjusted frontend logic to reflect the new integration.
- Enhanced the indexing process for Discord messages to treat each message as an individual document, improving metadata handling and content management.
- Replaced the announcement banner component and related state management with a more streamlined approach, removing unnecessary files and simplifying the dashboard layout.
- Updated logging messages for clarity and accuracy regarding processed messages.
- Introduced a shared schema for Atlassian OAuth 2.0 credentials, accommodating both Jira and Confluence.
- Updated Jira connector routes to utilize the new AtlassianAuthCredentialsBase for handling OAuth tokens.
- Enhanced configuration to include new environment variables for Jira OAuth integration.
- Refactored token handling in Jira indexing logic to support the new shared credential structure.
- Added support for Jira OAuth with new environment variables for client ID, client secret, and redirect URI.
- Implemented Jira connector routes for OAuth flow, including authorization and callback handling.
- Enhanced Jira connector to support both OAuth 2.0 and legacy API token authentication methods.
- Updated Jira indexing logic to utilize OAuth credentials with auto-refresh capabilities.
- Removed outdated Jira UI components and adjusted frontend logic to reflect the new integration.
- Added support for Jira OAuth with new environment variables for client ID, client secret, and redirect URI.
- Implemented Jira connector routes for OAuth flow, including authorization and callback handling.
- Enhanced Jira connector to support both OAuth 2.0 and legacy API token authentication.
- Updated Jira indexing logic to utilize OAuth credentials with auto-refresh capabilities.
- Removed outdated Jira UI components and adjusted frontend logic to reflect the new integration.
- Introduced Discord OAuth support with new environment variables for client ID, client secret, and redirect URI.
- Implemented Discord connector routes for OAuth flow, including authorization and callback handling.
- Enhanced Discord connector to support both OAuth-based authentication and legacy bot token usage.
- Updated Discord indexing logic to utilize OAuth credentials with auto-refresh capabilities.
- Removed outdated Discord UI components and adjusted frontend logic to reflect the new integration.
- Updated SlackHistory class to enforce the use of session and connector_id for token refresh, raising a ValueError for legacy token usage.
- Simplified conditional checks for client initialization in SlackHistory.
- Cleaned up unnecessary comments and whitespace in the codebase.
- Introduced Slack OAuth support with new environment variables for client ID, client secret, and redirect URI.
- Implemented Slack connector routes for OAuth flow, including authorization and callback handling.
- Updated configuration to support both new OAuth format and legacy token handling.
- Enhanced the Slack indexer to decrypt tokens when necessary, ensuring compatibility with existing encrypted credentials.
- Removed outdated Slack connector UI components and adjusted frontend logic to reflect the new integration.
- Enhanced LinearConnector and NotionHistoryConnector classes to support automatic token refresh, improving reliability in accessing APIs.
- Updated initialization to require session and connector ID, allowing for dynamic credential management.
- Introduced new credential schemas for Linear and Notion, encapsulating access and refresh tokens with expiration handling.
- Refactored indexers to utilize the new connector structure, ensuring seamless integration with the updated authentication flow.
- Improved error handling and logging during token refresh processes for better debugging and user feedback.
- Enhanced token decryption logic in Airtable, Google Drive, Linear, and Notion indexers to only attempt decryption when tokens are explicitly marked as encrypted.
- Added error handling for missing SECRET_KEY when tokens are marked as encrypted, improving robustness and clarity in error reporting.
- Updated comments to clarify the handling of plaintext tokens when encryption is not indicated.
- Added encryption for sensitive tokens (access token, refresh token, client secret) in Google Calendar, Google Drive, Gmail, Linear, and Notion connectors to enhance security.
- Introduced OAuthStateManager for secure state parameter generation and validation, improving the integrity of OAuth flows.
- Updated callback routes to handle state validation and error management, ensuring robust handling of authorization processes.
- Enhanced indexers to support decryption of tokens for backward compatibility, maintaining functionality with existing encrypted credentials.
- Improved validation for date parameters in connector routes to ensure proper input handling.
- Introduced Linear OAuth support with new environment variables for client ID, client secret, and redirect URI.
- Implemented Linear connector routes for OAuth flow, including authorization and callback handling.
- Updated existing components to accommodate Linear integration, including validation changes and connector configuration.
- Enhanced the Linear indexer to utilize OAuth access tokens instead of API keys.
- Adjusted UI components to reflect the new Linear connector without requiring special configuration.
- Introduced Notion OAuth support with new environment variables for client ID, client secret, and redirect URI.
- Implemented Notion connector routes for OAuth flow, including authorization and callback handling.
- Updated existing components to accommodate Notion integration, including validation changes and connector configuration.
- Enhanced the Notion indexer to utilize OAuth access tokens instead of integration tokens.
- Adjusted UI components to reflect the new Notion connector without requiring special configuration.
- Add structured request body with folders and files arrays
- Support individual file indexing alongside folder indexing
- Remove deprecated folder_ids/folder_names query params
- Update UI to allow selecting both folders and files
- Simplify module docstrings (remove meta-commentary about 'small focused modules')
- Remove redundant inline comments (e.g., 'Log task start', 'Get connector from database')
- Trim verbose function docstrings to essential information only
- Remove over-explanatory comments that restate what code does
- Keep necessary documentation, remove noise for better readability
- Store tokens in folder_tokens dict instead of single global token
- Each folder now tracks its own sync state independently
- Fixes issue where indexing folder 2 incorrectly used delta sync after folder 1 was indexed
- First-time indexing now correctly uses full scan for each new folder
- Add optional 'connector' parameter with 'type' and 'metadata' fields
- Create helper function _update_document_from_connector
- Use document_metadata column (not metadata) for JSON field
- Merge metadata with existing using dict spread operator
- Google Drive documents now marked as GOOGLE_DRIVE_CONNECTOR
- Backward compatible - no changes to existing logic
- Simple and clean implementation
- Query document from database to ensure it's attached to session
- Prevents detached instance errors after process_file_in_background commits
- Properly updates document_type and metadata with session management