- Store tokens in folder_tokens dict instead of single global token
- Each folder now tracks its own sync state independently
- Fixes issue where indexing folder 2 incorrectly used delta sync after folder 1 was indexed
- First-time indexing now correctly uses full scan for each new folder
- Add optional 'connector' parameter with 'type' and 'metadata' fields
- Create helper function _update_document_from_connector
- Use document_metadata column (not metadata) for JSON field
- Merge metadata with existing using dict spread operator
- Google Drive documents now marked as GOOGLE_DRIVE_CONNECTOR
- Backward compatible - no changes to existing logic
- Simple and clean implementation
- Query document from database to ensure it's attached to session
- Prevents detached instance errors after process_file_in_background commits
- Properly updates document_type and metadata with session management
- Return file metadata from content_extractor for indexer to use
- Update document type and metadata in indexer after processing
- Explicitly commit changes to database
- Ensures documents are properly marked as GOOGLE_DRIVE_CONNECTOR type
- Change document_type from file type (PDF, DOCX) to GOOGLE_DRIVE_CONNECTOR
- Store original file type in metadata for reference
- Add Google Drive specific metadata (file_id, mime_type, source)
- Include export format info for Google Workspace files
- Enables proper source tracking and bulk management
- Accept comma-separated folder_ids and folder_names
- Loop through each folder and index sequentially
- Collect total indexed count and errors
- Update timestamp only on full success
- Accept folder_id and folder_name as indexing parameters
- Hide date range for Google Drive connectors
- Create wrapper function to avoid circular imports
- Trigger Google Drive indexing Celery task
- Full folder scan on first index
- Delta sync using change tracking for subsequent indexes
- Process files in parallel batches
- Handle file additions, modifications, and deletions
- Store change tracking token for efficient re-indexing
- OAuth initialization and callback handling
- Folder and file browsing with parent_id support
- Validate credentials and handle token refresh
- Return folder contents with metadata for UI tree view
- Get start page token for change tracking baseline
- Fetch incremental changes using Google Drive Changes API
- Categorize changes into added, modified, and removed files
- Enable efficient re-indexing of only changed content
- List folder contents with full pagination support
- Query root folder or specific parent folder
- Return both folders and files with metadata (size, icons, links)
- Filter out shortcuts and trashed items
- Download files from Google Drive to temporary location
- Export Google Workspace files as PDF
- Delegate content extraction to existing process_file_in_background
- Reuse Surfsense's ETL services (Unstructured, LlamaCloud, Docling)
- Detect Google Workspace files (Docs, Sheets, Slides)
- Map to PDF export format to preserve rich content (images, formatting)
- Identify files to skip (shortcuts, unsupported types)
- Build and manage Google Drive service with credentials
- List files with query support and pagination
- Download binary files and export Google Workspace files as PDF
- Handle HTTP errors gracefully
- Introduced DocumentMentionPicker component to enhance document selection experience in the chat interface.
- Updated InlineMentionEditor and Composer components to utilize the new DocumentMentionPicker.
- Removed the deprecated DocumentsDataTable component to streamline the codebase and improve maintainability.
- Enhanced type safety and validation in document handling logic.
- Added a fallback mechanism using headless Chromium to fetch page content when standard HTTP requests fail.
- Introduced utility functions for unescaping HTML entities and converting relative URLs to absolute.
- Updated HTTP request headers to mimic a browser for better compatibility with web servers.
- Improved error handling and logging for better debugging and user feedback.
- Made various properties in Zod schemas nullable for better type safety and flexibility in handling optional data.
- Updated system prompt to clarify usage of the display_image tool, emphasizing URL requirements and restrictions on user-uploaded images.
- Enhanced the streaming chat process to provide more context about user attachments and documents during analysis.
- Implemented state resets when switching between chats to prevent stale data and race conditions.
- Added new components for displaying image previews and document attachments in the chat interface.
- Improved attachment processing to support image data URLs for persistent display after uploads.
- Migrated data from the old 'chats' table to 'new_chat_threads' and 'new_chat_messages'.
- Dropped the 'chats' table and removed the 'chattype' enum as part of the migration process.
- Updated the migration script to truncate thread titles to 500 characters to comply with database constraints.
- Adjusted the downgrade function to reflect changes in the migration process.
- Updated the new chat routes to include handling for mentioned document IDs, allowing users to reference specific documents in their chat.
- Modified the NewChatRequest schema to accommodate optional document IDs.
- Implemented document mention formatting in the chat streaming service for improved context.
- Enhanced the frontend to manage document mentions, including a new atom for state management and UI updates for document selection.
- Refactored the DocumentsDataTable component for better integration with the new mention functionality.
- Enhanced the new chat agent module to allow for configurable tools, enabling users to customize their experience with various functionalities.
- Removed outdated tools including display image, knowledge base search, link preview, podcast generation, and web scraping, streamlining the codebase.
- Updated the system prompt and agent factory to reflect these changes, ensuring a more cohesive and efficient architecture.