Commit graph

167 commits

Author SHA1 Message Date
Anish Sarkar
0fdd194d92 Merge remote-tracking branch 'upstream/dev' into fix/documents 2026-02-06 12:13:26 +05:30
DESKTOP-RTLN3BA\$punk
1511c26ef5 feat: add residential proxy configuration for web crawling and YouTube transcript fetching 2026-02-05 20:44:13 -08:00
DESKTOP-RTLN3BA\$punk
f85adefe5e chore: made generate_image more agnostic 2026-02-05 17:18:27 -08:00
Anish Sarkar
c132e5ddb0 Merge remote-tracking branch 'upstream/dev' into fix/documents 2026-02-06 05:36:32 +05:30
Anish Sarkar
aa66928154 chore: ran linting 2026-02-06 05:35:15 +05:30
Anish Sarkar
5042fbfb85 feat: enhance Gmail and Google Drive connectors with document status management and duplicate content checks 2026-02-05 22:59:56 +05:30
Anish Sarkar
aef59d04eb feat: add document status management with JSONB column for processing states in documents 2026-02-05 21:59:31 +05:30
CREDO23
ecd0985523 Add access token pre-validation to OAuth connectors 2026-02-05 14:39:50 +02:00
Anish Sarkar
04884caeef refactor: simplify document title assignment across various connectors by removing prefix formatting 2026-02-05 02:30:20 +05:30
Anish Sarkar
580b75c3c4 chore: ran linting 2026-02-04 03:04:25 +05:30
Anish Sarkar
30c6f42102 feat: streamline Composio connector logic by removing redundant checks and enhancing email retrieval for user accounts 2026-02-04 03:03:40 +05:30
Anish Sarkar
65b79f3705 feat: enhance Google Drive connector with file MIME type file detection and content based detection as fallback 2026-02-03 22:57:01 +05:30
Anish Sarkar
2125c76841 feat: merge new credentials with existing connector configurations to preserve user settings 2026-02-02 19:03:05 +05:30
Anish Sarkar
bf08982029 feat: add connector_id to documents for source tracking and implement connector deletion task 2026-02-02 16:23:26 +05:30
Anish Sarkar
e0ade20e68 feat: add created_by_id column to documents for ownership tracking and update related connectors 2026-02-02 12:32:24 +05:30
Anish Sarkar
0d0d08fabd chore: ran linting 2026-02-02 00:43:25 +05:30
Anish Sarkar
f7c3b36798 fix: update hashlib usage in generate_indexing_settings_hash to improve security compliance 2026-02-02 00:41:04 +05:30
Anish Sarkar
085653d3e3 chore: ran frontend and backend linting 2026-02-01 22:54:25 +05:30
Anish Sarkar
2b2acfebb6 feat: enhance DiscordConnector with start event signaling for improved initialization handling 2026-02-01 03:32:45 +05:30
Anish Sarkar
024a683b4f feat: add heartbeat callback support for long-running indexing tasks and implement stale notification cleanup task 2026-02-01 02:17:06 +05:30
Anish Sarkar
5e555a8f9a fix: improve notification for token expiration and revocation errors for multiple connectors 2026-01-31 16:24:43 +05:30
Anish Sarkar
9771a88380 fix: refine date handling in Google Calendar connector to ensure accurate same-day queries 2026-01-30 20:51:03 +05:30
DESKTOP-RTLN3BA\$punk
d39bf3510f chore: linting 2026-01-28 22:20:23 -08:00
Anish Sarkar
1658724fb2 Merge remote-tracking branch 'upstream/dev' into fix/notion-connector 2026-01-29 10:45:31 +05:30
Anish Sarkar
59d5bf9aa5 fix(backend): Add error handling for invalid pagination cursor in NotionHistoryConnector to ensure graceful continuation of data fetching 2026-01-28 23:18:10 +05:30
Anish Sarkar
c6d25ed7d8 feat(backend): Add legacy token handling in NotionHistoryConnector and log warnings for legacy usage in indexing 2026-01-28 22:53:34 +05:30
Anish Sarkar
33316fa6db feat(backend): Add retry logic for Notion API calls with user notifications on rate limits and errors 2026-01-28 18:36:42 +05:30
Anish Sarkar
41ebe162b0 feat(backend): Implement handling of unsupported Notion block types and track skipped content, add documentation for it 2026-01-28 17:43:45 +05:30
Anish Sarkar
c125c9e87f chore: ran backend linting 2026-01-28 09:10:37 +05:30
Anish Sarkar
aab547264e feat(connector): implement duplicate detection by Google Drive file ID and generate settings hash for indexing configuration changes 2026-01-28 09:09:58 +05:30
Anish Sarkar
3af4fd0533 feat(indexing): add content hash check to prevent duplicate indexing and update return values for indexing functions 2026-01-28 03:55:25 +05:30
Anish Sarkar
a5103da3d7 chore: ran linting 2026-01-24 04:36:34 +05:30
Anish Sarkar
c48ba36fa4 feat: improve indexing logic and duplicate handling in connectors
- Enhanced Google Calendar and Composio connector indexing to track and log duplicate content, preventing re-indexing of already processed events.
- Implemented robust error handling during final commits to manage integrity errors gracefully, ensuring successful indexing despite potential duplicates.
- Updated notification service to differentiate between actual errors and warnings for duplicate content, improving user feedback.
- Refactored date handling to ensure valid date ranges and adjusted end dates when necessary for better indexing accuracy.
2026-01-23 23:36:14 +05:30
Anish Sarkar
d20bb385b5 feat: enhance date handling and indexing logic across connectors
- Added normalization for "undefined" strings to None in date parameters to prevent parsing errors.
- Improved date range validation to ensure start_date is strictly before end_date, adjusting end_date if necessary.
- Updated Google Calendar and Composio connector indexing logic to handle duplicate content more effectively, logging warnings for skipped events.
- Enhanced error handling during final commits to manage integrity errors gracefully.
- Refactored date handling in various connector indexers for consistency and reliability.
2026-01-23 23:03:29 +05:30
Anish Sarkar
1343fabeee feat: refactor composio connectors for modularity 2026-01-23 19:56:19 +05:30
Anish Sarkar
8d8f69545e feat: improve Google Calendar and Gmail connectors with enhanced error handling
- Added user-friendly re-authentication messages for expired or revoked tokens in both Google Calendar and Gmail connectors.
- Updated error handling in indexing tasks to log specific authentication errors and provide clearer feedback to users.
- Enhanced the connector UI to handle indexing failures more effectively, improving overall user experience.
2026-01-23 18:57:10 +05:30
Anish Sarkar
29382070aa feat: enhance Composio connector functionality with Google Drive delta sync support
- Added methods to retrieve the starting page token and list changes in Google Drive, enabling delta sync capabilities.
- Updated Composio service to handle file download directory configuration.
- Modified indexing tasks to support delta sync, improving efficiency by processing only changed files.
- Adjusted date handling in connector tasks to allow optional start and end dates.
- Improved error handling and logging throughout the Composio indexing process.
2026-01-23 18:37:09 +05:30
Anish Sarkar
8a0b8346a5 chore: ran linting 2026-01-23 05:28:18 +05:30
Anish Sarkar
4cbf80d73a feat: enhance Composio integration with pagination and improved error handling
- Updated the list_gmail_messages method to support pagination with page tokens, allowing for more efficient message retrieval.
- Modified the return structure to include next_page_token and result_size_estimate for better client-side handling.
- Improved error handling and logging throughout the Gmail indexing process, ensuring better visibility into failures.
- Implemented batch processing for Gmail messages, committing changes incrementally to prevent data loss.
- Ensured consistent timestamp updates for connectors, even when no documents are indexed, to maintain accurate UI states.
- Refactored the indexing logic to streamline message processing and enhance overall performance.
2026-01-23 04:44:37 +05:30
DESKTOP-RTLN3BA\$punk
12b825bff0 Merge branch 'dev' of https://github.com/MODSetter/SurfSense into dev 2026-01-21 22:58:48 -08:00
DESKTOP-RTLN3BA\$punk
8c625d4237 feat: composio connector 2026-01-21 22:57:58 -08:00
Anish Sarkar
35888144eb refactor: Update GitHub connector to use gitingest CLI
- Refactored GitHubConnector to utilize gitingest CLI via subprocess, improving performance and avoiding async issues with Celery.
- Updated ingestion method to handle repository digests more efficiently, including error handling for subprocess execution.
- Adjusted GitHub indexer to call the new synchronous ingestion method.
- Clarified documentation regarding the optional nature of the Personal Access Token for public repositories.
2026-01-20 23:24:33 +05:30
Anish Sarkar
49b8a46d10 feat: Integrate gitingest for GitHub repository ingestion
- Added gitingest as a dependency to streamline the ingestion of GitHub repositories.
- Refactored GitHubConnector to utilize gitingest for efficient repository digest generation, reducing API calls.
- Updated GitHub indexer to process entire repository digests, enhancing performance and simplifying the indexing process.
- Modified GitHub connect form to indicate that the Personal Access Token is optional for public repositories.
2026-01-20 21:52:32 +05:30
Anish Sarkar
f538d59ca3 feat: enhance Google Drive file metadata handling
- Updated Google Drive API calls to include md5Checksum in file metadata retrieval for improved content tracking.
- Added logic to check for rename-only updates based on md5Checksum, optimizing document processing by preventing unnecessary ETL operations for unchanged content.
- Enhanced existing document update logic to handle renaming and metadata updates more effectively, particularly for Google Drive files.
2026-01-17 16:24:53 +05:30
Manoj Aggarwal
8b735a492a lint 2026-01-09 13:53:09 -08:00
Manoj Aggarwal
62d0d8b6db ruff lint 2026-01-09 13:38:49 -08:00
Manoj Aggarwal
18035b3728 Add MS Teams connector 2026-01-09 13:20:47 -08:00
Manoj Aggarwal
fa35b71522 Add teams connector similar to slack 2026-01-09 13:20:30 -08:00
Manoj Aggarwal
786fd63e5b
Revert "Add Microsoft Teams Connector" 2026-01-09 12:33:26 -08:00
Manoj Aggarwal
ba7e4f0ceb Add MS Teams connector 2026-01-08 17:13:19 -08:00