Commit graph

148 commits

Author SHA1 Message Date
DESKTOP-RTLN3BA\$punk
e9892c8fe9 feat: added configable summary calculation and various improvements
- Replaced direct embedding calls with a utility function across various components to streamline embedding logic.
- Added enable_summary flag to several models and routes to control summary generation behavior.
2026-02-26 18:24:57 -08:00
CREDO23
7d1bd1fab4 Implement KB sync after Notion page updates with block ID verification
- Add NotionKBSyncService for immediate KB updates after page changes
- Implement block ID verification to ensure content freshness
- Refactor duplicate block processing logic to shared utils
- Add user-friendly status messages
- Include debug logging for troubleshooting
2026-02-17 20:30:12 +02:00
Rohan Verma
26fd61fcbb
Merge pull request #796 from AnishSarkar22/feat/sur-149-batch-index
impr: batch index for messaging connectors & some fixes
2026-02-09 15:00:16 -08:00
DESKTOP-RTLN3BA\$punk
17b7348f61 feat: fixed and improved search and background task management. 2026-02-09 14:03:56 -08:00
Anish Sarkar
20ab128b05 feat: implement batch indexing for Microsoft Teams messages to improve efficiency and conversational context 2026-02-09 14:31:22 +05:30
Anish Sarkar
e2dd80c604 chore: ran linting 2026-02-08 12:43:31 +05:30
Anish Sarkar
7cede99d29 feat: implement batch indexing for Slack messages to enhance efficiency and conversational context 2026-02-07 18:30:06 +05:30
Anish Sarkar
98870a9f9a feat: implement batch indexing for Discord messages to improve efficiency and context 2026-02-07 18:26:29 +05:30
Anish Sarkar
0fdd194d92 Merge remote-tracking branch 'upstream/dev' into fix/documents 2026-02-06 12:13:26 +05:30
DESKTOP-RTLN3BA\$punk
1511c26ef5 feat: add residential proxy configuration for web crawling and YouTube transcript fetching 2026-02-05 20:44:13 -08:00
Anish Sarkar
aa66928154 chore: ran linting 2026-02-06 05:35:15 +05:30
Anish Sarkar
cc1e796c12 feat: implement two-phase document indexing for webcrawler and YouTube video processors with real-time status updates 2026-02-06 04:54:50 +05:30
Anish Sarkar
629f6f9cf5 feat: implement two-phase document indexing for Obsidian and Circleback connectors with real-time status updates 2026-02-06 04:35:13 +05:30
Anish Sarkar
0f61a249c0 feat: implement two-phase document indexing for BookStack, Elasticsearch, and Luma connectors with real-time status updates 2026-02-06 04:31:55 +05:30
Anish Sarkar
bfa3be655e feat: implement two-phase document indexing for ClickUp and GitHub connectors with real-time status updates 2026-02-06 04:06:14 +05:30
Anish Sarkar
1d870e45a4 feat: implement two-phase document indexing for Confluence and Jira connectors with real-time status updates 2026-02-06 03:54:24 +05:30
Anish Sarkar
0249ea20a5 feat: implement two-phase document indexing for Discord and Teams connectors with real-time status updates 2026-02-06 03:42:03 +05:30
Anish Sarkar
2077344934 feat: implement two-phase document indexing for Linear and Slack connectors with real-time status updates 2026-02-06 02:59:21 +05:30
Anish Sarkar
c12401c1e8 feat: implement two-phase document indexing across Google connectors with real-time status updates 2026-02-06 02:24:35 +05:30
Manoj Aggarwal
e6c0fabd0a Merge branch 'dev' into bugs_prod 2026-02-05 10:53:16 -08:00
Anish Sarkar
3bbac0d4ea feat: implement two-phase document indexing for Airtable and Notion connectors with real-time status updates 2026-02-06 00:12:48 +05:30
Anish Sarkar
aef59d04eb feat: add document status management with JSONB column for processing states in documents 2026-02-05 21:59:31 +05:30
Manoj Aggarwal
33165830e5 add parse date flexible 2026-02-04 13:18:33 -08:00
Anish Sarkar
04884caeef refactor: simplify document title assignment across various connectors by removing prefix formatting 2026-02-05 02:30:20 +05:30
Manoj Aggarwal
48e646607b Fix google calendar and notion erros 2026-02-02 12:07:53 -08:00
Anish Sarkar
bf08982029 feat: add connector_id to documents for source tracking and implement connector deletion task 2026-02-02 16:23:26 +05:30
Anish Sarkar
e0ade20e68 feat: add created_by_id column to documents for ownership tracking and update related connectors 2026-02-02 12:32:24 +05:30
Anish Sarkar
cf339ff350 chore: ran backend linting 2026-02-02 00:19:19 +05:30
Anish Sarkar
085653d3e3 chore: ran frontend and backend linting 2026-02-01 22:54:25 +05:30
Anish Sarkar
024a683b4f feat: add heartbeat callback support for long-running indexing tasks and implement stale notification cleanup task 2026-02-01 02:17:06 +05:30
Anish Sarkar
9771a88380 fix: refine date handling in Google Calendar connector to ensure accurate same-day queries 2026-01-30 20:51:03 +05:30
Anish Sarkar
4526b656a4 fix: update default date range for Google Calendar events and improve query parameter handling 2026-01-30 19:55:48 +05:30
DESKTOP-RTLN3BA\$punk
d39bf3510f chore: linting 2026-01-28 22:20:23 -08:00
Anish Sarkar
1658724fb2 Merge remote-tracking branch 'upstream/dev' into fix/notion-connector 2026-01-29 10:45:31 +05:30
Anish Sarkar
c6d25ed7d8 feat(backend): Add legacy token handling in NotionHistoryConnector and log warnings for legacy usage in indexing 2026-01-28 22:53:34 +05:30
CREDO23
b20fbaca4b fix: skip webcrawler indexing gracefully when no URLs configured 2026-01-28 17:54:46 +02:00
Anish Sarkar
b3f553802c fix(backend): Update Notion page indexing log message to clarify sharing requirements and adjust return value for no pages found 2026-01-28 18:58:57 +05:30
CREDO23
4f7ed8439f fix(backend): Use calculate_date_range for Gmail indexer
Gmail indexer was using a hardcoded 30-day default instead of respecting
last_indexed_at like other connectors. Now uses calculate_date_range()
for consistent behavior (last_indexed_at → now, or 365 days for first run).
2026-01-28 15:20:07 +02:00
Anish Sarkar
33316fa6db feat(backend): Add retry logic for Notion API calls with user notifications on rate limits and errors 2026-01-28 18:36:42 +05:30
CREDO23
a9d393327d fix(backend): Add duplicate content_hash check to connector indexers
Prevent UniqueViolationError on ix_documents_content_hash constraint by
adding check_duplicate_document_by_hash() before inserting new documents
in 15 connector indexers that were missing this check.

Affected: clickup, luma, linear, jira, google_gmail, confluence,
bookstack, github, webcrawler, teams, slack, notion, discord,
airtable, obsidian indexers.
2026-01-28 14:51:54 +02:00
Anish Sarkar
41ebe162b0 feat(backend): Implement handling of unsupported Notion block types and track skipped content, add documentation for it 2026-01-28 17:43:45 +05:30
Anish Sarkar
a5103da3d7 chore: ran linting 2026-01-24 04:36:34 +05:30
Anish Sarkar
97d7207bd4 fix: update Google Drive indexer to use SQLAlchemy casting for metadata queries
- Modified the Google Drive indexer to use SQLAlchemy's cast function for querying document metadata, ensuring proper type handling for file IDs.
- Improved the consistency of metadata queries across the indexing functions, enhancing reliability in document retrieval and processing.
2026-01-24 04:33:10 +05:30
Anish Sarkar
5cf6fb15ed fix: improve error logging for indexing tasks across multiple connectors
- Updated error handling in the indexing functions for BookStack, Confluence, Google Calendar, Jira, Linear, and Luma connectors to log specific error messages when failures occur.
- Enhanced logging for cases where no pages or events are found, providing clearer informational messages instead of treating them as critical errors.
- Ensured consistent error reporting across all connector indexers, improving debugging and user feedback during indexing operations.
2026-01-24 03:59:17 +05:30
Anish Sarkar
c48ba36fa4 feat: improve indexing logic and duplicate handling in connectors
- Enhanced Google Calendar and Composio connector indexing to track and log duplicate content, preventing re-indexing of already processed events.
- Implemented robust error handling during final commits to manage integrity errors gracefully, ensuring successful indexing despite potential duplicates.
- Updated notification service to differentiate between actual errors and warnings for duplicate content, improving user feedback.
- Refactored date handling to ensure valid date ranges and adjusted end dates when necessary for better indexing accuracy.
2026-01-23 23:36:14 +05:30
Anish Sarkar
d20bb385b5 feat: enhance date handling and indexing logic across connectors
- Added normalization for "undefined" strings to None in date parameters to prevent parsing errors.
- Improved date range validation to ensure start_date is strictly before end_date, adjusting end_date if necessary.
- Updated Google Calendar and Composio connector indexing logic to handle duplicate content more effectively, logging warnings for skipped events.
- Enhanced error handling during final commits to manage integrity errors gracefully.
- Refactored date handling in various connector indexers for consistency and reliability.
2026-01-23 23:03:29 +05:30
Anish Sarkar
8d8f69545e feat: improve Google Calendar and Gmail connectors with enhanced error handling
- Added user-friendly re-authentication messages for expired or revoked tokens in both Google Calendar and Gmail connectors.
- Updated error handling in indexing tasks to log specific authentication errors and provide clearer feedback to users.
- Enhanced the connector UI to handle indexing failures more effectively, improving overall user experience.
2026-01-23 18:57:10 +05:30
Manoj Aggarwal
49d51ba569 merge 2026-01-22 20:57:48 -08:00
DESKTOP-RTLN3BA\$punk
8b81507739 refactor: remove unused COMPOSIO_CONNECTOR migration and linting 2026-01-22 16:43:08 -08:00
Manoj Aggarwal
4b60a2b805 nit 2026-01-22 13:01:10 -08:00