Commit graph

51 commits

Author SHA1 Message Date
DESKTOP-RTLN3BA\$punk
2832d57bda chore: linting 2026-01-01 22:56:37 -08:00
CREDO23
9c78726b6b feat: add file selection to Google Drive connector
- Add structured request body with folders and files arrays
- Support individual file indexing alongside folder indexing
- Remove deprecated folder_ids/folder_names query params
- Update UI to allow selecting both folders and files
2025-12-31 14:15:07 +02:00
DESKTOP-RTLN3BA\$punk
c19d300c9d feat: added circleback connector 2025-12-30 09:00:59 -08:00
CREDO23
7618662e70 refactor: rename GOOGLE_DRIVE_CONNECTOR to GOOGLE_DRIVE_FILE document type 2025-12-29 20:38:26 +02:00
CREDO23
acf47e3b0c refactor(connectors): remove verbose docstrings and obvious comments
- Simplify module docstrings (remove meta-commentary about 'small focused modules')
- Remove redundant inline comments (e.g., 'Log task start', 'Get connector from database')
- Trim verbose function docstrings to essential information only
- Remove over-explanatory comments that restate what code does
- Keep necessary documentation, remove noise for better readability
2025-12-28 18:53:13 +02:00
CREDO23
506a9297a9 fix(connectors): track delta sync tokens per folder for Google Drive
- Store tokens in folder_tokens dict instead of single global token
- Each folder now tracks its own sync state independently
- Fixes issue where indexing folder 2 incorrectly used delta sync after folder 1 was indexed
- First-time indexing now correctly uses full scan for each new folder
2025-12-28 18:32:59 +02:00
CREDO23
a5935bc677 feat(connectors): add connector parameter to file processor for source tracking
- Add optional 'connector' parameter with 'type' and 'metadata' fields
- Create helper function _update_document_from_connector
- Use document_metadata column (not metadata) for JSON field
- Merge metadata with existing using dict spread operator
- Google Drive documents now marked as GOOGLE_DRIVE_CONNECTOR
- Backward compatible - no changes to existing logic
- Simple and clean implementation
2025-12-28 18:01:39 +02:00
CREDO23
8da58be9e0 fix(connectors): refresh document from DB before updating type
- Query document from database to ensure it's attached to session
- Prevents detached instance errors after process_file_in_background commits
- Properly updates document_type and metadata with session management
2025-12-28 17:21:44 +02:00
CREDO23
b2b891e4d7 fix(connectors): properly commit Google Drive document type changes
- Return file metadata from content_extractor for indexer to use
- Update document type and metadata in indexer after processing
- Explicitly commit changes to database
- Ensures documents are properly marked as GOOGLE_DRIVE_CONNECTOR type
2025-12-28 17:15:29 +02:00
CREDO23
7b8900d51f feat(indexer): export Google Drive indexer function 2025-12-28 15:55:46 +02:00
CREDO23
1696c7056a feat(indexer): add Google Drive folder indexing with delta sync
- Full folder scan on first index
- Delta sync using change tracking for subsequent indexes
- Process files in parallel batches
- Handle file additions, modifications, and deletions
- Store change tracking token for efficient re-indexing
2025-12-28 15:55:25 +02:00
CREDO23
c6cb754aac refactor: update the webcrawler index to compare hashes without metadata 2025-12-17 18:44:58 +02:00
DESKTOP-RTLN3BA\$punk
8c9aa68faa feat: update document tracking to use 'updated_at' timestamp instead of 'last_edited_at' 2025-12-12 01:32:14 -08:00
Differ
500bc60d02 fix: add input validation, retry limit, code formatting, and exclude i18n from secret detection 2025-12-05 09:58:49 +08:00
Differ
6b1b8d0f2e feat: add BookStack connector for wiki documentation indexing 2025-12-04 14:08:44 +08:00
DESKTOP-RTLN3BA\$punk
ab6ea7e0ab feat(UI): reorganized connectors 2025-11-26 13:44:38 -08:00
DESKTOP-RTLN3BA\$punk
8f30cfd69a chore(lint): ruff checks 2025-11-26 13:22:31 -08:00
samkul-swe
121e2f0c0e Renaming resources 2025-11-22 19:19:00 -08:00
samkul-swe
896e410e2a Webcrawler connector draft 2025-11-21 23:27:21 -08:00
DESKTOP-RTLN3BA\$punk
a3a5b13f48 chore: linting 2025-11-03 16:00:58 -08:00
DESKTOP-RTLN3BA\$punk
e65d74f2e2 refactor: added batch commits and Increased task time limits in celery_app.py
- Increased task time limits in celery_app.py for longer processing times.
- Enhanced pagination logic in NotionHistoryConnector to handle large result sets.
- Implemented batch commits every 10 documents across various indexers (Airtable, ClickUp, Confluence, Discord, GitHub, Google Calendar, Gmail, JIRA, Linear, Luma, Notion, Slack) to improve performance and reduce database load.
- Updated final commit logging for clarity on total documents processed.
2025-11-03 15:57:19 -08:00
DESKTOP-RTLN3BA\$punk
0e6669ac4e fix: celery_app path and gmail indexing 2025-10-21 21:11:41 -07:00
DESKTOP-RTLN3BA\$punk
5b957ec21c feat: bumped version to v0.0.8 2025-10-16 22:44:12 -07:00
Anish Sarkar
bbb2abfc02 fix: ran formatter as per coderrabbitai 2025-10-17 02:44:44 +05:30
Anish Sarkar
0ff1b586a2 feat: update Elasticsearch integration and logging
- revised Elasticsearch connector enum revision IDs
- added `TaskLoggingService` to elasticsearch_indexer
- integrated Elasticsearch into prompts.py as requested
2025-10-17 02:21:56 +05:30
Anish Sarkar
82438c7396 refactor: streamline Elasticsearch indexing by removing unused services and integrating document chunking, also added documentation 2025-10-16 17:48:28 +05:30
Anish Sarkar
929035f802 Merge remote-tracking branch 'upstream/main' into feature/elasticsearch-connector 2025-10-16 16:24:37 +05:30
DESKTOP-RTLN3BA\$punk
c99cd710ea feat: add unique identifier hash for documents to prevent duplicates across various connectors 2025-10-14 21:11:19 -07:00
DESKTOP-RTLN3BA\$punk
31982cea9a chore: removed content trunking for better UI 2025-10-14 14:19:48 -07:00
Anish Sarkar
72e8d98f40 feat: enhance Elasticsearch connector to handle missing index configuration 2025-10-12 10:10:19 +05:30
Anish Sarkar
55d752e3c8 feat: added elasticsearch connector 2025-10-12 09:39:04 +05:30
DESKTOP-RTLN3BA\$punk
633ea3ac0f feat: moved LLMConfigs from User to SearchSpaces
- RBAC soon??
- Updated various services and routes to handle search space-specific LLM preferences.
- Modified frontend components to pass search space ID for LLM configuration management.
- Removed onboarding page and settings page as part of the refactor.
2025-10-10 00:50:29 -07:00
DESKTOP-RTLN3BA\$punk
aea09a5dad feat: Moved searchconnectors association from user to searchspace
- Need to move llm configs to searchspace
2025-10-08 21:13:01 -07:00
DESKTOP-RTLN3BA\$punk
94367e4226 chore: linting and formatting 2025-09-28 22:26:26 -07:00
Rohan Verma
ef361e16b4
Merge pull request #337 from samkul-swe/feature/add-luma-connector
[Feature] Add Luma connector
2025-09-28 22:14:15 -07:00
samkul-swe
9d2b808e66 Added Luma connector 2025-09-28 14:59:10 -07:00
CREDO23
8f9f66b7f8 handle token token refreshing when expired 2025-09-21 21:14:03 +02:00
Rohan Verma
662212d4e2
Merge pull request #295 from CREDO23/feature/airtable-connector
[Feature]  Add Airtable connector
2025-09-03 12:49:14 -07:00
Rohan Verma
c2030cec48
Merge pull request #275 from CREDO23/improvement/persist-refreshed-token-in-google-related-connector
[Improvement] Google connectors | Update the connector config after refreshing the token
2025-08-26 18:47:36 -07:00
CREDO23
45d2c18c16 update airtable indexer 2025-08-26 19:17:46 +02:00
CREDO23
55d0cc4d0d Add sirtable indexer 2025-08-26 15:42:42 +02:00
CREDO23
ecbb1f27e0 clean up 2025-08-26 11:53:27 +02:00
CREDO23
85664f2ff8 update the connector config after refreshing google calendar access token 2025-08-26 11:49:31 +02:00
DESKTOP-RTLN3BA\$punk
3b87ecc3c5 fix: made notion indexing async 2025-08-21 14:43:04 -07:00
DESKTOP-RTLN3BA\$punk
f443a6636f fix: slack indexing
- Indivisual messages as Document instead of concatinating it.
2025-08-21 14:23:52 -07:00
CREDO23
9711af2b72 refresh the token when expired 2025-08-21 01:09:13 +02:00
CREDO23
b0b6df0971 updated the connector config after refreshing the token 2025-08-20 20:32:08 +02:00
DESKTOP-RTLN3BA\$punk
1c4c61eb04 feat: Fixed Document Summary Content across connectors and processors 2025-08-18 20:51:48 -07:00
CREDO23
089c9d1625 use new indexer files structureclear 2025-08-15 10:11:50 +02:00
DESKTOP-RTLN3BA\$punk
54374bd7be ruff format 2025-08-12 15:33:17 -07:00