Commit graph

138 commits

Author SHA1 Message Date
DESKTOP-RTLN3BA\$punk
4a51ccdc2c cloud: added openrouter integration with global configs 2026-04-15 23:46:29 -07:00
CREDO23
7e90a8ed3c Route uploaded images to vision LLM with document-parser fallback 2026-04-09 14:33:33 +02:00
Anish Sarkar
20fa93f0ba refactor: make Azure Document Intelligence an internal LLAMACLOUD accelerator instead of a standalone ETL service 2026-04-08 03:26:24 +05:30
Anish Sarkar
1fa8d1220b feat: add support for Azure Document Intelligence in ETL pipeline 2026-04-08 00:59:12 +05:30
Anish Sarkar
0a26a6c5bb chore: ran linting 2026-04-07 05:55:39 +05:30
Anish Sarkar
e7beeb2a36 refactor: unify file skipping logic across Dropbox, Google Drive, and OneDrive connectors by replacing classification checks with a centralized service-based approach, enhancing maintainability and consistency in file handling 2026-04-07 02:19:31 +05:30
Anish Sarkar
0fb92b7c56 refactor: streamline file skipping logic in Dropbox indexer by removing redundant checks, improving code clarity 2026-04-06 22:17:50 +05:30
Anish Sarkar
63a75052ca Merge remote-tracking branch 'upstream/dev' into feat/unified-etl-pipeline 2026-04-06 22:04:51 +05:30
Anish Sarkar
dc7047f64d refactor: implement file type classification for supported extensions across Dropbox, Google Drive, and OneDrive connectors, enhancing file handling and error management 2026-04-06 22:03:47 +05:30
Anish Sarkar
e814540727 refactor: move PKCE pair generatio for airtable
- Removed the `generate_pkce_pair` function from `airtable_add_connector_route.py` and relocated it to `oauth_security.py` for better organization.
- Updated imports in `airtable_add_connector_route.py` to reflect the new location of the PKCE generation function.
2026-04-04 03:36:54 +05:30
Anish Sarkar
8e6b1c77ea feat: implement PKCE support in native Google OAuth flows
- Added `generate_code_verifier` function to create a PKCE code verifier for enhanced security.
- Updated Google Calendar, Drive, and Gmail connector routes to utilize the PKCE code verifier during OAuth authorization.
- Modified state management to include the code verifier for secure state generation and validation.
2026-04-04 03:35:34 +05:30
Anish Sarkar
746c730b2e chore: ran linting 2026-04-03 13:14:40 +05:30
Anish Sarkar
96a58d0d30 feat: implement local folder indexing and document versioning capabilities 2026-04-02 11:11:57 +05:30
Anish Sarkar
0d5b902c26 feat: extend Dropbox support in chat event streaming and connector naming for enhanced integration 2026-03-30 23:07:25 +05:30
Anish Sarkar
5bddde60cb feat: implement Microsoft OneDrive connector with OAuth support and indexing capabilities 2026-03-28 14:31:25 +05:30
Anish Sarkar
489e48644f fix: revert native excel parsing 2026-03-27 22:15:24 +05:30
Anish Sarkar
3da0ffd683 feat: add native Excel parsing and improve Google Drive content extraction
- Introduced a new utility for parsing .xlsx files into markdown format, enhancing the ability to process Excel documents natively.
- Updated the Google Drive content extractor to utilize the new Excel parsing functionality, allowing for better handling of spreadsheet files.
- Enhanced file type detection and export logic to support various document formats, improving overall content extraction accuracy.
- Added unit tests to ensure the correctness of the new Excel parsing feature and its integration with existing content extraction workflows.
2026-03-27 21:47:14 +05:30
Anish Sarkar
683a4c17dd feat: implement thread-safe embedding access in document converters
- Added a reentrant lock to ensure thread-safe access to the tokenizer and embedding model, preventing runtime errors during concurrent operations.
- Updated the `truncate_for_embedding` and `embed_text` functions to utilize the lock, ensuring safe execution in multi-threaded environments.
- Enhanced the `embed_texts` function to maintain thread safety while processing multiple texts for embedding.
2026-03-27 11:31:00 +05:30
Anish Sarkar
e37e6d2d18 chore: ran linting 2026-03-21 13:21:19 +05:30
Anish Sarkar
83152e8e7e refactor: unify all 3 google Composio and non-Composio connector types and pipelines keeping same credential adapters 2026-03-19 05:08:21 +05:30
Anish Sarkar
8baba0693d feat: ensure unique connector names for MCP connectors 2026-03-18 16:09:35 +05:30
DESKTOP-RTLN3BA\$punk
d8a05ae4d5 feat: refactor agent tools management and add UI integration
- Added endpoint to list agent tools with metadata, excluding hidden tools.
- Updated NewChatRequest and RegenerateRequest schemas to include disabled tools.
- Integrated disabled tools management in the NewChatPage and Composer components.
- Improved tool instructions and visibility in the system prompt.
- Refactored tool registration to support hidden tools and default enabled states.
- Enhanced document chunk creation to handle strict zip behavior.
- Cleaned up imports and formatting across various files for consistency.
2026-03-10 17:36:26 -07:00
CREDO23
6eabfe2396 perf: conditional batch embedding — batch for API, sequential for local 2026-03-09 19:12:43 +02:00
CREDO23
c4f2e9a3a5 feat: use batch embedding in create_document_chunks 2026-03-09 16:21:14 +02:00
CREDO23
15aeec1fcb feat: add embed_texts batch embedding utility 2026-03-09 15:53:40 +02:00
DESKTOP-RTLN3BA\$punk
ecb0a25cc8 feat: enhance memory management and session handling in database operations
- Introduced a shielded async session context manager to ensure safe session closure during cancellations.
- Updated various database operations to utilize the new shielded session, preventing orphaned connections.
- Added environment variables to optimize glibc memory management, improving overall application performance.
- Implemented a function to trim the native heap, allowing for better memory reclamation on Linux systems.
2026-02-28 23:59:28 -08:00
DESKTOP-RTLN3BA\$punk
d959a6a6c8 feat: optimize document upload process and enhance memory management
- Increased maximum file upload limit from 10 to 50 to improve user experience.
- Implemented batch processing for document uploads to avoid proxy timeouts, splitting files into manageable chunks.
- Enhanced garbage collection in chat streaming functions to prevent memory leaks and improve performance.
- Added memory delta tracking in system snapshots for better monitoring of resource usage.
- Updated LLM router and service configurations to prevent unbounded internal accumulation and improve efficiency.
2026-02-28 17:22:34 -08:00
DESKTOP-RTLN3BA\$punk
664c43ca13 feat: add performance logging middleware and enhance performance tracking across services
- Introduced RequestPerfMiddleware to log request performance metrics, including slow request thresholds.
- Updated various services and retrievers to utilize the new performance logging utility for better tracking of execution times.
- Enhanced existing methods with detailed performance logs for operations such as embedding, searching, and indexing.
- Removed deprecated logging setup in stream_new_chat and replaced it with the new performance logger.
2026-02-27 16:32:30 -08:00
DESKTOP-RTLN3BA\$punk
e9892c8fe9 feat: added configable summary calculation and various improvements
- Replaced direct embedding calls with a utility function across various components to streamline embedding logic.
- Added enable_summary flag to several models and routes to control summary generation behavior.
2026-02-26 18:24:57 -08:00
Rohan Verma
9aef655566
Merge pull request #825 from CREDO23/sur-169-feat-implement-human-in-the-loop-for-linear-sensitive
[Feat] Add human in the loop for linear sensitive actions
2026-02-19 19:09:50 -08:00
Rohan Verma
bad114734a
Merge pull request #821 from AnishSarkar22/fix/ui
feat: introduce platejs and remove blocknote editor
2026-02-19 19:09:35 -08:00
CREDO23
7d1bd1fab4 Implement KB sync after Notion page updates with block ID verification
- Add NotionKBSyncService for immediate KB updates after page changes
- Implement block ID verification to ensure content freshness
- Refactor duplicate block processing logic to shared utils
- Add user-friendly status messages
- Include debug logging for troubleshooting
2026-02-17 20:30:12 +02:00
Anish Sarkar
09c5f5bd0d refactor: update _render_block function to use a parameter for numbered list counter, improving state management 2026-02-17 12:59:47 +05:30
Anish Sarkar
a482cc95de chore: ran linting 2026-02-17 12:47:39 +05:30
Anish Sarkar
49ac09b2cb feat: add support for rendering table cell content in markdown conversion 2026-02-17 12:45:57 +05:30
Anish Sarkar
8b497da130 feat: add source_markdown column to documents and implement migration logic for existing records using a pure-Python BlockNote JSON to Markdown converter 2026-02-17 11:34:11 +05:30
Manoj Aggarwal
22bf9ea718 Fix Obsidian connector 2026-02-16 21:07:08 -08:00
DESKTOP-RTLN3BA\$punk
856df201db feat: add future annotations to content_utils.py
- Imported future annotations to improve type hinting and support forward references in the content utilities module.
2026-02-09 17:15:33 -08:00
DESKTOP-RTLN3BA\$punk
db652116d6 chore: linting 2026-02-09 16:49:11 -08:00
Rohan Verma
3f0c9c35f7
Merge pull request #799 from CREDO23/sur-152-impr-split-private-and-shared-memory
[Feat] Split private vs shared chat memory and add team prompt/attribution
2026-02-09 15:03:54 -08:00
DESKTOP-RTLN3BA\$punk
17b7348f61 feat: fixed and improved search and background task management. 2026-02-09 14:03:56 -08:00
CREDO23
48d442a387 Author labels in shared chats: bootstrap, stream prefix, route display name 2026-02-06 18:09:32 +02:00
DESKTOP-RTLN3BA\$punk
1511c26ef5 feat: add residential proxy configuration for web crawling and YouTube transcript fetching 2026-02-05 20:44:13 -08:00
DESKTOP-RTLN3BA\$punk
19e2857343 feat: added image gen support 2026-02-05 16:43:48 -08:00
CREDO23
233852b681 Switch refresh token storage from cookies to localStorage 2026-02-05 17:55:21 +02:00
CREDO23
f3a9922eb9 Add refresh token auth routes and utilities 2026-02-05 17:29:50 +02:00
Anish Sarkar
4526b656a4 fix: update default date range for Google Calendar events and improve query parameter handling 2026-01-30 19:55:48 +05:30
CREDO23
949ec949f6 style(backend): run ruff format on 10 files 2026-01-28 22:20:02 +02:00
CREDO23
20b8a17254 fix(backend): handle non-string elements in webcrawler URL list
Add isinstance check to prevent AttributeError when INITIAL_URLS list
contains non-string elements (None, int, dict) from malformed config data.
2026-01-28 22:16:58 +02:00
CREDO23
b20fbaca4b fix: skip webcrawler indexing gracefully when no URLs configured 2026-01-28 17:54:46 +02:00