Commit graph

41 commits

Author SHA1 Message Date
DESKTOP-RTLN3BA\$punk
4bee367d4a feat: added ai file sorting 2026-04-14 01:43:30 -07:00
Anish Sarkar
a8b83dcf3f feat: add folder_id support in ConnectorDocument and indexing pipeline for improved document organization 2026-04-08 17:48:50 +05:30
Anish Sarkar
76c760b8dd fix: improve the notification content and improve tooltip 2026-04-07 23:00:52 +05:30
Anish Sarkar
000c2d9b5b style: simplify LLM model terminology in UI 2026-04-02 10:11:35 +05:30
DESKTOP-RTLN3BA\$punk
2cc2d339e6 feat: made agent file sytem optimized 2026-03-28 16:39:46 -07:00
Anish Sarkar
bd6e335cb3 feat: enhance performance logging in indexing pipeline
- Added performance logging to the `index_batch_parallel` method, capturing metrics for document indexing duration and concurrency.
- Introduced timing measurements for both the overall indexing process and the parallel document gathering phase, improving observability of the indexing workflow.
- Updated logging statements to provide detailed insights into the number of documents processed, indexed, and failed during the indexing operation.
2026-03-26 23:10:49 +05:30
Anish Sarkar
4fd776e7ef feat: implement parallel indexing for Google Calendar and Gmail connectors
- Refactored Google Calendar and Gmail indexers to utilize the new `index_batch_parallel` method for concurrent document indexing, enhancing performance.
- Updated the indexing logic to replace serial processing with parallel execution, allowing for improved efficiency in handling multiple documents.
- Adjusted logging and error handling to accommodate the new parallel processing approach, ensuring robust operation during indexing.
- Enhanced unit tests to validate the functionality of the parallel indexing method and its integration with existing workflows.
2026-03-26 19:34:04 +05:30
Anish Sarkar
e5cb6bfacf feat: implement parallel document indexing in IndexingPipelineService
- Added `index_batch_parallel` method to enable concurrent indexing of documents with bounded concurrency, improving performance and efficiency.
- Refactored existing indexing logic to utilize `asyncio.to_thread` for non-blocking execution of embedding and chunking functions.
- Introduced unit tests to validate the functionality of the new parallel indexing method, ensuring robustness and error handling during document processing.
2026-03-26 19:33:49 +05:30
Anish Sarkar
f7b52470eb feat: enhance Google connectors indexing with content extraction and document migration
- Added `download_and_extract_content` function to extract content from Google Drive files as markdown.
- Updated Google Drive indexer to utilize the new content extraction method.
- Implemented document migration logic to update legacy Composio document types to their native Google types.
- Introduced identifier hashing for stable document identification.
- Improved file pre-filtering to handle unchanged and rename-only files efficiently.
2026-03-25 18:33:44 +05:30
DESKTOP-RTLN3BA\$punk
d8a05ae4d5 feat: refactor agent tools management and add UI integration
- Added endpoint to list agent tools with metadata, excluding hidden tools.
- Updated NewChatRequest and RegenerateRequest schemas to include disabled tools.
- Integrated disabled tools management in the NewChatPage and Composer components.
- Improved tool instructions and visibility in the system prompt.
- Refactored tool registration to support hidden tools and default enabled states.
- Enhanced document chunk creation to handle strict zip behavior.
- Cleaned up imports and formatting across various files for consistency.
2026-03-10 17:36:26 -07:00
CREDO23
929445afd9 feat: use batch embedding in IndexingPipelineService.index 2026-03-09 16:13:44 +02:00
CREDO23
cb4b155b9d feat: re-export embed_texts from document_embedder 2026-03-09 15:54:02 +02:00
Anish Sarkar
6d00b0debf Merge remote-tracking branch 'upstream/dev' into refactor/upload-document-adapter-class 2026-03-01 22:35:17 +05:30
DESKTOP-RTLN3BA\$punk
0e723a5b8b feat: perf optimizations
- improved search_knowledgebase_tool
- Added new endpoint to batch-fetch comments for multiple messages, reducing the number of API calls.
- Introduced CommentBatchRequest and CommentBatchResponse schemas for handling batch requests and responses.
- Updated chat_comments_service to validate message existence and permissions before fetching comments.
- Enhanced frontend with useBatchCommentsPreload hook to optimize comment loading for assistant messages.
2026-02-27 17:19:25 -08:00
DESKTOP-RTLN3BA\$punk
664c43ca13 feat: add performance logging middleware and enhance performance tracking across services
- Introduced RequestPerfMiddleware to log request performance metrics, including slow request thresholds.
- Updated various services and retrievers to utilize the new performance logging utility for better tracking of execution times.
- Enhanced existing methods with detailed performance logs for operations such as embedding, searching, and indexing.
- Removed deprecated logging setup in stream_new_chat and replaced it with the new performance logger.
2026-02-27 16:32:30 -08:00
Anish Sarkar
23a98d802c refactor: implement UploadDocumentAdapter for file indexing and reindexing 2026-02-28 01:38:32 +05:30
DESKTOP-RTLN3BA\$punk
e9892c8fe9 feat: added configable summary calculation and various improvements
- Replaced direct embedding calls with a utility function across various components to streamline embedding logic.
- Added enable_summary flag to several models and routes to control summary generation behavior.
2026-02-26 18:24:57 -08:00
DESKTOP-RTLN3BA\$punk
23f553ef84 Merge branch 'dev' of https://github.com/MODSetter/SurfSense into dev 2026-02-26 13:01:24 -08:00
DESKTOP-RTLN3BA\$punk
aabc24f82c feat: enhance performance logging and caching in various components
- Introduced slow callback logging in FastAPI to identify blocking calls.
- Added performance logging for agent creation and tool loading processes.
- Implemented caching for MCP tools to reduce redundant server calls.
- Enhanced sandbox management with in-process caching for improved efficiency.
- Refactored several functions for better readability and performance tracking.
- Updated tests to ensure proper functionality of new features and optimizations.
2026-02-26 13:00:31 -08:00
Anish Sarkar
9ccee054a5 chore: ran linting 2026-02-26 03:05:20 +05:30
CREDO23
c50d661d7d fix wrong status key in adapter error reporting 2026-02-25 21:00:55 +02:00
CREDO23
d0fdd3224a fix metadata keys casing and set content_needs_reindexing in adapter 2026-02-25 20:39:18 +02:00
CREDO23
cad400be1b add file upload adapter and make index() return refreshed document 2026-02-25 19:56:59 +02:00
CREDO23
86ecb82c6e fix: tighten indexing pipeline exception handling and logging 2026-02-25 17:44:35 +02:00
CREDO23
5be58b78ad simplify indexing pipeline DB error handling 2026-02-25 16:59:09 +02:00
CREDO23
66d7d3da8a fix bugs in indexing pipeline exception handling 2026-02-25 16:27:12 +02:00
CREDO23
b6c25628c8 add structured logging to indexing pipeline 2026-02-25 16:04:35 +02:00
CREDO23
610080bfef extract persistence helpers into document_persistence.py 2026-02-25 15:30:25 +02:00
CREDO23
0aeb888be0 add structured error handling to indexing pipeline 2026-02-25 15:26:04 +02:00
CREDO23
ca870cf660 add fallback document sumary 2026-02-25 13:47:36 +02:00
CREDO23
36d1fba75f fix: isolate per-document errors in prepare_for_indexing 2026-02-25 13:00:34 +02:00
CREDO23
e6b7ce7345 fix: handle IntegrityError in prepare_for_indexing and add within-batch content dedup test 2026-02-25 12:03:00 +02:00
CREDO23
c5ae62140d fix: rescue stuck documents with unchanged content on next indexing run 2026-02-25 11:13:25 +02:00
CREDO23
0363cb9c17 fix: updated_at on title change, LLM fallback, stale chunks deleted on re-index 2026-02-25 08:40:13 +02:00
CREDO23
af22fa7c88 refactor: remove redundant and low-value tests, enforce connector_id and created_by_id constraints 2026-02-25 08:29:53 +02:00
CREDO23
5b616eac5a fix: plug all gaps found in deep review of indexing pipeline 2026-02-25 02:20:44 +02:00
CREDO23
61e50834e6 feat: implement and test index method 2026-02-25 01:40:30 +02:00
CREDO23
497ed681d5 feat: implement and test index happy path 2026-02-25 00:30:11 +02:00
CREDO23
579a9e2cb5 feat: implement and test prepare_for_indexing 2026-02-25 00:06:34 +02:00
CREDO23
a0134a5830 test: add document hashing unit tests and clean up conftest mocks 2026-02-24 22:48:40 +02:00
CREDO23
d5e10bd8f9 test: add ConnectorDocument unit tests and factory fixture 2026-02-24 22:20:08 +02:00