- Introduced a mechanism to identify degenerate queries that lack meaningful search signals, improving search accuracy.
- Implemented a fallback method for browsing recent documents when queries are degenerate, ensuring relevant results are returned.
- Added limits on the number of chunks fetched per document to optimize performance and prevent excessive data loading.
- Updated the ConnectorService to allow for reusable query embeddings, enhancing efficiency in search operations.
- Enhanced LLM router service to support context window fallbacks, improving robustness during context window limitations.
- improved search_knowledgebase_tool
- Added new endpoint to batch-fetch comments for multiple messages, reducing the number of API calls.
- Introduced CommentBatchRequest and CommentBatchResponse schemas for handling batch requests and responses.
- Updated chat_comments_service to validate message existence and permissions before fetching comments.
- Enhanced frontend with useBatchCommentsPreload hook to optimize comment loading for assistant messages.
- Introduced RequestPerfMiddleware to log request performance metrics, including slow request thresholds.
- Updated various services and retrievers to utilize the new performance logging utility for better tracking of execution times.
- Enhanced existing methods with detailed performance logs for operations such as embedding, searching, and indexing.
- Removed deprecated logging setup in stream_new_chat and replaced it with the new performance logger.
- Added functionality to dynamically discover available connectors and document types for the knowledge base tool, enhancing its flexibility and usability.
- Introduced new mapping functions and updated existing search methods to accommodate Composio connectors, improving integration with external services.
- Enhanced error handling and logging for connector discovery processes, ensuring better feedback during failures.
- Modified the document extraction and citation formatting to accommodate a new structure that includes a `chunks` list for each document.
- Enhanced the citation format to reference `chunk_id` instead of `source_id`, ensuring accurate citations in the UI.
- Updated various components, including the connector service and reranker service, to handle the new document format and maintain compatibility with existing functionalities.
- Improved documentation and comments to reflect changes in the data structure and citation requirements.
Fixes typo in directory name and updates all import paths:
- Renamed surfsense_backend/app/retriver/ to surfsense_backend/app/retriever/
- Updated imports in db.py
- Updated imports in connector_service.py
-Introduce granular permissions for documents, chats, podcasts, and logs.
- Update routes to enforce permission checks for creating, reading, updating, and deleting resources. - Refactor user and search space interactions to align with RBAC model, removing ownership checks in favor of permission validation.
- Add BAIDU_SEARCH_API connector type to support Chinese web search
- Implement search_baidu() method in connector_service.py
- Add frontend configuration page for Baidu Search API
- Create Alembic migration for new enum values
- Add validation rules and agent integration
- Support configurable model, search source, and deep search options
- Update .gitignore to exclude .env.local and other env files
Addresses integration with Chinese search ecosystem for better local market support.
Baidu AI Search provides intelligent search with automatic summarization.