- Replaced background styles with `bg-main-panel` in various components for consistent theming.
- Enhanced the `Header`, `LayoutShell`, and `Thread` components to improve visual coherence.
- Adjusted tool management UI to reflect new design standards, ensuring a unified user experience.
Wrap synchronous embedding_model.embed() calls with asyncio.to_thread()
in both vector_search and hybrid_search methods. This prevents blocking
the asyncio event loop during embedding computation, improving server
responsiveness under concurrent load.
Fixes#794
Signed-off-by: Tim Ren <137012659+xr843@users.noreply.github.com>
- modify the `web_search` tool that integrates multiple live search connectors (Tavily, Linkup, Baidu) alongside the SearXNG platform for real-time information retrieval.
- Updated comments to reflect the separation of local knowledge base searches from web searches.
- Improved handling of connector types and search logic to ensure accurate routing of queries to the appropriate tools.
- Refactored existing code to streamline the search process and enhance performance.
- Deleted the migration script for adding web search columns to the database.
- Removed web search related fields from the SearchSpace model and schemas.
- Eliminated web search health check endpoint and associated service logic.
- Updated frontend components to remove web search settings management.
- Cleaned up references to SearXNG API in various modules, transitioning to a platform service approach.
- Consolidated volume mappings for SearXNG to use a single directory.
- Removed unnecessary port mappings and legacy data volume definitions.
- Updated web search service documentation to clarify Redis usage and circuit breaker implementation, eliminating Redis dependency for circuit breaker logic.
- Added SearXNG service configuration to Docker setup, including environment variables and health checks.
- Introduced new settings management for web search in the frontend, allowing users to enable/disable and configure search engines and language preferences.
- Updated backend to support web search functionality, including database schema changes and service integration.
- Implemented health check endpoint for the web search service and integrated it into the application.
- Removed legacy SearXNG API connector references in favor of the new platform service approach.
Add full MiniMax provider support across the entire stack:
Backend:
- Add MINIMAX to LiteLLMProvider enum in db.py
- Add MINIMAX mapping to all provider_map dicts in llm_service.py,
llm_router_service.py, and llm_config.py
- Add Alembic migration (rev 106) for PostgreSQL enum
- Add MiniMax M2.5 example in global_llm_config.example.yaml
Frontend:
- Add MiniMax to LLM_PROVIDERS enum with apiBase
- Add MiniMax-M2.5 and MiniMax-M2.5-highspeed to LLM_MODELS
- Add MINIMAX to Zod validation schema
- Add MiniMax SVG icon and wire up in provider-icons
Docs:
- Add MiniMax setup guide in chinese-llm-setup.md
MiniMax uses an OpenAI-compatible API (https://api.minimax.io/v1)
with models supporting up to 204K context window.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Added bulk delete functionality for documents in DocumentsTableShell and DocumentsSidebar.
- Enhanced search space retrieval to exclude spaces marked for deletion in read_search_spaces.
- Updated connector dialog to synchronize URL parameters when opened externally.
- Improved layout behavior to handle search space deletion and redirection more effectively.
- Added `_attach_model_profile` function to attach model context metadata to `ChatLiteLLM`.
- Updated `create_chat_litellm_from_config` and `create_chat_litellm_from_agent_config` to utilize the new profile attachment.
- Improved context profile caching in `llm_router_service.py` to include both minimum and maximum input tokens, along with token model names for better context management.
- Introduced new methods for token counting and context trimming based on model profiles.
- Added endpoint to list agent tools with metadata, excluding hidden tools.
- Updated NewChatRequest and RegenerateRequest schemas to include disabled tools.
- Integrated disabled tools management in the NewChatPage and Composer components.
- Improved tool instructions and visibility in the system prompt.
- Refactored tool registration to support hidden tools and default enabled states.
- Enhanced document chunk creation to handle strict zip behavior.
- Cleaned up imports and formatting across various files for consistency.
- Add check_permission to drive-picker-token endpoint (IDOR fix)
- Use get_composio_service singleton + asyncio.to_thread to avoid blocking the event loop
- Sanitize error detail in 500 response to prevent internal info leakage
- Dispose picker on unmount to prevent orphaned overlay
- Surface error state on Google Picker Action.ERROR instead of silently closing
- Introduced a shielded async session context manager to ensure safe session closure during cancellations.
- Updated various database operations to utilize the new shielded session, preventing orphaned connections.
- Added environment variables to optimize glibc memory management, improving overall application performance.
- Implemented a function to trim the native heap, allowing for better memory reclamation on Linux systems.
- Added proper session closure to prevent connection leaks during streaming.
- Implemented a fresh session for cleanup tasks to ensure data integrity.
- Enhanced error handling during session operations to improve robustness.
- Removed unnecessary session parameters from function signatures for clarity.
- Introduced a mechanism to identify degenerate queries that lack meaningful search signals, improving search accuracy.
- Implemented a fallback method for browsing recent documents when queries are degenerate, ensuring relevant results are returned.
- Added limits on the number of chunks fetched per document to optimize performance and prevent excessive data loading.
- Updated the ConnectorService to allow for reusable query embeddings, enhancing efficiency in search operations.
- Enhanced LLM router service to support context window fallbacks, improving robustness during context window limitations.
- Increased maximum file upload limit from 10 to 50 to improve user experience.
- Implemented batch processing for document uploads to avoid proxy timeouts, splitting files into manageable chunks.
- Enhanced garbage collection in chat streaming functions to prevent memory leaks and improve performance.
- Added memory delta tracking in system snapshots for better monitoring of resource usage.
- Updated LLM router and service configurations to prevent unbounded internal accumulation and improve efficiency.
- Improved in-memory rate limiting by evicting timestamps outside the current window and cleaning up empty keys.
- Updated LLM router service to cache context profiles and avoid redundant computations.
- Introduced cache eviction logic for MCP tools and sandbox instances to manage memory usage effectively.
- Added garbage collection triggers in chat streaming functions to reclaim resources promptly.
- improved search_knowledgebase_tool
- Added new endpoint to batch-fetch comments for multiple messages, reducing the number of API calls.
- Introduced CommentBatchRequest and CommentBatchResponse schemas for handling batch requests and responses.
- Updated chat_comments_service to validate message existence and permissions before fetching comments.
- Enhanced frontend with useBatchCommentsPreload hook to optimize comment loading for assistant messages.
- Introduced RequestPerfMiddleware to log request performance metrics, including slow request thresholds.
- Updated various services and retrievers to utilize the new performance logging utility for better tracking of execution times.
- Enhanced existing methods with detailed performance logs for operations such as embedding, searching, and indexing.
- Removed deprecated logging setup in stream_new_chat and replaced it with the new performance logger.
- Introduced dynamic character budget calculation for document formatting based on model's context window.
- Updated `format_documents_for_context` to respect character limits and improve output quality.
- Added `max_input_tokens` parameter to various functions to facilitate context-aware processing.
- Enhanced error handling for context overflow in LLM router service.