- Added new environment variables for controlling task execution limits, including `SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS`, `SURFSENSE_TASK_BATCH_CONCURRENCY`, and `SURFSENSE_TASK_BATCH_MAX_SIZE`.
- Updated documentation to reflect new batch processing capabilities for `task` calls, allowing for concurrent execution of multiple subagent tasks.
- Improved error handling and receipt generation for deliverables, ensuring consistent feedback on task status.
- Refactored middleware to incorporate search space ID for better task management.
Adds an optional planner LLM role wired through KnowledgePriorityMiddleware
so KB query rewriting, date extraction, and recency classification run on a
cheap model (e.g. gpt-4o-mini, Haiku, Azure nano) instead of the user's
chat LLM. Operators opt in by setting is_planner: true on exactly one
global config; without it, behavior is unchanged.
Add full MiniMax provider support across the entire stack:
Backend:
- Add MINIMAX to LiteLLMProvider enum in db.py
- Add MINIMAX mapping to all provider_map dicts in llm_service.py,
llm_router_service.py, and llm_config.py
- Add Alembic migration (rev 106) for PostgreSQL enum
- Add MiniMax M2.5 example in global_llm_config.example.yaml
Frontend:
- Add MiniMax to LLM_PROVIDERS enum with apiBase
- Add MiniMax-M2.5 and MiniMax-M2.5-highspeed to LLM_MODELS
- Add MINIMAX to Zod validation schema
- Add MiniMax SVG icon and wire up in provider-icons
Docs:
- Add MiniMax setup guide in chinese-llm-setup.md
MiniMax uses an OpenAI-compatible API (https://api.minimax.io/v1)
with models supporting up to 204K context window.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Increased maximum file upload limit from 10 to 50 to improve user experience.
- Implemented batch processing for document uploads to avoid proxy timeouts, splitting files into manageable chunks.
- Enhanced garbage collection in chat streaming functions to prevent memory leaks and improve performance.
- Added memory delta tracking in system snapshots for better monitoring of resource usage.
- Updated LLM router and service configurations to prevent unbounded internal accumulation and improve efficiency.
- Improved in-memory rate limiting by evicting timestamps outside the current window and cleaning up empty keys.
- Updated LLM router service to cache context profiles and avoid redundant computations.
- Introduced cache eviction logic for MCP tools and sandbox instances to manage memory usage effectively.
- Added garbage collection triggers in chat streaming functions to reclaim resources promptly.
-Introduce granular permissions for documents, chats, podcasts, and logs.
- Update routes to enforce permission checks for creating, reading, updating, and deleting resources. - Refactor user and search space interactions to align with RBAC model, removing ownership checks in favor of permission validation.
- Added new LLM providers including Google, Azure OpenAI, Bedrock, and others to the backend.
- Updated the model selection UI to dynamically display available models based on the selected provider.
- Enhanced the provider change handling to reset the model selection when the provider is changed.
- Improved the overall user experience by providing contextual information for model selection.
- Added `validate_llm_config` function to `llm_service.py` for validating LLM configurations via test API calls.
- Integrated validation in `create_llm_config` and `update_llm_config` routes in `llm_config_routes.py`, raising HTTP exceptions for invalid configurations.
- Enhanced error handling to provide detailed feedback on configuration issues.
- Add support for DeepSeek, Qwen (Alibaba), Kimi (Moonshot), and GLM (Zhipu)
- Implement auto-fill API Base URL when selecting Chinese LLM providers
- Add smart validation and warnings for missing API endpoints
- Fix session state management in task logging service
- Add comprehensive Chinese setup documentation
- Add database migration for new LLM provider enums
Closes#383
- RBAC soon??
- Updated various services and routes to handle search space-specific LLM preferences.
- Modified frontend components to pass search space ID for LLM configuration management.
- Removed onboarding page and settings page as part of the refactor.
- Updated import paths for LLM, connector, query, and streaming services to reflect their new location in the 'services' module.
- Removed obsolete utility service files that have been migrated.
2025-07-06 17:51:24 -07:00
Renamed from surfsense_backend/app/utils/llm_service.py (Browse further)