app/agents/shared/ is a sibling of anonymous_chat/podcaster/multi_agent_chat/
video_presentation, so it should only hold code shared across 2+ of those
agents. In practice podcaster and video_presentation import nothing from it,
and anonymous_chat needs only context + compaction + retry_after + web_search.
Everything else was multi_agent_chat-only (the boundary just passes through).
Move the multi_agent_chat-only cluster into multi_agent_chat/shared/ (files
moved verbatim via git rename; ~116 import sites rewritten):
errors, feature_flags, filesystem_selection, path_resolver, prompt_caching,
sandbox, llm_config, mention_resolver
middleware/busy_mutex, middleware/kb_persistence
busy_mutex/llm_config/mention_resolver are boundary-only but import the moved
modules, so they were folded in to avoid a backwards shared -> multi_agent_chat
dependency. main_agent builders now import the impls directly; the shared
middleware barrel keeps only the genuinely-shared compaction + retry_after.
Also delete the dead leftover shared/plugins and shared/skills dirs (live
copies already live under main_agent/).
Remaining in app/agents/shared/: context, system_prompt(+prompts), checkpointer,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/. checkpointer and
system_prompt are boundary-only infra pending a dedicated home decision.
Relocate the mutually-dependent LLM config layer and the LiteLLM prompt-caching
helper to the shared kernel as one unit, rewiring their internal cross-reference
to the shared paths. Flip 21 non-frozen importers. Re-export shims remain at
new_chat/{llm_config,prompt_caching}.py for the frozen single-agent stack
(chat_deepagent); they will be removed when that stack is retired.
- Added new environment variables for controlling task execution limits, including `SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS`, `SURFSENSE_TASK_BATCH_CONCURRENCY`, and `SURFSENSE_TASK_BATCH_MAX_SIZE`.
- Updated documentation to reflect new batch processing capabilities for `task` calls, allowing for concurrent execution of multiple subagent tasks.
- Improved error handling and receipt generation for deliverables, ensuring consistent feedback on task status.
- Refactored middleware to incorporate search space ID for better task management.
Adds an optional planner LLM role wired through KnowledgePriorityMiddleware
so KB query rewriting, date extraction, and recency classification run on a
cheap model (e.g. gpt-4o-mini, Haiku, Azure nano) instead of the user's
chat LLM. Operators opt in by setting is_planner: true on exactly one
global config; without it, behavior is unchanged.
Add full MiniMax provider support across the entire stack:
Backend:
- Add MINIMAX to LiteLLMProvider enum in db.py
- Add MINIMAX mapping to all provider_map dicts in llm_service.py,
llm_router_service.py, and llm_config.py
- Add Alembic migration (rev 106) for PostgreSQL enum
- Add MiniMax M2.5 example in global_llm_config.example.yaml
Frontend:
- Add MiniMax to LLM_PROVIDERS enum with apiBase
- Add MiniMax-M2.5 and MiniMax-M2.5-highspeed to LLM_MODELS
- Add MINIMAX to Zod validation schema
- Add MiniMax SVG icon and wire up in provider-icons
Docs:
- Add MiniMax setup guide in chinese-llm-setup.md
MiniMax uses an OpenAI-compatible API (https://api.minimax.io/v1)
with models supporting up to 204K context window.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Increased maximum file upload limit from 10 to 50 to improve user experience.
- Implemented batch processing for document uploads to avoid proxy timeouts, splitting files into manageable chunks.
- Enhanced garbage collection in chat streaming functions to prevent memory leaks and improve performance.
- Added memory delta tracking in system snapshots for better monitoring of resource usage.
- Updated LLM router and service configurations to prevent unbounded internal accumulation and improve efficiency.
- Improved in-memory rate limiting by evicting timestamps outside the current window and cleaning up empty keys.
- Updated LLM router service to cache context profiles and avoid redundant computations.
- Introduced cache eviction logic for MCP tools and sandbox instances to manage memory usage effectively.
- Added garbage collection triggers in chat streaming functions to reclaim resources promptly.
-Introduce granular permissions for documents, chats, podcasts, and logs.
- Update routes to enforce permission checks for creating, reading, updating, and deleting resources. - Refactor user and search space interactions to align with RBAC model, removing ownership checks in favor of permission validation.
- Added new LLM providers including Google, Azure OpenAI, Bedrock, and others to the backend.
- Updated the model selection UI to dynamically display available models based on the selected provider.
- Enhanced the provider change handling to reset the model selection when the provider is changed.
- Improved the overall user experience by providing contextual information for model selection.
- Added `validate_llm_config` function to `llm_service.py` for validating LLM configurations via test API calls.
- Integrated validation in `create_llm_config` and `update_llm_config` routes in `llm_config_routes.py`, raising HTTP exceptions for invalid configurations.
- Enhanced error handling to provide detailed feedback on configuration issues.
- Add support for DeepSeek, Qwen (Alibaba), Kimi (Moonshot), and GLM (Zhipu)
- Implement auto-fill API Base URL when selecting Chinese LLM providers
- Add smart validation and warnings for missing API endpoints
- Fix session state management in task logging service
- Add comprehensive Chinese setup documentation
- Add database migration for new LLM provider enums
Closes#383
- RBAC soon??
- Updated various services and routes to handle search space-specific LLM preferences.
- Modified frontend components to pass search space ID for LLM configuration management.
- Removed onboarding page and settings page as part of the refactor.
- Updated import paths for LLM, connector, query, and streaming services to reflect their new location in the 'services' module.
- Removed obsolete utility service files that have been migrated.
2025-07-06 17:51:24 -07:00
Renamed from surfsense_backend/app/utils/llm_service.py (Browse further)