feat: enhance task management and timeout configurations in multi-agent chat

- Added new environment variables for controlling task execution limits, including `SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS`, `SURFSENSE_TASK_BATCH_CONCURRENCY`, and `SURFSENSE_TASK_BATCH_MAX_SIZE`.
- Updated documentation to reflect new batch processing capabilities for `task` calls, allowing for concurrent execution of multiple subagent tasks.
- Improved error handling and receipt generation for deliverables, ensuring consistent feedback on task status.
- Refactored middleware to incorporate search space ID for better task management.
This commit is contained in:
DESKTOP-RTLN3BA\$punk 2026-05-27 14:58:10 -07:00
parent 820f541f08
commit 9d6e9b7e2d
66 changed files with 2561 additions and 380 deletions

View file

@ -357,3 +357,50 @@ LANGSMITH_PROJECT=surfsense
# updates and deletes — the TTL only bounds staleness for bulk-import
# paths that bypass the ORM. Set to 0 to disable the cache.
# SURFSENSE_CONNECTOR_DISCOVERY_TTL_SECONDS=30
# -----------------------------------------------------------------------------
# `task` boundary controls (Hermes-inspired improvements)
# -----------------------------------------------------------------------------
# Wall-clock budget for a single ``task(subagent, ...)`` invocation in
# seconds. Subagents that run hot (slow image vendors, sluggish embedders,
# wedged MCP servers) would otherwise pin the orchestrator until the next
# checkpoint heartbeat fires. On timeout the runtime cancels the underlying
# coroutine and synthesizes a ToolMessage telling the orchestrator to treat
# the result as ``status=error``. Set to 0 to disable the cap entirely.
# Default: 300.0
# SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS=300
# Batch-mode (``task(tasks=[...])``) concurrency cap and max batch size.
# Concurrency is enforced via an ``asyncio.Semaphore`` so a runaway fanout
# cannot starve unrelated subagents (each child still owns an LLM call and
# its own DB session). Max-size is a hard safety net for prompt-injection /
# runaway loops; the orchestrator rarely needs more than a handful of
# concurrent specialists. Set concurrency to 1 to effectively serialise
# batches without changing the schema.
# SURFSENSE_TASK_BATCH_CONCURRENCY=3
# SURFSENSE_TASK_BATCH_MAX_SIZE=8
# Soft per-turn cap on cumulative ``task(...)`` invocations across all
# subagents. Once the sum of ``state['billable_calls']`` crosses this
# number, the runtime appends a one-shot warning ToolMessage telling the
# orchestrator to wrap up rather than launching more specialists. Tunable
# so heavy-research turns (15+ legitimate specialist calls) don't trip the
# alarm in production. Set to 0 to disable the warning entirely.
# SURFSENSE_SUBAGENT_BILLABLE_THRESHOLD=15
# Per-workspace spawn-paused kill switch — set via Redis at runtime, not
# this env var. The env var below only disables the check itself (useful
# for local dev without Redis). To pause a workspace in production:
# redis-cli SET surfsense:spawn_paused:<search_space_id> 1 EX 600
# redis-cli DEL surfsense:spawn_paused:<search_space_id>
# The check is fail-open: a Redis blip never blocks ``task(...)``.
# SURFSENSE_TASK_SPAWN_PAUSED_DISABLED=false
# Note on Celery-backed deliverables (generate_podcast,
# generate_video_presentation): these tools poll the artefact row until
# it reaches a terminal status — they do NOT use an internal wall-clock
# budget. The effective ceiling is SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS
# (above, default 300s) in multi-agent mode and the chat's HTTP / process
# lifetime in single-agent mode. If your podcasts or videos routinely
# exceed 5 minutes, raise SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS (or
# set it to 0 to disable that ceiling entirely).