mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-06-14 20:55:15 +02:00
feat: made chat fast
- Introduced lazy knowledge base retrieval mode, allowing the main agent to fetch KB content on demand via the `search_knowledge_base` tool, improving performance by skipping expensive pre-injection processes. - Added cross-thread caching capability, enabling reuse of compiled graphs across different user chats, reducing latency for returning users. - Updated middleware to support new lazy loading and caching features, ensuring efficient resource utilization and improved response times. - Enhanced logging for performance tracking during knowledge retrieval and agent interactions.
This commit is contained in:
parent
ce952d2ad1
commit
41ff57101c
32 changed files with 979 additions and 169 deletions
|
|
@ -362,6 +362,13 @@ LANGSMITH_PROJECT=surfsense
|
|||
# SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS=false
|
||||
# SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE=false
|
||||
|
||||
# KB retrieval mode (default OFF = lazy). When OFF, the main agent retrieves
|
||||
# KB content on demand via the `search_knowledge_base` tool and skips the
|
||||
# expensive per-turn pre-injection (planner LLM + embed + hybrid search,
|
||||
# ~2.3s); explicit @-mentions are still surfaced cheaply. Set to true to
|
||||
# restore the original eager `<priority_documents>` pre-injection.
|
||||
# SURFSENSE_ENABLE_KB_PRIORITY_PREINJECTION=false
|
||||
|
||||
# Snapshot / revert
|
||||
# SURFSENSE_ENABLE_ACTION_LOG=false
|
||||
# SURFSENSE_ENABLE_REVERT_ROUTE=false # Backend-only; flip when UI ships
|
||||
|
|
@ -382,6 +389,15 @@ LANGSMITH_PROJECT=surfsense
|
|||
# rollback if you suspect cache-related staleness.
|
||||
# SURFSENSE_ENABLE_AGENT_CACHE=true
|
||||
|
||||
# Cross-thread reuse (default ON). Drops thread_id from the cache key so a
|
||||
# returning user's NEW chats (same user + search space + config + visibility)
|
||||
# hit the already-compiled graph instead of paying a fresh ~4-5s compile —
|
||||
# turning a cold first turn into a warm one. Safe because ActionLog,
|
||||
# KB-persistence, and the deliverables tools now resolve the chat thread from
|
||||
# the live RunnableConfig at call time rather than a build-time closure. Flip
|
||||
# OFF to fall back to a per-thread cache key (instant rollback).
|
||||
# SURFSENSE_ENABLE_CROSS_THREAD_AGENT_CACHE=true
|
||||
|
||||
# Cache capacity (max number of compiled-agent entries kept in memory)
|
||||
# and TTL per entry (seconds). Working set is typically one entry per
|
||||
# active thread on this replica; tune up for very large deployments.
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue