Commit graph

205 commits

Author SHA1 Message Date
CREDO23
49da7a57df Merge remote-tracking branch 'upstream/dev' into improvement-agent-speed
Resolves: surfsense_backend/app/agents/new_chat/middleware/memory_injection.py
- Took both imports: upstream moved MEMORY_HARD_LIMIT/SOFT_LIMIT to
  app.services.memory; kept our perf-logger import for timing.

Pulls in upstream changes:
- Memory document feature (services/memory refactor, removal of
  app.agents.new_chat.memory_extraction and background extraction in
  stream_new_chat — agent now drives memory via update_memory tool).
- BACKEND_URL env refactor across web tool-ui/editor/chat/dashboard/lib.
- GitHub Actions backend test workflow + pre-commit biome bump.
- Token-display polish in MessageInfoDropdown; save_memory no-update
  sentinel.

Verified: 1723 unit tests pass, ruff clean. No semantic regression in
stream_new_chat (their memory-extraction deletion and our preflight
removal touch different functions).
2026-05-20 21:23:48 +02:00
CREDO23
d5ee8cc4cd Merge remote-tracking branch 'upstream/dev' into improvement-agent-speed 2026-05-20 19:22:49 +02:00
CREDO23
c3db25302b perf(chat): kill auto-pin preflight + speculative build, rely on reactive 429 recovery
The preflight pattern probed the LLM with a 1-token ping before each
cold turn (when requested_llm_config_id==0, llm_config_id<0, and the
45s healthy TTL had expired) to detect 429s before fanning out into
planner/classifier/title-gen. To absorb its ~1-5s RTT cost we built the
agent speculatively in parallel; on 429 we discarded the build and
repinned.

Three problems with that design:

1. False security. Provider rate limits are token-bucket. A 1-token
   ping consumes ~5 tokens; the real request consumes 10-50K. The
   probe can return 200 while the real call still 429s.
2. Pure overhead in the common case. On warm-agent-cache turns the
   probe dominates wall time: ~2.5s of TTFT pure tax for ~99% of users
   who never see a 429.
3. The in-stream recovery loop (catch of _is_provider_rate_limited
   gated by not _first_event_logged) already does the right thing
   reactively: mark_runtime_cooldown -> resolve_or_get_pinned_llm_config_id
   with exclude_config_ids={previous} -> rebuild agent -> retry the
   stream. Preflight was never the only safety net; it was a redundant
   probe in front of one.

Changes:
- Delete _preflight_llm, _settle_speculative_agent_build, and the
  _PREFLIGHT_TIMEOUT_SEC / _PREFLIGHT_MAX_TOKENS constants.
- Drop the parallel agent_build_task / preflight_task plumbing in
  both stream_new_chat and stream_resume_chat; build the agent inline
  with await _build_main_agent_for_thread(...).
- Drop the unused is_recently_healthy / mark_healthy imports here
  (still exported from auto_model_pin_service since OpenRouter
  catalogue refresh and a few tests reference clear_healthy).
- Remove the obsolete preflight + settle-speculative tests from
  test_stream_new_chat_contract.py.

Net: -447 LOC. ~2.5s removed from TTFT on every cold preflight-eligible
turn. 429 recovery path is unchanged - same repin/rebuild/retry, just
not paid in advance on the healthy path.
2026-05-20 11:03:08 +02:00
Anish Sarkar
132e7b3c44 refactor: remove memory extraction functions and related components from the new chat agent 2026-05-20 14:03:28 +05:30
Anish Sarkar
87caa4b6d0 Merge remote-tracking branch 'upstream/dev' into feat/ui-revamp 2026-05-18 09:39:35 +05:30
Anish Sarkar
af1d2fa430 Merge remote-tracking branch 'upstream/dev' into fix/zero-cache-stale-replica-1355 2026-05-16 19:30:09 +05:30
Anish Sarkar
f65bc81509 Merge remote-tracking branch 'upstream/dev' into feat/ui-revamp 2026-05-16 19:26:36 +05:30
Anish Sarkar
01d7379914 refactor: add public URL handling for SurfSense documents across various components and schemas 2026-05-15 02:05:11 +05:30
CREDO23
f2495092da chat/stream_resume: salt thinking-step prefix with turn_id to avoid duplicate React keys 2026-05-13 21:15:51 +02:00
CREDO23
0fd87ccb7f chat/stream_resume: key Command(resume=...) by Interrupt.id for parallel HITL 2026-05-13 20:59:57 +02:00
CREDO23
c06dd6e8ba chat/stream_new_chat: emit one SSE frame per pending interrupt 2026-05-13 20:59:48 +02:00
CREDO23
03cf1466d3 chat/stream_resume: route a flat decisions list per paused subagent 2026-05-13 19:58:13 +02:00
Anish Sarkar
8ea042e88c refactor(chat): improve user query handling and mention chip functionality 2026-05-12 20:57:15 +05:30
DESKTOP-RTLN3BA\$punk
c8374e6c5b feat: improved document, folder mentions rendering
Some checks are pending
Build and Push Docker Images / tag_release (push) Waiting to run
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_backend, ./surfsense_backend/Dockerfile, backend, surfsense-backend, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-24.04-arm, linux/arm64, arm64) (push) Blocked by required conditions
Build and Push Docker Images / build (./surfsense_web, ./surfsense_web/Dockerfile, web, surfsense-web, ubuntu-latest, linux/amd64, amd64) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (backend, surfsense-backend) (push) Blocked by required conditions
Build and Push Docker Images / create_manifest (web, surfsense-web) (push) Blocked by required conditions
2026-05-09 22:15:51 -07:00
CREDO23
1761b60c16 Carry thinkingStepId on tool output and extend builder and parity tests. 2026-05-08 23:17:12 +02:00
CREDO23
32092c0b65 Pass thinkingStepId through tool-input start and available metadata. 2026-05-08 23:17:05 +02:00
CREDO23
a309e830d3 Document thinkingStepId on tool-call parts and first-key metadata merge. 2026-05-08 23:17:01 +02:00
CREDO23
d136fcd054 Add tool_activity_metadata to merge spanId and thinkingStepId for tools. 2026-05-08 23:16:44 +02:00
CREDO23
3dbcac4b9d Merge span metadata into persisted tool-call and thinking parts. 2026-05-08 22:48:07 +02:00
CREDO23
f1d80ffe5d Forward span metadata from report_progress thinking updates. 2026-05-08 22:47:50 +02:00
CREDO23
1dcb08e925 Attach active span metadata to thinking-step SSE and completion. 2026-05-08 22:47:46 +02:00
CREDO23
3ed09bdd90 Clear spans after task completion and pass span id on tool output. 2026-05-08 22:47:38 +02:00
CREDO23
2c1b219c6c Open task spans at tool start and tag unmatched tool-input SSE. 2026-05-08 22:47:32 +02:00
CREDO23
695f9ded2c Mint pending span id when the task tool registers from chunks. 2026-05-08 22:47:08 +02:00
CREDO23
f944cdacb7 Add helpers to open and close task delegation span ids. 2026-05-08 22:47:03 +02:00
CREDO23
f0f87107f2 Track active task span id on the agent event relay state. 2026-05-08 22:46:58 +02:00
CREDO23
78f4747382 refactor(chat): stream agent events via stream_output and remove parity v2 flag 2026-05-07 19:40:10 +02:00
CREDO23
7e07092f67 refactor(chat): drop alternate streaming entry path; use graph_stream 2026-05-07 19:25:20 +02:00
CREDO23
52895e37e9 build streaming contexts for chat resume and regenerate paths 2026-05-07 17:57:27 +02:00
CREDO23
a04b2e88bd wire orchestrator streaming context path and align event relay outputs 2026-05-07 17:06:17 +02:00
CREDO23
0f40279d95 Expand orchestration gate coverage to resume and regenerate flows. 2026-05-07 16:18:29 +02:00
CREDO23
52593d88db Reorganize streaming orchestration modules into relay and orchestration folders. 2026-05-07 16:00:15 +02:00
CREDO23
f8754a9dab Rename streaming runtime modules for clearer SRP boundaries. 2026-05-07 15:41:33 +02:00
CREDO23
4e664652a8 Add streaming runtime helpers with behavior-focused unit tests. 2026-05-07 15:13:22 +02:00
CREDO23
c0706364d1 Add a route-level kill switch for streaming orchestrator cutover. 2026-05-07 14:44:36 +02:00
CREDO23
ec26ca69a6 Add chat EventRelay and orchestrator stubs for future cutover. 2026-05-06 20:08:48 +02:00
CREDO23
c8fb4aa5e5 Add deliverables and web tool streaming handlers for chat runs. 2026-05-06 20:08:48 +02:00
CREDO23
a322eedaa1 Add filesystem tool streaming handlers for chat runs. 2026-05-06 20:08:48 +02:00
CREDO23
1392abf5b1 Add chat tool streaming registry with shared, default, and connector tools. 2026-05-06 20:08:48 +02:00
CREDO23
ee16e1d5f9 Add LangGraph handlers for chat model, chain, tool, and custom events. 2026-05-06 20:08:48 +02:00
CREDO23
7581a7c9c3 Add chat streaming relay state and thinking-step SSE helpers. 2026-05-06 20:08:48 +02:00
CREDO23
c25b78c304 Add chat streaming error classification, helpers, and StreamResult. 2026-05-06 20:08:48 +02:00
CREDO23
3cb2c3056e fix(stream): route every agent (re)build through one helper to prevent factory drift 2026-05-05 23:35:23 +02:00
CREDO23
657c31fdf4 refactor(stream): rename multi-agent factory alias for clarity 2026-05-05 23:01:24 +02:00
CREDO23
5119915f4f Merge upstream/dev into feature/multi-agent 2026-05-05 01:44:46 +02:00
CREDO23
0af2c28a8d Stabilize HITL bundle UX and resume. 2026-05-04 23:58:53 +02:00
CREDO23
972650909c Rename package: multi_agent_chat 2026-05-04 21:57:05 +02:00
CREDO23
216a678f1a Address LLM review findings; trim comments. 2026-05-04 21:32:42 +02:00
CREDO23
65f1f8f73c Harden multi-agent for production: resume cleanup, busy-mutex race, deny propagation, disabled-tools. 2026-05-04 20:48:55 +02:00
CREDO23
4ac3f0b304 Forward HITL decisions from the streaming layer to subagents via the config side-channel. 2026-05-04 18:42:58 +02:00