SurfSense

mirror of https://github.com/MODSetter/SurfSense.git synced 2026-05-29 19:35:20 +02:00

Author	SHA1	Message	Date
CREDO23	b2a0888588	refactor(chat): add streaming/flows/new_chat/orchestrator.stream_new_chat Slim composition root for the new-chat streaming flow. Sequences: 1. validate inputs and load the LLM bundle (negative id => YAML) 2. open the OTEL chat_request span; set agent_mode tag 3. spawn the four pre-stream DB writes (set-ai-responding, persist user turn, persist assistant shell, first-assistant probe) 4. reserve premium quota (with free-fallback retry on denial) 5. build connector + checkpointer + agent + input_state 6. emit first frames (message-start, step-start, initial thinking step) 7. spawn the background title generator 8. run the shared stream_loop with a flow-local _recover closure that reroutes to the next auto-pin config on provider 429s 9. finalize: emit terminal title/token frames, shielded assistant finalize, release-or-finalize premium quota, close session, GC, record OTEL outcome Public entry-point flows/new_chat/__init__ re-exports stream_new_chat. Existing wiring (routes, tests) still imports the legacy function from app.tasks.chat.stream_new_chat. Cutover is a later commit.	2026-05-25 21:49:55 +02:00
CREDO23	927009745e	refactor(chat): add streaming/flows/new_chat/ per-concern leaf modules Seven focused modules that the upcoming new_chat orchestrator composes: * auto_pin: resolve_initial_auto_pin selects the initial config (with vision-capable filtering and error classification). * llm_capability: check_image_input_capability blocks routing an image-bearing turn to a known text-only model. * runtime_context: build_new_chat_runtime_context assembles the SurfSenseContextSchema for a new-chat turn. * persistence_spawn: spawn_set_ai_responding_bg, spawn_persist_user_task, spawn_persist_assistant_shell_task, and await_persist_task background the four pre-stream DB writes so they overlap with agent build. * initial_thinking_step: build_initial_thinking_step + iter_initial_thinking_step_frame produce the very first thinking-1 SSE step ("Understanding your request" / "Analyzing referenced content"). * title_gen: spawn_title_task + maybe_emit_title_update + await_pending_title_update background the thread-title generator and interleave its update into the stream when ready. * input_state: build_new_chat_input_state assembles the LangGraph input_state (history bootstrap, mentions resolution, context blocks, human-message construction). The heavy one. Add-only; no orchestrator yet (next commit).	2026-05-25 21:49:45 +02:00
CREDO23	21bddc73a7	refactor(chat): add streaming/flows/shared/assistant_finalize.py Extracts finalize_assistant_message: the post-stream server-side write of the final assistant message (with content parts + token usage) guarded by asyncio.shield + shielded_async_session so a client disconnect cannot abort the persist. Add-only; legacy stream_new_chat.py keeps its inline finalize block until cutover.	2026-05-25 21:49:31 +02:00
CREDO23	b54b803dc9	refactor(chat): add streaming/flows/shared/ rate-limit recovery + stream loop Two cooperating modules that wrap stream_agent_events with in-stream recovery from provider 429s: * rate_limit_recovery: can_recover_provider_rate_limit truth-table guard, reroute_to_next_auto_pin (selects the next eligible auto-pin config and reloads the LLM bundle), log_rate_limit_recovered. * stream_loop: run_stream_loop drives stream_agent_events in a while-True loop, delegating recovery to a flow-supplied RecoverFn callback so new_chat and resume_chat can share the same loop while keeping their own nonlocal state. Add-only; not yet wired into any orchestrator.	2026-05-25 21:49:27 +02:00
CREDO23	2c3edb7c84	refactor(chat): add streaming/flows/shared/terminal_error.py Extracts handle_terminal_exception: the shared except-branch behavior for the chat orchestrators. Classifies the raised exception, logs the structured chat_stream error event, and emits the terminal-error SSE frame + done sentinel via the streaming service. Add-only; nothing imports it yet.	2026-05-25 21:49:18 +02:00
CREDO23	40300d300a	refactor(chat): add streaming/flows/shared/premium_quota.py Centralizes the premium-credits lifecycle for chat turns: * needs_premium_quota: gate check (premium user + non-fallback config). * PremiumReservation: dataclass capturing reservation state + token totals. * reserve_premium / finalize_premium / release_premium: idempotent reservation, commit, and rollback used by the orchestrators. Add-only; legacy stream_new_chat.py keeps its inline quota handling until cutover.	2026-05-25 21:49:14 +02:00
CREDO23	e9a98ecafb	refactor(chat): add streaming/flows/shared/ base helpers Six small, single-purpose modules shared by the upcoming new_chat and resume_chat orchestrators: * llm_bundle: dispatches negative config_id to the YAML loader and non-negative config_id to the DB loader, returning (llm, AgentConfig). * pre_stream_setup: builds the connector service, resolves the Firecrawl API key, and returns the chat checkpointer. * first_frames: iter_initial_frames + iter_final_frames emit the canonical message-start / step-start / idle / finish / done SSE envelope. * finalize_emit: iter_token_usage_frame emits the per-turn usage frame from a TokenAccumulator summary. * finally_cleanup: close_session_and_clear_ai_responding and run_gc_pass centralize the finally-block bookkeeping. * span: open_chat_request_span / set_agent_mode / close_chat_request_span / record_outcome_attrs wrap the OpenTelemetry chat_request span. Add-only; these are not yet wired into stream_new_chat.py.	2026-05-25 21:49:09 +02:00

7 commits