Centralizes the premium-credits lifecycle for chat turns:
* needs_premium_quota: gate check (premium user + non-fallback config).
* PremiumReservation: dataclass capturing reservation state + token totals.
* reserve_premium / finalize_premium / release_premium: idempotent
reservation, commit, and rollback used by the orchestrators.
Add-only; legacy stream_new_chat.py keeps its inline quota handling
until cutover.
Six small, single-purpose modules shared by the upcoming new_chat and
resume_chat orchestrators:
* llm_bundle: dispatches negative config_id to the YAML loader and
non-negative config_id to the DB loader, returning (llm, AgentConfig).
* pre_stream_setup: builds the connector service, resolves the
Firecrawl API key, and returns the chat checkpointer.
* first_frames: iter_initial_frames + iter_final_frames emit the canonical
message-start / step-start / idle / finish / done SSE envelope.
* finalize_emit: iter_token_usage_frame emits the per-turn usage frame
from a TokenAccumulator summary.
* finally_cleanup: close_session_and_clear_ai_responding and run_gc_pass
centralize the finally-block bookkeeping.
* span: open_chat_request_span / set_agent_mode / close_chat_request_span /
record_outcome_attrs wrap the OpenTelemetry chat_request span.
Add-only; these are not yet wired into stream_new_chat.py.
Extracts the inner agent-streaming driver previously inlined as
_stream_agent_events in stream_new_chat.py.
stream_agent_events drives graph_stream.event_stream.stream_output and,
after the agent finishes, performs the post-stream safety-net work:
* commit any pending content the agent never explicitly finished
* evaluate file-operation contract outcomes and emit the appropriate
contract verdict for desktop_local_folder turns
This unit is what flows/shared/stream_loop.py wraps in the rate-limit
recovery while-loop. Add-only; no existing wiring uses it yet.
Extracts the agent-construction wrapper that the chat streamers call to
materialize the LangGraph agent for a given thread. Centralizes how we
pass the agent factory plus checkpointer, runtime context, and the
in-memory content builder.
Add-only; pre-existing inline equivalent in stream_new_chat.py stays
until cutover.
Extracts the desktop_local_folder file-operation contract helpers:
* contract_enforcement_active: gates the contract on filesystem mode.
* evaluate_file_contract_outcome: scores tool outputs as success/no-op.
* log_file_contract: structured logging of contract verdicts.
This is the unit responsible for catching agents that claim to have
written/edited a file without actually invoking the filesystem tool.
Add-only; stream_new_chat.py keeps its inline duplicates until cutover.
Extracts two pure context helpers used during input-state assembly:
* mentioned_docs.format_mentioned_surfsense_docs_as_context: renders the
user's @-mentioned SurfSense docs into the LLM context block.
* deepagents_todos.extract_todos_from_deepagents: pulls the in-progress
todo list from a deep-agents state snapshot for the title generator.
Add-only; existing call sites in stream_new_chat.py remain untouched
until cutover.
Foundation layer for the parallel refactor of stream_new_chat.py.
Extracts the StreamResult dataclass (tracks per-turn streaming state)
and a small set of shared utilities (resume_step_prefix, safe_float).
Add-only; no existing code imports from this package yet. Existing
stream_new_chat.py keeps its inline equivalents until cutover.