app/agents/shared/ is a sibling of anonymous_chat/podcaster/multi_agent_chat/
video_presentation, so it should only hold code shared across 2+ of those
agents. In practice podcaster and video_presentation import nothing from it,
and anonymous_chat needs only context + compaction + retry_after + web_search.
Everything else was multi_agent_chat-only (the boundary just passes through).
Move the multi_agent_chat-only cluster into multi_agent_chat/shared/ (files
moved verbatim via git rename; ~116 import sites rewritten):
errors, feature_flags, filesystem_selection, path_resolver, prompt_caching,
sandbox, llm_config, mention_resolver
middleware/busy_mutex, middleware/kb_persistence
busy_mutex/llm_config/mention_resolver are boundary-only but import the moved
modules, so they were folded in to avoid a backwards shared -> multi_agent_chat
dependency. main_agent builders now import the impls directly; the shared
middleware barrel keeps only the genuinely-shared compaction + retry_after.
Also delete the dead leftover shared/plugins and shared/skills dirs (live
copies already live under main_agent/).
Remaining in app/agents/shared/: context, system_prompt(+prompts), checkpointer,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/. checkpointer and
system_prompt are boundary-only infra pending a dedicated home decision.
Relocate the entire new_chat/middleware/ package to the shared kernel as one
cohesive unit (it is live shared infrastructure: the multi-agent stack wraps
nearly every middleware via multi_agent_chat/middleware/main_agent/*, and
anonymous_agent consumes it too). Flip 69 live importers across both the
package-path and submodule-path forms.
Shims left for the frozen single-agent stack: a package __init__ re-export plus
submodule shims for permission, skills_backends, and scoped_model_fallback
(the three imported via submodule path by chat_deepagent/subagents).
Cycle break: importing shared.middleware previously reached back into
new_chat.tools at module load, which dragged in new_chat.__init__ ->
chat_deepagent -> the middleware shim -> half-initialized shared.middleware.
Made action_log's ToolDefinition import TYPE_CHECKING-only and
tool_call_repair's INVALID_TOOL_NAME import function-local. These tools-package
back-edges fully resolve in slice 6.
Asset note: skills_backends._default_builtin_root now walks to
app/agents/new_chat/skills/builtin (the skills/ tree migrates in slice 7).
Two independent leaf modules (no intra-new_chat deps, no frozen importer),
consumed only by flows/routes/tests. Flipped 8 importers across both the
dotted-path and module-style (from app.agents.new_chat import mention_resolver)
forms. No shims needed.
Relocate the mutually-dependent LLM config layer and the LiteLLM prompt-caching
helper to the shared kernel as one unit, rewiring their internal cross-reference
to the shared paths. Flip 21 non-frozen importers. Re-export shims remain at
new_chat/{llm_config,prompt_caching}.py for the frozen single-agent stack
(chat_deepagent); they will be removed when that stack is retired.
Promote the filesystem mode contracts (FilesystemMode, FilesystemSelection,
ClientPlatform, LocalFilesystemMount) out of `new_chat` into the cross-agent
`app/agents/shared` kernel.
Pure leaf consumed across the whole multi-agent filesystem middleware/tool tree,
the chat flows/monolith, routes and tests. git mv (content unchanged) + flipped
all ~48 importers. A re-export shim remains at new_chat/filesystem_selection.py
only for the not-yet-retired single-agent (chat_deepagent).
Also updated the stream parity test's annotation normalizer to strip the new
app.agents.shared.filesystem_selection. prefix (the dataclasses' __module__
changed with the move), keeping monolith<->flows signature parity intact.
Behavior-preserving: only import paths change. 1326 tests green.
First slice of promoting the shared agent toolkit out of the misnamed
`new_chat` package into the cross-agent `app/agents/shared` kernel.
`errors.py` is a leaf module (no intra-package deps) consumed by the
multi-agent chat, the chat streaming flows/monolith, and tests — i.e. it is
shared infrastructure, not single-agent code. Moved it verbatim to
`app.agents.shared.errors` and flipped all 12 importers. No re-export shim
remains since zero importers needed it.
Behavior-preserving: identical class/enum definitions; only the import path
changes. 1208 agent + chat-task tests green.
Extracts finalize_assistant_message: the post-stream server-side write
of the final assistant message (with content parts + token usage)
guarded by asyncio.shield + shielded_async_session so a client
disconnect cannot abort the persist.
Add-only; legacy stream_new_chat.py keeps its inline finalize block
until cutover.
Two cooperating modules that wrap stream_agent_events with in-stream
recovery from provider 429s:
* rate_limit_recovery: can_recover_provider_rate_limit truth-table
guard, reroute_to_next_auto_pin (selects the next eligible auto-pin
config and reloads the LLM bundle), log_rate_limit_recovered.
* stream_loop: run_stream_loop drives stream_agent_events in a
while-True loop, delegating recovery to a flow-supplied RecoverFn
callback so new_chat and resume_chat can share the same loop while
keeping their own nonlocal state.
Add-only; not yet wired into any orchestrator.
Extracts handle_terminal_exception: the shared except-branch behavior for
the chat orchestrators. Classifies the raised exception, logs the
structured chat_stream error event, and emits the terminal-error SSE
frame + done sentinel via the streaming service.
Add-only; nothing imports it yet.
Centralizes the premium-credits lifecycle for chat turns:
* needs_premium_quota: gate check (premium user + non-fallback config).
* PremiumReservation: dataclass capturing reservation state + token totals.
* reserve_premium / finalize_premium / release_premium: idempotent
reservation, commit, and rollback used by the orchestrators.
Add-only; legacy stream_new_chat.py keeps its inline quota handling
until cutover.
Six small, single-purpose modules shared by the upcoming new_chat and
resume_chat orchestrators:
* llm_bundle: dispatches negative config_id to the YAML loader and
non-negative config_id to the DB loader, returning (llm, AgentConfig).
* pre_stream_setup: builds the connector service, resolves the
Firecrawl API key, and returns the chat checkpointer.
* first_frames: iter_initial_frames + iter_final_frames emit the canonical
message-start / step-start / idle / finish / done SSE envelope.
* finalize_emit: iter_token_usage_frame emits the per-turn usage frame
from a TokenAccumulator summary.
* finally_cleanup: close_session_and_clear_ai_responding and run_gc_pass
centralize the finally-block bookkeeping.
* span: open_chat_request_span / set_agent_mode / close_chat_request_span /
record_outcome_attrs wrap the OpenTelemetry chat_request span.
Add-only; these are not yet wired into stream_new_chat.py.