With multi-agent the only live factory (B1), the single-agent stack is dead.
Remove app/agents/new_chat/ entirely: chat_deepagent.py, subagents/, and all
re-export shims (errors/context/llm_config/permissions/tools/middleware/...) that
existed only to serve frozen single-agent code. Live code already imports the
shared kernel (app.agents.shared.*) directly.
Tests: delete single-agent-only suites (test_resolve_prompt_model_name,
test_specialized_subagents) and the chat_deepagent source-shape contract assertion;
repoint test_scoped_model_fallback to the shared middleware path. Suite green
(2710 passed).
Three live shared leaves discovered while taking stock after slice 7 (all are
consumed by the multi-agent stack and/or live routes, not single-agent-only):
- connector_searchable_types -> shared + shim (multi-agent factory uses it)
- agent_cache -> shared + shim (multi-agent runtime/agent_cache uses it)
- system_prompt + prompts/ (42 .md fragments) -> shared together + shim.
Repointed composer's _PROMPTS_PACKAGE to app.agents.shared.prompts so
importlib.resources fragment loading keeps working; system_prompt's relative
".prompts.composer" import is preserved by moving both as a unit.
Each keeps a re-export shim for the frozen chat_deepagent. After this slice,
new_chat/ holds only the frozen single-agent stack (chat_deepagent, subagents/,
__init__) plus shims.
- skills/ (builtin SKILL.md assets) has zero Python importers; it is read by
filesystem path only. Moved the dir and restored
skills_backends._default_builtin_root() to the clean
parent.parent / "skills" / "builtin" form (undoing the transient path from 5c).
- plugin_loader.py -> shared (frozen chat_deepagent uses it -> re-export shim).
- plugins/ package -> shared (year_substituter rewired to shared.plugin_loader;
docstring entry-point example updated to the shared dotted path). No shim
needed (only a test imported it). Plugin discovery is via importlib entry
points (group "surfsense.plugins"), not dotted-path import, and nothing is
registered in pyproject, so the move does not affect runtime discovery.
Relocate the entire new_chat/tools/ package (62 files incl. registry, hitl, MCP
cluster, and all connector subpackages: gmail/slack/discord/teams/drive/etc.)
to the shared kernel. The package turned out to be a clean cohesive cluster:
its only references to non-tools new_chat modules were comments, and its
middleware deps were already flipped to shared in slice 5c.
Flip 33 live importers (multi-agent, flows, routes, services, anonymous_agent,
tests). Re-export shims remain for the frozen single-agent stack: a package
__init__ mirroring the public surface (new_chat.__init__ imports it) plus
invalid_tool + registry submodule shims (chat_deepagent imports those).
Resolves slice 5c's two transient back-edges: shared/middleware/action_log
(TYPE_CHECKING ToolDefinition) and tool_call_repair (local INVALID_TOOL_NAME)
now point at app.agents.shared.tools.
Completes slice 5. filesystem_backends was deferred from 3b because it depends
on middleware.{kb_postgres_backend,multi_root_local_folder_backend}; those moved
to shared in 5c, so it now relocates cleanly. Flip the 2 non-frozen importers
(multi-agent factory + test); a re-export shim remains for the frozen
chat_deepagent (build_backend_resolver).
Relocate the entire new_chat/middleware/ package to the shared kernel as one
cohesive unit (it is live shared infrastructure: the multi-agent stack wraps
nearly every middleware via multi_agent_chat/middleware/main_agent/*, and
anonymous_agent consumes it too). Flip 69 live importers across both the
package-path and submodule-path forms.
Shims left for the frozen single-agent stack: a package __init__ re-export plus
submodule shims for permission, skills_backends, and scoped_model_fallback
(the three imported via submodule path by chat_deepagent/subagents).
Cycle break: importing shared.middleware previously reached back into
new_chat.tools at module load, which dragged in new_chat.__init__ ->
chat_deepagent -> the middleware shim -> half-initialized shared.middleware.
Made action_log's ToolDefinition import TYPE_CHECKING-only and
tool_call_repair's INVALID_TOOL_NAME import function-local. These tools-package
back-edges fully resolve in slice 6.
Asset note: skills_backends._default_builtin_root now walks to
app/agents/new_chat/skills/builtin (the skills/ tree migrates in slice 7).
Two independent leaf modules (no intra-new_chat deps, no frozen importer),
consumed only by flows/routes/tests. Flipped 8 importers across both the
dotted-path and module-style (from app.agents.new_chat import mention_resolver)
forms. No shims needed.
Two pure leaf modules with no intra-new_chat deps and no frozen importer.
Moving them now (before the middleware package) pre-empts two shared->new_chat
back-edges that the middleware move would otherwise create
(knowledge_search->utils, kb_postgres_backend->document_xml).
Relocate the mutually-dependent LLM config layer and the LiteLLM prompt-caching
helper to the shared kernel as one unit, rewiring their internal cross-reference
to the shared paths. Flip 21 non-frozen importers. Re-export shims remain at
new_chat/{llm_config,prompt_caching}.py for the frozen single-agent stack
(chat_deepagent); they will be removed when that stack is retired.
Relocate the permission evaluator (wildcard matcher + rule evaluation) to the
shared kernel and flip 43 non-frozen importers. A re-export shim remains at
new_chat/permissions.py for the frozen single-agent stack (chat_deepagent and
subagents/{config,providers/linear,providers/slack}); it will be removed when
that stack is retired.
Relocate three leaf filesystem-cluster modules to the shared kernel and flip
all 38 importers. No re-export shims needed (no frozen single-agent importer).
This also resolves the pre-existing shared->new_chat back-edge from
shared/receipt_command.py onto filesystem_state.
filesystem_backends is intentionally deferred to slice 5: it depends on
new_chat middleware (kb_postgres_backend, multi_root_local_folder_backend)
that have not yet moved, so relocating it now would create a shared->new_chat edge.
Promote the filesystem mode contracts (FilesystemMode, FilesystemSelection,
ClientPlatform, LocalFilesystemMount) out of `new_chat` into the cross-agent
`app/agents/shared` kernel.
Pure leaf consumed across the whole multi-agent filesystem middleware/tool tree,
the chat flows/monolith, routes and tests. git mv (content unchanged) + flipped
all ~48 importers. A re-export shim remains at new_chat/filesystem_selection.py
only for the not-yet-retired single-agent (chat_deepagent).
Also updated the stream parity test's annotation normalizer to strip the new
app.agents.shared.filesystem_selection. prefix (the dataclasses' __module__
changed with the move), keeping monolith<->flows signature parity intact.
Behavior-preserving: only import paths change. 1326 tests green.
Promote the agent feature-flag resolver (AgentFeatureFlags / get_flags) out of
`new_chat` into the cross-agent `app/agents/shared` kernel.
feature_flags is a pure leaf consumed across the multi-agent middleware stack,
the chat routes, and tests. Moved it via git mv (content unchanged) and flipped
all 37 importers to app.agents.shared.feature_flags. A thin re-export shim
remains at new_chat/feature_flags.py only for the not-yet-retired single-agent
(chat_deepagent); it goes away with the single-agent deletion.
Behavior-preserving: only import paths change. 1243 tests green.
Continue promoting the shared agent toolkit out of `new_chat` into the
cross-agent `app/agents/shared` kernel.
- state_reducers.py: clean move (no single-agent importer); all 7 importers
flipped to app.agents.shared.state_reducers.
- context.py: moved to app.agents.shared.context; flipped the multi-agent,
app, automations, chat-flows and monolith importers. A thin re-export shim
remains at new_chat/context.py because the not-yet-retired single-agent
(chat_deepagent) and the new_chat package __init__ still import it; the shim
goes away with the single-agent deletion.
- Updated the stream parity test's annotation normalizer to strip the new
app.agents.shared.context. prefix (SurfSenseContextSchema.__module__ changed
with the move), keeping monolith<->flows signature parity intact.
Behavior-preserving: definitions unchanged; only import paths move. 1219 tests green.
First slice of promoting the shared agent toolkit out of the misnamed
`new_chat` package into the cross-agent `app/agents/shared` kernel.
`errors.py` is a leaf module (no intra-package deps) consumed by the
multi-agent chat, the chat streaming flows/monolith, and tests — i.e. it is
shared infrastructure, not single-agent code. Moved it verbatim to
`app.agents.shared.errors` and flipped all 12 importers. No re-export shim
remains since zero importers needed it.
Behavior-preserving: identical class/enum definitions; only the import path
changes. 1208 agent + chat-task tests green.
- Enhanced lambda function formatting in `_after_commit` for better clarity.
- Simplified generator expression in `_match_condition` for improved readability.
- Streamlined function signature in `_eligible` for consistency.
- Updated imports and refactored anonymous chat routes to use a new agent creation method.
- Added a new function `_load_anon_document` to handle document loading from Redis.
- Improved UI components by replacing legacy structures with modern alternatives, including alerts and separators.
- Refactored quota-related components to utilize new alert structures for better user feedback.
- Cleaned up unused variables and optimized component states for performance.
- Removed the eligibility gate for model selection in the automation creation process, allowing users to choose models directly in the builder.
- Updated the `AutomationBuilderForm` to incorporate model selection logic, ensuring that selected models are validated and preserved during automation creation and editing.
- Simplified the `AutomationsContent` and `AutomationNewContent` components by eliminating unnecessary eligibility checks and alerts.
- Enhanced the user experience by integrating model selection directly into the automation approval process, ensuring that only billable models are used.
- Refactored related tests to cover new model selection behavior and ensure proper validation of user-selected models.
A standalone, domain-agnostic pub/sub seam: an EventBus that owns its
subscriber registry and streams Event values from producers to listeners
in process. Boundary-crossing (Celery/DB/workers) is left to subscribers,
keeping the bus single-responsibility. Includes the immutable Event value
object and full unit coverage.
- Added model eligibility checks to ensure automations can only use billable models (premium or BYOK).
- Introduced new API endpoint to report model eligibility status for search spaces.
- Updated frontend components to display eligibility alerts and disable creation options when models are not billable.
- Enhanced automation creation forms to reflect model eligibility, preventing users from submitting invalid configurations.
- Implemented server-side logic to capture and preserve model preferences across automation edits, ensuring consistent behavior during execution.
- Deleted the `search_surfsense_docs` tool and its associated files, streamlining the agent's toolset.
- Updated various components and prompts to remove references to the now-removed tool, ensuring consistency across the codebase.
- Adjusted documentation to direct users to the SurfSense documentation link for product-related queries instead.
Top-level tests that span multiple submodules:
- test_stores.py (7): the trigger + action registry contracts — register
round-trip, unknown type → None (not raise), duplicate registration
rejected, defensive snapshot from all_*.
- test_definition_types.py (2): params_schema property on both
ActionDefinition and TriggerDefinition reflects the Pydantic model.
- test_persistence_enums.py (3): exact string values + member sets of
AutomationStatus / RunStatus / TriggerType — the postgres-mirrored
contract that breaks stored rows if drifted.
- test_import_registrations.py (2): the bundled agent_task action and
schedule trigger self-register on package import (canary for the
side-effect import chain).
conftest.py adds isolated_action_registry / isolated_trigger_registry
fixtures: snapshot + restore of the module-level _REGISTRY dicts so
tests that add their own definitions don't leak across the suite.
14 tests, pure unit.
auto_decide.build_auto_decisions (3): produces one decision per
action_request entry, defaults to one decision for legacy scalar
interrupts, and skips malformed interrupts silently so a misbehaving
tool can't take down the whole agent_task step.
finalize.extract_final_assistant_message (4): string-content AIMessage
returned verbatim, list-of-parts content concatenated (skipping
non-text parts like tool_use), walks back past trailing ToolMessages
to find the last AIMessage, and returns None when no extractable text
is present (so callers can branch on silence vs. empty).
7 tests, pure unit.
render.py (4): variable substitution, StrictUndefined raises on missing
keys, evaluate_predicate coerces to bool, render_value walks dicts/lists
and renders string leaves.
filters.py (4): slugify produces URL-safe output, date formats datetime
with strftime, date(None) → "" so templates can write
{{ inputs.last_fired_at | date }} on first run, date(str) passes through.
environment.py (4): the sandbox boundary — disallowed Jinja built-ins
(e.g. pprint) raise, and the finalize hook coerces non-string outputs
to predictable wire shapes (datetime → ISO, None → "", dict → JSON).
context.py (1): build_run_context exposes {run, inputs, steps} with the
exact shape every plan template body relies on.
13 tests total, all pure unit.
execute_step (6 tests): happy path, when=falsy → skipped, unknown action
→ ActionNotFound failure, retry budget exhaustion (attempts = 1 +
max_retries), retry recovery, and template-rendering of step params
against the run context.
with_retries (3 tests): first-try success returns attempts=1, recovery
returns the actual attempt that produced the result, and exhaustion
re-raises the last exception with the handler called 1 + max_retries
times.
All tests use backoff="none" to keep wall-clock time zero; timeout
testing is intentionally skipped (would need >= 1s per the int contract,
and exhaustion already locks that any Exception triggers retry).
Cover the input-validation contract dispatch_run relies on:
- no declared schema → inputs pass through unchanged (regression site
that previously stripped runtime keys like fired_at / last_fired_at
and broke Jinja templates).
- declared schema, valid inputs → passthrough validated.
- declared schema, invalid inputs → DispatchError (uniform exception
type, not raw jsonschema.ValidationError).
Plus the DispatchError exception identity (Exception subclass, message
preserved, isinstance-friendly for the dispatch layer's consumers).
4 tests, pure unit.