A standalone, domain-agnostic pub/sub seam: an EventBus that owns its
subscriber registry and streams Event values from producers to listeners
in process. Boundary-crossing (Celery/DB/workers) is left to subscribers,
keeping the bus single-responsibility. Includes the immutable Event value
object and full unit coverage.
- Deleted the `search_surfsense_docs` tool and its associated files, streamlining the agent's toolset.
- Updated various components and prompts to remove references to the now-removed tool, ensuring consistency across the codebase.
- Adjusted documentation to direct users to the SurfSense documentation link for product-related queries instead.
- Added support for @-mentions in agent tasks, allowing users to reference documents, folders, and connectors directly in their queries.
- Updated `run_agent_task` to resolve mentions and include them in the context passed to the agent.
- Introduced new parameters in `AgentTaskActionParams` for handling mentioned document and connector IDs.
- Refactored the automation edit and new components to utilize the new `AutomationBuilderForm` for a more streamlined user experience.
- Removed deprecated JSON forms to simplify the automation creation process.
The shared AsyncPostgresSaver caches DB connections in a module-level
pool. Cached connections are bound to the asyncio loop that opened
them, but `run_async_celery_task` discards the loop on each task's
exit — so after the first task the pool holds connections pointing
to a dead loop, and the next automation hangs 30s before failing
with `PoolTimeout: couldn't get a connection after 30.00 sec`.
Swap agent_task to `InMemorySaver`; automation runs only need state
within one Celery task, so nothing is lost. Site-local TODO tracks
the proper future fix (dispose the checkpointer pool around each
Celery task, mirroring `_dispose_shared_db_engine`).
Closes the create loop in chat: the agent describes user intent → the
drafter sub-LLM produces an AutomationCreate JSON → this card surfaces
a structured preview → approve persists; reject cancels. Edits flow
through chat refinement (re-call with a refined intent), not in-card,
so the card stays simple and the multi-turn checkpointer carries the
context.
Tool UI (components/tool-ui/automation/):
- create-automation.tsx — entry dispatcher + ApprovalCard chrome
(pending/processing/complete/rejected via useHitlPhase) + SavedCard
(links to the detail page) + InvalidCard (lists drafter validation
issues) + ErrorCard (verbatim message). Rejection result is hidden
because the approval card itself shows the rejected phase inline.
- automation-draft-preview.tsx — structured preview body: name +
description + goal, triggers (humanised cron + tz + static-input
keys), plan steps (step_id → action), and a collapsible raw JSON
for power users.
Wiring:
- components/tool-ui/index.ts — re-export.
- features/chat-messages/timeline/tool-registry/registry.ts —
register create_automation → CreateAutomationToolUI (dynamic import,
same pattern as other connector tools).
- contracts/enums/toolIcons.tsx — Workflow icon + "Create automation"
display name so fallback chrome (and timeline headers) are honest.
Shared util:
- lib/automations/describe-cron.ts — lifted from the route slice's
lib/ folder since both the dashboard slice and the new approval card
now render schedule descriptions. Slice imports updated; the now-
empty slice lib/ folder is gone.
Backend prompt fragments:
- main_agent/system_prompt/.../create_automation/description.md and
the tool's docstring no longer promise in-card edits. They make the
refinement path explicit: if the user wants changes after seeing the
draft, they reply in chat and the agent calls the tool again with a
refined intent.
v1 deliberately excludes:
- In-card edit form / right-side edit panel — defer until we see real
demand. The chat refinement loop covers the common case.
- approve_always / persistent allow rules — automations are a single
artifact, not a repeated mutation, so the "trust this kind of call"
affordance doesn't apply.
Backend already defined automations:create/read/update/delete/execute and
seeded them on Owner/Editor/Viewer roles, but the Settings → Roles UI was
missing the metadata to render them properly.
- backend: add PERMISSION_DESCRIPTIONS entries for the 5 automations perms so
the role editor stops falling back to "Permission for automations:create".
- frontend: add automations to CATEGORY_CONFIG (Workflow icon, slotted between
podcasts and connectors) so the role editor groups them as a real section.
- frontend: extend the three ROLE_PRESETS — Editor and Contributor get
create/read/update/execute (mirroring backend Editor); Viewer gets read.
Prep work for the automations frontend; canPerform/usePermissionGate already
handle the runtime gating, so no new hook is needed.
Single tool exposed to the main agent. The main agent passes a natural-language
`intent`; a focused drafter sub-LLM turns it into a full AutomationCreate JSON;
that JSON is surfaced via request_approval (action_type "automation_create") so
the user can edit/approve it on a frontend card; on approval the tool persists
via AutomationService. Three phases, one tool call.
Scope split:
- main agent sees only `intent: str` (no schema knowledge leaks into the calling
graph) — prompt fragments scoped accordingly.
- drafter sub-LLM owns the schema + few-shot intent→JSON examples — lives in
the generating graph's prompt (tools/automation/prompt.py).
Files:
- main_agent/tools/automation/{create.py, prompt.py, __init__.py}: new tool
+ drafter system prompt with two few-shot intent→JSON examples.
- system_prompt/prompts/tools/create_automation/{description.md, example.md}:
intent-only guidance for the main agent.
- main_agent/tools/index.py: add create_automation to the main-agent allowlist.
- new_chat/tools/registry.py: deferred-import factory to break the
multi_agent_chat ↔ registry cycle; one ToolDefinition entry.
- Added new environment variables for controlling task execution limits, including `SURFSENSE_SUBAGENT_INVOKE_TIMEOUT_SECONDS`, `SURFSENSE_TASK_BATCH_CONCURRENCY`, and `SURFSENSE_TASK_BATCH_MAX_SIZE`.
- Updated documentation to reflect new batch processing capabilities for `task` calls, allowing for concurrent execution of multiple subagent tasks.
- Improved error handling and receipt generation for deliverables, ensuring consistent feedback on task status.
- Refactored middleware to incorporate search space ID for better task management.
Manual-as-a-standalone-trigger conflates "user clicks Run now" with the
trigger model and forces ad-hoc input plumbing on the caller. Remove the
unreachable surface so the tree reflects reality (schedule is the only
v1 trigger).
- Unregister `manual`: drop import from triggers/__init__.py
- Delete `app/automations/triggers/manual/`
- Drop `RunService.dispatch_manual` (RunService is now read-only)
- Drop `POST /automations/{id}/run` and `RunDispatched` schema
- Keep `TriggerType.MANUAL` Python + PG enum value (reserved, documented)
to avoid an Alembic round-trip when Run-now is redesigned
Re-apply the trim style after the prior refactor commit re-introduced
a multi-line docstring on AutomationRun.
- AutomationRun: drop the four-line docstring explaining where
per-step session ids live; move the note to a single-line inline
comment right above ``step_results`` where it's actionable.
- AutomationDefinition: drop the design-plan cross-reference; the
module docstring already establishes what the file is.
No behaviour change.
A run can contain zero, one, or N agent_task steps. A single
agent_session_id at the run level holds at most one of them, so the
column is the wrong shape for the data.
Per-step session ids (LangGraph thread/checkpoint reference for an
agent_task step) live inside step_results[i] alongside the rest of
the per-step bag (status, timings, output). Each agent step records
its own; non-agent steps record nothing. Run-level "primary session"
is a UI concern, not a schema concern.
Trade-off: trace -> run reverse lookup is now a JSONB query, not an
index hit. Usually traversal goes run -> trace; if the reverse
becomes hot we add a GIN index on step_results or a generated
column — both additive.
Changes:
- AutomationRun: drop the agent_session_id column; module docstring
notes where per-step session ids now live.
- Migration 144: drop the column from the CREATE TABLE; downgrade
unchanged.
Safe to edit migration 144 in place (vs. add 145 with ALTER ... DROP):
this branch has not shipped and the table has never existed in any
deployed database.
Cut the docstrings and Field(description=...) text across the entire
automations/ tree down to single-line intent statements, matching the
multi_agent_chat conciseness style:
- Module docstrings: one line stating what the file is.
- Class docstrings: deleted when the class name + module docstring
already cover intent; kept only where they add a constraint or
rationale not visible in the signature.
- Pydantic Field descriptions: short noun phrases / clauses, not
full sentences. Reasoning that belonged in the design plan moved
out of the code.
- Enum values: per-value docstrings replaced with terse inline
comments where the meaning isn't obvious from the name.
Behaviour is unchanged. The same 33 files, same public surface, same
imports — verified by re-running the 10-point registry smoke test and
the 8-point schema round-trip / constraint suite from commits 9 and
10.
LOC: 1180 → 691 (-42%).
Three registries under app/automations/registries/, each as its own
folder with the same SRP-per-file split (types.py for the dataclass,
store.py for the in-memory dict + register/get/all functions). All
three start empty; concrete entries land when the user signs off on
which capabilities / actions / triggers to include (step 2).
Capability (locked at v1-minimum five fields — see commit 2):
- id, description, input_schema, output_schema, handler
- CapabilityHandler = Callable[[dict[str, Any]], Awaitable[Any]]
- Frozen, slotted dataclass (immutable post-registration).
ActionDefinition (v1-trim of design plan §4):
- type, name, description, config_schema, handler
- Defers output_contract (handled per-step by agent_task's
config.output_schema), uses_capabilities (no static analysis
needed until >1 action ships), and produces_artifacts (deferred
alongside the artifact pipeline).
TriggerDefinition (declarative, no handler):
- type, description, config_schema, payload_schema
- No handler field — firing is a single dispatcher's
responsibility, not a per-trigger one.
store.py contract for all three:
- register_*: idempotent at process startup, raises on duplicate
- get_*: returns None on miss
- all_*: returns a defensive copy of the registry dict
Verified by an inline smoke test (10 checks): empty initial state,
registration and lookup work, duplicates raise, frozen dataclasses
reject mutation, snapshots are copies, handlers are awaitable.
Isolation invariant audit: grep across the full app/automations/
tree shows only three app.* imports, all of them
``from app.db import BaseModel, TimestampMixin`` in the model files.
No imports from app.agents.*, app.services.*, app.tasks.*,
app.routes.*, or any other business-logic module.