Commit graph

6109 commits

Author SHA1 Message Date
CREDO23
c0232fdcfe refactor(automations): park manual trigger pending Run-now redesign
Manual-as-a-standalone-trigger conflates "user clicks Run now" with the
trigger model and forces ad-hoc input plumbing on the caller. Remove the
unreachable surface so the tree reflects reality (schedule is the only
v1 trigger).

- Unregister `manual`: drop import from triggers/__init__.py
- Delete `app/automations/triggers/manual/`
- Drop `RunService.dispatch_manual` (RunService is now read-only)
- Drop `POST /automations/{id}/run` and `RunDispatched` schema
- Keep `TriggerType.MANUAL` Python + PG enum value (reserved, documented)
  to avoid an Alembic round-trip when Run-now is redesigned
2026-05-27 22:29:51 +02:00
CREDO23
8fb65d7188 fix(automations): use enum values not names for postgres enum columns 2026-05-27 21:53:07 +02:00
CREDO23
27ab367a13 feat(automations): static_inputs on triggers + vertical-slice api/services 2026-05-27 21:21:43 +02:00
CREDO23
84d99f19a2 automations(api): API request/response schemas 2026-05-27 19:10:20 +02:00
CREDO23
dd6bc30f98 move automations api into vertical slice with service layer 2026-05-27 18:56:16 +02:00
CREDO23
d84240a630 add schedule tick task and beat entry 2026-05-27 17:56:07 +02:00
CREDO23
3b1d7c4389 add cron-based schedule trigger 2026-05-27 17:56:02 +02:00
CREDO23
f08b316441 add next_fire_at to automation_triggers and croniter dep 2026-05-27 17:55:58 +02:00
CREDO23
861b91004d refactor(automations): extract dispatch_run; move manual adapter under triggers/manual/dispatch.py 2026-05-27 17:20:23 +02:00
CREDO23
8c32455818 refactor(automations): vertical-slice actions and triggers by domain 2026-05-27 17:07:20 +02:00
CREDO23
ce45e11009 feat(automations): wire agent_task to multi_agent_chat with auto-approve loop 2026-05-27 17:02:44 +02:00
CREDO23
7ec3468113 refactor(automations): bind action handlers via ActionContext factory 2026-05-27 16:29:32 +02:00
CREDO23
f646b5cbab feat(rbac): backfill automations permissions on existing roles 2026-05-27 15:37:25 +02:00
CREDO23
cfbe2a7fe0 feat(automations): expose POST /automations/{id}/run 2026-05-27 15:30:45 +02:00
CREDO23
3bb02d8889 feat(automations): add manual dispatch service 2026-05-27 15:30:41 +02:00
CREDO23
1366c8a711 feat(rbac): add automations permission family 2026-05-27 15:30:34 +02:00
CREDO23
b26bf0bbcf feat(automation): register automation run celery task 2026-05-27 15:02:36 +02:00
CREDO23
273b98f350 feat(automation): expose runtime package surface 2026-05-27 15:02:36 +02:00
CREDO23
d3cda12191 feat(automation): add automation run executor 2026-05-27 15:02:36 +02:00
CREDO23
0a329e5a69 feat(automation): add per-step execution 2026-05-27 15:02:36 +02:00
CREDO23
f71a02db2f feat(automation): add automation run repository 2026-05-27 15:02:36 +02:00
CREDO23
924a82c0b1 feat(automation): add retry policy helper 2026-05-27 15:02:36 +02:00
CREDO23
8b87d179e9 feat(automation): add recursive render_value to templating 2026-05-27 15:02:36 +02:00
CREDO23
cb42b3a84f feat(automation): add template run context builder 2026-05-27 14:23:18 +02:00
CREDO23
de6da1b775 feat(automation): add template render and predicate evaluation 2026-05-27 14:23:17 +02:00
CREDO23
8345e79f6d feat(automation): add sandboxed template environment 2026-05-27 14:23:17 +02:00
CREDO23
08e94ac5ca feat(automation): add custom template filters 2026-05-27 14:23:17 +02:00
CREDO23
b4e5bf95a4 feat(automation): add template filter and test allowlist 2026-05-27 14:23:17 +02:00
CREDO23
99fd1a1338 feat(automation): register agent_task action and schedule/manual triggers 2026-05-27 13:58:57 +02:00
CREDO23
56b3e1bfc4 refactor(automation): drop Block suffix from definition components 2026-05-27 13:48:41 +02:00
CREDO23
7f4c1c25ab feat(automation): wire SQLAlchemy relationships on both sides 2026-05-27 13:45:32 +02:00
CREDO23
7ac99b89a0 refactor(automation): drop Capability registry 2026-05-27 13:29:30 +02:00
CREDO23
9fa35f21cf refactor(automation): rename schema config to params, drop dead fields 2026-05-27 13:29:26 +02:00
CREDO23
c8a89ccac8 refactor(automation): rename trigger model config to params 2026-05-27 13:29:22 +02:00
CREDO23
fe32cd35ed refactor(automation): rename trigger config column to params 2026-05-27 13:29:18 +02:00
CREDO23
a4fbfd8c0d chore(automation): tighten run.py + envelope.py docstrings
Re-apply the trim style after the prior refactor commit re-introduced
a multi-line docstring on AutomationRun.

- AutomationRun: drop the four-line docstring explaining where
  per-step session ids live; move the note to a single-line inline
  comment right above ``step_results`` where it's actionable.
- AutomationDefinition: drop the design-plan cross-reference; the
  module docstring already establishes what the file is.

No behaviour change.
2026-05-27 11:45:04 +02:00
CREDO23
35117a952d refactor(automation): drop agent_session_id from AutomationRun
A run can contain zero, one, or N agent_task steps. A single
agent_session_id at the run level holds at most one of them, so the
column is the wrong shape for the data.

Per-step session ids (LangGraph thread/checkpoint reference for an
agent_task step) live inside step_results[i] alongside the rest of
the per-step bag (status, timings, output). Each agent step records
its own; non-agent steps record nothing. Run-level "primary session"
is a UI concern, not a schema concern.

Trade-off: trace -> run reverse lookup is now a JSONB query, not an
index hit. Usually traversal goes run -> trace; if the reverse
becomes hot we add a GIN index on step_results or a generated
column — both additive.

Changes:
- AutomationRun: drop the agent_session_id column; module docstring
  notes where per-step session ids now live.
- Migration 144: drop the column from the CREATE TABLE; downgrade
  unchanged.

Safe to edit migration 144 in place (vs. add 145 with ALTER ... DROP):
this branch has not shipped and the table has never existed in any
deployed database.
2026-05-27 11:41:32 +02:00
CREDO23
f0e00bd3ee chore(automation): trim docstrings to intent only
Cut the docstrings and Field(description=...) text across the entire
automations/ tree down to single-line intent statements, matching the
multi_agent_chat conciseness style:

- Module docstrings: one line stating what the file is.
- Class docstrings: deleted when the class name + module docstring
  already cover intent; kept only where they add a constraint or
  rationale not visible in the signature.
- Pydantic Field descriptions: short noun phrases / clauses, not
  full sentences. Reasoning that belonged in the design plan moved
  out of the code.
- Enum values: per-value docstrings replaced with terse inline
  comments where the meaning isn't obvious from the name.

Behaviour is unchanged. The same 33 files, same public surface, same
imports — verified by re-running the 10-point registry smoke test and
the 8-point schema round-trip / constraint suite from commits 9 and
10.

LOC: 1180 → 691 (-42%).
2026-05-26 23:01:22 +02:00
CREDO23
7a96c0e29c feat(automation): add empty Capability / Action / Trigger registries
Three registries under app/automations/registries/, each as its own
folder with the same SRP-per-file split (types.py for the dataclass,
store.py for the in-memory dict + register/get/all functions). All
three start empty; concrete entries land when the user signs off on
which capabilities / actions / triggers to include (step 2).

Capability (locked at v1-minimum five fields — see commit 2):
  - id, description, input_schema, output_schema, handler
  - CapabilityHandler = Callable[[dict[str, Any]], Awaitable[Any]]
  - Frozen, slotted dataclass (immutable post-registration).

ActionDefinition (v1-trim of design plan §4):
  - type, name, description, config_schema, handler
  - Defers output_contract (handled per-step by agent_task's
    config.output_schema), uses_capabilities (no static analysis
    needed until >1 action ships), and produces_artifacts (deferred
    alongside the artifact pipeline).

TriggerDefinition (declarative, no handler):
  - type, description, config_schema, payload_schema
  - No handler field — firing is a single dispatcher's
    responsibility, not a per-trigger one.

store.py contract for all three:
  - register_*: idempotent at process startup, raises on duplicate
  - get_*: returns None on miss
  - all_*: returns a defensive copy of the registry dict

Verified by an inline smoke test (10 checks): empty initial state,
registration and lookup work, duplicates raise, frozen dataclasses
reject mutation, snapshots are copies, handlers are awaitable.

Isolation invariant audit: grep across the full app/automations/
tree shows only three app.* imports, all of them
``from app.db import BaseModel, TimestampMixin`` in the model files.
No imports from app.agents.*, app.services.*, app.tasks.*,
app.routes.*, or any other business-logic module.
2026-05-26 22:54:17 +02:00
CREDO23
be4d43d6c9 feat(automation): add Pydantic schemas for the automation definition
Three layers of Pydantic models under app/automations/schemas/, one
file per concern (SRP), matching the envelope in
automation-design-plan.md §5.

definition/ — the editable envelope persisted in
automations.definition:
  - envelope.py       AutomationDefinition (top-level shape)
  - plan_step.py      PlanStep (one step in the sequential plan)
  - inputs.py         InputsBlock (the inputs JSON Schema wrapper)
  - execution.py      ExecutionBlock (timeouts, retries, concurrency,
                                      budget cap, on_failure plan)
  - metadata.py       MetadataBlock (tags + created_from_nl + extras)
  - trigger_spec.py   TriggerSpec (one entry in triggers[])

triggers/ — per-trigger config schemas, dispatched by registry on the
TriggerSpec.type discriminator:
  - schedule.py       ScheduleTriggerConfig(cron, timezone)
  - manual.py         ManualTriggerConfig() — empty in v1

actions/ — per-action config schemas, dispatched by registry on the
PlanStep.action discriminator:
  - agent_task.py     AgentTaskActionConfig(prompt, tools, model,
                                            output_schema)

Design properties verified by an inline smoke test:
  - The §5 worked example round-trips through model_validate_json /
    model_dump_json byte-for-byte (InputsBlock uses
    serialize_by_alias so the JSON key stays "schema" not
    "schema_").
  - Envelope rejects unknown top-level keys (extra="forbid").
  - MetadataBlock tolerates unknown keys (extra="allow").
  - ExecutionBlock defaults apply when the block is omitted.
  - retry_backoff and concurrency are typed as Literal — bogus
    values rejected at validation time.
  - Per-type configs enforce their required fields (cron + timezone
    on schedule; non-empty prompt on agent_task).

The envelope keeps trigger and action configs as untyped dicts on
purpose — per-type validation is a registry-driven dispatch (commit
10), keeping the envelope free of every-type-knows-every-type
coupling.
2026-05-26 22:50:52 +02:00
CREDO23
d9183464d9 feat(automation): add Alembic migration for the three automation tables
Migration 144 -> 143. Matches the SQLAlchemy models added in commit 7
and the v1 data model in automation-design-plan.md §9.

Up:
  - CREATE TYPE automation_status / automation_trigger_type /
    automation_run_status (PostgreSQL ENUMs created first because the
    tables reference them).
  - CREATE TABLE automations with FK to searchspaces (CASCADE) and
    user (SET NULL); five indexes matching the SQLAlchemy model.
  - CREATE TABLE automation_triggers with FK to automations
    (CASCADE); four indexes.
  - CREATE TABLE automation_runs with FK to automations (CASCADE) and
    automation_triggers (SET NULL — null trigger_id == manual via UI);
    four indexes.

Down: drops every index, table, and ENUM in reverse-dependency order
so the migration is reversible without ON DELETE side effects.

Verified: `alembic history` resolves 143 -> 144 (head) cleanly.

domain_events (Phase 3) and mcp_connections / mcp_tools (Phase 4) ship
in their own migrations when the consuming feature lands; this
migration only covers the three v1 tables.
2026-05-26 22:44:33 +02:00
CREDO23
05931375f4 feat(automation): add SQLAlchemy models for the three v1 tables
Three enums (one file each) plus three models (one file each), all
under app/automations/persistence/. The module imports from app.db
only (Base/BaseModel/TimestampMixin and FK targets searchspaces.id /
user.id); no business-logic imports.

Enums:
  - AutomationStatus: active | paused | archived
  - RunStatus: pending | running | succeeded | failed | cancelled
    | timed_out
  - TriggerType: schedule | manual (Phase-2/3 add webhook | event)

Models:
  - Automation: search_space-scoped, created_by_user_id (SET NULL),
    name + description, status enum, definition JSONB, version int,
    updated_at with onupdate.
  - AutomationTrigger: FK → automations (CASCADE), type enum, config
    JSONB, enabled bool, last_fired_at. Webhook secret_hash is omitted
    until Phase 2.
  - AutomationRun: FK → automations (CASCADE), nullable trigger_id
    (SET NULL — null = manual via UI), status enum,
    definition_snapshot for immutable history, trigger_payload /
    resolved_inputs / step_results / output / artifacts / error JSONB
    columns, started_at / finished_at timestamps, agent_session_id for
    linking to the LangGraph trace. cost_usd column omitted until at
    least one v1 capability records token-level cost.

Verified: Base.metadata exposes all three table names; columns and
enums introspect as documented; no linter errors.
2026-05-26 22:42:50 +02:00
CREDO23
113748dfd5 feat(automation): scaffold isolated module structure
Create app/automations/ with the SRP-per-file / grouped-folders layout
that mirrors app/agents/multi_agent_chat/. Twelve __init__.py files,
each a thin re-export with a single-line docstring describing the
subpackage's role, no exports yet (filled in subsequent commits).

Tree:
  app/automations/
  ├── persistence/
  │   ├── enums/      (status / type enums; one per file)
  │   └── models/     (SQLAlchemy tables; one per file)
  ├── schemas/
  │   ├── definition/ (the JSON envelope, broken by concern)
  │   ├── triggers/   (per-trigger config schemas)
  │   └── actions/    (per-action config schemas)
  └── registries/
      ├── capabilities/  (types.py + store.py)
      ├── actions/       (types.py + store.py)
      └── triggers/      (types.py + store.py)

The persistence/ folder is named to avoid surfsense_backend/.gitignore's
data/ ignore rule, which silently masked the original data/ name and
its contents from version control.

Isolation invariant: the module imports only from app.db (foundational
Base + FK targets, unavoidable) and stdlib / SQLAlchemy / Pydantic.
No imports from app.agents.*, app.services.*, app.tasks.*, app.routes.*
or any other business-logic module. Confirmed importable with no side
effects.
2026-05-26 22:39:58 +02:00
CREDO23
db8c472664 docs(automation): narrow v1 data model + Phase 1 scope
§9 (Data model): drop from six tables to three. v1 ships automations,
automation_triggers, automation_runs only. domain_events deferred to
Phase 3 (event trigger); mcp_connections/mcp_tools deferred to Phase 4
(MCP integration). Remove the table definitions for the deferred ones
and replace with a deferred-tables note pointing to the consuming
phase.

automation_triggers.type enum narrowed to schedule|manual for v1.
Webhook and event types ship with their respective phases. secret_hash
column deferred to Phase 2 alongside the webhook trigger.

automation_runs.cost_usd column deferred until at least one v1
capability records token-level cost — additive when reintroduced.

§14 (Phase 1) reorganized into four explicit steps matching the work
we're about to do: scaffolding + schemas + empty registries (step 1),
then registry population (step 2), then executor (step 3), then NL
authoring + UI (step 4). The current commit batch lands step 1 only.
2026-05-26 22:37:05 +02:00
CREDO23
144d702c35 docs(automation): defer credentials, cost, queue-routing, side-effects
Update §3 (Credentials), §7.1 (Dispatcher common path), §8 (Duration
classes and queue routing), and §13 (Decisions locked) to reflect the
v1-minimum scope:

- Credentials block in §3 collapses to a deferred-to-Phase-2 note. The
  three guarantees (no creds in definition, no creds in LLM context,
  per-call resolution) return unchanged when Phase 2 ships external
  capabilities.
- Cost-estimate pre-check in the dispatcher's common path is removed.
  Mid-flight budget kill in the executor still enforces budget_cap_usd.
- Queue routing by expected_duration_seconds is deferred. Single
  automations_default queue in v1.
- Decisions 24, 25, 26, 32-37, 38-41 marked deferred with explicit
  return phase. Three new v1-minimum decisions added (5-field
  Capability, measured-not-declared cost, single queue).

All deferrals are additive: the original designs return as-is when
warranted; nothing is rewritten between phases.
2026-05-26 22:35:37 +02:00
CREDO23
b029c090bd docs(automation): defer MCP integration to Phase 4
Remove the two-tier registry, MCP database schema, harvester pseudocode,
and the lazy per-worker closure cache from §3. v1 ships with a single
in-memory native registry; the MCP design is reintroduced in Phase 4
along with the rest of the integration-tooling surface.

The deferral is additive: the v1 registry interface is the same callable
surface a Phase-4 MCP harvester will register into. No design rewrite
between phases.
2026-05-26 22:34:03 +02:00
CREDO23
16b6618629 docs(automation): trim Capability dataclass to v1-minimum
Reduce the §3 Capability dataclass from ten fields to five:
id, description, input_schema, output_schema, handler. Removed
fields (name, required_credentials, side_effects,
expected_duration_seconds, cost_estimate) are reintroduced only when a
concrete consumer feature demands them. The v1 invariant is that a
Capability is a typed, named, callable unit and every consumer
(executor, agent tool layer, future HTTP API) sees the same five-field
shape.
2026-05-26 22:33:10 +02:00
CREDO23
123f0d3b5d docs(automation): add v2 design plan baseline
Track the initial v2 design document for the SurfSense automation feature.
This is the baseline snapshot of the design before applying the v1-minimum
scope narrowing (capability trimming, MCP deferral, queue-routing deferral).
Subsequent commits trim this down to the v1 scope.
2026-05-26 22:30:21 +02:00
CREDO23
cfdad85058 test(chat): add parity tests for streaming/flows/ parallel refactor
Adds 34 tests under tests/unit/tasks/chat/streaming/ that cover the
new flows tree against the legacy stream_new_chat.py module to gate
the upcoming cutover. Coverage:

* Public entry points: stream_new_chat and stream_resume_chat are
  async generator functions whose parameter signatures (name, kind,
  annotation, default) match the legacy versions one-for-one. Uses a
  normalized-annotation comparison so PEP-563 vs eager-annotation
  representation differences are tolerated.
* Extracted helpers: image-capability gate, runtime-context builders
  for new-chat and resume-chat, LLM-bundle dispatcher, premium-quota
  needs check + reservation dataclass, rate-limit recovery truth
  table, persistence-spawn registration/self-unregistration, await
  helpers.
* SSE frame iterators: iter_initial_frames + iter_final_frames emit
  the canonical sequence; iter_token_usage_frame skips on None.
* Initial thinking step: 4 parametrized branches (text, image-only,
  empty, mentioned-docs), long-query truncation, many-docs collapse.

These tests are scaffolding for the cutover and will be removed once
the legacy module is deleted.
2026-05-25 21:50:18 +02:00
CREDO23
cf0085575c refactor(chat): add streaming/flows/resume_chat/orchestrator + flows public API
Slim composition root for the resume-chat streaming flow. Mirrors the
new_chat orchestrator but specialized for resumed turns:

* no fresh user turn, no title generation, no image-capability gate
* persists a fresh assistant shell for the resumed turn
* applies build_resume_routing to dispatch user decisions to the
  correct paused subagent before invoking the agent
* shares the same stream_loop + flow-local _recover closure for in-
  stream provider rate-limit recovery

Also lands flows/__init__.py, which becomes the public chat-flow API:

    from app.tasks.chat.streaming.flows import stream_new_chat, stream_resume_chat

Existing wiring (routes, contract test) still imports from the legacy
app.tasks.chat.stream_new_chat module. Cutover is the next phase.
2026-05-25 21:50:09 +02:00