docs(automation): defer credentials, cost, queue-routing, side-effects

Update §3 (Credentials), §7.1 (Dispatcher common path), §8 (Duration classes and queue routing), and §13 (Decisions locked) to reflect the v1-minimum scope: - Credentials block in §3 collapses to a deferred-to-Phase-2 note. The three guarantees (no creds in definition, no creds in LLM context, per-call resolution) return unchanged when Phase 2 ships external capabilities. - Cost-estimate pre-check in the dispatcher's common path is removed. Mid-flight budget kill in the executor still enforces budget_cap_usd. - Queue routing by expected_duration_seconds is deferred. Single automations_default queue in v1. - Decisions 24, 25, 26, 32-37, 38-41 marked deferred with explicit return phase. Three new v1-minimum decisions added (5-field Capability, measured-not-declared cost, single queue). All deferrals are additive: the original designs return as-is when warranted; nothing is rewritten between phases.
2026-07-16 23:01:06 +02:00 · 2026-05-26 22:35:37 +02:00 · 2026-05-26 22:35:37 +02:00 · 144d702c35
commit 144d702c35
parent b029c090bd
1 changed files with 56 additions and 66 deletions
--- a/automation-design-plan.md
+++ b/automation-design-plan.md
@ -138,47 +138,24 @@ See archived design at `docs/automation/archived/mcp-registry.md` once
 v1 ships; for now the only consumer of the registry is the in-memory
 native path.

-### Credentials: resolved at the moment of use
+### Credentials — deferred to Phase 2

-The handler doesn't carry credentials and the closure doesn't capture them.
-When invoked, the handler asks `ActionContext` for what it needs:
+The earlier per-call credential resolution pattern (`ctx.resolve_mcp_client`,
+`ctx.resolve_http_client`, `ctx.resolve_llm`) is **deferred to Phase 2**.
+v1 capabilities run server-side using app-level configuration; none of
+the seven v1 capabilities needs per-user or per-connection auth.

-```python
-def make_mcp_handler(connection_id: UUID, tool_name: str):
-    async def handler(ctx: ActionContext, args: dict) -> Any:
-        # Credential resolution happens here, per call
-        client = await ctx.resolve_mcp_client(connection_id)
-        response = await client.call_tool(name=tool_name, arguments=args)
-        return response.content
-    return handler
-```
+When Phase 2 ships external-credential capabilities (Slack, email, etc.),
+the three guarantees the original design promised are reintroduced
+unchanged:

-`ctx.resolve_mcp_client(connection_id)`:
-1. Loads the `mcp_connections` row
-2. Decrypts the access token
-3. Refreshes the token if it's expired (using the refresh token)
-4. Constructs an `MCPClient` with the token set as a default authorization
-   header
+- Credentials never appear in the automation definition (connection IDs
+  only).
+- Credentials never appear in the LLM's context (the host holds them
+  and uses them on the LLM's behalf when executing tool calls).
+- Credentials are loaded per-call, not pre-loaded into worker memory.

-The HTTP library carries the auth header on every subsequent call the
-client makes — the handler doesn't think about it after construction.
-
-For native capabilities calling external APIs directly,
-`ctx.resolve_http_client(provider)` returns an authenticated `httpx`
-client. For LLM operations, `ctx.resolve_llm(provider)` returns a
-configured LLM client. **Three resolution methods, one pattern: the
-context returns a client already authenticated.**
-
-Three properties this gives us:
-
- **Credentials never appear in the automation definition.** The JSON
-  contains capability references and connection IDs, never tokens.
- **Credentials never appear in the LLM's context.** Even during
-  `agent_task`, the LLM sees tool descriptions only; the host holds
-  credentials and uses them when executing the tools the LLM requests.
- **Credentials are loaded per-call, not pre-loaded.** The credential
-  exists in memory only during the moment a handler is making a call. No
-  long-lived secrets in worker memory.
+The Phase-2 design returns as-is; only the v1 surface is simplified.

 ---

@ -504,12 +481,18 @@ event, evaluates all matching triggers' filters, fires the matches.
 Common path (after a trigger has fired):
 1. Resolve `inputs` from trigger payload and defaults
 2. Validate resolved inputs against the automation's input schema
-3. **Cost estimate** — sum capabilities' `cost_estimate(args)` for the plan;
-   refuse if exceeds `budget_cap_usd`
-4. **Idempotency check** — dedup against existing pending/running runs
-5. **Snapshot the resolved definition** into the run row (immutable history)
-6. Enqueue executor task on the appropriate Celery queue (per
-   `expected_duration_seconds`)
+3. **Idempotency check** — dedup against existing pending/running runs
+4. **Snapshot the resolved definition** into the run row (immutable history)
+5. Enqueue executor task on the single `automations_default` Celery queue
+
+The cost-estimate pre-check (originally step 3) is **deferred**.
+v1 capabilities do not declare `cost_estimate`; pre-flight budgeting
+returns when a historical-cost ledger exists. The mid-flight budget
+cap (§7.2) still kills the run if accumulated cost crosses
+`budget_cap_usd`.
+
+Queue routing by `expected_duration_seconds` is **deferred** until load
+patterns justify a second queue. v1 uses a single queue.

 ### 7.2 Executor

@ -801,16 +784,18 @@ The engine handles storage (writes to SurfSense's existing object storage),
 URL generation (signed, scoped to the run's permissions), and cleanup (a
 nightly Celery Beat task deletes expired artifacts).

-### Duration classes and queue routing
+### Duration classes and queue routing — deferred

-Capabilities declare `expected_duration_seconds`. The dispatcher routes
-runs to Celery queues based on the longest-duration step:
- < 10s → `automations_fast`
- 10s – 5min → `automations_medium`
- 5min – 1hr → `automations_long`
+The original design routed runs to multiple Celery queues based on each
+capability's declared `expected_duration_seconds`. v1 ships with **one
+queue** (`automations_default`) and capabilities do not declare a
+duration. Multi-queue routing returns when burst load on a single queue
+actually justifies the operational complexity of independent worker
+pools.

-Operators scale each queue's worker pool independently. A future "very
-long" queue is a config change, not a contract change.
+Adding the second queue is a config change plus reintroducing
+`expected_duration_seconds` on the `Capability` dataclass — both
+mechanical, additive, and free of design rewrite.

 ---

@ -1210,9 +1195,9 @@ place.

 ### Components
 23. ✅ Dispatcher / executor / handlers / registry — distinct, each replaceable
-24. ✅ Side effects are a set, including `USER_VISIBLE`
-25. ✅ `expected_duration_seconds` integer drives queue routing
-26. ✅ `produces_artifacts` is a list of `ArtifactSpec`, not a bool
+24. ⏸ Side effects are a set, including `USER_VISIBLE` — **deferred** until multi-user automation RBAC ships
+25. ⏸ `expected_duration_seconds` integer drives queue routing — **deferred** until a second Celery queue is needed
+26. ⏸ `produces_artifacts` is a list of `ArtifactSpec`, not a bool — **deferred** until artifacts beyond the deliverable handlers' own persistence are needed
 27. ✅ Output schemas recommended on `agent_task`; editor warns when missing

 ### Event bus
@ -1220,20 +1205,25 @@ place.
 29. ✅ Automations publish run events for composability
 30. ✅ Publish/subscribe behind interface — no direct table access elsewhere

-### Capability storage (two-tier persistence)
+### Capability storage
 31. ✅ Native capabilities registered in-memory at startup from the codebase. Identical across all workers.
-32. ✅ MCP capability metadata persisted in `mcp_connections` and `mcp_tools` tables. Survives restarts.
-33. ✅ MCP handler closures built lazily per worker from database state. Worker-local cache, rebuilt on demand.
-34. ✅ MCP server tool list re-harvested on a schedule (default: daily) and on user request.
-35. ✅ MCP tools harvested into the capability registry at connection time
-36. ✅ Side effects inferred from MCP hints + naming + admin overrides
-37. ✅ MCP tools callable directly (no agent required) when caller knows args
+32. ⏸ MCP capability metadata persisted in `mcp_connections` and `mcp_tools` tables — **deferred to Phase 4**
+33. ⏸ MCP handler closures built lazily per worker from database state — **deferred to Phase 4**
+34. ⏸ MCP server tool list re-harvested on a schedule — **deferred to Phase 4**
+35. ⏸ MCP tools harvested into the capability registry at connection time — **deferred to Phase 4**
+36. ⏸ Side effects inferred from MCP hints + naming + admin overrides — **deferred to Phase 4**
+37. ⏸ MCP tools callable directly (no agent required) when caller knows args — **deferred to Phase 4**

-### Credentials
-38. ✅ Credentials never appear in the automation definition — only connection IDs do
-39. ✅ Credentials never appear in the LLM's context — the host holds them and uses them on the LLM's behalf
-40. ✅ Credentials resolved per-call by `ActionContext`, not pre-loaded into worker environment
-41. ✅ Tokens encrypted at rest in the database; refresh handled automatically by `ActionContext.resolve_*_client`
+### Credentials — all deferred to Phase 2
+38. ⏸ Credentials never appear in the automation definition — only connection IDs do — **Phase 2**
+39. ⏸ Credentials never appear in the LLM's context — the host holds them — **Phase 2**
+40. ⏸ Credentials resolved per-call by `ActionContext`, not pre-loaded into worker environment — **Phase 2**
+41. ⏸ Tokens encrypted at rest; refresh handled automatically by `ActionContext.resolve_*_client` — **Phase 2**
+
+### v1-minimum (new lock)
+v1. ✅ `Capability` is exactly five fields: `id`, `description`, `input_schema`, `output_schema`, `handler`. Additional fields are added only when a concrete consumer feature requires them.
+v2. ✅ Cost is **measured** from a per-run ledger, not declared. Pre-flight cost checks return when the ledger has enough history.
+v3. ✅ Single `automations_default` Celery queue in v1. Multi-queue routing returns when load justifies it.

 ### NL authoring
 42. ✅ LLM-authored templates is the primary path from day one — not a Phase 3 addition. Hand-authoring JSON is supported but secondary