Commit graph

24 commits

Author SHA1 Message Date
CREDO23
f2a61bc0ef refactor(agents): consolidate chat runtime infra under chat/runtime
Move the lower-level runtime/infra modules out of multi_agent_chat/shared/
(they were never used by subagents, so they failed the shared-by-all-siblings
rule) and unify them with the already-relocated checkpointer:

  agents/runtime/                      -> agents/chat/runtime/
  mac/shared/errors.py                 -> chat/runtime/errors.py
  mac/shared/llm_config.py             -> chat/runtime/llm_config.py
  mac/shared/prompt_caching.py         -> chat/runtime/prompt_caching.py
  mac/shared/mention_resolver.py       -> chat/runtime/mention_resolver.py
  mac/shared/path_resolver.py          -> chat/runtime/path_resolver.py

These sit below the agent packages: the boundary + agent factory + shared
middleware depend on them, and they import no agent code (acyclic).
2026-06-05 13:19:24 +02:00
CREDO23
24b62a63b4 refactor(agents): introduce chat/ category; dissolve top-level agents/shared
Recursive shared-folder rule: a shared/ must be shared by ALL siblings at its
level. The kernel (context, compaction, retry_after, web_search) was shared by
only 2 of the agents -- anonymous_chat + multi_agent_chat -- never by podcaster
or video_presentation. Those 2 are the "chat" category, so their shared code
belongs in that category's shared/, not the top-level one.

  app/agents/anonymous_chat/   -> app/agents/chat/anonymous_chat/
  app/agents/multi_agent_chat/ -> app/agents/chat/multi_agent_chat/
  app/agents/shared/           -> app/agents/chat/shared/   (anon<->mac kernel)

Top-level app/agents/shared/ is gone: nothing was shared across all three
categories (chat / podcaster / video_presentation).

~289 import sites rewritten (app.agents.{anonymous_chat,multi_agent_chat,shared}
-> app.agents.chat.*); all moves are git renames (history preserved).
app/agents/ now: chat/, podcaster/, video_presentation/, runtime/.
2026-06-05 12:54:02 +02:00
CREDO23
82c5dc5b02 refactor(agents): move mac-only modules out of the cross-agent shared kernel
app/agents/shared/ is a sibling of anonymous_chat/podcaster/multi_agent_chat/
video_presentation, so it should only hold code shared across 2+ of those
agents. In practice podcaster and video_presentation import nothing from it,
and anonymous_chat needs only context + compaction + retry_after + web_search.
Everything else was multi_agent_chat-only (the boundary just passes through).

Move the multi_agent_chat-only cluster into multi_agent_chat/shared/ (files
moved verbatim via git rename; ~116 import sites rewritten):

  errors, feature_flags, filesystem_selection, path_resolver, prompt_caching,
  sandbox, llm_config, mention_resolver
  middleware/busy_mutex, middleware/kb_persistence

busy_mutex/llm_config/mention_resolver are boundary-only but import the moved
modules, so they were folded in to avoid a backwards shared -> multi_agent_chat
dependency. main_agent builders now import the impls directly; the shared
middleware barrel keeps only the genuinely-shared compaction + retry_after.

Also delete the dead leftover shared/plugins and shared/skills dirs (live
copies already live under main_agent/).

Remaining in app/agents/shared/: context, system_prompt(+prompts), checkpointer,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/. checkpointer and
system_prompt are boundary-only infra pending a dedicated home decision.
2026-06-05 12:30:15 +02:00
CREDO23
946f8a8c5d refactor(agents): move llm_config + prompt_caching to app/agents/shared (slice 4b)
Relocate the mutually-dependent LLM config layer and the LiteLLM prompt-caching
helper to the shared kernel as one unit, rewiring their internal cross-reference
to the shared paths. Flip 21 non-frozen importers. Re-export shims remain at
new_chat/{llm_config,prompt_caching}.py for the frozen single-agent stack
(chat_deepagent); they will be removed when that stack is retired.
2026-06-04 12:41:52 +02:00
CREDO23
cb44063081 fix: repair pre-existing agent_task, gateway, and skills tests 2026-06-04 10:25:06 +02:00
DESKTOP-RTLN3BA\$punk
80daf46fbf Merge commit '7972901f15' into dev_mod 2026-05-29 20:28:12 -07:00
DESKTOP-RTLN3BA\$punk
9d1a01eb0c refactor(automations): streamline model eligibility handling in automation creation
- Removed the eligibility gate for model selection in the automation creation process, allowing users to choose models directly in the builder.
- Updated the `AutomationBuilderForm` to incorporate model selection logic, ensuring that selected models are validated and preserved during automation creation and editing.
- Simplified the `AutomationsContent` and `AutomationNewContent` components by eliminating unnecessary eligibility checks and alerts.
- Enhanced the user experience by integrating model selection directly into the automation approval process, ensuring that only billable models are used.
- Refactored related tests to cover new model selection behavior and ensure proper validation of user-selected models.
2026-05-29 20:27:40 -07:00
CREDO23
7b0e7a4c34 chore: merge upstream/dev — keep builtin schedule path, add SearchSpace 2026-05-29 23:40:52 +02:00
CREDO23
30fff9e52f refactor(automations): move agent_task to builtin and restructure dispatch 2026-05-29 18:13:09 +02:00
CREDO23
f293aa6bdf refactor(automations): move schedule trigger into builtin package 2026-05-29 17:49:05 +02:00
CREDO23
acd673023a feat(automations): add event trigger source, selector and registration 2026-05-29 17:48:48 +02:00
CREDO23
4ba637ea44 feat(automations): add event trigger match and inputs 2026-05-29 17:48:48 +02:00
CREDO23
3ba18c7750 feat(automations): add event trigger filter grammar 2026-05-29 17:48:48 +02:00
CREDO23
f09e302d4f feat(automations): add event trigger params 2026-05-29 17:48:48 +02:00
CREDO23
9247a2337f feat(automations): add EVENT to TriggerType enum 2026-05-29 17:48:39 +02:00
DESKTOP-RTLN3BA\$punk
409fec94c3 feat(automations): implement model eligibility checks for automation creation
- Added model eligibility checks to ensure automations can only use billable models (premium or BYOK).
- Introduced new API endpoint to report model eligibility status for search spaces.
- Updated frontend components to display eligibility alerts and disable creation options when models are not billable.
- Enhanced automation creation forms to reflect model eligibility, preventing users from submitting invalid configurations.
- Implemented server-side logic to capture and preserve model preferences across automation edits, ensuring consistent behavior during execution.
2026-05-29 03:13:46 -07:00
DESKTOP-RTLN3BA\$punk
94e834134f chore: linting 2026-05-28 19:21:29 -07:00
CREDO23
353755fd73 test(automations): cross-cutting registries, enums, side-effects + shared fixtures
Top-level tests that span multiple submodules:

- test_stores.py (7): the trigger + action registry contracts — register
  round-trip, unknown type → None (not raise), duplicate registration
  rejected, defensive snapshot from all_*.
- test_definition_types.py (2): params_schema property on both
  ActionDefinition and TriggerDefinition reflects the Pydantic model.
- test_persistence_enums.py (3): exact string values + member sets of
  AutomationStatus / RunStatus / TriggerType — the postgres-mirrored
  contract that breaks stored rows if drifted.
- test_import_registrations.py (2): the bundled agent_task action and
  schedule trigger self-register on package import (canary for the
  side-effect import chain).

conftest.py adds isolated_action_registry / isolated_trigger_registry
fixtures: snapshot + restore of the module-level _REGISTRY dicts so
tests that add their own definitions don't leak across the suite.

14 tests, pure unit.
2026-05-28 19:03:55 +02:00
CREDO23
822940b09e test(automations/schemas): lock definition + api validation gates
definition/ (29 tests): the envelope (defaults, extra=forbid, empty
plan/name rejection), Inputs schema-alias roundtrip (Python schema_ ↔
wire schema), PlanStep numeric bounds + addressing-field constraints,
Execution production defaults stability (10-min timeout, 2 retries,
exponential backoff, drop_if_running) + closed-set Literal gates,
Metadata's exceptional extra="allow" contract, and TriggerSpec type
requirement.

api/ (9 tests): AutomationCreate/Update cascade-validate into the
nested definition, reject unknown payload fields, enforce name length;
TriggerCreate exposes safe defaults (enabled=True, params={},
static_inputs={}) and rejects unknown TriggerType strings at the
boundary.

All pure unit, no DB.
2026-05-28 19:03:42 +02:00
CREDO23
acbeb60a43 test(automations/actions): lock agent_task helpers (auto_decide + finalize)
auto_decide.build_auto_decisions (3): produces one decision per
action_request entry, defaults to one decision for legacy scalar
interrupts, and skips malformed interrupts silently so a misbehaving
tool can't take down the whole agent_task step.

finalize.extract_final_assistant_message (4): string-content AIMessage
returned verbatim, list-of-parts content concatenated (skipping
non-text parts like tool_use), walks back past trailing ToolMessages
to find the last AIMessage, and returns None when no extractable text
is present (so callers can branch on silence vs. empty).

7 tests, pure unit.
2026-05-28 19:03:29 +02:00
CREDO23
db4eef651f test(automations/templating): lock render, filters, environment, context
render.py (4): variable substitution, StrictUndefined raises on missing
keys, evaluate_predicate coerces to bool, render_value walks dicts/lists
and renders string leaves.

filters.py (4): slugify produces URL-safe output, date formats datetime
with strftime, date(None) → "" so templates can write
{{ inputs.last_fired_at | date }} on first run, date(str) passes through.

environment.py (4): the sandbox boundary — disallowed Jinja built-ins
(e.g. pprint) raise, and the finalize hook coerces non-string outputs
to predictable wire shapes (datetime → ISO, None → "", dict → JSON).

context.py (1): build_run_context exposes {run, inputs, steps} with the
exact shape every plan template body relies on.

13 tests total, all pure unit.
2026-05-28 19:03:22 +02:00
CREDO23
49af95b652 test(automations/runtime): lock execute_step + with_retries
execute_step (6 tests): happy path, when=falsy → skipped, unknown action
→ ActionNotFound failure, retry budget exhaustion (attempts = 1 +
max_retries), retry recovery, and template-rendering of step params
against the run context.

with_retries (3 tests): first-try success returns attempts=1, recovery
returns the actual attempt that produced the result, and exhaustion
re-raises the last exception with the handler called 1 + max_retries
times.

All tests use backoff="none" to keep wall-clock time zero; timeout
testing is intentionally skipped (would need >= 1s per the int contract,
and exhaustion already locks that any Exception triggers retry).
2026-05-28 19:03:08 +02:00
CREDO23
18b4800e49 test(automations/dispatch): lock _validate_inputs + DispatchError
Cover the input-validation contract dispatch_run relies on:

- no declared schema → inputs pass through unchanged (regression site
  that previously stripped runtime keys like fired_at / last_fired_at
  and broke Jinja templates).
- declared schema, valid inputs → passthrough validated.
- declared schema, invalid inputs → DispatchError (uniform exception
  type, not raw jsonschema.ValidationError).

Plus the DispatchError exception identity (Exception subclass, message
preserved, isinstance-friendly for the dispatch layer's consumers).

4 tests, pure unit.
2026-05-28 19:03:00 +02:00
CREDO23
2a76f43387 test(automations/triggers): lock schedule cron + params
Cover the cron + IANA timezone + UTC normalization contract for the
schedule trigger: next-match strictly-after, DST offset shift across
spring-forward, malformed cron / unknown timezone rejection, and the
ScheduleTriggerParams Pydantic gate that surfaces InvalidCronError as
ValidationError at the API boundary.

8 tests, pure unit (no DB, no mocks).
2026-05-28 19:02:52 +02:00