Commit graph

318 commits

Author SHA1 Message Date
CREDO23
41f4a58663 Merge remote-tracking branch 'upstream/dev' into improvement-podcast-graph
# Conflicts:
#	surfsense_backend/app/tasks/celery_tasks/podcast_tasks.py
2026-06-11 23:14:49 +02:00
CREDO23
ca9b157676 fix(podcasts): keep legacy episodes readable and guard regenerate 2026-06-11 12:43:07 +02:00
CREDO23
f0fc660d70 feat(podcasts): constrain monologue briefs to a single speaker 2026-06-11 11:56:57 +02:00
CREDO23
c84525897b test(podcasts): relocate stateful tests to integration
Move the lifecycle service, Celery task bodies, and mark_failed coverage out of
DB-faking unit tests and into integration tests against a real Postgres, faking
only true externals (broker, object store, TTS, ffmpeg, billing, LLM). Add HTTP
slices for cancel, voices, scoping, and public-chat streaming. The unit tier is
now fake-free pure logic with no session doubles.
2026-06-11 06:27:00 +02:00
DESKTOP-RTLN3BA\$punk
a7407502d3 feat(refactor): refactor payment system to implement unified credit wallet.
- Updated environment variables and - configurations for credit purchases via Stripe, replacing legacy page pack system.
- Introduced auto-reload feature for credit top-ups and modified database models to track credit transactions.
- Updated notification system to handle insufficient credits and auto-reload failures.
- Adjusted API routes and schemas to reflect changes in credit management.
2026-06-10 16:49:03 -07:00
CREDO23
8f38737ad9 test(podcasts): retarget celery and observability tests to new tasks 2026-06-10 21:45:04 +02:00
CREDO23
aa7aa81c16 refactor(podcasts): drop language detection from brief 2026-06-10 20:51:38 +02:00
CREDO23
15e44616f3 test(podcasts): cover drafting billing gate 2026-06-10 18:44:26 +02:00
CREDO23
0bed4a0d38 test(podcasts): cover failure recording 2026-06-10 18:44:25 +02:00
CREDO23
0c7987cd9e test(podcasts): cover api read model 2026-06-10 18:44:25 +02:00
CREDO23
fa7ab8a06d test(podcasts): cover renderer validation 2026-06-10 18:44:25 +02:00
CREDO23
36c201f9e2 test(podcasts): cover structured json parsing 2026-06-10 18:44:25 +02:00
CREDO23
0c92ee963e test(podcasts): cover voice catalog 2026-06-10 18:44:25 +02:00
CREDO23
e926990d8e test(podcasts): cover language and voice resolution 2026-06-10 18:44:25 +02:00
CREDO23
aaa9f01087 test(podcasts): cover brief and transcript contracts 2026-06-10 18:44:25 +02:00
CREDO23
9d8e4e4f9d test(podcasts): cover lifecycle state machine 2026-06-10 18:44:25 +02:00
CREDO23
f61e8af8c0 test(podcasts): add shared test fixtures 2026-06-10 18:44:25 +02:00
CREDO23
9f76daec8f test(indexers): update download mock return shape 2026-06-09 23:39:25 +02:00
CREDO23
bdd3728c5b test(dropbox): update download failure return shape 2026-06-09 23:39:25 +02:00
CREDO23
b5aa41beb6 test(onedrive): update download failure return shape 2026-06-09 23:39:25 +02:00
CREDO23
5f59ad3ad3 test(google-drive): update download failure return shape 2026-06-09 23:39:25 +02:00
DESKTOP-RTLN3BA\$punk
ce952d2ad1 chore: linting 2026-06-09 00:42:26 -07:00
CREDO23
8bdfd00a15 Merge upstream/dev 2026-06-05 19:18:12 +02:00
CREDO23
88fe213176 refactor(agents): extract subagent-invocation contract to subagents/shared
The knowledge_base subagent imported subagent_invoke_config + EXCLUDED_STATE_KEYS
from main_agent's checkpointed_subagent_middleware -- a subagent reaching into
main-agent internals. Both symbols (plus the recursion-limit constant they need)
are a subagent-invocation contract shared by the orchestrator's task middleware
and any nested-invoking subagent. Move them to subagents/shared/invocation.py;
config.py keeps the HITL resume side-channel and constants.py keeps the
main-agent tuning knobs. All consumers (task_tool, kb tool, tests) repointed.
2026-06-05 14:18:44 +02:00
CREDO23
0081b627e9 refactor(agents): move kb_persistence middleware into main_agent (owner)
The KB-persistence impl lived in shared/middleware/ but no subagent uses it --
consumers are the main_agent builder and the boundary event loop. Colocate with
its owner using the folder-per-middleware shape; __init__ re-exports the public
surface. Tests that reached module internals now alias the .middleware submodule.

  main_agent/middleware/kb_persistence.py -> kb_persistence/builder.py
  shared/middleware/kb_persistence.py     -> kb_persistence/middleware.py
2026-06-05 14:11:55 +02:00
CREDO23
a7a642fedc refactor(agents): move busy_mutex middleware into main_agent (owner)
The busy-mutex impl (BusyMutexMiddleware + cancel/turn-lifecycle primitives)
lived in shared/middleware/ but no subagent uses it -- consumers are the
main_agent builder and the boundary (turn lifecycle). Colocate with its owner
using the folder-per-middleware shape; __init__ re-exports the public surface so
boundary import sites only change package path:

  main_agent/middleware/busy_mutex.py    -> busy_mutex/builder.py
  shared/middleware/busy_mutex.py        -> busy_mutex/middleware.py
2026-06-05 14:08:45 +02:00
CREDO23
84b775c0ac refactor(agents): unify permissions into one vertical-slice package
Per-file verification of the slice-3 candidates showed receipts/ and
date_filters.py are shared contracts (consumed by shared/state + shared
middleware + subagents), so they correctly stay put.

permissions was the real misfit: the rule *model* lived at shared/permissions.py
while its enforcement lived at shared/middleware/permissions/. Unify them into a
single self-contained subsystem:

  shared/permissions.py                 -> shared/permissions/model.py
  shared/middleware/permissions/{deny,ask,middleware}
                                        -> shared/permissions/{deny,ask,middleware}

The package __init__ re-exports the model API + build_permission_mw, so the 32
external model consumers keep importing `from ...shared.permissions import Rule`
unchanged; only the 8 internal files redirect to `.model` (cycle-safe, model
loaded before middleware).
2026-06-05 13:29:48 +02:00
CREDO23
f2a61bc0ef refactor(agents): consolidate chat runtime infra under chat/runtime
Move the lower-level runtime/infra modules out of multi_agent_chat/shared/
(they were never used by subagents, so they failed the shared-by-all-siblings
rule) and unify them with the already-relocated checkpointer:

  agents/runtime/                      -> agents/chat/runtime/
  mac/shared/errors.py                 -> chat/runtime/errors.py
  mac/shared/llm_config.py             -> chat/runtime/llm_config.py
  mac/shared/prompt_caching.py         -> chat/runtime/prompt_caching.py
  mac/shared/mention_resolver.py       -> chat/runtime/mention_resolver.py
  mac/shared/path_resolver.py          -> chat/runtime/path_resolver.py

These sit below the agent packages: the boundary + agent factory + shared
middleware depend on them, and they import no agent code (acyclic).
2026-06-05 13:19:24 +02:00
CREDO23
24b62a63b4 refactor(agents): introduce chat/ category; dissolve top-level agents/shared
Recursive shared-folder rule: a shared/ must be shared by ALL siblings at its
level. The kernel (context, compaction, retry_after, web_search) was shared by
only 2 of the agents -- anonymous_chat + multi_agent_chat -- never by podcaster
or video_presentation. Those 2 are the "chat" category, so their shared code
belongs in that category's shared/, not the top-level one.

  app/agents/anonymous_chat/   -> app/agents/chat/anonymous_chat/
  app/agents/multi_agent_chat/ -> app/agents/chat/multi_agent_chat/
  app/agents/shared/           -> app/agents/chat/shared/   (anon<->mac kernel)

Top-level app/agents/shared/ is gone: nothing was shared across all three
categories (chat / podcaster / video_presentation).

~289 import sites rewritten (app.agents.{anonymous_chat,multi_agent_chat,shared}
-> app.agents.chat.*); all moves are git renames (history preserved).
app/agents/ now: chat/, podcaster/, video_presentation/, runtime/.
2026-06-05 12:54:02 +02:00
CREDO23
d59bb2b5aa refactor(agents): evict mac-only tools/middleware from shared kernel
These were never shared with anonymous_chat (nor podcaster/video_presentation)
-- only multi_agent_chat (subagents/main agent) and the boundary use them:

  shared/tools/mcp/             -> multi_agent_chat/shared/tools/mcp/
  shared/tools/hitl.py          -> multi_agent_chat/shared/tools/hitl.py
  shared/tools/catalog.py       -> multi_agent_chat/shared/tools/catalog.py
  shared/middleware/dedup_tool_calls.py
                                -> multi_agent_chat/shared/middleware/dedup_tool_calls.py

app/agents/shared/ now holds only the genuine anon<->mac kernel:
context, middleware/{compaction,retry_after}, tools/web_search.
2026-06-05 12:50:46 +02:00
CREDO23
b7ea829371 refactor(agents): relocate boundary-only infra out of shared/
Neither module is imported by any sibling agent package, so neither belongs in
the cross-agent shared kernel:

- checkpointer.py -> app/agents/runtime/checkpointer.py
  LangGraph Postgres checkpoint saver. It's cross-agent *runtime infra* wired by
  the boundary (app lifespan + anonymous_chat & multi_agent_chat flows), not
  agent code. New app/agents/runtime/ layer holds boundary-wired agent infra.

- shared/system_prompt.py + shared/prompts/ -> app/prompts/
  The legacy single-agent prompt composer. The live agents don't use it
  (main_agent has its own system_prompt/ builder; anonymous_chat builds inline);
  its only consumer is new_llm_config_routes for displaying default instructions.
  Moved to the existing non-agent prompt domain:
    system_prompt.py        -> app/prompts/default_system_instructions.py
    prompts/                 -> app/prompts/system_prompt_composer/

app/agents/shared/ now contains only genuinely cross-agent code: context,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/.

NOTE: get_default_system_instructions() (LLM-config UI) composes from the legacy
library, which differs from what the live agents actually run -- pre-existing
latent staleness, not changed here.
2026-06-05 12:36:44 +02:00
CREDO23
82c5dc5b02 refactor(agents): move mac-only modules out of the cross-agent shared kernel
app/agents/shared/ is a sibling of anonymous_chat/podcaster/multi_agent_chat/
video_presentation, so it should only hold code shared across 2+ of those
agents. In practice podcaster and video_presentation import nothing from it,
and anonymous_chat needs only context + compaction + retry_after + web_search.
Everything else was multi_agent_chat-only (the boundary just passes through).

Move the multi_agent_chat-only cluster into multi_agent_chat/shared/ (files
moved verbatim via git rename; ~116 import sites rewritten):

  errors, feature_flags, filesystem_selection, path_resolver, prompt_caching,
  sandbox, llm_config, mention_resolver
  middleware/busy_mutex, middleware/kb_persistence

busy_mutex/llm_config/mention_resolver are boundary-only but import the moved
modules, so they were folded in to avoid a backwards shared -> multi_agent_chat
dependency. main_agent builders now import the impls directly; the shared
middleware barrel keeps only the genuinely-shared compaction + retry_after.

Also delete the dead leftover shared/plugins and shared/skills dirs (live
copies already live under main_agent/).

Remaining in app/agents/shared/: context, system_prompt(+prompts), checkpointer,
middleware/{compaction,retry_after,dedup_tool_calls}, tools/. checkpointer and
system_prompt are boundary-only infra pending a dedicated home decision.
2026-06-05 12:30:15 +02:00
CREDO23
c0c4f57f5d refactor(agents): delete dead PermissionMiddleware twin in shared kernel
app/agents/shared/middleware/permission.py was an older, monolithic
PermissionMiddleware superseded by the modular permissions/ package under
multi_agent_chat/shared/middleware/ (core + evaluation + ask/ + factory).
Production wires only the package (main_agent stack + every subagent
builder); the kernel file was reachable only through the shared barrel
re-export (itself unused) and two tests pinned to its dead internals
(_raise_interrupt, _normalize_permission_decision, old after_model shape).

- delete app/agents/shared/middleware/permission.py
- drop PermissionMiddleware from the shared middleware barrel
- delete test_permission_middleware.py (covered the dead impl only; live
  behavior is covered by tests/.../middleware/shared/permissions/*)
- test_desktop_safety_rules.py: keep the ruleset-level regression tests,
  drop the dead import + TestPermissionMiddlewareIntegration class
2026-06-05 12:10:08 +02:00
CREDO23
8ae190a11d refactor(agents): move MAC middleware impls out of shared kernel
knowledge_search, memory_injection and scoped_model_fallback no longer
belong in the cross-agent kernel (app/agents/shared/middleware): they are
consumed only inside multi_agent_chat. Relocate each impl next to the
builder that uses it:

- knowledge_search.py -> multi_agent_chat/shared/middleware/ (genuinely
  shared: its _render_priority_message feeds kb_context_projection, used by
  both the main agent and the KB subagent)
- memory_injection.py -> multi_agent_chat/shared/middleware/ (beside its
  memory.py builder)
- scoped_model_fallback.py -> multi_agent_chat/shared/middleware/resilience/
  (beside fallback.py/bundle.py)

Impls moved verbatim (git rename). Builders/consumers now import the local
sibling; main_agent knowledge_priority imports the new shared path; shared
middleware barrel trimmed.

Tests: repoint imports; convert the knowledge_search monkeypatch targets
from brittle dotted-string form to object-based patching (monkeypatch.setattr
on the imported module), which is robust to import ordering. No behavior
change.
2026-06-05 12:04:31 +02:00
CREDO23
9493519c61 refactor(agents): colocate 8 main-agent-only middleware as per-concept folders
Each main-agent-only middleware now lives in its own folder under
main_agent/middleware/<concept>/ with builder.py (flag-gated construction)
+ middleware.py (the impl), re-exported via __init__.py. This kills the
cross-folder hop into agents/shared/middleware and keeps each middleware's
two responsibilities (build vs behavior) as colocated siblings.

Moved (impl from shared/middleware, builder from main_agent/middleware):
action_log, anonymous_document, context_editing, doom_loop, knowledge_tree,
noop_injection, otel_span, tool_call_repair.

Impls moved verbatim (git rename, no body edits) so behavior is unchanged.
Builders now import from the local .middleware sibling. stack.py import
paths updated for the 3 renamed folders; shared middleware barrel trimmed;
tests repointed (imports + patch targets).
2026-06-05 11:42:58 +02:00
CREDO23
fbd5ccc35a refactor(agents): split dedup_tool_calls; move HITL middleware to main_agent
DedupHITLToolCallsMiddleware is only wired by the main_agent stack, but
its module also exports dedup-key resolvers consumed by the shared MCP
tool layer. Splitting keeps the resolvers (dedup_key_full_args,
wrap_dedup_key_by_arg_name, DedupResolver) in shared and moves the
middleware class verbatim into main_agent/middleware/dedup_hitl.py
(merged with its builder), eliminating the shared->main_agent dependency
that a flat move would create. No behavior change.
2026-06-05 11:17:44 +02:00
CREDO23
afa51e97cf refactor(agents): delete dead single-agent-only middleware
file_intent (FileIntentMiddleware) and flatten_system
(FlattenSystemMessageMiddleware) were only ever instantiated in the
single-agent chat_deepagent stack, which was removed in 14bbea085. They
have no production consumer in multi_agent_chat. Delete both modules and
their unit tests.

Also drop the vestigial KnowledgeBaseSearchMiddleware alias (= the live
KnowledgePriorityMiddleware); its tests now target the real class so the
behavior coverage is preserved. Trim the three barrel/__all__ entries and
strip the now-dead class names from comments.
2026-06-05 11:15:13 +02:00
CREDO23
21509e7eca refactor(agents): group filesystem backends under filesystem/backends/
The concrete filesystem backends are consumed only by the MAC filesystem
layer (tools, path-resolution middleware, the resolver, skills backend) and
tests -- no external app code. Group them next to the filesystem middleware
they serve:

- filesystem_backends.py            -> filesystem/backends/resolver.py
- middleware/kb_postgres_backend.py -> filesystem/backends/kb_postgres.py
- middleware/local_folder_backend.py -> filesystem/backends/local_folder.py
- middleware/multi_root_local_folder_backend.py -> .../multi_root_local_folder.py
- document_xml.py                   -> filesystem/backends/document_xml.py

Repoint all 21 importers. No behavior change; import-all + filesystem
backend/path-resolution/knowledge-search unit tests stay green (478).
2026-06-05 11:02:26 +02:00
CREDO23
f615d6b530 refactor(agents): relocate remaining MAC-only kernel (permissions, deliverable_wait)
permissions.py (authorization Rule/Ruleset model) is consumed across all
MAC subagents + the permissions middleware, with a single external
consumer (user_tool_allowlist service) -> move to
multi_agent_chat/shared/permissions.py and repoint all 42 sites.

deliverable_wait.py (wait_for_deliverable) is used only by the podcast and
video_presentation deliverable tools -> colocate into
subagents/builtins/deliverables/.

No behavior change; import-all + permission/allowlist/deliverable unit
tests stay green.
2026-06-05 10:58:49 +02:00
CREDO23
1d2519730e refactor(agents): move MAC graph-state schema into multi_agent_chat/shared/state/
filesystem_state.py (the multi-agent graph state) and state_reducers.py
(its merge reducers) are consumed only by multi_agent_chat (filesystem
tools/middleware, kb projection, and the MAC-only shared middleware) plus
two unit tests -- no external app code. Relocate them into a dedicated
multi_agent_chat/shared/state/ package (filesystem_state.py + reducers.py)
and repoint every importer.

No behavior change; import-all + the full unit/middleware + unit/agents
suites (1066 tests) stay green.
2026-06-05 10:54:15 +02:00
CREDO23
a7d7155039 refactor(agents): colocate main_agent-only kernel into main_agent/
Move modules out of agents/shared/ that are consumed by a single package
(main_agent), placing each next to its only consumer instead of in a
"shared" grab-bag:

- agent_cache.py        -> main_agent/runtime/agent_cache_store.py
- connector_searchable_types.py -> main_agent/runtime/
- plugin_loader.py + plugins/    -> main_agent/plugins/
- skills/ + skills_backends.py   -> main_agent/skills/
- tools/invalid_tool.py          -> main_agent/tools/

Drop the skills_backends re-export from the shared middleware barrel and
repoint all consumers + tests. No behavior change; import-all,
error-contract, and the moved tests stay green.
2026-06-04 21:25:39 +02:00
CREDO23
c51aca6ccc refactor(agents): group MCP tools into shared/tools/mcp/ subpackage
The three MCP siblings (mcp_client/mcp_tool/mcp_tools_cache) served one
objective but sat loose at the top of shared/tools. Grouped them into an
mcp/ package and dropped the redundant prefix: client.py, tool.py, cache.py.
Updated all importers (routes, mcp_tools subagent, e2e fake patch targets,
unit test) to the new paths.
2026-06-04 20:35:38 +02:00
CREDO23
8d0090c6a1 refactor(agents): delete deliverable dead twins in shared/tools; fix live image api_base bug
The deliverables subagent runs its own generate_image/podcast/report/resume/
video_presentation (via tools/index.py); the shared/tools copies had zero
production importers — classic dead twins. Removed them so deliverable tools
live only in their vertical slice.

While repointing the 2 stranded unit tests at the LIVE deliverables modules,
found the OpenRouter empty-api_base defense (resolve_api_base) existed ONLY in
the dead shared generate_image, never propagated to the live multi-agent copy.
Ported the fix into deliverables/tools/generate_image.py (both the global-config
and user-DB-config branches) so an empty api_base no longer falls through to
LiteLLM's global api_base (Azure) and 404s.

Tests now exercise the live Command/receipt-returning tools (invoke the raw
coroutine with a hand-built ToolRuntime; resume progress events neutralized).
2026-06-04 20:30:30 +02:00
CREDO23
003924062d refactor(agents): split tool registry into pure-data catalog, decouple connectors
Replace the connector-coupled BUILTIN_TOOLS registry with a pure-data
catalog so shared/tools no longer imports any connector module, making the
connector packages independently deletable.

- add shared/tools/catalog.py (ToolMetadata + TOOL_CATALOG, 41 tools, no imports)
- point GET /agent/tools (the only live consumer) at the catalog
- relocate ToolDefinition into action_log middleware (its sole consumer);
  drop the inert tool_definitions wiring (no tool defines reverse)
- delete shared/tools/registry.py: connector imports, dead factories,
  dead get_connector_gated_tools, and BUILTIN_TOOLS
- drop stale dedup-propagation test (path removed in C1) + refresh docstrings

import-all guardrail + agents unit suite green (987 passed).
2026-06-04 19:43:50 +02:00
CREDO23
add9e14694 refactor(agents): colocate middleware into vertical slices
Eliminate the top-level multi_agent_chat/middleware/ package so each slice
owns its middleware (vertical-slice colocation):

- middleware/shared/   -> shared/middleware/        (cross-slice middleware)
- middleware/subagent/ -> subagents/shared/middleware/ (subagent stack)
- main_agent/middleware/ already colocated in Slice A

The moved shared/ subtree is internally consistent (all relative imports
stay within it), so only external absolute refs were rewritten. The
subagent stack's ..shared.* relatives were promoted to absolute paths to
the new shared/middleware/ location.

multi_agent_chat/ root is now: main_agent/, shared/, subagents/.
Verified: 2430 unit tests pass, 1 skipped (baseline unchanged).
2026-06-04 18:13:47 +02:00
CREDO23
9c845d562e refactor(agents): colocate main-agent middleware under main_agent/ slice
Vertical-slice colocation: all main-agent code should live under
main_agent/ instead of being split across a parallel middleware/main_agent
tree. Move multi_agent_chat/middleware/main_agent/ -> main_agent/middleware/
and its assembler middleware/stack.py -> main_agent/middleware/stack.py, so
the main-agent slice is self-contained (graph, runtime, system_prompt, tools,
middleware).

Genuinely cross-slice middleware (middleware/shared/, middleware/subagent/)
stays under multi_agent_chat/middleware/ for a later slice; the moved builders
now reference it via absolute imports.

Pure move + import rewrite (git-tracked renames). Verified: full unit suite
green (2430 passed, 1 skipped), including test_import_all and the
checkpointed-subagent middleware suite.
2026-06-04 18:03:49 +02:00
CREDO23
1acde6a470 test(agents): cover live filesystem middleware, retire dead twin
The single-agent-era filesystem middleware (app/agents/shared/middleware/
filesystem.py, ~2000 lines) was never instantiated in production, yet three
unit suites validated it — an illusory guardrail while the live decomposed
middleware (multi_agent_chat/middleware/shared/filesystem) was unguarded.

Close the gap before reorganizing the agents module:
- Add 14 integration tests driving live B's tools in desktop mode (real
  on-disk effects) and cloud mode (in-state staging, namespace policy).
- Port all high-value dead-twin assertions onto the live path: cloud rm/rmdir
  staging + guard rails, KBPostgresBackend delete-view filter, mode-scoped
  system prompt, cwd/relative/namespace resolution, multi-root mount
  normalization.
- Delete dead twin filesystem.py, drop its __init__ re-export, and retire its
  3 dead-twin tests.

Verified: test_import_all + middleware unit + FS integration all green.
2026-06-04 17:46:49 +02:00
CREDO23
5b45f78a16 refactor(chat): delete legacy stream_new_chat monolith (cutover complete)
The flows orchestrators (new_chat/resume_chat) are now the sole live path
after the byte-for-byte differential proof, so the monolith and its
monolith-vs-flows parity scaffolding are removed.

- Repoint the last live importer (anonymous_chat_routes) to
  streaming.agent.event_loop.stream_agent_events + shared.stream_result.StreamResult
  (drop-in; the keyword-only fallback-commit params default to inert for anon).
- Repoint e2e launcher patch targets to flows.shared.llm_bundle.
- Repoint helper unit tests (chunk_parts, thinking-step ids, tool-input
  streaming) to their flows homes to preserve coverage.
- Delete the monolith, the contract test, and the parity tests
  (parallel_refactor, stage_1, stage_2, orchestrator_frame) whose sole
  purpose was comparing against the now-removed monolith.

Full suite green (2622 passed, 1 skipped); the two excluded live-app dirs
(document_upload, composio) have a pre-existing, env-gated registration 404
unrelated to this change.
2026-06-04 14:35:45 +02:00
CREDO23
b9937cf4b1 refactor(chat): cut live chat streaming over to flows orchestrators (cutover)
Flip the live callers (new_chat_routes + gateway/agent_invoke) from the
legacy monolithic app.tasks.chat.stream_new_chat to the side-by-side
app.tasks.chat.streaming.flows orchestrators.

Adds a byte-for-byte differential parity test driving BOTH implementations
on identical, fully-deterministic glue inputs (frozen time/uuid, stubbed
LLM/persistence/agent seams). All glue paths are byte-identical:
  * new: auto-pin failure, LLM-load failure, persist-user fail,
    persist-assistant fail (full initial-frame ordering + handshake),
    pre-stream exception (top-level except path)
  * resume: persist-assistant fail

The differential test also surfaced one INTENTIONAL divergence: on a resume
turn whose auto-pin / LLM-load fails, the monolith crashes with
UnboundLocalError (_resume_premium_request_id read in finally before its
post-early-return definition); the flows version emits a clean terminal
error. The flows path is therefore byte-identical or strictly more correct.

The agent-content stream itself is shared, unforkable code
(stream_output -> EventRelay) so it cannot diverge.

Monolith + old parity test deletion follows in a separate commit.
2026-06-04 14:24:18 +02:00
CREDO23
14bbea0854 refactor(agents): delete single-agent stack + new_chat shim package (bucket B3/B4)
With multi-agent the only live factory (B1), the single-agent stack is dead.
Remove app/agents/new_chat/ entirely: chat_deepagent.py, subagents/, and all
re-export shims (errors/context/llm_config/permissions/tools/middleware/...) that
existed only to serve frozen single-agent code. Live code already imports the
shared kernel (app.agents.shared.*) directly.

Tests: delete single-agent-only suites (test_resolve_prompt_model_name,
test_specialized_subagents) and the chat_deepagent source-shape contract assertion;
repoint test_scoped_model_fallback to the shared middleware path. Suite green
(2710 passed).
2026-06-04 13:40:44 +02:00