Align multi-agent docs and Gmail/Calendar expert copy with registry routing.

2026-05-06 06:12:40 +02:00 · 2026-04-30 03:53:29 +02:00 · 2026-04-30 03:53:29 +02:00 · 362d462f92
commit 362d462f92
parent 2ab4c411fe
8 changed files with 30 additions and 30 deletions
--- a/surfsense_backend/app/agents/multi_agent_chat/IMPLEMENTATION_PLAN.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/IMPLEMENTATION_PLAN.md
@ -31,7 +31,7 @@ Not every row applies to the **first** multi-agent graph (e.g. you may start wit

 ## Rework principles (better arrangement, same substance)

-1. **Expert agents**: **`expert_agent/builtins/`** — broad registry **categories** (e.g. research, deliverables), not a single vendor. **`expert_agent/connectors/`** — **external integrations** (Gmail, Calendar, Discord, Teams, doc stores, …), whether wired with hand-written factories or registry connector tools. Prompt + tools live together per slice; cross-cutting helpers live in `core/` or are imported from `new_chat`.
+1. **Expert agents**: **`expert_agent/builtins/`** — broad registry **categories** (e.g. research, deliverables), not a single vendor. **`expert_agent/connectors/`** — **external integrations** (one package per product route: Discord, Notion, Gmail, …), each using the same pattern: ``slice_tools.py`` (registry subset or factories) + ``domain_prompt.md`` + ``agent.py``. Cross-cutting helpers live in `core/` or are imported from `new_chat`.
 2. **Explicit graphs**: supervisor vs domain agents vs routing tools are **named** and testable; avoid one opaque megagraph where behavior is hard to reason about.
 3. **Single composer**: integration eventually mirrors `create_surfsense_deep_agent` in spirit—**one factory** that attaches middleware, KB, and tools in documented order (see `chat_deepagent.py` comments on ordering).
 4. **No duplicate KB pipelines**: align with `KnowledgePriorityMiddleware` / tree semantics; don’t invent a second hybrid-search path for the same turn.
@ -43,15 +43,15 @@ Not every row applies to the **first** multi-agent graph (e.g. you may start wit

 **Supervisor (orchestrator)**

- Keeps a **small tool surface**: routing tools (`gmail`, `calendar`, future category tools like `research` / `deliverables`) — **not** the full `registry.py` “general” tool list.
+- Keeps a **small tool surface**: one **routing** tool per builtin category (`research`, `memory`, …) and per connector route (`notion`, `gmail`, …) — **not** the full flat `registry.py` tool list on the supervisor.
 - **KB** should primarily benefit the model via **`new_chat`-style middleware** (e.g. hybrid priority docs → state / system adjunct), not by stacking redundant search tools, unless product explicitly requires them.
 - **Single hybrid search per user turn** at this layer when possible: full retrieval is expensive; avoid running it again inside every sub-agent for the same message.
 - Does **not** own **on-demand connector discovery** (e.g. `get_connected_accounts`): orchestration is route-by-intent, not ID resolution.

-**Domain agents (Gmail, Calendar, future slices)**
+**Domain agents (every connector slice — same shape)**

- Carry tools built from **`new_chat` factories** (already pattern in `expert_agent/connectors/gmail/slice_tools.py`, etc.).
- **Curated context belongs in the task message**: when the supervisor calls a routing tool (`gmail`, `calendar`, …), the **tool handler composes the child’s task string** so it includes **only** what that domain needs (KB snippets, constraints, distilled facts) — folded into how the task is written — not the full parent transcript. The sub-agent `invoke` stays a tight payload (`messages` + task content); domain middleware can still add connector-local hints. Still **no second full hybrid search** for the same turn unless the subdomain explicitly needs a new query.
+- Carry tools built from **`new_chat`** (`registry` subsets via ``build_registry_tools_for_category`` per ``TOOL_NAMES_BY_CATEGORY``, plus MCP merge where applicable).
+- **Curated context belongs in the task message**: when the supervisor calls **any** routing tool, the handler composes the child’s task string so it includes **only** what that domain needs (KB snippets, constraints, distilled facts) — folded into how the task is written — not the full parent transcript. The sub-agent `invoke` stays a tight payload (`messages` + task content); domain middleware can still add connector-local hints. Still **no second full hybrid search** for the same turn unless the subdomain explicitly needs a new query.
 - **Middleware here** still fits **domain-only** grounding (connector availability, search-space hints, metadata) shared across tools in that subgraph. Reuse or thin-wrap `new_chat.middleware` where it applies to a subgraph.
 - **Reactive discovery** (resolve a service id mid-task) stays a **tool** on that domain (or shared factory), e.g. `get_connected_accounts` when the model needs it — not something the supervisor must call.

@ -63,9 +63,9 @@ Not every row applies to the **first** multi-agent graph (e.g. you may start wit

 In `new_chat`, KB + **virtual FS** (`KnowledgePriorityMiddleware`, tree, **`SurfSenseFilesystemMiddleware`** / **`KBPostgresBackend`**) serves the **orchestrator** that may **read and traverse** the workspace.

-**Connector domain agents** (Gmail, Calendar, …) are **not** mini-parents: the **supervisor** should already decide *what* to do and pass a **clear task** (plus any curated KB snippet folded into **`compose_child_task`**). The specialist runs **connector APIs**, not a second document crawl — duplicating full KB+VFS on every domain subgraph **shifts the parent’s exploration work onto the wrong agent** and adds noise.
+**Connector domain agents** are **not** mini-parents: the **supervisor** should already decide *what* to do and pass a **clear task** (plus any curated KB snippet folded into **`compose_child_task`**). The specialist runs **connector APIs**, not a second document crawl — duplicating full KB+VFS on every domain subgraph **shifts the parent’s exploration work onto the wrong agent** and adds noise.

-So **no child-side filesystem stack by default** for mail/calendar-style slices unless product demands it. Reserve **KB + VFS on a subgraph** for roles that **actually** need heavy document work (research, coding/explore-style agents, deliverables that grep the KB), matching how `new_chat` uses specialists.
+So **no child-side filesystem stack by default** for narrow connector subgraphs unless product demands it. Reserve **KB + VFS on a subgraph** for roles that **actually** need heavy document work (research, coding/explore-style agents, deliverables that grep the KB), matching how `new_chat` uses specialists.

 ---

@ -99,7 +99,7 @@ multi_agent_chat/

  expert_agent/
    builtins/                # broad categories: research, deliverables
-    connectors/              # one subgraph per vendor: gmail, calendar, discord, teams, notion, …
+    connectors/              # one subgraph per vendor route (see TOOL_NAMES_BY_CATEGORY keys)

  routing/
    domain_routing_spec.py
--- a/surfsense_backend/app/agents/multi_agent_chat/core/prompts/load.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/core/prompts/load.py
@ -6,7 +6,7 @@ from importlib import resources


 def read_prompt_md(package: str, stem: str) -> str:
-    """Read ``{stem}.md`` from the given import package (e.g. ``…expert_agent.connectors.gmail``)."""
+    """Read ``{stem}.md`` from the given import package (e.g. ``…expert_agent.connectors.notion``)."""
    try:
        ref = resources.files(package).joinpath(f"{stem}.md")
        if not ref.is_file():
--- a/surfsense_backend/app/agents/multi_agent_chat/expert_agent/init.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/expert_agent/init.py
@ -1,5 +1,5 @@
 """Expert subgraphs (specialists the supervisor delegates to).

 - :mod:`expert_agent.builtins` — cross-cutting registry categories (e.g. research, memory, deliverables).
- :mod:`expert_agent.connectors` — vendor/product integrations (mail, calendar, chat, doc stores, …).
+- :mod:`expert_agent.connectors` — vendor/product integrations (email, chat, documents, … — one slice per route).
 """
--- a/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/calendar/init.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/calendar/init.py
@ -1,4 +1,4 @@
-"""Google Calendar vertical slice: connector tools, domain agent, ``domain_prompt.md``."""
+"""Google Calendar vertical slice: registry tools, domain agent, ``domain_prompt.md``."""

 from app.agents.multi_agent_chat.expert_agent.connectors.calendar.agent import build_calendar_domain_agent
 from app.agents.multi_agent_chat.expert_agent.connectors.calendar.slice_tools import (
--- a/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/calendar/agent.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/calendar/agent.py
@ -12,7 +12,7 @@ from app.agents.multi_agent_chat.core.agents import build_domain_agent


 def build_calendar_domain_agent(llm: BaseChatModel, tools: Sequence[BaseTool]):
-    """Compiled Calendar domain-agent graph (prompt + tools co-located under ``calendar``)."""
+    """Compiled Google Calendar domain-agent graph."""
    return build_domain_agent(
        llm,
        tools,
--- a/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/gmail/init.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/gmail/init.py
@ -1,4 +1,4 @@
-"""Gmail vertical slice: connector tools, domain agent, ``domain_prompt.md``."""
+"""Gmail vertical slice: registry tools, domain agent, ``domain_prompt.md``."""

 from app.agents.multi_agent_chat.expert_agent.connectors.gmail.agent import build_gmail_domain_agent
 from app.agents.multi_agent_chat.expert_agent.connectors.gmail.slice_tools import build_gmail_tools
--- a/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/gmail/agent.py
+++ b/surfsense_backend/app/agents/multi_agent_chat/expert_agent/connectors/gmail/agent.py
@ -12,7 +12,7 @@ from app.agents.multi_agent_chat.core.agents import build_domain_agent


 def build_gmail_domain_agent(llm: BaseChatModel, tools: Sequence[BaseTool]):
-    """Compiled Gmail domain-agent graph (prompt + tools co-located under ``gmail``)."""
+    """Compiled Gmail domain-agent graph."""
    return build_domain_agent(
        llm,
        tools,
--- a/surfsense_backend/app/agents/multi_agent_chat/supervisor/supervisor_prompt.md
+++ b/surfsense_backend/app/agents/multi_agent_chat/supervisor/supervisor_prompt.md
@ -1,26 +1,26 @@
-You are the supervisor agent. Route work to the right sub-agent using **one** routing tool per request when delegation is needed:
+You are the supervisor agent. Route work to the right sub-agent using **one** routing tool per request when delegation is needed.
+
+**Built-in capabilities**

- **gmail** — email (search, read, drafts, send, trash).
- **calendar** — Google Calendar events.
 - **research** — web search, page scraping, SurfSense documentation help.
 - **memory** — save long-term facts and preferences (personal or team memory).
- **deliverables** — reports, podcasts, video presentations, resumes, images (thread-scoped outputs).
- **discord** — Discord server channels and messages.
- **teams** — Microsoft Teams channels and messages.
- **notion** — Notion pages.
+- **deliverables** — reports, podcasts, video presentations, resumes, images (thread-scoped outputs; only when available).
+
+**Connectors** (same pattern for each product)
+
+- **calendar** — Google Calendar events.
 - **confluence** — Confluence pages.
- **google_drive** — Google Drive files (Docs/Sheets).
+- **discord** — Discord server channels and messages.
 - **dropbox** — Dropbox files.
- **onedrive** — Microsoft OneDrive files.
+- **gmail** — email (search, read, drafts, send, trash).
+- **google_drive** — Google Drive files (Docs/Sheets).
 - **luma** — Luma calendar events (list, read, create).
+- **notion** — Notion pages.
+- **onedrive** — Microsoft OneDrive files.
+- **teams** — Microsoft Teams channels and messages.

-When the user has connected OAuth MCP integrations, additional routing tools may appear — use them only for that product’s work:
+**OAuth MCP** (extra routing tools only when those integrations are connected)

- **linear** — Linear (issues, projects) via MCP.
- **slack** — Slack search / reads via MCP.
- **jira** — Jira via MCP.
- **clickup** — ClickUp via MCP.
- **airtable** — Airtable via MCP.
- **generic_mcp** — user-defined MCP servers (stdio).
+- **linear**, **slack**, **jira**, **clickup**, **airtable**, **generic_mcp** — use only for that product’s MCP-backed work.

 Pass each tool a **clear natural-language task** describing what the sub-agent should do. Answer directly when no sub-agent is needed. When sub-agents return results, combine them into one coherent reply for the user.