mirror of
https://github.com/MODSetter/SurfSense.git
synced 2026-06-26 21:39:43 +02:00
agent: retire eager KB priority/planner path and its dead flags
The pull-based KB design (on-demand search_knowledge_base tool + pre-injected workspace tree) fully replaced the old eager retrieval path. Remove its last remnants: - Delete KnowledgePriorityMiddleware (knowledge_search.py) and its tests. - Drop the kb_priority state field + reducer default; trim KbContextProjectionMiddleware to project only workspace_tree_text. - Remove the now-dead feature flags enable_kb_priority_preinjection and enable_kb_planner_runnable across backend (flags, route schema, tests, env examples) and frontend (settings toggle, zod schema). - Scrub <priority_documents> and stale KnowledgePriorityMiddleware references from prompts, docstrings, and the ADR. No functional change: nothing wrote kb_priority and neither flag gated live behavior after the cutover. Full backend suite green (pre-existing unrelated failures aside).
This commit is contained in:
parent
0148647b98
commit
2beafbdec8
34 changed files with 62 additions and 1890 deletions
|
|
@ -394,7 +394,6 @@ SURFSENSE_ENABLE_TOOL_CALL_REPAIR=true
|
||||||
SURFSENSE_ENABLE_BUSY_MUTEX=true
|
SURFSENSE_ENABLE_BUSY_MUTEX=true
|
||||||
SURFSENSE_ENABLE_SKILLS=true
|
SURFSENSE_ENABLE_SKILLS=true
|
||||||
SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS=true
|
SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS=true
|
||||||
SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE=true
|
|
||||||
SURFSENSE_ENABLE_ACTION_LOG=true
|
SURFSENSE_ENABLE_ACTION_LOG=true
|
||||||
SURFSENSE_ENABLE_REVERT_ROUTE=true
|
SURFSENSE_ENABLE_REVERT_ROUTE=true
|
||||||
SURFSENSE_ENABLE_PERMISSION=true
|
SURFSENSE_ENABLE_PERMISSION=true
|
||||||
|
|
|
||||||
|
|
@ -379,13 +379,18 @@ the ambient plane.
|
||||||
Remove from the hot path:
|
Remove from the hot path:
|
||||||
|
|
||||||
- `KnowledgePriorityMiddleware` search branch (planner LLM, embedding, hybrid
|
- `KnowledgePriorityMiddleware` search branch (planner LLM, embedding, hybrid
|
||||||
search in `before_agent`).
|
search in `before_agent`). ✅ **Done** — the whole `knowledge_search.py`
|
||||||
|
module is deleted.
|
||||||
- `fetch_mentioned_documents` eager chunk pull.
|
- `fetch_mentioned_documents` eager chunk pull.
|
||||||
- `<priority_documents>` pre-injection and `KbContextProjectionMiddleware`
|
- `<priority_documents>` pre-injection and `KbContextProjectionMiddleware`
|
||||||
priority projection.
|
priority projection. ✅ **Done** — `<priority_documents>` is no longer
|
||||||
|
produced anywhere; `KbContextProjectionMiddleware` is trimmed to a pure
|
||||||
|
`<workspace_tree>` projector. The `enable_kb_priority_preinjection` flag and
|
||||||
|
every `<priority_documents>` prompt reference are removed.
|
||||||
- `kb_priority` state plumbing (deleted per §8.10; add a dedicated
|
- `kb_priority` state plumbing (deleted per §8.10; add a dedicated
|
||||||
`citation_registry` field instead). `kb_matched_chunk_ids` is already gone
|
`citation_registry` field instead). ✅ **Done** — `kb_priority` /
|
||||||
(build-order Step 5).
|
`KbPriorityEntry` are removed from state + reducers. `kb_matched_chunk_ids`
|
||||||
|
is already gone (build-order Step 5).
|
||||||
|
|
||||||
Keep / add:
|
Keep / add:
|
||||||
|
|
||||||
|
|
@ -486,11 +491,15 @@ behavior tests, and the on-contract prompt `base/citation_contract.md`
|
||||||
4. **Mentions → scope.** Map `@document`/`@folder` mentions to
|
4. **Mentions → scope.** Map `@document`/`@folder` mentions to
|
||||||
`SearchScope(document_ids=…)` for the tool; retire `kb_priority` mention
|
`SearchScope(document_ids=…)` for the tool; retire `kb_priority` mention
|
||||||
surfacing.
|
surfacing.
|
||||||
5. **Remove the old eager path.** Retire `KnowledgePriorityMiddleware`,
|
5. **Remove the old eager path.** ✅ **Done** — `KnowledgePriorityMiddleware`
|
||||||
`kb_context_projection`, and the old `search_knowledge_base` hybrid helper in
|
and the old `search_knowledge_base` hybrid helper in `knowledge_search.py`
|
||||||
`knowledge_search.py`; later `ChucksHybridSearchRetriever` (after migrating
|
are deleted (the whole module is gone); `kb_context_projection` is trimmed to
|
||||||
`ConnectorService`). Migrate `web_search` to register `WEB_RESULT` so all
|
a tree-only projector (kept because it still projects `<workspace_tree>` to
|
||||||
citations unify on `[n]` — **done**, see §12 build-order Step 6.
|
subagents); `kb_priority` state + the `enable_kb_priority_preinjection` flag +
|
||||||
|
all `<priority_documents>` prompt references are removed. Still pending:
|
||||||
|
`ChucksHybridSearchRetriever` (after migrating `ConnectorService`). Migrate
|
||||||
|
`web_search` to register `WEB_RESULT` so all citations unify on `[n]` —
|
||||||
|
**done**, see §12 build-order Step 6.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -416,14 +416,6 @@ LANGSMITH_PROJECT=surfsense
|
||||||
# Skills + subagents
|
# Skills + subagents
|
||||||
# SURFSENSE_ENABLE_SKILLS=false
|
# SURFSENSE_ENABLE_SKILLS=false
|
||||||
# SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS=false
|
# SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS=false
|
||||||
# SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE=false
|
|
||||||
|
|
||||||
# KB retrieval mode (default OFF = lazy). When OFF, the main agent retrieves
|
|
||||||
# KB content on demand via the `search_knowledge_base` tool and skips the
|
|
||||||
# expensive per-turn pre-injection (planner LLM + embed + hybrid search,
|
|
||||||
# ~2.3s); explicit @-mentions are still surfaced cheaply. Set to true to
|
|
||||||
# restore the original eager `<priority_documents>` pre-injection.
|
|
||||||
# SURFSENSE_ENABLE_KB_PRIORITY_PREINJECTION=false
|
|
||||||
|
|
||||||
# Snapshot / revert
|
# Snapshot / revert
|
||||||
# SURFSENSE_ENABLE_ACTION_LOG=false
|
# SURFSENSE_ENABLE_ACTION_LOG=false
|
||||||
|
|
|
||||||
|
|
@ -6,8 +6,6 @@ read-only). This middleware loads it once on the first turn into
|
||||||
|
|
||||||
* :class:`KnowledgeTreeMiddleware` can render the synthetic ``/documents``
|
* :class:`KnowledgeTreeMiddleware` can render the synthetic ``/documents``
|
||||||
view without touching the DB.
|
view without touching the DB.
|
||||||
* :class:`KnowledgePriorityMiddleware` skips hybrid search and emits a
|
|
||||||
degenerate priority list.
|
|
||||||
* :class:`KBPostgresBackend` (``als_info`` / ``aread`` / ``_load_file_data``)
|
* :class:`KBPostgresBackend` (``als_info`` / ``aread`` / ``_load_file_data``)
|
||||||
recognises the synthetic path.
|
recognises the synthetic path.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -8,11 +8,6 @@ standing instructions. It also reports current character usage versus the
|
||||||
hard limit so you can manage the budget. Treat it as background colour for
|
hard limit so you can manage the budget. Treat it as background colour for
|
||||||
your answer, not as the task itself.
|
your answer, not as the task itself.
|
||||||
|
|
||||||
`<priority_documents>` lists the workspace documents most relevant to the
|
|
||||||
latest user message, ranked by relevance score, with `[USER-MENTIONED]`
|
|
||||||
flagged on anything the user explicitly referenced. When the task is about
|
|
||||||
workspace content, read these first.
|
|
||||||
|
|
||||||
`<workspace_tree>` shows the full `/documents/` folder and file layout. Use
|
`<workspace_tree>` shows the full `/documents/` folder and file layout. Use
|
||||||
it to resolve paths the user describes in natural language ("my Q2 roadmap",
|
it to resolve paths the user describes in natural language ("my Q2 roadmap",
|
||||||
"last week's meeting notes") into concrete document references before
|
"last week's meeting notes") into concrete document references before
|
||||||
|
|
|
||||||
|
|
@ -7,11 +7,6 @@ decisions, conventions, architecture notes, processes, key facts. It also
|
||||||
reports current character usage versus the hard limit so you can manage the
|
reports current character usage versus the hard limit so you can manage the
|
||||||
budget. Treat it as background colour for your answer, not as the task itself.
|
budget. Treat it as background colour for your answer, not as the task itself.
|
||||||
|
|
||||||
`<priority_documents>` lists the workspace documents most relevant to the
|
|
||||||
latest user message, ranked by relevance score, with `[USER-MENTIONED]`
|
|
||||||
flagged on anything someone in the thread explicitly referenced. When the
|
|
||||||
task is about workspace content, read these first.
|
|
||||||
|
|
||||||
`<workspace_tree>` shows the full `/documents/` folder and file layout. Use
|
`<workspace_tree>` shows the full `/documents/` folder and file layout. Use
|
||||||
it to resolve paths described in natural language ("the Q2 roadmap", "last
|
it to resolve paths described in natural language ("the Q2 roadmap", "last
|
||||||
week's planning notes") into concrete document references before delegating
|
week's planning notes") into concrete document references before delegating
|
||||||
|
|
|
||||||
|
|
@ -14,5 +14,5 @@ Workflow (Understand → Plan → Act → Verify):
|
||||||
|
|
||||||
Discipline:
|
Discipline:
|
||||||
- Do not imply access to connectors, MCP tools, or deliverable generators except via **task**.
|
- Do not imply access to connectors, MCP tools, or deliverable generators except via **task**.
|
||||||
- Pass paths to **task(knowledge_base, …)** only when you saw them in `<workspace_tree>` or `<priority_documents>`. Otherwise describe the document in natural language and let the subagent resolve it.
|
- Pass paths to **task(knowledge_base, …)** only when you saw them in `<workspace_tree>`. Otherwise describe the document in natural language and let the subagent resolve it.
|
||||||
</provider_hints>
|
</provider_hints>
|
||||||
|
|
|
||||||
|
|
@ -53,14 +53,6 @@ class AgentFeatureFlags:
|
||||||
# Skills + subagents
|
# Skills + subagents
|
||||||
enable_skills: bool = True
|
enable_skills: bool = True
|
||||||
enable_specialized_subagents: bool = True
|
enable_specialized_subagents: bool = True
|
||||||
enable_kb_planner_runnable: bool = True
|
|
||||||
|
|
||||||
# KB retrieval mode — when False (default), the main agent retrieves KB
|
|
||||||
# content lazily via the on-demand ``search_knowledge_base`` tool and the
|
|
||||||
# expensive per-turn pre-injection (planner LLM + embed + hybrid search,
|
|
||||||
# ~2.3s) is skipped; explicit @-mentions are still surfaced cheaply. Set
|
|
||||||
# True to restore the original eager ``<priority_documents>`` pre-injection.
|
|
||||||
enable_kb_priority_preinjection: bool = False
|
|
||||||
|
|
||||||
# Snapshot / revert
|
# Snapshot / revert
|
||||||
enable_action_log: bool = True
|
enable_action_log: bool = True
|
||||||
|
|
@ -118,9 +110,6 @@ class AgentFeatureFlags:
|
||||||
enable_llm_tool_selector=False,
|
enable_llm_tool_selector=False,
|
||||||
enable_skills=False,
|
enable_skills=False,
|
||||||
enable_specialized_subagents=False,
|
enable_specialized_subagents=False,
|
||||||
enable_kb_planner_runnable=False,
|
|
||||||
# Full rollback restores the original eager KB pre-injection.
|
|
||||||
enable_kb_priority_preinjection=True,
|
|
||||||
enable_action_log=False,
|
enable_action_log=False,
|
||||||
enable_revert_route=False,
|
enable_revert_route=False,
|
||||||
enable_plugin_loader=False,
|
enable_plugin_loader=False,
|
||||||
|
|
@ -156,12 +145,6 @@ class AgentFeatureFlags:
|
||||||
enable_specialized_subagents=_env_bool(
|
enable_specialized_subagents=_env_bool(
|
||||||
"SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS", True
|
"SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS", True
|
||||||
),
|
),
|
||||||
enable_kb_planner_runnable=_env_bool(
|
|
||||||
"SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE", True
|
|
||||||
),
|
|
||||||
enable_kb_priority_preinjection=_env_bool(
|
|
||||||
"SURFSENSE_ENABLE_KB_PRIORITY_PREINJECTION", False
|
|
||||||
),
|
|
||||||
# Snapshot / revert
|
# Snapshot / revert
|
||||||
enable_action_log=_env_bool("SURFSENSE_ENABLE_ACTION_LOG", True),
|
enable_action_log=_env_bool("SURFSENSE_ENABLE_ACTION_LOG", True),
|
||||||
enable_revert_route=_env_bool("SURFSENSE_ENABLE_REVERT_ROUTE", True),
|
enable_revert_route=_env_bool("SURFSENSE_ENABLE_REVERT_ROUTE", True),
|
||||||
|
|
@ -198,7 +181,6 @@ class AgentFeatureFlags:
|
||||||
self.enable_llm_tool_selector,
|
self.enable_llm_tool_selector,
|
||||||
self.enable_skills,
|
self.enable_skills,
|
||||||
self.enable_specialized_subagents,
|
self.enable_specialized_subagents,
|
||||||
self.enable_kb_planner_runnable,
|
|
||||||
self.enable_action_log,
|
self.enable_action_log,
|
||||||
self.enable_revert_route,
|
self.enable_revert_route,
|
||||||
self.enable_plugin_loader,
|
self.enable_plugin_loader,
|
||||||
|
|
|
||||||
|
|
@ -44,12 +44,6 @@ to page through a large document. Cite a passage by writing its `[n]` after the
|
||||||
statement it supports — the same `[n]` that passage had in
|
statement it supports — the same `[n]` that passage had in
|
||||||
`search_knowledge_base` results.
|
`search_knowledge_base` results.
|
||||||
|
|
||||||
## Priority List
|
|
||||||
|
|
||||||
You receive a `<priority_documents>` system message each turn listing the
|
|
||||||
top-K paths most relevant to the user's query (by hybrid search). Read those
|
|
||||||
first.
|
|
||||||
|
|
||||||
## Workspace Tree
|
## Workspace Tree
|
||||||
|
|
||||||
You receive a `<workspace_tree>` system message each turn with the current
|
You receive a `<workspace_tree>` system message each turn with the current
|
||||||
|
|
|
||||||
|
|
@ -37,13 +37,4 @@ directory (`cwd`).
|
||||||
- Cross-mount moves are not supported.
|
- Cross-mount moves are not supported.
|
||||||
- Desktop deletes hit disk immediately and cannot be undone via the
|
- Desktop deletes hit disk immediately and cannot be undone via the
|
||||||
agent's revert flow — confirm before calling `rm`/`rmdir`.
|
agent's revert flow — confirm before calling `rm`/`rmdir`.
|
||||||
|
|
||||||
## Priority List
|
|
||||||
|
|
||||||
You may receive a `<priority_documents>` system message listing the top-K
|
|
||||||
documents from the user's SurfSense knowledge base — these are cloud-ingested
|
|
||||||
via connectors (Notion, Slack, etc.), not local files. Treat it as a hint:
|
|
||||||
consult it when the task spans both local and cloud sources (e.g. drafting a
|
|
||||||
local note from a Notion summary); skip when the task is purely about local
|
|
||||||
files.
|
|
||||||
"""
|
"""
|
||||||
|
|
|
||||||
|
|
@ -1,4 +1,4 @@
|
||||||
"""Project ``workspace_tree_text`` + ``kb_priority`` from state into SystemMessages."""
|
"""Project ``workspace_tree_text`` from state into a SystemMessage."""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
|
@ -14,18 +14,15 @@ from app.agents.chat.multi_agent_chat.shared.state.filesystem_state import (
|
||||||
)
|
)
|
||||||
from app.utils.perf import get_perf_logger
|
from app.utils.perf import get_perf_logger
|
||||||
|
|
||||||
from .knowledge_search import _render_priority_message
|
|
||||||
|
|
||||||
_perf_log = get_perf_logger()
|
_perf_log = get_perf_logger()
|
||||||
|
|
||||||
|
|
||||||
class KbContextProjectionMiddleware(AgentMiddleware): # type: ignore[type-arg]
|
class KbContextProjectionMiddleware(AgentMiddleware): # type: ignore[type-arg]
|
||||||
"""Emit ``<workspace_tree>`` + ``<priority_documents>`` from shared state.
|
"""Emit the ``<workspace_tree>`` from shared state.
|
||||||
|
|
||||||
Read-only consumer: no DB, no LLM, no state writes. The orchestrator's
|
Read-only consumer: no DB, no LLM, no state writes. The orchestrator's
|
||||||
renderer middlewares populate the source fields; this projection lets any
|
``KnowledgeTreeMiddleware`` populates ``workspace_tree_text``; this
|
||||||
agent (orchestrator or subagent) put the same content in front of its
|
projection lets a subagent put the same tree in front of its own LLM call.
|
||||||
own LLM call.
|
|
||||||
"""
|
"""
|
||||||
|
|
||||||
tools = ()
|
tools = ()
|
||||||
|
|
@ -39,28 +36,19 @@ class KbContextProjectionMiddleware(AgentMiddleware): # type: ignore[type-arg]
|
||||||
del runtime
|
del runtime
|
||||||
start = time.perf_counter()
|
start = time.perf_counter()
|
||||||
tree_text = state.get("workspace_tree_text")
|
tree_text = state.get("workspace_tree_text")
|
||||||
priority = state.get("kb_priority")
|
if not tree_text:
|
||||||
if not tree_text and not priority:
|
|
||||||
_perf_log.info(
|
_perf_log.info(
|
||||||
"[kb_context_projection] tree=0 priority=0 elapsed=%.3fs",
|
"[kb_context_projection] tree=0 elapsed=%.3fs",
|
||||||
time.perf_counter() - start,
|
time.perf_counter() - start,
|
||||||
)
|
)
|
||||||
return None
|
return None
|
||||||
|
|
||||||
messages = list(state.get("messages") or [])
|
messages = list(state.get("messages") or [])
|
||||||
insert_at = max(len(messages) - 1, 0)
|
insert_at = max(len(messages) - 1, 0)
|
||||||
tree_chars = 0
|
messages.insert(insert_at, SystemMessage(content=tree_text))
|
||||||
if tree_text:
|
|
||||||
tree_chars = len(tree_text)
|
|
||||||
messages.insert(insert_at, SystemMessage(content=tree_text))
|
|
||||||
priority_count = 0
|
|
||||||
if priority:
|
|
||||||
priority_count = len(priority) if hasattr(priority, "__len__") else 1
|
|
||||||
messages.insert(insert_at, _render_priority_message(priority))
|
|
||||||
_perf_log.info(
|
_perf_log.info(
|
||||||
"[kb_context_projection] tree_chars=%d priority_items=%d elapsed=%.3fs",
|
"[kb_context_projection] tree_chars=%d elapsed=%.3fs",
|
||||||
tree_chars,
|
len(tree_text),
|
||||||
priority_count,
|
|
||||||
time.perf_counter() - start,
|
time.perf_counter() - start,
|
||||||
)
|
)
|
||||||
return {"messages": messages}
|
return {"messages": messages}
|
||||||
|
|
|
||||||
File diff suppressed because it is too large
Load diff
|
|
@ -13,7 +13,6 @@ extra fields needed to implement Postgres-backed virtual filesystem semantics:
|
||||||
* ``dirty_paths`` — paths whose state file content differs from DB.
|
* ``dirty_paths`` — paths whose state file content differs from DB.
|
||||||
* ``dirty_path_tool_calls`` — sidecar map ``path -> latest tool_call_id`` for
|
* ``dirty_path_tool_calls`` — sidecar map ``path -> latest tool_call_id`` for
|
||||||
dirty paths; used to bind the per-path snapshot to an action_id.
|
dirty paths; used to bind the per-path snapshot to an action_id.
|
||||||
* ``kb_priority`` — top-K priority hints rendered into a system message.
|
|
||||||
* ``kb_anon_doc`` — Redis-loaded anonymous document (if any).
|
* ``kb_anon_doc`` — Redis-loaded anonymous document (if any).
|
||||||
* ``citation_registry`` — per-conversation ``[n]`` -> source map for citations.
|
* ``citation_registry`` — per-conversation ``[n]`` -> source map for citations.
|
||||||
* ``tree_version`` — bumped by persistence; invalidates the tree render cache.
|
* ``tree_version`` — bumped by persistence; invalidates the tree render cache.
|
||||||
|
|
@ -69,14 +68,6 @@ class PendingDelete(TypedDict, total=False):
|
||||||
tool_call_id: str
|
tool_call_id: str
|
||||||
|
|
||||||
|
|
||||||
class KbPriorityEntry(TypedDict, total=False):
|
|
||||||
path: str
|
|
||||||
score: float
|
|
||||||
document_id: int | None
|
|
||||||
title: str
|
|
||||||
mentioned: bool
|
|
||||||
|
|
||||||
|
|
||||||
class KbAnonDoc(TypedDict, total=False):
|
class KbAnonDoc(TypedDict, total=False):
|
||||||
"""In-memory anonymous-session document loaded from Redis."""
|
"""In-memory anonymous-session document loaded from Redis."""
|
||||||
|
|
||||||
|
|
@ -161,9 +152,6 @@ class SurfSenseFilesystemState(FilesystemState):
|
||||||
to the latest action_id (the one the user is most likely to revert).
|
to the latest action_id (the one the user is most likely to revert).
|
||||||
"""
|
"""
|
||||||
|
|
||||||
kb_priority: NotRequired[Annotated[list[KbPriorityEntry], _replace_reducer]]
|
|
||||||
"""Top-K priority hints rendered as a system message before the user turn."""
|
|
||||||
|
|
||||||
kb_anon_doc: NotRequired[Annotated[KbAnonDoc | None, _replace_reducer]]
|
kb_anon_doc: NotRequired[Annotated[KbAnonDoc | None, _replace_reducer]]
|
||||||
"""Anonymous-session document loaded from Redis (read-only, no DB row)."""
|
"""Anonymous-session document loaded from Redis (read-only, no DB row)."""
|
||||||
|
|
||||||
|
|
@ -212,7 +200,6 @@ class SurfSenseFilesystemState(FilesystemState):
|
||||||
|
|
||||||
__all__ = [
|
__all__ = [
|
||||||
"KbAnonDoc",
|
"KbAnonDoc",
|
||||||
"KbPriorityEntry",
|
|
||||||
"PendingDelete",
|
"PendingDelete",
|
||||||
"PendingMove",
|
"PendingMove",
|
||||||
"SurfSenseFilesystemState",
|
"SurfSenseFilesystemState",
|
||||||
|
|
|
||||||
|
|
@ -2,7 +2,7 @@
|
||||||
|
|
||||||
These reducers back the extra state fields used by the cloud-mode filesystem
|
These reducers back the extra state fields used by the cloud-mode filesystem
|
||||||
agent (`cwd`, `staged_dirs`, `pending_moves`, `dirty_paths`, `doc_id_by_path`,
|
agent (`cwd`, `staged_dirs`, `pending_moves`, `dirty_paths`, `doc_id_by_path`,
|
||||||
`kb_priority`, `kb_anon_doc`, `tree_version`).
|
`kb_anon_doc`, `tree_version`).
|
||||||
|
|
||||||
Tools mutate these fields ONLY via `Command(update={...})` returns; the
|
Tools mutate these fields ONLY via `Command(update={...})` returns; the
|
||||||
reducers are responsible for merging successive updates atomically and for
|
reducers are responsible for merging successive updates atomically and for
|
||||||
|
|
@ -258,7 +258,6 @@ def _initial_filesystem_state() -> dict[str, Any]:
|
||||||
"doc_id_by_path": {},
|
"doc_id_by_path": {},
|
||||||
"dirty_paths": [],
|
"dirty_paths": [],
|
||||||
"dirty_path_tool_calls": {},
|
"dirty_path_tool_calls": {},
|
||||||
"kb_priority": [],
|
|
||||||
"kb_anon_doc": None,
|
"kb_anon_doc": None,
|
||||||
"tree_version": 0,
|
"tree_version": 0,
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -6,10 +6,9 @@ You are the SurfSense knowledge base specialist for the user's `/documents/` wor
|
||||||
|
|
||||||
- If the supervisor already provided a precise path (e.g. `/documents/notes/2026-05-11.md`), use it directly — skip the lookup steps below.
|
- If the supervisor already provided a precise path (e.g. `/documents/notes/2026-05-11.md`), use it directly — skip the lookup steps below.
|
||||||
- Otherwise, most requests reference documents by description (`"my meeting notes from last week"`, `"the design doc"`). Resolve them yourself:
|
- Otherwise, most requests reference documents by description (`"my meeting notes from last week"`, `"the design doc"`). Resolve them yourself:
|
||||||
1. Consult `<priority_documents>` — it's a hint about top-K likely matches, not a directive. Skip when the ranked entries don't fit the task.
|
1. Walk `<workspace_tree>` for descriptive folder/filename matches.
|
||||||
2. Walk `<workspace_tree>` for descriptive folder/filename matches.
|
2. Use the `glob` tool for filename patterns the tree didn't surface, and the `grep` tool when the description points at *content* rather than a name.
|
||||||
3. Use the `glob` tool for filename patterns the tree didn't surface, and the `grep` tool when the description points at *content* rather than a name.
|
3. Only return `status=blocked` with `missing_fields=["path"]` when the description is genuinely ambiguous after a thorough lookup.
|
||||||
4. Only return `status=blocked` with `missing_fields=["path"]` when the description is genuinely ambiguous after a thorough lookup.
|
|
||||||
|
|
||||||
For writes (where you choose the path yourself):
|
For writes (where you choose the path yourself):
|
||||||
|
|
||||||
|
|
@ -89,7 +88,7 @@ A KB document reads back like this — only the bracketed `[n]` is a citation la
|
||||||
**Example 2 — edit by inference:**
|
**Example 2 — edit by inference:**
|
||||||
|
|
||||||
- *Supervisor task:* `"Add a bullet about the new feature flag to my Q2 roadmap"`
|
- *Supervisor task:* `"Add a bullet about the new feature flag to my Q2 roadmap"`
|
||||||
- *You:* search for the roadmap doc — check `<priority_documents>` and `<workspace_tree>` first; if neither surfaces it, widen with the `glob` tool (try filename patterns the user's language suggests) or the `grep` tool (search by content). Suppose `<priority_documents>` hits `/documents/planning/q2-roadmap.md` → `read_file("/documents/planning/q2-roadmap.md")` → `edit_file("/documents/planning/q2-roadmap.md", old, new)` → success.
|
- *You:* search for the roadmap doc — check `<workspace_tree>` first; if it doesn't surface the doc, widen with the `glob` tool (try filename patterns the user's language suggests) or the `grep` tool (search by content). Suppose the tree hits `/documents/planning/q2-roadmap.md` → `read_file("/documents/planning/q2-roadmap.md")` → `edit_file("/documents/planning/q2-roadmap.md", old, new)` → success.
|
||||||
- *Output:* `status=success`, evidence includes path and the inserted snippet.
|
- *Output:* `status=success`, evidence includes path and the inserted snippet.
|
||||||
|
|
||||||
**Example 3 — blocked, multiple candidates:**
|
**Example 3 — blocked, multiple candidates:**
|
||||||
|
|
|
||||||
|
|
@ -9,8 +9,7 @@ You are the SurfSense workspace specialist for the user's local folders.
|
||||||
1. If you do not know which mounts exist, call `ls('/')` first.
|
1. If you do not know which mounts exist, call `ls('/')` first.
|
||||||
2. Walk likely folders with the `ls` and `list_tree` tools.
|
2. Walk likely folders with the `ls` and `list_tree` tools.
|
||||||
3. Use the `glob` tool for filename patterns; use the `grep` tool when the description points at *content* rather than a name.
|
3. Use the `glob` tool for filename patterns; use the `grep` tool when the description points at *content* rather than a name.
|
||||||
4. `<priority_documents>` lists top-K cloud-ingested docs, not local files — consult it only when the task spans both worlds (e.g. drafting a local note from a Notion source). Skip otherwise.
|
4. Only return `status=blocked` with `missing_fields=["path"]` when the description is genuinely ambiguous after a thorough lookup.
|
||||||
5. Only return `status=blocked` with `missing_fields=["path"]` when the description is genuinely ambiguous after a thorough lookup.
|
|
||||||
|
|
||||||
For writes (where you choose the path yourself):
|
For writes (where you choose the path yourself):
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -6,9 +6,8 @@ You answer workspace questions for another agent. The end user does **not** see
|
||||||
|
|
||||||
The caller's question often references documents by description (`"my meeting notes from last week"`, `"the design doc"`). Resolve them yourself:
|
The caller's question often references documents by description (`"my meeting notes from last week"`, `"the design doc"`). Resolve them yourself:
|
||||||
|
|
||||||
1. Consult `<priority_documents>` — a hint about top-K likely matches, not a directive. Skip when the ranked entries don't fit.
|
1. Walk `<workspace_tree>` for descriptive folder/filename matches.
|
||||||
2. Walk `<workspace_tree>` for descriptive folder/filename matches.
|
2. Use `glob` for filename patterns the tree didn't surface, and `grep` when the description points at *content* rather than a name.
|
||||||
3. Use `glob` for filename patterns the tree didn't surface, and `grep` when the description points at *content* rather than a name.
|
|
||||||
|
|
||||||
If a precise path was already given, use it directly — skip the lookup.
|
If a precise path was already given, use it directly — skip the lookup.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -9,7 +9,6 @@ The caller's question often references files by description (`"my meeting notes
|
||||||
1. If you do not know which mounts exist, call `ls('/')` first.
|
1. If you do not know which mounts exist, call `ls('/')` first.
|
||||||
2. Walk likely folders with the `ls` and `list_tree` tools.
|
2. Walk likely folders with the `ls` and `list_tree` tools.
|
||||||
3. Use `glob` for filename patterns; use `grep` when the description points at *content* rather than a name.
|
3. Use `glob` for filename patterns; use `grep` when the description points at *content* rather than a name.
|
||||||
4. `<priority_documents>` lists top-K cloud-ingested docs, not local files — consult it only when the task spans both worlds (e.g. drafting a local note from a Notion source). Skip otherwise.
|
|
||||||
|
|
||||||
If a precise path was already given, use it directly — skip the lookup.
|
If a precise path was already given, use it directly — skip the lookup.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -74,8 +74,9 @@ class ResolvedMentionSet:
|
||||||
``@Project``).
|
``@Project``).
|
||||||
|
|
||||||
``mentioned_document_ids`` is an ordered, deduped list consumed by
|
``mentioned_document_ids`` is an ordered, deduped list consumed by
|
||||||
the priority middleware downstream — see
|
the on-demand ``search_knowledge_base`` tool downstream (via
|
||||||
``KnowledgePriorityMiddleware._compute_priority_paths``.
|
``referenced_document_ids``) to pin @-mentioned docs into the
|
||||||
|
retrieval scope.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
mentions: list[ResolvedMention] = field(default_factory=list)
|
mentions: list[ResolvedMention] = field(default_factory=list)
|
||||||
|
|
@ -113,8 +114,8 @@ async def resolve_mentions(
|
||||||
|
|
||||||
* Legacy clients that haven't migrated to the unified chip list
|
* Legacy clients that haven't migrated to the unified chip list
|
||||||
still send the id arrays — we treat the union as authoritative.
|
still send the id arrays — we treat the union as authoritative.
|
||||||
* The id arrays are the canonical input to
|
* The id arrays are the canonical input to the retrieval scope
|
||||||
``KnowledgePriorityMiddleware`` (via ``SurfSenseContextSchema``);
|
(via ``SurfSenseContextSchema`` → ``referenced_document_ids``);
|
||||||
returning the deduped, validated lists lets the route forward
|
returning the deduped, validated lists lets the route forward
|
||||||
them unchanged.
|
them unchanged.
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,6 @@ This module is the single source of truth for mapping ``Document`` rows to
|
||||||
virtual paths under ``/documents/`` and back. It is used by:
|
virtual paths under ``/documents/`` and back. It is used by:
|
||||||
|
|
||||||
* :class:`KnowledgeTreeMiddleware` (rendering the workspace tree)
|
* :class:`KnowledgeTreeMiddleware` (rendering the workspace tree)
|
||||||
* :class:`KnowledgePriorityMiddleware` (computing priority paths)
|
|
||||||
* :class:`KBPostgresBackend` (``als_info`` / ``aread`` / move operations)
|
* :class:`KBPostgresBackend` (``als_info`` / ``aread`` / move operations)
|
||||||
* :class:`KnowledgeBasePersistenceMiddleware` (resolving moves and creates)
|
* :class:`KnowledgeBasePersistenceMiddleware` (resolving moves and creates)
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -11,9 +11,9 @@ MUST live on this context object instead of being captured into a
|
||||||
middleware ``__init__`` closure. Middlewares read fields back via
|
middleware ``__init__`` closure. Middlewares read fields back via
|
||||||
``runtime.context.<field>``; tools read them via ``runtime.context``.
|
``runtime.context.<field>``; tools read them via ``runtime.context``.
|
||||||
|
|
||||||
This object is read inside both ``KnowledgePriorityMiddleware`` (for
|
This object is read by the ``search_knowledge_base`` tool (for
|
||||||
``mentioned_document_ids``) and any future middleware that needs
|
``mentioned_document_ids``) and any middleware that needs per-request
|
||||||
per-request state without invalidating the compiled-agent cache.
|
state without invalidating the compiled-agent cache.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
@ -43,13 +43,12 @@ class SurfSenseContextSchema:
|
||||||
Phase 1.5 fields:
|
Phase 1.5 fields:
|
||||||
search_space_id: Search space the request is scoped to.
|
search_space_id: Search space the request is scoped to.
|
||||||
mentioned_document_ids: KB documents the user @-mentioned this turn.
|
mentioned_document_ids: KB documents the user @-mentioned this turn.
|
||||||
Read by ``KnowledgePriorityMiddleware`` to seed its priority
|
Read by the ``search_knowledge_base`` tool to pin these docs
|
||||||
list. Stays out of the compiled-agent cache key — that's the
|
into the retrieval scope. Stays out of the compiled-agent cache
|
||||||
whole point of putting it here.
|
key — that's the whole point of putting it here.
|
||||||
mentioned_folder_ids: KB folders the user @-mentioned this turn
|
mentioned_folder_ids: KB folders the user @-mentioned this turn
|
||||||
(cloud filesystem mode). Surfaced as ``[USER-MENTIONED]``
|
(cloud filesystem mode). Pinned into the ``search_knowledge_base``
|
||||||
entries in ``<priority_documents>`` so the agent prioritises
|
retrieval scope so matches from those folders are prioritised.
|
||||||
walking those folders with ``ls`` / ``find_documents``.
|
|
||||||
file_operation_contract: One-shot file operation contract for the
|
file_operation_contract: One-shot file operation contract for the
|
||||||
upcoming turn (reserved; not currently populated).
|
upcoming turn (reserved; not currently populated).
|
||||||
turn_id / request_id: Correlation IDs surfaced by the streaming
|
turn_id / request_id: Correlation IDs surfaced by the streaming
|
||||||
|
|
|
||||||
|
|
@ -4,7 +4,7 @@ Extends ``SummarizationMiddleware`` with three SurfSense behaviors:
|
||||||
|
|
||||||
1. A structured summary template (:data:`SURFSENSE_SUMMARY_PROMPT`) instead of
|
1. A structured summary template (:data:`SURFSENSE_SUMMARY_PROMPT`) instead of
|
||||||
the base freeform prompt.
|
the base freeform prompt.
|
||||||
2. Protected SystemMessages (injected hints like ``<priority_documents>``) are
|
2. Protected SystemMessages (injected hints like ``<workspace_tree>``) are
|
||||||
kept verbatim instead of being summarized away.
|
kept verbatim instead of being summarized away.
|
||||||
3. ``content=None`` is sanitized before ``get_buffer_string`` (some providers
|
3. ``content=None`` is sanitized before ``get_buffer_string`` (some providers
|
||||||
stream tool-only AIMessages with ``None`` content, which would crash it).
|
stream tool-only AIMessages with ``None`` content, which would crash it).
|
||||||
|
|
@ -77,7 +77,6 @@ Respond ONLY with the structured summary. Do not include any text before or afte
|
||||||
# compaction step happens *before* re-injection in some paths, so we
|
# compaction step happens *before* re-injection in some paths, so we
|
||||||
# must preserve them verbatim across the cutoff.
|
# must preserve them verbatim across the cutoff.
|
||||||
PROTECTED_SYSTEM_PREFIXES: tuple[str, ...] = (
|
PROTECTED_SYSTEM_PREFIXES: tuple[str, ...] = (
|
||||||
"<priority_documents>", # KnowledgePriorityMiddleware
|
|
||||||
"<workspace_tree>", # KnowledgeTreeMiddleware
|
"<workspace_tree>", # KnowledgeTreeMiddleware
|
||||||
"<file_operation_contract>", # reserved file-operation contract prefix
|
"<file_operation_contract>", # reserved file-operation contract prefix
|
||||||
"<user_memory>", # MemoryInjectionMiddleware
|
"<user_memory>", # MemoryInjectionMiddleware
|
||||||
|
|
|
||||||
|
|
@ -78,7 +78,7 @@ async def _resolve_mention_context(
|
||||||
Automation always runs in cloud filesystem mode, so we mirror the chat
|
Automation always runs in cloud filesystem mode, so we mirror the chat
|
||||||
``new_chat`` flow: substitute ``@title`` tokens with canonical
|
``new_chat`` flow: substitute ``@title`` tokens with canonical
|
||||||
``/documents/...`` paths, prepend a ``<mentioned_connectors>`` block, and
|
``/documents/...`` paths, prepend a ``<mentioned_connectors>`` block, and
|
||||||
build a ``SurfSenseContextSchema`` that ``KnowledgePriorityMiddleware``
|
build a ``SurfSenseContextSchema`` that the ``search_knowledge_base`` tool
|
||||||
reads via ``runtime.context``. Returns ``(query, None)`` unchanged when
|
reads via ``runtime.context``. Returns ``(query, None)`` unchanged when
|
||||||
there are no mentions.
|
there are no mentions.
|
||||||
"""
|
"""
|
||||||
|
|
@ -210,7 +210,7 @@ async def run_agent_task(
|
||||||
runtime_context.turn_id = turn_id
|
runtime_context.turn_id = turn_id
|
||||||
|
|
||||||
# The compiled graph declares ``context_schema=SurfSenseContextSchema``;
|
# The compiled graph declares ``context_schema=SurfSenseContextSchema``;
|
||||||
# mentions only reach ``KnowledgePriorityMiddleware`` via ``context=``.
|
# mentions only reach the ``search_knowledge_base`` tool via ``context=``.
|
||||||
invoke_kwargs: dict[str, Any] = {"config": config}
|
invoke_kwargs: dict[str, Any] = {"config": config}
|
||||||
if runtime_context is not None:
|
if runtime_context is not None:
|
||||||
invoke_kwargs["context"] = runtime_context
|
invoke_kwargs["context"] = runtime_context
|
||||||
|
|
|
||||||
|
|
@ -53,7 +53,6 @@ class AgentFeatureFlagsRead(BaseModel):
|
||||||
|
|
||||||
enable_skills: bool
|
enable_skills: bool
|
||||||
enable_specialized_subagents: bool
|
enable_specialized_subagents: bool
|
||||||
enable_kb_planner_runnable: bool
|
|
||||||
|
|
||||||
enable_action_log: bool
|
enable_action_log: bool
|
||||||
enable_revert_route: bool
|
enable_revert_route: bool
|
||||||
|
|
|
||||||
|
|
@ -246,10 +246,10 @@ class NewChatRequest(BaseModel):
|
||||||
description=(
|
description=(
|
||||||
"Optional knowledge-base folder IDs the user mentioned with "
|
"Optional knowledge-base folder IDs the user mentioned with "
|
||||||
"@. Resolved to virtual paths (``/documents/.../``) by "
|
"@. Resolved to virtual paths (``/documents/.../``) by "
|
||||||
"``mention_resolver`` and surfaced to the agent via "
|
"``mention_resolver``, surfaced to the agent via backtick-wrapped "
|
||||||
"(a) backtick-wrapped substitution in ``user_query`` and "
|
"substitution in ``user_query`` and pinned into the "
|
||||||
"(b) a ``[USER-MENTIONED]`` entry in ``<priority_documents>``. "
|
"``search_knowledge_base`` retrieval scope. The agent's ``ls`` "
|
||||||
"The agent's ``ls`` tool can then walk the folder itself."
|
"tool can then walk the folder itself."
|
||||||
),
|
),
|
||||||
)
|
)
|
||||||
mentioned_documents: list[MentionedDocumentInfo] | None = Field(
|
mentioned_documents: list[MentionedDocumentInfo] | None = Field(
|
||||||
|
|
|
||||||
|
|
@ -22,7 +22,8 @@ def build_new_chat_runtime_context(
|
||||||
request_id: str | None,
|
request_id: str | None,
|
||||||
turn_id: str,
|
turn_id: str,
|
||||||
) -> SurfSenseContextSchema:
|
) -> SurfSenseContextSchema:
|
||||||
"""``mentioned_document_ids`` is consumed by ``KnowledgePriorityMiddleware``.
|
"""``mentioned_document_ids`` is consumed by the ``search_knowledge_base``
|
||||||
|
tool (via ``referenced_document_ids``) to pin mentioned docs into scope.
|
||||||
|
|
||||||
``accepted_folder_ids`` (post-resolve) wins over the raw
|
``accepted_folder_ids`` (post-resolve) wins over the raw
|
||||||
``mentioned_folder_ids`` from the request: the resolver drops chips that
|
``mentioned_folder_ids`` from the request: the resolver drops chips that
|
||||||
|
|
|
||||||
|
|
@ -1,61 +0,0 @@
|
||||||
"""Integration smoke tests for KB search query/date scoping."""
|
|
||||||
|
|
||||||
from __future__ import annotations
|
|
||||||
|
|
||||||
from contextlib import asynccontextmanager
|
|
||||||
from datetime import UTC, datetime, timedelta
|
|
||||||
|
|
||||||
import numpy as np
|
|
||||||
import pytest
|
|
||||||
|
|
||||||
from app.agents.chat.multi_agent_chat.shared.middleware import knowledge_search as ks
|
|
||||||
from app.agents.chat.multi_agent_chat.shared.middleware.knowledge_search import (
|
|
||||||
search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
from .conftest import DUMMY_EMBEDDING
|
|
||||||
|
|
||||||
pytestmark = pytest.mark.integration
|
|
||||||
|
|
||||||
|
|
||||||
async def test_search_knowledge_base_applies_date_filters(
|
|
||||||
db_session,
|
|
||||||
seed_date_filtered_docs,
|
|
||||||
monkeypatch,
|
|
||||||
):
|
|
||||||
"""Date filters should remove older matching documents from scoped KB results."""
|
|
||||||
|
|
||||||
@asynccontextmanager
|
|
||||||
async def fake_shielded_async_session():
|
|
||||||
yield db_session
|
|
||||||
|
|
||||||
monkeypatch.setattr(ks, "shielded_async_session", fake_shielded_async_session)
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks, "embed_texts", lambda texts: [np.array(DUMMY_EMBEDDING) for _ in texts]
|
|
||||||
)
|
|
||||||
|
|
||||||
space_id = seed_date_filtered_docs["search_space"].id
|
|
||||||
recent_cutoff = datetime.now(UTC) - timedelta(days=30)
|
|
||||||
|
|
||||||
unfiltered_results = await search_knowledge_base(
|
|
||||||
query="ocv meeting decisions",
|
|
||||||
search_space_id=space_id,
|
|
||||||
available_document_types=["FILE"],
|
|
||||||
top_k=10,
|
|
||||||
)
|
|
||||||
filtered_results = await search_knowledge_base(
|
|
||||||
query="ocv meeting decisions",
|
|
||||||
search_space_id=space_id,
|
|
||||||
available_document_types=["FILE"],
|
|
||||||
top_k=10,
|
|
||||||
start_date=recent_cutoff,
|
|
||||||
end_date=datetime.now(UTC),
|
|
||||||
)
|
|
||||||
|
|
||||||
unfiltered_ids = {result["document"]["id"] for result in unfiltered_results}
|
|
||||||
filtered_ids = {result["document"]["id"] for result in filtered_results}
|
|
||||||
|
|
||||||
assert seed_date_filtered_docs["recent_doc"].id in unfiltered_ids
|
|
||||||
assert seed_date_filtered_docs["old_doc"].id in unfiltered_ids
|
|
||||||
assert seed_date_filtered_docs["recent_doc"].id in filtered_ids
|
|
||||||
assert seed_date_filtered_docs["old_doc"].id not in filtered_ids
|
|
||||||
|
|
@ -38,7 +38,7 @@ class TestIsProtectedSystemMessage:
|
||||||
)
|
)
|
||||||
|
|
||||||
def test_tolerates_leading_whitespace(self) -> None:
|
def test_tolerates_leading_whitespace(self) -> None:
|
||||||
msg = SystemMessage(content=" \n<priority_documents>\n...")
|
msg = SystemMessage(content=" \n<workspace_tree>\n...")
|
||||||
assert _is_protected_system_message(msg) is True
|
assert _is_protected_system_message(msg) is True
|
||||||
|
|
||||||
|
|
||||||
|
|
@ -89,7 +89,7 @@ class TestPartitionMessages:
|
||||||
|
|
||||||
def test_protected_system_message_preserved_even_in_summarize_half(self) -> None:
|
def test_protected_system_message_preserved_even_in_summarize_half(self) -> None:
|
||||||
partitioner = self._build_partitioner()
|
partitioner = self._build_partitioner()
|
||||||
protected = SystemMessage(content="<priority_documents>\n...")
|
protected = SystemMessage(content="<workspace_tree>\n...")
|
||||||
msgs = [
|
msgs = [
|
||||||
HumanMessage(content="old human"),
|
HumanMessage(content="old human"),
|
||||||
AIMessage(content="old ai"),
|
AIMessage(content="old ai"),
|
||||||
|
|
|
||||||
|
|
@ -28,7 +28,6 @@ def _clear_all(monkeypatch: pytest.MonkeyPatch) -> None:
|
||||||
"SURFSENSE_ENABLE_LLM_TOOL_SELECTOR",
|
"SURFSENSE_ENABLE_LLM_TOOL_SELECTOR",
|
||||||
"SURFSENSE_ENABLE_SKILLS",
|
"SURFSENSE_ENABLE_SKILLS",
|
||||||
"SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS",
|
"SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS",
|
||||||
"SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE",
|
|
||||||
"SURFSENSE_ENABLE_ACTION_LOG",
|
"SURFSENSE_ENABLE_ACTION_LOG",
|
||||||
"SURFSENSE_ENABLE_REVERT_ROUTE",
|
"SURFSENSE_ENABLE_REVERT_ROUTE",
|
||||||
"SURFSENSE_ENABLE_PLUGIN_LOADER",
|
"SURFSENSE_ENABLE_PLUGIN_LOADER",
|
||||||
|
|
@ -57,7 +56,6 @@ def test_defaults_match_shipped_agent_stack(monkeypatch: pytest.MonkeyPatch) ->
|
||||||
assert flags.enable_llm_tool_selector is False
|
assert flags.enable_llm_tool_selector is False
|
||||||
assert flags.enable_skills is True
|
assert flags.enable_skills is True
|
||||||
assert flags.enable_specialized_subagents is True
|
assert flags.enable_specialized_subagents is True
|
||||||
assert flags.enable_kb_planner_runnable is True
|
|
||||||
assert flags.enable_action_log is True
|
assert flags.enable_action_log is True
|
||||||
assert flags.enable_revert_route is True
|
assert flags.enable_revert_route is True
|
||||||
assert flags.enable_plugin_loader is False
|
assert flags.enable_plugin_loader is False
|
||||||
|
|
@ -122,7 +120,6 @@ def test_each_flag_can_be_set_independently(monkeypatch: pytest.MonkeyPatch) ->
|
||||||
"enable_llm_tool_selector": "SURFSENSE_ENABLE_LLM_TOOL_SELECTOR",
|
"enable_llm_tool_selector": "SURFSENSE_ENABLE_LLM_TOOL_SELECTOR",
|
||||||
"enable_skills": "SURFSENSE_ENABLE_SKILLS",
|
"enable_skills": "SURFSENSE_ENABLE_SKILLS",
|
||||||
"enable_specialized_subagents": "SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS",
|
"enable_specialized_subagents": "SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS",
|
||||||
"enable_kb_planner_runnable": "SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE",
|
|
||||||
"enable_action_log": "SURFSENSE_ENABLE_ACTION_LOG",
|
"enable_action_log": "SURFSENSE_ENABLE_ACTION_LOG",
|
||||||
"enable_revert_route": "SURFSENSE_ENABLE_REVERT_ROUTE",
|
"enable_revert_route": "SURFSENSE_ENABLE_REVERT_ROUTE",
|
||||||
"enable_plugin_loader": "SURFSENSE_ENABLE_PLUGIN_LOADER",
|
"enable_plugin_loader": "SURFSENSE_ENABLE_PLUGIN_LOADER",
|
||||||
|
|
|
||||||
|
|
@ -90,8 +90,8 @@ class TestSubstituteInText:
|
||||||
|
|
||||||
class TestResolveMentions:
|
class TestResolveMentions:
|
||||||
"""``resolve_mentions`` resolves chip ids → virtual paths and emits
|
"""``resolve_mentions`` resolves chip ids → virtual paths and emits
|
||||||
a ``ResolvedMentionSet`` whose id partitions feed
|
a ``ResolvedMentionSet`` whose id partitions feed the
|
||||||
``KnowledgePriorityMiddleware``."""
|
``search_knowledge_base`` retrieval scope."""
|
||||||
|
|
||||||
@pytest.mark.asyncio
|
@pytest.mark.asyncio
|
||||||
async def test_returns_empty_when_no_mentions(self):
|
async def test_returns_empty_when_no_mentions(self):
|
||||||
|
|
|
||||||
|
|
@ -161,7 +161,6 @@ class TestInitialFilesystemState:
|
||||||
assert state["doc_id_by_path"] == {}
|
assert state["doc_id_by_path"] == {}
|
||||||
assert state["dirty_paths"] == []
|
assert state["dirty_paths"] == []
|
||||||
assert state["dirty_path_tool_calls"] == {}
|
assert state["dirty_path_tool_calls"] == {}
|
||||||
assert state["kb_priority"] == []
|
|
||||||
assert state["kb_anon_doc"] is None
|
assert state["kb_anon_doc"] is None
|
||||||
assert state["tree_version"] == 0
|
assert state["tree_version"] == 0
|
||||||
|
|
||||||
|
|
|
||||||
|
|
@ -1,604 +0,0 @@
|
||||||
"""Unit tests for knowledge_search middleware helpers."""
|
|
||||||
|
|
||||||
import json
|
|
||||||
|
|
||||||
import pytest
|
|
||||||
from langchain_core.messages import AIMessage, HumanMessage
|
|
||||||
|
|
||||||
from app.agents.chat.multi_agent_chat.shared.middleware import knowledge_search as ks
|
|
||||||
from app.agents.chat.multi_agent_chat.shared.middleware.knowledge_search import (
|
|
||||||
KBSearchPlan,
|
|
||||||
KnowledgePriorityMiddleware,
|
|
||||||
_normalize_optional_date_range,
|
|
||||||
_parse_kb_search_plan_response,
|
|
||||||
_render_recent_conversation,
|
|
||||||
_resolve_search_types,
|
|
||||||
)
|
|
||||||
|
|
||||||
pytestmark = pytest.mark.unit
|
|
||||||
|
|
||||||
|
|
||||||
# ── _resolve_search_types ──────────────────────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
class TestResolveSearchTypes:
|
|
||||||
def test_returns_none_when_no_inputs(self):
|
|
||||||
assert _resolve_search_types(None, None) is None
|
|
||||||
|
|
||||||
def test_returns_none_when_both_empty(self):
|
|
||||||
assert _resolve_search_types([], []) is None
|
|
||||||
|
|
||||||
def test_includes_legacy_type_for_google_gmail(self):
|
|
||||||
result = _resolve_search_types(["GOOGLE_GMAIL_CONNECTOR"], None)
|
|
||||||
assert "GOOGLE_GMAIL_CONNECTOR" in result
|
|
||||||
assert "COMPOSIO_GMAIL_CONNECTOR" in result
|
|
||||||
|
|
||||||
def test_includes_legacy_type_for_google_drive(self):
|
|
||||||
result = _resolve_search_types(None, ["GOOGLE_DRIVE_FILE"])
|
|
||||||
assert "GOOGLE_DRIVE_FILE" in result
|
|
||||||
assert "COMPOSIO_GOOGLE_DRIVE_CONNECTOR" in result
|
|
||||||
|
|
||||||
def test_includes_legacy_type_for_google_calendar(self):
|
|
||||||
result = _resolve_search_types(["GOOGLE_CALENDAR_CONNECTOR"], None)
|
|
||||||
assert "GOOGLE_CALENDAR_CONNECTOR" in result
|
|
||||||
assert "COMPOSIO_GOOGLE_CALENDAR_CONNECTOR" in result
|
|
||||||
|
|
||||||
def test_no_legacy_expansion_for_unrelated_types(self):
|
|
||||||
result = _resolve_search_types(["FILE", "NOTE"], None)
|
|
||||||
assert set(result) == {"FILE", "NOTE"}
|
|
||||||
|
|
||||||
def test_combines_connectors_and_document_types(self):
|
|
||||||
result = _resolve_search_types(["FILE"], ["NOTE", "CRAWLED_URL"])
|
|
||||||
assert {"FILE", "NOTE", "CRAWLED_URL"}.issubset(set(result))
|
|
||||||
|
|
||||||
def test_deduplicates(self):
|
|
||||||
result = _resolve_search_types(["FILE", "FILE"], ["FILE"])
|
|
||||||
assert result.count("FILE") == 1
|
|
||||||
|
|
||||||
|
|
||||||
# ── planner parsing / date normalization ───────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
class TestPlannerHelpers:
|
|
||||||
def test_parse_kb_search_plan_response_accepts_plain_json(self):
|
|
||||||
plan = _parse_kb_search_plan_response(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "ocv meeting decisions summary",
|
|
||||||
"start_date": "2026-03-01",
|
|
||||||
"end_date": "2026-03-31",
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
assert plan.optimized_query == "ocv meeting decisions summary"
|
|
||||||
assert plan.start_date == "2026-03-01"
|
|
||||||
assert plan.end_date == "2026-03-31"
|
|
||||||
|
|
||||||
def test_parse_kb_search_plan_response_accepts_fenced_json(self):
|
|
||||||
plan = _parse_kb_search_plan_response(
|
|
||||||
"""```json
|
|
||||||
{"optimized_query":"deel founders guide","start_date":null,"end_date":null}
|
|
||||||
```"""
|
|
||||||
)
|
|
||||||
assert plan.optimized_query == "deel founders guide"
|
|
||||||
assert plan.start_date is None
|
|
||||||
assert plan.end_date is None
|
|
||||||
|
|
||||||
def test_normalize_optional_date_range_returns_none_when_absent(self):
|
|
||||||
start_date, end_date = _normalize_optional_date_range(None, None)
|
|
||||||
assert start_date is None
|
|
||||||
assert end_date is None
|
|
||||||
|
|
||||||
def test_normalize_optional_date_range_resolves_single_bound(self):
|
|
||||||
start_date, end_date = _normalize_optional_date_range("2026-03-01", None)
|
|
||||||
assert start_date is not None
|
|
||||||
assert end_date is not None
|
|
||||||
assert start_date.date().isoformat() == "2026-03-01"
|
|
||||||
assert end_date >= start_date
|
|
||||||
|
|
||||||
|
|
||||||
class FakeLLM:
|
|
||||||
def __init__(self, response_text: str):
|
|
||||||
self.response_text = response_text
|
|
||||||
self.calls: list[dict] = []
|
|
||||||
|
|
||||||
async def ainvoke(self, messages, config=None):
|
|
||||||
self.calls.append({"messages": messages, "config": config})
|
|
||||||
return AIMessage(content=self.response_text)
|
|
||||||
|
|
||||||
|
|
||||||
class FakeBudgetLLM:
|
|
||||||
def __init__(self, *, max_input_tokens: int):
|
|
||||||
self._max_input_tokens_value = max_input_tokens
|
|
||||||
|
|
||||||
def _get_max_input_tokens(self) -> int:
|
|
||||||
return self._max_input_tokens_value
|
|
||||||
|
|
||||||
def _count_tokens(self, messages) -> int:
|
|
||||||
# Deterministic, simple proxy for tests: count characters as tokens.
|
|
||||||
return sum(len(msg.get("content", "")) for msg in messages)
|
|
||||||
|
|
||||||
|
|
||||||
class TestKnowledgePriorityMiddlewarePlanner:
|
|
||||||
@pytest.fixture(autouse=True)
|
|
||||||
def _disable_planner_runnable(self, monkeypatch):
|
|
||||||
# ``FakeLLM`` is a duck-typed mock; ``create_agent`` (used when the
|
|
||||||
# planner Runnable path is enabled) calls ``.bind()`` on the LLM,
|
|
||||||
# which the mock does not implement. Pin the flag off so the
|
|
||||||
# planner falls through to the legacy ``self.llm.ainvoke`` path
|
|
||||||
# these tests assert against (``llm.calls[0]["config"]``).
|
|
||||||
monkeypatch.setenv("SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE", "false")
|
|
||||||
|
|
||||||
def test_render_recent_conversation_prefers_latest_messages_under_budget(self):
|
|
||||||
messages = [
|
|
||||||
HumanMessage(content="old user context " * 40),
|
|
||||||
AIMessage(content="old assistant answer " * 35),
|
|
||||||
HumanMessage(content="recent user context " * 20),
|
|
||||||
AIMessage(content="recent assistant answer " * 18),
|
|
||||||
HumanMessage(content="latest question"),
|
|
||||||
]
|
|
||||||
|
|
||||||
rendered = _render_recent_conversation(
|
|
||||||
messages,
|
|
||||||
llm=FakeBudgetLLM(max_input_tokens=900),
|
|
||||||
user_text="latest question",
|
|
||||||
)
|
|
||||||
|
|
||||||
assert "recent user context" in rendered
|
|
||||||
assert "recent assistant answer" in rendered
|
|
||||||
assert "latest question" not in rendered
|
|
||||||
assert rendered.index("recent user context") < rendered.index(
|
|
||||||
"recent assistant answer"
|
|
||||||
)
|
|
||||||
|
|
||||||
def test_render_recent_conversation_falls_back_to_legacy_without_budgeting(self):
|
|
||||||
messages = [
|
|
||||||
HumanMessage(content="message one"),
|
|
||||||
AIMessage(content="message two"),
|
|
||||||
HumanMessage(content="latest question"),
|
|
||||||
]
|
|
||||||
|
|
||||||
rendered = _render_recent_conversation(
|
|
||||||
messages,
|
|
||||||
llm=None,
|
|
||||||
user_text="latest question",
|
|
||||||
)
|
|
||||||
|
|
||||||
assert "user: message one" in rendered
|
|
||||||
assert "assistant: message two" in rendered
|
|
||||||
assert "latest question" not in rendered
|
|
||||||
|
|
||||||
async def test_middleware_uses_optimized_query_and_dates(self, monkeypatch):
|
|
||||||
captured: dict = {}
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**kwargs):
|
|
||||||
captured.update(kwargs)
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
llm = FakeLLM(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "ocv meeting decisions action items",
|
|
||||||
"start_date": "2026-03-01",
|
|
||||||
"end_date": "2026-03-31",
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
middleware = KnowledgePriorityMiddleware(llm=llm, search_space_id=37)
|
|
||||||
|
|
||||||
result = await middleware.abefore_agent(
|
|
||||||
{
|
|
||||||
"messages": [
|
|
||||||
HumanMessage(content="what happened in our OCV meeting last month?")
|
|
||||||
]
|
|
||||||
},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
|
|
||||||
assert result is not None
|
|
||||||
assert captured["query"] == "ocv meeting decisions action items"
|
|
||||||
assert captured["start_date"] is not None
|
|
||||||
assert captured["end_date"] is not None
|
|
||||||
assert captured["start_date"].date().isoformat() == "2026-03-01"
|
|
||||||
assert captured["end_date"].date().isoformat() == "2026-03-31"
|
|
||||||
assert llm.calls[0]["config"] == {"tags": ["surfsense:internal"]}
|
|
||||||
|
|
||||||
async def test_middleware_falls_back_when_planner_returns_invalid_json(
|
|
||||||
self,
|
|
||||||
monkeypatch,
|
|
||||||
):
|
|
||||||
captured: dict = {}
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**kwargs):
|
|
||||||
captured.update(kwargs)
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
middleware = KnowledgePriorityMiddleware(
|
|
||||||
llm=FakeLLM("not json"),
|
|
||||||
search_space_id=37,
|
|
||||||
)
|
|
||||||
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="summarize founders guide by deel")]},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
|
|
||||||
assert captured["query"] == "summarize founders guide by deel"
|
|
||||||
assert captured["start_date"] is None
|
|
||||||
assert captured["end_date"] is None
|
|
||||||
|
|
||||||
async def test_middleware_passes_none_dates_when_planner_returns_nulls(
|
|
||||||
self,
|
|
||||||
monkeypatch,
|
|
||||||
):
|
|
||||||
captured: dict = {}
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**kwargs):
|
|
||||||
captured.update(kwargs)
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
middleware = KnowledgePriorityMiddleware(
|
|
||||||
llm=FakeLLM(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "deel founders guide summary",
|
|
||||||
"start_date": None,
|
|
||||||
"end_date": None,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
),
|
|
||||||
search_space_id=37,
|
|
||||||
)
|
|
||||||
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="summarize founders guide by deel")]},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
|
|
||||||
assert captured["query"] == "deel founders guide summary"
|
|
||||||
assert captured["start_date"] is None
|
|
||||||
assert captured["end_date"] is None
|
|
||||||
|
|
||||||
async def test_middleware_routes_to_recency_browse_when_flagged(
|
|
||||||
self,
|
|
||||||
monkeypatch,
|
|
||||||
):
|
|
||||||
"""When the planner sets is_recency_query=true, browse_recent_documents
|
|
||||||
is called instead of search_knowledge_base."""
|
|
||||||
browse_captured: dict = {}
|
|
||||||
search_called = False
|
|
||||||
|
|
||||||
async def fake_browse_recent_documents(**kwargs):
|
|
||||||
browse_captured.update(kwargs)
|
|
||||||
return []
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**kwargs):
|
|
||||||
nonlocal search_called
|
|
||||||
search_called = True
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"browse_recent_documents",
|
|
||||||
fake_browse_recent_documents,
|
|
||||||
)
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
llm = FakeLLM(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "latest uploaded file",
|
|
||||||
"start_date": None,
|
|
||||||
"end_date": None,
|
|
||||||
"is_recency_query": True,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
middleware = KnowledgePriorityMiddleware(llm=llm, search_space_id=42)
|
|
||||||
|
|
||||||
result = await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="what's my latest file?")]},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
|
|
||||||
assert result is not None
|
|
||||||
assert browse_captured["search_space_id"] == 42
|
|
||||||
assert not search_called
|
|
||||||
|
|
||||||
async def test_middleware_uses_hybrid_search_when_not_recency(
|
|
||||||
self,
|
|
||||||
monkeypatch,
|
|
||||||
):
|
|
||||||
"""When is_recency_query is false (default), hybrid search is used."""
|
|
||||||
search_captured: dict = {}
|
|
||||||
browse_called = False
|
|
||||||
|
|
||||||
async def fake_browse_recent_documents(**kwargs):
|
|
||||||
nonlocal browse_called
|
|
||||||
browse_called = True
|
|
||||||
return []
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**kwargs):
|
|
||||||
search_captured.update(kwargs)
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"browse_recent_documents",
|
|
||||||
fake_browse_recent_documents,
|
|
||||||
)
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
llm = FakeLLM(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "quarterly revenue report analysis",
|
|
||||||
"start_date": None,
|
|
||||||
"end_date": None,
|
|
||||||
"is_recency_query": False,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
middleware = KnowledgePriorityMiddleware(llm=llm, search_space_id=42)
|
|
||||||
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="find the quarterly revenue report")]},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
|
|
||||||
assert search_captured["query"] == "quarterly revenue report analysis"
|
|
||||||
assert not browse_called
|
|
||||||
|
|
||||||
|
|
||||||
# ── KBSearchPlan schema ────────────────────────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
class TestKBSearchPlanSchema:
|
|
||||||
def test_is_recency_query_defaults_to_false(self):
|
|
||||||
plan = KBSearchPlan(optimized_query="test query")
|
|
||||||
assert plan.is_recency_query is False
|
|
||||||
|
|
||||||
def test_is_recency_query_parses_true(self):
|
|
||||||
plan = _parse_kb_search_plan_response(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "latest uploaded file",
|
|
||||||
"start_date": None,
|
|
||||||
"end_date": None,
|
|
||||||
"is_recency_query": True,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
assert plan.is_recency_query is True
|
|
||||||
assert plan.optimized_query == "latest uploaded file"
|
|
||||||
|
|
||||||
def test_missing_is_recency_query_defaults_to_false(self):
|
|
||||||
plan = _parse_kb_search_plan_response(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "meeting notes",
|
|
||||||
"start_date": None,
|
|
||||||
"end_date": None,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
assert plan.is_recency_query is False
|
|
||||||
|
|
||||||
|
|
||||||
# ── mentioned_document_ids cross-turn drain ────────────────────────────
|
|
||||||
|
|
||||||
|
|
||||||
class TestKnowledgePriorityMentionDrain:
|
|
||||||
"""Regression tests for the cross-turn ``mentioned_document_ids`` drain.
|
|
||||||
|
|
||||||
The compiled-agent cache reuses a single :class:`KnowledgePriorityMiddleware`
|
|
||||||
instance across turns of the same thread. ``mentioned_document_ids``
|
|
||||||
can therefore enter the middleware via two paths:
|
|
||||||
|
|
||||||
1. The constructor closure (``__init__(mentioned_document_ids=...)``) —
|
|
||||||
seeded by the cache-miss build on turn 1.
|
|
||||||
2. ``runtime.context.mentioned_document_ids`` — supplied freshly per
|
|
||||||
turn by the streaming task.
|
|
||||||
|
|
||||||
Without the drain fix, an empty ``runtime.context.mentioned_document_ids``
|
|
||||||
on turn 2 would fall through to the closure (because ``[]`` is falsy in
|
|
||||||
Python) and replay turn 1's mentions. This class pins down the
|
|
||||||
correct behaviour: the runtime path is authoritative even when empty,
|
|
||||||
and the closure is drained the first time the runtime path fires so
|
|
||||||
no later turn can ever resurrect stale state.
|
|
||||||
"""
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _make_runtime(mention_ids: list[int]):
|
|
||||||
"""Minimal runtime stub exposing only ``runtime.context.mentioned_document_ids``."""
|
|
||||||
from types import SimpleNamespace
|
|
||||||
|
|
||||||
return SimpleNamespace(
|
|
||||||
context=SimpleNamespace(mentioned_document_ids=mention_ids),
|
|
||||||
)
|
|
||||||
|
|
||||||
@staticmethod
|
|
||||||
def _planner_llm() -> "FakeLLM":
|
|
||||||
# Planner returns a stable, non-recency plan so we always land in
|
|
||||||
# the hybrid-search branch (where ``fetch_mentioned_documents`` is
|
|
||||||
# invoked alongside the main search).
|
|
||||||
return FakeLLM(
|
|
||||||
json.dumps(
|
|
||||||
{
|
|
||||||
"optimized_query": "follow up question",
|
|
||||||
"start_date": None,
|
|
||||||
"end_date": None,
|
|
||||||
"is_recency_query": False,
|
|
||||||
}
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
async def test_runtime_context_overrides_closure_and_drains_it(self, monkeypatch):
|
|
||||||
"""Turn 1 with mentions in BOTH closure and runtime context: the
|
|
||||||
runtime path wins AND the closure is drained so a future turn
|
|
||||||
cannot replay it.
|
|
||||||
"""
|
|
||||||
fetched_ids: list[list[int]] = []
|
|
||||||
|
|
||||||
async def fake_fetch_mentioned_documents(*, document_ids, search_space_id):
|
|
||||||
fetched_ids.append(list(document_ids))
|
|
||||||
return []
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**_kwargs):
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"fetch_mentioned_documents",
|
|
||||||
fake_fetch_mentioned_documents,
|
|
||||||
)
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
middleware = KnowledgePriorityMiddleware(
|
|
||||||
llm=self._planner_llm(),
|
|
||||||
search_space_id=42,
|
|
||||||
mentioned_document_ids=[1, 2, 3],
|
|
||||||
)
|
|
||||||
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="what is in those docs?")]},
|
|
||||||
runtime=self._make_runtime([1, 2, 3]),
|
|
||||||
)
|
|
||||||
|
|
||||||
assert fetched_ids == [[1, 2, 3]], (
|
|
||||||
"runtime.context mentions must be the source of truth on turn 1"
|
|
||||||
)
|
|
||||||
assert middleware.mentioned_document_ids == [], (
|
|
||||||
"closure must be drained the first time the runtime path fires "
|
|
||||||
"so no later turn can replay stale mentions"
|
|
||||||
)
|
|
||||||
|
|
||||||
async def test_empty_runtime_context_does_not_replay_closure_mentions(
|
|
||||||
self, monkeypatch
|
|
||||||
):
|
|
||||||
"""Regression: turn 2 with NO mentions must not surface turn 1's
|
|
||||||
mentions from the constructor closure.
|
|
||||||
|
|
||||||
Before the fix, ``if ctx_mentions:`` treated an empty list as
|
|
||||||
absent and fell through to ``elif self.mentioned_document_ids:``,
|
|
||||||
replaying turn 1's mentions. This test pins down the corrected
|
|
||||||
behaviour.
|
|
||||||
"""
|
|
||||||
fetched_ids: list[list[int]] = []
|
|
||||||
|
|
||||||
async def fake_fetch_mentioned_documents(*, document_ids, search_space_id):
|
|
||||||
fetched_ids.append(list(document_ids))
|
|
||||||
return []
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**_kwargs):
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"fetch_mentioned_documents",
|
|
||||||
fake_fetch_mentioned_documents,
|
|
||||||
)
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
# Simulate a cached middleware instance whose closure was seeded
|
|
||||||
# by a previous turn's cache-miss build (mentions=[1,2,3]).
|
|
||||||
middleware = KnowledgePriorityMiddleware(
|
|
||||||
llm=self._planner_llm(),
|
|
||||||
search_space_id=42,
|
|
||||||
mentioned_document_ids=[1, 2, 3],
|
|
||||||
)
|
|
||||||
|
|
||||||
# Turn 2: streaming task supplies an EMPTY mention list (no
|
|
||||||
# mentions on this follow-up turn).
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="what about the next steps?")]},
|
|
||||||
runtime=self._make_runtime([]),
|
|
||||||
)
|
|
||||||
|
|
||||||
assert fetched_ids == [], (
|
|
||||||
"fetch_mentioned_documents must NOT be called when the runtime "
|
|
||||||
"context says there are no mentions for this turn"
|
|
||||||
)
|
|
||||||
|
|
||||||
async def test_legacy_path_fires_only_when_runtime_context_absent(
|
|
||||||
self, monkeypatch
|
|
||||||
):
|
|
||||||
"""Backward-compat: if a caller doesn't supply runtime.context (old
|
|
||||||
non-streaming code path), the closure-injected mentions are still
|
|
||||||
honoured exactly once and then drained.
|
|
||||||
"""
|
|
||||||
fetched_ids: list[list[int]] = []
|
|
||||||
|
|
||||||
async def fake_fetch_mentioned_documents(*, document_ids, search_space_id):
|
|
||||||
fetched_ids.append(list(document_ids))
|
|
||||||
return []
|
|
||||||
|
|
||||||
async def fake_search_knowledge_base(**_kwargs):
|
|
||||||
return []
|
|
||||||
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"fetch_mentioned_documents",
|
|
||||||
fake_fetch_mentioned_documents,
|
|
||||||
)
|
|
||||||
monkeypatch.setattr(
|
|
||||||
ks,
|
|
||||||
"search_knowledge_base",
|
|
||||||
fake_search_knowledge_base,
|
|
||||||
)
|
|
||||||
|
|
||||||
middleware = KnowledgePriorityMiddleware(
|
|
||||||
llm=self._planner_llm(),
|
|
||||||
search_space_id=42,
|
|
||||||
mentioned_document_ids=[7, 8],
|
|
||||||
)
|
|
||||||
|
|
||||||
# First call: no runtime → legacy path uses the closure.
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="initial question")]},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
# Second call: still no runtime — closure already drained, so no replay.
|
|
||||||
await middleware.abefore_agent(
|
|
||||||
{"messages": [HumanMessage(content="follow up")]},
|
|
||||||
runtime=None,
|
|
||||||
)
|
|
||||||
|
|
||||||
assert fetched_ids == [[7, 8]], (
|
|
||||||
"legacy path must honour the closure exactly once and then drain it"
|
|
||||||
)
|
|
||||||
assert middleware.mentioned_document_ids == []
|
|
||||||
|
|
@ -125,12 +125,6 @@ const FLAG_GROUPS: FlagGroup[] = [
|
||||||
description: "Spin up explore / report_writer / connector_negotiator subagents.",
|
description: "Spin up explore / report_writer / connector_negotiator subagents.",
|
||||||
envVar: "SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS",
|
envVar: "SURFSENSE_ENABLE_SPECIALIZED_SUBAGENTS",
|
||||||
},
|
},
|
||||||
{
|
|
||||||
key: "enable_kb_planner_runnable",
|
|
||||||
label: "KB planner runnable",
|
|
||||||
description: "Compile a private planner sub-agent for KB search.",
|
|
||||||
envVar: "SURFSENSE_ENABLE_KB_PLANNER_RUNNABLE",
|
|
||||||
},
|
|
||||||
],
|
],
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
|
|
|
||||||
|
|
@ -19,7 +19,6 @@ const AgentFeatureFlagsSchema = z.object({
|
||||||
|
|
||||||
enable_skills: z.boolean(),
|
enable_skills: z.boolean(),
|
||||||
enable_specialized_subagents: z.boolean(),
|
enable_specialized_subagents: z.boolean(),
|
||||||
enable_kb_planner_runnable: z.boolean(),
|
|
||||||
|
|
||||||
enable_action_log: z.boolean(),
|
enable_action_log: z.boolean(),
|
||||||
enable_revert_route: z.boolean(),
|
enable_revert_route: z.boolean(),
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue