2026-04-21 07:56:16 +05:30
|
|
|
from fastmcp import FastMCP
|
feat(mcp): add search_docs tool over docs corpus (closes #295) (#316)
* feat(mcp): add search_docs tool over Mintlify docs corpus
Closes #295. The docs at https://docs.dograh.com promise "Search the
Dograh docs for how to configure a TURN server" as an MCP example
prompt, but no search_docs tool exists in the MCP server — agents can
list workspace resources but cannot search the documentation.
This adds a dependency-free, in-process keyword search over the
`docs/` tree shipped into the API image (`COPY ./docs ./docs`):
- New `api/mcp_server/tools/docs_search.py` — async `search_docs(query,
limit=10)` with weighted scoring (path > title > body), a 25-result
hard cap, snippet extraction around the first term hit, and graceful
empty-list degradation when docs aren't on disk. `DOGRAH_DOCS_PATH`
env var overrides location discovery for non-Docker layouts.
- Registered in `api/mcp_server/server.py` alongside the other tools,
keeping the existing list-alphabetical convention.
- `api/tests/test_mcp_docs_search.py` — 18 unit tests covering the
pure helpers (tokenizer, frontmatter stripping, title extraction,
scoring weights, URL building) and end-to-end ranking, limit
clamping, empty-corpus degradation, and input-validation errors.
Mocks `authenticate_mcp_request` to avoid the DB dependency,
mirroring `test_mcp_save_workflow.py`.
Implementation notes:
- The docs corpus is ~100 files / ~140k LoC, so a per-call scan runs
well under 50 ms; avoiding a vector index / embedding backend keeps
the tool zero-dependency and works for fully offline self-hosted
deployments.
- Authentication is required for consistency with the other MCP tools
(and to route through the existing rate-limit middleware), even
though docs are not org-scoped data.
- Title/path matches deliberately outweigh body matches so a page
whose subject IS the query term outranks one that merely mentions
it incidentally.
* feat: improve docs search
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-05-20 20:50:35 +08:00
|
|
|
from mcp.types import ToolAnnotations
|
2026-04-21 07:56:16 +05:30
|
|
|
|
|
|
|
|
from api.mcp_server.instructions import DOGRAH_MCP_INSTRUCTIONS
|
2026-04-25 17:38:38 +05:30
|
|
|
from api.mcp_server.tools.catalog import (
|
|
|
|
|
list_credentials,
|
|
|
|
|
list_documents,
|
|
|
|
|
list_recordings,
|
|
|
|
|
list_tools,
|
|
|
|
|
)
|
|
|
|
|
from api.mcp_server.tools.create_workflow import create_workflow
|
feat(mcp): add search_docs tool over docs corpus (closes #295) (#316)
* feat(mcp): add search_docs tool over Mintlify docs corpus
Closes #295. The docs at https://docs.dograh.com promise "Search the
Dograh docs for how to configure a TURN server" as an MCP example
prompt, but no search_docs tool exists in the MCP server — agents can
list workspace resources but cannot search the documentation.
This adds a dependency-free, in-process keyword search over the
`docs/` tree shipped into the API image (`COPY ./docs ./docs`):
- New `api/mcp_server/tools/docs_search.py` — async `search_docs(query,
limit=10)` with weighted scoring (path > title > body), a 25-result
hard cap, snippet extraction around the first term hit, and graceful
empty-list degradation when docs aren't on disk. `DOGRAH_DOCS_PATH`
env var overrides location discovery for non-Docker layouts.
- Registered in `api/mcp_server/server.py` alongside the other tools,
keeping the existing list-alphabetical convention.
- `api/tests/test_mcp_docs_search.py` — 18 unit tests covering the
pure helpers (tokenizer, frontmatter stripping, title extraction,
scoring weights, URL building) and end-to-end ranking, limit
clamping, empty-corpus degradation, and input-validation errors.
Mocks `authenticate_mcp_request` to avoid the DB dependency,
mirroring `test_mcp_save_workflow.py`.
Implementation notes:
- The docs corpus is ~100 files / ~140k LoC, so a per-call scan runs
well under 50 ms; avoiding a vector index / embedding backend keeps
the tool zero-dependency and works for fully offline self-hosted
deployments.
- Authentication is required for consistency with the other MCP tools
(and to route through the existing rate-limit middleware), even
though docs are not org-scoped data.
- Title/path matches deliberately outweigh body matches so a page
whose subject IS the query term outranks one that merely mentions
it incidentally.
* feat: improve docs search
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-05-20 20:50:35 +08:00
|
|
|
from api.mcp_server.tools.docs_search import list_docs, read_doc, search_docs
|
2026-04-25 17:38:38 +05:30
|
|
|
from api.mcp_server.tools.get_workflow_code import get_workflow_code
|
|
|
|
|
from api.mcp_server.tools.node_types import get_node_type, list_node_types
|
|
|
|
|
from api.mcp_server.tools.save_workflow import save_workflow
|
2026-05-31 16:50:44 +05:30
|
|
|
from api.mcp_server.tools.tool_creation import create_tool
|
2026-05-31 16:07:32 +05:30
|
|
|
from api.mcp_server.tools.voice_prompting_guide import get_voice_prompting_guide
|
2026-04-25 17:38:38 +05:30
|
|
|
from api.mcp_server.tools.workflows import get_workflow, list_workflows
|
2026-04-21 07:56:16 +05:30
|
|
|
|
|
|
|
|
mcp = FastMCP("dograh", instructions=DOGRAH_MCP_INSTRUCTIONS)
|
|
|
|
|
|
2026-04-25 17:38:38 +05:30
|
|
|
for _tool in (
|
|
|
|
|
create_workflow,
|
2026-05-31 16:50:44 +05:30
|
|
|
create_tool,
|
2026-04-25 17:38:38 +05:30
|
|
|
get_node_type,
|
|
|
|
|
get_workflow,
|
|
|
|
|
get_workflow_code,
|
|
|
|
|
list_credentials,
|
|
|
|
|
list_documents,
|
|
|
|
|
list_node_types,
|
|
|
|
|
list_recordings,
|
|
|
|
|
list_tools,
|
|
|
|
|
list_workflows,
|
|
|
|
|
save_workflow,
|
|
|
|
|
):
|
|
|
|
|
mcp.tool(_tool)
|
feat(mcp): add search_docs tool over docs corpus (closes #295) (#316)
* feat(mcp): add search_docs tool over Mintlify docs corpus
Closes #295. The docs at https://docs.dograh.com promise "Search the
Dograh docs for how to configure a TURN server" as an MCP example
prompt, but no search_docs tool exists in the MCP server — agents can
list workspace resources but cannot search the documentation.
This adds a dependency-free, in-process keyword search over the
`docs/` tree shipped into the API image (`COPY ./docs ./docs`):
- New `api/mcp_server/tools/docs_search.py` — async `search_docs(query,
limit=10)` with weighted scoring (path > title > body), a 25-result
hard cap, snippet extraction around the first term hit, and graceful
empty-list degradation when docs aren't on disk. `DOGRAH_DOCS_PATH`
env var overrides location discovery for non-Docker layouts.
- Registered in `api/mcp_server/server.py` alongside the other tools,
keeping the existing list-alphabetical convention.
- `api/tests/test_mcp_docs_search.py` — 18 unit tests covering the
pure helpers (tokenizer, frontmatter stripping, title extraction,
scoring weights, URL building) and end-to-end ranking, limit
clamping, empty-corpus degradation, and input-validation errors.
Mocks `authenticate_mcp_request` to avoid the DB dependency,
mirroring `test_mcp_save_workflow.py`.
Implementation notes:
- The docs corpus is ~100 files / ~140k LoC, so a per-call scan runs
well under 50 ms; avoiding a vector index / embedding backend keeps
the tool zero-dependency and works for fully offline self-hosted
deployments.
- Authentication is required for consistency with the other MCP tools
(and to route through the existing rate-limit middleware), even
though docs are not org-scoped data.
- Title/path matches deliberately outweigh body matches so a page
whose subject IS the query term outranks one that merely mentions
it incidentally.
* feat: improve docs search
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-05-20 20:50:35 +08:00
|
|
|
|
2026-05-31 16:07:32 +05:30
|
|
|
_GUIDE_TOOL_ANNOTATIONS = ToolAnnotations(
|
|
|
|
|
readOnlyHint=True,
|
|
|
|
|
idempotentHint=True,
|
|
|
|
|
destructiveHint=False,
|
|
|
|
|
openWorldHint=False,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
mcp.tool(get_voice_prompting_guide, annotations=_GUIDE_TOOL_ANNOTATIONS)
|
|
|
|
|
|
feat(mcp): add search_docs tool over docs corpus (closes #295) (#316)
* feat(mcp): add search_docs tool over Mintlify docs corpus
Closes #295. The docs at https://docs.dograh.com promise "Search the
Dograh docs for how to configure a TURN server" as an MCP example
prompt, but no search_docs tool exists in the MCP server — agents can
list workspace resources but cannot search the documentation.
This adds a dependency-free, in-process keyword search over the
`docs/` tree shipped into the API image (`COPY ./docs ./docs`):
- New `api/mcp_server/tools/docs_search.py` — async `search_docs(query,
limit=10)` with weighted scoring (path > title > body), a 25-result
hard cap, snippet extraction around the first term hit, and graceful
empty-list degradation when docs aren't on disk. `DOGRAH_DOCS_PATH`
env var overrides location discovery for non-Docker layouts.
- Registered in `api/mcp_server/server.py` alongside the other tools,
keeping the existing list-alphabetical convention.
- `api/tests/test_mcp_docs_search.py` — 18 unit tests covering the
pure helpers (tokenizer, frontmatter stripping, title extraction,
scoring weights, URL building) and end-to-end ranking, limit
clamping, empty-corpus degradation, and input-validation errors.
Mocks `authenticate_mcp_request` to avoid the DB dependency,
mirroring `test_mcp_save_workflow.py`.
Implementation notes:
- The docs corpus is ~100 files / ~140k LoC, so a per-call scan runs
well under 50 ms; avoiding a vector index / embedding backend keeps
the tool zero-dependency and works for fully offline self-hosted
deployments.
- Authentication is required for consistency with the other MCP tools
(and to route through the existing rate-limit middleware), even
though docs are not org-scoped data.
- Title/path matches deliberately outweigh body matches so a page
whose subject IS the query term outranks one that merely mentions
it incidentally.
* feat: improve docs search
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-05-20 20:50:35 +08:00
|
|
|
_DOCS_TOOL_ANNOTATIONS = ToolAnnotations(
|
|
|
|
|
readOnlyHint=True,
|
|
|
|
|
idempotentHint=True,
|
|
|
|
|
destructiveHint=False,
|
|
|
|
|
openWorldHint=False,
|
|
|
|
|
)
|
|
|
|
|
|
|
|
|
|
for _tool in (list_docs, read_doc, search_docs):
|
|
|
|
|
mcp.tool(_tool, annotations=_DOCS_TOOL_ANNOTATIONS)
|