dograh/api/mcp_server
Leoy 5762095edf
feat(mcp): add search_docs tool over docs corpus (closes #295) (#316)
* feat(mcp): add search_docs tool over Mintlify docs corpus

Closes #295. The docs at https://docs.dograh.com promise "Search the
Dograh docs for how to configure a TURN server" as an MCP example
prompt, but no search_docs tool exists in the MCP server — agents can
list workspace resources but cannot search the documentation.

This adds a dependency-free, in-process keyword search over the
`docs/` tree shipped into the API image (`COPY ./docs ./docs`):

- New `api/mcp_server/tools/docs_search.py` — async `search_docs(query,
  limit=10)` with weighted scoring (path > title > body), a 25-result
  hard cap, snippet extraction around the first term hit, and graceful
  empty-list degradation when docs aren't on disk. `DOGRAH_DOCS_PATH`
  env var overrides location discovery for non-Docker layouts.

- Registered in `api/mcp_server/server.py` alongside the other tools,
  keeping the existing list-alphabetical convention.

- `api/tests/test_mcp_docs_search.py` — 18 unit tests covering the
  pure helpers (tokenizer, frontmatter stripping, title extraction,
  scoring weights, URL building) and end-to-end ranking, limit
  clamping, empty-corpus degradation, and input-validation errors.
  Mocks `authenticate_mcp_request` to avoid the DB dependency,
  mirroring `test_mcp_save_workflow.py`.

Implementation notes:
- The docs corpus is ~100 files / ~140k LoC, so a per-call scan runs
  well under 50 ms; avoiding a vector index / embedding backend keeps
  the tool zero-dependency and works for fully offline self-hosted
  deployments.
- Authentication is required for consistency with the other MCP tools
  (and to route through the existing rate-limit middleware), even
  though docs are not org-scoped data.
- Title/path matches deliberately outweigh body matches so a page
  whose subject IS the query term outranks one that merely mentions
  it incidentally.

* feat: improve docs search

---------

Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-05-20 18:20:35 +05:30
..
tools feat(mcp): add search_docs tool over docs corpus (closes #295) (#316) 2026-05-20 18:20:35 +05:30
ts_validator feat: refactor node spec and add mcp tools (#244) 2026-04-21 07:56:16 +05:30
__init__.py feat: refactor node spec and add mcp tools (#244) 2026-04-21 07:56:16 +05:30
auth.py feat(mcp): generic MCP tool source with per-node function filtering (#301) 2026-05-19 16:10:00 +05:30
instructions.py feat(mcp): add search_docs tool over docs corpus (closes #295) (#316) 2026-05-20 18:20:35 +05:30
server.py feat(mcp): add search_docs tool over docs corpus (closes #295) (#316) 2026-05-20 18:20:35 +05:30
tracing.py feat: refactor node spec and add mcp tools (#244) 2026-04-21 07:56:16 +05:30
ts_bridge.py feat: refactor node spec and add mcp tools (#244) 2026-04-21 07:56:16 +05:30