Mirrors the LLM treatment from #368 for the OpenAI STT and OpenAI TTS
providers. Users running OpenAI-compatible self-hosted services (vLLM,
Speaches, llama.cpp, custom proxies) can now point Dograh at them via
the OpenAI provider with `base_url`, instead of being forced onto the
Speaches provider as a workaround.
Changes:
* `registry.py` — add `base_url` field (default `https://api.openai.com/v1`)
to `OpenAISTTConfiguration` and `OpenAITTSService`, identical in shape
to the existing `OpenAILLMService.base_url` from #368.
* `service_factory.py` — in the OPENAI branches of `create_stt_service`
and `create_tts_service`, lift `base_url` off the user config, run it
through `_validate_runtime_service_url`, and forward it as a kwarg to
`OpenAISTTService` / `OpenAITTSService` (both already accept it). Same
pattern as the LLM branch.
* `test_user_configured_service_url_security.py` — adds four runtime
validation tests covering private-IP rejection and localhost rejection
in SaaS mode for both STT and TTS. Existing OSS-mode permissiveness
is unchanged (DEPLOYMENT_MODE=oss skips the validator, as before).
No schema migration needed — Pydantic populates the default; existing
configurations without `base_url` continue to talk to api.openai.com.
`check_validity.py` requires no edits because the per-service validation
loop already iterates `("base_url", "endpoint")` via `getattr`, and the
`_check_openai_api_key` dispatcher already routes OPENAI providers
through the base_url-aware code path (introduced in #368) for STT and
TTS too.
Tests pass locally:
pytest api/tests/test_user_configured_service_url_security.py
23 passed in 4.80s (19 existing + 4 new)
Co-authored-by: developer603 <developer603@users.noreply.github.com>
* fix: support object and array parameters in custom HTTP tools
* feat(ui): expose object and array types in the custom tool parameter editor
* fix: error handling and schema generation
---------
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Add Sarvam LLM provider, update Sarvam STT models, expose usage_info on run detail.
Depends on pipecat PR dograh-hq/pipecat#43 for STT string language support.
Submodule bump will follow after that merges.
* test: cover Sarvam STT language mapping; link Sarvam docs
---------
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
* fix: stamp API key into model override at save time to survive global provider change
When a workflow overrides the TTS/LLM/STT provider to match the current
global config, the override dict only stores model/voice fields, not the
API key. If the global config later switches to a different provider, the
override can no longer inherit the API key and calls fail.
Fix: enrich_overrides_with_api_keys() copies the global provider's API
key (and other secret fields) into the override dict at workflow-save
time, making the override self-contained regardless of future global
config changes.
* feat: add test coverage and masking logic
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Add OpenAI-compatible API option in model configuration
Backend-only cherry-pick from 20617db37a.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix: harden the base url settings in SaaS mode
---------
Co-authored-by: Chris Briddock <briddockchristopher@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* feat: add MiniMax provider support (Chat + TTS)
- Add MiniMax LLM provider using OpenAI-compatible API
- Models: MiniMax-M2.7, MiniMax-M2.7-highspeed
- Default base URL: https://api.minimax.io/v1
- Uses MINIMAX_API_KEY for authentication
- Add MiniMax TTS provider using Pipecat's MiniMaxHttpTTSService
- Models: speech-2.8-hd (default), speech-2.8-turbo
- 6 built-in voices
- Requires group_id configuration
- Add unit tests for both providers
* fix(minimax): validator, temperature, session cleanup, reasoning filter
- check_validity.py: wire MiniMax into _validator_map and enforce
group_id at save time. Without this, saving a config with a valid
key was rejected.
- registry.py: surface temperature on the LLM config (gt=0; MiniMax
rejects 0) and base_url on the TTS config
- service_factory.py:
* Plumb temperature through create_llm_service
* Normalize TTS base_url to include /t2a_v2 — pipecat appends only
?GroupId=... to the URL.
* Use the new MiniMaxLLMService (from pipecat) to strip
<think>...</think> reasoning that MiniMax-M2.7 emits inline in
delta.content (otherwise it leaks straight to TTS).
* Use MiniMaxOwnedSessionTTSService so the per-instance aiohttp
session gets closed in cleanup() instead of leaking sockets/FDs.
- minimax_tts.py: small wrapper around MiniMaxHttpTTSService that owns
the session it was handed (pipecat's caller-owns-session API
conflicts with the ftory's per-instance pattern).
- pipecat submodule: bumps to a commit that adds MiniMaxLLMService — a
thin OpenAILLMService subclass with the streaming <think> filter
(mirrors NvidiaLLMService's pattern for NIM reasoning models).
- Tests updated/added for all of the above.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
* feat(mcp): add search_docs tool over Mintlify docs corpus
Closes#295. The docs at https://docs.dograh.com promise "Search the
Dograh docs for how to configure a TURN server" as an MCP example
prompt, but no search_docs tool exists in the MCP server — agents can
list workspace resources but cannot search the documentation.
This adds a dependency-free, in-process keyword search over the
`docs/` tree shipped into the API image (`COPY ./docs ./docs`):
- New `api/mcp_server/tools/docs_search.py` — async `search_docs(query,
limit=10)` with weighted scoring (path > title > body), a 25-result
hard cap, snippet extraction around the first term hit, and graceful
empty-list degradation when docs aren't on disk. `DOGRAH_DOCS_PATH`
env var overrides location discovery for non-Docker layouts.
- Registered in `api/mcp_server/server.py` alongside the other tools,
keeping the existing list-alphabetical convention.
- `api/tests/test_mcp_docs_search.py` — 18 unit tests covering the
pure helpers (tokenizer, frontmatter stripping, title extraction,
scoring weights, URL building) and end-to-end ranking, limit
clamping, empty-corpus degradation, and input-validation errors.
Mocks `authenticate_mcp_request` to avoid the DB dependency,
mirroring `test_mcp_save_workflow.py`.
Implementation notes:
- The docs corpus is ~100 files / ~140k LoC, so a per-call scan runs
well under 50 ms; avoiding a vector index / embedding backend keeps
the tool zero-dependency and works for fully offline self-hosted
deployments.
- Authentication is required for consistency with the other MCP tools
(and to route through the existing rate-limit middleware), even
though docs are not org-scoped data.
- Title/path matches deliberately outweigh body matches so a page
whose subject IS the query term outranks one that merely mentions
it incidentally.
* feat: improve docs search
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Add tuner integration
* bump pipecat version
* chore: update pipecat submodule to match upstream and use tuner-pipecat-sdk 0.2.0
Update pipecat submodule from 0.0.109.dev23 to 13e98d0d9 (the exact commit
upstream dograh-hq/dograh uses after v1.30.1). This installs pipecat-ai as
1.1.0.post277 via setuptools_scm, satisfying tuner-pipecat-sdk 0.2.0's
pipecat-ai>=1.0.0 requirement.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* wire tuner
* feat: refactor integrations into self contained packages
* chore: simplify ensure_public_access_token
* fix: remove NodeSpec and make DTOs the source of truth
* feat: send relevant signal to mcp using to_mcp_dict
* fix: fix tests
* cleanup: remove nango integrations
* feat: add agents.md for integrations
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Resolve an issue with direct socket connections using the wrong event data.
* Resolve the formatting issus in the provider file
* chore: fix import ordering with ruff
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Nir Simionovich <nirs@cloudonix.io>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ari): pre-register ext channel id and defer bridge to its StasisStart
Two race conditions in the inbound ARI flow could leave a call silent:
1. Bridging both channels immediately after creating the ext media leg
raced against the ext channel entering the Stasis application; slow
chan_websocket handshakes produced "Channel not in Stasis application"
422 errors on addChannel.
2. Asterisk could fire StasisStart for the ext channel before the
externalMedia POST response returned, so _is_ext_channel returned
False and the event was dropped as an unknown outbound call.
Fixes:
- Generate the ext channel id as dograh-ext-<uuid> client-side and pass
it to Asterisk via the channelId query param. Mark the ext channel,
set its channel->run mapping, register the pending bridge entry, and
persist gathered_context.ext_channel_id all before the POST.
- Defer the bridge to a new _complete_bridge_after_ext_ready handler
triggered by the ext channel's own StasisStart. Both channels are
guaranteed in Stasis by then, so addChannel cannot 422.
- On POST failure or channelId mismatch, roll back the pending entry
and ERROR loudly.
* fix: replace in-memory dict with redis storage
* feat: configurable ElevenLabs base URL for Data Residency (#269)
Adds a `base_url` field to `ElevenlabsTTSConfiguration` so users on an
ElevenLabs Data Residency plan (EU, etc.) can point Dograh at the
regional endpoint instead of the hardcoded global one. Defaults to
`https://api.elevenlabs.io`, preserving existing behaviour. The
service factory rewrites the HTTP scheme to WSS when constructing the
WebSocket TTS service.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: fix drift
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Conference-based transfer for Telnyx with a two-step flow:
- transfer_call dials the destination with a per-transfer webhook URL.
- On call.answered, the webhook seeds a conference with the destination's
live call_control_id and publishes DESTINATION_ANSWERED.
- TelnyxConferenceStrategy joins the caller into the conference on
pipeline teardown (EndTaskReason.TRANSFER_CALL).
- On post-answer destination hangup, the webhook hangs up the caller —
Telnyx's create_conference doesn't accept end_conference_on_exit on
the seed leg, so we tear down the bridge ourselves.
TransferContext gains optional workflow_run_id (for webhook→provider
resolution in multi-config orgs) and conference_id (set on answer,
rd by the strategy).
Also fixes the transfer tool's provider lookup to go through
get_telephony_provider_for_run instead of the deprecated org-default
shim, which was returning the wrong provider in multi-config orgs.
* filter out local sdp candidates on non local environment
* feat: add FORCE_TURN_RELAY variable
* add FORCE_TURN_RELAY option in docker-compose
* fix: fix github workflow
If there are multiple telephony configurations, the form number should be initialized from the campaigns given telephonic configuration rather than the organization default telephonic configuration.