The UserIdleHandler injected its "are you still there?" and disconnect
prompts as role="user" messages. These are agent-side directives, not
user utterances, so they should be injected as role="system" to avoid
polluting the conversation transcript with fake user turns and to read
correctly by the LLM. Updated the realtime append tests to match.
Also forward ports 3000 (UI) and 8000 (API) in the devcontainer so the
running services are reachable from the host.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix(telephony): handle Cloudonix CDR payloads missing session/disposition
The /cloudonix/cdr webhook is a public, unauthenticated endpoint that parses
arbitrary external JSON. It dereferenced cdr_data.get("session").get("token")
unconditionally, so a partial or malformed CDR payload that omits "session"
(or sends "session": null) raised AttributeError -> HTTP 500. The existing
"Missing call_id field" guard right below it was unreachable because the crash
happened first.
StatusCallbackRequest.from_cloudonix_cdr had the same fragility plus a second
one: data.get("disposition", "") returns None when the key is present-but-null,
and None.upper() then crashed.
Navigate both fields defensively so missing/null values fall through to the
intended graceful error path instead of crashing. Adds regression tests
covering missing session, null session, null disposition, and the well-formed
mapping path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
* fix: harden cloudonix cdr session validation
* chore: renamed test path
---------
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
* feat: add model config v2
* chore: centralize user org selection
* chore: move preferences to platform settings
* fix: decouple org preference and ai model preferences
* fix: add CORS preflight handler and ACAO header for embed config endpoint
The GET /public/embed/config/{token} endpoint is fetched by external
websites (third-party embed sites). The global CORSMiddleware only covers
first-party origins, so external origins received no Access-Control-Allow-
Origin header, causing browser preflight failures.
Add an OPTIONS /config/{token} handler that validates the origin against the
token's allowed_domains list and returns the appropriate CORS headers.
Also inject Access-Control-Allow-Origin into the GET response via FastAPI's
response parameter so the actual request succeeds cross-origin.
Closes#383
* fix: complete public embed CORS handling
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* fix: make email lookup case-insensitive in get_user_by_email
Email addresses are case-insensitive in practice, but get_user_by_email
compared with an exact `UserModel.email == email` predicate. A user who
signed up as "User@example.com" could not be found when logging in as
"user@example.com" (and vice-versa), so the same person could fail to log
in — or be treated as a brand-new account — depending only on how their
client capitalized the address.
Compare on `func.lower(UserModel.email) == func.lower(email)` so lookups
are robust to capitalization. Minimal and backwards-compatible: it works
with existing mixed-case rows immediately, with no migration required.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* fix: enforce case-insensitive user emails
---------
Co-authored-by: developer603 <vrramsolutions@gmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
The public WebRTC signaling WebSocket (`/public/signaling/{session_token}`)
validated only the session token and its expiry, not the embed token's
allowed-domain policy that the HTTP embed endpoints already enforce. A leaked
or replayed session token could therefore attach to the signaling path from
an arbitrary origin.
Validate the request origin against `embed_token.allowed_domains` (reusing the
existing `validate_origin` helper) before the signaling handoff, rejecting
disallowed origins with a 1008 close — mirroring the HTTP embed endpoints.
Closes#330
Co-authored-by: shiminshen <16914659+shiminshen@users.noreply.github.com>
Transfer-context lookup by original_call_sid ran
`redis.keys("transfer:context:*")` and iterated every match — an O(N)
blocking scan on call-control hot paths, duplicated across the ARI
manager and the Twilio/Telnyx conference strategies.
Maintain a `transfer:by_call_sid:{original_call_sid}` -> transfer_id
secondary index, written and cleared alongside the context in
store/remove, and resolve lookups with a direct GET. Route the
Twilio/Telnyx strategies through the manager so the lookup lives in one
place (also dropping per-call ad-hoc Redis connections).
Closes#328
Co-authored-by: shiminshen <16914659+shiminshen@users.noreply.github.com>
* feat: add Azure AI multi-provider support (TTS, STT, Embeddings, Realtime)
Enables Azure AI services across all model layers so users with Azure
credits can consolidate billing on a single provider.
- Voice (TTS): AzureSpeechTTSConfiguration via azure_speech provider
- Transcriber (STT): AzureSpeechSTTConfiguration via azure_speech provider
- Embedding: AzureOpenAIEmbeddingsConfiguration via azure provider
- Realtime: AzureRealtimeLLMConfiguration via azure_realtime provider
New files:
- api/services/pipecat/realtime/azure_realtime.py
- api/services/gen_ai/embedding/azure_openai_service.py
- api/tests/test_azure_speech_service_factory.py
The UI picks up all four providers automatically from the schema —
no frontend changes required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: add validation for URL and params
---------
Co-authored-by: Vishal Dhateria <vishal@finela.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
Mirrors the LLM treatment from #368 for the OpenAI STT and OpenAI TTS
providers. Users running OpenAI-compatible self-hosted services (vLLM,
Speaches, llama.cpp, custom proxies) can now point Dograh at them via
the OpenAI provider with `base_url`, instead of being forced onto the
Speaches provider as a workaround.
Changes:
* `registry.py` — add `base_url` field (default `https://api.openai.com/v1`)
to `OpenAISTTConfiguration` and `OpenAITTSService`, identical in shape
to the existing `OpenAILLMService.base_url` from #368.
* `service_factory.py` — in the OPENAI branches of `create_stt_service`
and `create_tts_service`, lift `base_url` off the user config, run it
through `_validate_runtime_service_url`, and forward it as a kwarg to
`OpenAISTTService` / `OpenAITTSService` (both already accept it). Same
pattern as the LLM branch.
* `test_user_configured_service_url_security.py` — adds four runtime
validation tests covering private-IP rejection and localhost rejection
in SaaS mode for both STT and TTS. Existing OSS-mode permissiveness
is unchanged (DEPLOYMENT_MODE=oss skips the validator, as before).
No schema migration needed — Pydantic populates the default; existing
configurations without `base_url` continue to talk to api.openai.com.
`check_validity.py` requires no edits because the per-service validation
loop already iterates `("base_url", "endpoint")` via `getattr`, and the
`_check_openai_api_key` dispatcher already routes OPENAI providers
through the base_url-aware code path (introduced in #368) for STT and
TTS too.
Tests pass locally:
pytest api/tests/test_user_configured_service_url_security.py
23 passed in 4.80s (19 existing + 4 new)
Co-authored-by: developer603 <developer603@users.noreply.github.com>
* fix: support object and array parameters in custom HTTP tools
* feat(ui): expose object and array types in the custom tool parameter editor
* fix: error handling and schema generation
---------
Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Add Sarvam LLM provider, update Sarvam STT models, expose usage_info on run detail.
Depends on pipecat PR dograh-hq/pipecat#43 for STT string language support.
Submodule bump will follow after that merges.
* test: cover Sarvam STT language mapping; link Sarvam docs
---------
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
* fix: run api container as non-root dograh user
The runner stage had no USER directive, causing the API process to run
as root inside the container. Add a system user/group and transfer
ownership of /app before switching to it, so the container process
runs with minimal privileges.
* fix: chown /app and use COPY --chown for non-root runner
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: stamp API key into model override at save time to survive global provider change
When a workflow overrides the TTS/LLM/STT provider to match the current
global config, the override dict only stores model/voice fields, not the
API key. If the global config later switches to a different provider, the
override can no longer inherit the API key and calls fail.
Fix: enrich_overrides_with_api_keys() copies the global provider's API
key (and other secret fields) into the override dict at workflow-save
time, making the override self-contained regardless of future global
config changes.
* feat: add test coverage and masking logic
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Add OpenAI-compatible API option in model configuration
Backend-only cherry-pick from 20617db37a.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix: harden the base url settings in SaaS mode
---------
Co-authored-by: Chris Briddock <briddockchristopher@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* feat: add devcontainer for local setup
* feat: add local install hook
* feat: add devcontainer based setup docs
* feat: use uv in api/Dockerfile
* fix: fix CI scripts
* fix: fix post job cleanup step
* feat: add MiniMax provider support (Chat + TTS)
- Add MiniMax LLM provider using OpenAI-compatible API
- Models: MiniMax-M2.7, MiniMax-M2.7-highspeed
- Default base URL: https://api.minimax.io/v1
- Uses MINIMAX_API_KEY for authentication
- Add MiniMax TTS provider using Pipecat's MiniMaxHttpTTSService
- Models: speech-2.8-hd (default), speech-2.8-turbo
- 6 built-in voices
- Requires group_id configuration
- Add unit tests for both providers
* fix(minimax): validator, temperature, session cleanup, reasoning filter
- check_validity.py: wire MiniMax into _validator_map and enforce
group_id at save time. Without this, saving a config with a valid
key was rejected.
- registry.py: surface temperature on the LLM config (gt=0; MiniMax
rejects 0) and base_url on the TTS config
- service_factory.py:
* Plumb temperature through create_llm_service
* Normalize TTS base_url to include /t2a_v2 — pipecat appends only
?GroupId=... to the URL.
* Use the new MiniMaxLLMService (from pipecat) to strip
<think>...</think> reasoning that MiniMax-M2.7 emits inline in
delta.content (otherwise it leaks straight to TTS).
* Use MiniMaxOwnedSessionTTSService so the per-instance aiohttp
session gets closed in cleanup() instead of leaking sockets/FDs.
- minimax_tts.py: small wrapper around MiniMaxHttpTTSService that owns
the session it was handed (pipecat's caller-owns-session API
conflicts with the ftory's per-instance pattern).
- pipecat submodule: bumps to a commit that adds MiniMaxLLMService — a
thin OpenAILLMService subclass with the streaming <think> filter
(mirrors NvidiaLLMService's pattern for NIM reasoning models).
- Tests updated/added for all of the above.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
Closes three known advisories in python-multipart, all reachable
from the FastAPI multipart form-parser used across the API
(transcribe_audio, knowledge_base uploads, presigned upload flows):
- GHSA-wp53-j4wj-2cfg (HIGH, CWE-22) — arbitrary file write via
non-default configuration. Fixed in 0.0.22.
- GHSA-pp6c-gr5w-3c5g (HIGH, CWE-400) — DoS via unbounded multipart
part headers. Fixed in 0.0.27.
- GHSA-mj87-hwqh-73pj (MOD, CWE-400) — DoS via large multipart
preamble or epilogue. Fixed in 0.0.26.
0.0.27 is a patch-level bump within the same 0.0.x line, no API
changes; fastapi==0.135.3 only requires python-multipart>=0.0.7 so
the upper bound is unaffected.
Detected by Aeon + osv-scanner.
Co-authored-by: aeonframework <aeon@aaronjmars.com>