* feat: add Smallest AI TTS and STT provider integration
Integrates Smallest AI's Waves (TTS) and Pulse (STT) APIs as selectable
providers in the Dograh platform. Dograh's pipecat fork already contains
the pipecat-level service implementations; this wires them into the API
configuration registry and service factory.
- Added `SMALLEST = "smallest"` to `ServiceProviders` enum
- Registered `SmallestAITTSConfiguration` (lightning-v3.1/v2, voices,
language, speed) and `SmallestAISTTConfiguration` (pulse model, 30+
languages) Pydantic config classes with the TTS/STT registries
- Added factory branches in `create_tts_service` and `create_stt_service`
routing to `SmallestTTSService` and `SmallestSTTService` from pipecat
* fix: update Smallest AI models to v4 naming convention
- TTS: rename lightning-v3.1 → lightning_v3.1, add lightning_v3.1_pro, drop deprecated lightning-v2
- STT: keep pulse only (pulse-pro is not a streaming model)
* fix: change default TTS voice from emily to sophia for lightning_v3.1
emily is not a verified lightning_v3.1 voice; sophia is the pipecat
SmallestTTSService default and confirmed to work with the standard pool.
* fix: replace 9 invalid lightning_v3.1 voice IDs with verified ones
jasmine, james, michael, aria, lara, asel, sarah, rishi, deepika do not
exist in the lightning_v3.1 voice catalog. Replaced with avery, liam,
lucas, olivia, freya, devansh, maya, dhruv, maithili — all verified
against the API.
* fix: smallest ai config validation and tts model compatibility
* chore: ruff fix
* chore: updated smallest ai schema in openapi.json
---------
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
Co-authored-by: Sabiha Khan <87858386+chewwbaka@users.noreply.github.com>
* feat: add Azure AI multi-provider support (TTS, STT, Embeddings, Realtime)
Enables Azure AI services across all model layers so users with Azure
credits can consolidate billing on a single provider.
- Voice (TTS): AzureSpeechTTSConfiguration via azure_speech provider
- Transcriber (STT): AzureSpeechSTTConfiguration via azure_speech provider
- Embedding: AzureOpenAIEmbeddingsConfiguration via azure provider
- Realtime: AzureRealtimeLLMConfiguration via azure_realtime provider
New files:
- api/services/pipecat/realtime/azure_realtime.py
- api/services/gen_ai/embedding/azure_openai_service.py
- api/tests/test_azure_speech_service_factory.py
The UI picks up all four providers automatically from the schema —
no frontend changes required.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix: add validation for URL and params
---------
Co-authored-by: Vishal Dhateria <vishal@finela.ai>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
Mirrors the LLM treatment from #368 for the OpenAI STT and OpenAI TTS
providers. Users running OpenAI-compatible self-hosted services (vLLM,
Speaches, llama.cpp, custom proxies) can now point Dograh at them via
the OpenAI provider with `base_url`, instead of being forced onto the
Speaches provider as a workaround.
Changes:
* `registry.py` — add `base_url` field (default `https://api.openai.com/v1`)
to `OpenAISTTConfiguration` and `OpenAITTSService`, identical in shape
to the existing `OpenAILLMService.base_url` from #368.
* `service_factory.py` — in the OPENAI branches of `create_stt_service`
and `create_tts_service`, lift `base_url` off the user config, run it
through `_validate_runtime_service_url`, and forward it as a kwarg to
`OpenAISTTService` / `OpenAITTSService` (both already accept it). Same
pattern as the LLM branch.
* `test_user_configured_service_url_security.py` — adds four runtime
validation tests covering private-IP rejection and localhost rejection
in SaaS mode for both STT and TTS. Existing OSS-mode permissiveness
is unchanged (DEPLOYMENT_MODE=oss skips the validator, as before).
No schema migration needed — Pydantic populates the default; existing
configurations without `base_url` continue to talk to api.openai.com.
`check_validity.py` requires no edits because the per-service validation
loop already iterates `("base_url", "endpoint")` via `getattr`, and the
`_check_openai_api_key` dispatcher already routes OPENAI providers
through the base_url-aware code path (introduced in #368) for STT and
TTS too.
Tests pass locally:
pytest api/tests/test_user_configured_service_url_security.py
23 passed in 4.80s (19 existing + 4 new)
Co-authored-by: developer603 <developer603@users.noreply.github.com>
* Add Sarvam LLM provider, update Sarvam STT models, expose usage_info on run detail.
Depends on pipecat PR dograh-hq/pipecat#43 for STT string language support.
Submodule bump will follow after that merges.
* test: cover Sarvam STT language mapping; link Sarvam docs
---------
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
* Add OpenAI-compatible API option in model configuration
Backend-only cherry-pick from 20617db37a.
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* fix: harden the base url settings in SaaS mode
---------
Co-authored-by: Chris Briddock <briddockchristopher@gmail.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* feat: add MiniMax provider support (Chat + TTS)
- Add MiniMax LLM provider using OpenAI-compatible API
- Models: MiniMax-M2.7, MiniMax-M2.7-highspeed
- Default base URL: https://api.minimax.io/v1
- Uses MINIMAX_API_KEY for authentication
- Add MiniMax TTS provider using Pipecat's MiniMaxHttpTTSService
- Models: speech-2.8-hd (default), speech-2.8-turbo
- 6 built-in voices
- Requires group_id configuration
- Add unit tests for both providers
* fix(minimax): validator, temperature, session cleanup, reasoning filter
- check_validity.py: wire MiniMax into _validator_map and enforce
group_id at save time. Without this, saving a config with a valid
key was rejected.
- registry.py: surface temperature on the LLM config (gt=0; MiniMax
rejects 0) and base_url on the TTS config
- service_factory.py:
* Plumb temperature through create_llm_service
* Normalize TTS base_url to include /t2a_v2 — pipecat appends only
?GroupId=... to the URL.
* Use the new MiniMaxLLMService (from pipecat) to strip
<think>...</think> reasoning that MiniMax-M2.7 emits inline in
delta.content (otherwise it leaks straight to TTS).
* Use MiniMaxOwnedSessionTTSService so the per-instance aiohttp
session gets closed in cleanup() instead of leaking sockets/FDs.
- minimax_tts.py: small wrapper around MiniMaxHttpTTSService that owns
the session it was handed (pipecat's caller-owns-session API
conflicts with the ftory's per-instance pattern).
- pipecat submodule: bumps to a commit that adds MiniMaxLLMService — a
thin OpenAILLMService subclass with the streaming <think> filter
(mirrors NvidiaLLMService's pattern for NIM reasoning models).
- Tests updated/added for all of the above.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: octo-patch <octo-patch@github.com>
Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
* feat: configurable ElevenLabs base URL for Data Residency (#269)
Adds a `base_url` field to `ElevenlabsTTSConfiguration` so users on an
ElevenLabs Data Residency plan (EU, etc.) can point Dograh at the
regional endpoint instead of the hardcoded global one. Defaults to
`https://api.elevenlabs.io`, preserving existing behaviour. The
service factory rewrites the HTTP scheme to WSS when constructing the
WebSocket TTS service.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix: fix drift
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat: add tests and migrations
* feat: workflow versioning among published and draft
* feat: add a new settings page to simplify workflow detail page
* fix: fix tsclient generation