Commit graph

26 commits

Author SHA1 Message Date
Tararais
fd0d144b08
feat(webhooks): durable retrying delivery for final webhooks (#478)
* feat(webhooks): durable retrying delivery for final webhooks

Final webhook nodes were fired inline with a single best-effort httpx POST
(run_integrations._execute_webhook_node). On a transient error the failure was
swallowed at three levels, so ARQ never retried and the final call report was
permanently lost -- leaving downstream receivers stuck (e.g. a CRM showing a
call as still "in conversation").

Replace the one-shot POST with a durable, idempotent delivery pipeline modelled
on the campaign retry pattern (persisted row + scheduled_for + bounded attempts):

- New webhook_deliveries table (WebhookDeliveryModel) is the source of truth.
  Payload is rendered once and frozen so retries are deterministic; secrets are
  not stored -- the credential is referenced by uuid and re-resolved at send time.
- run_integrations now persists a delivery row and enqueues deliver_webhook with
  a deterministic ARQ job id instead of sending inline.
- deliver_webhook (new ARQ task) sends the request and:
    * 2xx            -> succeeded
    * transient      -> retry with capped exponential backoff (RequestError /
                        5xx / 408 / 425 / 429), up to max_attempts then dead_letter
    * permanent 4xx  -> dead_letter immediately (no pointless looping)
  It is idempotent: a non-pending delivery is a no-op, so a duplicate enqueue or
  sweeper re-injection can't double-send.
- sweep_webhook_deliveries cron (every 5 min) re-enqueues overdue pending
  deliveries so nothing is lost to a worker restart / Redis flush.
- Stable X-Dograh-Delivery-Id / Workflow-Run-Id / Attempt headers let receivers
  dedupe retried deliveries.
- enqueue_job now forwards ARQ job options (_job_id, _defer_by); failures log
  repr(e) so empty-message errors like ConnectTimeout are diagnosable.

Config via DEFAULT_WEBHOOK_DELIVERY_CONFIG (env-overridable): max_attempts=5,
base_delay=30s, max_delay=600s, timeout=30s.

Tests cover payload rendering, persist+enqueue, success, transient retry,
retryable 5xx, permanent 4xx dead-letter, attempt exhaustion, and idempotency.
Migration verified to apply/rollback against Postgres; table/enum/indexes confirmed.

* fix(webhooks): atomic claim, safe success-recording, sweep paging, migration cleanup

Address review feedback on the webhook delivery pipeline:

- deliver_webhook now atomically claims a delivery (conditional UPDATE that
  leases scheduled_for) before sending, so concurrent ARQ executions can't
  double-send (the prior status=='pending' read was non-atomic).
- Recording success is moved out of the dead-letter try-block: if the receiver
  accepted the payload (2xx) but the success DB-write fails, the row is left
  pending for the sweeper to reconcile instead of being dead-lettered.
- The sweep keyset-paginates by id so a backlog over the page size is fully
  drained, and logs the true re-enqueued total.
- Migration downgrade drops the enum via op.execute(DROP TYPE IF EXISTS ...)
  instead of the deprecated op.get_bind().

* fix(webhooks): idempotent delivery creation and drop secret custom headers

Address the remaining review feedback:

- Add a (workflow_run_id, webhook_node_id) unique constraint and make
  create_webhook_delivery a get-or-create returning (delivery, created). A
  retried run_integrations now reuses the existing row instead of creating and
  sending a duplicate final webhook; only a freshly-created row is enqueued.
- Stop persisting secret-looking custom headers (Authorization, X-API-Key,
  Cookie, ...) in plaintext on the delivery row: they are dropped with a warning
  pointing at the credential store (which is re-resolved securely at send time).
  Non-secret custom headers are unaffected.

* fix(webhooks): harden idempotency key, secret-header match, sweep reclaim id

Address follow-up review feedback:

- webhook_node_id is now NOT NULL so a NULL can't slip past the
  (workflow_run_id, webhook_node_id) unique constraint and create duplicates.
- Secret-header filtering matches normalized markers (auth/token/secret/cookie/
  api-key/...) instead of an exact name list, catching variants like
  X-Custom-Auth-Token while leaving benign headers (e.g. X-Idempotency-Key).
- The sweeper re-enqueues with a reclaim-specific job id (the lease timestamp)
  so reconciling a delivered-but-unrecorded row isn't deduped against the
  original attempt's already-completed ARQ job. The atomic claim still ensures
  at most one send.

* fix(webhooks): scope delivery rows to workflow org

---------

Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-07-02 21:44:14 +05:30
Abhishek
b192d4ada7
chore: drain active calls before rolling updates (#474)
* chore: drain active calls before rolling updates

* fix: add a devops secret header

* fix: implement PR review
2026-06-29 06:00:31 +05:30
Abhishek
78427817a6
feat(scripts): free trusted HTTPS via sslip.io for public-IP remote i… (#460)
* feat(scripts): free trusted HTTPS via sslip.io for public-IP remote installs

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* chore: refactor setup scripts

* chore: generate sdk

* chore: fix messaging for setup_remote script

* fix: fix ffmpeg download url

* feat: centralise and simplify the url configuration

* fix: force script run as sudo

* fix: fix documentation

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 17:19:29 +05:30
Sky Moore
1e2a276a61
feat: support other s3 sig versions so it works with s3 (#461) 2026-06-24 18:36:34 +05:30
Abhishek Kumar
f586aebe5b feat: enable stack auth config from backend 2026-06-18 14:17:28 +05:30
Abhishek Kumar
6f79bd67eb fix: harden CORS origin allow list
Fixes #322
2026-05-27 15:36:48 +05:30
Mohamed-Mamdouh
5f28c1b2a9
feat: add Tuner Integration to Dograh (#311)
* Add tuner integration

* bump pipecat version

* chore: update pipecat submodule to match upstream and use tuner-pipecat-sdk 0.2.0

Update pipecat submodule from 0.0.109.dev23 to 13e98d0d9 (the exact commit
upstream dograh-hq/dograh uses after v1.30.1). This installs pipecat-ai as
1.1.0.post277 via setuptools_scm, satisfying tuner-pipecat-sdk 0.2.0's
pipecat-ai>=1.0.0 requirement.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* wire tuner

* feat: refactor integrations into self contained packages

* chore: simplify ensure_public_access_token

* fix: remove NodeSpec and make DTOs the source of truth

* feat: send relevant signal to mcp using to_mcp_dict

* fix: fix tests

* cleanup: remove nango integrations

* feat: add agents.md for integrations

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-05-20 14:37:33 +05:30
Sabiha Khan
8778bb453e
chore: mandate telnyx signature verification (#319) 2026-05-19 16:50:27 +05:30
Sabiha Khan
b670004725
feat: verify telnyx webhook signature optionally (#279) 2026-05-12 19:47:28 +05:30
Abhishek
e2fe1f3cd4
feat: enable FORCE_TURN_RELAY to diagnose turn connectivity for local deployment setups (#272)
* filter out local sdp candidates on non local environment

* feat: add FORCE_TURN_RELAY variable

* add FORCE_TURN_RELAY option in docker-compose

* fix: fix github workflow
2026-05-11 17:13:01 +05:30
Abhishek Kumar
6d93be3ef6 fix: number pool initialization in multi telephony setup
If there are multiple telephony configurations, the form number should be initialized from the campaigns given telephonic configuration rather than the organization default telephonic configuration.
2026-05-08 14:48:53 +05:30
Sabiha Khan
3f19a16e7f
feat: add posthog events (#231)
* feat: add posthog events

* fix: workflow_duplicated event

* chore: add events to enum
2026-04-10 17:52:21 +05:30
Sabiha Khan
e4759ac6f2
revert: cloudonix amd (#183) 2026-03-07 09:56:04 +05:30
Sabiha Khan
3c5bc688ed
feat: hang up cloudonix machine answered call if feature flag enabled (#182) 2026-03-06 15:22:44 +05:30
Abhishek Kumar
1614879ddd chore: add log rotation in logging config 2026-03-05 13:51:57 +05:30
Abhishek
a836825b83
feat: add qa node in workflow builder (#172)
* feat: add qa node in workflow builder

* feat: add qa analysis token usage in usage_info

* fix: mask the API key in QA node

* feat: add advanced configuration in QA node
2026-02-25 13:53:30 +05:30
Abhishek
642cc34e8c
feat: add authentication for OSS (#167)
* feat: add authentication for OSS

Fixes #157 and #156

* fix: fix token generation

* fix: limit fastapi workers to 1
2026-02-20 18:21:24 +05:30
Abhishek
fe4ea648e4
Feat/campaign enhancements (#163)
* feat: add circuit breaker to safeguard

* feat: Add Circuit breaker in campaigns to safeguard against telephony failures

* feat: add schedules in campaigns
2026-02-17 21:04:15 +05:30
Abhishek
bf972fcfec
feat: add coturn configurations (#143)
* feat: add coturn changes

* add turn credentials and config

* fix: fix setup_remote script and docker compose
2026-02-03 13:52:50 +05:30
Abhishek Kumar
6f41e91f67 feat: add retry config during campaign creation 2026-01-29 11:57:57 +05:30
Abhishek
911c5ed416
fix: changes to update pipecat version to 0.0.100 (#122)
* feat: add stt evals

* add smart turn as provider

* chore: remove deprecations

* chore: format files

* fix: remove deprecated UserIdleProcessor

* fix: remove deprecated TranscriptProcessor

* chore: update pipecat submodule

* feat: add evals visualisation

* fix: trigger llm generation on client connected and pipeline started

* chore: update pipecat

* chore: update pipecat submodule

* Add tests

* fix: slow loading of workflow page

* chore: update pipecat submodule

* Show version after release

* Fixes #99

* fix: provider check for websocket connection

* Fixes #107

* Fix #96

* chore: fix documentation

* fix: cloudonix campaign call error

---------

Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
2026-01-23 18:53:59 +05:30
Sabiha Khan
97fbd9b37b
Feat/inbound telephony (#113)
* feat: inbound telephony (twilio & vobiz)

* chore: add ruff and lint formatting

* fix: add missing cloudonix interface compliance implementation
2026-01-12 10:10:30 +05:30
Abhishek
99a768f291
feat: enable workflows to be embedded in websites as a script tag (#47)
* feat: add deployment configuration options

* Simplify EmbedDialog

* Add options for inline vs floating embedding of agent
2025-11-15 17:32:37 +05:30
Sabiha Khan
4cfdc3d420 feat: add vonage telephony (#35)
* refactor: telephony integration

* feat: add vonage telephony
2025-10-27 15:58:20 +05:30
Abhishek
3babb5ced6
feat: add csv upload functionality for OSS (#29)
feat: add csv upload functionality
chore: remove redundant arq-worker from docker-compose
2025-10-09 17:54:31 +05:30
Abhishek Kumar
4f2a629340 Initial Commit 🚀 🚀 2025-09-09 14:37:32 +05:30