* feat(webhooks): durable retrying delivery for final webhooks
Final webhook nodes were fired inline with a single best-effort httpx POST
(run_integrations._execute_webhook_node). On a transient error the failure was
swallowed at three levels, so ARQ never retried and the final call report was
permanently lost -- leaving downstream receivers stuck (e.g. a CRM showing a
call as still "in conversation").
Replace the one-shot POST with a durable, idempotent delivery pipeline modelled
on the campaign retry pattern (persisted row + scheduled_for + bounded attempts):
- New webhook_deliveries table (WebhookDeliveryModel) is the source of truth.
Payload is rendered once and frozen so retries are deterministic; secrets are
not stored -- the credential is referenced by uuid and re-resolved at send time.
- run_integrations now persists a delivery row and enqueues deliver_webhook with
a deterministic ARQ job id instead of sending inline.
- deliver_webhook (new ARQ task) sends the request and:
* 2xx -> succeeded
* transient -> retry with capped exponential backoff (RequestError /
5xx / 408 / 425 / 429), up to max_attempts then dead_letter
* permanent 4xx -> dead_letter immediately (no pointless looping)
It is idempotent: a non-pending delivery is a no-op, so a duplicate enqueue or
sweeper re-injection can't double-send.
- sweep_webhook_deliveries cron (every 5 min) re-enqueues overdue pending
deliveries so nothing is lost to a worker restart / Redis flush.
- Stable X-Dograh-Delivery-Id / Workflow-Run-Id / Attempt headers let receivers
dedupe retried deliveries.
- enqueue_job now forwards ARQ job options (_job_id, _defer_by); failures log
repr(e) so empty-message errors like ConnectTimeout are diagnosable.
Config via DEFAULT_WEBHOOK_DELIVERY_CONFIG (env-overridable): max_attempts=5,
base_delay=30s, max_delay=600s, timeout=30s.
Tests cover payload rendering, persist+enqueue, success, transient retry,
retryable 5xx, permanent 4xx dead-letter, attempt exhaustion, and idempotency.
Migration verified to apply/rollback against Postgres; table/enum/indexes confirmed.
* fix(webhooks): atomic claim, safe success-recording, sweep paging, migration cleanup
Address review feedback on the webhook delivery pipeline:
- deliver_webhook now atomically claims a delivery (conditional UPDATE that
leases scheduled_for) before sending, so concurrent ARQ executions can't
double-send (the prior status=='pending' read was non-atomic).
- Recording success is moved out of the dead-letter try-block: if the receiver
accepted the payload (2xx) but the success DB-write fails, the row is left
pending for the sweeper to reconcile instead of being dead-lettered.
- The sweep keyset-paginates by id so a backlog over the page size is fully
drained, and logs the true re-enqueued total.
- Migration downgrade drops the enum via op.execute(DROP TYPE IF EXISTS ...)
instead of the deprecated op.get_bind().
* fix(webhooks): idempotent delivery creation and drop secret custom headers
Address the remaining review feedback:
- Add a (workflow_run_id, webhook_node_id) unique constraint and make
create_webhook_delivery a get-or-create returning (delivery, created). A
retried run_integrations now reuses the existing row instead of creating and
sending a duplicate final webhook; only a freshly-created row is enqueued.
- Stop persisting secret-looking custom headers (Authorization, X-API-Key,
Cookie, ...) in plaintext on the delivery row: they are dropped with a warning
pointing at the credential store (which is re-resolved securely at send time).
Non-secret custom headers are unaffected.
* fix(webhooks): harden idempotency key, secret-header match, sweep reclaim id
Address follow-up review feedback:
- webhook_node_id is now NOT NULL so a NULL can't slip past the
(workflow_run_id, webhook_node_id) unique constraint and create duplicates.
- Secret-header filtering matches normalized markers (auth/token/secret/cookie/
api-key/...) instead of an exact name list, catching variants like
X-Custom-Auth-Token while leaving benign headers (e.g. X-Idempotency-Key).
- The sweeper re-enqueues with a reclaim-specific job id (the lease timestamp)
so reconciling a delivered-but-unrecorded row isn't deduped against the
original attempt's already-completed ARQ job. The atomic claim still ensures
at most one send.
* fix(webhooks): scope delivery rows to workflow org
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* Add tuner integration
* bump pipecat version
* chore: update pipecat submodule to match upstream and use tuner-pipecat-sdk 0.2.0
Update pipecat submodule from 0.0.109.dev23 to 13e98d0d9 (the exact commit
upstream dograh-hq/dograh uses after v1.30.1). This installs pipecat-ai as
1.1.0.post277 via setuptools_scm, satisfying tuner-pipecat-sdk 0.2.0's
pipecat-ai>=1.0.0 requirement.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* wire tuner
* feat: refactor integrations into self contained packages
* chore: simplify ensure_public_access_token
* fix: remove NodeSpec and make DTOs the source of truth
* feat: send relevant signal to mcp using to_mcp_dict
* fix: fix tests
* cleanup: remove nango integrations
* feat: add agents.md for integrations
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* feat: add headless widget for deployment
* feat: call callbacks at the right time
* feat: add onCallConnected & onCallDisconnected callback
* feat: add a button with text for floating widget
* feat: add headless widget for deployment
* feat: call callbacks at the right time
* feat: add onCallConnected & onCallDisconnected callback
* feat: add a button with text for floating widget
* docs: web widget
* fix: format issue in pre-pr drift check
* fix: fix CD to rely on pipecat dev dependey
* chore: update message
---------
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
* chore: bump pipecat version and fix tests
* chore: add github workflow to run tests
* fix: install reqirements.dev.txt in test script
* fix: fix api-test action
* feat: add integration test
* test: add integration tests
* test: add test for function call mute strategy
* feat: add tests and migrations
* feat: workflow versioning among published and draft
* feat: add a new settings page to simplify workflow detail page
* fix: fix tsclient generation