Commit graph

22 commits

Author SHA1 Message Date
Tararais
fd0d144b08
feat(webhooks): durable retrying delivery for final webhooks (#478)
* feat(webhooks): durable retrying delivery for final webhooks

Final webhook nodes were fired inline with a single best-effort httpx POST
(run_integrations._execute_webhook_node). On a transient error the failure was
swallowed at three levels, so ARQ never retried and the final call report was
permanently lost -- leaving downstream receivers stuck (e.g. a CRM showing a
call as still "in conversation").

Replace the one-shot POST with a durable, idempotent delivery pipeline modelled
on the campaign retry pattern (persisted row + scheduled_for + bounded attempts):

- New webhook_deliveries table (WebhookDeliveryModel) is the source of truth.
  Payload is rendered once and frozen so retries are deterministic; secrets are
  not stored -- the credential is referenced by uuid and re-resolved at send time.
- run_integrations now persists a delivery row and enqueues deliver_webhook with
  a deterministic ARQ job id instead of sending inline.
- deliver_webhook (new ARQ task) sends the request and:
    * 2xx            -> succeeded
    * transient      -> retry with capped exponential backoff (RequestError /
                        5xx / 408 / 425 / 429), up to max_attempts then dead_letter
    * permanent 4xx  -> dead_letter immediately (no pointless looping)
  It is idempotent: a non-pending delivery is a no-op, so a duplicate enqueue or
  sweeper re-injection can't double-send.
- sweep_webhook_deliveries cron (every 5 min) re-enqueues overdue pending
  deliveries so nothing is lost to a worker restart / Redis flush.
- Stable X-Dograh-Delivery-Id / Workflow-Run-Id / Attempt headers let receivers
  dedupe retried deliveries.
- enqueue_job now forwards ARQ job options (_job_id, _defer_by); failures log
  repr(e) so empty-message errors like ConnectTimeout are diagnosable.

Config via DEFAULT_WEBHOOK_DELIVERY_CONFIG (env-overridable): max_attempts=5,
base_delay=30s, max_delay=600s, timeout=30s.

Tests cover payload rendering, persist+enqueue, success, transient retry,
retryable 5xx, permanent 4xx dead-letter, attempt exhaustion, and idempotency.
Migration verified to apply/rollback against Postgres; table/enum/indexes confirmed.

* fix(webhooks): atomic claim, safe success-recording, sweep paging, migration cleanup

Address review feedback on the webhook delivery pipeline:

- deliver_webhook now atomically claims a delivery (conditional UPDATE that
  leases scheduled_for) before sending, so concurrent ARQ executions can't
  double-send (the prior status=='pending' read was non-atomic).
- Recording success is moved out of the dead-letter try-block: if the receiver
  accepted the payload (2xx) but the success DB-write fails, the row is left
  pending for the sweeper to reconcile instead of being dead-lettered.
- The sweep keyset-paginates by id so a backlog over the page size is fully
  drained, and logs the true re-enqueued total.
- Migration downgrade drops the enum via op.execute(DROP TYPE IF EXISTS ...)
  instead of the deprecated op.get_bind().

* fix(webhooks): idempotent delivery creation and drop secret custom headers

Address the remaining review feedback:

- Add a (workflow_run_id, webhook_node_id) unique constraint and make
  create_webhook_delivery a get-or-create returning (delivery, created). A
  retried run_integrations now reuses the existing row instead of creating and
  sending a duplicate final webhook; only a freshly-created row is enqueued.
- Stop persisting secret-looking custom headers (Authorization, X-API-Key,
  Cookie, ...) in plaintext on the delivery row: they are dropped with a warning
  pointing at the credential store (which is re-resolved securely at send time).
  Non-secret custom headers are unaffected.

* fix(webhooks): harden idempotency key, secret-header match, sweep reclaim id

Address follow-up review feedback:

- webhook_node_id is now NOT NULL so a NULL can't slip past the
  (workflow_run_id, webhook_node_id) unique constraint and create duplicates.
- Secret-header filtering matches normalized markers (auth/token/secret/cookie/
  api-key/...) instead of an exact name list, catching variants like
  X-Custom-Auth-Token while leaving benign headers (e.g. X-Idempotency-Key).
- The sweeper re-enqueues with a reclaim-specific job id (the lease timestamp)
  so reconciling a delivered-but-unrecorded row isn't deduped against the
  original attempt's already-completed ARQ job. The atomic claim still ensures
  at most one send.

* fix(webhooks): scope delivery rows to workflow org

---------

Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-07-02 21:44:14 +05:30
Matt Van Horn
3309face2c
fix: reject misrouted smallwebrtc runs on the telephony websocket (#468)
* fix: reject misrouted smallwebrtc runs on the telephony websocket

A smallwebrtc (browser/WebRTC) workflow run is established through the WebRTC
signaling endpoint, not the PSTN telephony websocket. When such a run reached
_handle_telephony_websocket it read no "provider" from initial_context and
closed with an opaque "Provider type not found". Detect smallwebrtc runs and
close with a clear reason pointing to the signaling endpoint, without setting
the run to running or invoking a telephony provider. Also store the provider on
smallwebrtc runs at creation so they are self-describing, and make the generic
no-provider close reason include the run id and mode.

Closes #433

* fix: merge workflow run initial context defaults

---------

Co-authored-by: Matt Van Horn <455140+mvanhorn@users.noreply.github.com>
Co-authored-by: Abhishek Kumar <abhishek@a6k.me>
2026-06-26 19:37:40 +05:30
Abhishek Kumar
7efc19c812 fix: merge initial context during update workflow run 2026-06-18 09:25:02 +05:30
Abhishek Kumar
3d1886c450 feat: persist split user and bot audio 2026-06-16 15:19:49 +05:30
Abhishek
1f1149f4d5
feat: billing and credit management v2 (#429)
* feat: use mps generated correlation ID

* chore: update pipecat submodule

* feat: add credit purchase URL

* feat: carve out billing page and show credit ledger

* feat: deprecate dograh based quota tracking

* fix: remove cost calculation from dograh codebase

* fix: create mps account on migrate to v2

* chore: update pipecat
2026-06-12 14:55:30 +05:30
Abhishek
d97d1d72cd
feat: add chat based testing for voice agent (#308)
* feat: add backend foundations

* feat: add text chat UI

* chore: simplify the reload behaviour

* fix: fix upgrade banner to be triggered after package upload

* feat: simplify TesterPanel design

* chore: fix formatting and generate client

* chore: fix tracing for text chat mode

* fix: fix revert and edit CTA

* refactor: refactor TesterPanel into smaller components

* feat: enable runtime transition of nodes

* fix: fix review comments
2026-05-21 15:20:02 +05:30
Abhishek
0e12c41fc7
chore: bump pipecat version and fix tests (#263)
* chore: bump pipecat version and fix tests

* chore: add github workflow to run tests

* fix: install reqirements.dev.txt in test script

* fix: fix api-test action

* feat: add integration test

* test: add integration tests

* test: add test for function call mute strategy
2026-05-04 21:35:37 +05:30
Abhishek
7fd3b96470
feat: agent stream for cloudonix OPBX (#261)
* feat: agent stream for cloudonix OPBX

* feat: make cloudonix app name optional

* feat: create application while configuring telephony config

* fix: get telephony configuration from stamped workflow run

* fix: fix vobiz hangup URL
2026-05-02 15:53:58 +05:30
Abhishek
38d1d928b7
feat: agent versioning and model configurations override (#227)
* feat: add tests and migrations

* feat: workflow versioning among published and draft

* feat: add a new settings page to simplify workflow detail page

* fix: fix tsclient generation
2026-04-08 19:20:31 +05:30
Abhishek Kumar
c8742dbdc0 feat: run per node QA 2026-02-25 17:17:48 +05:30
Abhishek
a836825b83
feat: add qa node in workflow builder (#172)
* feat: add qa node in workflow builder

* feat: add qa analysis token usage in usage_info

* fix: mask the API key in QA node

* feat: add advanced configuration in QA node
2026-02-25 13:53:30 +05:30
Sabiha Khan
13b41437e8
fix: missing call_id in gathered_context (#165) 2026-02-18 21:13:28 +05:30
Abhishek
7552b6c819
feat: add asterisk ARI websocket interface (#159)
* chore: remove old files

* feat: ari outbound dialing

* feat: add websocket configuration for ARI

* feat: handling inbound calls

* delete ext channel from redis on stasis end

* fix: add lock in workflow run update, refactor _handle_stasis_start

* chore: update submodule

---------

Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
2026-02-17 19:32:03 +05:30
Abhishek
5fe1c8ce2f
chore: UI enhancements for workflow runs view (#142)
* add local state in filters

* feat: add sorting feature by duration

* chore: refactor workfow run view
2026-01-30 17:08:15 +05:30
Abhishek
b1c982a52e
fix: add cloudonix CDR handling (#140)
* feat: add cloudonix cdr

* fix: remove org check

---------

Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
2026-01-29 19:06:52 +05:30
Abhishek
911c5ed416
fix: changes to update pipecat version to 0.0.100 (#122)
* feat: add stt evals

* add smart turn as provider

* chore: remove deprecations

* chore: format files

* fix: remove deprecated UserIdleProcessor

* fix: remove deprecated TranscriptProcessor

* chore: update pipecat submodule

* feat: add evals visualisation

* fix: trigger llm generation on client connected and pipeline started

* chore: update pipecat

* chore: update pipecat submodule

* Add tests

* fix: slow loading of workflow page

* chore: update pipecat submodule

* Show version after release

* Fixes #99

* fix: provider check for websocket connection

* Fixes #107

* Fix #96

* chore: fix documentation

* fix: cloudonix campaign call error

---------

Co-authored-by: Sabiha Khan <sabihak89@gmail.com>
2026-01-23 18:53:59 +05:30
Sabiha Khan
97fbd9b37b
Feat/inbound telephony (#113)
* feat: inbound telephony (twilio & vobiz)

* chore: add ruff and lint formatting

* fix: add missing cloudonix interface compliance implementation
2026-01-12 10:10:30 +05:30
Abhishek
55b727a872
Feat/Add API Trigger and Webhooks in Agent Builder (#83)
* feat: add api trigger node for agent runs

* feat: add webhook node

* Execute webhook nodes post workflow run

* Add hint to go to API keys
2025-12-22 14:08:30 +05:30
Sabiha Khan
c99bd29ef1
fix: call_id and stream_id for vobiz pipeline, add workflow run state (#78)
* fix: add workflow run state for pipeline

* fix: call and stream id for vobiz pipeline
2025-12-11 17:12:28 +07:00
Abhishek Kumar
0345df6fbe Optimise requirements.txt and update pipecat imports 2025-09-20 14:07:00 +05:30
Sabiha Khan
d28991fc60 fix storage_backend value and docker image versions 2025-09-09 16:37:05 +05:30
Abhishek Kumar
4f2a629340 Initial Commit 🚀 🚀 2025-09-09 14:37:32 +05:30