Join our community: https://t.me/+DOylgFv1jyJlNzM0
Description
The webhook retry processor in backend/src/services/webhookService.ts fetches up to 100 pending retries and processes them. There is no circuit breaker pattern. If one subscriber endpoint is slow (e.g., 30s response time), the retry processor serializes on that endpoint, and all other pending retries for other subscribers are delayed.
Even after the RPC timeout added in #617 for Stellar calls, webhook delivery to subscriber URLs does not have its own response timeout enforcement.
Expected Behavior
- Add a per-delivery timeout (e.g., 10 seconds) for each webhook HTTP call
- Track consecutive failures per subscriber URL and implement circuit breaking: after N failures, stop retrying that URL for a cooldown period
- Process retries concurrently across different subscriber URLs (group by URL and parallelize groups)
Suggested Fix
const MAX_DELIVERY_TIMEOUT_MS = 10_000;
const response = await Promise.race([
fetch(url, { method: 'POST', body: payload }),
new Promise((_, reject) => setTimeout(() => reject(new Error('timeout')), MAX_DELIVERY_TIMEOUT_MS))
]);
Impact
Medium. A single slow webhook subscriber can back up the entire retry queue, delaying notifications for all other subscribers.
Description
The webhook retry processor in
backend/src/services/webhookService.tsfetches up to 100 pending retries and processes them. There is no circuit breaker pattern. If one subscriber endpoint is slow (e.g., 30s response time), the retry processor serializes on that endpoint, and all other pending retries for other subscribers are delayed.Even after the RPC timeout added in #617 for Stellar calls, webhook delivery to subscriber URLs does not have its own response timeout enforcement.
Expected Behavior
Suggested Fix
Impact
Medium. A single slow webhook subscriber can back up the entire retry queue, delaying notifications for all other subscribers.