-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Summary
We need to increase the timeout/retry window for status SMS requests to the Kavenegar API. Currently the status-checking logic times out or treats delayed delivery reports as failures. Increasing the timeout will reduce false negatives and improve message status accuracy.
Background
- Our system uses Kavenegar to send SMS and relies on their delivery status responses.
- Some carriers and network conditions cause delivery reports to arrive later than our current timeout.
- This results in messages being marked as failed or triggering unnecessary retries/alerts.
Proposed change
- Increase the timeout and/or extend the time window we consider for a final delivery status from Kavenegar.
- Possible options:
- Increase HTTP request timeout when polling the Kavenegar API (if currently low).
- Increase the application-level wait period before marking a message as permanently failed (e.g., from X minutes → Y minutes).
- Add configurable retry/backoff policy for status checks (with sensible defaults).
- Make the new timeout configurable via environment variable or config file (e.g., KAVENEGAR_STATUS_TIMEOUT_MINUTES / KAVENEGAR_STATUS_MAX_AGE).
Acceptance criteria
- Messages that receive late delivery reports within the new window are updated to the correct status instead of being marked permanently failed.
- No significant increase in resource usage from extended polling (or if polling frequency changes, show plan to mitigate).
- Timeout value is configurable and documented.
- Unit and integration tests added to cover the extended timeout behavior and config override.
Impact & Risks
- Increased memory/DB retention of pending messages for longer (minimal if limited to reasonable values).
- If timeout is set too long, alerts and retries may be delayed; therefore default should be conservative and configurable.
- May require coordination with monitoring/alerting to avoid false positives.
Implementation notes
- Identify where we currently mark SMS as failed due to timeout or where we poll Kavenegar for status.
- Add config entry and respect it in that code path.
- Update documentation and any operational runbooks.
- Add tests:
- Unit test for code that decides when to mark a message failed vs. keep waiting.
- Integration test (mocking Kavenegar) verifying late delivery report within new window updates status.
Suggested default values
- HTTP request timeout: 10s → 30s (if currently lower)
- Application-level finalization window: 5 minutes → 30 minutes (example — adjust per product needs)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request