Skip to content

Increase Kavenegar API status SMS timeout #29

@SalehBorhani

Description

@SalehBorhani

Summary

We need to increase the timeout/retry window for status SMS requests to the Kavenegar API. Currently the status-checking logic times out or treats delayed delivery reports as failures. Increasing the timeout will reduce false negatives and improve message status accuracy.

Background

  • Our system uses Kavenegar to send SMS and relies on their delivery status responses.
  • Some carriers and network conditions cause delivery reports to arrive later than our current timeout.
  • This results in messages being marked as failed or triggering unnecessary retries/alerts.

Proposed change

  • Increase the timeout and/or extend the time window we consider for a final delivery status from Kavenegar.
  • Possible options:
    • Increase HTTP request timeout when polling the Kavenegar API (if currently low).
    • Increase the application-level wait period before marking a message as permanently failed (e.g., from X minutes → Y minutes).
    • Add configurable retry/backoff policy for status checks (with sensible defaults).
  • Make the new timeout configurable via environment variable or config file (e.g., KAVENEGAR_STATUS_TIMEOUT_MINUTES / KAVENEGAR_STATUS_MAX_AGE).

Acceptance criteria

  • Messages that receive late delivery reports within the new window are updated to the correct status instead of being marked permanently failed.
  • No significant increase in resource usage from extended polling (or if polling frequency changes, show plan to mitigate).
  • Timeout value is configurable and documented.
  • Unit and integration tests added to cover the extended timeout behavior and config override.

Impact & Risks

  • Increased memory/DB retention of pending messages for longer (minimal if limited to reasonable values).
  • If timeout is set too long, alerts and retries may be delayed; therefore default should be conservative and configurable.
  • May require coordination with monitoring/alerting to avoid false positives.

Implementation notes

  • Identify where we currently mark SMS as failed due to timeout or where we poll Kavenegar for status.
  • Add config entry and respect it in that code path.
  • Update documentation and any operational runbooks.
  • Add tests:
    • Unit test for code that decides when to mark a message failed vs. keep waiting.
    • Integration test (mocking Kavenegar) verifying late delivery report within new window updates status.

Suggested default values

  • HTTP request timeout: 10s → 30s (if currently lower)
  • Application-level finalization window: 5 minutes → 30 minutes (example — adjust per product needs)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions