Skip to content

fix(intercom): retry transient read failures#6

Merged
altaywtf merged 1 commit into
mainfrom
fix/intercom-transient-retries
May 17, 2026
Merged

fix(intercom): retry transient read failures#6
altaywtf merged 1 commit into
mainfrom
fix/intercom-transient-retries

Conversation

@altaywtf
Copy link
Copy Markdown
Member

@altaywtf altaywtf commented May 17, 2026

Summary

Harden live Intercom reads against transient provider/network failures seen during a 90d backfill attempt.

Changed

  • increases the live CLI HTTP timeout to 90 seconds
  • adds opt-in Intercom client retry attempts for request timeouts and HTTP 5xx responses
  • keeps HTTP 429 handling in the syncer rate-limit layer only
  • covers transient 502 recovery with a synthetic Intercom client test

Risks

  • longer live waits when Intercom is slow or unhealthy
  • retry behavior must not duplicate rate-limit retries

Verification

  • go test ./internal/intercom ./internal/syncer ./internal/cli
  • ./scripts/verify
  • $HOME/.agents/skills/codex-review/scripts/codex-review --mode local clean after the 429 retry layering fix

Complexity

Reduced for operators: live backfills can survive transient provider failures without adding tenant-specific wrapper behavior.


Summary by cubic

Add opt-in retries for transient Intercom read failures and increase the live CLI HTTP timeout to 90s. This improves backfill resilience against timeouts and 5xx errors while keeping 429 handling unchanged.

  • Bug Fixes
    • Added client retries for request timeouts and HTTP 5xx with MaxAttempts and RetryBackoff (CLI sets 3 attempts, 2s backoff).
    • Kept HTTP 429 handling in the syncer rate-limit layer (no duplicate retries).
    • Increased live CLI HTTP timeout from 30s to 90s.
    • Added a test that verifies recovery from a 502 response.

Written for commit 0931d71. Summary will update on new commits. Review in cubic

Copilot AI review requested due to automatic review settings May 17, 2026 17:31
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Re-trigger cubic

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens live Intercom “read” operations during backfills by adding opt-in retry behavior for transient failures and increasing the live CLI timeout to better tolerate slow/unhealthy provider responses.

Changes:

  • Increased the live CLI HTTP client timeout from 30s to 90s.
  • Added opt-in retry support in the Intercom client for request timeouts and HTTP 5xx responses (while leaving 429 handling to the syncer’s rate-limit retry layer).
  • Added a synthetic test to verify recovery from a transient 502 during conversation search.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
internal/intercom/client.go Adds configurable retry attempts + backoff for timeouts and 5xx, with proper body handling between attempts.
internal/intercom/client_test.go Adds coverage ensuring a 502 on the first request is retried and succeeds on the next attempt.
internal/cli/cli.go Increases live HTTP timeout and enables Intercom client retries for live sync runs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@altaywtf altaywtf merged commit cc4c775 into main May 17, 2026
7 checks passed
@altaywtf altaywtf deleted the fix/intercom-transient-retries branch May 17, 2026 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants