Skip to content

Embedding API 429 floods Sentry with 31K events — no rate limit backoff #2898

@graycyrus

Description

@graycyrus

Summary

The embedding API client has no rate limiting or backoff when the server returns 429 Too Many Requests. This generates 31,328 Sentry error events — the single noisiest issue in the project.

Problem

What happened: When the embedding API returns 429 ("Rate limit exceeded. Please retry after a brief wait."), the code either retries immediately without backoff, or each retry fires a new Sentry error event. Result: 31K+ events flooding Sentry.

Expected:

  1. On 429, implement exponential backoff before retrying
  2. 429 responses should NOT fire Sentry error events — they're expected/transient, should be warn-level at most

Impact: 31,328 events in Sentry (TAURI-RUST-3), ongoing across releases 0.54.0 → 0.56.0. This is the highest-volume issue in the entire project. It wastes Sentry quota and buries real bugs.

Version / Platform: openhuman@0.54.0 through 0.56.0, all platforms

Solution (optional)

Two fixes needed:

  1. Add backoff: In the embedding client (src/openhuman/embeddings/ or similar), add exponential backoff on 429. Respect Retry-After header if present.
  2. Suppress Sentry noise: Add "Rate limit exceeded" / "rate_limit_error" / HTTP 429 to the observability suppression list (like the PII fix in PR fix(observability): demote memory-store PII rejection errors from Sentry to warn #2850) so 429s are logged as warnings, not error events.

Acceptance criteria

  • Repro gone — 429 responses don't flood Sentry
  • Backoff works — Embedding retries use exponential backoff on 429
  • Regression safety — Unit test for backoff behavior
  • Diff coverage >= 80%

Related

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions