Skip to content

fix(channels): surface structured rate-limit metadata on chat_error (#2606)#2652

Open
CodeGhost21 wants to merge 1 commit into
tinyhumansai:mainfrom
CodeGhost21:fix/2606-structured-rate-limit-metadata
Open

fix(channels): surface structured rate-limit metadata on chat_error (#2606)#2652
CodeGhost21 wants to merge 1 commit into
tinyhumansai:mainfrom
CodeGhost21:fix/2606-structured-rate-limit-metadata

Conversation

@CodeGhost21
Copy link
Copy Markdown
Contributor

@CodeGhost21 CodeGhost21 commented May 25, 2026

Summary

  • Promote rate-limit error metadata from message-text-only to typed wire fields on chat_error so the frontend can render a real countdown, retry button, and provider/source label without regexing the message.
  • Add five additive optional fields to WebChannelEvent: error_source, error_retryable, error_retry_after_ms, error_provider, error_fallback_available. Older FE clients that read only message / error_type keep working unchanged (skip_serializing_if = "Option::is_none").
  • Distinguishes upstream provider 429s from OpenHuman's per-hour SecurityPolicy cap, agent-loop iteration cap, OpenHuman billing exhaustion, transport failures, and config errors.
  • Detects non-retryable business 429s (plan does not include model, insufficient balance, Z.AI codes 1311/1113) so the FE can hide the retry button and route the user to billing/settings.
  • Surfaces fallback_available: false once reliable.rs::format_failure_aggregate has exhausted the model_fallbacks chain so the FE doesn't promise a fallback that doesn't exist.

Problem

Issue #2606 follows up on PR #2371. That PR landed retry-after wording inside the user-facing message text — but the structured data was thrown away at the channel-classifier boundary. WebChannelEvent still only carried message: String and error_type: String, so the frontend could not:

  • show a precise countdown (the seconds existed only as a substring like "Try again in 30 seconds")
  • decide whether to render a Retry button (no way to distinguish transient 429 from non-retryable business 429)
  • name the actual provider that throttled
  • distinguish OpenHuman's own cap from upstream provider throttling
  • decide whether to offer a fallback CTA

Users are still reporting confusing rate-limit copy because of this gap (see issue #2606's acceptance criteria).

Solution

New ClassifiedError struct returned from classify_inference_error carries the typed metadata, and the chat-error publish path populates the new WebChannelEvent fields from it.

Source error_source error_retryable Notes
Upstream provider 429 provider true retry_after_ms parsed from body Retry-After: / retry_after:
Non-retryable business 429 provider false matches plan/balance/Z.AI 1311/1113 markers
SecurityPolicy hourly cap openhuman_budget true decays gradually; no provider name implicated
Agent max-iterations agent_loop true same-thread retry once underlying cap clears
402 / insufficient credits openhuman_billing false distinguished from 429 per #2606 AC
Transport timeout transport true
401 / model unavail / context overflow config false

Provider name extracted best-effort from the \"<provider> API error (...)\" prefix that inference::provider::ops::api_error formats. Fallback-exhausted signal extracted from the "All providers/models failed" aggregate that reliable.rs::format_failure_aggregate emits.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) — 13 new tests (11 classifier unit, 2 wire-shape).
  • Diff coverage ≥ 80% — every new branch has dedicated assertions; the structured fields are asserted both at the struct level and as serialized JSON keys, plus a None-omit contract test.
  • Coverage matrix updated — N/A: behaviour-only change to existing classifier output and event payload; no feature row added/removed/renamed.
  • All affected feature IDs from the matrix listed in ## Related — N/A: no matrix row touched.
  • No new external network dependencies introduced — pure string classification + struct expansion; no IO.
  • Manual smoke checklist updated if this touches release-cut surfaces — N/A: additive backend wire fields; happy-path chat behaviour unchanged.
  • Linked issue closed via Closes #NNN — see ## Related.

Impact

  • Runtime/platform: backend chat error path (channels::providers::web::classify_inference_error and the chat_error publish in start_chat) plus the WebChannelEvent wire struct.
  • User-visible: no immediate UI change (FE consumer is a follow-up). The structured fields appear on every chat_error SSE/Socket.IO frame and can be picked up by the FE in a subsequent PR to render a real countdown/retry/fallback UI.
  • Performance: zero — pure string classification, no new IO.
  • Security: no new error data exposed beyond what message already carries. Provider name extraction is allow-listed to ASCII alphanumeric/_/-.
  • Migration / compatibility: error_type tokens are unchanged for existing consumers. New fields are all Option<…> with skip_serializing_if = \"Option::is_none\", so older FE clients see exactly the same JSON shape they do today.

Live verification

Beyond the unit + wire-shape tests, ran a full end-to-end live test:

  1. Node HTTP server that returns HTTP/1.1 429 Too Many Requests, Retry-After: 30, body {\"error\":{\"message\":\"...\",\"code\":\"rate_limit_exceeded\"},\"retry_after\":30} on /v1/chat/completions.
  2. openhuman-core run --port 7892 against an isolated workspace with that endpoint configured as a custom_openai provider.
  3. POST openhuman.channel_web_chat, listen to /events via curl -N.

The chat_error SSE frame received:

event: chat_error
data: {
  \"event\":\"chat_error\",
  \"message\":\"Your AI provider is rate-limiting requests. ... Try again in 30 seconds.
              > rate limited (synthetic 429 for openhuman issue #2606 test)\",
  \"error_type\":             \"rate_limited\",
  \"error_source\":           \"provider\",
  \"error_retryable\":        true,
  \"error_retry_after_ms\":   30000,
  \"error_provider\":         \"fake429\"
}

This exercises the full real path: reliable.rs::is_rate_limitedops::api_error formatting → classify_inference_errorWebChannelEvent serialization → SSE.

Known follow-ups (out of scope)

  • inference::provider::ops::api_error currently preserves only the response BODY, not the Retry-After HTTP header. When the upstream sends only the header (no body retry_after), retry_after_ms will be None. Follow-up should thread the header into the error chain.
  • Frontend consumer for the new fields (countdown, retry button, fallback CTA) is a separate PR — this PR is backend-only per the scope discussion on Backend should return structured retry metadata for rate limits #2606.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

  • Branch: fix/2606-structured-rate-limit-metadata (branched from origin/main after fresh fetch)
  • Commit SHA: 187dc63

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: no frontend changes.
  • pnpm typecheck — fails on 4 pre-existing errors in app/src/components/settings/panels/devices/PairPhoneModal.tsx, app/src/lib/tunnel/crypto.ts, app/src/pages/ios/PairScreen.tsx (missing iOS-only deps qrcode.react, @noble/ciphers, @tauri-apps/plugin-barcode-scanner). These exist on origin/main and are unrelated to this PR's Rust-only changes. Pushed with --no-verify per CLAUDE.md guidance.
  • Focused tests: cargo test --lib openhuman::channels::providers::web::tests → 52 passed, 0 failed (39 pre-existing + 13 new).
  • Full module tests: cargo test --lib openhuman::channels::providers → 834 passed, 0 failed under default parallelism (added a serial test lock for the three tests sharing TEST_FORCED_RUN_CHAT_TASK_ERROR).
  • Rust fmt/check: cargo fmt --manifest-path Cargo.toml applied; cargo check --lib --bin openhuman-core clean.
  • Tauri fmt/check: cargo check --manifest-path app/src-tauri/Cargo.toml clean.

Validation Blocked

  • command: N/A — full cargo test --lib had 42 pre-existing failures in openhuman::memory::tree_global::* and openhuman::memory_tree::* modules, confirmed reproducible on origin/main with this branch stashed.
  • error: pre-existing memory_tree test failures unrelated to channels work.
  • impact: none on this PR — those modules are not touched.

Behavior Changes

  • Intended behavior change: chat_error WebChannelEvent now carries structured error_source, error_retryable, error_retry_after_ms, error_provider, error_fallback_available fields.
  • User-visible effect: none until the FE consumer ships. Older FE clients see the same JSON they always did.

Parity Contract

  • Legacy behavior preserved: error_type tokens and message text are unchanged for every existing branch. New fields are additive Option<…> with skip_serializing_if. Three existing classifier tests still pass unchanged (only the destructuring shape was migrated from tuple to struct).
  • Guard/fallback/dispatch parity checks: classifier branch ORDER is unchanged (SecurityPolicy budget and max-iterations checks still precede the generic 429 branch — locked in by _distinguishes_action_budget_from_provider_429 and _max_iterations_gets_dedicated_branch).

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none.
  • Canonical PR: this PR.
  • Resolution: N/A.

Summary by CodeRabbit

  • New Features

    • Error responses now include enhanced structured metadata: error source classification, retry eligibility, retry-after timing, provider information, and fallback chain availability status.
  • Bug Fixes

    • Improved error categorization and classification for inference failures with more accurate retry guidance.

Review Change Stack

…inyhumansai#2606)

Issue tinyhumansai#2606 follows up on PR tinyhumansai#2371: that PR landed retry-after wording
inside the user-facing message text, but the structured data was
discarded at the channel layer. The wire payload still only carried
`message: String` and `error_type: String`, so the frontend could not
render a real countdown, a retry button, a provider/source label, or a
fallback CTA without regexing the message.

This change moves the metadata onto the wire as additive optional
fields on `WebChannelEvent`, populated by a new `ClassifiedError`
struct returned from `classify_inference_error`:

- `error_source`         — "provider" | "openhuman_budget" |
                           "agent_loop" | "openhuman_billing" |
                           "transport" | "config"
- `error_retryable`      — false for non-retryable business 429s
                           ("plan does not include", insufficient
                           balance, Z.AI 1311/1113), auth, model,
                           context, OpenHuman billing
- `error_retry_after_ms` — verbatim from Retry-After / retry_after
- `error_provider`       — extracted from "<provider> API error (...)"
- `error_fallback_available` — Some(false) when reliable.rs has
                               already exhausted the model_fallbacks
                               chain ("All providers/models failed")

All fields use `skip_serializing_if = "Option::is_none"` so older
clients that only read `message` / `error_type` keep working unchanged.

Coverage:
- 11 new unit tests on the classifier (each branch, each new field,
  non-retryable 429 business-quota cases, provider extraction,
  fallback-exhausted aggregate, JSON omit-when-None contract).
- 2 new wire-shape tests that drive `start_chat` end-to-end through
  the WebChannelEvent bus and assert the structured fields land on
  both the struct and the serialized JSON.
- Added a serial test lock so the three tests that share
  `TEST_FORCED_RUN_CHAT_TASK_ERROR` no longer race under
  `cargo test`'s default parallelism.
@CodeGhost21 CodeGhost21 requested a review from a team May 25, 2026 22:22
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

📝 Walkthrough

Walkthrough

This PR adds structured error metadata to chat error responses. It extends the WebChannelEvent wire contract with optional fields for error source, retryability, retry-after timing, provider attribution, and fallback availability; introduces a ClassifiedError struct to categorize inference errors with retry semantics and provider detection; refactors classify_inference_error to populate all metadata fields across rate-limit, budget, timeout, auth, and provider-unavailable cases; and updates all event emission sites and tests.

Changes

Structured Error Metadata for Rate Limits

Layer / File(s) Summary
Wire contract: WebChannelEvent error fields
src/core/socketio.rs
WebChannelEvent struct gains five optional error metadata fields (error_source, error_retryable, error_retry_after_ms, error_provider, error_fallback_available), each omitted from JSON when None.
Error classification model and helpers
src/openhuman/channels/providers/web.rs
ClassifiedError struct captures typed error categorization with source, retryability, retry timing, and optional provider/fallback indicators. Helper functions extract normalized provider names and detect fallback-chain exhaustion.
Inference error classification logic
src/openhuman/channels/providers/web.rs
classify_inference_error refactored to return ClassifiedError, populating error_type, message, source, retryable, retry_after_ms, provider, and fallback_available across budget-limit, max-iteration, rate-limit, timeout, auth, billing, provider-unavailable, and overflow branches.
Chat error handling and WebChannelEvent population
src/openhuman/channels/providers/web.rs
Chat error path calls classify_inference_error and uses the resulting ClassifiedError struct to populate WebChannelEvent::chat_error with message, error_type, and all new metadata fields.
Cancellation and progress event emissions
src/openhuman/channels/providers/web.rs
Event emission sites for cancellations and progress markers initialize new error fields to None, maintaining consistent wire shape across all event types.
Proactive and presentation channel events
src/openhuman/channels/proactive.rs, src/openhuman/channels/providers/presentation.rs
ProactiveMessageSubscriber and presentation-layer response delivery initialize new error fields to None when emitting WebChannelEvent to the web channel.
Test infrastructure and structured metadata validation
src/openhuman/channels/providers/web_tests.rs
Adds async lock for test serialization, updates existing error classification tests to destructure ClassifiedError struct, and adds comprehensive new tests validating retry_after_ms, source, provider, retryable, fallback_available fields; wire serialization; and classification outcomes across budget vs provider 429, max-iterations, business-quota, provider name extraction, fallback-chain exhaustion, auth 401, and billing 402.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2239: Modifies classify_inference_error provider configuration rejection categorization, and this PR extends the output into new WebChannelEvent error metadata fields.
  • tinyhumansai/openhuman#2371: Modifies classify_inference_error rate-limit source classification and Retry-After parsing; this PR further extends the classified error output into new structured error metadata.

Suggested labels

working

Suggested reviewers

  • senamakel
  • graycyrus
  • fzamel3333-ai

Poem

🐰 A rabbit hops through error stacks,
Classifying missteps, adding facts—
When rate-limits bite, now sources gleam,
With retry times and fallback schemes.
Metadata flows where shadows crept,
The thread survives—its state well-kept! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: surfacing structured rate-limit metadata on chat_error events, which is the primary objective of this pull request.
Linked Issues check ✅ Passed The PR implements all key coding requirements from #2606: typed ClassifiedError with error classification, five optional WebChannelEvent fields (error_source, error_retryable, error_retry_after_ms, error_provider, error_fallback_available), Retry-After parsing, provider name extraction, fallback-exhausted detection, and comprehensive test coverage (11 classifier + 2 wire-shape tests). For #2364, it ensures rate-limit state is not persisted per-thread by properly distinguishing transient provider 429s from non-retryable business limits.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing structured rate-limit metadata on WebChannelEvent and the ClassifiedError classifier. Core socket.io event shape, provider error handling, presentation channel, and comprehensive test suite changes all align with the stated objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 25, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/channels/providers/web_tests.rs (1)

29-35: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid fire-and-forget forced-error reset in Drop (can clobber the next test)
TestForcedRunChatTaskErrorGuard::drop schedules async cleanup via tokio::spawn(set_test_forced_run_chat_task_error(None).await). Because locals drop in reverse order, the guard can be dropped (and the spawned reset started) before FORCED_ERROR_TEST_LOCK is released, letting the next test set Some(...) and then having the earlier spawned task overwrite it back to None while run_chat_task hasn’t consumed it yet (TEST_FORCED_RUN_CHAT_TASK_ERROR is read via slot.take()).
Remove the async Drop/spawn cleanup and instead perform an awaited reset inside each serialized test (applies to the three tests that currently create TestForcedRunChatTaskErrorGuard).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/channels/providers/web_tests.rs` around lines 29 - 35, The Drop
implementation for TestForcedRunChatTaskErrorGuard must not spawn an async reset
because that fire-and-forget reset (tokio::spawn calling
set_test_forced_run_chat_task_error(None)) can race with FORCED_ERROR_TEST_LOCK
and clobber a subsequent test's forced error; remove the async Drop::drop
implementation entirely and instead update each test that constructs
TestForcedRunChatTaskErrorGuard to explicitly await
set_test_forced_run_chat_task_error(None) at the end of the test while still
holding the FORCED_ERROR_TEST_LOCK (so run_chat_task’s
TEST_FORCED_RUN_CHAT_TASK_ERROR slot.take() is not interfered with). Ensure
references to set_test_forced_run_chat_task_error,
TestForcedRunChatTaskErrorGuard, FORCED_ERROR_TEST_LOCK, run_chat_task, and
TEST_FORCED_RUN_CHAT_TASK_ERROR are used to find and modify the guard and the
three tests.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/channels/providers/web.rs`:
- Around line 496-508: The code branch that builds a ClassifiedError for
402/payment messages incorrectly sets source to the hardcoded string
"openhuman_billing", which misattributes upstream provider billing errors; in
the ClassifiedError constructor (the block creating ClassifiedError with
error_type "budget_exhausted" and calling with_provider_detail), replace the
hardcoded source with the actual provider identifier (use the existing provider
variable) or another provider-specific source instead of "openhuman_billing" so
upstream 402/payment errors preserve the provider as the error source.
- Around line 449-469: The current 429 branch sets retryable =
!is_non_retryable_rate_limit_text(&lower) but always builds a
transient/retry-focused summary; change it so you first compute a boolean (e.g.,
let non_retryable = is_non_retryable_rate_limit_text(&lower)) and then pick the
message accordingly: if non_retryable produce a non-retry message that directs
the user to billing/settings/plan (no "retry in this thread" hint), otherwise
keep the transient retry summary that uses retry_after_hint(retry_secs); pass
the chosen summary into with_provider_detail(...) and keep retryable set to
!non_retryable and retry_after_ms as-is. Ensure you reference
is_non_retryable_rate_limit_text, retry_after_hint, retry_secs, and
ClassifiedError::message/retryable when editing.

---

Outside diff comments:
In `@src/openhuman/channels/providers/web_tests.rs`:
- Around line 29-35: The Drop implementation for TestForcedRunChatTaskErrorGuard
must not spawn an async reset because that fire-and-forget reset (tokio::spawn
calling set_test_forced_run_chat_task_error(None)) can race with
FORCED_ERROR_TEST_LOCK and clobber a subsequent test's forced error; remove the
async Drop::drop implementation entirely and instead update each test that
constructs TestForcedRunChatTaskErrorGuard to explicitly await
set_test_forced_run_chat_task_error(None) at the end of the test while still
holding the FORCED_ERROR_TEST_LOCK (so run_chat_task’s
TEST_FORCED_RUN_CHAT_TASK_ERROR slot.take() is not interfered with). Ensure
references to set_test_forced_run_chat_task_error,
TestForcedRunChatTaskErrorGuard, FORCED_ERROR_TEST_LOCK, run_chat_task, and
TEST_FORCED_RUN_CHAT_TASK_ERROR are used to find and modify the guard and the
three tests.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5e36b968-60fa-4bdc-a35e-c54ea354d53b

📥 Commits

Reviewing files that changed from the base of the PR and between e05cab9 and 187dc63.

📒 Files selected for processing (5)
  • src/core/socketio.rs
  • src/openhuman/channels/proactive.rs
  • src/openhuman/channels/providers/presentation.rs
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/providers/web_tests.rs

Comment on lines 449 to +469
} else if lower.contains("rate limit") || lower.contains("429") {
let retry = parse_retry_after_secs_from_str(err);
let retry_secs = parse_retry_after_secs_from_str(err);
let summary = format!(
"Your AI provider is rate-limiting requests. This is a transient upstream \
limit, not a thread-level block — you can retry in this thread.{}",
retry_after_hint(retry)
retry_after_hint(retry_secs)
);
("rate_limited", with_provider_detail(summary.as_str(), err))
// Non-retryable business 429s ("plan does not include", balance
// exhausted, known provider business codes like Z.AI 1311/1113)
// also surface here — mark them non-retryable so the FE can hide
// the "Retry" button and route the user to settings/billing.
let retryable = !is_non_retryable_rate_limit_text(&lower);
ClassifiedError {
error_type: "rate_limited",
message: with_provider_detail(summary.as_str(), err),
source: "provider",
retryable,
retry_after_ms: retry_secs.map(|s| s.saturating_mul(1000)),
provider,
fallback_available,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Split retryable and non-retryable 429 messaging.

This branch sets error_retryable = false for plan/balance/business 429s, but the message still says the limit is transient and that the user can retry in the same thread. That contradicts the structured metadata and will point users at the wrong recovery path.

Suggested fix
-        let retry_secs = parse_retry_after_secs_from_str(err);
-        let summary = format!(
-            "Your AI provider is rate-limiting requests. This is a transient upstream \
-             limit, not a thread-level block — you can retry in this thread.{}",
-            retry_after_hint(retry_secs)
-        );
-        // Non-retryable business 429s ("plan does not include", balance
-        // exhausted, known provider business codes like Z.AI 1311/1113)
-        // also surface here — mark them non-retryable so the FE can hide
-        // the "Retry" button and route the user to settings/billing.
-        let retryable = !is_non_retryable_rate_limit_text(&lower);
+        let retry_secs = parse_retry_after_secs_from_str(err);
+        let retryable = !is_non_retryable_rate_limit_text(&lower);
+        let summary = if retryable {
+            format!(
+                "Your AI provider is rate-limiting requests. This is a transient upstream \
+                 limit, not a thread-level block — you can retry in this thread.{}",
+                retry_after_hint(retry_secs)
+            )
+        } else {
+            "Your AI provider rejected this request because the current plan or balance \
+             does not cover it. Update the provider billing or plan and try again."
+                .to_string()
+        };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else if lower.contains("rate limit") || lower.contains("429") {
let retry = parse_retry_after_secs_from_str(err);
let retry_secs = parse_retry_after_secs_from_str(err);
let summary = format!(
"Your AI provider is rate-limiting requests. This is a transient upstream \
limit, not a thread-level block — you can retry in this thread.{}",
retry_after_hint(retry)
retry_after_hint(retry_secs)
);
("rate_limited", with_provider_detail(summary.as_str(), err))
// Non-retryable business 429s ("plan does not include", balance
// exhausted, known provider business codes like Z.AI 1311/1113)
// also surface here — mark them non-retryable so the FE can hide
// the "Retry" button and route the user to settings/billing.
let retryable = !is_non_retryable_rate_limit_text(&lower);
ClassifiedError {
error_type: "rate_limited",
message: with_provider_detail(summary.as_str(), err),
source: "provider",
retryable,
retry_after_ms: retry_secs.map(|s| s.saturating_mul(1000)),
provider,
fallback_available,
}
} else if lower.contains("rate limit") || lower.contains("429") {
let retry_secs = parse_retry_after_secs_from_str(err);
let retryable = !is_non_retryable_rate_limit_text(&lower);
let summary = if retryable {
format!(
"Your AI provider is rate-limiting requests. This is a transient upstream \
limit, not a thread-level block — you can retry in this thread.{}",
retry_after_hint(retry_secs)
)
} else {
"Your AI provider rejected this request because the current plan or balance \
does not cover it. Update the provider billing or plan and try again."
.to_string()
};
ClassifiedError {
error_type: "rate_limited",
message: with_provider_detail(summary.as_str(), err),
source: "provider",
retryable,
retry_after_ms: retry_secs.map(|s| s.saturating_mul(1000)),
provider,
fallback_available,
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/channels/providers/web.rs` around lines 449 - 469, The current
429 branch sets retryable = !is_non_retryable_rate_limit_text(&lower) but always
builds a transient/retry-focused summary; change it so you first compute a
boolean (e.g., let non_retryable = is_non_retryable_rate_limit_text(&lower)) and
then pick the message accordingly: if non_retryable produce a non-retry message
that directs the user to billing/settings/plan (no "retry in this thread" hint),
otherwise keep the transient retry summary that uses
retry_after_hint(retry_secs); pass the chosen summary into
with_provider_detail(...) and keep retryable set to !non_retryable and
retry_after_ms as-is. Ensure you reference is_non_retryable_rate_limit_text,
retry_after_hint, retry_secs, and ClassifiedError::message/retryable when
editing.

Comment on lines 496 to +508
} else if lower.contains("402")
|| lower.contains("payment required")
|| lower.contains("insufficient balance")
{
(
"budget_exhausted",
with_provider_detail("Insufficient credits. Please top up to continue.", err),
)
ClassifiedError {
error_type: "budget_exhausted",
message: with_provider_detail("Insufficient credits. Please top up to continue.", err),
source: "openhuman_billing",
retryable: false,
retry_after_ms: None,
provider,
fallback_available: None,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't map provider billing failures to openhuman_billing.

Any upstream error like openrouter API error (402 Payment Required) will land here and be surfaced as error_source = "openhuman_billing" with OpenHuman top-up copy. That sends users to the wrong billing system.

Suggested fix
-    } else if lower.contains("402")
-        || lower.contains("payment required")
-        || lower.contains("insufficient balance")
-    {
+    } else if (lower.contains("402")
+        || lower.contains("payment required")
+        || lower.contains("insufficient balance"))
+        && provider.is_none()
+    {
         ClassifiedError {
             error_type: "budget_exhausted",
             message: with_provider_detail("Insufficient credits. Please top up to continue.", err),
             source: "openhuman_billing",
             retryable: false,
             retry_after_ms: None,
             provider,
             fallback_available: None,
         }
+    } else if lower.contains("402")
+        || lower.contains("payment required")
+        || lower.contains("insufficient balance")
+    {
+        ClassifiedError {
+            error_type: "provider_error",
+            message: with_provider_detail(
+                "Your AI provider account has no available credit for this request. \
+                 Please top up that provider or change providers.",
+                err,
+            ),
+            source: "provider",
+            retryable: false,
+            retry_after_ms: None,
+            provider,
+            fallback_available,
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
} else if lower.contains("402")
|| lower.contains("payment required")
|| lower.contains("insufficient balance")
{
(
"budget_exhausted",
with_provider_detail("Insufficient credits. Please top up to continue.", err),
)
ClassifiedError {
error_type: "budget_exhausted",
message: with_provider_detail("Insufficient credits. Please top up to continue.", err),
source: "openhuman_billing",
retryable: false,
retry_after_ms: None,
provider,
fallback_available: None,
}
} else if (lower.contains("402")
|| lower.contains("payment required")
|| lower.contains("insufficient balance"))
&& provider.is_none()
{
ClassifiedError {
error_type: "budget_exhausted",
message: with_provider_detail("Insufficient credits. Please top up to continue.", err),
source: "openhuman_billing",
retryable: false,
retry_after_ms: None,
provider,
fallback_available: None,
}
} else if lower.contains("402")
|| lower.contains("payment required")
|| lower.contains("insufficient balance")
{
ClassifiedError {
error_type: "provider_error",
message: with_provider_detail(
"Your AI provider account has no available credit for this request. \
Please top up that provider or change providers.",
err,
),
source: "provider",
retryable: false,
retry_after_ms: None,
provider,
fallback_available,
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/channels/providers/web.rs` around lines 496 - 508, The code
branch that builds a ClassifiedError for 402/payment messages incorrectly sets
source to the hardcoded string "openhuman_billing", which misattributes upstream
provider billing errors; in the ClassifiedError constructor (the block creating
ClassifiedError with error_type "budget_exhausted" and calling
with_provider_detail), replace the hardcoded source with the actual provider
identifier (use the existing provider variable) or another provider-specific
source instead of "openhuman_billing" so upstream 402/payment errors preserve
the provider as the error source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Backend should return structured retry metadata for rate limits Rate limit state sticks to one chat thread

1 participant