Skip to content

fix(observability): classify provider config-rejection 4xx as expected user-state (#2079)#2239

Merged
graycyrus merged 3 commits into
tinyhumansai:mainfrom
sanil-23:fix/2079-provider-config-rejection-classifier
May 19, 2026
Merged

fix(observability): classify provider config-rejection 4xx as expected user-state (#2079)#2239
graycyrus merged 3 commits into
tinyhumansai:mainfrom
sanil-23:fix/2079-provider-config-rejection-classifier

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 19, 2026

Summary

  • Custom cloud providers (custom_openai -> DeepSeek / OpenRouter / Moonshot) reject requests whose model id or sampling params they do not accept. These are deterministic user-configuration state, already surfaced in the UI, but every agent turn produced a fresh Sentry event (OPENHUMAN-TAURI-WJ / -QW / -HB / -NH, ~273 events).
  • Adds a tight is_provider_config_rejection_message classifier (sibling of billing_error) and demotes these from Sentry to an info log at both emission points, mirroring the existing budget-exhaustion pattern.
  • Provider-aware, inverted polarity: the same body from the OpenHuman backend stays Sentry-actionable (that would be a real regression - we sent our own backend a bad request).
  • The user now sees an actionable model_unavailable message ("fix your model/routing in Settings -> LLM") instead of the generic "inference" error.
  • Purely additive (+~430 incl. tests), no behavior removed.

Problem

#2079 (abstract tier reasoning-v1/chat-v1 leaked to a provider that only speaks native ids), #2076 (Moonshot Kimi K2 only accepts temperature: 1), and #2202 (unknown / stale model pin) all surface as upstream 4xx invalid_request bodies like:

The supported API model names are deepseek-v4-pro or deepseek-v4-flash, but you passed reasoning-v1.
invalid temperature: only 1 is allowed for this model
Model 'claude-opus-4-7' is not available. Use GET /openai/v1/models to list available models.

These reach Sentry at two points, both unfiltered (the before_send filter only drops transient domain=llm_provider statuses - 400/404 are not transient, matching the issues' domain=llm_provider and domain=agent tags):

  1. providers::ops::api_error (domain=llm_provider, operation=api_error) - should_report_provider_http_failure is status-based, so a 400 reports.
  2. The re-report from agent.run_single / web_channel.run_chat_task (domain=agent/web_channel) - expected_error_kind had no matcher for these bodies, so report_error_or_expected fell through to report_error.

This is the same class as budget-exhaustion, which the codebase already demotes - the matcher just did not exist for model/param rejection.

Solution

  • New inference/provider/config_rejection.rs - pure is_provider_config_rejection_message predicate, deliberately tight phrase list tied to the exact Sentry bodies. The phrases are emitted only by third-party upstream APIs (the OpenHuman backend resolves tier names natively and never emits them), so the message-level predicate is intrinsically scoped to custom providers.
  • ops::api_error - is_provider_config_rejection_http (4xx in {400,404,422} AND provider != openhuman_backend::PROVIDER_LABEL) + log_provider_config_rejection, a new branch mirroring is_budget_exhausted_http_400. The non-backend guard is the inverted-polarity mirror of the existing 401/403-backend rule.
  • core/observability.rs - ExpectedErrorKind::ProviderConfigRejection, wired into expected_error_kind and report_expected_message. This is the central fix: it catches the dominant re-report flood from every report_error_or_expected caller (agent, web-channel, cron, scheduler).
  • channels/providers/web::classify_inference_error - new arm reusing the shared predicate so the user gets an actionable model_unavailable message instead of the generic "inference" bucket.

Deliberate scope: the 5 duplicated decision-tree blocks in compatible.rs (operations responses_api/chat_completions) were intentionally left untouched - the issues cite only operation=api_error (handled) and operation=native_chat (handled centrally via expected_error_kind, since those bailed errors propagate to the agent layer and are re-reported there). Flagged as a follow-up cleanup rather than duplicating the branch 5x.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) - 13 unit tests across 4 files: predicate phrase detection + negatives, provider-aware inverted-polarity (backend same body NOT suppressed), 5xx/transient/unrelated-4xx guards, expected_error_kind classify + over-match guard, report_error_or_expected demotion path, web::classify_inference_error actionable mapping, log-helper smoke.
  • Diff coverage >= 80% - every new predicate/branch/arm is exercised; only ~3 api_error var-assignment/branch wiring lines (which need a live reqwest::Response) are uncovered, well under 20% of changed executable lines. cargo fmt --check clean, cargo clippy --lib zero issues in changed files.
  • Coverage matrix updated - N/A: observability/error-classification change, no added/removed/renamed feature row.
  • All affected feature IDs listed under ## Related - N/A: no matrix feature touched.
  • No new external network dependencies introduced - confirmed (pure in-process classifier, no new crates).
  • Manual smoke checklist updated if this touches release-cut surfaces - N/A: internal error classification, no release-cut surface.
  • Linked issue closed via Closes #NNN - see Related.

Impact

  • Runtime / platform: none on hot paths - same lookup shape that already runs for budget-exhaustion; only adds branches on the error path.
  • Security / migration / compatibility: zero. Errors still propagate via Err unchanged (retry/fallback unaffected); only the per-attempt Sentry report is gated, exactly like the budget/transient/session-expired precedents.
  • User-visible: custom-provider users hitting an unknown model / abstract tier / temperature constraint now see an actionable "fix your model/routing in Settings -> LLM" message instead of a generic error; the ~273 Sentry events/window from these (OPENHUMAN-TAURI-WJ / -QW / -HB / -NH) drop to info-level breadcrumbs.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

AI-authored PR.

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/2079-provider-config-rejection-classifier
  • Commit SHA: 7a3c7edb

Validation Run

  • pnpm --filter openhuman-app format:check - N/A: no app/* files changed.
  • pnpm typecheck - N/A: no TypeScript changed.
  • Focused tests: cargo test --lib config_rejection report_error_or_expected_does_not_panic -> 13 passed, 0 failed.
  • Rust fmt/check (if changed): cargo fmt --check clean; cargo check --lib exit 0; cargo clippy --lib zero issues in changed files.
  • Tauri fmt/check (if changed): N/A: no app/src-tauri files changed.

Validation Blocked

  • command: none
  • error: none - full cargo check --lib, targeted cargo test --lib, and cargo clippy --lib all ran cleanly on the dev host.
  • impact: none. CI runs the full suite + diff-cover.

Behavior Changes

  • Intended behavior change: provider 4xx config-rejection from a non-backend provider is demoted from a Sentry error to an info log at both the provider-HTTP layer and the agent/web-channel re-report; the user-facing error is classified model_unavailable with actionable copy.
  • User-visible effect: no more generic "inference" error for unknown-model / abstract-tier / temperature rejections - the message tells the user to fix their model/routing in Settings.

Parity Contract

  • Legacy behavior preserved: Yes - errors still propagate via Err unchanged (retry/fallback/dispatch untouched); only the Sentry report is gated, identical contract to the existing budget/transient/session-expired demotions.
  • Guard/fallback/dispatch parity checks: inverted-polarity test pins that a model-rejection body from openhuman_backend::PROVIDER_LABEL still reports; 5xx / transient-429 / generic-4xx guards pin that real bugs are not silenced.

Duplicate / Superseded PR Handling

Summary by CodeRabbit

  • Bug Fixes

    • Provider configuration rejection errors (unsupported models, invalid parameters) now provide actionable guidance directing users to their LLM settings instead of generic messages.
  • Chores

    • Enhanced error handling to properly classify and suppress expected provider configuration failures, reducing error report noise.

Review Change Stack

sanil-23 added 2 commits May 19, 2026 23:39
…d user-state (tinyhumansai#2079)

Custom cloud providers (custom_openai -> DeepSeek / OpenRouter / Moonshot)
reject requests whose model id or sampling params they don't accept: an
OpenHuman abstract tier alias leaked to a provider that only speaks
native ids (tinyhumansai#2079), an unknown / stale model pin (tinyhumansai#2202), or a
model-specific temperature constraint (tinyhumansai#2076, Moonshot Kimi K2). These
are deterministic user-configuration state, already surfaced in the UI,
but every agent turn produced a fresh Sentry event (OPENHUMAN-TAURI-WJ /
-QW / -HB / -NH, ~273 events) at BOTH the provider HTTP layer
(domain=llm_provider) and the re-report from agent.run_single /
web_channel.run_chat_task (domain=agent) -- before_send only drops
transient statuses, not 400/404.

- New is_provider_config_rejection_message predicate (sibling of
  billing_error), tight phrase list tied to the exact Sentry bodies.
- ops::api_error: provider-aware demotion with inverted polarity -- a
  model-rejection from the OpenHuman backend stays Sentry-actionable
  (that would be a real regression we sent our own backend).
- observability::ExpectedErrorKind::ProviderConfigRejection wired into
  expected_error_kind so the agent / web-channel re-report is demoted
  too (the dominant flood).
- web::classify_inference_error maps these to an actionable
  model_unavailable message instead of the generic "inference" bucket.

Purely additive, mirrors the budget-exhaustion demotion pattern. 11 new
unit tests incl. inverted-polarity and scope guards.

Closes tinyhumansai#2079
…ath (tinyhumansai#2079)

Exercise the diff-coverage gate's remaining changed lines:
- report_error_or_expected_does_not_panic now drives the
  expected_error_kind ProviderConfigRejection branch and the
  report_expected_message skip-log arm (the agent / web-channel
  re-report demotion path).
- log_helper_runs_without_panicking covers log_provider_config_rejection
  (the provider-layer api_error demotion log).
@sanil-23 sanil-23 requested a review from a team May 19, 2026 18:19
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

Warning

Rate limit exceeded

@sanil-23 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 7 minutes and 34 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 770cf826-2e80-433f-9f05-1a0d995433eb

📥 Commits

Reviewing files that changed from the base of the PR and between 7a3c7ed and 481538b.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/providers/web_tests.rs
📝 Walkthrough

Walkthrough

This PR adds a unified provider configuration rejection detection and suppression mechanism. A new is_provider_config_rejection_message classifier detects when provider APIs reject user model or parameter configuration and routes these deterministic, user-actionable errors through HTTP suppression, user-facing remediation messaging, and expected-error observability pipelines.

Changes

Provider Configuration Rejection Detection and Suppression

Layer / File(s) Summary
Provider config rejection detection foundation
src/openhuman/inference/provider/config_rejection.rs, src/openhuman/inference/provider/mod.rs
New config_rejection module exposes is_provider_config_rejection_message, which detects provider rejection phrases via case-insensitive substring matching. Includes unit tests for positive detection, case-insensitivity, and exclusion of transient/generic errors.
HTTP API error suppression for config rejections
src/openhuman/inference/provider/ops.rs
New helpers is_provider_config_rejection_http and log_provider_config_rejection integrate config rejection detection into api_error control flow. Routes 400/404/422 responses from custom providers with matching body text to INFO logging instead of Sentry, while preserving Sentry reporting for backend, 5xx, transient 429, and unrelated 4xx cases.
User-facing error classification and remediation
src/openhuman/channels/providers/web.rs, src/openhuman/channels/providers/web_tests.rs
classify_inference_error uses the classifier to map provider config rejections to model_unavailable category and returns actionable messages directing users to "Settings → LLM". Tests verify multiple provider error shapes surface consistent remediation guidance.
Observability expected-error classification
src/core/observability.rs
ExpectedErrorKind gains ProviderConfigRejection variant, wired into expected_error_kind classifier and report_expected_message logger. Routes config rejections to INFO-level logging with kind = provider_config_rejection, skipping error event emission. Tests verify correct classification and non-panicking integration with report_error_or_expected.

Sequence Diagram

sequenceDiagram
  participant Client
  participant InferenceAPI as API<br/>(ops.rs)
  participant Classifier as Classifier<br/>(config_rejection)
  participant Logger as Logger<br/>(INFO)
  participant Sentry as Sentry
  
  Client->>InferenceAPI: invoke provider API
  InferenceAPI->>InferenceAPI: receive 4xx response
  InferenceAPI->>Classifier: is_provider_config_rejection_message()
  alt config rejection detected
    Classifier-->>InferenceAPI: true
    InferenceAPI->>Logger: log_provider_config_rejection()
    Logger-->>InferenceAPI: emitted INFO event
    InferenceAPI-->>Client: return error (no Sentry)
  else transient/generic error
    Classifier-->>InferenceAPI: false
    InferenceAPI->>Sentry: report to Sentry
    Sentry-->>InferenceAPI: acknowledged
    InferenceAPI-->>Client: return error
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • tinyhumansai/openhuman#1763: Both PRs extend core::observability::ExpectedErrorKind and the shared expected_error_kind/report_error_or_expectedreport_expected_message classification flow by adding different new "expected" variants.
  • tinyhumansai/openhuman#1798: Both PRs extend src/core/observability.rs's error classification/suppression pipeline to demote non-fatal cases and suppress Sentry noise.
  • tinyhumansai/openhuman#1719: Both PRs modify the shared observability pipeline by extending ExpectedErrorKind and wiring new classification plus Sentry suppression in src/core/observability.rs.

Suggested labels

working

Suggested reviewers

  • graycyrus

Poem

🐰 A bunny hops through provider errors with glee,
Now config rejections are handled with clarity,
No more Sentry noise for settings gone wrong,
Just helpful guidance: "Check LLM, be strong!"
Silent INFO logs, and users align—
Provider happiness in logs so fine! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: classifying provider config-rejection 4xx errors as expected user-state. It is specific, directly related to the core objective, and uses precise terminology aligned with the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 19, 2026
coderabbitai[bot]
coderabbitai Bot previously requested changes May 19, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/channels/providers/web.rs`:
- Around line 303-317: The provider-config rejection check
(crate::openhuman::inference::provider::is_provider_config_rejection_message) is
placed after the broader "model_unavailable" branch so some provider-specific
messages fall through; move the entire else-if branch that returns
("model_unavailable", with_provider_detail(..., err)) using
is_provider_config_rejection_message(err) so it appears before the generic
model-unavailable arm, ensuring provider-config rejection messages get the
specific "Settings → LLM" remediation; keep references to with_provider_detail
and the "model_unavailable" tuple intact when relocating the block.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 029facd0-8f2d-4a03-9925-872c7c6ee7aa

📥 Commits

Reviewing files that changed from the base of the PR and between 049d1ea and 7a3c7ed.

📒 Files selected for processing (6)
  • src/core/observability.rs
  • src/openhuman/channels/providers/web.rs
  • src/openhuman/channels/providers/web_tests.rs
  • src/openhuman/inference/provider/config_rejection.rs
  • src/openhuman/inference/provider/mod.rs
  • src/openhuman/inference/provider/ops.rs

Comment thread src/openhuman/channels/providers/web.rs Outdated
…available (tinyhumansai#2079)

CodeRabbit review on tinyhumansai#2239: the is_provider_config_rejection_message arm
ran after the broader model-unavailable branch, so config-rejection
bodies that also contain "model"/"does not exist"/"does not have access"
(e.g. a stale model pin / model_not_found) fell through to the generic
"Check your model settings" copy instead of the actionable
"Settings -> LLM" remediation. Move the specific arm above the generic
one. Updates the pre-existing classify_inference_error_quotes_model_unavailable_detail
test to assert the new (intended) remediation for that stale-pin body.
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — graycyrus

Walkthrough: Solid, well-structured observability fix that demotes deterministic provider config-rejection errors (~273 Sentry events across OPENHUMAN-TAURI-WJ/-QW/-HB/-NH) to info logs, following the exact same pattern as the existing budget-exhaustion demotion. The inverted-polarity contract is correct — same body from the OpenHuman backend still reaches Sentry. The user-facing classification to model_unavailable with actionable "Settings → LLM" copy is a nice UX improvement.

Assessment: Clean. No critical or major issues found.

File Change Description
config_rejection.rs New Core predicate — tight phrase list, case-insensitive, well-documented polarity contract
ops.rs Modified HTTP-layer integration — is_provider_config_rejection_http (status + provider + body), new branch in api_error
observability.rs Modified ProviderConfigRejection variant + expected_error_kind classifier + report_expected_message info log arm
web.rs Modified New classify_inference_error arm for actionable user message
web_tests.rs Modified Tests for config-rejection classification + updated existing model-unavailable test
mod.rs Modified Module registration + re-export

CodeRabbit dedup: Their one finding (arm ordering in web.rs) was addressed in the latest commit — skipped.

What I liked:

  • The phrase list in config_rejection.rs is deliberately tight with clear provenance (each phrase tied to a specific Sentry issue)
  • Comprehensive test matrix: 13 tests covering detection, case insensitivity, negatives, inverted polarity, transient/5xx guards, and smoke
  • The "is an abstract tier" forward-looking phrase keeps the classifier stable across the planned tier-resolution fix
  • Diff coverage gate passes (≥80%)

LGTM — no blocking issues.

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice work!

@graycyrus graycyrus merged commit b8fbb36 into tinyhumansai:main May 19, 2026
24 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(providers): reasoning-v1 model name sent to DeepSeek API which rejects it (regression)

2 participants