fix(observability): classify list_models 404 as ProviderUserState (Sentry TAURI-RUST-YJ)#2793
Conversation
|
Warning Review limit reached
More reviews will be available in 11 minutes and 55 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a conditional in src/core/observability.rs to classify messages that start with "provider returned 404" as ExpectedErrorKind::ProviderUserState, and adds two unit tests verifying 404 variants match and non-404 provider status variants do not. ChangesCustom OpenAI provider 404 classification
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
clean fix. the 404-only anchor in is_provider_user_state_message is the right call — the prefix is unique to one emit site, the reasoning for keeping 401/403/400/429/5xx actionable is sound, and the test matrix covers the verbatim Sentry payload plus the body-shape variants that would trip a naive body anchor.
note on the --no-verify push: understood given the worktree limitation, and irrelevant here since no app/ files were touched — CI confirms it.
nice work.
7b3975e
|
Actionable comments posted: 0 |
|
The CI failure here is a stale-base issue — I've opened #2873 with the exact same fix rebased on the current |
`inference/provider/ops.rs::list_models` probes a user-configured
custom-provider's `/models` endpoint. When the upstream returns 404
because the user pointed the base URL at something that doesn't host
a `/models` listing (wrong base, model-only proxy, typo'd path), the
client emits:
"provider returned 404: {<upstream body>}"
The model-dropdown / connection-test UI already surfaces this failure
inline; Sentry has no remediation path. Demote to `ProviderUserState`
so the per-misconfigured-user event noise stays out of Sentry while
real upstream / client bugs (401 BYO-key auth, 400 request-shape
mismatches, 5xx) keep escalating.
Anchor: `lower.starts_with("provider returned 404")` — the
`"provider returned NNN: "` prefix is emitted ONLY from this one
site (verified via grep across `src/`), so the prefix alone is a
sufficient anchor. **404 only**:
- 401 / 403 → BYO-key auth wall, must stay actionable (tinyhumansai#2286
contract preserved).
- 400 → typically a client-shape bug worth investigating.
- 429 / 5xx → transient / server faults, retried at provider layer.
Targets Sentry OPENHUMAN-TAURI-YJ (issue 1207): 4 events between
2026-05-19 and 2026-05-27 across v0.54.0 and v0.56.0, all
`domain=rpc method=openhuman.inference_list_models`.
Tests pin the verbatim Sentry payload, three additional body-shape
variants (FastAPI `{"detail":...}`, bare HTML, post-`truncate_with_ellipsis`
clipped body), and a discrimination guard exercising every sibling
4xx / 5xx code from the same emit site to confirm only 404 demotes.
7b3975e to
eefdb3e
Compare
M3gA-Mind
left a comment
There was a problem hiding this comment.
LGTM. Rebased onto upstream/main — resolved conflict in is_provider_user_state_message by keeping both the Cloudflare anti-bot check (from upstream) and the new provider returned 404 case (this PR). The classifies_embedding_api_invalid_token_401_as_session_expired test now passes — the root cause was that upstream/main's #2869 changed is_embedding_backend_auth_failure to return SessionExpired instead of BackendUserError, which this branch now picks up via the rebase.
M3gA-Mind
left a comment
There was a problem hiding this comment.
All CI is green including the previously failing Rust Core Tests + Quality. The root cause was a stale CI run against an older commit; the current branch passes all tests locally and in CI. The fix is correct: starts_with('provider returned 404') in is_provider_user_state_message correctly demotes list_models 404s to user-state noise while keeping all other 4xx/5xx codes actionable.
M3gA-Mind
left a comment
There was a problem hiding this comment.
All CI is green including Rust Core Tests + Quality. The fix correctly demotes list_models 404s to user-state noise via starts_with('provider returned 404') in is_provider_user_state_message, while keeping all other status codes actionable.
Summary
inference/provider/ops.rs::list_modelsprobes a user-configured custom-provider's/modelsendpoint. When the upstream returns 404 because the user pointed the base URL at something that doesn't host a/modelslisting (wrong base, model-only proxy, typo'd path), the client emits\"provider returned 404: {<upstream body>}\".ProviderUserStateso the per-misconfigured-user event noise stays out of Sentry while real upstream / client bugs (401 BYO-key auth, 400 request-shape mismatches, 5xx) keep escalating.Problem
Sentry OPENHUMAN-TAURI-YJ —
\"provider returned 404: {\\\"error\\\":\\\"path \\\\\\\"/api/v1/models\\\\\\\" not found\\\"}\". 4 events between 2026-05-19 and 2026-05-27, spanning v0.54.0 and v0.56.0, alldomain=rpc method=openhuman.inference_list_models operation=invoke_method. Wire shape comes fromsrc/openhuman/inference/provider/ops.rs:118-122:expected_error_kinddid not match:is_transient_upstream_http_messageonly fires onapi error (/http error: NNNshapes,is_backend_user_error_messagerequires the\"backend returned \"prefix from the integrations client, and the existingis_provider_user_state_messagebranches all target other emit sites (composio / kimi / custom_openai). Result: every misconfigured user files an event each time they open the model dropdown.Solution
src/core/observability.rs— new branch inis_provider_user_state_message:The
\"provider returned NNN: \"prefix is emitted only frominference/provider/ops.rs:118(verified viagrep -rn 'provider returned ' src/), so the prefix alone is a sufficient anchor — no second body anchor needed.404 only — sibling 4xx / 5xx codes from the same emit site stay actionable:
does_not_classify_byo_key_provider_401_as_session_expiredcontract (#2286).is_transient_upstream_http_message.Submission Checklist
classifies_list_models_404_as_provider_user_statepins the verbatim Sentry payload + 3 body-shape variants (FastAPI{\"detail\":...}, bare HTML, post-truncate_with_ellipsisclipped body).does_not_classify_non_404_list_models_failures_as_user_stateexercises every sibling status from the same emit site to confirm only 404 demotes.Closes #NNN— Sentry-only fix; no GitHub issue.Impact
info!/warn!log retained for diagnostics viareport_expected_message.Related
inference/providerarea: PR fix(observability): demote unknown-provider list_models error (Sentry TAURI-RUST-X) #2783 (TAURI-RUST-X —\"no cloud provider with id or slug 'X' found\") and PR fix(inference): synth local-runtime entry for list_models when no cloud_providers row (Sentry TAURI-RUST-28Z) #2785 (TAURI-RUST-28Z — synth local-runtime entry).does_not_classify_byo_key_provider_401_as_session_expiredtest stays green and is supplemented by the new discrimination guard here.AI Authored PR Metadata
Commit & Branch
fix/observability-list-models-404-user-config956125c1Validation Run
cargo test --lib -p openhuman -- classifies_list_models_404_as_provider_user_state does_not_classify_non_404_list_models_failures_as_user_state classifies_trigger_type_not_found_as_provider_user_state— 3/3 pass.cargo test --lib -p openhuman core::observability::— 90/90 pass.cargo fmt -- --check— clean.Validation Blocked
command:pre-push hook (pnpm format) +cargo check --manifest-path app/src-tauri/Cargo.toml.error:worktree lacksnode_modulesand vendored CEF tauri-cli — documented limitation inCLAUDE.md.impact:pushed with--no-verify; only the Tauri shell check and frontend format were skipped — both unrelated to this PR (noapp/files touched).Behavior Changes
\"provider returned 404\"now classifies asExpectedErrorKind::ProviderUserStateand is demoted viareport_expected_messageinstead of captured to Sentry.Parity Contract
\"provider returned NNN\"emit site continues to escalate (proven bydoes_not_classify_non_404_list_models_failures_as_user_state).Summary by CodeRabbit
Bug Fixes
Tests