Skip to content

fix(observability): suppress context-window-exceeded provider error from Sentry (Sentry TAURI-RUST-501)#2820

Merged
M3gA-Mind merged 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-501-context-window-exceeded
May 28, 2026
Merged

fix(observability): suppress context-window-exceeded provider error from Sentry (Sentry TAURI-RUST-501)#2820
M3gA-Mind merged 3 commits into
tinyhumansai:mainfrom
CodeGhost21:fix/sentry-tauri-rust-501-context-window-exceeded

Conversation

@CodeGhost21
Copy link
Copy Markdown
Contributor

@CodeGhost21 CodeGhost21 commented May 28, 2026

Summary

  • A custom / self-hosted provider that mis-reports context-overflow as HTTP 500 ("custom API error (500 …): Context size has been exceeded.") escaped every existing matcher and fired a raw Sentry event from the providers::ops::api_error suppression cascade (domain=llm_provider operation=api_error provider=custom).
  • Fix routes context-overflow through a single-source matcher used at all three emit/decision points (retry classifier, api_error Sentry cascade, observability re-report classifier), and adds the missing "context size has been exceeded" phrasing.
  • Targets self-hosted Sentry TAURI-RUST-501 (openhuman@0.56.0+e8968077aeb5).

Problem

Context-overflow is already a recognised condition: providers::reliable::is_context_window_exceeded marks it non-retryable (retrying the same oversized request can't help). But two gaps let TAURI-RUST-501 through:

  1. The matcher's phrase list missed this custom-gateway wording — it had "maximum context length", "context length exceeded", "prompt is too long", … but not "context size has been exceeded".
  2. The providers::ops::api_error Sentry-suppression cascade (ops.rs) had no context-overflow branch at all — only budget / upstream-bad-request / access-policy / config-rejection / backend-auth. So the 500 fell through to should_report_provider_http_failure(500)report_error → Sentry.

api_error does not consult core::observability::expected_error_kind — it has its own cascade — so the fix had to land in the cascade itself, not just the classifier.

Solution — single source of truth, three call sites

  • ops.rs: new pub fn is_context_window_exceeded_message(body) — the 8 established phrasings + "context size has been exceeded" — the one place the phrasing lives. New log_context_window_exceeded helper (demotes to warn! breadcrumb), and a new branch in the api_error cascade:
    } else if is_context_window_exceeded {            // status-agnostic
        log_context_window_exceeded("api_error", provider, None, status);
    } else if should_report_provider_http_failure(status) {}
    Status-agnostic on purpose: providers disagree on the code (OpenAI → 400 context_length_exceeded; this custom gateway → 500). Matching the body keeps them in one bucket.
  • reliable.rs: is_context_window_exceeded(err) now delegates to the shared matcher — preserves the non-retryable behavior, picks up the new phrase, can't drift.
  • observability.rs: new ExpectedErrorKind::ContextWindowExceeded arm, also delegating to the shared matcher, to catch the higher-layer re-report (agent.run_single / web_channel.run_chat_task re-raise the same error under a different domain tag). Same two-emit-site pattern as the empty-response (4JX/4Z1) and session-expired (4P0/4K5) fixes — prevents a follow-on Sentry issue from the re-report.

Polarity preserved: anchors are context-overflow specific, so rate-limit "exceeded", budget errors, "model not found", and tool-budget errors still reach Sentry (pinned by rejection tests).

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy — 6 new tests:
    • provider::ops context_window_exceeded_suppression mod (4): TAURI-RUST-501 500 body, the 7 established phrasings, a 4-case rejection contract (rate-limit / model-not-found / budget / tool-budget), and a log_context_window_exceeded smoke.
    • core::observability (2): re-report classification (501 body + established phrasings) and a 3-case rejection contract.
  • Diff coverage ≥ 80% — every new line (matcher, cascade branch, log helper, reliable delegation, classifier arm + demote arm) is exercised. cargo test → observability 94, provider::ops 28, reliable 56, all pass.
  • N/A: Coverage matrix updated — observability/provider-classifier change is behaviour-only; not a tracked feature row in docs/TEST-COVERAGE-MATRIX.md.
  • N/A: All affected feature IDs from the matrix are listed in the PR description under ## Related — no matrix feature IDs affected.
  • No new external network dependencies introduced — pure in-process classifier/cascade change.
  • N/A: Manual smoke checklist updated — provider error-classification internals; no user-visible UI surface, no release-cut behaviour change.
  • N/A: Linked issue closed via Closes #NNN — Sentry-only fix; no GitHub issue. The Sentry-Issue trailer below carries the back-reference.

Impact

  • Runtime: Desktop + core. Context-overflow provider errors (any status) now demote to a warn! breadcrumb at the api_error cascade and classify as ContextWindowExceeded at the re-report layer. No behaviour change to the non-retryable decision, the user-facing chat error, or any caller — only the Sentry/log severity tier.
  • Performance: Net positive — removes the TAURI-RUST-501 event stream and preempts the higher-layer re-report variant.
  • Security: None — no new code paths, no PII (breadcrumb carries the sanitized provider body only in the local log).
  • Migration / compatibility: None — additive matcher, additive cascade branch, additive enum variant; reliable.rs refactor is behaviour-preserving (delegation to the same phrase set).
  • Observability trade-off: A genuine context-overflow now shows only as a local warn! breadcrumb. Unrelated provider failures (rate-limit, model-not-found, budget) are unaffected and still reach Sentry.

Notes for reviewers

Related

  • Sentry-Issue: TAURI-RUST-501
  • Emit sites: src/openhuman/inference/provider/ops.rs api_error cascade (primary, operation=api_error); higher-layer re-report via agent.run_single / web_channel.run_chat_task.
  • Single-source matcher: provider::ops::is_context_window_exceeded_message (shared by reliable.rs, the api_error cascade, and observability.rs).
  • Precedent: the two-emit-site fixes for empty-response (4JX agent / 4Z1 web_channel) and session-expired (4P0 / 4K5).

Summary by CodeRabbit

  • Bug Fixes

    • Improved detection/classification of "context window exceeded" failures and now demote these to breadcrumb-only warnings instead of full error captures.
  • Refactor

    • Centralized the context-window overflow classifier so all modules use consistent detection logic.
  • Tests

    • Added tests covering matching phrases, known misreported bodies, and negative cases to prevent false positives.

Review Change Stack

…rom Sentry (Sentry TAURI-RUST-501)

A custom / self-hosted provider that mis-reports context-overflow as
HTTP 500 (`"custom API error (500 …): Context size has been exceeded."`)
escaped every existing matcher and fired a raw Sentry event from the
`providers::ops::api_error` suppression cascade
(`domain=llm_provider operation=api_error`).

Root cause: context-overflow is already handled — `reliable.rs` marks
it non-retryable — but the matcher's phrase list missed this custom
wording, AND the `api_error` cascade had no context-overflow branch at
all (only budget / access-policy / config-rejection / auth). So the
500 fell through to `should_report_provider_http_failure` → Sentry.

Fix (single-source matcher, three call sites):
- `ops.rs`: new `pub fn is_context_window_exceeded_message` — the 8
  established phrasings + `"context size has been exceeded"`, the one
  source of truth. New `log_context_window_exceeded` helper + an
  `is_context_window_exceeded` branch in the `api_error` cascade
  (status-agnostic — some gateways send 400, some 500) that demotes to
  a `warn!` breadcrumb instead of `report_error`.
- `reliable.rs`: `is_context_window_exceeded(err)` now delegates to the
  shared matcher (preserves non-retryable behavior, picks up the new
  phrase, can't drift).
- `observability.rs`: new `ExpectedErrorKind::ContextWindowExceeded`
  arm that delegates to the same matcher, catching the higher-layer
  re-report (`agent.run_single` / `web_channel.run_chat_task` under a
  different `domain` tag) — same two-emit-site pattern as the
  empty-response (4JX/4Z1) and session-expired (4P0/4K5) fixes.

Polarity preserved: anchors are context-overflow specific, so rate-limit
"exceeded", budget errors, and "model not found" still reach Sentry.

Tests: 4 new in `provider::ops` (TAURI-RUST-501 500 body, established
phrasings, rejection contract, log smoke) + 2 in `core::observability`
(re-report classification + rejection). cargo test → observability 94,
provider::ops 28, reliable 56, all pass. cargo fmt --check clean.

Sentry-Issue: TAURI-RUST-501
@CodeGhost21 CodeGhost21 requested a review from a team May 28, 2026 05:58
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b30575ab-4fe6-4ce1-adec-866a5cdd1f97

📥 Commits

Reviewing files that changed from the base of the PR and between 2055e50 and 7052110.

📒 Files selected for processing (2)
  • src/core/observability.rs
  • src/openhuman/inference/provider/ops.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/inference/provider/ops.rs
  • src/core/observability.rs

📝 Walkthrough

Walkthrough

Adds a shared classifier for "context window exceeded" provider messages, uses it in provider error handling to route such failures to a demotional breadcrumb tracing::warn! log, exposes a new ExpectedErrorKind::ContextWindowExceeded and wires it into observability, and updates the reliable module to reuse the shared matcher. Tests cover matching and non-matching cases.

Changes

Context Window Error Suppression and Observability

Layer / File(s) Summary
Provider detection, logging helper, and api_error routing
src/openhuman/inference/provider/ops.rs, tests/*
Adds pub fn is_context_window_exceeded_message(body: &str) -> bool predicate and pub(super) fn log_context_window_exceeded(...), integrates the classifier into api_error to call the demotion logger when matched, and adds context_window_exceeded_suppression tests covering positive and negative cases plus a smoke test for the logger.
Observability kind and reporting
src/core/observability.rs, tests/*
Adds ExpectedErrorKind::ContextWindowExceeded, wires provider classifier into expected_error_kind, and updates report_expected_message to emit a breadcrumb tracing::warn! (kind="context_window_exceeded") instead of Sentry capture; includes tests validating classification and suppression.
Centralize detection in reliable module
src/openhuman/inference/provider/reliable.rs
Refactors is_context_window_exceeded to delegate to the shared super::is_context_window_exceeded_message helper, removing duplicated string-match hints.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#1719: Modifies src/core/observability.rs to extend ExpectedErrorKind and wire new classifiers into expected-message demotion reporting.
  • tinyhumansai/openhuman#2216: Routes embedding emitter/provider failures through the same expected-error reporting pipeline used here.
  • tinyhumansai/openhuman#2830: Also adds new expected-error variants and breadcrumb demotion in observability, for different failure kinds.

Suggested labels

rust-core, sentry-traced-bug

Suggested reviewers

  • graycyrus
  • oxoxDev

Poem

🐰 I counted tokens, one by one,

the window closed when I was done.
Now gentle warnings light the path,
no Sentry screams, just breadcrumb math —
a quiet hop, the job is won.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: suppressing context-window-exceeded provider errors from being sent to Sentry. It directly reflects the core fix across all modified files and references the specific Sentry issue (TAURI-RUST-501) being addressed.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the bug label May 28, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/inference/provider/ops.rs`:
- Around line 569-580: The current HINTS array and the simple substring check
(HINTS.iter().any(|hint| lower.contains(hint))) are too broad and cause false
positives; tighten the matchers by removing or scoping generic phrases like "too
many tokens", "prompt is too long", and "input is too long", and/or replace the
plain contains check with anchored/word-boundary regexes (e.g., require
"\bcontext\b" nearby or use /\bcontext (window|length|size)\b/ and
/\btoken(s)?\b/ with context anchors) so only errors explicitly about model
context windows match; update the HINTS constant and the matching logic to use
these more specific patterns or regexes instead of bare substrings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cf9eab68-924f-4357-82ba-7a1bd87107cd

📥 Commits

Reviewing files that changed from the base of the PR and between 3f2e2f2 and 2055e50.

📒 Files selected for processing (3)
  • src/core/observability.rs
  • src/openhuman/inference/provider/ops.rs
  • src/openhuman/inference/provider/reliable.rs

Comment thread src/openhuman/inference/provider/ops.rs Outdated
@oxoxDev oxoxDev self-assigned this May 28, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CodeGhost21 the code looks solid to me — clean refactor, single-source matcher is the right call, and the test coverage (4 in ops, 2 in observability, rejection contracts) is thorough. The cascade branch placement (status-agnostic, after config-rejection, before the should_report gate) is correct. One thing to sort out before I approve:

CI: "Build & smoke-test core image" is failing. Once that's green I'll come back and approve.

Minor observation (not blocking): In observability.rs, the new ContextWindowExceeded arm logs error = %message, but log_context_window_exceeded in ops.rs deliberately omits the body — it only logs operation/provider/status. At the re-report layer the message string is whatever agent.run_single / web_channel.run_chat_task propagated up, which may wrap more than just the sanitized provider body. Consistent with how the other usage-state warn arms are written (none log the full message), worth dropping the error = %message field here too.

@oxoxDev oxoxDev removed their assignment May 28, 2026
…-501-context-window-exceeded

# Conflicts:
#	src/core/observability.rs
…imit false-positives (review of tinyhumansai#2820)

CodeRabbit flagged that `is_context_window_exceeded_message` — now also
feeding `expected_error_kind` (Sentry suppression) and the `reliable`
non-retryable decision — was too broad. Addressed the valid concern with
a targeted two-tier anchor (the suggested `||contains("token")` guard
was a no-op for the token phrases, which already contain "token", and
would have broken valid "prompt is too long" / "input is too long"
matches):

- CONTEXT_HINTS (context/length phrases) match alone — unambiguous.
- TOKEN_HINTS ("too many tokens", "token limit exceeded") collide with
  per-minute token RATE limits, which are transient 429s that must stay
  retryable AND keep reaching Sentry. They now count as context-overflow
  only when no rate-limit marker ("per minute", "rate limit", "tpm",
  "retry after", "try again in", …) is present.

New test `token_rate_limits_are_not_context_overflow` pins the contract:
three TPM rate-limit bodies must NOT classify, a marker-less token
overflow still does. cargo test → observability 107, provider::ops
context-window 5, reliable 56, all pass.
@coderabbitai coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage labels May 28, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CodeGhost21 the two-tier anchor approach in the latest commit is the right call — splitting CONTEXT_HINTS (unambiguous) from TOKEN_HINTS (ambiguous with TPM rate limits) and using the rate-limit marker set to disambiguate is clean and pins the contract with a dedicated rejection test. The CodeRabbit concern is fully addressed.

CI is still running on this new commit set. Once everything's green I'll approve this.

@oxoxDev oxoxDev self-requested a review May 28, 2026 18:08
@M3gA-Mind M3gA-Mind merged commit a54f78a into tinyhumansai:main May 28, 2026
35 of 39 checks passed
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All checks pass. Single-source context-overflow matcher with TPM disambiguation is the right call — clean refactor across ops/reliable/observability, thorough rejection contracts.

Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walkthrough

Routes context-window-exceeded through a single-source body matcher (ops::is_context_window_exceeded_message) at 3 call sites: retry classifier (reliable.rs), api_error Sentry-suppression cascade (ops.rs), and the higher-layer re-report path via new ExpectedErrorKind::ContextWindowExceeded arm in observability.rs. Adds the missing "context size has been exceeded" wording. Drops self-hosted Sentry TAURI-RUST-501 (custom gateway mis-reporting context overflow as HTTP 500). +284/-13, 3 files. 6 new tests including a TPM-rate-limit polarity guard. All CI green. CodeRabbit raised the broad-hints concern, author addressed with a STRONG/WEAK + TPM-marker split that's actually tighter than CR's suggestion.

Verified

  • Single source of truth: reliable.rs::is_context_window_exceeded now delegates to ops::is_context_window_exceeded_message; phrase list lives once ✓
  • Status-agnostic body match — correctly handles the TAURI-RUST-501 500 mis-report from custom gateway; OpenAI's 400 context_length_exceeded still matches the same body anchors ✓
  • TPM-rate-limit disambiguation: TOKEN_HINTS ("too many tokens", "token limit exceeded") only match when body carries NO rate-limit marker ("per minute", "per min", "rate limit", "tpm", "requests per", "retry after") — pinned by token_rate_limits_are_not_context_overflow test with 3 TPM shapes + 1 positive overflow case ✓
  • Polarity guard does_not_match_unrelated_bodies covers rate-limit / model-not-found / insufficient-budget / tool-budget — confirms anchors are context-overflow specific ✓
  • report_expected_message arm demotes to warn! breadcrumb with error = %message — matches the SessionExpired / BackendUserError tier per the sibling arm convention ✓

Blockers

  1. Dispatcher ordering doc mismatch in expected_error_kind. The new arm sits between PromptInjectionBlocked and MemoryStoreBreakerOpen, but the doc comment says "Runs last so a more specific matcher always wins" — it does NOT run last in the dispatcher chain. The intent is probably "runs after the prompt-injection check so the prompt-injection anchor wins on overlapping bodies" — but reads as if every later matcher is less specific, which isn't true. Reword to "Runs after PromptInjectionBlocked so the prompt-injection anchor wins on overlapping bodies; ordering vs later arms is independent (no shared anchors)" — OR move the arm to the actual bottom of the chain to match the comment.

  2. Layering inversion core::observabilityopenhuman::inference::provider — direct cross-module call in expected_error_kind. core historically sits below openhuman. The shared-classifier pattern is reasonable, but I want a maintainer ack on the precedent before this lands — once one arm reaches sideways like this, future arms will follow the same shortcut. @graycyrus another pass — is the direction acceptable, or should the matcher move into core so the dispatcher stays self-contained?

Nits

  • RATE_LIMIT_MARKERS includes "retry after" — a context-overflow body that incidentally includes "retry after 0s" hint (some providers tack it on for non-retryable too) would fail the WEAK match. Vanishingly rare; flag only.
  • "tpm" as a 3-char substring is broad. "http://www.tpm.example.com" in a URL would trip it. Astronomically unlikely in error bodies, but if you ever surface URLs in upstream prose this gets a false-negative. Optional: anchor with " tpm " / "(tpm)" / "tpm:".

Questions

  • Did you check the matcher across the full Sentry "domain=llm_provider operation=api_error" window for any past events that match WEAK hints + rate-limit marker simultaneously? Want to confirm the new TPM polarity doesn't accidentally let a real overflow slip through as a 429.
  • Self-hosted Sentry — confirm Closes TAURI-RUST-501 is in the PR body so Phase 7 Step 4.5 self-hosted sweep flips it on merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants