Skip to content

fix(composio/triage): demote provider-config-rejection rollups (Sentry TAURI-RUST-1V)#2689

Open
oxoxDev wants to merge 2 commits into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-1v-composio-triage-noise
Open

fix(composio/triage): demote provider-config-rejection rollups (Sentry TAURI-RUST-1V)#2689
oxoxDev wants to merge 2 commits into
tinyhumansai:mainfrom
oxoxDev:fix/sentry-1v-composio-triage-noise

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 26, 2026

Summary

  • Routes [composio][triage] run_triage failed through the central observability classifier instead of raw tracing::error! — so user-config / budget-exhausted rollups from the upstream provider chain get demoted to info-level breadcrumbs.
  • Adds "may not be available on your provider" to is_provider_config_rejection_message — the canonical phrase emitted by reliable.rs:332 when the user's reliability.model_fallbacks chain is misconfigured.
  • Net effect: drops the bulk of self-hosted Sentry TAURI-RUST-1V (10,692 events / 14d) where the inner provider attempts the provider layer already demoted were re-surfaced by the outer composio re-emit.

Problem

Self-hosted Sentry tauri-rust project's #4 unresolved by event count (14d, sort=freq) is [composio][triage] run_triage failed at 10,692 events. Breadcrumbs show every event boils down to:

[llm_provider] native_chat budget-exhausted 400 — not reporting to Sentry          (correct demote, inner)
Non-retryable error, moving on
Exhausted retries, trying next provider/model
llm_provider.reliable_chat skipped expected budget-exhausted error: The model `chat-v1` may not be available on your provider. Configure a fallback chain via `reliability.model_fallbacks` in …    (correct demote, inner)
[triage::evaluator] agent turn dispatch failed
[composio][triage] run_triage failed       ← raw tracing::error! emits a Sentry error event

The reliable-provider stack and the inference observability layer already classify these as user-config — but the bus-level re-emit at memory_sync/composio/bus.rs:354 used a raw tracing::error! that bypassed the classifier entirely.

Root cause is user-side: the user's reliability.model_fallbacks config doesn't list a model the provider actually serves. Remediation = fix that config. Sentry has no remediation path.

Solution

Two surgical changes, kept as separate micro-commits for review:

  1. src/openhuman/inference/provider/config_rejection.rs — append "may not be available on your provider" to the PHRASES table feeding is_provider_config_rejection_message. Canonical phrase, anchored to reliable.rs:332 (the sole producer in-tree). Comment cross-links the call site so future drift of that wording is caught at review.

  2. src/openhuman/memory_sync/composio/bus.rs — swap the raw tracing::error! at L354 with crate::core::observability::report_error_or_expected(..., "composio", "trigger_triage", &[("label", …)]). Same structured fields as before, but now the classifier runs. Genuine runtime bugs that don't classify still surface as full Sentry errors.

The two changes are additive; either landed alone would partially help, but only the pair fully closes the loop.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
  • N/A: behaviour-only change — classifier anchor + call-site swap, no new user-facing feature row. (Coverage matrix updated for added/removed/renamed feature rows.)
  • All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • N/A: observability classifier internals — no release-cut surface touched. (Manual smoke checklist not required.)
  • N/A: Sentry-only fix — no GitHub issue. Sentry-Issue: TAURI-RUST-1V is in ## Related instead of Closes #NNN.

Impact

  • Runtime/platform: desktop only. Demoted events still appear as info! breadcrumbs in local trace, so support can still inspect them. Sustained outages — were they ever to indicate a real triage bug — would surface via separate health/escalation paths.
  • Performance: negligible (classifier already runs on the surrounding paths; one extra substring scan per failed triage turn).
  • Security/migration: none.
  • Compatibility: no public API change.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A: Sentry-only — no Linear issue
  • URL: N/A: Sentry-only — no Linear issue

Commit & Branch

  • Branch: fix/sentry-1v-composio-triage-noise
  • Commit SHA: $(git rev-parse HEAD)

Validation Run

  • N/A: no app/ files touched — pnpm --filter openhuman-app format:check not needed.
  • N/A: no app/ files touched — pnpm typecheck not needed.
  • Focused tests: `cargo test --lib openhuman::inference::provider::config_rejection` (7/7 passed)
  • Rust fmt/check (if changed): `cargo fmt --check` clean; `cargo check --manifest-path Cargo.toml` clean (pre-existing warnings only, not from this PR)
  • N/A: no Tauri shell files touched — Tauri fmt/check not needed.

Validation Blocked

  • `command:`
  • `error:`
  • `impact:`

Behavior Changes

  • Intended behavior change: Composio-triage failures whose underlying err already classifies as user-state via the central observability classifier now demote to info-level breadcrumbs instead of emitting Sentry errors.
  • User-visible effect: None — UI surfaces of the underlying conditions (budget-exhausted toasts, model-unavailable banners) are emitted at the inner layers and unchanged. Only the Sentry noise drops.

Parity Contract

  • Legacy behavior preserved: All structured fields previously logged (`label`, `error`) are preserved in the routed report's message and tag set. Genuine runtime errors that don't classify still reach Sentry.
  • Guard/fallback/dispatch parity checks: `is_provider_config_rejection_message` continues to require an anchored substring match — false-positive surface is identical to existing entries; new entry is the canonical `reliable.rs:332` remediation phrase.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A: no duplicates
  • Canonical PR: N/A
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • Bug Fixes

    • Improved detection of provider-configuration rejection messages so fallback remediation notices are correctly classified as provider-config issues.
  • Chores

    • Enhanced error reporting to route triage failures into structured observability reporting for better diagnostics and monitoring.

Review Change Stack

@oxoxDev oxoxDev requested a review from a team May 26, 2026 10:38
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 389f54a3-ccd9-41c2-93a3-2c78e071710c

📥 Commits

Reviewing files that changed from the base of the PR and between 5fa4ca8 and e8397ab.

📒 Files selected for processing (1)
  • src/openhuman/inference/provider/config_rejection.rs
💤 Files with no reviewable changes (1)
  • src/openhuman/inference/provider/config_rejection.rs

📝 Walkthrough

Walkthrough

Adds the substring "may not be available on your provider" to provider config-rejection detection and a unit test; routes Composio triage errors through crate::core::observability::report_error_or_expected instead of direct tracing::error!.

Changes

Error Classification and Observability

Layer / File(s) Summary
Provider config-rejection detection
src/openhuman/inference/provider/config_rejection.rs
Adds the substring "may not be available on your provider" to PHRASES and adds detects_reliable_chain_exhaustion_rollup test asserting the multi-line rollup and inner remediation both match is_provider_config_rejection_message.
Composio triage observability routing
src/openhuman/memory_sync/composio/bus.rs
Replaces direct tracing::error! with crate::core::observability::report_error_or_expected in run_triage error handling, reporting formatted error detail with category/context ("composio", "trigger_triage") and a label attribute.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

working

Suggested reviewers

  • graycyrus
  • senamakel

"I nibble on strings of logs and tests,
sniff the phrases where a fallback rests,
a hop, a patch, a tidy clue—
now errors flow where watchers view,
cheer the rabbit; builds are blessed!" 🐇

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing composio/triage error handling by demoting provider-config-rejection errors to reduce Sentry noise, which aligns with both file changes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the bug label May 26, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 26, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — clean, surgical fix with good test coverage.

Nice work routing the composio triage error path through report_error_or_expected instead of the raw tracing::error!. The new phrase anchor in config_rejection.rs is well-scoped (sole producer at reliable.rs:332) and the cross-referencing comments will help catch drift.

Both changes follow the established pattern used across ~10 other call sites in the codebase. Tests cover the multi-line rollup and bare phrase cases. The {e:#} alternate Display for the error chain in the detail string is correct.

CI note: all failures are Docker login (GHCR auth) issues for the contributor's fork — unrelated to these changes. The Tauri build, E2E suites, and core image build all pass.

@oxoxDev oxoxDev assigned oxoxDev and unassigned oxoxDev May 28, 2026
oxoxDev added 2 commits May 28, 2026 20:59
…r provider" (Sentry TAURI-RUST-1V)

Add the canonical phrase from `reliable.rs:332` to the
ProviderConfigRejection classifier. `reliable.rs` rolls every exhausted
fallback into `All providers/models failed. Attempts:\n…\nThe model
`<id>` may not be available on your provider. Configure a fallback chain
via `reliability.model_fallbacks` in …`, which the composio triage
subscriber re-reports to Sentry. The remediation lives entirely in the
user's `reliability.model_fallbacks` config; Sentry has no remediation
path.

Drops the bulk of self-hosted Sentry TAURI-RUST-1V (10,692 events / 14d
on `tauri-rust` project, dominated by `gemini-3-flash-preview`-style
ProviderConfigRejection rollups).

Sentry-Issue: TAURI-RUST-1V
…sifier (Sentry TAURI-RUST-1V)

Swap the raw `tracing::error!` at memory_sync/composio/bus.rs:354 with
`crate::core::observability::report_error_or_expected` so user-config /
budget-exhausted rollups from the upstream provider chain get demoted to
info-level breadcrumbs instead of surfacing as Sentry errors.

Pairs with the new `may not be available on your provider` anchor in
`is_provider_config_rejection_message` — together they neutralise the
self-hosted Sentry TAURI-RUST-1V noise (10,692 events / 14d) whose inner
attempts the provider layer already correctly demoted but whose outer
rollup escaped via this raw error emit. Genuine triage runtime bugs that
don't classify still reach Sentry unchanged.

Sentry-Issue: TAURI-RUST-1V
@oxoxDev oxoxDev force-pushed the fix/sentry-1v-composio-triage-noise branch from 5fa4ca8 to e8397ab Compare May 28, 2026 15:31
@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 28, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants