fix(observability): demote "Config loading timed out" out of Sentry#2920
Conversation
…line error handling by removing outdated comments
…st assertions for clarity
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds ExpectedErrorKind::ConfigLoadTimedOut and a matcher for the case-insensitive literal "config loading timed out", wires it into expected_error_kind and report_expected_message (emit tracing::warn! breadcrumb), condenses a session-expiry matcher expression, and adds unit tests for positive and negative classification cases. ChangesConfig Load Timeout Observability Classification
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
graycyrus
left a comment
There was a problem hiding this comment.
@YellowSnnowmann the code looks good to me — tight matcher anchored on the exact canonical phrase, proper dispatch placement relative to DiskFull/MemoryStorePiiRejection, and solid test coverage (positive + RPC-wrapped positive + three meaningful negatives). CI is still pending so I'll hold the formal approval until those are green, but I don't expect any blockers there. A couple minor things while I had the diff open:
Missing doc comment on ConfigLoadTimedOut
Your PR description explicitly says the new variant includes a doc-comment citing the emit sites and TAURI-RUST-5X, but the diff just shows a bare ConfigLoadTimedOut, with no preceding /// block. Every other variant in the enum has one — ChannelSupervisorRestart, DiskFull, MemoryStorePiiRejection, etc. all document the Sentry issue ID, event count, and affected emit sites. Worth adding for consistency and so the next maintainer doesn't have to dig through git blame to understand why this is expected.
Bundled doc comment deletions
The PR deletes substantial doc comment blocks from LoopbackUnavailable, PromptInjectionBlocked, ContextWindowExceeded, the UPDATER_TRANSIENT_MESSAGE_PHRASES const, and the three is_session_expired_message arms. Those comments carried Sentry issue IDs (OPENHUMAN-TAURI-R5/R6, OPENHUMAN-TAURI-140, TAURI-RUST-501, TAURI-RUST-CD, 4P0, 4K5, 1EE), event counts, and the specific reasoning for conjunctive vs. substring anchoring. None of that is in git history in a way that's easy to surface later. The cleanup makes the code shorter but strips future-maintainer context. If the goal is readability, the issue IDs and emit-site references are the parts worth keeping even if the prose gets trimmed.
Neither of these blocks the merge — the classifier logic itself is correct and the fix is exactly the right call for TAURI-RUST-5X. Once CI goes green I'll approve.
oxoxDev
left a comment
There was a problem hiding this comment.
The new matcher itself is clean — is_config_load_timed_out_message is tightly anchored on the literal "config loading timed out" phrase, and the negative tests (network/HTTP timeouts, bare "timed out") are solid. But the diff has a blocking side effect.
Blocker — large collateral deletion of existing doc comments (stale base)
Beyond adding the ConfigLoadTimedOut arm, this diff deletes ~80 lines of doc comments and Sentry-ID annotations that other merged PRs added:
UPDATER_TRANSIENT_MESSAGE_PHRASES— thetauri-plugin-updaterrationale +Drops TAURI-RUST-CDnoteLoopbackUnavailable— theOPENHUMAN-TAURI-R5/R6annotationPromptInjectionBlocked— theOPENHUMAN-TAURI-140rationaleContextWindowExceeded— the full single-source-matcher explanation +TAURI-RUST-501noteis_session_expired_message— the per-arm4P0 / 4K5 / 1EEanchor comments (logic preserved, but the explanatory blocks are stripped)
I confirmed those comments still exist on current main, so merging this as-is would revert that documentation. This looks like a stale base / bad rebase rather than an intentional change.
Please rebase on the latest tinyhumansai/openhuman:main and re-push so the diff contains only the ConfigLoadTimedOut addition. Once it's just the matcher + tests, this is a straightforward approve.
Summary
"Config loading timed out"wrapper string as an expected, non-actionable error so it no longer fires Sentry events.ExpectedErrorKind::ConfigLoadTimedOutvariant +is_config_load_timed_out_messagematcher, wired into the centralexpected_error_kinddispatcher.warn!(breadcrumb only) inreport_expected_message, matching the existingDiskFulltreatment.Problem
Sentry issue TAURI-RUST-5X (
Config loading timed out, ~318 events, ongoing acrossopenhuman@0.54.0andopenhuman@0.56.0) is pure noise. The string originates fromload_config_with_timeout/reload_config_snapshot_with_timeoutinsrc/openhuman/config/ops.rs:48,70whentokio::time::timeout(CONFIG_LOAD_TIMEOUT, …)elapses aroundConfig::load_or_init.The timeout fires when the user's local filesystem stalls — cloud-sync daemons (iCloud Drive / OneDrive) holding
~/.openhuman, antivirus scanning the config file on open, a nearly-full or network-mounted home directory. None of these are operator-actionable; Sentry has no remediation path. Events surface across many RPC methods (openhuman.config_get,inference_status,local_ai_downloads_progress, …), all wrapped byrpc.invoke_method→report_error_or_expected.Solution
Followed the established
ExpectedErrorKindpattern insrc/core/observability.rs:ConfigLoadTimedOutwith a doc-comment citing the emit sites and TAURI-RUST-5X.is_config_load_timed_out_messageanchors on the full lowercase phrase"config loading timed out"— substring match so the RPC-wrapped form still classifies, but tight enough that bare"… timed out"/"operation timed out"(network) keep flowing to Sentry.is_disk_full_messageand beforeis_memory_store_pii_rejectioninexpected_error_kind.tracing::warn!(breadcrumb survives for local diagnosis; no Sentry error event).No behavior change beyond Sentry routing: the error string is unchanged, still returned to the caller, still surfaced in the UI. Classification is the only effect.
Submission Checklist
classifies_config_load_timed_out(happy + RPC-wrapped) anddoes_not_classify_unrelated_timeouts_as_config_load_timed_out(network/HTTP/cron negatives).cargo-llvm-cov; changed lines are matcher + dispatch arm, all exercised by the new tests.N/A: observability-only routing change, no feature row.N/A: no matrix feature touched.N/A: classifier-only change.N/A: no release-cut surface touched.Impact
Related
Summary by CodeRabbit
Bug Fixes
Tests