feat(overlay): transcription failure recovery with one-tap retry#21
Merged
that-yolanda merged 8 commits intoJun 24, 2026
Merged
Conversation
为避免 WebView/WebRTC 音频处理导致录音音量前轻后响,macOS 改用 Rust/cpal 采集并复用现有 ASR 缓冲管线。 同时保存 ASR 实际收到的 16k mono PCM 为 WAV,便于继续排查音频质量。
原先用 std::sync::mpsc 的阻塞 recv 在 async 上下文里等待采集 线程就绪,会占住一个 tokio worker 线程。改用 oneshot 异步等待, 采集流建好的瞬间即返回,不再阻塞执行器。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
先固化 WebView-free 主路径、late result 修复、录音资产和重试策略,确保后续实现按已确认的阶段推进。
调试原生悬浮窗时常需要不触发热重载的运行方式。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
原 WebView getUserMedia 路径靠 base64 事件让前端 AudioContext 播 提示音;原生采集下直接用 paste::play_sound 播放,更稳更准时。 原生采集没有浏览器 AEC/AGC 需要收敛,macOS 的 settle 延迟置 0, 按键到「开始」更跟手。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2bc9cca to
377e280
Compare
- native_audio 仅在 macOS 编译,cpal 也只在 macOS 用到;放在通用 依赖里会让 Linux CI 拉取 cpal→alsa-sys,需要 libasound2-dev (alsa.pc) 而构建失败。改到 macOS target 依赖即可避开。 - append_audio_samples 的 app 仅用于 macOS 原生波形,非 macOS 下 以 let _ = app 消除 unused 警告(clippy -D warnings)。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
超时回退到 partial/final 文本会把未完成的识别当成成功结果, 掩盖网络问题。改为返回错误并把超时放宽到 15s,让上层据此走 失败兜底(提示 + 重试),结果更可控。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
为转写失败提供完整的恢复路径,避免一次网络抖动就丢失整段语音。 录音留存:与发送给 ASR 相同的 16k 单声道 PCM 整段缓存,停止时 写为 WAV。成功后按 keep_recordings 设置保留或删除,失败录音保留 以供重试,未重试的录音 31 天后随留存清理一并回收。新增「保留录音」 设置项与历史记录里的播放/重试入口。 一键重试:失败在悬浮窗给出可重试提示与按钮,重放保留的 WAV 重新 转写,结果以流式方式回填,焦点在粘贴前交还给原窗口(点击按钮会 激活悬浮窗,故记录并恢复前台 App)。 键盘操作:失败提示展示期间再次按主热键即触发重试;错误态与重试中 均可按 ESC 终止。重试按钮显示触发热键(如「重试 (R ⌥)」),符号与 设置页一致。 其它:没说话就停止时立即结束、不进入重试(跳过开头提示音窗口的 能量判断);macOS 原生采集无 AEC/AGC,settle 延迟置 0。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
377e280 to
2154c6e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
💡 Overview
Gives transcription failures a full recovery path so a single network blip / ASR timeout no longer loses an entire utterance. The audio is retained, the failure is surfaced on the overlay with a retry affordance, and retry replays the saved audio and streams the result back live.
🛠️ Key Changes
keep_recordingssetting keeps successful recordings ~31 days; failed ones are kept for retry and reclaimed by the same 31-day sweep, or deleted once a retry succeeds. History gains play / retry entries.重试 (R ⌥), matching the settings-page symbols.commit_and_await_finalreturns an error on timeout instead of silently falling back to partial text, so the failure/retry path can engage.🧪 Testing
cargo test(246 unit tests) andnpx vitest run(123 tests) pass;cargo clippyclean;biomeclean.📸 Screenshots
UI changes (overlay retry button + label, settings "保留录音" toggle, history play/retry entries) — screenshots to be added.
🔗 Related
Depends on #20.