feat(audio): native cpal microphone capture + native cue#20
Closed
leo-fengchao wants to merge 6 commits into
Closed
feat(audio): native cpal microphone capture + native cue#20leo-fengchao wants to merge 6 commits into
leo-fengchao wants to merge 6 commits into
Conversation
为避免 WebView/WebRTC 音频处理导致录音音量前轻后响,macOS 改用 Rust/cpal 采集并复用现有 ASR 缓冲管线。 同时保存 ASR 实际收到的 16k mono PCM 为 WAV,便于继续排查音频质量。
原先用 std::sync::mpsc 的阻塞 recv 在 async 上下文里等待采集 线程就绪,会占住一个 tokio worker 线程。改用 oneshot 异步等待, 采集流建好的瞬间即返回,不再阻塞执行器。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
先固化 WebView-free 主路径、late result 修复、录音资产和重试策略,确保后续实现按已确认的阶段推进。
调试原生悬浮窗时常需要不触发热重载的运行方式。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
原 WebView getUserMedia 路径靠 base64 事件让前端 AudioContext 播 提示音;原生采集下直接用 paste::play_sound 播放,更稳更准时。 原生采集没有浏览器 AEC/AGC 需要收敛,macOS 的 settle 延迟置 0, 按键到「开始」更跟手。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
aa4832b to
f24a146
Compare
- native_audio 仅在 macOS 编译,cpal 也只在 macOS 用到;放在通用 依赖里会让 Linux CI 拉取 cpal→alsa-sys,需要 libasound2-dev (alsa.pc) 而构建失败。改到 macOS target 依赖即可避开。 - append_audio_samples 的 app 仅用于 macOS 原生波形,非 macOS 下 以 let _ = app 消除 unused 警告(clippy -D warnings)。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
7cc192f to
fa7d4eb
Compare
Contributor
Author
|
Superseded by #21, which was squash-merged into master as 4f73ee4. Since #21 was stacked on this branch, that squash already includes all of this PR's commits (native cpal capture, native cue, settle=0, the cpal macOS-gating and the Linux-CI fixes). Verified master now contains native_audio.rs, the macOS-gated cpal dependency, the native emit_cue, and the unused-arg fix — so there is nothing left to merge here. Closing as redundant. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
💡 Overview
On macOS, recording now captures audio natively via
cpal(CoreAudio) instead of the WebViewgetUserMediapath, and the start/end cues play natively too. This removes the renderer from the audio hot path on macOS and makes capture/cue timing more predictable. Windows keeps the existinggetUserMediapath.This is the first of two PRs; the transcription failure-recovery / retry feature builds on top of this one.
🛠️ Key Changes
cpal-based microphone capture with streaming resampling to 16 kHz mono, replacing the renderergetUserMediaroute. Includes a unit-tested streaming resampler.paste::play_soundinstead of a base64 event to a rendererAudioContext.tokio::oneshotinstead of a blockingstd::mpscrecv, so it no longer blocks a tokio worker.🧪 Testing
cargo test(236 unit tests) andnpx vitest run(123 tests) pass;cargo clippyclean.📸 Screenshots
No user-visible UI change (audio path only).
🔗 Related
Follow-up PR (failure recovery / retry) builds on this branch.