feat(audio): native cpal microphone capture + native cue by leo-fengchao · Pull Request #20 · that-yolanda/voicepaste

leo-fengchao · 2026-06-23T16:09:07Z

💡 Overview

On macOS, recording now captures audio natively via cpal (CoreAudio) instead of the WebView getUserMedia path, and the start/end cues play natively too. This removes the renderer from the audio hot path on macOS and makes capture/cue timing more predictable. Windows keeps the existing getUserMedia path.

This is the first of two PRs; the transcription failure-recovery / retry feature builds on top of this one.

🛠️ Key Changes

Native capture (macOS): cpal-based microphone capture with streaming resampling to 16 kHz mono, replacing the renderer getUserMedia route. Includes a unit-tested streaming resampler.
Native cue: start/end cues play via paste::play_sound instead of a base64 event to a renderer AudioContext.
Settle delay: macOS uses 0 ms (no browser AEC/AGC to converge under native capture); other platforms keep 350 ms.
Non-blocking warmup: the capture-thread ready signal uses a tokio::oneshot instead of a blocking std::mpsc recv, so it no longer blocks a tokio worker.
Docs: native overlay + retry design note.

🧪 Testing

cargo test (236 unit tests) and npx vitest run (123 tests) pass; cargo clippy clean.
Manually verified on macOS: native capture, native start/end cues, recording start/stop.

📸 Screenshots

No user-visible UI change (audio path only).

🔗 Related

Follow-up PR (failure recovery / retry) builds on this branch.

为避免 WebView/WebRTC 音频处理导致录音音量前轻后响，macOS 改用 Rust/cpal 采集并复用现有 ASR 缓冲管线。同时保存 ASR 实际收到的 16k mono PCM 为 WAV，便于继续排查音频质量。

原先用 std::sync::mpsc 的阻塞 recv 在 async 上下文里等待采集线程就绪，会占住一个 tokio worker 线程。改用 oneshot 异步等待，采集流建好的瞬间即返回，不再阻塞执行器。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

先固化 WebView-free 主路径、late result 修复、录音资产和重试策略，确保后续实现按已确认的阶段推进。

调试原生悬浮窗时常需要不触发热重载的运行方式。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

原 WebView getUserMedia 路径靠 base64 事件让前端 AudioContext 播提示音；原生采集下直接用 paste::play_sound 播放，更稳更准时。原生采集没有浏览器 AEC/AGC 需要收敛，macOS 的 settle 延迟置 0，按键到「开始」更跟手。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

- native_audio 仅在 macOS 编译，cpal 也只在 macOS 用到；放在通用依赖里会让 Linux CI 拉取 cpal→alsa-sys，需要 libasound2-dev (alsa.pc) 而构建失败。改到 macOS target 依赖即可避开。 - append_audio_samples 的 app 仅用于 macOS 原生波形，非 macOS 下以 let _ = app 消除 unused 警告（clippy -D warnings）。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

leo-fengchao · 2026-06-24T08:24:33Z

Superseded by #21, which was squash-merged into master as 4f73ee4. Since #21 was stacked on this branch, that squash already includes all of this PR's commits (native cpal capture, native cue, settle=0, the cpal macOS-gating and the Linux-CI fixes). Verified master now contains native_audio.rs, the macOS-gated cpal dependency, the native emit_cue, and the unused-arg fix — so there is nothing left to merge here. Closing as redundant.

leochenfc and others added 5 commits June 24, 2026 00:25

feat(audio): 使用 cpal 原生采集录音

452fdd9

为避免 WebView/WebRTC 音频处理导致录音音量前轻后响，macOS 改用 Rust/cpal 采集并复用现有 ASR 缓冲管线。同时保存 ASR 实际收到的 16k mono PCM 为 WAV，便于继续排查音频质量。

docs(audio): 记录原生悬浮窗与重试设计

bf31d50

先固化 WebView-free 主路径、late result 修复、录音资产和重试策略，确保后续实现按已确认的阶段推进。

chore: 添加 dev:no-watch 脚本

9241450

调试原生悬浮窗时常需要不触发热重载的运行方式。 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

leo-fengchao force-pushed the codex/native-cpal-capture branch from aa4832b to f24a146 Compare June 23, 2026 16:31

leo-fengchao changed the title ~~feat(audio): native cpal capture with transcription failure recovery~~ feat(audio): native cpal microphone capture + native cue Jun 23, 2026

leo-fengchao mentioned this pull request Jun 23, 2026

feat(overlay): transcription failure recovery with one-tap retry #21

Merged

leo-fengchao force-pushed the codex/native-cpal-capture branch from 7cc192f to fa7d4eb Compare June 24, 2026 01:38

leo-fengchao closed this Jun 24, 2026

leo-fengchao deleted the codex/native-cpal-capture branch June 24, 2026 10:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(audio): native cpal microphone capture + native cue#20

feat(audio): native cpal microphone capture + native cue#20
leo-fengchao wants to merge 6 commits into
that-yolanda:masterfrom
leo-fengchao:codex/native-cpal-capture

leo-fengchao commented Jun 23, 2026 •

edited

Loading

Uh oh!

leo-fengchao commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

leo-fengchao commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

💡 Overview

🛠️ Key Changes

🧪 Testing

📸 Screenshots

🔗 Related

Uh oh!

leo-fengchao commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

leo-fengchao commented Jun 23, 2026 •

edited

Loading