feat: Windows support — WASAPI audio, Parakeet, mouse tracker, GPU model#40
Open
sk8ersquare wants to merge 9 commits intopoodle64:mainfrom
Open
feat: Windows support — WASAPI audio, Parakeet, mouse tracker, GPU model#40sk8ersquare wants to merge 9 commits intopoodle64:mainfrom
sk8ersquare wants to merge 9 commits intopoodle64:mainfrom
Conversation
added 9 commits
March 23, 2026 16:42
- windows-latest runner with --no-default-features (Whisper only) - Install LLVM via choco (required for whisper-rs clang bindings) - LIBCLANG_PATH set for Windows runner - Config file permissions already correctly guarded with #[cfg(unix)]
Windows MSI/NSIS requires each version component to be <= 255. Year-based version 2026.x.x breaks this constraint. Set bundle.windows.version to 26.2.11.0 (century-stripped) for Windows packaging while keeping the full version everywhere else.
…ields - CI: strip century from version before Windows build (2026.x -> 26.x) - Remove invalid bundle.windows.version and nsis.displayVersion from config - Tauri 2.0 doesn't support those fields, version comes from Cargo.toml
Windows GPU: - Add parakeet-tdt-0.6b-v2-fp16 to manifest.json (FP16 CUDA-optimised model) - parakeet.rs: auto-detect fp16 vs int8 model files; try CUDA provider first on Windows when parakeet-cuda feature enabled, fall back to CPU - Add nemo_transducer_fp16 model type to manifest.rs is_backend_available() - Add parakeet-cuda Cargo feature for NVIDIA GPU Windows builds - Add tauri-plugin-os for runtime platform detection Platform-aware permissions UI (OverviewPane): - Accessibility permission card: macOS only - Input Monitoring permission card: macOS only - Show in Dock toggle: macOS only - Troubleshooting / TCC reset section: macOS only - allPermissionsGranted derived: only requires Accessibility + Input Monitoring on macOS; Windows/Linux only require Microphone macOS and Linux builds are 100% unchanged — all guards are additive cfg/runtime checks that default to the existing behaviour.
… (v2026.3.2)
Three Windows bugs fixed:
1. Audio capture — native sample format handling (WASAPI)
Windows WASAPI devices commonly report i16 or i32 as their native format.
The recorder was always requesting f32 from cpal, which either failed or
produced silent recordings when the device's native format didn't match.
Now detects the native SampleFormat and converts i16/i32/u16 → f32 in
the callback, so the rest of the pipeline always sees correct f32 audio.
This fixes 'records for 2 seconds then stops' on Whisper Small.
2. Parakeet backend enabled on Windows
CI was building Windows with --no-default-features, stripping the parakeet
feature. Sherpa-ONNX static linking works on Windows (unlike Linux where
the archive lacks .a files). Updated CI to --features parakeet and updated
Cargo.toml to enable whisper-rs CPU backend (default features) on Windows.
Parakeet models now show as downloadable and functional on Windows.
3. Overview setup flow — Windows/Linux
- Accessibility (step 3) is hidden on non-macOS; not required there
- allRequiredDone only requires model + mic on non-macOS
- check_model_downloaded called with undefined instead of null to avoid
Tauri v2 Option<String> deserialization edge case
The recording indicator was not following the cursor on Windows because get_mouse_position() always returned None (only macOS/Linux were implemented). Result: indicator appeared at bottom-centre fallback, then after ~1.4s (90 failed polls × 16ms) the failure threshold hit and moved it off-screen. Fix: add Windows implementation using GetCursorPos + GetDeviceCaps for DPI- aware logical coordinate conversion. Matches how Handy's rdev dependency handles Windows cursor input.
…wnload-binaries) sherpa-rs-sys 0.6.8 build script errors when tts+static+download-binaries are all enabled. The tts feature is enabled by default; adding default-features = false resolves the conflict.
ONNX Runtime (embedded in sherpa-onnx static archive) requires advapi32.lib for ETW telemetry symbols (EventWriteTransfer, EventRegister, etc.) and espeak-ng needs it for registry access (RegOpenKeyExA, RegQueryValueExA). sherpa-rs-sys doesn't add this automatically; adding it via build.rs.
… propagate to deps) build.rs rustc-link-lib only affects the top-level crate's link step, not transitive deps like sherpa-rs. Using .cargo/config.toml rustflags with /DEFAULTLIB:advapi32.lib ensures the lib is available when link.exe resolves sherpa-onnx's ETW and registry symbols.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Full Windows x86_64 support. Tested on Windows 11 with NVIDIA GPU.
What's included
CI: Windows build target
windows-latestto the release matrix2026.x.y→26.x.yfor the Windows build only (Tauri's NSIS installer doesn't have this limit; only MSI does)tauri.conf.jsonfields that caused Windows builds to failAudio capture: WASAPI sample format handling
Windows WASAPI devices commonly report
i16ori32as their native sample format. The recorder was always requestingf32from cpal, causing the audio stream to fail or produce silence on many Windows devices.Fixed: detects the native
SampleFormatand convertsi16/i32/u16→f32in the callback. Same pattern used by Handy.Parakeet (Sherpa-ONNX) on Windows
Parakeet's Sherpa-ONNX static archive works on Windows (unlike Linux where the archive lacks
.afiles). Enabled via--features parakeetin Windows CI.Fixes required:
default-features = falseon sherpa-rs (thettsdefault feature conflicts withstatic + download-binaries).cargo/config.toml:/DEFAULTLIB:advapi32.lib— ONNX Runtime (embedded in sherpa-onnx) requires advapi32 for ETW telemetry symbols; espeak-ng needs it for registry access. sherpa-rs-sys doesn't declare this dependency.Mouse tracker: Windows cursor following
The recording indicator (cursor-dot style) was always showing at the fallback position on Windows because
get_mouse_position()had no Windows implementation (only macOS/Linux). After 90 failed polls (~1.4s) it moved off-screen.Fixed: added Win32
GetCursorPos+GetDeviceCapsfor DPI-aware logical pixel coordinates.Platform-aware permissions UI
Overview pane previously showed macOS-only Accessibility and Input Monitoring rows on all platforms. Windows users saw permission steps they couldn't complete.
Fixed:
{#if isMacos}allRequiredDoneonly requires Model + Microphone on non-macOScheck_model_downloadedpassesundefinedinstead ofnull(Tauri v2 serialization)GPU model
Manifest includes
parakeet-tdt-0.6b-v2-fp16(FP16 ONNX). On Windows, users with NVIDIA GPUs and CUDA installed can download this model for ~3-4x faster transcription. Falls back to CPU automatically.Tested
Not included (future work)
ort-directml)