Skip to content

feat: upstream microphone passthrough via VB-Audio Virtual Cable#168

Closed
xenstalker02 wants to merge 3 commits into
Nonary:masterfrom
xenstalker02:upstream-pr/mic-passthrough
Closed

feat: upstream microphone passthrough via VB-Audio Virtual Cable#168
xenstalker02 wants to merge 3 commits into
Nonary:masterfrom
xenstalker02:upstream-pr/mic-passthrough

Conversation

@xenstalker02
Copy link
Copy Markdown

Summary

Adds full upstream microphone passthrough: the Moonlight client captures the user's mic, Opus-encodes it at 96kbps mono, and sends it to the host as 0x3003 control-stream packets over the existing encrypted AES-GCM channel. The host decodes the Opus payload and writes float32 PCM into a VB-Audio CABLE Input render device, making the mic appear as a normal Windows capture source to any app.

No new network ports — all mic data rides the existing control stream.

Commits

  1. feat: mic passthrough via VB-Audio Virtual Cable — config fields (mic_sink, mic_capture_device, mic_buffer_ms, mic_buffer_packets, install_vbcable), speaker_t abstract interface + virtual_microphone() / switch_default_capture_device() on audio_control_t, is_placebo_app() helper (Desktop/auto-detached sessions skip device switch + toasts), web UI fields, locale strings, bundled VB-Audio CABLE WDM driver with silent pnputil installer.

  2. feat: WASAPI mic render + 0x3003 Opus decode loop + capture device switchspeaker_wasapi_t event-driven render thread, jitter buffer (configurable prebuffer), switch_default_capture_device() for both eConsole and eCommunications roles with verify+retry loop (3x, 500ms), 1-second delayed retry thread for games that enumerate audio after startup.

  3. fix: prevent repeated Application Started/Stopped toasts during reconnect cycles — per-app s_started_notification_fired / s_stopped_notification_fired flags reset only when the app name changes; suppresses notification spam on Moonlight reconnect (common with Steam Deck sleep/wake).

Configuration

Add to sunshine.conf:

mic_sink = CABLE Input
mic_capture_device = CABLE Output (VB-Audio Virtual Cable)
install_vbcable = true

All four fields are also exposed in the web UI under Audio/Video settings.

Requirements

  • VB-Audio Virtual Cable — bundled driver auto-installs silently when install_vbcable = true.
  • A Moonlight client fork that sends 0x3003 mic packets (stock Moonlight does not).

Test Plan

  • Privacy scan clean (no personal IPs, MACs, credentials, or host-specific values)
  • Mic audio arrives at CABLE Output and is usable in Discord / Teams / game voice chat
  • Default capture device switches to CABLE Output on stream start, restores on stream end
  • No duplicate Started/Stopped toasts on Moonlight reconnect
  • Desktop pseudo-app does not trigger device switch or toasts
  • install_vbcable = false skips silent driver install

🤖 Generated with Claude Code

Adds upstream microphone passthrough support: the Moonlight client
(Vibelight) captures the user's mic via SDL, Opus-encodes at 96kbps
mono, and sends it to the host as 0x3003 control-stream packets.
Sunshine decodes the Opus payload and writes float32 PCM into a
VB-Audio CABLE Input render device, making the stream mic appear as a
normal Windows capture source.

Key additions:
- config: mic_sink, mic_capture_device, mic_buffer_ms,
  mic_buffer_packets, install_vbcable fields (config.cpp / config.h)
- platform/common.h: speaker_t abstract interface + virtual_microphone()
  and switch_default_capture_device() declarations on audio_control_t
- platform/linux/audio.cpp: no-op stub for virtual_microphone()
- process: is_placebo_app() — identifies Desktop / auto-detached sessions
  that must not trigger device switches or toast notifications
- main.cpp: startup diagnostic — validates mic_capture_device against
  active WASAPI capture endpoints, logs available devices if not found
- Web UI: mic_sink, mic_capture_device, mic_buffer_ms,
  mic_buffer_packets, install_vbcable fields in AudioVideo.vue + en.json
- src_assets/windows/drivers/vbcable/: bundled VB-Audio CABLE WDM
  driver (INF + SYS + CAT + setup EXE) with silent pnputil installer
…itch

- platform/windows/audio.cpp:
  - speaker_wasapi_t: WASAPI render client for writing decoded float32 PCM
    to VB-Audio CABLE Input; event-driven render thread, 50ms default
    buffer, per-frame IAudioRenderClient::GetBuffer/ReleaseBuffer cycle
  - switch_default_capture_device(): switches Windows default capture
    endpoint to mic_capture_device for both eConsole AND eCommunications
    roles; includes verify+retry loop (3×, 500ms) to handle drivers that
    delay the policy change; restores previous device on stream end
  - audio_control_t::virtual_microphone(): factory returning speaker_wasapi_t

- stream.cpp:
  - 0x3003 packet handler: receives Opus-encoded mic frames from client;
    jitter buffer (configurable prebuffer depth); OpusDecoder (48kHz, mono,
    float output); pushes float PCM to speaker_wasapi_t
  - Session start: opens VB-Cable render client, switches default capture
    device immediately + schedules 1-second delayed retry thread (handles
    games that enumerate audio devices after startup)
  - Session end: restores previous default capture device; is_placebo_app()
    guard prevents device switch for Desktop / auto-detached sessions
…nect cycles

System tray notifications were re-fired on every Moonlight reconnect for
the same app (common with Steam Deck sleep/wake cycles), spamming the
notification area.

Fix: per-app boolean flags s_started_notification_fired and
s_stopped_notification_fired that are only reset when the app name
changes.  update_tray_playing() skips the toast and icon refresh if the
started flag is already set for the current app; update_tray_stopped()
likewise dedups the stopped toast.  is_placebo_app() sessions (Desktop,
auto-detached commands) are excluded from toast logic entirely.
@Nonary
Copy link
Copy Markdown
Owner

Nonary commented Mar 24, 2026

I would not approve this PR as written. It needs to be redone.

The core problem is architectural. This looks like it was vibe coded, and the AI clearly found the better implementation on its own, the Steam path. But because the person directing it told it to implement VB-CABLE, it went ahead and forced that in anyway even though it had already landed on something cleaner. You can see that conflict in the code. The AI knew where it should go, but it defaulted to following the instruction it was given, and the result is worse for it. Instead of committing to the Steam route, it layers on VB-CABLE-specific install logic, device switching, and runtime driver management that have no business being in this codebase.

The VB-CABLE portion is a hard blocker and must be removed entirely.

The VB-CABLE path silently installs a bundled third-party driver from the app at startup and during recovery, rather than going through a proper installer flow with explicit user consent. That is not acceptable. Driver installation is a privileged system change. It belongs in an installer or update flow with clear product ownership, not buried in runtime code as default-on behavior.

This creates a cascade of problems: it expands the app's privileged surface area unnecessarily, introduces avoidable antivirus and endpoint-security risk (including false positives), crosses trust boundaries that runtime code should not cross, and smuggles third-party driver management into a feature PR that should be doing none of this since we already integrate with virtual audio drivers.

This PR also needs to be scoped strictly to microphone support.

There are unrelated commits in here that need to be removed. This should be a microphone support PR and nothing else. Any changes that aren't directly serving that feature, whether that's unrelated refactors, tangential fixes, or extra functionality, need to be pulled out and submitted separately if they have merit. Keep the scope tight so the review can focus on what actually matters.

Beyond the blocker and scope issues, there are four implementation problems that need to be fixed before this is mergeable:

1. Mic packet handling is too loose. The 0x3003 mic packet path needs a real packet format and a validation pass before decode. Right now it fabricates ordering from a process-global counter instead of parsing a real sender sequence, and that's not a safe basis for transport logic. Payloads go into Opus decode without enough preflight: no opus_packet_parse() validation, no meaningful channel validation, and the decode path assumes a hardcoded 20ms buffer. The general transport idea may be fine, but the parser and decode path is too trusting of what it receives.

2. Capture device switching and restore is unsafe. The switching logic snapshots only the current eConsole capture default, then overwrites and restores all roles from that single saved ID. This will silently clobber separate communications-device preferences. On top of that, there's detached retry behavior that can race teardown and flip the default capture device back after the session has already restored it. That means the app can leave the user's system audio state wrong after the session ends, and that is not acceptable.

3. The WASAPI render path is too fragile. It hardcodes float32 / 2ch / 48kHz without negotiating device support first, which won't hold up against real device variability. The device invalidation story is also incomplete. The caller expects a failure signal that the write path never actually returns, so when the render loop dies, future writes just keep queueing into a dead path. This is exactly the kind of silent failure that becomes painful to debug later.

4. Control flow and lifecycle management need restructuring. Stream startup and audio control paths have become deeply nested and branch-heavy, with too much mixed responsibility, too much implicitly coordinated state, and too much lifecycle behavior that depends on retries and detached work rather than structured teardown. This needs a cleaner shape before merge: clearer strategy selection, session-scoped state, and predictable teardown.

Recommended path forward:

Remove the VB-CABLE portion entirely. Back out the commits that introduced its install and device-management behavior. Remove all unrelated commits so this PR is scoped purely to microphone support. Rebuild the feature around the Steam path. Then fix the transport, parser, render, and lifecycle issues within that narrower, cleaner implementation.

@xenstalker02
Copy link
Copy Markdown
Author

xenstalker02 commented Mar 24, 2026

I would not approve this PR as written. It needs to be redone.

The core problem is architectural. This looks like it was vibe coded, and the AI clearly found the better implementation on its own, the Steam path. But because the person directing it told it to implement VB-CABLE, it went ahead and forced that in anyway even though it had already landed on something cleaner. You can see that conflict in the code. The AI knew where it should go, but it defaulted to following the instruction it was given, and the result is worse for it. Instead of committing to the Steam route, it layers on VB-CABLE-specific install logic, device switching, and runtime driver management that have no business being in this codebase.

The VB-CABLE portion is a hard blocker and must be removed entirely.

The VB-CABLE path silently installs a bundled third-party driver from the app at startup and during recovery, rather than going through a proper installer flow with explicit user consent. That is not acceptable. Driver installation is a privileged system change. It belongs in an installer or update flow with clear product ownership, not buried in runtime code as default-on behavior.

This creates a cascade of problems: it expands the app's privileged surface area unnecessarily, introduces avoidable antivirus and endpoint-security risk (including false positives), crosses trust boundaries that runtime code should not cross, and smuggles third-party driver management into a feature PR that should be doing none of this since we already integrate with virtual audio drivers.

This PR also needs to be scoped strictly to microphone support.

There are unrelated commits in here that need to be removed. This should be a microphone support PR and nothing else. Any changes that aren't directly serving that feature, whether that's unrelated refactors, tangential fixes, or extra functionality, need to be pulled out and submitted separately if they have merit. Keep the scope tight so the review can focus on what actually matters.

Beyond the blocker and scope issues, there are four implementation problems that need to be fixed before this is mergeable:

1. Mic packet handling is too loose. The 0x3003 mic packet path needs a real packet format and a validation pass before decode. Right now it fabricates ordering from a process-global counter instead of parsing a real sender sequence, and that's not a safe basis for transport logic. Payloads go into Opus decode without enough preflight: no opus_packet_parse() validation, no meaningful channel validation, and the decode path assumes a hardcoded 20ms buffer. The general transport idea may be fine, but the parser and decode path is too trusting of what it receives.

2. Capture device switching and restore is unsafe. The switching logic snapshots only the current eConsole capture default, then overwrites and restores all roles from that single saved ID. This will silently clobber separate communications-device preferences. On top of that, there's detached retry behavior that can race teardown and flip the default capture device back after the session has already restored it. That means the app can leave the user's system audio state wrong after the session ends, and that is not acceptable.

3. The WASAPI render path is too fragile. It hardcodes float32 / 2ch / 48kHz without negotiating device support first, which won't hold up against real device variability. The device invalidation story is also incomplete. The caller expects a failure signal that the write path never actually returns, so when the render loop dies, future writes just keep queueing into a dead path. This is exactly the kind of silent failure that becomes painful to debug later.

4. Control flow and lifecycle management need restructuring. Stream startup and audio control paths have become deeply nested and branch-heavy, with too much mixed responsibility, too much implicitly coordinated state, and too much lifecycle behavior that depends on retries and detached work rather than structured teardown. This needs a cleaner shape before merge: clearer strategy selection, session-scoped state, and predictable teardown.

Recommended path forward:

Remove the VB-CABLE portion entirely. Back out the commits that introduced its install and device-management behavior. Remove all unrelated commits so this PR is scoped purely to microphone support. Rebuild the feature around the Steam path. Then fix the transport, parser, render, and lifecycle issues within that narrower, cleaner implementation.

Hey Nonary — appreciate the detailed feedback, genuinely. I should be upfront: I'm a UX designer, not a programmer. This whole project was personal — I use Vibepollo every day to stream games to my Steam Deck and just wanted mic passthrough for Discord. Vibe-coded is exactly the right description, and the review makes that clear in retrospect.

On the Steam path — I did try it first, based on what logabell had done with his moonlight-qt fork. I couldn't get it working and ended up on VB-Cable which did work. I don't have a confident technical explanation for why it failed, which is partly the point — someone with more experience would have figured it out. VB-Cable worked, so that's what I shipped.

None of that changes the validity of your technical feedback though. The sequence number fabrication, the eConsole-only snapshot, the hardcoded WASAPI format, the detached retry thread, the missing opus_packet_parse validation, the silent device invalidation — all real problems and I'm not arguing any of them.

I've rebuilt the PR from scratch on a clean branch from your master. Real sender sequence parsed from a 4-byte wire format header stamped by the client, all three ERole values snapshotted and restored independently, WASAPI format negotiation via IsFormatSupported(), render_dead propagation so dead paths don't accumulate silently, opus_packet_parse() validation before decode, and the lifecycle flattened to a clean init lambda. The VB-Cable install stays in the installer where it belongs. The branch is scoped strictly to mic passthrough — mic_sink is a configurable string, not hardcoded to any device.

Thanks for taking the time to actually review it instead of just closing it.

@xenstalker02
Copy link
Copy Markdown
Author

Superseded by #169

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants