SingingPracticeTool

A desktop app for solo vocal practice: sing along to an instrumental with live pitch on a piano roll, run your mic through a VST chain for monitoring, extract stems from any song or YouTube URL, and transcribe vocal takes to MIDI.

Status: Windows-first. macOS support is planned but not yet implemented (SidecarProcess has a stub for the macOS branch; the C++ host is otherwise portable).

Features

Practice tab

Sing along to an instrumental while the piano roll scrolls your live pitch against a reference MIDI. Mic runs through a VST3 plugin chain for monitoring. Click/drag/wheel the timeline to scrub.

Layer	What it does	Library / source
Audio I/O	Device + driver enumeration, low-latency callbacks	`juce::AudioDeviceManager` — ASIO + WASAPI on Windows
Pitch detection	Real-time fundamental-frequency tracking	Custom YIN-style detector (`Source/Pitch/PitchDetector.cpp`), RT-safe (no allocs/locks)
Transport	Sample-rate-aware playback + scrub	`juce::AudioTransportSource`
VST hosting	Load + run a chain of VST3 plugins on the mic	`juce::AudioPluginFormatManager` (VST3 only)
Piano roll + scrub	Live pitch trace, MIDI overlay, draggable timeline	Custom JUCE components — `PianoRoll`, `TransportScrubber`

Stem Extract tab

Drop a file or paste a YouTube URL → get vocals / drums / bass / other (or vocals + accompaniment) as WAV stems. Runs on CUDA or CPU.

Layer	What it does	Library
Source separation	The actual `htdemucs` / `htdemucs_ft` / `mdx_extra` model inference	Demucs 4 (Meta)
Compute backend	Tensor math on GPU or CPU	PyTorch (CUDA 12.8 wheel for Blackwell, falls back to CPU)
Audio decode	Read any format ffmpeg understands	`ffmpeg` subprocess via `demucs.audio.AudioFile`
Audio encode	Write stem WAVs	soundfile (libsndfile) — bypasses `torchaudio.save` to avoid the torchcodec / FFmpeg-shared-lib chain
YouTube ingest	Download a single video (no playlists)	yt-dlp + ffmpeg postprocessor

Vocal → MIDI tab

Transcribe a vocal take to a .mid file with adjustable onset / frame / min- note-length thresholds.

Layer	What it does	Library
Transcription	Spotify's ICASSP 2022 multi-pitch model	basic-pitch — ONNX backend (not the default TF)
Compute	Runs the model graph	ONNX Runtime (CPU; session cached at module scope so repeat calls amortise the ~500 ms load)
Audio decode	Pull samples + resample	librosa
MIDI export	Serialise notes to a `.mid` file	pretty_midi

Settings tab

Audio device, driver (ASIO / WASAPI), sample rate, buffer size, channel routing.

Layer	What it does	Library
Device UI	The device-picker composite	`juce::AudioDeviceSelectorComponent`
Persistence	Save/restore on every device change	`AudioDeviceManager::createStateXml` → `%APPDATA%\SingingPracticeTool\device_settings.xml` (hash-deduped so a burst of broadcasts doesn't thrash the disk)

Shell + look

UI theme — custom dark palette with a teal accent, in Source/UI/ModernLookAndFeel.cpp. Subclasses juce::LookAndFeel_V4; rounded buttons / inputs, slim sliders, minimal underline-style tabs.
Sidecar transport — line-delimited JSON-RPC 2.0 over real Win32 pipes (since juce::ChildProcess can't write to a child's stdin). UTF-8 enforced on both sides so unicode filenames survive a YouTube → Stem → Vocal→MIDI round-trip.

Install

The simplest path — grab the latest Windows installer from GitHub Releases:

Download SingingPracticeTool-Setup-<version>.exe.
First launch creates %APPDATA%\SingingPracticeTool\ for device settings and ~\Music\SingingPracticeTool\ for recordings / stems / MIDI.

The installer is unsigned, so Windows SmartScreen will show "Microsoft Defender prevented an unrecognized app from starting" on first run. Click More info → Run anyway. Code signing is on the roadmap.

If you want CUDA-accelerated stem separation / vocal-to-MIDI, install a recent NVIDIA driver — the bundled PyTorch + ONNX Runtime detect CUDA at runtime and fall back to CPU automatically.

Build from source

Building yourself only makes sense if you want to modify the code. For everyday use, the installer above is faster.

Prerequisites (Windows)

Visual Studio 2022 Build Tools (or full VS) with the Desktop development with C++ workload — provides MSVC + Windows SDK.
CMake ≥ 3.22 — https://cmake.org/download/
Python 3.13 — https://www.python.org/downloads/ (3.11 / 3.12 will also work, but the sidecar install recipe below assumes 3.13.)
ffmpeg — required by yt-dlp and Demucs.
```
choco install ffmpeg
```
Or download a static build and put ffmpeg.exe on PATH.
NVIDIA GPU + recent driver — optional, only if you want CUDA-accelerated Demucs / Vocal→MIDI. Without one, the sidecar falls back to CPU.

1. Clone

git clone https://github.com/VanKyle00/SingingPracticeTool.git
cd SingingPracticeTool

2. ASIO SDK (optional, recommended for low-latency monitoring)

The Steinberg ASIO SDK is not redistributable, so you must fetch it yourself:

Download asiosdk_2.3.3_2019-06-14.zip from https://www.steinberg.net/asiosdk.
Unzip so that the path third_party/asiosdk/common/iasiodrv.h exists.

CMake auto-detects the SDK and enables JUCE's ASIO backend. Without it, JUCE falls back to WASAPI (still usable, just higher latency).

3. Configure + build the C++ host

cmake -S . -B build -G "Visual Studio 17 2022"
cmake --build build --config RelWithDebInfo --target App

The first configure clones JUCE 8.0.4 via FetchContent (~500 MB, ~45 s). Incremental rebuilds take seconds.

Output: build\App_artefacts\RelWithDebInfo\SingingPracticeTool.exe.

4. Python sidecar

The sidecar handles Demucs (stem separation), yt-dlp (YouTube), and basic-pitch (vocal-to-MIDI). It runs as a child process and talks JSON-RPC over pipes.

# From the project root
py -3.13 -m venv sidecar\.venv
sidecar\.venv\Scripts\python.exe -m pip install --upgrade pip

# 1. PyTorch (CUDA 12.8 build for RTX 30/40/50 series; use the CPU index URL
#    if you don't have an NVIDIA GPU).
sidecar\.venv\Scripts\python.exe -m pip install ^
    torch torchaudio --index-url https://download.pytorch.org/whl/cu128

# 2. Core deps (Demucs, yt-dlp, soundfile, torchcodec, basic-pitch's runtime
#    deps). The requirements file deliberately excludes basic-pitch itself —
#    its package metadata pins `tensorflow<2.15.1` and `resampy<0.4.3`,
#    neither of which builds on Python 3.13.
sidecar\.venv\Scripts\python.exe -m pip install -r sidecar\requirements.txt

# 3. basic-pitch with --no-deps (works fine — we already installed its real
#    runtime deps above).
sidecar\.venv\Scripts\python.exe -m pip install --no-deps basic-pitch==0.4.0

Expected: CUDA: True (if you have an NVIDIA GPU) and basic-pitch OK.

Running

Launch build\App_artefacts\RelWithDebInfo\SingingPracticeTool.exe.

The host walks up from the exe looking for sidecar/.venv/Scripts/python.exe, so as long as the venv is in the source tree (it is by default), the sidecar spawns automatically on first use of the Stem Extract or Vocal→MIDI tabs.

First-run defaults

Output recordings: ~\Music\SingingPracticeTool\vocal_dry_<timestamp>.wav
Stems: ~\Music\SingingPracticeTool\Stems\
MIDI: ~\Music\SingingPracticeTool\MIDI\
Device settings: %APPDATA%\SingingPracticeTool\device_settings.xml

Architecture

C++ host: JUCE 8 + CMake. Source/ is organised by concern (App/, Audio/, Pitch/, UI/, VST/, Sidecar/).
Sidecar: Python 3 package at sidecar/practiceml/. Line-delimited JSON-RPC 2.0 over the child's stdin/stdout (UTF-8). Real Win32 pipes — juce::ChildProcess can't write to stdin, so the transport is hand-rolled.
Audio thread invariant: no allocations, no locks (try-lock or lock-free atomics), no logging. Enforced in Engine::audioDeviceIOCallback and PitchDetector::process.

See CLAUDE.md for deeper architectural notes and slice history.

License

GPLv3 — see LICENSE.

The C++ host depends on JUCE 8, which is dual-licensed (GPLv3 / commercial). Linking against JUCE under GPLv3 requires this project to be GPLv3. For a permissive license you'd need a JUCE commercial license.

ASIO is a trademark and software of Steinberg Media Technologies GmbH; the ASIO SDK is not redistributed with this repo.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Source		Source
packaging		packaging
resources		resources
sidecar		sidecar
.gitattributes		.gitattributes
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SingingPracticeTool

Features

Practice tab

Stem Extract tab

Vocal → MIDI tab

Settings tab

Shell + look

Install

Build from source

Prerequisites (Windows)

1. Clone

2. ASIO SDK (optional, recommended for low-latency monitoring)

3. Configure + build the C++ host

4. Python sidecar

Running

Architecture

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SingingPracticeTool

Features

Practice tab

Stem Extract tab

Vocal → MIDI tab

Settings tab

Shell + look

Install

Build from source

Prerequisites (Windows)

1. Clone

2. ASIO SDK (optional, recommended for low-latency monitoring)

3. Configure + build the C++ host

4. Python sidecar

Running

Architecture

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages