A powerful audio stem separation tool built with Python, using the Demucs deep learning model to separate audio tracks into drums, bass, vocals, and other instruments.
- High-quality audio stem separation
- User-friendly GUI interface
- Real-time progress tracking
- Support for MP3, WAV, and FLAC files
- Enhanced vocal processing
- Memory-efficient chunked processing
- CUDA GPU acceleration support
- Python 3.8 or higher
- NVIDIA GPU (optional, for faster processing)
- FFmpeg installed on your system
This application requires different PyTorch installations depending on your platform:
- Clone or download this repository
- Run
setup_windows_cuda.batfor automatic GPU-enabled setup - Or manually install CUDA-enabled PyTorch:
# Create virtual environment
python -m venv venv_windows
venv_windows\Scripts\activate
# Install CUDA-enabled PyTorch
pip install torch==2.1.0+cu118 torchvision==0.16.0+cu118 torchaudio==2.1.0+cu118 --index-url https://download.pytorch.org/whl/cu118
# Install other dependencies
pip install -r requirements.txt- Clone or download this repository
- Run
./setup_macos.shfor automatic setup - Or manually install CPU-only PyTorch:
# Install Python 3.12 (required for PyTorch compatibility)
brew install python@3.12
# Create virtual environment
/usr/local/bin/python3.12 -m venv venv_macos
source venv_macos/bin/activate
# Install CPU-only PyTorch
pip install torch torchvision torchaudio
# Install other dependencies
pip install -r requirements.txtThe old setup.bat and run.bat files may not work with newer Python versions.
-
Start the program:
- Double-click
run.bat(Windows) - Or run:
python main.py
- Double-click
-
Using the application:
- Click "Upload Audio File" to select your audio file
- Choose an output directory
- Click "Process and Separate Stems"
- Wait for processing to complete
- Find your separated stems in the output directory
The program generates four separate audio files:
drums.wav: Drum tracksbass.wav: Bass tracksvocals.wav: Vocal tracksother.wav: Other instruments
-
Memory Issues
- The program automatically handles large files through chunk processing
- For very large files, ensure you have at least 8GB of RAM
-
CUDA/GPU Issues
Windows (NVIDIA GPU):
- Make sure you have NVIDIA GPU drivers installed
- IMPORTANT: Install CUDA-enabled PyTorch for GPU acceleration:
# Uninstall CPU-only PyTorch first pip uninstall torch torchvision torchaudio -y # Install CUDA-enabled PyTorch pip install torch==2.1.0+cu118 torchvision==0.16.0+cu118 torchaudio==2.1.0+cu118 --index-url https://download.pytorch.org/whl/cu118
- Or run
setup_windows_cuda.batfor automatic CUDA setup - The program will automatically fall back to CPU if GPU is unavailable
- Check GPU detection in the application log output
macOS:
- CUDA is not supported on macOS - use CPU-only PyTorch
- Run
./setup_macos.shfor automatic setup - The application will automatically use CPU processing
-
Audio Quality
- For best results, use high-quality input files (WAV/FLAC)
- Output files are saved in high-quality float32 WAV format
- Uses Demucs HTDemucs-ft model for separation
- Implements chunked processing for memory efficiency
- Features crossfade processing for seamless stem combination
- Enhanced vocal processing with stereo field preservation
This project is licensed under the MIT License - see the LICENSE file for details.
The MIT License is a permissive license that is short and to the point. It lets people do anything they want with your code as long as they provide attribution back to you and don't hold you liable.
- Demucs by Meta Research (Licensed under MIT)
- GUI built with PyQt5 (Licensed under GPL)
- Audio processing using librosa and soundfile
Feel free to submit issues and enhancement requests!