Skip to content

Commit a67348a

Browse files
devnenclaude
andcommitted
Fix installation across all platforms: ONNX builds, MPS float64, Docker CPU
- Install chatterbox with --no-deps on ALL paths (CPU, NVIDIA, cu128, ROCm) to prevent ONNX source build failures and torch version conflicts. Chatterbox dependencies (conformer, diffusers, transformers, s3tokenizer, omegaconf, resampy) are now listed explicitly in all requirements files. onnx==1.16.0 is pinned to guarantee pre-built wheels. - Fix Apple Silicon Turbo model crash ("Cannot convert a MPS Tensor to float64 dtype") by forcing float32 in s3tokenizer and voice_encoder. Applied in chatterbox-v2 fork (cc03573) and as automatic post-install patch in start.py for users of other chatterbox versions. - New lightweight Dockerfile.cpu based on python:3.10-slim instead of the 4GB+ nvidia/cuda base image. docker-compose-cpu.yml updated. - Default config.yaml device changed from "cuda" to "auto" for correct auto-detection on all hardware (CUDA, MPS, CPU). - Removed deprecated docker-compose version tags from all compose files. - Updated README: Python 3.10 recommended (3.13+ not supported), manual install instructions include --no-deps step, new troubleshooting entries for ONNX builds and torch version errors. Fixes #23, #44, #79, #93, #105, #107, #113, #121 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent b181edb commit a67348a

12 files changed

Lines changed: 227 additions & 37 deletions

Dockerfile

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -31,15 +31,18 @@ WORKDIR /app
3131
# Copy requirements first to leverage Docker cache
3232
COPY requirements.txt .
3333

34-
# Upgrade pip and install Python dependencies
35-
RUN pip3 install --no-cache-dir --upgrade pip && \
36-
pip3 install --no-cache-dir -r requirements.txt
37-
# Conditionally install NVIDIA dependencies if RUNTIME is set to 'nvidia'
34+
# Install dependencies:
35+
# 1. Base requirements (CPU torch + all server deps + chatterbox deps)
36+
# 2. Conditionally install NVIDIA CUDA torch (overrides CPU torch)
37+
# 3. Chatterbox with --no-deps to prevent pip from pulling conflicting torch/onnx
3838
COPY requirements-nvidia.txt .
3939

40-
RUN if [ "$RUNTIME" = "nvidia" ]; then \
41-
pip3 install --no-cache-dir -r requirements-nvidia.txt; \
42-
fi
40+
RUN pip3 install --no-cache-dir --upgrade pip && \
41+
pip3 install --no-cache-dir -r requirements.txt && \
42+
if [ "$RUNTIME" = "nvidia" ]; then \
43+
pip3 install --no-cache-dir -r requirements-nvidia.txt; \
44+
fi && \
45+
pip3 install --no-cache-dir --no-deps git+https://github.com/devnen/chatterbox-v2.git@master
4346
# Copy the rest of the application code
4447
COPY . .
4548

Dockerfile.cpu

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
FROM python:3.10-slim
2+
3+
# Set environment variables
4+
ENV PYTHONDONTWRITEBYTECODE=1
5+
ENV PYTHONUNBUFFERED=1
6+
ENV DEBIAN_FRONTEND=noninteractive
7+
# Set the Hugging Face home directory for better model caching
8+
ENV HF_HOME=/app/hf_cache
9+
10+
# Install system dependencies
11+
RUN apt-get update && apt-get install -y --no-install-recommends \
12+
build-essential \
13+
libsndfile1 \
14+
ffmpeg \
15+
git \
16+
&& apt-get clean \
17+
&& rm -rf /var/lib/apt/lists/*
18+
19+
# Set up working directory
20+
WORKDIR /app
21+
22+
# Copy requirements first to leverage Docker cache
23+
COPY requirements.txt ./requirements.txt
24+
25+
# Install dependencies:
26+
# 1. Requirements file installs CPU torch + all server dependencies
27+
# 2. Chatterbox with --no-deps to prevent pip from pulling conflicting versions
28+
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
29+
python3 -m pip install --no-cache-dir -r requirements.txt && \
30+
python3 -m pip install --no-cache-dir --no-deps git+https://github.com/devnen/chatterbox-v2.git@master
31+
32+
# Copy the rest of the application code
33+
COPY . .
34+
35+
# Create required directories for the application
36+
RUN mkdir -p model_cache reference_audio outputs voices logs hf_cache
37+
38+
# Expose the port the application will run on
39+
EXPOSE 8004
40+
41+
# Command to run the application
42+
CMD ["python3", "server.py"]

README.md

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This server is based on the architecture and UI of our [Dia-TTS-Server](https://
1010

1111
[![Project Link](https://img.shields.io/badge/GitHub-devnen/Chatterbox--TTS--Server-blue?style=for-the-badge&logo=github)](https://github.com/devnen/Chatterbox-TTS-Server)
1212
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](LICENSE)
13-
[![Python Version](https://img.shields.io/badge/Python-3.10+-blue.svg?style=for-the-badge)](https://www.python.org/downloads/)
13+
[![Python Version](https://img.shields.io/badge/Python-3.10_(recommended)-blue.svg?style=for-the-badge)](https://www.python.org/downloads/release/python-31011/)
1414
[![Framework](https://img.shields.io/badge/Framework-FastAPI-green.svg?style=for-the-badge)](https://fastapi.tiangolo.com/)
1515
[![Model Source](https://img.shields.io/badge/Model-ResembleAI/chatterbox-orange.svg?style=for-the-badge)](https://github.com/resemble-ai/chatterbox)
1616
[![Docker](https://img.shields.io/badge/Docker-Supported-blue.svg?style=for-the-badge)](https://www.docker.com/)
@@ -81,11 +81,16 @@ This server is based on the architecture and UI of our [Dia-TTS-Server](https://
8181

8282
**Switching models is effortless:** Simply select your preferred model from the engine selector dropdown at the top of the Web UI. No restarts, no configuration changes required—just instant hot-swapping to test quality, speed, and language support across the complete Chatterbox family.
8383

84-
### 🖥️ Fixed NVIDIA Blackwell / CUDA 12.8 and AMD ROCm installation
84+
### 🖥️ Installation fixes across all platforms
8585

86+
- **All platforms:** Chatterbox is now installed with `--no-deps` across all installation paths (CPU, NVIDIA, cu128, ROCm). This eliminates ONNX source build failures, torch version conflicts, and CMake errors that affected many users. Chatterbox's dependencies (conformer, diffusers, transformers, s3tokenizer, etc.) are now listed explicitly in each requirements file with `onnx==1.16.0` pinned to guarantee pre-built wheels.
87+
- **Apple Silicon / MPS:** Fixed Turbo model crash ("Cannot convert a MPS Tensor to float64 dtype") by forcing float32 in s3tokenizer and voice_encoder. Fix applied in the chatterbox-v2 fork and also as an automatic post-install patch in `start.py` for users of other chatterbox versions. Thanks to @jonas3245 (#93).
88+
- **Docker CPU:** New lightweight `Dockerfile.cpu` based on `python:3.10-slim` instead of the 4GB+ NVIDIA CUDA base image. `docker-compose-cpu.yml` now uses this smaller image. Removed deprecated `version` tags from all docker-compose files.
89+
- **config.yaml:** Default device changed from `cuda` to `auto` for correct auto-detection on all hardware (CUDA, MPS, CPU).
90+
- **Python version:** Clarified that **Python 3.10 is recommended**. Python 3.13+ is not supported due to missing wheels for torch and ONNX. The Windows launcher's Portable Mode handles this automatically.
8691
- **Blackwell (CUDA 12.8):** Fixed `requirements-nvidia-cu128.txt` to properly install PyTorch 2.9.0 with CUDA 12.8 (`sm_120` support) for RTX 5060 Ti, 5070, 5070 Ti, 5080, and 5090 GPUs. The `Dockerfile.cu128` now correctly installs chatterbox with `--no-deps` to prevent PyTorch downgrade.
8792
- **AMD ROCm:** Fixed ROCm installation by switching to PyTorch's official ROCm 6.1 wheel index (`torch==2.5.1+rocm6.1`), which resolves the previous `torch==2.6.0` / `torchaudio==2.5.1` version conflict. A new `requirements-rocm-init.txt` installs the ROCm PyTorch stack before other dependencies. Both `Dockerfile.rocm` and `start.py` now use a two-step install to prevent pip from replacing ROCm torch wheels with CPU-only versions.
88-
- Thanks to community contributors in issues #20, #58, #64, #89, #92, #98, #109, #114, and #122 for testing and reporting solutions.
93+
- Thanks to community contributors in issues #20, #23, #44, #58, #64, #79, #89, #92, #93, #98, #105, #107, #109, #113, #114, #121, and #122 for testing and reporting solutions.
8994

9095
### 🧰 Automated launcher + easy updates
9196

@@ -225,7 +230,7 @@ This server application enhances the underlying `chatterbox-tts` engine with the
225230
## 🔩 System Prerequisites
226231

227232
* **Operating System:** Windows 10/11 (64-bit) or Linux (Debian/Ubuntu recommended).
228-
* **Python:** Version 3.10 or later ([Download](https://www.python.org/downloads/)). *When using Portable Mode on Windows, Python is only needed on the machine where you first set up the application. The target machine (where you copy/share the folder to) does not need Python installed at all.*
233+
* **Python:** Version **3.10 recommended** ([Download](https://www.python.org/downloads/release/python-31011/)). Python 3.11 and 3.12 also work but may require building some dependencies from source. **Python 3.13+ is not supported** — several key dependencies (torch, ONNX) lack pre-built wheels for 3.13. On Windows, the launcher's Portable Mode automatically uses Python 3.10 regardless of your system Python version. *When using Portable Mode, Python is only needed on the machine where you first set up the application.*
229234
* **Git:** For cloning the repository ([Download](https://git-scm.com/downloads)).
230235
* **Internet:** For downloading dependencies and models from Hugging Face Hub.
231236
* **Disk Space:** 10GB+ recommended (for dependencies and model cache).
@@ -480,11 +485,12 @@ This is the most straightforward option and works on any machine without a compa
480485
# Make sure your (venv) is active
481486
pip install --upgrade pip
482487
pip install -r requirements.txt
488+
pip install --no-deps git+https://github.com/devnen/chatterbox-v2.git@master
483489
```
484490

485491
<details>
486492
<summary><strong>💡 How This Works</strong></summary>
487-
The `requirements.txt` file is specially crafted for CPU users. It tells `pip` to use PyTorch's CPU-specific package repository and pins compatible versions of `torch` and `torchvision`. This prevents `pip` from installing mismatched versions, which is a common source of errors.
493+
The `requirements.txt` file installs CPU PyTorch and all server dependencies. Chatterbox is installed separately with `--no-deps` to prevent pip from pulling in conflicting torch versions or triggering ONNX source builds.
488494
</details>
489495

490496
---
@@ -493,12 +499,13 @@ The `requirements.txt` file is specially crafted for CPU users. It tells `pip` t
493499

494500
For users with NVIDIA GPUs. This provides the best performance for RTX 20/30/40 series.
495501

496-
**Prerequisite:** Ensure you have the latest NVIDIA drivers installed.
502+
**Prerequisite:** Ensure you have the latest NVIDIA drivers installed. **Python 3.10 recommended** (3.11/3.12 also work; 3.13+ is not supported).
497503

498504
```bash
499505
# Make sure your (venv) is active
500506
pip install --upgrade pip
501507
pip install -r requirements-nvidia.txt
508+
pip install --no-deps git+https://github.com/devnen/chatterbox-v2.git@master
502509
```
503510

504511
**After installation, verify that PyTorch can see your GPU:**
@@ -509,7 +516,7 @@ If `CUDA available:` shows `True`, your setup is correct!
509516

510517
<details>
511518
<summary><strong>💡 How This Works</strong></summary>
512-
The `requirements-nvidia.txt` file instructs `pip` to use PyTorch's official CUDA 12.1 package repository. It pins specific, compatible versions of `torch`, `torchvision`, and `torchaudio` that are built with CUDA support. This guarantees that the versions required by `chatterbox-tts` are met with the correct GPU-enabled libraries, preventing conflicts.
519+
The `requirements-nvidia.txt` file installs PyTorch with CUDA 12.1 support plus all server dependencies. Chatterbox is installed separately with `--no-deps` to prevent pip from downgrading the CUDA torch to a CPU version or triggering ONNX source builds.
513520
</details>
514521

515522
---
@@ -1199,9 +1206,10 @@ lspci | grep VGA
11991206
### Apple Silicon (MPS) Issues
12001207

12011208
* **MPS Not Available:** Ensure you have macOS 12.3+ and an Apple Silicon Mac. Verify with `python -c "import torch; print(torch.backends.mps.is_available())"`
1209+
* **Turbo Model Float64 Error:** If you see "Cannot convert a MPS Tensor to float64 dtype", update to the latest version. This is now fixed in the chatterbox-v2 fork (s3tokenizer and voice_encoder force float32). The `start.py` launcher also applies this patch automatically.
12021210
* **Installation Conflicts:** If you encounter version conflicts, follow the exact Apple Silicon installation sequence in Option 4, installing PyTorch first before other dependencies.
1203-
* **ONNX Build Errors:** Use the specific ONNX version `pip install onnx==1.16.0` as shown in the installation steps.
1204-
* **Model Loading Errors:** Ensure `config.yaml` has `device: mps` in the `tts_engine` section.
1211+
* **ONNX Build Errors:** Now resolved — `onnx==1.16.0` is pinned in all requirements files to use pre-built wheels. If you still hit issues, ensure you're using Python 3.10.
1212+
* **Model Loading Errors:** Ensure `config.yaml` has `device: auto` (or `device: mps`) in the `tts_engine` section.
12051213
12061214
### NVIDIA GPU Issues
12071215
@@ -1222,6 +1230,8 @@ lspci | grep VGA
12221230
12231231
### General Issues
12241232
1233+
* **ONNX / wheel build failures:** This is usually caused by Python 3.13+ or a missing pre-built wheel. Use Python 3.10 and ensure `onnx==1.16.0` is pinned. The updated requirements files handle this automatically.
1234+
* **"No matching distribution found for torch==2.5.1+cu121":** You're likely on Python 3.13+ which doesn't have torch 2.5.1 wheels. Downgrade to Python 3.10 or use the Windows launcher's Portable Mode which handles this automatically.
12251235
* **Import Errors (e.g., `chatterbox-tts`, `librosa`):** Ensure virtual environment is active and dependencies installed successfully. Try reinstalling: `python start.py --reinstall`
12261236
* **`libsndfile` Error (Linux):** Run `sudo apt install libsndfile1`.
12271237
* **Model Download Fails:** Check internet connection. `ChatterboxTTS.from_pretrained()` will attempt to download from Hugging Face Hub. Ensure `model.repo_id` in `config.yaml` is correct.

config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ server:
1111
model:
1212
repo_id: chatterbox-turbo
1313
tts_engine:
14-
device: cuda
14+
device: auto
1515
predefined_voices_path: voices
1616
reference_audio_path: reference_audio
1717
default_voice_id: Emily.wav

docker-compose-cpu.yml

Lines changed: 1 addition & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,8 @@
1-
version: '3.8'
2-
31
services:
42
chatterbox-tts-server:
53
build:
64
context: .
7-
dockerfile: Dockerfile
8-
args:
9-
# This build argument ensures only CPU dependencies are installed
10-
- RUNTIME=cpu
5+
dockerfile: Dockerfile.cpu
116
ports:
127
- "${PORT:-8004}:8004"
138
volumes:

docker-compose-rocm.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
version: '3.8'
2-
31
services:
42
chatterbox-tts-server:
53
build:

docker-compose.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
version: '3.8'
2-
31
services:
42
chatterbox-tts-server:
53
build:

requirements-nvidia-cu128.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,15 @@ torch==2.9.0
2525
torchvision==0.24.0
2626
torchaudio==2.9.0
2727

28+
# --- Chatterbox Dependencies (explicit, since --no-deps skips them) ---
29+
conformer==0.3.2
30+
diffusers==0.29.0
31+
transformers==4.46.3
32+
s3tokenizer
33+
omegaconf==2.3.0
34+
resampy==0.4.3
35+
onnx==1.16.0 # Pinned to ensure pre-built wheel
36+
2837
# --- Core Web Framework ---
2938
fastapi
3039
uvicorn[standard]

requirements-nvidia.txt

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,25 @@ torchvision==0.20.1+cu121
1111
torchaudio==2.5.1+cu121
1212

1313
# --- Core Application Dependencies ---
14-
15-
# Chatterbox TTS engine - Install from chatterbox-v2 fork
16-
chatterbox-tts @ git+https://github.com/devnen/chatterbox-v2.git@master
14+
#
15+
# IMPORTANT: chatterbox-tts is NOT included here. It must be installed
16+
# separately with --no-deps to prevent torch version conflicts and
17+
# ONNX source build failures.
18+
#
19+
# The start.py launcher handles this automatically.
20+
#
21+
# For manual installation, run these commands in order:
22+
# pip install -r requirements-nvidia.txt
23+
# pip install --no-deps git+https://github.com/devnen/chatterbox-v2.git@master
24+
25+
# Chatterbox Dependencies (explicit, since --no-deps skips them)
26+
conformer==0.3.2
27+
diffusers==0.29.0
28+
transformers==4.46.3
29+
s3tokenizer
30+
omegaconf==2.3.0
31+
resampy==0.4.3
32+
onnx==1.16.0 # Pinned to ensure pre-built wheel
1733

1834
# Core Web Framework
1935
fastapi

requirements-rocm.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,15 @@
1414
# The --no-deps flag on chatterbox-tts is critical: without it, pip will
1515
# replace the ROCm torch wheels with CPU-only versions from PyPI.
1616

17+
# Chatterbox Dependencies (explicit, since --no-deps skips them)
18+
conformer==0.3.2
19+
diffusers==0.29.0
20+
transformers==4.46.3
21+
s3tokenizer
22+
omegaconf==2.3.0
23+
resampy==0.4.3
24+
onnx==1.16.0 # Pinned to ensure pre-built wheel
25+
1726
# Core Web Framework
1827
fastapi
1928
uvicorn[standard]

0 commit comments

Comments
 (0)