Text-to-speech Shiny app for the Cornball AI ecosystem.
- Multi-backend TTS: Chatterbox, Qwen3-TTS, OpenAI, ElevenLabs, fal.ai
- Voice selection: 9 built-in Qwen3 voices, OpenAI voices, ElevenLabs library
- Voice cloning: Upload reference audio for Chatterbox/Qwen3 backends
- Voice design: Create custom voices from natural language descriptions (Qwen3)
- Save as voice: Save generated audio as a reusable voice for cloning
- History: Persistent storage with audio playback in
~/.cornfab/
# Install from GitHub
remotes::install_github("cornball-ai/cornfab")library(cornfab)
run_app() # Runs on port 7803| Backend | Type | Port | Features |
|---|---|---|---|
| Chatterbox | Container | 7810 | Voice cloning, exaggeration control |
| Qwen3-TTS | Container | 7811 | 9 voices, voice design, 10 languages |
| OpenAI | API | - | 6 voices, tts-1/tts-1-hd models |
| ElevenLabs | API | - | Large voice library, multilingual |
| fal.ai | API | - | F5-TTS, Dia, Orpheus models |
Chatterbox and Qwen3-TTS run as local Docker containers. You must:
- Download models before running containers
- Start containers manually - cornfab does not auto-start containers
Models are stored in the HuggingFace cache:
~/.cache/huggingface/hub/
Mount this directory when running containers:
-v ~/.cache/huggingface:/root/.cache/huggingfaceModel: ResembleAI/chatterbox (~2GB)
Option 1: R with hfhub (recommended)
# install.packages("hfhub")
hfhub::hub_snapshot("ResembleAI/chatterbox")Option 2: Python with huggingface_hub
pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download('ResembleAI/chatterbox')"Option 3: curl
mkdir -p ~/.cache/huggingface/hub/models--ResembleAI--chatterbox/snapshots/main
cd ~/.cache/huggingface/hub/models--ResembleAI--chatterbox/snapshots/main
curl -LO https://huggingface.co/ResembleAI/chatterbox/resolve/main/chatterbox.safetensors
curl -LO https://huggingface.co/ResembleAI/chatterbox/resolve/main/s3gen.safetensors
curl -LO https://huggingface.co/ResembleAI/chatterbox/resolve/main/t3_cfg.safetensors
curl -LO https://huggingface.co/ResembleAI/chatterbox/resolve/main/ve.safetensorsThree models are needed for full functionality:
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice- Built-in speakers (~7GB)Qwen/Qwen3-TTS-12Hz-1.7B-Base- Voice cloning (~7GB)Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign- Voice design (~7GB)
Option 1: R with hfhub (recommended)
# Download all three models
hfhub::hub_snapshot("Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice")
hfhub::hub_snapshot("Qwen/Qwen3-TTS-12Hz-1.7B-Base")
hfhub::hub_snapshot("Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign")Option 2: Python with huggingface_hub
python -c "from huggingface_hub import snapshot_download; \
snapshot_download('Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice'); \
snapshot_download('Qwen/Qwen3-TTS-12Hz-1.7B-Base'); \
snapshot_download('Qwen/Qwen3-TTS-12Hz-1.7B-VoiceDesign')"Option 3: Git LFS
cd ~/.cache/huggingface/hub
git lfs install
git clone https://huggingface.co/Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoiceNote: Containers default to LOCAL_FILES_ONLY=true and will fail if models aren't pre-downloaded. Set LOCAL_FILES_ONLY=false to enable auto-download (not recommended for production).
# Build (if not using ghcr.io)
cd ~/chatterbox-tts-api
docker build -f docker/Dockerfile -t chatterbox-tts-api .
# Run
docker run -d --gpus all --network=host --name chatterbox \
-v ~/.cache/huggingface:/root/.cache/huggingface \
-e PORT=7810 \
chatterbox-tts-api# Build (if not using ghcr.io)
cd ~/qwen3-tts-api
docker build -f Dockerfile.blackwell -t qwen3-tts-api:blackwell .
# Run (Blackwell GPUs - RTX 50xx)
docker run -d --gpus all --network=host --name qwen3-tts-api \
-v ~/.cache/huggingface:/cache \
-e PORT=7811 \
-e USE_FLASH_ATTENTION=false \
qwen3-tts-api:blackwell
# Run (older GPUs - Ampere, Ada Lovelace)
docker build -t qwen3-tts-api . # Use default Dockerfile
docker run -d --gpus all --network=host --name qwen3-tts-api \
-v ~/.cache/huggingface:/cache \
-e PORT=7811 \
qwen3-tts-apiFor OpenAI, ElevenLabs, and fal.ai, set environment variables:
export OPENAI_API_KEY="sk-..."
export ELEVENLABS_API_KEY="..."
export FAL_KEY="..."Or configure in the app's API Settings panel.
Qwen3-TTS supports creating custom voices from natural language descriptions.
| Mode | Models Loaded | VRAM |
|---|---|---|
| Built-in voices | Base | ~4.6GB |
| Voice cloning | + CustomVoice | +4.1GB |
| Voice design | + VoiceDesign | +3.9GB |
Models load on first use and stay in memory. Restart the container to unload.
To avoid keeping the VoiceDesign model loaded:
- Design once: Check "Design voice from description", enter a description like "A warm, friendly female voice with a slight British accent", generate
- Save as voice: Click "Save as Voice", enter a name (e.g., "warm-female")
- Restart container:
docker restart qwen3-tts-apito free ~5GB VRAM - Clone from saved: Select "warm-female (custom)" from the voice dropdown for future generations
Saved voices are stored in ~/.cornfab/voices/ and work with both Qwen3 and Chatterbox backends.
# Clone
git clone https://github.com/cornball-ai/cornfab
cd cornfab
# Build and run
r -e 'tinyrox::document(); tinypkgr::install()'
r -e 'library(cornfab); run_app()'