Releases: richardr1126/KittenTTS-FastAPI
Releases · richardr1126/KittenTTS-FastAPI
v0.2.1
✨ What's New
- Added markdown artifact normalization and non-speakable symbol stripping to the preprocessing pipeline to improve speech input robustness for mixed-format text.
- Introduced token-aware chunking controls (
target_min_tokens,target_max_tokens,absolute_max_tokens) across config, API models, and runtime option resolution for more predictable chunk sizing. - Improved server chunk processing by deferring unspeakable validation to chunk cleanup when splitting and adding safer fallback/recovery behavior during synthesis.
- Expanded and aligned test coverage for preprocessing/chunking behavior, and verified the full suite including integration tests (
45 passed).
Full Changelog: v0.2.0...v0.2.1
v0.2.0
✨ What's New
- Added dedicated health endpoints (
/health/live,/health/ready) and moved probe strategy to these fast paths for more reliable long-running generation workloads. - Added configurable generation concurrency controls (
KITTEN_MAX_CONCURRENT_GENERATIONS, queue timeout) and non-blocking request handling so synth jobs no longer starve server responsiveness. - Added configurable server worker count (
KITTEN_SERVER_WORKERS) and updated deployment defaults in Helm and Compose for better runtime stability. - Expanded automated coverage for health/readiness behavior and new server concurrency configuration paths.
Full Changelog: v0.1.1...v0.2.0
v0.1.1
✨ What's New
- Added advanced text processing controls (pause punctuation normalization, dialogue turn splitting, speaker label handling, and profile-based behavior).
- Added
KITTEN_TEXT_PROFILES_JSONto override/extend built-in text profiles from environment config. - Added OpenAI-compatible model listing/retrieval endpoints and model alias support for speech requests.
- Added an initial Helm chart for Kubernetes deployment (
charts/kittentts-fastapi). - Updated docs and deployment guidance (Docker naming/commands, chart env docs, and related cleanup).
Full Changelog: v0.1.0...v0.1.1
v0.1.0
🚀 KittenTTS FastAPI v0.1.0
- 🧠 Turns KittenTTS into a real deployable API service
- ⚡ Keeps the lightweight model footprint while adding production ergonomics
- 📚 Handles long-form input (chunking + merged output) for audiobook-style workflows
- 🐳 Ships with Docker-first deployment and clean local dev setup
🔥 What You Get
- 🌐 FastAPI backend with
/v1/audio/speechand/v1/audio/voices - 🖥️ Built-in web UI for text input, voice controls, playback, and download
- 🚄 ONNX Runtime acceleration with
auto/cpu/cuda - ⚙️ Environment-driven configuration via
KITTEN_* - 🧩 OpenAI-compatible API behavior for easier drop-in integration
🐳 Quick Start
CPU Docker:
docker run -it -d \
--name kittentts-fastapi \
--restart unless-stopped \
-e KITTEN_MODEL_REPO_ID="KittenML/kitten-tts-nano-0.8-fp32" \
-p 8005:8005 \
ghcr.io/richardr1126/kittentts-fastapi-cpuLocal development:
cp .env.example .env
uv sync
uv run src/server.pyNVIDIA local setup:
uv sync --group nvidiaThen set KITTEN_TTS_DEVICE=cuda before startup.
🧪 API Example
curl http://localhost:8005/v1/audio/speech \
-H "Content-Type: application/json" \
-d '{
"model": "kitten-tts",
"input": "Hello from the Kitten TTS FastAPI server!",
"voice": "Jasper",
"speed": 1.1,
"response_format": "mp3"
}' \
--output speech.mp3List voices:
curl http://localhost:8005/v1/audio/voices🌍 Open Source
- 📄 License: Apache-2.0
- 🐱 Upstream model: KittenML/KittenTTS
- 🤝 Contributions, bug reports, and production feedback are welcome
🙌 Credits
- KittenTTS by KittenML
- Early server foundation inspired by devnen/Kitten-TTS-Server