Skip to content

Releases: richardr1126/KittenTTS-FastAPI

v0.2.1

06 Apr 22:52

Choose a tag to compare

✨ What's New

  • Added markdown artifact normalization and non-speakable symbol stripping to the preprocessing pipeline to improve speech input robustness for mixed-format text.
  • Introduced token-aware chunking controls (target_min_tokens, target_max_tokens, absolute_max_tokens) across config, API models, and runtime option resolution for more predictable chunk sizing.
  • Improved server chunk processing by deferring unspeakable validation to chunk cleanup when splitting and adding safer fallback/recovery behavior during synthesis.
  • Expanded and aligned test coverage for preprocessing/chunking behavior, and verified the full suite including integration tests (45 passed).

Full Changelog: v0.2.0...v0.2.1

v0.2.0

06 Apr 19:08

Choose a tag to compare

✨ What's New

  • Added dedicated health endpoints (/health/live, /health/ready) and moved probe strategy to these fast paths for more reliable long-running generation workloads.
  • Added configurable generation concurrency controls (KITTEN_MAX_CONCURRENT_GENERATIONS, queue timeout) and non-blocking request handling so synth jobs no longer starve server responsiveness.
  • Added configurable server worker count (KITTEN_SERVER_WORKERS) and updated deployment defaults in Helm and Compose for better runtime stability.
  • Expanded automated coverage for health/readiness behavior and new server concurrency configuration paths.

Full Changelog: v0.1.1...v0.2.0

v0.1.1

22 Feb 08:43

Choose a tag to compare

✨ What's New

  • Added advanced text processing controls (pause punctuation normalization, dialogue turn splitting, speaker label handling, and profile-based behavior).
  • Added KITTEN_TEXT_PROFILES_JSON to override/extend built-in text profiles from environment config.
  • Added OpenAI-compatible model listing/retrieval endpoints and model alias support for speech requests.
  • Added an initial Helm chart for Kubernetes deployment (charts/kittentts-fastapi).
  • Updated docs and deployment guidance (Docker naming/commands, chart env docs, and related cleanup).

Full Changelog: v0.1.0...v0.1.1

v0.1.0

22 Feb 02:57

Choose a tag to compare

🚀 KittenTTS FastAPI v0.1.0

  • 🧠 Turns KittenTTS into a real deployable API service
  • ⚡ Keeps the lightweight model footprint while adding production ergonomics
  • 📚 Handles long-form input (chunking + merged output) for audiobook-style workflows
  • 🐳 Ships with Docker-first deployment and clean local dev setup

🔥 What You Get

  • 🌐 FastAPI backend with /v1/audio/speech and /v1/audio/voices
  • 🖥️ Built-in web UI for text input, voice controls, playback, and download
  • 🚄 ONNX Runtime acceleration with auto / cpu / cuda
  • ⚙️ Environment-driven configuration via KITTEN_*
  • 🧩 OpenAI-compatible API behavior for easier drop-in integration

🐳 Quick Start

CPU Docker:

docker run -it -d \
  --name kittentts-fastapi \
  --restart unless-stopped \
  -e KITTEN_MODEL_REPO_ID="KittenML/kitten-tts-nano-0.8-fp32" \
  -p 8005:8005 \
  ghcr.io/richardr1126/kittentts-fastapi-cpu

Local development:

cp .env.example .env
uv sync
uv run src/server.py

NVIDIA local setup:

uv sync --group nvidia

Then set KITTEN_TTS_DEVICE=cuda before startup.

🧪 API Example

curl http://localhost:8005/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kitten-tts",
    "input": "Hello from the Kitten TTS FastAPI server!",
    "voice": "Jasper",
    "speed": 1.1,
    "response_format": "mp3"
  }' \
  --output speech.mp3

List voices:

curl http://localhost:8005/v1/audio/voices

🌍 Open Source

  • 📄 License: Apache-2.0
  • 🐱 Upstream model: KittenML/KittenTTS
  • 🤝 Contributions, bug reports, and production feedback are welcome

🙌 Credits

  • KittenTTS by KittenML
  • Early server foundation inspired by devnen/Kitten-TTS-Server

🔗 Links