PrivateVoice

Local text-to-speech that never leaves your machine. Generate natural speech, clone voices, and design new ones — powered by Qwen3-TTS models running entirely on your hardware.

No cloud APIs. No subscriptions. No data sent anywhere. Have ideas or requests? Open an issue or PR, or support the project.

Three Ways to Generate Speech

Custom Voice

Pick from preset speakers, choose a language, and type your text. Optionally add style instructions to control tone and delivery.

Voice Clone

Record or import a reference audio clip, and PrivateVoice will generate new speech in that voice. Includes optional Whisper auto-transcription.

Voice Design

Describe the voice you want in plain text — "warm baritone, slight British accent, nature documentary narrator" — and the model creates it.

Key Features

Runs 100% locally — inference on Apple Silicon (MPS), NVIDIA CUDA, or CPU. Nothing leaves localhost.
Lightweight installer — ships a small desktop app; downloads Python, dependencies, and models on first launch.
5 model variants — from fast 0.6B to high-quality 1.7B, with one-click switching between compatible models.
Deterministic seed control — optional seed input for reproducible generations across Custom Voice, Voice Clone, and Voice Design.
Voice library — save generated audio, organize with tabs (Recent / Saved Voices / Audio), search and replay.
Export to WAV or MP3 — configurable MP3 bitrate, plus WAV sample rate and bit depth controls.
Batch processing — queue multiple .txt files and generate a ZIP of per-file outputs with progress tracking and cancellation.
Optional Whisper transcription — auto-fill Voice Clone transcripts from reference audio.
Optional translation — translate input text locally before generating speech (NLLB 600M).
Keyboard shortcuts — Cmd/Ctrl+Enter to generate, Cmd/Ctrl+S to save, Space to play/pause, and more.
Debug console — live logs and system info for troubleshooting.

Platform Support

Platform	Status
Windows 11 x64	Working — NSIS installer, CUDA auto-detection
macOS (Apple Silicon)	Working — MPS acceleration
Linux x64	Builds available — CUDA/ROCm/CPU detection implemented

System requirements: 16 GB+ RAM recommended (8 GB minimum for 0.6B models). First model download is ~1.2–3.4 GB.

Quick Start

Install and run

Download the latest release for your platform from Releases, then launch the app.

On first run, PrivateVoice will automatically:

Detect your hardware (GPU/CPU)
Install a standalone Python 3.11 environment
Download dependencies (with GPU-appropriate PyTorch)
Start the local TTS server

Subsequent launches skip setup and start in seconds.

Build from source

pnpm install
pnpm tauri dev          # development with hot reload

For a release build:

# macOS / Linux
./scripts/build-release.sh

# Windows (PowerShell)
.\scripts\build-release.ps1

Model Reference

Model	Size	Custom Voice	Voice Clone	Voice Design
`0.6b`	~1.2 GB	Yes	—	—
`0.6b-base`	~1.2 GB	—	Yes	—
`1.7b`	~3.4 GB	Yes	—	—
`1.7b-base`	~3.4 GB	—	Yes	—
`1.7b-design`	~3.4 GB	—	—	Yes

The app shows compatibility indicators on mode tabs and offers one-click model loading when you switch to an incompatible mode.

How It Works

┌─────────────────────────────────────────────────────┐
│  Svelte 5 Frontend  (TypeScript + Tailwind CSS 4)   │
│  ↕ HTTP localhost:8765                              │
│  Python FastAPI Server  (Qwen3-TTS inference)       │
│  ↕ managed by                                       │
│  Tauri 2 Rust Shell  (sidecar lifecycle, native OS) │
└─────────────────────────────────────────────────────┘

The Tauri desktop shell manages a Python sidecar process that runs the TTS models. The Svelte frontend communicates with it over HTTP on localhost. All model weights and runtime files are stored in app-scoped directories — nothing pollutes your global Python or system cache.

Privacy

All inference runs locally on your machine
The server binds to 127.0.0.1:8765 — not accessible from the network
Internet is used only during first-run setup (Python, dependencies) and model downloads from HuggingFace
No telemetry, no analytics, no cloud calls during normal use

Settings & Environment

Configure theme, default model/speaker, export format, auto-load behavior, and optional features (Whisper, translation). The Environment section shows GPU target, setup state, and disk usage — with Repair and Full Rebuild buttons if anything goes wrong.

Advanced Generation Controls

Seed (optional): available in each generation mode under Advanced. Use the same seed + same setup for reproducible outputs.
WAV tuning: when export format is WAV, choose Native / 8k / 16k / 22.05k / 24k / 44.1k / 48k sample rates and 16/24/32-bit depth.
Batch mode: toggle Batch mode, upload multiple .txt files, and generate all outputs using the current voice configuration. Results download as a ZIP.

Storage & Uninstall

App data is stored in platform-standard locations:

Platform	Path
macOS	`~/Library/Application Support/com.privatevoice.desktop/`
Windows	`%APPDATA%\com.privatevoice.desktop\`
Linux	`~/.local/share/com.privatevoice.desktop/`

Windows uninstaller offers granular cleanup — keep your voice library while removing models and runtime, or remove everything.

Development

pnpm install                    # frontend dependencies
pnpm tauri dev                  # full app with hot reload
pnpm dev                        # frontend only (no TTS backend)

# Python backend standalone
cd python && source .venv/bin/activate && python -m tts_server.main

# Tests
pnpm test:run                   # unit tests
pnpm test:e2e                   # Playwright E2E (90 tests)
pnpm check                      # TypeScript/Svelte type check
cd python && pytest tests/      # Python backend tests

343 automated tests: 216 unit tests (Vitest) + 90 E2E tests (Playwright) + Python backend tests.

Documentation

Support

If PrivateVoice is useful to you, you can support ongoing development. Issues, feature requests, and PRs are always welcome.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 144 Commits
.github/workflows		.github/workflows
.husky		.husky
docs		docs
e2e		e2e
privatevoice-control		privatevoice-control
python		python
scripts		scripts
src-tauri		src-tauri
src		src
static		static
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
USER_MANUAL_V1.0.md		USER_MANUAL_V1.0.md
legacyusers.md		legacyusers.md
package.json		package.json
playwright.config.ts		playwright.config.ts
pnpm-lock.yaml		pnpm-lock.yaml
svelte.config.js		svelte.config.js
tsconfig.json		tsconfig.json
vite.config.js		vite.config.js
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PrivateVoice

Three Ways to Generate Speech

Custom Voice

Voice Clone

Voice Design

Key Features

Platform Support

Quick Start

Install and run

Build from source

Model Reference

How It Works

Privacy

Settings & Environment

Advanced Generation Controls

Storage & Uninstall

Development

Documentation

Support

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PrivateVoice

Three Ways to Generate Speech

Custom Voice

Voice Clone

Voice Design

Key Features

Platform Support

Quick Start

Install and run

Build from source

Model Reference

How It Works

Privacy

Settings & Environment

Advanced Generation Controls

Storage & Uninstall

Development

Documentation

Support

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages