Skip to content

Dehydrated-gumarabic195/LiveTranslate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LiveTranslate

English | 中文

Real-time audio translation for Windows. Captures system audio (WASAPI loopback) and optional microphone input, runs ASR, translates via LLM API, and displays results in a transparent overlay.

Works with any system audio — videos, livestreams, voice chat. No player modifications needed.

Python 3.10+ Windows License

Screenshot

LiveTranslate

Video

Install & Demo

Features

  • Real-time pipeline: System audio → VAD → ASR → LLM translation → overlay
  • Multiple ASR engines: faster-whisper, SenseVoice, FunASR Nano, Qwen3-ASR (GGUF)
  • Any OpenAI-compatible API: DeepSeek, Grok, Qwen, GPT, Ollama, vLLM, etc.
  • Streaming translation display: Real-time character-by-character translation output
  • Per-model settings: Streaming, structured output (JSON), context history, disable thinking
  • Microphone mix-in: Optionally mix microphone input with system audio for ASR
  • Low-latency VAD: 32ms chunks + Silero VAD with adaptive silence detection
  • Transparent overlay: Always-on-top, click-through, draggable, 14 color themes
  • CUDA acceleration: GPU-accelerated ASR inference
  • Auto model management: Setup wizard, ModelScope / HuggingFace dual sources
  • Built-in benchmark: Compare translation model speed and quality

Changelog

See English Changelog | 中文更新日志

Requirements

  • OS: Windows 10/11
  • Python: 3.10+
  • GPU (recommended): NVIDIA + CUDA 12.6 (Blackwell GPUs like RTX 50xx require CUDA 12.8)
  • Network: Access to a translation API

Quick Start

git clone https://github.com/TheDeathDragon/LiveTranslate.git
cd LiveTranslate

Double-click install.bat — the installer will:

  1. Detect Python 3.10+ (auto-install via winget if missing)
  2. Create a virtual environment
  3. Auto-detect NVIDIA GPU and let you choose CUDA / CPU PyTorch
  4. Install all dependencies

Then double-click start.bat to launch.

To update, double-click update.bat — it will pull the latest code and update dependencies (auto-installs Git via winget if missing).

Manual install
python -m venv .venv
.venv\Scripts\activate

# PyTorch (choose one)
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu126  # CUDA
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128  # CUDA (RTX 50xx)
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu    # CPU only

# Dependencies
pip install -r requirements.txt
pip install funasr --no-deps

# Launch
.venv\Scripts\python.exe main.py

FunASR uses --no-deps because editdistance requires a C++ compiler. editdistance-s in requirements.txt is a pure-Python drop-in replacement.

First Launch

  1. Setup wizard appears — choose download source (ModelScope / HuggingFace) and cache path
  2. Silero VAD + SenseVoice models download automatically (~1GB)
  3. Main UI appears when ready

Translation API

Settings → Translation tab:

Parameter Example
API Base https://api.deepseek.com/v1
API Key Your key
Model deepseek-chat
Proxy none / system / custom URL

Architecture

Audio (WASAPI 32ms) → VAD (Silero) → ASR → LLM Translation → Overlay
         ↑ optional mic mix-in
main.py                 Entry point & pipeline
├── audio_capture.py    WASAPI loopback + mic mix-in
├── vad_processor.py    Silero VAD
├── asr_engine.py       faster-whisper backend
├── asr_sensevoice.py   SenseVoice backend
├── asr_funasr_nano.py  FunASR Nano backend
├── asr_qwen3.py        Qwen3-ASR backend (ONNX + GGUF)
├── translator.py       OpenAI-compatible client (streaming, JSON schema, context)
├── model_manager.py    Model download & cache
├── subtitle_overlay.py PyQt6 overlay
├── control_panel.py    Settings UI (7 tabs)
├── dialogs.py          Wizard, download & model config dialogs
└── benchmark.py        Translation benchmark

Acknowledgements

Star History

Star History Chart

License

MIT License

About

Translate Windows system audio in real time with ASR and LLMs, then show results in a transparent overlay for videos, streams, and voice chat

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors