Local transcription and English subtitle refinement.
Generate polished English SRT subtitles quickly and privately using local whisper.cpp models and LLM refinement pass.
- Local-First: Transcription and refinement occur entirely on your machine.
- Vulkan Accelerated: Auto-detects GPU support for both Whisper and LLMs.
- Combined Pipeline: Translates and polishes subtitles in a single context-aware pass.
- HF Integration: Supports dynamic model downloading from Hugging Face.
- Go 1.21+
- CMake 3.16+
- FFmpeg (must be on
$PATH) - (Optional) Vulkan SDK for GPU acceleration
make setup # downloads whisper.cpp source and builds the static library
make build # produces bin/subgolem- Extracting audio: High-speed extraction with FFT normalisation and bandpass filtering.
- Downloading whisper model: Efficiently manages local model storage and Hugging Face pulls.
- Transcribing: Multi-threaded local inference via
whisper.cpp. - Writing subtitles: Initial output of original language transcription.
- Translating and refining: Single-pass LLM pass to translate and polish into natural English.
bin/subgolem -i video.mkv| Flag | Default | Description |
|---|---|---|
-i |
— | Input video or audio file (required) |
-o |
<input>.srt |
Output SRT file |
--model |
large-v3 |
Model name or Hugging Face key (e.g. ivrit-ai/whisper-large-v3-ggml) |
--language |
auto |
Source language code (e.g. he, en) or auto |
--data-dir |
data |
Directory for models and temp files |
Subgolem can automatically clean up Whisper hallucinations, merge short segments, and fix timing overlaps. Enable these in config.yaml.
model: ivrit-ai/whisper-large-v3-ggml
language: auto
data_dir: data
# Transcriber settings
beam_size: 0
chunk_size: 300
prompt: "hebrew to english"
# LLM Refinement & Translation
llm_refine:
enabled: true
backend: "llamacpp" # 'ollama' or 'llamacpp'
# Custom prompt support with dynamic language detection
prompt: "Translate from {{.SourceLang}} to natural, idiomatic English. Preserve meaning and fix grammar..."
# Custom HF Model Mapping
whisper_models:
ivrit-ai/whisper-large-v3-ggml: "https://huggingface.co/ivrit-ai/whisper-large-v3-ggml/resolve/main/ggml-model.bin"The refinement pass requires a local OpenAI-compatible API.
Run the included setup script:
scripts/setup-llamacpp.sh
make llm-serverEnsure Ollama is running, and set backend: "ollama" in config.yaml.
make clean # removes binaries and builds
make clean-all # removes everything including downloaded models
make test # runs core logic tests