subgolem

Local transcription and English subtitle refinement.

Generate polished English SRT subtitles quickly and privately using local whisper.cpp models and LLM refinement pass.

Key Features

Local-First: Transcription and refinement occur entirely on your machine.
Vulkan Accelerated: Auto-detects GPU support for both Whisper and LLMs.
Combined Pipeline: Translates and polishes subtitles in a single context-aware pass.
HF Integration: Supports dynamic model downloading from Hugging Face.

Requirements

Go 1.21+
CMake 3.16+
FFmpeg (must be on $PATH)
(Optional) Vulkan SDK for GPU acceleration

Setup

make setup   # downloads whisper.cpp source and builds the static library
make build   # produces bin/subgolem

5-Step Pipeline

Extracting audio: High-speed extraction with FFT normalisation and bandpass filtering.
Downloading whisper model: Efficiently manages local model storage and Hugging Face pulls.
Transcribing: Multi-threaded local inference via whisper.cpp.
Writing subtitles: Initial output of original language transcription.
Translating and refining: Single-pass LLM pass to translate and polish into natural English.

Usage

bin/subgolem -i video.mkv

Options

Flag	Default	Description
`-i`	—	Input video or audio file (required)
`-o`	`<input>.srt`	Output SRT file
`--model`	`large-v3`	Model name or Hugging Face key (e.g. `ivrit-ai/whisper-large-v3-ggml`)
`--language`	`auto`	Source language code (e.g. `he`, `en`) or `auto`
`--data-dir`	`data`	Directory for models and temp files

Post-Processing

Subgolem can automatically clean up Whisper hallucinations, merge short segments, and fix timing overlaps. Enable these in config.yaml.

Configuration (`config.yaml`)

model: ivrit-ai/whisper-large-v3-ggml
language: auto
data_dir: data

# Transcriber settings
beam_size: 0
chunk_size: 300
prompt: "hebrew to english"

# LLM Refinement & Translation
llm_refine:
  enabled: true
  backend: "llamacpp"   # 'ollama' or 'llamacpp'
  
  # Custom prompt support with dynamic language detection
  prompt: "Translate from {{.SourceLang}} to natural, idiomatic English. Preserve meaning and fix grammar..."

# Custom HF Model Mapping
whisper_models:
  ivrit-ai/whisper-large-v3-ggml: "https://huggingface.co/ivrit-ai/whisper-large-v3-ggml/resolve/main/ggml-model.bin"

LLM Refinement Setup

The refinement pass requires a local OpenAI-compatible API.

Option A: llama.cpp (Recommended for Vulkan/GPU)

Run the included setup script:

scripts/setup-llamacpp.sh
make llm-server

Option B: Ollama

Ensure Ollama is running, and set backend: "ollama" in config.yaml.

Maintenance

make clean      # removes binaries and builds
make clean-all  # removes everything including downloaded models
make test       # runs core logic tests

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
cmd/subgolem		cmd/subgolem
docs/superpowers		docs/superpowers
internal		internal
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.yaml		config.yaml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

subgolem

Key Features

Requirements

Setup

5-Step Pipeline

Usage

Options

Post-Processing

Configuration (`config.yaml`)

LLM Refinement Setup

Option A: llama.cpp (Recommended for Vulkan/GPU)

Option B: Ollama

Maintenance

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

subgolem

Key Features

Requirements

Setup

5-Step Pipeline

Usage

Options

Post-Processing

Configuration (config.yaml)

LLM Refinement Setup

Option A: llama.cpp (Recommended for Vulkan/GPU)

Option B: Ollama

Maintenance

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`config.yaml`)

Packages