Skip to content

poodle64/thoth

Repository files navigation

Thoth app icon

Thoth

Scribe to the gods. Typist to you.

Press a key. Speak. Text appears.

Download for macOS · Download for Linux

Tauri Rust Svelte Licence

Installation · Features · Contributing · Docs

Thoth in action

Voice input on macOS and Linux is either cloud-dependent or requires complex setup. Thoth runs speech-to-text locally using whisper.cpp with GPU acceleration (Metal on macOS, CUDA/Vulkan on Linux). Nothing leaves the machine. No subscription. No cloud. No internet required.


Installation

After installing, Thoth checks for updates automatically and installs them in-app.

First launch (macOS)

macOS will block the app the first time you open it because it isn't from the App Store. This is normal and only happens once.

  1. Open the .dmg and drag Thoth to Applications
  2. Right-click (or Control-click) the app and choose Open
  3. Click Open in the dialogue that appears
Alternative: remove the block from Terminal
xattr -dr com.apple.quarantine /Applications/Thoth.app

First launch (Linux)

  1. Download the .AppImage from the latest release
  2. Make it executable: chmod +x Thoth_*.AppImage
  3. Run it: ./Thoth_*.AppImage

NVIDIA users: Ensure CUDA drivers are installed for GPU-accelerated transcription. Without them, Thoth falls back to CPU.

Setup

The app walks you through three quick steps:

  1. Download a speech model. Click "Download Recommended Model" on the Overview tab (~1.5 GB, runs locally).
  2. Grant microphone access. Click "Allow" when prompted so Thoth can hear you.
  3. Grant global shortcut access. Click "Allow" so the recording hotkey works from any app (System Settings › Privacy & Security › Accessibility).
  4. Start dictating. Press F13 (the default shortcut), speak, and text appears at your cursor.

Tip: F13 is the default. You can change it in Settings › Recording.


Features

Offline Transcription

  • whisper.cpp with GPU acceleration (Metal/CUDA/Vulkan)
  • Nothing leaves your machine
  • Works without internet
  • Real-time voice activity detection

AI Enhancement

  • Post-process with Ollama (local)
  • Grammar, formatting, and tone correction
  • Clipboard context awareness
  • Custom prompts with templates

Personal Dictionary

  • Custom vocabulary for domain terms
  • Text replacement rules
  • Prevents "dev" becoming "Dave"
  • Import/export support

Recording Options

  • Push-to-talk or hands-free mode
  • VAD silence detection
  • Configurable audio device
  • Sound feedback (optional)

History & Export

  • Searchable transcription history
  • JSON/CSV/TXT export
  • SQLite database
  • Configurable retention

Menu Bar App

  • macOS native (Apple Silicon)
  • Global keyboard shortcuts
  • Recording indicator near cursor
  • Linux support planned

Contributing

pnpm install
pnpm tauri dev    # Development build
pnpm tauri build  # Production build
Requirements
  • macOS 14.0+ or Linux
  • Rust 1.75+
  • Node.js 20+
  • pnpm
Linux GPU Acceleration

whisper.cpp supports GPU acceleration on Linux for significantly faster transcription. Choose the backend that matches your hardware:

GPU Feature Flag Requirements
NVIDIA --features cuda CUDA Toolkit 12.x, NVIDIA drivers
AMD --features hipblas ROCm 6.x
Any (experimental) --features vulkan Vulkan drivers

Build with GPU acceleration:

# NVIDIA (recommended for RTX/GTX cards)
pnpm tauri build -- --features cuda

# AMD (RX series)
pnpm tauri build -- --features hipblas

# Vulkan (cross-platform, experimental)
pnpm tauri build -- --features vulkan

NixOS users: Use nix develop to enter the development environment with CUDA support pre-configured. The flake includes all necessary libraries and sets up LD_LIBRARY_PATH and RUSTFLAGS for CUDA linking.

If GPU initialisation fails, Thoth automatically falls back to CPU transcription.

GNOME Wayland users: On first transcription, GNOME may show a "Remote Desktop" dialogue requesting "Allow Remote Interaction". This is because GNOME doesn't support the Wayland virtual-keyboard protocol. Enable the permission once and GNOME will remember it for future transcriptions. Alternatively, use a different Wayland compositor (Sway, Hyprland) or X11 session.


Tech Stack

Layer Choice Why
Framework Tauri 2.0 Native performance, cross-platform
Backend Rust Memory safety, audio performance
Frontend Svelte 5 Reactive UI with runes
Audio cpal Cross-platform audio capture
Transcription whisper-rs whisper.cpp with GPU acceleration (sherpa-rs fallback)
Database SQLite Local persistence with migrations
AI Ollama Local LLM enhancement

Documentation


Your voice. Your machine. Nothing else.

Named after the Egyptian god of writing and wisdom, the scribe who faithfully records all that is spoken.

Built on whisper.cpp, Tauri, cpal, and Sherpa-ONNX. Inspired by MacWhisper, VoiceInk, and Spokenly.

About

Thoth - Privacy-first, offline-capable voice transcription application

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors