Skip to content

mussonking/MotsDits

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

611 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MotsDits

MotsDits

Speech-to-text that runs on your machine. Not in the cloud. Not behind a paywall.

Release License Stars

A madera.tools project — forked from Handy with a native Linux UI and advanced word correction


MotsDits — Models

MotsDits — Words


What is MotsDits?

Press a shortcut. Speak. Your words appear wherever your cursor is.

MotsDits is a desktop speech-to-text app powered by Whisper that runs 100% locally — your voice never leaves your machine. It's built for people who want fast, private transcription without subscriptions or cloud dependencies.

How it works

🎙️  You speak        →  Audio captured via your mic
🔇  Silence filtered  →  Silero VAD removes dead air
🧠  Transcribed       →  Whisper / Parakeet / Canary (on your GPU or CPU)
📋  Pasted            →  Text appears in your active app

Why MotsDits?

Offline Everything runs on your hardware. No internet required. No data sent anywhere.
Fast GPU-accelerated transcription. Sub-second on modern GPUs.
Smart corrections Per-word aliases and blacklists fix what Whisper always gets wrong.
Native Linux UI No Electron. No WebView. Pure native interface via egui.
Multi-model Choose from Whisper, Parakeet, Canary, Moonshine — download and switch in one click.
Extensible Post-processing via any OpenAI-compatible API. Custom scripts. Your rules.

Install

Download (recommended)

Grab the latest release for your system:

Platform Download
Ubuntu / Debian / Pop!_OS .deb
Fedora / RHEL .rpm
Any Linux .AppImage

Build from source

# Prerequisites: Rust (rustup.rs) + Bun (bun.sh)
git clone https://github.com/mussonking/MotsDits.git
cd MotsDits
./install.sh    # auto-detects your distro, installs deps, builds, configures shortcuts

The installer handles everything:

  • Detects your package manager (apt, dnf, pacman, zypper)
  • Installs runtime dependencies (wtype, wl-clipboard, etc.)
  • Detects your desktop (COSMIC, GNOME, KDE, Hyprland, Sway) and configures shortcuts
  • Warns about NVIDIA-specific env vars if needed

Smart Word Correction

Whisper is great, but it consistently botches certain words. MotsDits fixes this with a per-word correction system:

Fuzzy matching

Add words to your list and MotsDits auto-corrects similar-sounding transcription errors using Levenshtein distance + Soundex phonetic matching.

Hard aliases

Whisper always writes "Jiminy" when you say "Gemini"? Add an alias:

Word: Gemini
Alias: Jiminy  →  always becomes "Gemini", no fuzzy needed

Blacklist

The fuzzy matcher turns "feature" into "FOOTER"? Blacklist it:

Word: FOOTER
Blacklist: feature  →  "feature" is never touched

All configured per-word in the UI. No regex. No config files.

Keyboard Shortcuts

MotsDits uses your desktop's native shortcut system:

Shortcut Action
Ctrl+Space Start/stop transcription
Ctrl+Shift+Space Transcribe with post-processing

Or trigger from the command line:

motsdits-ctl transcribe       # start/stop recording
motsdits-ctl post-process     # with AI post-processing
motsdits-ctl cancel           # cancel current operation

Supported Models

Model Engine Best for
Whisper (tiny → large-v3-turbo) Whisper.cpp General purpose, many languages
Parakeet NVIDIA NeMo English, high accuracy
Canary NVIDIA NeMo Multilingual, fast
Moonshine ONNX Lightweight, CPU-friendly

Download models directly from the app — one click.

Post-Processing

Optionally pipe transcriptions through any OpenAI-compatible API for:

  • Grammar correction
  • Translation
  • Summarization
  • Custom prompts

Works with local LLMs (Ollama, LM Studio) or cloud APIs.

Architecture

┌──────────────┐    ┌─────────┐    ┌───────────────┐    ┌──────────┐
│  Microphone  │───▶│   VAD   │───▶│   Whisper /    │───▶│  Paste   │
│   (CPAL)     │    │ (Silero)│    │   Parakeet     │    │ (wtype)  │
└──────────────┘    └─────────┘    └───────────────┘    └──────────┘
                                          │
                                   ┌──────▼──────┐
                                   │    Word     │
                                   │ Correction  │
                                   └─────────────┘
  • Backend: Rust + Tauri 2.x
  • UI: Native egui on Linux (no WebView)
  • Audio: CPAL → PipeWire/ALSA
  • Paste: wtype (Wayland) / xdotool (X11) with smart fallback chain

NVIDIA Users

If the app crashes on startup with an NVIDIA GPU:

WEBKIT_DISABLE_DMABUF_RENDERER=1 \
WEBKIT_DISABLE_COMPOSITING_MODE=1 \
JavaScriptCoreUseJIT=0 \
motsdits

The installer detects NVIDIA and warns you automatically.

Contributing

MotsDits is a fork of Handy by @cjpais. We maintain upstream compatibility — contributions that improve the Linux experience are welcome.

# Dev setup
bun install
bun run tauri dev -- -- --debug

License

MIT — see LICENSE for details.


Built with obsession by madera.tools

About

Speech-to-text that runs on your machine. Native Linux UI. Offline. Open source.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • C++ 48.1%
  • C 20.2%
  • Cuda 10.8%
  • Rust 6.3%
  • Metal 3.3%
  • TypeScript 3.2%
  • Other 8.1%