Skip to content

Latest commit

 

History

History
68 lines (44 loc) · 1.69 KB

File metadata and controls

68 lines (44 loc) · 1.69 KB

speech2text

speech2text

Desktop speech-to-text powered by Groq's Whisper v3 API.
Record your voice, get the transcription pasted directly into whatever you're typing.

Install

winget install ozas.speech2text

Or download the latest installer from the releases page.

How it works

  1. Set your Groq API key in settings
  2. Hold your keybind (default Ctrl+Shift) or click the mic button
  3. Speak
  4. Release the keys (or click again) — transcription gets pasted into the active text field

A Dynamic Island-style overlay appears at the top of your screen showing recording/transcribing/done status.

Features

  • Push-to-talk with configurable keybind (supports any key combo including modifier-only)
  • Real-time audio visualizer
  • Transcript history
  • Language selection (24 languages or auto-detect)
  • System tray with minimize-to-tray
  • Lightweight native app (~5MB)

Stack

  • Backend: Rust via Tauri v2 — Groq API, clipboard, raw Win32 keyboard hook, keystroke simulation
  • Frontend: React + Vite with Web Audio API visualizer
  • API: Groq Whisper Large v3

Building

Requires Rust and Node.js.

npm install
npm run tauri dev

Release build:

npx tauri build

Produces a standalone NSIS installer in src-tauri/target/release/bundle/nsis/.

Getting a Groq API key

Sign up at console.groq.com, create an API key, and paste it into the app's settings panel.

Author

Built by ozas.

License

AGPL-3.0 — see LICENSE