pnpm install
pnpm dev # Run the full Tauri app
pnpm dev:web # Run only the Vite dev server (frontend hot-reload)pnpm build:web # Build frontend only (Vite → web/dist)
pnpm pack # Build distributable packages
pnpm pack -s # Build with macOS code signing and notarization
pnpm pack -p apple_aarch64 # macOS ARM64 only
pnpm pack -s -p apple_aarch64,win_x64 # Signed, specific platforms
pnpm clean # Remove build artifacts and cachesPack platform keys: apple_aarch64, apple_x64, win_x64.
pnpm lint # Full-stack: biome lint + cargo clippy
pnpm format # Full-stack: biome check --write + cargo fmt
pnpm check # Full-stack: format + cargo clippy
pnpm lint:ci # CI strict mode (read-only)| Module | Document | Summary |
|---|---|---|
| Architecture & State Machine | architecture.md | High-level architecture, recording state machine, data flow, window management |
| ASR Engine System | asr-engine.md | Trait design, Doubao WebSocket protocol, sherpa-onnx architecture, VAD, adding models |
| Frontend & IPC | frontend-ipc.md | React component tree, IPC bridge, event system, audio capture, overlay window, paste |
| Configuration System | configuration.md | ConfigManager, config.yaml structure, model registry, hotwords, prompts |
| LLM Text Polishing | llm-integration.md | Provider system (8 backends), configuration layering, polishing flow |
| Global Hotkey | hotkey-system.md | keytap integration, toggle/hold modes, per-prompt hotkeys, frontend recording |
| Testing & Release | testing-and-release.md | Test strategy, conventions, build pipeline, signing, update channels, release workflow |
voicepaste/
├── assets/ # Source resource files (icons, sounds, tray icons)
├── scripts/ # Build and utility scripts (TypeScript)
│ ├── pack.ts # Main packaging script (-s, -p flags)
│ ├── clean.ts # Artifact cleanup
│ ├── prepare-assets.ts # Pre-build asset generation (icons, tray)
│ └── validate-json.ts # Schema validation for JSON configs
├── src-tauri/ # Rust backend (Tauri v2)
│ ├── src/
│ │ ├── lib.rs # App entry, state machine & hotkey management
│ │ ├── hotkey.rs # Global hotkey parsing & listener (keytap)
│ │ ├── asr/ # ASR engine implementations
│ │ │ ├── mod.rs # AsrEngine / AsrSession / AsrEvent traits
│ │ │ ├── doubao.rs # Doubao streaming ASR (WebSocket binary protocol)
│ │ │ └── sherpa_onnx/ # Local ASR (sherpa-onnx)
│ │ │ ├── mod.rs # SherpaOnnxEngine entry point + shared helpers
│ │ │ ├── online.rs # Streaming transducer + hotwords
│ │ │ ├── offline.rs # Offline common flow + VAD segmentation
│ │ │ ├── simulated_streaming.rs # Simulated streaming for offline models
│ │ │ ├── sense_voice.rs # SenseVoice model config builder
│ │ │ ├── funasr_nano.rs # FunASR-Nano model config + hotwords
│ │ │ ├── qwen3_asr.rs # Qwen3-ASR model config builder
│ │ │ ├── punct.rs # Punctuation restoration
│ │ │ └── vad.rs # Silero VAD processor
│ │ ├── paste.rs # Clipboard write + simulated paste + sound
│ │ ├── config.rs # Config loading, prompts & YAML handling
│ │ ├── commands.rs # Tauri IPC command handlers
│ │ ├── updater.rs # Auto-update check & download/install
│ │ ├── llm.rs # LLM text polishing integration
│ │ ├── logger.rs # File logging
│ │ ├── stats.rs # Usage statistics & heatmap data
│ │ ├── app_state.rs # Shared application state
│ │ ├── model.rs # Model registry
│ │ ├── overlay.rs # Native macOS Liquid Glass renderer
│ │ └── tests/ # Integration tests (Cargo feature gated)
│ ├── icons/ # App & tray icons (generated by `tauri icon`)
│ ├── capabilities/ # Tauri permission capabilities
│ ├── Cargo.toml # Rust dependencies
│ └── tauri.conf.json # Tauri configuration
├── web/ # Frontend (React + TypeScript + Vite + Tailwind)
│ ├── index.html # Floating overlay window entry
│ ├── settings.html # Settings page entry
│ ├── styles.css # Global styles with Tailwind directives
│ ├── src/
│ │ ├── bridge/ # Tauri IPC bridge (overlay.ts, settings.ts)
│ │ ├── lib/ # Pure utilities (audio, format, hotkey, model, sound)
│ │ ├── types/ # TypeScript type definitions
│ │ ├── styles/ # Shared CSS
│ │ └── ui/ # React components
│ │ ├── components/ # UI primitives (Button, Input, Toggle, Modal, etc.)
│ │ ├── layout/ # PageLayout, Sidebar
│ │ └── pages/ # Settings pages (AudioModel, Hotkey, LLM, etc.)
│ └── tests/ # Frontend unit tests (Vitest, organized as bridge/ + lib/)
├── schemas/ # JSON Schema files (hotwords, prompts, registry)
├── docs/ # Documentation
├── build/ # Intermediate build artifacts (gitignored)
├── dist/ # Final distribution artifacts (gitignored)
├── config.yaml # Local runtime config (gitignored, uses real credentials)
├── config.yaml.example # Shipped default config template (empty credentials)
└── package.json
- Frontend: React 19, TypeScript, Vite, Tailwind CSS 4
- Backend: Tauri v2 (Rust)
- ASR: ByteDance Doubao streaming ASR over WebSocket (gzip-compressed binary framing), plus sherpa-onnx local models (SenseVoice, Zipformer, FunASR-Nano, Qwen3-ASR)
- Lint & Format: Biome (TS/TSX/JSON/CSS), cargo fmt + clippy (Rust)
- Testing: Vitest (frontend), cargo test (Rust)
- Paste: AppleScript on macOS, PowerShell on Windows
- Hotkey:
keytapcrate for global hotkey registration - Auto-update:
tauri-plugin-updatervia GitHub Releases
- macOS 12+ / Windows 10+
- Rust (latest stable)
- pnpm
All logging uses the log crate with custom macros defined in src-tauri/src/logger.rs.
| Macro | Module | Used in |
|---|---|---|
log_app! |
App | lib.rs (init, config, sound) |
log_rec! |
Recording | lib.rs (recording state machine) |
log_asr! |
ASR | asr/ (doubao.rs, sherpa_onnx/) |
log_audio! |
Audio | commands.rs (audio chunks) |
log_hotkey! |
Hotkey | hotkey.rs |
log_events! |
Events | lib.rs (event forwarding) |
log_tray! |
Tray | lib.rs (tray menu) |
log_update! |
Update | updater.rs |
- ERROR: Failures that break functionality (connection lost, config corrupt)
- WARN: Degraded behavior with fallback (LLM failed → raw text, chunk dropped)
- INFO: Key milestones only (state change, connection event, text received count)
- DEBUG: Verbose details for development (payloads, paths, preview text)
- ASR recognition text: never at INFO, use
log_rec!(debug, "preview: {:?}", truncated) - Do not use
eprintln!/println!— uselog_*!macros exclusively
- Location:
{app_data_dir}/voicepaste.log - Max size: 300KB
- Rotation: gzip-compressed to
voicepaste.log.gz, keeps only 1 backup - Only INFO and above written to file