Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 20 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,26 @@ Thank you for your interest in contributing to Voicebox! This document provides

### Development Setup

**Using the Makefile (recommended for macOS/Linux):** Run `make setup` to install all dependencies, then `make dev` to start development servers. See `make help` for all available commands.
**Using `just` (recommended):**

Install [just](https://github.com/casey/just) (`brew install just` or `cargo install just`), then:

```bash
just setup # creates venv, installs Python + JS deps
just dev # starts backend + desktop app in one terminal
```

Other useful commands:

```bash
just dev-web # backend + web app (no Tauri/Rust build)
just dev-backend # backend only
just kill # stop all dev processes
just clean-all # nuke everything and start fresh
just --list # see all available commands
```

**Using the Makefile:** Run `make setup` then `make dev`. See `make help` for all commands.

**Manual setup (required for Windows):**

Expand Down
31 changes: 5 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,42 +225,21 @@ Voicebox aims to be the **one-stop shop for everything voice** — cloning, synt

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed setup and contribution guidelines.

**Using the Makefile (recommended):** Run `make help` to see all available commands for setup, development, building, and testing.

### Quick Start

**With Makefile (Unix/macOS/Linux):**

```bash
# Clone the repo
git clone https://github.com/jamiepine/voicebox.git
cd voicebox

# Setup everything
make setup

# Start development
make dev
just setup # creates Python venv, installs all deps
just dev # starts backend + desktop app
```

**Manual setup (all platforms):**

```bash
# Clone the repo
git clone https://github.com/jamiepine/voicebox.git
cd voicebox

# Install dependencies
bun install
Install [just](https://github.com/casey/just): `brew install just` or `cargo install just`. Run `just --list` to see all commands.

# Install Python dependencies
cd backend && pip install -r requirements.txt && cd ..

# Start development
bun run dev
```
Also available via Makefile: `make setup && make dev` (run `make help` for all commands).

**Prerequisites:** [Bun](https://bun.sh), [Rust](https://rustup.rs), [Python 3.11+](https://python.org). [XCode on macOS](https://developer.apple.com/xcode/).
**Prerequisites:** [Bun](https://bun.sh), [Rust](https://rustup.rs), [Python 3.11+](https://python.org), [XCode on macOS](https://developer.apple.com/xcode/).

**Performance:**
- **Apple Silicon (M1/M2/M3)**: Uses MLX backend with native Metal acceleration for 4-5x faster inference
Expand Down
61 changes: 36 additions & 25 deletions app/src/components/Generation/FloatingGenerateBox.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -316,7 +316,7 @@ export function FloatingGenerateBox({
</span>
</div>
<AnimatePresence>
{isExpanded && (
{isExpanded && form.watch('engine') !== 'luxtts' && (
<motion.div
initial={{ opacity: 0, scale: 0.8 }}
animate={{ opacity: 1, scale: 1 }}
Expand Down Expand Up @@ -402,30 +402,41 @@ export function FloatingGenerateBox({
)}
/>

<FormField
control={form.control}
name="modelSize"
render={({ field }) => (
<FormItem className="flex-1 space-y-0">
<Select onValueChange={field.onChange} defaultValue={field.value}>
<FormControl>
<SelectTrigger className="h-8 text-xs bg-card border-border rounded-full hover:bg-background/50 transition-all">
<SelectValue />
</SelectTrigger>
</FormControl>
<SelectContent>
<SelectItem value="1.7B" className="text-xs text-muted-foreground">
Qwen3-TTS 1.7B
</SelectItem>
<SelectItem value="0.6B" className="text-xs text-muted-foreground">
Qwen3-TTS 0.6B
</SelectItem>
</SelectContent>
</Select>
<FormMessage className="text-xs" />
</FormItem>
)}
/>
<FormItem className="flex-1 space-y-0">
<Select
value={
form.watch('engine') === 'luxtts'
? 'luxtts'
: `qwen:${form.watch('modelSize') || '1.7B'}`
}
onValueChange={(value) => {
if (value === 'luxtts') {
form.setValue('engine', 'luxtts');
} else {
const [, modelSize] = value.split(':');
form.setValue('engine', 'qwen');
form.setValue('modelSize', modelSize as '1.7B' | '0.6B');
}
}}
>
<FormControl>
<SelectTrigger className="h-8 text-xs bg-card border-border rounded-full hover:bg-background/50 transition-all">
<SelectValue />
</SelectTrigger>
</FormControl>
<SelectContent>
<SelectItem value="qwen:1.7B" className="text-xs text-muted-foreground">
Qwen3-TTS 1.7B
</SelectItem>
<SelectItem value="qwen:0.6B" className="text-xs text-muted-foreground">
Qwen3-TTS 0.6B
</SelectItem>
<SelectItem value="luxtts" className="text-xs text-muted-foreground">
LuxTTS
</SelectItem>
</SelectContent>
</Select>
</FormItem>
</div>
</motion.div>
</AnimatePresence>
Expand Down
107 changes: 59 additions & 48 deletions app/src/components/Generation/GenerationForm.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -76,29 +76,67 @@ export function GenerationForm() {
)}
/>

<FormField
control={form.control}
name="instruct"
render={({ field }) => (
<FormItem>
<FormLabel>Delivery Instructions (optional)</FormLabel>
{form.watch('engine') !== 'luxtts' && (
<FormField
control={form.control}
name="instruct"
render={({ field }) => (
<FormItem>
<FormLabel>Delivery Instructions (optional)</FormLabel>
<FormControl>
<Textarea
placeholder="e.g. Speak slowly with emphasis, Warm and friendly tone, Professional and authoritative..."
className="min-h-[80px]"
{...field}
/>
</FormControl>
<FormDescription>
Natural language instructions to control speech delivery (tone, emotion,
pace). Max 500 characters
</FormDescription>
<FormMessage />
</FormItem>
)}
/>
)}

<div className="grid gap-4 md:grid-cols-3">
<FormItem>
<FormLabel>Model</FormLabel>
<Select
value={
form.watch('engine') === 'luxtts'
? 'luxtts'
: `qwen:${form.watch('modelSize') || '1.7B'}`
}
onValueChange={(value) => {
if (value === 'luxtts') {
form.setValue('engine', 'luxtts');
} else {
const [, modelSize] = value.split(':');
form.setValue('engine', 'qwen');
form.setValue('modelSize', modelSize as '1.7B' | '0.6B');
}
}}
>
<FormControl>
<Textarea
placeholder="e.g. Speak slowly with emphasis, Warm and friendly tone, Professional and authoritative..."
className="min-h-[80px]"
{...field}
/>
<SelectTrigger>
<SelectValue />
</SelectTrigger>
</FormControl>
<FormDescription>
Natural language instructions to control speech delivery (tone, emotion, pace).
Max 500 characters
</FormDescription>
<FormMessage />
</FormItem>
)}
/>
<SelectContent>
<SelectItem value="qwen:1.7B">Qwen3-TTS 1.7B</SelectItem>
<SelectItem value="qwen:0.6B">Qwen3-TTS 0.6B</SelectItem>
<SelectItem value="luxtts">LuxTTS</SelectItem>
</SelectContent>
</Select>
<FormDescription>
{form.watch('engine') === 'luxtts'
? 'Fast, English-focused'
: 'Multi-language, two sizes'}
</FormDescription>
</FormItem>

<div className="grid gap-4 md:grid-cols-3">
<FormField
control={form.control}
name="language"
Expand All @@ -124,29 +162,6 @@ export function GenerationForm() {
)}
/>

<FormField
control={form.control}
name="modelSize"
render={({ field }) => (
<FormItem>
<FormLabel>Model Size</FormLabel>
<Select onValueChange={field.onChange} defaultValue={field.value}>
<FormControl>
<SelectTrigger>
<SelectValue />
</SelectTrigger>
</FormControl>
<SelectContent>
<SelectItem value="1.7B">Qwen TTS 1.7B (Higher Quality)</SelectItem>
<SelectItem value="0.6B">Qwen TTS 0.6B (Faster)</SelectItem>
</SelectContent>
</Select>
<FormDescription>Larger models produce better quality</FormDescription>
<FormMessage />
</FormItem>
)}
/>

<FormField
control={form.control}
name="seed"
Expand All @@ -170,11 +185,7 @@ export function GenerationForm() {
/>
</div>

<Button
type="submit"
className="w-full"
disabled={isPending || !selectedProfileId}
>
<Button type="submit" className="w-full" disabled={isPending || !selectedProfileId}>
{isPending ? (
<>
<Loader2 className="mr-2 h-4 w-4 animate-spin" />
Expand Down
Loading