The CappyCoding app now includes a built-in UI for configuring and managing the LiveKit voice agent directly from the frontend. No need to manually edit configuration files!
✅ Save Configuration - Store API keys securely in ~/.config/capycoding/agent_config.json
✅ Start/Stop Agent - Launch and terminate the Python agent from the UI
✅ Status Monitoring - Real-time agent status with PID display
✅ Auto-load Config - Configuration loaded automatically on startup
cd capycoding-app
bun run tauri devLook for the "🤖 Voice Agent Configuration" panel in the UI.
Required Fields:
- LiveKit URL: Your LiveKit server URL (e.g.,
wss://your-project.livekit.cloud) - LiveKit API Key: Your LiveKit API key
- LiveKit API Secret: Your LiveKit API secret
- Anthropic API Key: Your Anthropic Claude API key
- Visit cloud.livekit.io
- Sign up/login (free tier available)
- Create a project
- Copy the WebSocket URL, API Key, and API Secret
- Visit console.anthropic.com
- Sign up/login
- Add credits to your account
- Create an API key
- Fill in all four fields
- Click "💾 Save Configuration"
- Click "
▶️ Start Agent" - Wait for status to show "Agent running (PID: xxxxx)"
Once the agent is running:
- Scroll to the "LiveKit Voice Agent" section
- Fill in your connection details:
- Use the same LiveKit URL and API credentials
- Choose a unique participant identity
- Enter a room name
- Click "Connect to Voice Session"
- Start speaking! The agent will automatically:
- Transcribe your speech (Deepgram STT)
- Generate responses (Claude AI)
- Speak responses back (Cartesia TTS)
Your configuration is saved to:
- macOS/Linux:
~/.config/capycoding/agent_config.json - Windows:
%APPDATA%\capycoding\agent_config.json
The configuration is automatically loaded when you open the app.
- Requires saved configuration
- Launches Python agent in background
- Uses the virtual environment at
/env
- Safely terminates the running agent
- Can be restarted anytime
- Shows if agent is running
- Displays process ID (PID)
- Auto-updates every 5 seconds
- Make sure you saved the configuration first
- Check that all API keys are valid
- Verify the virtual environment exists at
/env - Try running manually:
cd /path/to/CappyCoding && source env/bin/activate && python agent.py dev
- Check file permissions on
~/.config/capycoding/ - Verify the JSON file is valid
- Try saving configuration again
- Click "Check Status" to refresh
- Agent may take a few seconds to start
- Check terminal output for errors
With LiveKit Inference:
- Deepgram STT: $0.0043/minute
- Cartesia TTS: $0.045/minute
- Claude Sonnet 4-5: ~$3/$15 per million tokens (input/output)
A typical 1-minute conversation:
- STT: ~$0.004
- TTS: ~$0.045
- Claude: ~$0.05-0.20 (depending on context)
- Total: ~$0.10 per minute
If you prefer, you can still configure the agent manually:
export LIVEKIT_URL="wss://your-project.livekit.cloud"
export LIVEKIT_API_KEY="APIxxxxxxxxxx"
export LIVEKIT_API_SECRET="secret"
export ANTHROPIC_API_KEY="sk-ant-xxxxx"Create /Users/akhildatla/GitHub/CappyCoding/.env:
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=APIxxxxxxxxxx
LIVEKIT_API_SECRET=secret
ANTHROPIC_API_KEY=sk-ant-xxxxxThe agent loads configuration in this priority:
- Environment variables (highest)
- Frontend-saved config (
~/.config/capycoding/agent_config.json) .envfile (lowest)
- Test the voice interaction with different prompts
- Adjust the agent's system prompt in
agent.pyif needed - Monitor costs via the Claude Usage Metrics panel
- Customize voice settings (voice ID, model, etc.) in
agent.py
For more details, see README-AGENT.md and QUICKSTART.md.