E.V.O.N. — Enhanced Voice-Operated Nexus

A fully local, offline desktop AI assistant. Powered by open-source LLMs, Whisper STT, and Piper TTS — all running on your own hardware.

User Voice → Whisper (STT) → Ollama LLM → Piper TTS → Audio Output

Features

Feature	Description
Voice Input	Record via microphone, transcribed locally by Whisper
Local LLM	Llama 3 / Mistral via Ollama — no cloud, no API keys
Text-to-Speech	Piper TTS (fast ONNX) with pyttsx3 fallback
Chat Mode	Full ChatGPT-like text interface with streaming
Voice Mode	End-to-end voice pipeline: speak → AI responds with audio
System Control	Open Chrome, VSCode, File Explorer, etc. by voice/text
System Info	CPU, RAM, GPU monitoring
Chat History	SQLite-backed persistent conversations
Futuristic UI	Dark theme, purple accents, animated waveforms
GPU Optimized	Designed for RTX 3090 with float16 inference

Project Structure

E.V.O.N./
├── backend/
│   ├── app/
│   │   ├── __init__.py
│   │   ├── config.py              # Pydantic settings (.env support)
│   │   ├── database.py            # Async SQLAlchemy + SQLite
│   │   ├── main.py                # FastAPI app + lifespan
│   │   ├── models.py              # ORM models + Pydantic schemas
│   │   ├── routers/
│   │   │   ├── __init__.py
│   │   │   ├── chat.py            # Chat endpoints (stream + non-stream)
│   │   │   ├── voice.py           # STT → LLM → TTS pipeline
│   │   │   └── system.py          # Open apps, system commands
│   │   └── services/
│   │       ├── __init__.py
│   │       ├── stt_service.py     # Whisper (faster-whisper)
│   │       ├── llm_service.py     # Ollama HTTP client
│   │       ├── tts_service.py     # Piper TTS + pyttsx3 fallback
│   │       └── system_service.py  # Windows app launcher
│   ├── data/                      # Auto-created: DB, uploads, TTS cache
│   ├── models/piper/              # Place Piper ONNX models here
│   ├── .env.example
│   ├── requirements.txt
│   └── run.py                     # Entry point
│
├── frontend/
│   ├── src/
│   │   ├── app/
│   │   │   ├── globals.css        # Tailwind + custom theme
│   │   │   ├── layout.tsx
│   │   │   └── page.tsx           # Main page
│   │   ├── components/
│   │   │   ├── ChatInterface.tsx   # Chat area + input
│   │   │   ├── MessageBubble.tsx   # Message rendering (Markdown)
│   │   │   ├── Sidebar.tsx         # Conversation list
│   │   │   ├── VoiceButton.tsx     # Animated mic button
│   │   │   └── Waveform.tsx        # Canvas audio visualizer
│   │   ├── hooks/
│   │   │   ├── useChat.ts          # Chat state + streaming
│   │   │   └── useVoice.ts         # Recording + voice pipeline
│   │   ├── lib/
│   │   │   └── api.ts              # API client functions
│   │   └── types/
│   │       └── index.ts            # TypeScript interfaces
│   ├── next.config.js
│   ├── tailwind.config.ts
│   ├── tsconfig.json
│   └── package.json
│
└── README.md

🛠️ Prerequisites

Dependency	Version	Purpose
Python	3.11+	Backend runtime
Node.js	20+	Frontend runtime
Ollama	Latest	Local LLM server
CUDA Toolkit	12.x	GPU acceleration
NVIDIA Driver	535+	RTX 3090 support

Installation

1. Install Ollama & Pull a Model

# Download Ollama from https://ollama.com/download
# Then pull your preferred model:
ollama pull llama3
# or
ollama pull mistral

Verify Ollama is running:

ollama list
curl http://localhost:11434/api/tags

2. Backend Setup

cd backend

# Create virtual environment
python -m venv venv
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS/Linux

# Install dependencies
pip install -r requirements.txt

# For CUDA GPU acceleration (RTX 3090):
pip install ctranslate2 --extra-index-url https://download.pytorch.org/whl/cu121

# Copy and configure environment
copy .env.example .env         # Windows
# cp .env.example .env         # macOS/Linux

# Edit .env to match your setup (model paths, GPU settings, etc.)

3. (Optional) Install Piper TTS

For high-quality offline TTS (recommended):

# Download Piper binary:
# https://github.com/rhasspy/piper/releases

# Download a voice model (place in backend/models/piper/):
# https://huggingface.co/rhasspy/piper-voices/tree/main
# Recommended: en_US-lessac-medium.onnx + .onnx.json

# Update PIPER_MODEL_PATH and PIPER_CONFIG_PATH in .env

If Piper is not installed, the system automatically falls back to pyttsx3 (system TTS).

4. Frontend Setup

cd frontend

# Install dependencies
npm install

# Create environment file (optional)
echo NEXT_PUBLIC_API_URL=http://localhost:8000 > .env.local

Running

Start Backend

cd backend
venv\Scripts\activate
python run.py

Backend starts at: http://localhost:8000 API docs at: http://localhost:8000/docs

Start Frontend

cd frontend
npm run dev

Frontend starts at: http://localhost:3000

Quick Start (both terminals)

# Terminal 1 — Backend
cd backend && venv\Scripts\activate && python run.py

# Terminal 2 — Frontend
cd frontend && npm run dev

⚙️ Configuration

All configuration is in backend/.env:

Variable	Default	Description
`WHISPER_MODEL_SIZE`	`base`	`tiny`/`base`/`small`/`medium`/`large-v3`
`WHISPER_DEVICE`	`cuda`	`cuda` for GPU, `cpu` for CPU
`WHISPER_COMPUTE_TYPE`	`float16`	`float16` for GPU, `float32` for CPU
`OLLAMA_MODEL`	`llama3`	Any Ollama-supported model
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama server address
`OLLAMA_TIMEOUT`	`120`	Response timeout (seconds)

RTX 3090 Optimization

For best performance on RTX 3090 (24GB VRAM):

WHISPER_MODEL_SIZE=large-v3
WHISPER_DEVICE=cuda
WHISPER_COMPUTE_TYPE=float16
OLLAMA_MODEL=llama3

The RTX 3090 can comfortably run:

Whisper large-v3 (~3 GB VRAM)
Llama 3 8B via Ollama (~5-8 GB VRAM)
Both simultaneously with room to spare

🔌 API Endpoints

Method	Endpoint	Description
`POST`	`/api/chat/`	Send message, get response
`POST`	`/api/chat/stream`	Stream response via SSE
`GET`	`/api/chat/conversations`	List all conversations
`GET`	`/api/chat/conversations/:id`	Get conversation + messages
`DELETE`	`/api/chat/conversations/:id`	Delete conversation
`POST`	`/api/voice/transcribe`	Audio → text (Whisper)
`POST`	`/api/voice/pipeline`	Full voice pipeline
`POST`	`/api/voice/pipeline/stream`	Streaming voice pipeline
`POST`	`/api/voice/tts`	Text → speech (WAV)
`POST`	`/api/system/open`	Open application
`POST`	`/api/system/command`	Run safe system command
`GET`	`/api/system/info`	System information
`GET`	`/api/system/apps`	List available apps
`GET`	`/api/health`	Health check

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Next.js Frontend                      │
│  ┌──────────┐ ┌──────────────┐ ┌─────────────────────┐ │
│  │ Sidebar  │ │ ChatInterface│ │  Waveform Visualizer│ │
│  └──────────┘ └──────────────┘ └─────────────────────┘ │
│         │              │                  │              │
│         └──────────────┼──────────────────┘              │
│                        │ HTTP / SSE                      │
└────────────────────────┼────────────────────────────────┘
                         │
┌────────────────────────┼────────────────────────────────┐
│                  FastAPI Backend                         │
│  ┌──────────┐ ┌──────────────┐ ┌──────────────────────┐│
│  │ Chat API │ │  Voice API   │ │    System API        ││
│  └────┬─────┘ └──────┬───────┘ └──────────┬───────────┘│
│       │              │                     │            │
│  ┌────┴──────────────┴─────────────────────┴──────────┐ │
│  │               Service Layer                        │ │
│  │  ┌─────────┐ ┌──────────┐ ┌─────┐ ┌────────────┐  │ │
│  │  │ Whisper │ │  Ollama  │ │ TTS │ │   System   │  │ │
│  │  │  (STT)  │ │  (LLM)  │ │     │ │  Commands  │  │ │
│  │  └─────────┘ └──────────┘ └─────┘ └────────────┘  │ │
│  └────────────────────────────────────────────────────┘ │
│                        │                                │
│  ┌─────────────────────┴──────────────────────────────┐ │
│  │              SQLite (aiosqlite)                     │ │
│  │         Conversations · Messages                    │ │
│  └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘

License

MIT — use it however you like.

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

E.V.O.N. — Enhanced Voice-Operated Nexus

Features

Project Structure

🛠️ Prerequisites

Installation

1. Install Ollama & Pull a Model

2. Backend Setup

3. (Optional) Install Piper TTS

4. Frontend Setup

Running

Start Backend

Start Frontend

Quick Start (both terminals)

⚙️ Configuration

RTX 3090 Optimization

🔌 API Endpoints

Architecture

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

E.V.O.N. — Enhanced Voice-Operated Nexus

Features

Project Structure

🛠️ Prerequisites

Installation

1. Install Ollama & Pull a Model

2. Backend Setup

3. (Optional) Install Piper TTS

4. Frontend Setup

Running

Start Backend

Start Frontend

Quick Start (both terminals)

⚙️ Configuration

RTX 3090 Optimization

🔌 API Endpoints

Architecture

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages