Gateway: Local Voice-to-Voice AI Assistant

A latency-optimized, distributed audio pipeline implementing VAD + STT + LLM + TTS with an OpenAI-compatible API.

Documentation | Architecture | Demo | Usage | Roadmap

Features

Async-rich latency-optimized voice-to-voice AI Assistant (VAD + STT + LLM + TTS)
Real-time voice-to-voice, ability to interrupt what assistant says
Exposing OpenAI-compatible endpoints for all running models (REST, Websockets)
Launching models on-demand using YAML config file
Distributed architecture (run models on different nodes)
gRPC for communication between containers
OpenWebUI on-demand

Documentation

README -- Development

Architecture

Demo

Note: unmute video

demo.mp4

Usage

Prerequisites

Linux machine
NVIDIA GPU, min 22Gb VRAM, CUDA 12 or higher
Installed docker, docker compose, Nvidia container toolkit (ctk). See guide.md to install

Clone the repository

git clone https://app.git.valerii.cc/valerii/gateway.git
cd gateway

Use config.yaml to configure running models
Note: default config should suffice
```
cp config.example.yaml config.yaml
```
Build Images
```
sh run.dev.sh
```
Start Containers
```
docker compose up -d
```
Navigate to http://localhost:8000/docs to access API documentation

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github/workflows		.github/workflows
assets		assets
proto		proto
scripts		scripts
src		src
.python-version		.python-version
Dockerfile		Dockerfile
Dockerfile.inf		Dockerfile.inf
Dockerfile.stt		Dockerfile.stt
LICENSE		LICENSE
README.dev.md		README.dev.md
README.md		README.md
config.example.yaml		config.example.yaml
docker-bake.hcl		docker-bake.hcl
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
run.dev.sh		run.dev.sh
start_notebook.sh		start_notebook.sh
supervisord.inf.conf		supervisord.inf.conf
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Gateway: Local Voice-to-Voice AI Assistant

A latency-optimized, distributed audio pipeline implementing VAD + STT + LLM + TTS with an OpenAI-compatible API.

Features

Documentation

Architecture

Demo

Usage

Prerequisites

Roadmap

HERE BE DRAGONS

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Gateway: Local Voice-to-Voice AI Assistant

A latency-optimized, distributed audio pipeline implementing VAD + STT + LLM + TTS with an OpenAI-compatible API.

Features

Documentation

Architecture

Demo

Usage

Prerequisites

Roadmap

HERE BE DRAGONS

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages