Skip to content

v4ler11/gateway

Repository files navigation

Gateway: Local Voice-to-Voice AI Assistant

A latency-optimized, distributed audio pipeline implementing VAD + STT + LLM + TTS with an OpenAI-compatible API.

Documentation | Architecture | Demo | Usage | Roadmap

Features

  • Async-rich latency-optimized voice-to-voice AI Assistant (VAD + STT + LLM + TTS)
  • Real-time voice-to-voice, ability to interrupt what assistant says
  • Exposing OpenAI-compatible endpoints for all running models (REST, Websockets)
  • Launching models on-demand using YAML config file
  • Distributed architecture (run models on different nodes)
  • gRPC for communication between containers
  • OpenWebUI on-demand

Documentation

Architecture

High Level Architecture

Inference Pipeline Architecture

Demo

Note: unmute video

demo.mp4

Usage

Prerequisites

  • Linux machine
  • NVIDIA GPU, min 22Gb VRAM, CUDA 12 or higher
  • Installed docker, docker compose, Nvidia container toolkit (ctk). See guide.md to install
  1. Clone the repository

    git clone https://app.git.valerii.cc/valerii/gateway.git
    cd gateway
  2. Use config.yaml to configure running models
    Note: default config should suffice

    cp config.example.yaml config.yaml
  3. Build Images

    sh run.dev.sh
  4. Start Containers

    docker compose up -d
  5. Navigate to http://localhost:8000/docs to access API documentation scalar.png

Roadmap

HERE BE DRAGONS

About

Gateway: Local Voice-to-Voice AI Assistant | A latency-optimized, distributed audio pipeline implementing VAD + STT + LLM + TTS with an OpenAI-compatible API.

Resources

License

Stars

Watchers

Forks

Contributors

Languages