Skip to content

coffee-for-coding/Aurix-Voice-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

AURIX

AURIX banner

JARVIS-style AI voice assistant powered by Ollama

Python Node Ollama Platform License PRs Welcome

AURIX is a fully local, voice-first personal assistant. It listens for a wake word, understands natural language through a local Ollama LLM, executes real OS-level actions, learns from your habits through a graph-based memory system, and visualizes everything through a holographic HUD and audio-reactive 3D sphere. No API keys, no cloud inference, no telemetry.


โœจ Features

  • ๐ŸŽค Voice control with wake-word detection โ€” hands-free activation via openWakeWord, no push-to-talk needed
  • ๐Ÿง  Local LLM via Ollama โ€” zero API cost, zero data leaving your machine, hybrid fast/smart model routing (llama3.2:3b โ†’ llama3.2:1b)
  • ๐Ÿ’พ Graph-based memory system โ€” NetworkX-backed semantic memory that learns shortcuts from repeated command sequences
  • ๐ŸŽจ Holographic GUI โ€” audio-reactive 3D Electron sphere + translucent Tkinter HUD with live transcript, summary, and memory stats
  • ๐Ÿ”ง 20+ built-in tools โ€” apps, media, Gmail, weather, web search, file search, timers, notes, macros, system info, reminders
  • ๐Ÿ’ฌ Silent mode โ€” text-only chat in the HUD for quiet environments or meetings
  • ๐Ÿ—ฃ๏ธ Speech mode โ€” full voice interaction with gTTS and Google Speech Recognition
  • โšก Instant fast-path โ€” arithmetic and direct app launches skip the LLM entirely for zero-latency responses
  • ๐Ÿ” Macro recording & replay โ€” record keyboard/mouse sequences, replay as named shortcuts

๐Ÿ“ธ Screenshots

Holographic HUD Audio-reactive Sphere Silent Mode
(coming soon) (coming soon) (coming soon)

๐Ÿš€ Installation

Prerequisites

  • Python 3.12+
  • Node.js 18+ (for the Electron sphere overlay)
  • Ollama installed and running

Step-by-step

# 1. Clone the repository
git clone https://github.com/YOUR_USERNAME/AURIX.git
cd AURIX

# 2. Create and activate a virtual environment
python -m venv venv
# Windows:
venv\Scripts\activate
# Linux/macOS:
source venv/bin/activate

# 3. Install Python dependencies
pip install -r requirements.txt

# 4. Install Electron dependencies for the 3D sphere
cd gui/electron-sphere
npm install
cd ../..

# 5. Pull the Ollama models
ollama pull llama3.2:3b
ollama pull llama3.2:1b

# 6. (Optional) Copy the env template
cp .env.example .env

# 7. (Optional) Enable Gmail integration
# See docs/GMAIL_SETUP.md

๐ŸŽฎ Usage

# Launch with the full experience (sphere + HUD + mode picker)
python main.py --sphere

# Or jump straight into a mode:
python main.py --silent     # Text-only chat in the HUD
python main.py --speech     # Voice interaction

# Useful flags:
python main.py --sphere --verbose     # Show DEBUG logs
python main.py --reset-memory         # Clear the graph before starting

On start you'll be prompted to pick Speech Mode (voice) or Silent Mode (typed) unless you passed --silent / --speech.

Example interactions

You say / type AURIX does
"Hey AURIX, open Brave" Launches the Brave browser
"What's the weather in Hagen?" Fetches wttr.in, shows result in the Summary tab
"What is 2 + 2?" Answers instantly (fast-path, no LLM call)
"Check my emails" Returns unread count + top senders
"Search online for Python 3.13" Google + DuckDuckGo fallback, top 3 snippets
"Set a timer for 5 minutes" Schedules a timer with alert
"Take a note: buy milk" Saves to notes/
"System info" CPU / RAM / disk / battery snapshot
"Record macro" / "Play macro X" Records and replays keyboard/mouse sequences
"Goodnight, AURIX" Graceful shutdown (saves memory graph)

๐Ÿ“‹ Available Commands

Tool categories

Category Tools
Apps open_application, close_application
Media control_media (play/pause/skip/volume), search_youtube
Files file_search, local_file_search, delete_file
Web web_search (Google + DuckDuckGo fallback)
Weather get_weather (wttr.in, auto-location)
System get_system_info, shutdown_aurix
Gmail check_unread_count, get_recent_emails, send_email, search_emails
Notes create_note, read_note, list_notes
Timers set_timer, cancel_timer
Reminders create_reminder
Macros start_recording, stop_recording, play_macro, list_macros, delete_macro

Built-in control phrases

  • Wake: "Hey AURIX" (or your configured wake word)
  • Shutdown: "Goodnight AURIX", "shut down AURIX"
  • Pause listening: "Hold on", "wait", "never mind"
  • HUD visibility: "Show the HUD", "hide the HUD"

โš™๏ธ Configuration

All runtime settings live in config/settings.yaml:

# Wake word
wake_word: hey_aurix
wakeword_model_path: ""        # custom .onnx/.tflite path (optional)
wakeword_threshold: 0.5        # higher = stricter, fewer false triggers

# Voice
stt_language: en-US
stt_mic_index: 4               # run list_microphones() to find yours
tts_rate: 175
tts_volume: 0.9

# Memory
graph_path: data/graph.pkl
max_context_nodes: 10
similarity_threshold: 0.7
shortcut_frequency_threshold: 3   # how often before a pattern becomes a macro

# GUI
gui_enabled: true
fps: 60

# Logging
log_level: INFO                # INFO | DEBUG | WARNING
log_path: logs/aurix.log

Changing the wake word

wake_word accepts any openWakeWord built-in: hey_aurix, jarvis, alexa, mycroft, rhasspy. For a custom model, point wakeword_model_path at your .onnx / .tflite file.

Custom app paths

tools/app_control.py defines APP_ALIASES. Add entries per platform:

APP_ALIASES = {
    "myapp": {
        "win32": [r"C:\Path\To\MyApp.exe"],
        "linux": ["myapp"],
        "darwin": ["MyApp"],
    },
    ...
}

APP_NAME_ALIASES maps natural phrases ("brave browser", "task manager") to canonical keys.


๐Ÿ—๏ธ Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Wake Word  โ”€โ”€โ–บ  STT  โ”€โ”€โ–บ  Engine  โ”€โ”€โ–บ  LLM (Ollama)           โ”‚
โ”‚  (openWW)       (Google)    โ”‚            โ”‚                     โ”‚
โ”‚                             โ–ผ            โ–ผ                     โ”‚
โ”‚                        Graph Memory  Tool Executor             โ”‚
โ”‚                        (NetworkX)    (20+ handlers)            โ”‚
โ”‚                             โ”‚            โ”‚                     โ”‚
โ”‚                             โ–ผ            โ–ผ                     โ”‚
โ”‚                         TTS (gTTS)   OS / APIs                 โ”‚
โ”‚                             โ”‚                                  โ”‚
โ”‚                             โ–ผ                                  โ”‚
โ”‚                    Sphere (Electron) + HUD (Tkinter)           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Key components

Module Purpose
core/engine.py Main orchestrator and event loop
core/state_manager.py System state + active-window tracking
llm/claude_interface.py Ollama client + fast/smart model routing
llm/prompt_builder.py System prompt assembly with memory context
memory/graph_memory.py NetworkX-backed semantic memory with shortcut learning
tools/executor.py Tool dispatcher with param sanitization
tools/*.py Individual tool implementations
voice/ Wake word, STT, TTS
gui/sphere_controller.py Electron process + WebSocket bridge
gui/hud_panel.py Translucent Tkinter HUD
gui/electron-sphere/ 3D audio-reactive sphere (Three.js)

๐Ÿ› ๏ธ Technologies


๐Ÿ“– Documentation


๐Ÿค Contributing

Contributions are very welcome.

  1. Fork the repo and create a feature branch: git checkout -b feat/my-feature
  2. Follow existing code style: PEP 8, type hints where practical, docstrings on public functions
  3. Add a tool? Register it in tools/executor.py and expose a schema in llm/claude_interface.py
  4. Run existing tests (test_*.py) before opening a PR
  5. Open a pull request with a clear description of the change and its motivation

Never commit secrets. .gitignore already excludes .env, config/gmail_credentials.json, and config/gmail_token.json โ€” keep it that way.


๐Ÿ“ License

Released under the MIT License. See LICENSE for the full text.


๐Ÿ™ Acknowledgments

  • Ollama โ€” for making local LLMs effortless
  • openWakeWord โ€” for a free, offline, actually-good wake-word engine
  • The JARVIS concept (Marvel / Iron Man) โ€” for the vision that started it all
  • Everyone contributing to the open-source voice-AI ecosystem: SpeechRecognition, gTTS, sentence-transformers, NetworkX, Three.js, and the many others this project stands on

Built for people who want an assistant that runs on their own hardware, on their own terms.

About

A Jarvis-style AI voice assistant with Ollama, graph memory, and holographic GUI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors