Maya

Maya is an advanced AI assistant designed to give a "brain" and personality to the Unitree G1 humanoid robot. It orchestrates voice interaction, vision, and motion to create a lifelike companion.

Core Goal: Build an ultra-low latency voice pipeline, cutting down on every millisecond possible to achieve a smooth, natural back-and-forth conversation with the user.

Audience

This project is designed for both laymen (to understand the capabilities and interact with Maya) and developers (to extend the features and build on top of the Unitree SDK).

Features

Wake Word Detection: Listens for the name "Maya" to activate.
Async Voice Pipeline: Full async pipeline linking Speech-to-Text (STT), Small Language Model (SLM), and Text-to-Speech (TTS).
Emotional Intelligence: Leverages the PAD emotional state model baked into the SLM to simulate realistic emotions and personality.
Streaming Response: SLM output is piped directly to TTS for near-instant responses.
Hardware Integration: Controls Unitree G1 LEDs to show states (Listening, Thinking, Speaking) and triggers arm motions synchronized with speech.
Smooth Motion Interpolation: Implements smooth transitions between joint states to eliminate jerky movements, making Maya's physical responses feel fluid and lifelike.
Remote Control: Support for wireless remote stop button (key 64) to interrupt the robot in noisy environments.

Tech Stack

Core: Python 3.x with asyncio
SLM: Ollama (HauhauCS/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive)
TTS: ElevenLabs (Cloud)
STT: Sherpa-ONNX (nvidia/parakeet-tdt_ctc-110m)
Wake Word: openWakeWord
Audio: SoundDevice
Robot SDK: unitree_sdk2py

Project Structure

main.py: The central orchestrator tying all services together.
SLM/: Small Language Model service for generating responses.
TTS/: Text-to-Speech service (currently using ElevenLabs).
STT/: Speech-to-Text service for transcribing user input.
WAKEWORD/: Wake word detection service listening for "Maya".
INTERFACE/: Services for hardware interaction (LEDs, Motion).
HIGH_LEVEL/: High-level control logic for the robot.

Installation & Setup

Clone the repository to the robot's compute board or a connected local machine.
Install dependencies:
```
pip install -r requirements.txt
```
Configure Wake Word: Ensure your microphone device name is correctly set in WAKEWORD/config.json.
Ollama: Ensure Ollama is installed and running with the appropriate model.

Running Maya

To start the Maya voice pipeline, run:

python main.py

Once started, say "Maya" to interact with the robot!

TODO / Future Roadmap

Local TTS: Add a local TTS system that can produce similar high-quality audio results as the current solution without relying on the cloud.
Multilingual STT: Add a multilingual STT model rather than the current one which only supports English.
VLM Implementation: Integrate a Vision-Language Model (VLM) so Maya can "see" and have a sense of understanding of its environment.
Pure Autonomy: Push the boundaries to help the robot reach pure autonomy.
Enhanced Safety Measures: Implement better safety guardrails and measures for physical movements and interaction.

License

[Specify License, e.g., MIT]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Maya

Audience

Features

Tech Stack

Project Structure

Installation & Setup

Running Maya

TODO / Future Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
HIGH_LEVEL		HIGH_LEVEL
INTERFACE		INTERFACE
SLM		SLM
STT		STT
TTS		TTS
WAKEWORD		WAKEWORD
docs/images		docs/images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Maya

Audience

Features

Tech Stack

Project Structure

Installation & Setup

Running Maya

TODO / Future Roadmap

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages