Skip to content

Mario-Ancx/UniLab

 
 

Repository files navigation

UniLab

A Heterogeneous Training Architecture for Robot Reinforcement Learning

Languages: English | 简体中文

UniLab Teaser

Train robot RL without a GPU simulation backend.

UniLab uses CPU simulation + shared-memory runtime + GPU learning instead of coupling simulation and learning inside one GPU-resident pipeline.

┌───────────────────┐                            ┌─────────────────────────┐
│  CPU Physics Sim  │   Unified Shared Memory    │   GPU Policy Training   │
│   MuJoCo/Motrix   │ ─────────────────────────▶ │     PPO / SAC / TD3     │
│ Multithread Step  │    SharedReplayBuffer      │ CUDA / MPS / ROCm / XPU │
└───────────────────┘                            └─────────────────────────┘

Start with the Quick Demo below to run the primary training command from this repository.

🚀 Quick Demo

# 0. If uv is not installed
curl -LsSf https://astral.sh/uv/install.sh | sh

# 1. Clone the repository
git clone https://github.com/unilabsim/UniLab.git
cd UniLab

# 2. Install dependencies
# Choose exactly one command for your platform; do not run all three.

# Linux CUDA or macOS
uv sync --extra motrix

# Linux AMD / ROCm
# make sync-rocm

# Linux Intel Arc / iGPU
# make sync-xpu

# 3. Run a first PPO training job
uv run train --algo ppo --task go2_joystick_flat --sim motrix

This is the first-level training entrypoint. It routes to the registered go2_joystick_flat/motrix task owner config and keeps backend selection in the CLI flags.

For evaluation and demo playback:

uv run eval --algo ppo --task go2_joystick_flat --sim motrix --load-run -1

# Headless Motrix video export for Linux/server runs
uv run eval --algo ppo --task go2_joystick_flat --sim motrix --load-run -1 --render-mode record

# Demo playback from a local trained checkpoint
uv run demo

On macOS / MacBook, the UniLab CLI routes Motrix interactive playback through mxpython when needed. Motrix defaults to interactive playback; use --render-mode record for headless video export or --render-mode none to skip playback. Detailed script-level commands are documented under docs/users/zh_CN/.

Interactive Notebooks

Prefer a guided, step-by-step experience? Open the notebooks in Jupyter:

Notebooks require a local environment (no Colab support) — MuJoCo needs local compute.

🏃 Example Runs

These are example repository runs for documented commands and hardware setups. They are useful as concrete entrypoints and reported timings, but they are not yet a formal benchmark manifest.

uv run train --algo sac --task g1_walk_flat --sim mujoco
uv run train --algo sac --task g1_sac_wbt --sim mujoco training.use_amp=true
uv run train --algo ppo --task sharpa_inhand --sim mujoco --profile hora

More training commands, script-level entrypoints, resume flow, and W&B details are in 03 Training Guide.

🎯 Training Entrypoints

Use uv run train for training, uv run eval for checkpoint playback, and uv run demo for the local demo preset. These commands are the first-level training interface and keep algorithm, task, and backend selection explicit.

See 03 Training Guide for the algorithm matrix, log directory layout, Hydra overrides, script-level entrypoints, and demo flags.

📚 Documentation

Use docs/README.md as the documentation index. High-signal entrypoints:

About

UniLab: Universal CPU-Vectorized Simulation for Fast Robot Learning.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 99.4%
  • Other 0.6%