Skip to content

VikingDeng/rl-research-platform

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

133 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧠 RL Research Platform (Gen 2)

中文版本 (Chinese Version)

An industrial-grade Research & Operations (MLOps) platform tailored for Reinforcement Learning (RL) and Multi-Agent RL (MARL).

Designed for researchers who need reproducible experiments, automated evaluation, and deep observability.

Status Stack


✨ Key Features

🏋️ Training & Scheduling

  • Hybrid Engine Support: Native support for Stable-Baselines3 (Single-Agent) and Ray RLLib (Multi-Agent).
  • Git-Ops Workflow: Run experiments directly from your Git commits. The platform records commit hashes for 100% reproducibility.
  • Config Diff: Instantly visualize hyperparameter differences between any two runs.

👁️ Observability

  • TensorBoard Integration: Built-in TensorBoard proxy for deep gradient/loss analysis.
  • Smart Video Gallery: Automatically records and organizes replay videos from training checkpoints.
  • Real-time Metrics: Live streaming of Reward, Entropy, and Win Rate with downsampling for performance.

🧪 Evaluation & Analysis

  • Matrix Evaluation: Automated "League Table" generation. Run A vs Run B evaluations with heatmaps and Elo scoring.
  • Repro Bundle: One-click export of reproduce.sh, config.yaml, and README.md for open-sourcing your results.

🚀 Quick Start

Option A: Docker Deployment (Recommended)

Best for: Servers with Docker installed. Zero configuration required.

# 1. Start the platform (Builds everything automatically)
docker compose up -d --build

# 2. View logs
docker compose logs -f

Access at: http://localhost:8000


Option B: User-Space Deployment (No Docker/Sudo)

Best for: Shared HPC clusters, School servers without root access.

Step 1: Local Preparation (On your Mac/PC) Build the frontend assets locally to avoid installing Node.js on the server.

cd rl-research-platform
npm ci && npm run build
# Now upload the entire project (including the new 'dist' folder) to your server.

Step 2: Server Launch

# 1. Grant execution permissions
chmod +x start-linux.sh
chmod +x start-mac.sh

# 2. Start the platform (one-click setup + tests)
# Linux:
./start-linux.sh
# macOS:
./start-mac.sh

Access at: http://localhost:8000


Option C: Backend-Only Quick Start (Offline Friendly)

Best for: Existing Python environment already has dependencies, or network-restricted servers.

chmod +x scripts/backend-local-up.sh
cp apps/portal-backend/.env.example apps/portal-backend/.env
./scripts/backend-local-up.sh

Notes:

  • Uses SQLite by default (apps/portal-backend/rl_platform.db).
  • Auto-detects usable Python interpreter (BACKEND_PYTHON, .venv, then conda env).
  • Skips heavy Orbit/extra runtime installation from start-linux.sh.

Option D: One-Click Acceptance Check (For Demo Readiness)

Best for: Verifying "can run on this machine" before recording or demo.

chmod +x scripts/acceptance-check.sh
./scripts/acceptance-check.sh

Checks:

  • docker compose config validation
  • frontend build
  • backend startup + /healthz smoke

What the start scripts do

The start-*.sh scripts are fully automated and will:

  • Build the frontend + generate OpenAPI clients
  • Create venv and install backend/runner dependencies (or reuse an existing conda env if available)
  • Optionally install Miniconda + OrbitZoo + Orekit data (INSTALL_ORBIT_RUNTIME=1)
  • Optionally install common RL env extras (INSTALL_RL_EXTRAS=1)
  • Initialize DB and seed defaults
  • Optionally seed comprehensive MARL envs (SEED_MARL_ENVS=1)
  • Optionally run backend tests (RUN_TESTS=1)
  • Start TensorBoard + backend

You can skip heavy steps if needed:

SEED_MARL_ENVS=0 RUN_TESTS=0 INSTALL_ORBIT_RUNTIME=0 INSTALL_RL_EXTRAS=0 ./start-linux.sh

📂 Project Structure

rl-research-platform/
├── apps/
│   ├── portal-backend/       # FastAPI Backend & Orchestrator
│   │   ├── app/              # Core Logic (API, DB, Services)
│   │   └── runner/           # Training Runner (Executes SB3/RLLib)
│   └── portal-frontend/      # React Frontend (Vite)
├── scripts/
│   ├── seed-full.sh          # Database Seeding (Default Envs/Algos)
│   ├── backend-local-up.sh   # Backend quick start (offline-friendly)
│   ├── start-linux.sh        # Unified Startup Script (Linux)
│   └── start-mac.sh          # Unified Startup Script (macOS)
├── docs/                     # Documentation
└── requirements.txt          # Top-level deps

🔬 Research Workflow

  1. Develop: Write your custom environment or algorithm wrapper in your local Git repository.
  2. Push: Commit your changes to GitHub/GitLab.
  3. Submit: In the platform, create a Job pointing to your Git Repo URL.
  4. Observe: Watch live TensorBoard plots and video replays.
  5. Evaluate: Select your best checkpoints and run a "Matrix Job" to benchmark against baselines.
  6. Publish: Click "Download Repro Bundle" to get a clean, shippable zip file for your paper.

🔧 Extending the Platform

See Developer Guide for details on:

  • Adding custom Gym/PettingZoo environments.
  • Registering new Algorithms.
  • Plugin system for custom rewards/loggers.

🔥 New: LLM Integration Guide - How to use GPT-4/Claude to auto-generate code for this platform.


Built for the RL Community.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors