Polar is a RL rollout framework for real-world agent harnesses.
- Harness as Environment. Bring your agent harnesses as RL-ready environments without code change.
- Smart Rollout Pipeline. Maximize GPU utilization with Polar's async rollout staging.
- Rollout as a Service. Server mode by design -- scaling Async RL with any training frameworks.
The Rollout Server manages and dispatches client requests into distributed Gateway Nodes, which asynchronously prepare runtime, execute agents, build trajectories and evaluate them. Agent harnesses are listened by a proxy that sits between agnostic agent execution processes and local inference servers.
uv venv
uv pip install -e .SGLang is installed and launched separately.
uv pip install --prerelease=allow sglang==0.5.10
bash scripts/patch/patch_sglang.shFor SWE-bench evaluation support:
uv pip install -e ".[swebench]"Polar itself is trainer agnostic. Currently, we provide a demo-purpose Slime integration in Slime bridge installation guide.
- ⭐ Choose your Agent Harness: pick a built-in harness, or use the shell harness with any wrappers.
- 🚀 Trajectory Construction and Eval: See builder and evaluator guides for built-in strategies or register your own.
- 🔧 Deployment Topology: define rollout and gateway nodes, networking, worker limits, and model endpoints.
▶️ Request for Rollout: trainer / client side task submission through rollout API.
- Calculator: minimal smoke test without extra runtime dependency.
- SWE-bench Verified: benchmark-style evaluation on SWE-bench Verified tasks.
- SWE-Gym Slime GRPO: training path that connects Polar rollouts to Slime.
This project is under early development. We are actively adding new examples for different tasks / models on diverse hardware setups. Contributions are welcome!
Important
If you find it useful, please consider citing our work:
@article{zhang2026prorl,
title={ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents},
author={Zhang, Hao and Liu, Mingjie and Zhang, Shaokun and Han, Songyang and Hu, Jian and Jin, Zhenghui and Zhang, Yuchi and Diao, Shizhe and Lu, Ximing and Xu, Binfeng and others},
journal={arXiv preprint arXiv:2603.18815},
year={2026}
}
