gym-hyrosphere

Gymnasium environments for self-propelling spheres on a flat ground plane:

hyrosphere-v0 — tetrahedral rigid body with 4 point masses driven by angular accelerations (action dim 4, obs dim 41).
linearsphere-v0 — sphere with 6 point masses sliding along the cardinal axes (action dim 6, obs dim 63).

Reward is (z - radius) + 0.1 * v_z — the goal is to jump as high as possible, with a small upward-velocity term for early-training shaping. Episodes truncate at 10 s of simulated time (dt = 0.01, 1000 steps).

Setup

Requires Python 3.12 (via pyenv or system) and Poetry ≥ 1.9. Torch is pulled from the pytorch-cu128 index — CUDA 12.8 build, needed for RTX 5090 / Blackwell (sm_120).

poetry install

Use

import gymnasium as gym
import gym_hyrosphere  # registers hyrosphere-v0 and linearsphere-v0

env = gym.make("hyrosphere-v0")
obs, info = env.reset(seed=0)
obs, reward, terminated, truncated, info = env.step(env.action_space.sample())

Scripts

poetry run python viewer.py [--env hyro|linear]              # interactive OpenGL viewer
poetry run python train.py  --env hyro --timesteps 2_000_000 # PPO training
poetry run python play.py   [--run runs/<dir>]               # watch trained policy
poetry run tensorboard --logdir runs/                        # training curves
poetry run python benchmark.py                               # 10k physics-step timing
poetry run python plot-progress.py -f progress.csv -n 100    # live-plot CSV

train.py runs PPO with a VecNormalize wrapper (obs+reward normalization) and writes checkpoints + tb logs to runs/<env>-<timestamp>/. play.py picks up the most recent run by default.

Docs

The math behind the simulator is in docs/: start with overview.md, then dynamics.md for the shared rigid-body equations and hyrosphere.md / linearsphere.md for per-model specifics.

Notes

The original repo trained with OpenAI Baselines (python -m baselines.run --alg=acktr ...). That package is unmaintained and the saved checkpoints (hyrosphere-acktr, linearsphere-acktr) are not compatible with this Gymnasium-based rewrite. Use stable-baselines3 or another modern RL library to train against the env.
step() returns the 5-tuple (obs, reward, terminated, truncated, info); reset(seed=...) returns (obs, info). There is no terminal condition — only truncated after 10 s.

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
docs		docs
gym_hyrosphere		gym_hyrosphere
.gitignore		.gitignore
.python-version		.python-version
CLAUDE.md		CLAUDE.md
README.md		README.md
benchmark.py		benchmark.py
play.py		play.py
plot-progress.py		plot-progress.py
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
train.py		train.py
viewer.py		viewer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gym-hyrosphere

Setup

Use

Scripts

Docs

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gym-hyrosphere

Setup

Use

Scripts

Docs

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages