Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: CI

on:
push:
branches: [ main ]
pull_request:
workflow_dispatch:

jobs:
lint:
name: ruff (lint)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- run: pip install ruff
- run: ruff check .

reproduce-analysis:
name: no-sim failure analysis (reproducibility smoke)
runs-on: ubuntu-latest
# ACT inference on a CPU runner is heavy; run on demand rather than on every push.
if: github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install (lerobot + hub extras)
run: pip install -e ".[lerobot,hub]" "lerobot==0.5.2"
- name: Fetch minimal assets from the Hugging Face Hub
run: bash scripts/bootstrap_assets.sh --minimal
- name: Reproduce the analysis (no simulator)
run: bash experiments/act_push_failure/run_all.sh
- name: Assert the wrist-shortcut diagnostic holds
run: |
python - <<'PY'
import json
s = json.load(open("experiments/act_push_failure/results/push_summary.json"))
a = s["E2_camera_ablation"]
wrist, overhead = a["delta_black_wrist_only"], a["delta_black_overhead_only"]
print(f"black-wrist Δ={wrist:.3f} black-overhead Δ={overhead:.3f}")
assert wrist > overhead, "expected push to rely on the wrist camera (wrist Δ > overhead Δ)"
print("OK: push policy is wrist-reliant, as diagnosed.")
PY
28 changes: 28 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
cff-version: 1.2.0
message: "If you use sim2act in your work, please cite it."
title: "sim2act: a VLA simulation data engine"
abstract: >-
An end-to-end NVIDIA Isaac Lab pipeline for collecting multimodal Franka manipulation
demonstrations (a Warp state-machine oracle and a PPO RL teacher), converting them to the
LeRobot v3.0 format, training Action Chunking Transformer (ACT) policies, and evaluating them
closed-loop — together with a reproducible, simulator-free failure-analysis case study of a
camera-reliance shortcut in an imitation policy.
type: software
authors:
- family-names: Ma
given-names: Kevin
license: Apache-2.0
repository-code: "https://github.com/Kevinma0215/sim2act"
url: "https://github.com/Kevinma0215/sim2act"
version: "0.1.0"
date-released: "2026-06-19"
keywords:
- vision-language-action
- imitation-learning
- robot-learning
- action-chunking-transformer
- isaac-lab
- lerobot
- sim-to-real
- domain-randomization
- covariate-shift
77 changes: 77 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# CLAUDE.md — sim2act

> Working notes for AI agents on this repo. Read before editing or running anything.
> This file is the single source of truth for commands, conventions, and verified numbers —
> prefer it over re-deriving facts from the code.

## What this is

**sim2act** — a VLA (vision-language-action) **simulation data engine** on NVIDIA Isaac Lab. A Franka
arm performs pick / barrier / push; the repo takes each task from privileged oracle → multimodal demo
collection → LeRobot v3.0 → ACT imitation learning → a closed-loop eval harness. Positioning is
three-in-one: **data engine** (umbrella) + **rigorous failure diagnosis** + **end-to-end closed loop**.
Audience: foundation-model robotics engineers.

Origin: built as an **R&S (Robotics & Simulation) take-home challenge**, now generalized into a public
project. **R&S = Robotics & Simulation — NOT "Rohde & Schwarz".** The old submission codename was
"Corvinus" and is being retired (only `docs/archive/` may still mention it).

## Conda environments (CRITICAL — there are two; do not mix them)

Isaac Lab is pip-installed into the `isaaclab` conda env, so run stages with **plain `python`**
(NOT `./isaaclab.sh -p`, despite older README text). Prefer `conda run -n <env> python ...`.

| env | key versions | use for |
|---|---|---|
| **isaaclab** (py3.11) | isaacsim 5.1.0, isaaclab 0.54.4, lerobot **0.4.4**, torch 2.7, warp 1.14 | everything touching the simulator: collect, PPO RL train, LeRobot convert, eval — and the push-fix pipeline's ACT training (`fix_push_widen_dr.sh` runs end-to-end in this env) |
| **lerobot** (py3.12) | lerobot **0.5.2**, torch 2.11, no Isaac Sim | the **no-sim** failure analysis `experiments/act_push_failure/run_all.sh` (requires 0.5.2) |

Gotcha: the two envs ship different lerobot versions (0.4.4 vs 0.5.2); existing ACT checkpoints load
under both. Keep each pipeline inside ONE env: run the whole `fix_push_widen_dr.sh` in `isaaclab`;
run the no-sim ablation in `lerobot`.

Hardware here: 1× RTX 5060 Ti (16 GB). PPO default is 4096 envs — may need fewer on 16 GB; SPEEDRUN uses 256.

## Canonical commands (run from repo root)

- Editable install: `conda run -n isaaclab python -m pip install -e .`
- Collect (SM oracle): `python scripts/collect/demos.py --task pick_place|barrier --num_demos 50 --headless --enable_cameras`
- Push RL chain: `python scripts/train/push_rl.py --headless --num_envs 4096` → `python scripts/rl/export_push.py --headless` → `python scripts/collect/push_rl_demos.py --num_demos 50 --num_envs 4 --headless --enable_cameras`
- Convert → LeRobot: `python data/convert_to_lerobot.py --input _out/datasets/<tag>_official_demos/dataset.hdf5 --output _out/datasets/lerobot/<tag> --state_keys joint_pos,joint_vel --no_depth`
- Train ACT: `python scripts/train/act.py --dataset _out/datasets/lerobot/<tag> --steps 40000 --batch-size 8 [--wandb]`
- Eval: `python scripts/eval/policy.py --policy act|oracle|dummy --task <t> --model_path <ckpt>/pretrained_model --num_rollouts 20 --headless --enable_cameras` (extras: `--ablate_camera overhead|wrist`, `--n_action_steps`, `--init_scale`, `--oracle-pose gt|noisy`)
- No-sim push failure analysis: `conda run -n lerobot bash experiments/act_push_failure/run_all.sh`
- Push fix (whole chain): `conda run -n isaaclab bash scripts/fix_push_widen_dr.sh` (run `SPEEDRUN=1` first). DR is set via `PUSH_BOX_DR` and shared by train/collect/eval.

## Outputs: everything generated lives under `_out/` (gitignored)

`_out/datasets/{<tag>_official_demos/dataset.hdf5 (raw HDF5), lerobot/<tag> (LeRobot v3.0)}`,
`_out/rl/franka_push/<ts>/`, `_out/act/act_<tag>_run_<ts>/checkpoints/{<step>,last}/pretrained_model`,
`_out/eval/*.json`, `_out/viz/`.
The one generated thing that IS committed: `experiments/act_push_failure/results/` (analysis evidence —
deliberately gitignore-excepted).

## Verified numbers (cite verbatim; do not re-derive)

From `experiments/act_push_failure/results/*_summary.json`:
- push teacher-forcing EE-xy L1 = **0.011 m** (proves the model learned the demos).
- push camera ablation: black-**wrist** Δ = **0.197** vs black-**overhead** Δ = **0.038** → wrist shortcut.
- barrier ablation: black-overhead Δ = **0.089** vs black-wrist Δ = **0.027** → robust overhead.
- barrier ACT **90%** in-dist vs SM oracle **75%**; OOD at init_scale 1.5 → **55%**. push ACT **0%** (pre-fix).
- Root cause: push init DR ±3 cm (vs barrier ±13/±7 cm) → static overhead uninformative → policy takes the
wrist shortcut → closed-loop covariate-shift spiral.

## Don't

- Don't commit `_out/` or large media (`.gif/.webm/.mp4`) into git history (host on HF Hub / GitHub releases).
- Don't rename the Python modules (`envs`/`eval`/`data`/`controllers`) — only the distribution name is `sim2act`.
- Don't call it "Rohde & Schwarz". R&S = Robotics & Simulation.
- Don't advertise OpenVLA / Octo / π0 as done — `OpenVLAWrapper` is wired but unvalidated (an extension point).
- Don't hardcode `/home/kevin786/...` — use the `BASH_SOURCE` repo-root pattern (see `scripts/*.sh`).
- Don't launch the heavy push fix without a `SPEEDRUN=1` smoke first.

## Layout

`envs/` (base/tasks/scenes cfg) · `controllers/` (Warp GPU state machine) · `scripts/{collect,train,eval,rl,viz}`
· `eval/` (VLA eval harness) · `data/convert_to_lerobot.py` · `tools/{checks,smoke,viz}` ·
`experiments/act_push_failure/` (flagship failure analysis, no-sim) · `docs/` · `_out/` (generated, gitignored).
41 changes: 41 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Contributing to sim2act

Thanks for your interest. sim2act is a research codebase; contributions that improve
reproducibility, add tasks/policies, or sharpen the analysis are very welcome.

## Environments

Two conda envs are used (full matrix in [CLAUDE.md](CLAUDE.md)):

- **`isaaclab`** — anything that touches the simulator: demo collection, PPO RL training, LeRobot
conversion, and closed-loop eval. Isaac Lab is pip-installed into this env, so run scripts with
plain `python` (not `./isaaclab.sh -p`).
- **`lerobot`** — the simulator-free failure analysis (`lerobot==0.5.2`).

Install the package editable:

```bash
conda run -n isaaclab python -m pip install -e ".[hub]"
```

## Sanity check without a GPU or simulator

The flagship failure analysis reproduces in ~5 minutes from a published dataset + checkpoint, no
Isaac Sim required:

```bash
conda activate lerobot
bash scripts/bootstrap_assets.sh --minimal # pulls the dataset + checkpoint from the HF Hub
bash experiments/act_push_failure/run_all.sh
```

## Style

- Python is linted with [ruff](https://docs.astral.sh/ruff/): `ruff check . && ruff format --check .`
- Keep generated artifacts out of git — everything lands under `_out/` (gitignored).
- Don't hardcode absolute paths; shell scripts resolve the repo root via `BASH_SOURCE`.

## Pull requests

Keep PRs focused and clearly described. If a change affects the pipeline, say which stage(s) and
which conda env you validated it in. CI runs ruff plus the no-simulator reproducibility smoke.
Loading
Loading