From 6024aff888c402d7a57a7e1a40dfcb35a9e043ae Mon Sep 17 00:00:00 2001 From: UWLab BOT Date: Sat, 3 Jan 2026 18:06:58 -0800 Subject: [PATCH] Prepares pre-merge --- README.md | 11 +- docs/source/publications/omnireset/index.rst | 532 ++++++++++++++++-- .../publications/omnireset/instruction.rst | 204 ------- 3 files changed, 478 insertions(+), 269 deletions(-) delete mode 100644 docs/source/publications/omnireset/instruction.rst diff --git a/README.md b/README.md index 0e598dd..450f531 100644 --- a/README.md +++ b/README.md @@ -27,12 +27,17 @@ In addition to what IsaacLab provides, UW Lab brings: - **Sim to Real**: Providing robots and configuration that has been tested in Lab and deliver the Simulation Setup that can directly transfer to reals +## Installation + +Follow the [installation guide](https://uw-lab.github.io/UWLab/main/source/setup/installation/pip_installation.html). + + ## Getting Started -Our [documentation page](https://uw-lab.github.io/UWLab) provides everything you need to get started, including detailed tutorials and step-by-step guides. Follow these links to learn more about: +- **Train Your First Policy** — Train an ant to run in minutes → [Quickstart](https://uw-lab.github.io/UWLab/main/source/setup/quickstart.html#launch-training) +- **OmniReset** — RL for manipulation without reward engineering or demos → [Quickstart](https://uw-lab.github.io/UWLab/main/source/publications/omnireset/index.html#quick-start) -- [Installation steps](https://uw-lab.github.io/UWLab/main/source/setup/installation/index.html) -- [Available environments](https://uw-lab.github.io/UWLab/main/source/overview/uw_environments.html) +See [all available environments](https://uw-lab.github.io/UWLab/main/source/overview/uw_environments.html) and [full documentation](https://uw-lab.github.io/UWLab) for details. ## Support diff --git a/docs/source/publications/omnireset/index.rst b/docs/source/publications/omnireset/index.rst index 884aae6..eaf4548 100644 --- a/docs/source/publications/omnireset/index.rst +++ b/docs/source/publications/omnireset/index.rst @@ -3,71 +3,479 @@ OmniReset **OmniReset** is a robotic manipulation framework using RL to solve dexterous, contact-rich manipulation tasks without reward engineering or demos. -.. important:: - **Pre-trained RL Checkpoints Available!** - - We provide trained RL checkpoints for all six tasks: **Drawer Assembly**, **Leg Twisting**, **Peg Insertion**, **Rectangle Reorientation on Wall**, **Cupcake on Plate**, and **Cube Stacking**. - Download the checkpoints and evaluate them immediately! - - See the :doc:`instruction` guide for download links and evaluation instructions. - -.. raw:: html - -
-
- -

Drawer Assembly

-
-
- -

Leg Twisting

-
-
- -

Peg Insertion

-
-
- -

Rectangle Reorientation on Wall

-
-
- -

Cupcake on Plate

-
-
- -

Cube Stacking

-
-
- .. note:: Detailed documentation will be updated following the public release of the paper. -Getting Started ---------------- +---- + +.. _quick-start: + +Quick Start (Try in 2 Minutes) +------------------------------ + +.. important:: + + Make sure you have completed the `installation `_ before running these commands. + +Download our pretrained checkpoint and run evaluation. + +.. tab-set:: + + .. tab-item:: Leg Twisting + + .. raw:: html + +
+ +
+ + .. code:: bash + + # Download checkpoint + wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/fbleg_state_rl_expert.pt + + # Run evaluation + python scripts/reinforcement_learning/rsl_rl/play.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 \ + --num_envs 1 \ + --checkpoint fbleg_state_rl_expert.pt \ + env.scene.insertive_object=fbleg \ + env.scene.receptive_object=fbtabletop + + .. tab-item:: Drawer Assembly + + .. raw:: html + +
+ +
+ + .. code:: bash + + # Download checkpoint + wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/fbdrawerbottom_state_rl_expert.pt + + # Run evaluation + python scripts/reinforcement_learning/rsl_rl/play.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 \ + --num_envs 1 \ + --checkpoint fbdrawerbottom_state_rl_expert.pt \ + env.scene.insertive_object=fbdrawerbottom \ + env.scene.receptive_object=fbdrawerbox + + .. tab-item:: Peg Insertion + + .. raw:: html + +
+ +
+ + .. code:: bash + + # Download checkpoint + wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/peg_state_rl_expert.pt + + # Run evaluation + python scripts/reinforcement_learning/rsl_rl/play.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 \ + --num_envs 1 \ + --checkpoint peg_state_rl_expert.pt \ + env.scene.insertive_object=peg \ + env.scene.receptive_object=peghole + + .. tab-item:: Rectangle on Wall + + .. raw:: html + +
+ +
+ + .. code:: bash + + # Download checkpoint + wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/rectangle_state_rl_expert.pt + + # Run evaluation + python scripts/reinforcement_learning/rsl_rl/play.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 \ + --num_envs 1 \ + --checkpoint rectangle_state_rl_expert.pt \ + env.scene.insertive_object=rectangle \ + env.scene.receptive_object=wall + + .. tab-item:: Cupcake on Plate + + .. raw:: html + +
+ +
+ + .. code:: bash + + # Download checkpoint + wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/cupcake_state_rl_expert.pt + + # Run evaluation + python scripts/reinforcement_learning/rsl_rl/play.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 \ + --num_envs 1 \ + --checkpoint cupcake_state_rl_expert.pt \ + env.scene.insertive_object=cupcake \ + env.scene.receptive_object=plate + + .. tab-item:: Cube Stacking + + .. raw:: html + +
+ +
+ + .. code:: bash + + # Download checkpoint + wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/cube_state_rl_expert.pt + + # Run evaluation + python scripts/reinforcement_learning/rsl_rl/play.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 \ + --num_envs 1 \ + --checkpoint cube_state_rl_expert.pt \ + env.scene.insertive_object=cube \ + env.scene.receptive_object=cube + +---- + +.. _reproduce-training: + +Reproduce Our Training +---------------------- + +Reproduce our training results from scratch. + +.. important:: + + Before running reset state generation scripts (step 3), make sure ``base_path`` and ``base_paths`` in ``reset_states_cfg.py`` are set appropriately. + +.. tab-set:: + + .. tab-item:: Leg Twisting + + **Step 1: Collect Partial Assemblies** (~30 seconds) + + .. code:: bash + + python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=fbleg env.scene.receptive_object=fbtabletop + + **Step 2: Sample Grasp Poses** (~1 minute) + + .. code:: bash + + python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=fbleg + + **Step 3: Generate Reset State Datasets** (~1 min to 1 hour depending on the reset) + + .. code:: bash + + # Object Anywhere, End-Effector Anywhere (Reaching) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=fbleg env.scene.receptive_object=fbtabletop + + # Object Resting, End-Effector Grasped (Near Object) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=fbleg env.scene.receptive_object=fbtabletop + + # Object Anywhere, End-Effector Grasped (Grasped) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=fbleg env.scene.receptive_object=fbtabletop + + # Object Partially Assembled, End-Effector Grasped (Near Goal) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=fbleg env.scene.receptive_object=fbtabletop + + **Step 4: Train RL Policy** + + .. code:: bash + + python -m torch.distributed.run \ + --nnodes 1 \ + --nproc_per_node 4 \ + scripts/reinforcement_learning/rsl_rl/train.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ + --num_envs 16384 \ + --logger wandb \ + --headless \ + --distributed \ + env.scene.insertive_object=fbleg \ + env.scene.receptive_object=fbtabletop + + .. tab-item:: Drawer Assembly + + **Step 1: Collect Partial Assemblies** (~30 seconds) + + .. code:: bash + + python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=fbdrawerbottom env.scene.receptive_object=fbdrawerbox + + **Step 2: Sample Grasp Poses** (~1 minute) + + .. code:: bash + + python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=fbdrawerbottom + + **Step 3: Generate Reset State Datasets** (~1 min to 1 hour depending on the reset) + + .. code:: bash + + # Object Anywhere, End-Effector Anywhere (Reaching) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=fbdrawerbottom env.scene.receptive_object=fbdrawerbox + + # Object Resting, End-Effector Grasped (Near Object) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=fbdrawerbottom env.scene.receptive_object=fbdrawerbox + + # Object Anywhere, End-Effector Grasped (Grasped) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=fbdrawerbottom env.scene.receptive_object=fbdrawerbox + + # Object Partially Assembled, End-Effector Grasped (Near Goal) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=fbdrawerbottom env.scene.receptive_object=fbdrawerbox + + **Step 4: Train RL Policy** + + .. code:: bash + + python -m torch.distributed.run \ + --nnodes 1 \ + --nproc_per_node 4 \ + scripts/reinforcement_learning/rsl_rl/train.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ + --num_envs 16384 \ + --logger wandb \ + --headless \ + --distributed \ + env.scene.insertive_object=fbdrawerbottom \ + env.scene.receptive_object=fbdrawerbox + + .. tab-item:: Peg Insertion + + **Step 1: Collect Partial Assemblies** (~30 seconds) + + .. code:: bash + + python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=peg env.scene.receptive_object=peghole + + **Step 2: Sample Grasp Poses** (~1 minute) + + .. code:: bash + + python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=peg + + **Step 3: Generate Reset State Datasets** (~1 min to 1 hour depending on the reset) + + .. code:: bash + + # Object Anywhere, End-Effector Anywhere (Reaching) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=peg env.scene.receptive_object=peghole + + # Object Resting, End-Effector Grasped (Near Object) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=peg env.scene.receptive_object=peghole + + # Object Anywhere, End-Effector Grasped (Grasped) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=peg env.scene.receptive_object=peghole + + # Object Partially Assembled, End-Effector Grasped (Near Goal) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=peg env.scene.receptive_object=peghole + + **Step 4: Train RL Policy** + + .. code:: bash + + python -m torch.distributed.run \ + --nnodes 1 \ + --nproc_per_node 4 \ + scripts/reinforcement_learning/rsl_rl/train.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ + --num_envs 16384 \ + --logger wandb \ + --headless \ + --distributed \ + env.scene.insertive_object=peg \ + env.scene.receptive_object=peghole + + .. tab-item:: Rectangle on Wall + + **Step 1: Collect Partial Assemblies** (~30 seconds) + + .. code:: bash + + python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=rectangle env.scene.receptive_object=wall + + **Step 2: Sample Grasp Poses** (~1 minute) + + .. code:: bash + + python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=rectangle + + **Step 3: Generate Reset State Datasets** (~1 min to 1 hour depending on the reset) + + .. code:: bash + + # Object Anywhere, End-Effector Anywhere (Reaching) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=rectangle env.scene.receptive_object=wall + + # Object Resting, End-Effector Grasped (Near Object) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=rectangle env.scene.receptive_object=wall + + # Object Anywhere, End-Effector Grasped (Grasped) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=rectangle env.scene.receptive_object=wall + + # Object Partially Assembled, End-Effector Grasped (Near Goal) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=rectangle env.scene.receptive_object=wall + + **Step 4: Train RL Policy** + + .. code:: bash + + python -m torch.distributed.run \ + --nnodes 1 \ + --nproc_per_node 4 \ + scripts/reinforcement_learning/rsl_rl/train.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ + --num_envs 16384 \ + --logger wandb \ + --headless \ + --distributed \ + env.scene.insertive_object=rectangle \ + env.scene.receptive_object=wall + + .. tab-item:: Cupcake on Plate + + **Step 1: Collect Partial Assemblies** (~30 seconds) + + .. code:: bash + + python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=cupcake env.scene.receptive_object=plate + + **Step 2: Sample Grasp Poses** (~1 minute) + + .. code:: bash + + python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=cupcake + + **Step 3: Generate Reset State Datasets** (~1 min to 1 hour depending on the reset) + + .. code:: bash + + # Object Anywhere, End-Effector Anywhere (Reaching) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=cupcake env.scene.receptive_object=plate + + # Object Resting, End-Effector Grasped (Near Object) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=cupcake env.scene.receptive_object=plate + + # Object Anywhere, End-Effector Grasped (Grasped) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=cupcake env.scene.receptive_object=plate + + # Object Partially Assembled, End-Effector Grasped (Near Goal) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=cupcake env.scene.receptive_object=plate + + **Step 4: Train RL Policy** + + .. code:: bash + + python -m torch.distributed.run \ + --nnodes 1 \ + --nproc_per_node 4 \ + scripts/reinforcement_learning/rsl_rl/train.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ + --num_envs 16384 \ + --logger wandb \ + --headless \ + --distributed \ + env.scene.insertive_object=cupcake \ + env.scene.receptive_object=plate + + .. tab-item:: Cube Stacking + + **Step 1: Collect Partial Assemblies** (~30 seconds) + + .. code:: bash + + python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=cube env.scene.receptive_object=cube + + **Step 2: Sample Grasp Poses** (~1 minute) + + .. code:: bash + + python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=cube + + **Step 3: Generate Reset State Datasets** (~1 min to 1 hour depending on the reset) + + .. code:: bash + + # Object Anywhere, End-Effector Anywhere (Reaching) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=cube env.scene.receptive_object=cube + + # Object Resting, End-Effector Grasped (Near Object) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=cube env.scene.receptive_object=cube + + # Object Anywhere, End-Effector Grasped (Grasped) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=cube env.scene.receptive_object=cube + + # Object Partially Assembled, End-Effector Grasped (Near Goal) + python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=cube env.scene.receptive_object=cube + + **Step 4: Train RL Policy** + + .. code:: bash + + python -m torch.distributed.run \ + --nnodes 1 \ + --nproc_per_node 4 \ + scripts/reinforcement_learning/rsl_rl/train.py \ + --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ + --num_envs 16384 \ + --logger wandb \ + --headless \ + --distributed \ + env.scene.insertive_object=cube \ + env.scene.receptive_object=cube + +Training Curves +^^^^^^^^^^^^^^^ + +Below are success rate curves for each task plotting over number of training iterations and wall clock time when training on 4xL40S GPUs. +Insertion, twisting, cube stacking, and rectangle orientation on wall tasks converge within **8 hours**, while drawer assembly and cupcake on plate tasks take **1 day**. + +.. list-table:: + :widths: 50 50 + :class: borderless + + * - .. figure:: ../../../source/_static/publications/omnireset/success_rate_over_steps.jpg + :width: 100% + :alt: Training curve over steps + + Success Rate of 6 Tasks Over Number of Training Iterations -For setup, training, and evaluation instructions, see :doc:`instruction`. + - .. figure:: ../../../source/_static/publications/omnireset/success_rate_over_wall_clock.jpg + :width: 100% + :alt: Training curve over wall clock time -.. toctree:: - :maxdepth: 1 - :hidden: + Success Rate of 6 Tasks Over Wall Clock Time - instruction +---- diff --git a/docs/source/publications/omnireset/instruction.rst b/docs/source/publications/omnireset/instruction.rst deleted file mode 100644 index b999582..0000000 --- a/docs/source/publications/omnireset/instruction.rst +++ /dev/null @@ -1,204 +0,0 @@ -Instructions -============ - -This guide provides step-by-step instructions for using the OmniReset framework. -Choose your path: :ref:`evaluate our pre-trained checkpoints ` or :ref:`reproduce training from scratch `. - -.. note:: - - For all commands below, replace ``insertive_object`` and ``receptive_object`` with one of the following: - - * **Drawer Assembly:** ``fbdrawerbottom`` / ``fbdrawerbox`` - * **Twisting:** ``fbleg`` / ``fbtabletop`` - * **Insertion:** ``peg`` / ``peghole`` - * **Rectangle Reorientation on Wall:** ``rectangle`` / ``wall`` - * **Cupcake on Plate:** ``cupcake`` / ``plate`` - * **Cube Stacking:** ``cube`` / ``cube`` - - For grasp sampling, replace ``object`` with ``fbleg``, ``fbdrawerbottom``, ``peg``, ``rectangle``, ``cupcake``, or ``cube``. - ----- - -.. _evaluate-checkpoints: - -Download and Evaluate Pre-trained Checkpoints ---------------------------------------------- - -We provide trained RL checkpoints for all three tasks. Download and evaluate them immediately! - -Download Checkpoints -^^^^^^^^^^^^^^^^^^^^ - -Download the pre-trained checkpoints from our Backblaze B2 storage (drawer assembly, leg twisting, peg insertion): - -.. code:: bash - - wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/fbdrawerbottom_state_rl_expert.pt - wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/fbleg_state_rl_expert.pt - wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/peg_state_rl_expert.pt - wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/rectangle_state_rl_expert.pt - wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/cupcake_state_rl_expert.pt - wget https://s3.us-west-004.backblazeb2.com/uwlab-assets/Policies/OmniReset/cube_state_rl_expert.pt - -Evaluate Checkpoints -^^^^^^^^^^^^^^^^^^^^ - -Run evaluation on the downloaded checkpoints: - -.. code:: bash - - python scripts/reinforcement_learning/rsl_rl/play.py --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 --num_envs 1 --checkpoint /path/to/checkpoint.pt env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - - -.. _reproduce-training: - -Reproduce Our Training ----------------------- - -Follow these steps to reproduce our training results from scratch. This involves collecting reset state datasets and training RL policies. - -Collect Partial Assemblies -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Collect partial assembly datasets that will be used for generating reset states. -You can either use existing datasets from Backblaze or collect new ones. - -.. code:: bash - - python scripts_v2/tools/record_partial_assemblies.py --task OmniReset-PartialAssemblies-v0 --num_envs 10 --num_trajectories 10 --dataset_dir ./partial_assembly_datasets --headless env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - -.. note:: - - This step should take approximately 30 seconds. - - -Sample Grasp Poses -^^^^^^^^^^^^^^^^^^ - -Sample grasp poses for the objects. You can either use existing datasets from Backblaze or collect new ones. - -.. code:: bash - - python scripts_v2/tools/record_grasps.py --task OmniReset-Robotiq2f85-GraspSampling-v0 --num_envs 8192 --num_grasps 1000 --dataset_dir ./grasp_datasets --headless env.scene.object=object - -.. note:: - - This step should take approximately 1 minute. - - -Generate Reset State Datasets -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Generate reset state datasets for different configurations. You can either use existing datasets from Backblaze or collect new ones. - -.. important:: - - Before running these scripts, make sure ``base_path`` and ``base_paths`` in ``reset_states_cfg.py`` are set appropriately. - -Object Anywhere, End-Effector Anywhere (Reaching) -""""""""""""""""""""""""""""""""""""""""""""""""" - -.. code:: bash - - python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEAnywhere-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEAnywhere env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - -Object Resting, End-Effector Grasped (Near Object) -"""""""""""""""""""""""""""""""""""""""""""""""""" - -.. warning:: - - This task depends on reset states from **Object Anywhere, End-Effector Anywhere**. If you are generating your own reset states, make sure to set ``base_paths`` in ``reset_states_cfg.py`` to point to your generated ``ObjectAnywhereEEAnywhere`` dataset directory. - -.. code:: bash - - python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectRestingEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectRestingEEGrasped env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - -Object Anywhere, End-Effector Grasped (Grasped) -""""""""""""""""""""""""""""""""""""""""""""""" - -.. code:: bash - - python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectAnywhereEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectAnywhereEEGrasped env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - -Object Partially Assembled, End-Effector Grasped (Near Goal) -"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" - -.. code:: bash - - python scripts_v2/tools/record_reset_states.py --task OmniReset-UR5eRobotiq2f85-ObjectPartiallyAssembledEEGrasped-v0 --num_envs 4096 --num_reset_states 10000 --headless --dataset_dir ./reset_state_datasets/ObjectPartiallyAssembledEEGrasped env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - -.. note:: - - Each of these steps should take anywhere between 1 minute and 1 hour depending on the task and reset configuration. - - -Visualize Reset States (Optional) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -Visualize the generated reset states to verify they are correct. - -.. code:: bash - - python scripts_v2/tools/visualize_reset_states.py --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-Play-v0 --num_envs 4 --dataset_dir /path/to/dataset env.scene.insertive_object=insertive_object env.scene.receptive_object=receptive_object - - -Train RL Policy -^^^^^^^^^^^^^^^ - -Train reinforcement learning policies using the generated reset states. - -.. code:: bash - - python -m torch.distributed.run \ - --nnodes 1 \ - --nproc_per_node 4 \ - scripts/reinforcement_learning/rsl_rl/train.py \ - --task OmniReset-Ur5eRobotiq2f85-RelCartesianOSC-State-v0 \ - --num_envs 16384 \ - --logger wandb \ - --headless \ - --distributed \ - env.scene.insertive_object=insertive_object \ - env.scene.receptive_object=receptive_object - -Training Curves -^^^^^^^^^^^^^^^ - -Below are success rate curves for each task plotting over number of training iterations and wall clock time when training on 4xL40S GPUs. -Insertion, twisting, cube stacking, and rectangle orientation on wall tasks converge within **8 hours**, while drawer assembly and cupcake on plate tasks take **1 day**. - -.. list-table:: - :widths: 50 50 - :class: borderless - - * - .. figure:: ../../../source/_static/publications/omnireset/success_rate_over_steps.jpg - :width: 100% - :alt: Training curve over steps - - Success Rate of 6 Tasks Over Number of Training Iterations - - - .. figure:: ../../../source/_static/publications/omnireset/success_rate_over_wall_clock.jpg - :width: 100% - :alt: Training curve over wall clock time - - Success Rate of 6 Tasks Over Wall Clock Time - ----- - -Known Issues and Solutions --------------------------- - -GLIBCXX Version Error -^^^^^^^^^^^^^^^^^^^^^ - -If you encounter this error: - -.. code-block:: text - - OSError: version `GLIBCXX_3.4.30' not found (required by /path/to/omni/libcarb.so) - -Try exporting the system's ``libstdc++`` library: - -.. code:: bash - - export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6