Coarse-to-fine Language-Aligned manipulation Policy (CLAP)

To enhance generalization to novel instructions and environment variations, we propose Coarse-to-fine Language-Aligned manipulation Policy (CLAP), a framework that integrates three key components: 1) task decomposition, 2) VLM fine-tuning for 3D keypoint prediction, and 3) 3D-aware representation.

🔗 Website 📄 arXiv

Getting Started

Install

Tested (Recommended) Versions: Python 3.10 and CUDA 12.1.
Step 1 (Optional): We recommend using conda and creating a virtual environment.

conda create --name clap python=3.10
conda activate clap

Step 2: Install PyTorch. Make sure the PyTorch version is compatible with the CUDA version.More instructions to install PyTorch can be found here.

pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu124

check cuda is available with the installed torch before moving to next step.

Step 3: Install PyTorch3D. For more instructions visit here.

pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"

Step 4: Install CoppeliaSim. PyRep requires version 4.1 of CoppeliaSim. Download and unzip CoppeliaSim:
Ubuntu 16.04
Ubuntu 18.04
Ubuntu 20.04

Once you have downloaded CoppeliaSim, add the following to your ~/.bashrc file. (NOTE: the 'EDIT ME' in the first line)

export COPPELIASIM_ROOT=<EDIT ME>/PATH/TO/COPPELIASIM/INSTALL/DIR
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOT
export DISPLAY=:1.0

For headless server, use this line to replace the last line above

Xvfb :0 -screen 0 1024x768x24 +extension GLX +render -noreset & export DISPLAY=:0

Remember to source your .bashrc (source ~/.bashrc) or .zshrc (source ~/.zshrc) after this.

Step 5: Clone the repository with the submodules using the following command.

git clone --recurse-submodules https://github.com/Jianshu-Hu/CLAP.git && cd CLAP && git submodule update --init

Step 6: Install packages for fine-tuning VLM with ms-swift

pip install ms-swift==3.5.2
pip install transformers==4.51.3
pip install modelscope==1.27.1
pip install peft==0.15.2
pip install trl==0.18
pip install deepspeed==0.16.9
pip install vllm==0.8.5.post1
pip install qwen_vl_utils

Step 7: Install, required libraries such as PyRep, RLBench, YARR, Point Renderer and robot-colosseum.

pip install -e libs/PyRep 
pip install -e libs/RLBench 
pip install -e libs/YARR 
pip install -e libs/point-renderer
pip install -e libs/robot-colosseum-rvt/
pip install transforms3d
pip install timm
pip install bitsandbytes
pip install openai-clip
pip install pyquaternion

Step 8: Collect dataset.
- You can generate the initial demonstrations using the following command. They will be generated under Generalizable-CLAP/data/gembench/xxx where xxx is either train, test, or val. And modify DATA_DIR in config.py to match the location.
```
bash scripts/collect_gembench_data.sh
```
- Additionally, we use the same dataloader as PerAct, which is based on YARR. It will save the replay buffer in the disk (It will only be created once when you run the low-level training). You can modify TASK_REPLAY_STORAGE_FOLDER in config.py to decide the location for saving the replay buffer.
Additional notes:
- For headless server, if you faced issue related to qt such as Could not find the Qt platform plugin "xcb", try
```
pip uninstall opencv-python opencv-python-headless
pip install opencv-python-headless
```
- If you faced issue related to libGL such as miniconda3/envs/robot-vlm/bin/../lib/libstdc++.so.6: version GLIBCXX_3.4.30' not found, try the following command to see if your computer already has GLIBCXX_3.4.30 locally.
```
strings /usr/lib/x86_64-linux-gnu/libstdc++.so.6 | grep GLIBCXX
```
If you can see GLIBCXX_3.4.30, try to backup the original libstdc++.so.6 and copy your local libstdc++.so.6 to the conda env. Remember to replace the directory with your own path.
```
mv miniconda3/envs/robot-vlm/lib/libstdc++.so.6 miniconda3/envs/robot-vlm/lib/libstdc++.so.6.old
cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 miniconda3/envs/robot-vlm/lib/libstdc++.so.6
```
If you can not see GLIBCXX_3.4.30, try to update the libstdc++ library.

Train and eval in GemBench

Train

Coarse Task Planner:

Step 1: Prepare training data.
```
bash scripts/prepare_gembench_pretraining_data.sh
```
See detailed instructions for more information.

Step 2: Train high-level module for GemBench. We provide scripts for multi-gpu training:

bash scripts/sft_gembench.sh

and python code for single gpu training:

python train.py --tag coarse_task_planner --task_name gembench --num_episodes 10 --data_type lang_keypoints --cot 9 --epochs 1 --lr 0.0003 --eval_save_steps 250 --include_lang_plan gembench

Fine-grained action predictor:
- Step 1: Train low-level policy for Gembench. Note that you need to set gradient_accumulation as 16/num_gpus according to the number of gpus you use. For example, if you run with one gpu, run with:
```
python finegrained_policy/train.py  --gradient_accumulation 16 --with_val --epochs 20 --tasks gembench --tag fine_grained_policy
```

Eval

Eval with Gembench.

bash scripts/eval_gembench.sh

Note: See detailed instructions for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 211 Commits
finegrained_policy		finegrained_policy
keypoint_predictor		keypoint_predictor
libs		libs
pvr_ckpts		pvr_ckpts
scripts		scripts
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
config.py		config.py
eval.py		eval.py
eval_gembench.py		eval_gembench.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Coarse-to-fine Language-Aligned manipulation Policy (CLAP)

🔗 Website 📄 arXiv

Getting Started

Install

Train and eval in GemBench

Train

Eval

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Coarse-to-fine Language-Aligned manipulation Policy (CLAP)

🔗 Website 📄 arXiv

Getting Started

Install

Train and eval in GemBench

Train

Eval

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages