FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment

📢 News!

[2026/2/20] We released our paper on ArXiv.

📃 Overview

Environment Setup

First, clone and install the RoboTwin repo and required packages. You can follow the guidance in RoboTwin Document.

git clone https://github.com/RoboTwin-Platform/RoboTwin.git
conda create -n frappe python=3.10 -y
conda activate frappe

bash script/_install.sh  #Install RoboTwin basic envs and CuRobo

bash script/_download_assets.sh #Download assets (RoboTwin-OD, Texture Library and Embodiments)

Then we can continue to set up the environment for environment.

# Make sure python version == 3.10
conda activate frappe

# Install pytorch
# Look up https://pytorch.org/get-started/previous-versions/ with your cuda version for a correct command
pip install torch==2.1.0 torchvision==0.16.0  --index-url https://download.pytorch.org/whl/cu121

# Install packaging
pip install packaging==24.0
pip install ninja
# Verify Ninja --> should return exit code "0"
ninja --version; echo $?
# Install flash-attn
pip install flash-attn==2.7.2.post1 --no-build-isolation

# Install other prequisites
pip install -r requirements.txt

Then clone our repo as a policy of the RoboTwin, the directory structure will be as below:

cd policy
git clone https://github.com/Jbo-Wang/frappe.git

RoboTwin
    ├── policy
    ·   ├── FRAPPE        
        │
        └── other policys ...

Installation

Download the pretrained ckpt nad Encoders we will use in the training stage.

# In the RoboTwin ROOT directory
cd policy
mkdir weights
cd weights
mkdir RDT && cd RDT

# Download the models
huggingface-cli download google/t5-v1_1-xxl --local-dir t5-v1_1-xxl
huggingface-cli download google/siglip-so400m-patch14-384 --local-dir siglip-so400m-patch14-384
huggingface-cli download robotics-diffusion-transformer/rdt-1b --local-dir rdt-1b

# Teacher eocders
#theia
huggingface-cli download theaiinstitute/theia-base-patch16-224-cdiv --local-dir theia-base-patch16-224-cdiv
#clip
huggingface-cli download laion/CLIP-ViT-H-14-laion2B-s32B-b79K --local-dir CLIP-ViT-H-14-laion2B-s32B-b79K 
#vit
huggingface-cli download google/vit-huge-patch14-224-in21k --local-dir vit-huge-patch14-224-in21k
#dinov2
git clone https://github.com/facebookresearch/dinov2.git
cd dinov2-main
mkdir checkpoints && cd checkpoints
huggingface-cli download facebook/dinov2-base

Then update your real paths of teacher encoders (Theia, CLIP, VIT, DINOv2) in the utils.py.

Getting Started

You can download the checkpoints from Huggingface.

The directory structure will be as below:

flappe
    ├── checkpoints
    ·   ├── flappe_taskxxx        
        │   └──checkpoint-xxx
        └── ...

We offer a inference example for our method (eval.sh). model_name should be the checkpoint file name under the ./checkpoints folders.

conda activate frappe
bash eval.sh

Training

We offer a training example for our method. It contains two stage training:

mid-training (finetune_mid.sh & model_config/mid_train.yml)
For mid_train.yml, you should update the path of the pretrained ckpts and the pretrained_model_name_or_path;
post-training (finetune_post.sh & model_config/post_train.yml) For post_train.yml, you should update the path of the mid-train ckpts, the pretrained_model_name_or_path and the teacher encoder paths;

conda activate frappe
bash finetune_mid.sh # or bash finetune_post.sh

The default configurations match the experimental setup in our paper.

🔥 TODO List

✅ Training and inference code on RoboTwin2.0

🌏 Contact

For further discussion and collaboration, please feel free to contact us via Email and WeChat:

Author	Email	WeChat
Han Zhao	zhaohan34@westlake.edu.cn	este_zh
Jingbo Wang	guangtouchangkaishen@outlook.com	guangtouchangkaishen
Wenxuan Song	songwenxuan0115@gmail.com	swx0757

❤️ Acknowledgement

We thank these great works and open-source codebases: RDT & Theia

🖊 Citation

If you find this work useful, please cite:

@article{zhao2026frappe,
    title={FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment},
    author={Han Zhao and Jingbo Wang and Wenxuan Song and Shuai Chen and Yang Liu and Yan Wang and Haoang Li and Donglin Wang},
    journal = {arXiv preprint arXiv:2602.17259},
    year={2026},
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
data		data
figs		figs
model_config		model_config
models		models
scripts		scripts
train		train
.gitattributes		.gitattributes
__init__.py		__init__.py
deploy_policy.py		deploy_policy.py
deploy_policy.yml		deploy_policy.yml
eval.sh		eval.sh
finetune_mid.sh		finetune_mid.sh
finetune_post.sh		finetune_post.sh
generate.sh		generate.sh
main.py		main.py
main_mid_train.py		main_mid_train.py
main_post_train.py		main_post_train.py
model.py		model.py
post_train.py		post_train.py
pretrain.sh		pretrain.sh
process_data_rdt.sh		process_data_rdt.sh
readme.md		readme.md
requirements.txt		requirements.txt
robotwin.md		robotwin.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment

📢 News!

📃 Overview

Contents

Environment Setup

Installation

Getting Started

Training

🔥 TODO List

🌏 Contact

❤️ Acknowledgement

🖊 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FRAPPE: Infusing World Modeling into Generalist Policies via Multiple Future Representation Alignment

📢 News!

📃 Overview

Contents

Environment Setup

Installation

Getting Started

Training

🔥 TODO List

🌏 Contact

❤️ Acknowledgement

🖊 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages