Junfeng Ni1,2, Song-Chun Zhu1,2,3, Siyuan Huang2
1Tsinghua University 2National Key Lab of General AI, BIGAI 3Peking University
We provide a script install.sh to install the environment. In our experiments, we used NVIDIA CUDA 12.4 on Ubuntu 22.04. You may need to modify the installation command according to your CUDA version.
For VideoArtGS-20 Dataset, we provide data at here.
For Video2Articulation Dataset, please download the data from Video2Articulation, and the Partnet-Mobility dataset, and then preprocess the data using python data_tools/process_v2a.py. You can also download the processed version at here.
Data structure:
data
├── videoartgs
│ ├── realscan
│ │ ├── microwave
│ │ │ ├── images
│ │ │ ├── ...
│ ├── sapien
│ │ ├── 100481
│ │ │ ├── images
│ │ │ ├── ...
├── v2a
│ ├── sapien
│ │ ├── 100068_joint_0_bg_view_0
│ │ │ ├── images
│ │ │ ├── ...
We provide the following files and scripts for training:
init_cano.py&scripts/init_cano.sh: training the coarse single state Gaussians.init_deform.py&scripts/init_deform.sh: training the deformable Gaussians.train.py&scripts/train.sh: training the full model.train_gui.py: training the full model with GUI visualization.
Please run scripts/init_cano.sh and scripts/init_deform.sh before running scripts/train.sh.
We provide render.py and script scripts/render.sh, scripts/eval.sh for evaluation. You can download the checkpoints from here and put them in the outputs folder.
We provide some visualization tools for intermediate results in vis_utils folder.
You can visualize the point cloud, joint, and centers for initialization in vis_utils/vis_init.ipynb and visualize the Gaussians and deformation models in vis_utils/vis_videoartgs.ipynb.
We provide vis_utils/json2urdf.py to export URDF files from the trained model. Load URDF files with IsaacSim (>=4.5) to export USD files. We found that IsaacSim can not load texture of .ply meshes. We provide a scriptvis_utils/ply2glb.py, which uses Blender to transform the .ply meshes to .glb meshes.
See detailed instructions in preprocess.md.
If you find our paper and/or code helpful, please consider citing:
@article{liu2025videoartgs,
title={VideoArtGS: Building Digital Twins of Articulated Objects from Monocular Video},
author={Liu, Yu and Jia, Baoxiong and Lu, Ruijie and Gan, Chuyue and Chen, Huayu and Ni, Junfeng and Zhu, Song-Chun and Huang, Siyuan},
journal={arXiv preprint arXiv:2509.17647},
year={2025}
}
This code heavily used resources from ArtGS, SpatialTrackerV2, TAPIP3D, and Video2Articulation. We thank the authors for open-sourcing their awesome projects.
