Check out our Project Page for more visual demos!
02/08/2026
- Release training and inference code.
- Release training data.
- Release training and inference code
- Release datasets (LVIS, Objects365, etc. in WebDataset format)
- Release pretrained models (coming soon)
- Release any-object compositing code
- System: Linux (Tested on Ubuntu 20.04/22.04).
- Hardware:
- GPU: NVIDIA GPU with at least 40GB VRAM (e.g., A6000, A100, H100).
- RAM: Minimum 64GB system memory recommended.
- Software:
- Conda is recommended.
- Python 3.10 or higher.
Create a new conda environment named PICS and install the dependencies:
conda env create --file=PICS.yml
conda activate PICS
- DINOv2: Download ViT-g/14 and place it at: checkpoints/dinov2_vitg14_pretrain.pth
- PICS Checkpoints: (Links will be updated once uploaded to Google Drive/Hugging Face).
Coming soon! We are currently finalizing the model weights for public release.
Our training set is a mixture of LVIS, VITON-HD, Objects365, Cityscapes, Mapillary Vistas and BDD100K. We provide the processed two-object compositing data in WebDataset format (.tar shards) below:
| Model | #Sample | Size | Download |
|---|---|---|---|
| LVIS | 34,160 | 7.98GB | Download |
| VITON-HD | 11,647 | 2.53GB | Download |
| Objects365 | 940,764 | 243GB | Download |
| Cityscapes | 536 | 1.21GB | Download |
| Mapillary Vistas | 603 | 582MB | Download |
| BDD100K | 1,012 | 204MB | Download |
PICS/
├── data/
├── train/
├── LVIS/
├── 00000.tar
├── ...
├── VITONHD/
├── Objects365/
├── Cityscapes/
├── MapillaryVistas/
├── BDD100K/
We provide a script using SAM to extract high-quality object silhouettes for the Objects365 dataset. To process a specific range of data shards, run:
python scripts/annotate_sam.py --is_train --index_low 00000 --index_high 10000
To process raw data (e.g., LVIS), run the following command. Replace /path/to/raw_data with your actual local data path:
python -m datasets.lvis \
--dataset_dir "/path/to/raw_data" \
--construct_dataset_dir "data/train/LVIS" \
--area_ratio 0.02 \
--is_build_data \
--is_train
To train a model on the whole dataset:
python run_train.py \
--root_dir 'LOGS/whole_data' \
--batch_size 16 \
--logger_freq 1000 \
--is_joint
python run_test.py \
--input "sample" \
--output "results/sample" \
--obj_thr 2
This project is licensed under the terms of the MIT license.
