Feel free to change the cuda version. The following assumes cuda 12.6.
pip install torch==2.8.0 torchvision==0.23.0 torchaudio==2.8.0 --index-url https://download.pytorch.org/whl/cu126
pip install kaolin==0.18.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.8.0_cu126.html
pip install -U xformers --index-url https://download.pytorch.org/whl/cu126
pip install git+https://github.com/microsoft/MoGe.git
pip install git+https://github.com/NVlabs/nvdiffrast.git
pip install -r requirements.txt
Install Hunyuan3D
cd ext/Hunyuan3D-2/hy3dgen/texgen/custom_rasterizer
python3 setup.py install
cd ../../..
cd hy3dgen/texgen/differentiable_renderer
python3 setup.py install
cd ../../../../../
Download HQ-SAM
sh download_sam.sh
Alternatively, You can also just run setup.sh to download all of the required dependencies.
Lastly, place your openai api key in ./vlm_proposal/prompts/openai_key.py
You can run the full pipeline to generate scenes. Scenes used for figures are in ./samples/examples/
python pipeline.py path/to/image path/to/output_folder
if you want to generate scenes for all images in one folder, you can also do
python pipeline.py path/to/image_folder path/to/output_folder
Alternatively, you can run the object removal pipeline
python generate_layer_data.py path/to/image path/to/output_folder
Then run the fitting pipeline like so
python generate_scene.py path/to/output
Note: The output of generate_layer_data.py will have the same name as the image given (without the filetype) located like so /path/to/output_folder/image_filename. The script generate_scene.py assumes path/to/output is the same folder directly generated bygenerate_layer_data.py.
For object fitting w/ VGGT, refer to the fit_object() function in mesh_fitting/fit_objects.py. Note that it generates data by running a script I placed at ./ext/vggt/predict_cam_corrs.py.
To look at the inpainting pipeline, refer to inpaint_image_flux() in ./vlm_proposal/scene_inpainting.py.