Efficient and Portable 3D Explorable World Generation on AMD GPUs

🔆 Introduction

3D world generation has emerged as a rapidly growing area of research, and we want to bring popular projects in this space to the ROCm ecosystem. One such project is Matrix-3D, a framework that generates an explorable 3D world from a text or image prompt by combining conditional video generation with panoramic 3D reconstruction, and representing the resulting scene as 3D Gaussian Splatting. See their tech report for full details.

In this blog, we describe how we deployed Matrix3D on AMD Instinct™ MI250 and MI300 GPUs. With a series of targeted modifications and optimizations, we made the framework both more efficient and more portable: end-to-end generation time for a single world drops from 2887s to 1306s on one MI250 and from 972s to 482s on one MI300.

📝 What This Project Covers

[Kernel optimization]: 🔥Replacing rendering kernels with more portable Triton kernels, with help from the kernel-writing agent GEAK, without sacrificing performance.
[Faster 3DGS fitting]: 🔥Replacing the original rasterization backend with gsplat for better efficiency and portability.
[Pipeline-level optimization]: 🔥Refactoring the pipeline to reduce repeated model loading, I/O overhead, and recomputation, while also accelerating depth-map merging.
[Reproducible setup]: 🔥Providing step-by-step instructions for running Matrix3D on AMD GPUs.
[End-to-end results]: 🔥Showing the speedup of the optimized version over the original implementation on AMD GPUs.

🎬 Examples

We show both image-to-image and text-to-image results below.

Prompt	Panoramic Video	3D Scene

"an impressionistic winter landscape"

The end-to-end latency is also illustrated in the table and figure below. Overall, the optimized version improves latency by 54% on the MI250 GPU and 50% on the MI300 GPU.

	Original	w/ gsplat	w/ solver opt.	w/ io opt.	Total Reduction
MI250	2887	2527	1406	1306	54%
MI300	972	853	507	482	50%

End-to-end latency comparison between the original and optimized pipelines on MI250 and MI300.

Installation

For ROCm GPUs, we suggest using the built-in docker at rocm/pytorch for example rocm/pytorch:rocm7.2_ubuntu22.04_py3.10_pytorch_release_2.9.1.

After running the docker, clone our project and run:

bash scripts/install_m3d.sh

All the dependencies will be installed automatically.

Usage

For text prompts, run:

bash scripts/run_m3d_i2i.sh

For image prompts, run:

bash scripts/run_m3d_t2i.sh

Acknowledgements

We are grateful for the excellent work of:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github		.github
assets		assets
code		code
data		data
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Efficient and Portable 3D Explorable World Generation on AMD GPUs

🔆 Introduction

📝 What This Project Covers

🎬 Examples

Installation

Usage

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Efficient and Portable 3D Explorable World Generation on AMD GPUs

🔆 Introduction

📝 What This Project Covers

🎬 Examples

Installation

Usage

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages