A small path tracer implemented in C++ with both CPU and CUDA renderers. Includes diffuse, metal, and dielectric materials; recursive ray scattering; antialiasing; gamma correction; and a depth-of-field camera.
The CUDA version parallelizes rendering across pixels so each GPU thread traces and samples one pixel independently. This makes it a good project for comparing a straightforward CPU renderer with a GPU implementation of the same ray-tracing logic.
- CPU renderer in
main.cpp - CUDA renderer in
main.cu - Randomized scene generation with hundreds of spheres
- Lambertian, metal, and dielectric materials
- Camera aperture and focus distance for depth of field
- Per-pixel random sampling with CURAND in the CUDA path
- PPM image output
- Makefile targets for building and profiling
The CPU and CUDA implementations share the same basic rendering model: rays are generated from a camera, tested against a list of hittable objects, scattered through materials up to a maximum depth, and accumulated into a final pixel color.
The CUDA renderer uses:
16x16thread blocks- One thread per pixel
- A
1200x608output resolution 100samples per pixel- Up to
50ray bounces per sample cudaMallocManagedfor the framebuffer- Device allocations for the world, camera, object list, and CURAND states
The scene and camera are created on the GPU, then the render kernel fills the framebuffer in parallel. Each pixel gets its own CURAND state so antialiasing and material scattering can use independent random samples.
- NVIDIA GPU with CUDA support
- CUDA Toolkit with
nvcc,nsys, andncu - C++ compiler, configured as
g++in the Makefile make
The Makefile assumes CUDA is installed at /usr/local/cuda unless CUDA_PATH is set.
Build the CPU renderer:
make cpuartBuild the CUDA renderer:
make cudartOn Windows, the executables are built as cpuart.exe and cudart.exe. On Linux/macOS-style environments, they are built as cpuart and cudart.
Run the CPU renderer:
./cpuartRun the CUDA renderer:
./cudartBoth renderers write a PPM image to:
render.ppm
The CUDA renderer also prints the render time to standard error.
Build and profile the CUDA renderer with NVIDIA Nsight Systems:
make profile_basicCollect a compute profile with NVIDIA Nsight Compute:
make profile_computeCollect selected occupancy and instruction metrics:
make profile_metricsProfiling reports are written to the reports/ directory. You can set a custom report tag with:
make profile_basic TAG=my_runRemove generated executables and output files:
make clean