Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a7df4a3
have basic rasterization with depth buffer
AmanSachan1 Oct 12, 2017
8642ac3
working with float depth buffer
AmanSachan1 Oct 12, 2017
355ab8e
have shaders but theres a weird bug
AmanSachan1 Oct 12, 2017
f712b90
fixed depth issues
AmanSachan1 Oct 12, 2017
a025970
fixes for perspective correct interpolation
AmanSachan1 Oct 13, 2017
27115ad
implemented a mutex lock for depth
AmanSachan1 Oct 13, 2017
af6e978
texture mapping with bilinear interpolation
AmanSachan1 Oct 14, 2017
cef0109
refactored code with helper functions and wrote out how to do tile ba…
AmanSachan1 Oct 14, 2017
61e5920
wrote code for culling but need to debug it so that it works
AmanSachan1 Oct 14, 2017
96e2197
finished backface culling, started tile based
AmanSachan1 Oct 16, 2017
4629d49
tile rasterization checkpoint 1
AmanSachan1 Oct 16, 2017
5a3d94d
tile based check point 2
AmanSachan1 Oct 17, 2017
7363e19
tile based has race conditions so only 5-7 triangles are drawn per ti…
AmanSachan1 Oct 17, 2017
3167b47
tile based checkpoint 3; much much better but still some race conditions
AmanSachan1 Oct 17, 2017
9955284
tile based works, from certain angles a tile is rendered black someti…
AmanSachan1 Oct 17, 2017
b52b3ab
readme
AmanSachan1 Oct 18, 2017
ee36dab
readme images
AmanSachan1 Oct 18, 2017
74b3c60
readme update
AmanSachan1 Oct 18, 2017
99b8dbd
readme pictures
AmanSachan1 Oct 18, 2017
938247a
readme pictures
AmanSachan1 Oct 18, 2017
a7c9dcc
readme pictures
AmanSachan1 Oct 18, 2017
438b230
readme pictures
AmanSachan1 Oct 18, 2017
cd5ac2c
readme pictures
AmanSachan1 Oct 18, 2017
0ebe219
readme correction
AmanSachan1 Oct 18, 2017
f0b2c94
rasterizaiont if defs changed
AmanSachan1 Oct 18, 2017
d300663
readme update
AmanSachan1 Oct 19, 2017
870d4ce
changed readme
AmanSachan1 Oct 27, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
125 changes: 118 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,129 @@
CUDA Rasterizer
===============

[CLICK ME FOR INSTRUCTION OF THIS PROJECT](./INSTRUCTION.md)

**University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 4**

* (TODO) YOUR NAME HERE
* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
* Aman Sachan
* Tested on: Windows 10, i7-7700HQ @ 2.8GHz 32GB, GTX 1070(laptop GPU) 8074MB (Personal Machine: Customized MSI GT62VR 7RE)

[![](readmeImages/CUDARasterizerVimeoLink.png)](https://vimeo.com/238849486)

## Overview

Rasterization (or rasterisation) as defined by wikipedia is the task of taking an image described in a vector graphics format (shapes) and converting it into a raster image (pixels or dots).

In this project, I simulated the rasterization process of a GPU using CUDA kernels. The aim of this project was to learn the graphics pipeline more intimately while also gaining an appreciation for the GPU's existing rasterization capabilities.

The pipeline I implemented here is a fairly simple one. It consists of:
- Vertex input assembly
- Vertex shading
- Primitive assembly
- Backface culling
- Rasterization
- Per fragment depth test
- Fragment shading

![](readmeImages/pipeline.png)

## Features

### Tile Based Rasterizarion

Tile Based Rasterization is a technique that is commonly seen on low-power devices like mobile phones; it however is gaining popularity and has been [adopted by Nvidia as the defacto rasterization technique](https://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/) since the maxwell architecture.

Tiled rasterization simply cuts up the output image into a grid of 2D tiles that are then dealt with separately. As a preprocess step all the primitives in the scene are binned into different tiles using their bounding boxes. Then, during the actual rasterization stage a separate kernel is launched for each tile that deals with only those primitives that happened to be binned into that tile. And those are pretty much the only major differences that tiled rasterization introduces as compared to a regular scanline implementaion.

![](readmeImages/tileOccupancy.png)

![](readmeImages/TileBased_vs_ScanLine.png)

Performance wise there is an almost 4X increase in the framerate when the window space triangles are distributed over most of the tiles. This is mostly because tile based is more stable in terms of performance whereas there is an exponential drop for regular scanline rasterization.
If however, all the triangles exist inside a few tiles the technique is pretty useless. Fortunately, in real world applications triangles are pretty evenly distributed and binning them into tiles greatly increases the framerate. Performance for tile based rasterization can be simplified to the time complexity of the numberOfPixels x (numberOfPrimitives/numTiles) assuming we have a uniform distribution of triangles in window space.

### ScanLine Rasterization

ScanLine Rasterization is the most common rasterization technique. Scanline literally scans pixels row-wise to create a picture. This can be optimized by the use of Bounding boxes for each primitive and then only performing scanline inside the primitive. Other basic optimizations include using the line intersection testing to determine the start and end points of every row being evaluated (a triangle cant occupy the entirety of its bounding box, usually a triangle fills close to half of its bounding box). In my scanline implementation, a kernel parallelized over the number of primitives is launched, and then scanline is performed over the bounding box of each primitive.

![](readmeImages/BasicPipelineFeatureComparison.png)

Vertex Shading and Primitive Assembly are taking a lot of time. This might be because of the huge indirection and memory reassignment that is happening in global memory. You are taking a huge chunk of global memory and assigning it to some other giant chunk of global memory. This is not friendly data flow and could possibly be made better if the memory chunks were closer together and more coherent somehow.

Anyway, because Vertex shading and primitive assembly are common to all the features, we can ignore them and just look at the rasterization and shading stages.

![](readmeImages/scanlineStages_rasterizationAndShading.png)

Usually Shading is incredibly expensive but because the most complex shader in use is a lambertian shader, which is simply a dot product, 2 normalizations, a multiplication, and an addition it is not complex at all. This is why the shading stage is about as expensive as the rasterization stage.

Performance of scanline rasterization is pretty good although it usually cannot compete with tile based rendering. My implementation of scanline rasterization was used as a baseline to compare every feature against. However, it is about half as fast as tile based rasterization.
Performance for scanline rasterization can be simplified to the time complexity of the numberOfPixels x numberOfPrimitives assuming we have a uniform distribution of triangles in window space.

### Texture Mapping

Texture Mapping is a technique that maps a 2D image onto a 3D object; it is also used to define high frequency detail on 3D models. As would be expected it texture mapping only hurts performance. However textures are still worth all the frames they use because the alternative is too computationally intensive and is a nightmare for artists. The overall impact reading textures can be reduced by streamlining data and make it cache coherent. Essentially anything that reduces global memory reads.

### Perspective Correct Interpolation of values

![](readmeImages/notPerspectiveCorrect.png) | ![](readmeImages/perspectiveCorrect.png) |
|---|---|
| no perspective correct | perspective correct |

Transforming positions from View space to NDC space involves perspective division. This can introduce artifacts in textures if it isnt taken into account during color interpolation for fragments. Perspective Correct Interpolation doesn't add much if anything to the computational cost of barycentric interpolation.

### Bilinear Filtering

![](readmeImages/inputbilinear.png) | ![](readmeImages/bilinear.png) |
|---|---|
| random colors | bilinear filtering applied |

Bilinear filtering is a texture filtering method used to smooth textures when displayed larger or smaller than they actually are, which is almost all the time. Bilinear filtering essentially bilinearly interpolates between the 4 pixels surrounding the look up point. It makes images look much much better and not pixelated. However, it does introduces slight artifacts that can be removed with better but more expensive filtering methods such as trilinear filtering.

### Depth Test

A depth test is used to ensure that only the fragments that can be seen by the camera are drawn (assuming there isnt transparency). In practice, this means if there are a thousand triangles in a line behind the camera then only the fragment from the first camera will be drawn or written into. In a GPU rasterizer, the depth test has to be performed atomically or with mutexes. This is because in a parallelised kernel, multiple threads can try and write to the fragment buffer at the same time at the same location. This is a "race condition." There is no guarantee as to which thread will finish writing to it last. To avoid this, I used a mutex that basically acts as a lock. A mutex array contatains a lock for every index of the fragment buffer. Atomics are operations that guarantee serial operation amongst parallel threads. Atomics however dont prevent race conditions between different blocks launched by a kernel. Thus they can produce rare race conditions that show up as a few blinking pixels.

![](readmeImages/depthTest.png)

Performance wise Depth Testing leads to a big hit in the framerate because we have serialized what was once a parallel section of our code.

### Backface Culling

In a 3D scene there are objects that will not be seen by camera because they aren't facing the camera. It is a good idea to simply ignore these triangles as they dont generally add to the final image.

![](readmeImages/PipelineComparison__Breakdown.png)

### Different Primitive Types

Just like OpenGL, my implementation of a rasterizer allows for drawing meshes as lines and also as point clouds.

![](readmeImages/RasterizationByPrimitives.png)

#### Wireframe (Lines)

![](readmeImages/WireframeRasterization.png)

Equivalent to GL_LINES.

#### Point Cloud (Points)

![](readmeImages/PointRasterization.png)

Equivalent to GL_POINTS.

### Shading Models:

Shading models are what sell the illusion that is rasterization. Great shading models can turn a scene into a believable landscape. I have implemented the standard lambertian shading model along with debug shading models.

#### Lambertian

![](readmeImages/Engine.png)

#### Depth Shading

### (TODO: Your README)
![](readmeImages/depthTest.png)

*DO NOT* leave the README to the last minute! It is a crucial part of the
project, and we will not be able to grade you without a good README.
#### Normals

![](readmeImages/NormalsCow.png)

### Credits

Expand Down
2 changes: 1 addition & 1 deletion gltfs/checkerboard/checkerboard.gltf
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@
},
"images": {
"Checkered": {
"uri": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAhwAAAIcCAIAAAAynOArAAAACXBIWXMAAAsTAAALEwEAmpwYAAALjUlEQVR42u3ZMU5jZxuGYY5tJtiWEGnRrGNSjKZmD6np2Uf2geQGiWWkQyILCQVgG5njvxhpVnAX/3GuawevnuLWp2/4559/lsvl2cTNZrO///77zz///P3336d+yx9//PHXX3+tViujGMUoRpmWi4uLxWq1OoFV5vP5zzT++++/U7/l7e1ttVqt12ujGMUoRpmW1Wo1m/oNAPz/EBUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAzGI2m83n86mfcQIn/LLb7YZhMIpRjGKUyXl/fx82m81yuTyBY8ZxfH19PR6PUz/kcDhcXl6en58bxShGMcq0fHx8DCewx0/39/c3NzfjOE79kO12e3d39/j4aBSjGMUo07Lf7xdXV1cvLy8nsMrxeBzH8fPz8wRuGYYTib1RjGKU/9Qo4zj6qAcgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAmcW3b9/e3t6mfsZutzscDrvd7ng8Tv2W7Xb79evX79+/G8UoRjHKtOz3++H5+Xm5XE79kmEYnp6eNpvNbDb5t9f19fXt7e3FxYVRjGIUo0zLly9fFuv1erVaTf2S+Xy+WCweHh6mfsjZ2dmPHz+Wy+V6vTaKUYxilGlZrVb+VADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJAZjGbzebz+dTPOIETftntdsMwGMUoRjHK5Ly/vw+bzWa5XJ7AMeM4vr6+Ho/HqR9yOBwuLy/Pz8+NYhSjGGVaPj4+hhPY46f7+/ubm5txHKd+yHa7vbu7e3x8NIpRjGKUadnv94urq6uXl5cTWOV4PI7j+Pn5eQK3DMOJxN4oRjHKf2qUcRx91AOQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAzOLbt29vb29TP2O32x0Oh91udzwep37Ldrv9+vXr9+/fjWIUoxhlWvb7/fD8/LxcLqd+yTAMT09Pm81mNpv82+v6+vr29vbi4sIoRjGKUably5cvi/V6vVqtpn7JfD5fLBYPDw9TP+Ts7OzHjx/L5XK9XhvFKEYxyrSsVit/KgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgs5jNZvP5fOpnnMAJv+x2u2EYjGIUoxhlct7f34fNZrNcLk/gmHEcX19fj8fj1A85HA6Xl5fn5+dGMYpRjDItHx8fwwns8dP9/f3Nzc04jlM/ZLvd3t3dPT4+GsUoRjHKtOz3+8XV1dXLy8sJrHI8Hsdx/Pz8PIFbhuFEYm8UoxjlPzXKOI4+6gHIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJAZvHt27e3t7epn7Hb7Q6Hw263Ox6PU79lu91+/fr1+/fvRjGKUYwyLfv9fnh+fl4ul1O/ZBiGp6enzWYzm03+7XV9fX17e3txcWEUoxjFKNPy5cuXxXq9Xq1WU79kPp8vFouHh4epH3J2dvbjx4/lcrler41iFKMYZVpWq5U/FQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQWcxms/l8PvUzTuCEX3a73TAMRjGKUYwyOe/v78Nms1kulydwzDiOr6+vx+Nx6occDofLy8vz83OjGMUoRpmWj4+P4QT2+On+/v7m5mYcx6kfst1u7+7uHh8fjWIUoxhlWvb7/eLq6url5eUEVjkej+M4fn5+nsAtw3AisTeKUYzynxplHEcf9QBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgIyoAZEQFgIyoAJARFQAyogJARlQAyIgKABlRASAjKgBkRAWAjKgAkBEVADKiAkBGVADIiAoAGVEBICMqAGREBYCMqACQERUAMqICQEZUAMiICgAZUQEgs7i4uFgul1M/Y7vdfnx8fHx8jOM49Vv2+/1vv/1mFKMYxShTHOV/51nv6/az2UUAAAAASUVORK5CYII="
"uri": "checkerboard.jpg"
}
},
"materials": {
Expand Down
Binary file added gltfs/checkerboard/checkerboard.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/BasicPipelineFeatureComparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/CUDARasterizerVimeoLink.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/Engine.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/NormalsCow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/PipelineComparison.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/PipelineComparison__Breakdown.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/PointRasterization.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/RasterizationByPrimitives.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/TileBased_vs_ScanLine.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/WireframeRasterization.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/bilinear.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/depthTest.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/grid_interpolation2d.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/inputbilinear.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/notPerspectiveCorrect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/perspectiveCorrect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added readmeImages/tileOccupancy.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading