A Tiny Deep Learning Framework in C that runs on Resource-Constrained Hardware Platforms
The purpose of this code is to port deep learning to resource-constrained MCUs or DSPs such as the STM32.
The code is NOT very efficient, NO optimizations are used, and it is even inefficient by design. This is just a coursework assignment.
Any optimization ideas are welcome.
MNIST dataset is provided as a test.
mnist_test_data_tiny.h contains 1% of the test data and mnist_test_data.h contains all the test data.
model_params.h is the exported parameter file for my own trained model. The model is defined as follows:
model = torch.nn.Sequential(
torch.nn.Conv2d(1, 6, 5),
torch.nn.ReLU(),
torch.nn.MaxPool2d(2, 2),
torch.nn.Conv2d(6, 16, 3),
torch.nn.ReLU(),
torch.nn.MaxPool2d(2, 2),
torch.nn.Flatten(),
torch.nn.Linear(16 * 5 * 5, 120),
torch.nn.ReLU(),
torch.nn.Linear(120, 84),
torch.nn.ReLU(),
torch.nn.Linear(84, 10)
)mnist_test_data.h and model_params.h can be downloaded from Google Drive.