A MNIST CNN developed to recognize handwritten digits; grayscale input channel of 1 accepted, 1x28x28 input.
The CNN is trained in batches of 32 from the DB. Each of the 32 filters is a 2D 3x3 and extracted from the 1x28x28
A Visual of an MNIST CNN
To train the CNN,
poetry run python src/train_model.pyThis will contact the MNIST DB & run 10 epochs to train the model. The appropriate state after the model is finished executing will be created
as the digital_model.pt file.
Once completed, run the model with the binary generated using
poetry run python src/main.pyFor the purpose of MNIST, the Adam optimizer with lr=1e-3 performs best
Starting with Stochastic Gradient Descent
Where
The Adam optimizer redefines SGD's params as such:
For this project,
Nesterov's Momentum Acceleration
The loss / cost after Epoch 10 ended at ~ 0.01253 on average.
I am the sole contributor of this project.
This project is licensed under the MIT License - see the LICENSE.md


