Simple educational multilayer perceptron implementation in pure Python.
This repository contains a small neural-network implementation and helper scripts to split a CSV dataset, train a model, and run inference on a test set. The pipeline is intentionally minimal so you can run everything with plain Python and a few common scientific packages.
data/- input dataset and split files:data.csv,training.csv,validation.csv,test.csv.model.pkl- example saved model (created after training).metrics.png- example training metrics plot (created after training).src/- source codesplit_data.py- createtraining.csv,validation.csv,test.csvfromdata/data.csvtrain_neuralnetwork.py- train aNeuralNetworkand save it tomodel.pklmultilayer_perceptron.py- run the trained model ondata/test.csvand print metricsneuralnetwork.py,neuron.py- core network implementation
- Inputs:
data/data.csv(rows: id, label, 30 numeric features) - Outputs:
data/training.csv,data/validation.csv,data/test.csv,model.pkl,metrics.png - Success: training finishes and
model.pkl&metrics.pngare created; inference prints test differences and loss
-
Python 3.8+ (3.10/3.12 tested in this project)
-
The following Python packages:
- pandas
- numpy
- matplotlib
Install dependencies (recommended inside a virtualenv):
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install pandas numpy matplotlibThe code expects data/data.csv with no header. Each row must follow this
layout:
- id (ignored by the scripts)
- label:
M(malignant) orB(benign) - 30 numeric feature columns used as inputs to the network
An example row (first lines of data/data.csv are included in the repo):
842302,M,17.99,10.38,... (30 feature columns)
Run these three steps in order from the repository root.
- Split the data
This creates data/training.csv, data/validation.csv, and data/test.csv from data/data.csv.
python3 src/split_data.py- Train the model
Trains a NeuralNetwork using data/training.csv and data/validation.csv,
saves the trained model to model.pkl, and writes a metrics.png plot.
python3 src/train_neuralnetwork.py- Run inference / evaluate
This loads model.pkl and evaluates it on data/test.csv. The script prints
the number of differences and a (binary cross-entropy) loss value.
python3 src/multilayer_perceptron.pyNotes:
- If any of the expected files are missing the scripts will print
File not foundand exit. train_neuralnetwork.pyconstructs aNeuralNetwork(30, 40, 4)by default (30 inputs, 40 neurons per hidden layer, 4 hidden layers) and callstrain(..., learning_rate=0.001, epochs=100)— edit the file to change hyperparameters.
model.pkl— the pickled NeuralNetwork after trainingmetrics.png— training/validation loss and accuracy vs epochs
- If plots don't appear when training, ensure a display is available or
run headless by removing
plt.show()insrc/neuralnetwork.py(the file still writesmetrics.png). - If you see
ModuleNotFoundErrorfor any package, install it via pip as shown in Requirements.
See the LICENSE file in the repository for license terms.