Single Layer Perceptron (Linear Regression Model)

This work presents a computational implementation of a Single Layer Perceptron model trained via gradient descent. A set of weights and a bias term are iteratively optimized to minimize the prediction error on a continuous dataset. The system evaluates the final model by comparing the predicted outputs against the ground truth values, demonstrating fundamental machine learning convergence.

Introduction

This work implements a discrete learning algorithm on a toy dataset with the following objectives:

Simulate the training loop of a basic neural network architecture
Constrain the model to a single linear layer without non-linear activation functions
Compute the Mean Squared Error (MSE) cost function across iterations
Update parameters using a continuous gradient descent algorithm
Evaluate the model's accuracy through a comparative reporting utility

The system is designed as an educational and exploratory tool for understanding forward propagation, backward propagation, and numerical optimization.

Problem Representation

The simulation operates on a continuous dataset defined as:

(x1, x2) ∈ ℝ², y ∈ ℝ

Where:

X ∈ ℝ^(2 × m) : feature matrix (input boundaries)

Y ∈ ℝ^(1 × m) : label vector (target outputs)

m ∈ ℕ : number of training examples

The model state at iteration t is defined by its parameters:

S(t) = {W(t), b(t)}

Where:

W(t) ∈ ℝ^(1 × 2) : weight matrix

b(t) ∈ ℝ : bias scalar

Training Dynamics

At each iteration, the model performs a deterministic update:

A forward pass computes the linear prediction:

    Y_hat(t) = W(t)X + b(t)

A backward pass computes the gradients of the cost function:

    dW(t) = (1/m) * (Y_hat(t) - Y) * X^T
    db(t) = (1/m) * sum(Y_hat(t) - Y)

Parameters are updated using a learning rate α:

    W(t+1) = W(t) - α * dW(t)
    b(t+1) = b(t) - α * db(t)

This ensures:

Iterative minimization of the loss landscape
Continuous parameter adjustment
Convergence toward the global minimum

Cost Computation

To analyze model performance, a Mean Squared Error (MSE) metric is computed:

J(t) = sum((Y_hat(t) - Y)^2) / (2 * m)

Where:

(Y_hat(t) - Y) is the residual error vector for all examples
m is the number of training examples

This produces a strictly positive scalar value used to monitor convergence across training iterations.

Results

The system produces:

A trained linear regression model capable of numerical prediction
An iterative loss reduction log over time
A comparative report of actual versus predicted values Convergence logs reveal:
Rapid initial descent of the cost function
Stable asymptotic behavior near the optimal weights
A high degree of accuracy on linearly separable continuous data

Limitations

The model is strictly linear and cannot solve non-linear problems (e.g., XOR)
Lack of an activation function reduces the network to simple linear regression
Gradient descent uses the full batch, which is computationally expensive for large datasets
Hand-coded mathematical derivatives lack the flexibility of automatic differentiation

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Single Layer Perceptron (Linear Regression Model)

Introduction

Problem Representation

Training Dynamics

Cost Computation

Results

Limitations

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Single Layer Perceptron (Linear Regression Model)

Introduction

Problem Representation

Training Dynamics

Cost Computation

Results

Limitations

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages