Skip to content

JLMSC/single_layer_perceptron

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Single Layer Perceptron (Linear Regression Model)

This work presents a computational implementation of a Single Layer Perceptron model trained via gradient descent. A set of weights and a bias term are iteratively optimized to minimize the prediction error on a continuous dataset. The system evaluates the final model by comparing the predicted outputs against the ground truth values, demonstrating fundamental machine learning convergence.

Introduction

This work implements a discrete learning algorithm on a toy dataset with the following objectives:

  • Simulate the training loop of a basic neural network architecture
  • Constrain the model to a single linear layer without non-linear activation functions
  • Compute the Mean Squared Error (MSE) cost function across iterations
  • Update parameters using a continuous gradient descent algorithm
  • Evaluate the model's accuracy through a comparative reporting utility

The system is designed as an educational and exploratory tool for understanding forward propagation, backward propagation, and numerical optimization.

Problem Representation

The simulation operates on a continuous dataset defined as:

(x1, x2) ∈ ℝ², y ∈ ℝ

Where:

X ∈ ℝ^(2 × m) : feature matrix (input boundaries)

Y ∈ ℝ^(1 × m) : label vector (target outputs)

m ∈ ℕ : number of training examples

The model state at iteration t is defined by its parameters:

S(t) = {W(t), b(t)}

Where:

W(t) ∈ ℝ^(1 × 2) : weight matrix

b(t) ∈ ℝ : bias scalar

Training Dynamics

At each iteration, the model performs a deterministic update:

  • A forward pass computes the linear prediction:
    Y_hat(t) = W(t)X + b(t)
  • A backward pass computes the gradients of the cost function:
    dW(t) = (1/m) * (Y_hat(t) - Y) * X^T
    db(t) = (1/m) * sum(Y_hat(t) - Y)
  • Parameters are updated using a learning rate α:
    W(t+1) = W(t) - α * dW(t)
    b(t+1) = b(t) - α * db(t)

This ensures:

  • Iterative minimization of the loss landscape
  • Continuous parameter adjustment
  • Convergence toward the global minimum

Cost Computation

To analyze model performance, a Mean Squared Error (MSE) metric is computed:

J(t) = sum((Y_hat(t) - Y)^2) / (2 * m)

Where:

(Y_hat(t) - Y) is the residual error vector for all examples
m is the number of training examples

This produces a strictly positive scalar value used to monitor convergence across training iterations.

Results

The system produces:

  • A trained linear regression model capable of numerical prediction
  • An iterative loss reduction log over time
  • A comparative report of actual versus predicted values Convergence logs reveal:
  • Rapid initial descent of the cost function
  • Stable asymptotic behavior near the optimal weights
  • A high degree of accuracy on linearly separable continuous data

Limitations

  • The model is strictly linear and cannot solve non-linear problems (e.g., XOR)
  • Lack of an activation function reduces the network to simple linear regression
  • Gradient descent uses the full batch, which is computationally expensive for large datasets
  • Hand-coded mathematical derivatives lack the flexibility of automatic differentiation

About

Educational implementation of a Single Layer Perceptron demonstrating linear regression and gradient descent from scratch in Python.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages