Skip to content

mariamashraf731/SVM-Classifier-and-Multivariate-Linear-Regression-from-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

🧠 Machine Learning Algorithms From Scratch

Language Library Topic

📌 Project Overview

This repository contains pure Python implementations of fundamental Machine Learning algorithms, built from the ground up using NumPy. The goal is to demystify the "black box" of ML libraries by implementing the mathematical optimization logic manually.

The project covers Support Vector Machines (SVM) using Soft Margin optimization and Linear Regression (Univariate & Multivariate) using Gradient Descent. It also includes benchmarking against scikit-learn on real-world datasets like German Credit Data and Iris.

⚙️ Implemented Algorithms

1. Support Vector Machine (Soft Margin)

  • Mathematical Formulation: Implements the Hinge Loss function with L2 regularization: $$J(\mathbf{w}, b) = \frac{1}{2} ||\mathbf{w}||^2 + C \sum_{i=1}^{n} \max(0, 1 - y_i(\mathbf{w} \cdot \mathbf{x}_i + b))$$
  • Optimization: Solved using Batch Gradient Descent.
  • Visualization: Includes decision boundary plotting with support vector highlighting on the Iris dataset.
  • File: src/svm/svm_scratch.py

2. Linear Regression (Univariate & Multivariate)

  • Cost Function: Mean Squared Error (MSE). $$J(\theta) = \frac{1}{2m} \sum_{i=1}^{m} (h_\theta(x^{(i)}) - y^{(i)})^2$$
  • Optimization: Manual implementation of Gradient Descent to update weights ($\theta$).
  • Features:
    • Handles multiple features (Multivariate) via matrix operations.
    • Data normalization/standardization from scratch.
  • Files: src/regression/

3. German Credit Classification (Sklearn)

  • A comparative study using sklearn.svm.SVC to classify credit risk on the German Credit dataset.
  • Includes data preprocessing (MinMax Scaling, Standardization) and Confusion Matrix evaluation.

🚀 How to Run

  1. Clone the repository:
    git clone [https://github.com/mariamashraf731/ML-Algorithms-From-Scratch.git](https://github.com/mariamashraf731/ML-Algorithms-From-Scratch.git)
  2. Install Dependencies:
    pip install numpy pandas matplotlib scikit-learn
  3. Run SVM from Scratch:
    python src/svm/svm_scratch.py
  4. Run Regression:
    python src/regression/linear_multivariate.py

📂 Datasets Used

  • Iris Dataset: For testing SVM decision boundaries.
  • German Credit Data: For binary classification (Credit Risk).
  • Custom Regression Data: Synthetic datasets for testing gradient descent convergence.

👨‍💻 Key Concepts Demonstrated

  • Convex Optimization: Gradient Descent implementation.
  • Vectorization: Efficient matrix multiplication using NumPy.
  • Regularization: Soft Margin C-parameter in SVM.
  • Data Preprocessing: Standardization and Scaling.

About

Python implementations of Support Vector Machines (Soft Margin) and Linear Regression (Univariate/Multivariate) from scratch using NumPy and Gradient Descent. Includes performance comparison with Scikit-Learn on the German Credit dataset.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages