Skip to content

SammyShaw/ML-Projects-Archive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Projects Archive

This repository serves as an archive of three supervised and unsupervised machine learning projects completed during the Data Science Infinity program.
Each project demonstrates end-to-end implementation — from data preprocessing and model training to evaluation and business application — using Python and scikit-learn.

These projects collectively showcase a range of core data science skills:

  • Regression modeling and prediction
  • Supervised classification and evaluation metrics
  • Unsupervised clustering and customer segmentation
  • Data preprocessing, feature engineering, and model interpretability

Project Summaries

1. Predicting Customer Loyalty (Regression)

Goal: Build predictive regression models to estimate customer loyalty scores for ABC Grocery’s membership program.
Techniques: Linear Regression, Random Forest Regressor, Feature Selection via RFECV.
Highlights:

  • Compared multiple regression approaches for predictive accuracy.
  • Identified key drivers of customer loyalty such as spending patterns and distance from store.
  • Demonstrated robust data preprocessing and feature importance analysis.

➡️ View Project →


2. Enhancing Targeting Accuracy (Classification)

Goal: Predict which customers are most likely to sign up for ABC Grocery’s Delivery Club membership using supervised ML classification.
Techniques: Logistic Regression, Decision Tree, Random Forest, K-Nearest Neighbors (KNN).
Highlights:

  • Random Forest achieved the best balance of accuracy (0.935) and recall (0.904).
  • Emphasized interpretability via feature and permutation importance.
  • Provided actionable business insight: proximity to store was the top predictor of signups.

➡️ View Project →


3. "You Are What You Eat" Customer Segmentation (Clustering)

Goal: Use unsupervised learning (k-means) to segment grocery customers based on dietary preferences and spending patterns.
Techniques: K-Means Clustering, Feature Scaling, WCSS (Elbow Method).
Highlights:

  • Identified 3 main segments (General, Vegetarian, Vegan-like) based on product area spend.
  • Provided clear actionable insights for personalized marketing strategies.
  • Suggested future applications: deeper subcategory segmentation, integration with demographic data.

➡️ View Project →


Technical Stack

Tool / Library Purpose
Python (3.x) Core scripting language
pandas / numpy Data cleaning & transformation
scikit-learn Machine learning & evaluation
matplotlib / seaborn Visualization
pickle Model persistence

Key Skills Demonstrated

  • Regression and classification modeling
  • Clustering and segmentation
  • Cross-validation and feature selection
  • Handling imbalanced data (Precision, Recall, F1)
  • Model interpretability and visualization
  • Business translation of ML insights

Reflection

This collection marks the foundation of my applied machine learning journey — moving from conceptual understanding to practical, business-relevant modeling.
Each project emphasizes clarity, reproducibility, and interpretability — demonstrating the bridge between statistical rigor and actionable insight.


© 2025 Samuel Shaw
📍 Seattle, WA
📫 LinkedIn

About

Select machine learning projects using a grocery store's customer database.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors