This repository contains a collection of data mining and machine learning projects implemented using Python and scikit-learn.
The projects demonstrate the full workflow of a data science pipeline including:
• Data cleaning
• Feature engineering
• Exploratory data analysis (EDA)
• Statistical testing
• Machine learning model development
• Model evaluation
The repository contains projects from different domains including:
- E-commerce analytics
- Customer purchase prediction
- Medical diagnosis classification
- Admission prediction
- Housing price regression
Python
Pandas
NumPy
Matplotlib
Seaborn
Scikit-learn
SciPy
projects/
│
├── ecommerce-customer-behavior-analysis
├── ecommerce-purchase-prediction
├── laptop-purchase-prediction
├── thyroid-cancer-classification
├── university-admission-prediction
└── yemen-housing-price-prediction
Each project contains:
data/ – dataset used in the project
notebook/ – Jupyter notebook containing the analysis and models
README.md – detailed explanation of the project