🛒 Customer Segmentation Using RFM Analysis 📊

Unlock the Power of Customer Data!
A Data Science Project to segment customers based on their purchasing behavior using RFM Analysis and K-Means Clustering.

📌 Table of Contents / Project Structure

Click on any section to jump directly to it:

📂 Project Overview
🛠️ Technologies Used
📂 Dataset
⚙️ Analysis Workflow
📈 Key Insights & Results
🚀 How to Run
✍️ Author
License

📖 Project Overview

This project focuses on identifying distinct customer segments for an online retail business. By analyzing transactional data, we group customers based on their purchasing habits to create targeted marketing strategies. We utilize RFM Analysis (Recency, Frequency, Monetary) combined with both rule-based segmentation and machine learning (K-Means Clustering).

🛠️ Technologies Used

The project is built using Python and the following powerful libraries:

Pandas: Data manipulation and analysis.
NumPy: Numerical computations.
Matplotlib & Seaborn: Data visualization (Histograms, Boxplots, Bar charts).
Scikit-Learn: Machine Learning (StandardScaler, K-Means Clustering).

📂 Dataset

The analysis is based on the Online Retail dataset.

File Path: Dataset/Online Retail.xlsx
Description: Contains transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.

⚙️ Analysis Workflow

1. Data Cleaning

We start by ensuring high data quality:

Missing Values: Handling null values in CustomerID and Description.
Duplicates: Removing duplicate transactions to avoid skewing data.
Data Types: Converting InvoiceDate to datetime objects.

2. Exploratory Data Analysis (EDA)

We visualize the data to understand trends:

Top Countries: Identifying which countries have the most customers.
Price Distribution: Analyzing UnitPrice to detect outliers.
Orders per Day: Tracking transaction volume over time.

3. RFM Calculation

We compute the three key metrics for every customer:

Recency (R): How many days ago was their last purchase?
Frequency (F): How often do they buy?
Monetary (M): How much do they spend?

Scores (1-5) are assigned to each metric using Quantiles (pd.qcut).

4. K-Means Clustering

We take it a step further with Machine Learning:

Log Transformation: To handle skewed data distribution.
Scaling: Using StandardScaler to normalize metrics.
Elbow Method: Determining the optimal number of clusters (k).
Clustering: Grouping customers into mathematical clusters.

📈 Key Insights & Results

Based on the RFM Scores, customers are categorized into segments such as:

Segment	Description	Strategy
🏆 Champions	High R, F, M scores. Bought recently, buy often, and spend the most.	Reward them. Can become early adopters of new products.
💎 Loyal Customers	Good Frequency and Monetary scores.	Upsell higher value products. Ask for reviews.
⚠️ At Risk / Potential	Aveage Recency, Frequency, and Monetary scores.	Send personalized emails to reconnect, offer renewals.
💤 Hibernating	Low Recency, Frequency, and Monetary scores.	Recreate brand value, offer relevant discounts.

🚀 How to Run

Clone the repository.

Install dependencies:

pip install pandas numpy matplotlib seaborn scikit-learn openpyxl

Run the Notebook: Open notebook.ipynb in Jupyter Notebook or VS Code and execute the cells sequentially.

✍️ Author

Name: Mohamed Younis

License 📄

Add a license that matches how you want others to use your work (e.g., MIT).

Created with ❤️ for Data Science Enthusiasts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🛒 Customer Segmentation Using RFM Analysis 📊

📌 Table of Contents / Project Structure

📖 Project Overview

🛠️ Technologies Used

📂 Dataset

⚙️ Analysis Workflow

1. Data Cleaning

2. Exploratory Data Analysis (EDA)

3. RFM Calculation

4. K-Means Clustering

📈 Key Insights & Results

🚀 How to Run

✍️ Author

License 📄

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Dataset		Dataset
README.md		README.md
notebook.ipynb		notebook.ipynb

Folders and files

Latest commit

History

Repository files navigation

🛒 Customer Segmentation Using RFM Analysis 📊

📌 Table of Contents / Project Structure

📖 Project Overview

🛠️ Technologies Used

📂 Dataset

⚙️ Analysis Workflow

1. Data Cleaning

2. Exploratory Data Analysis (EDA)

3. RFM Calculation

4. K-Means Clustering

📈 Key Insights & Results

🚀 How to Run

✍️ Author

License 📄

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages