This project applies the K-Means Clustering Algorithm, an unsupervised machine learning technique, to segment customers of a retail store based on their purchase behavior. Using the Mall Customer Dataset, customers are grouped into different clusters based on their Annual Income and Spending Score.
Customer segmentation helps businesses understand their customers better and design targeted marketing strategies.
The goal of this project is to:
- Analyze customer purchasing patterns
- Identify groups of similar customers
- Apply K-Means clustering to form meaningful customer segments
- Unsupervised Learning
- K-Means Clustering Algorithm
- Elbow Method to determine the optimal number of clusters
The dataset used in this project is the Mall Customers Dataset which contains the following attributes:
| Feature | Description |
|---|---|
| CustomerID | Unique ID of the customer |
| Gender | Male / Female |
| Age | Customer age |
| Annual Income (k$) | Customer yearly income |
| Spending Score (1-100) | Score assigned by the mall based on purchasing behavior |
For clustering, the features used are:
- Annual Income
- Spending Score
- Python
- Pandas – Data processing
- Matplotlib – Data visualization
- Scikit-learn – Machine learning algorithms
- Import required Python libraries
- Load the dataset using Pandas
- Select relevant features (Income & Spending Score)
- Use the Elbow Method to find the optimal number of clusters
- Apply the K-Means Algorithm
- Visualize customer segments using scatter plots
The algorithm groups customers into five distinct clusters, representing different purchasing behaviors such as:
- High income – High spending customers
- High income – Low spending customers
- Low income – High spending customers
- Low income – Low spending customers
- Average customers
This segmentation helps businesses improve marketing strategies and customer targeting.
SCT_ML_2
│
├── Mall_Customers.csv
├── kmeans_customer_segmentation.py
└── README.md
1️⃣ Clone the repository
git clone https://github.com/codewithRakz/SCT_ML_2.git
2️⃣ Install required libraries
pip install pandas matplotlib scikit-learn
3️⃣ Run the program
python kmeans_customer_segmentation.py
The project generates:
- Elbow Method Graph
- Customer Segmentation Visualization
These plots show how customers are grouped based on their purchasing behavior.
- Understanding unsupervised machine learning
- Implementing K-Means clustering
- Using data visualization for insights
- Applying the Elbow Method for cluster optimization
This project was completed as part of an internship task focused on machine learning and customer analytics.