This project focuses on analyzing retail customer purchase data using SQL and Python to extract meaningful business insights. It covers revenue trends, customer behavior, product performance, and regional analysis.
- Analyze overall revenue and growth patterns
- Identify high-value customers and segments
- Evaluate product and category performance
- Understand geographic distribution of sales
- Estimate customer retention and churn
- SQL (MySQL) – Data querying and transformation
- Python (Pandas, Matplotlib) – Data analysis & visualization
- Jupyter Notebook – Development environment
The dataset includes:
-
Customer details (ID, Age, Gender, Income, Segment)
-
Transaction data (Amount, Total Purchases, Date)
-
Product information (Category, Brand, Type)
-
Order details (Payment Method, Shipping, Status, Ratings)
-
Geographic data (City, State, Country)
-
link:https://drive.google.com/file/d/1mPKDHK8c0jWb-zgqzow7qxHSVK5Pi-ic/view?usp=drive_link
- Total Revenue
- Average Order Value (AOV)
- Revenue by City and Product Category
- Customer Lifetime Value (CLTV)
- Top Customers by Revenue
- Repeat Purchase Rate
- Churn Rate (approximated)
- Top Product Categories
- Brand-wise Performance
- Top Cities by Revenue
- Monthly and Yearly Revenue Trends
- 📊 Bar Chart – Revenue by City
- 📈 Line Chart – Monthly Revenue Trend
- A small percentage of customers contribute a large portion of total revenue
- Certain cities dominate overall sales, indicating regional concentration
- Specific product categories consistently outperform others
- Repeat customers significantly impact business revenue
- Customer Acquisition Cost (CAC) not calculated due to lack of marketing data
- Profit estimated as no cost data is available
- Churn rate approximated based on inactivity
- Build an interactive dashboard using Power BI or Tableau
- Include real profit and cost analysis
- Perform advanced segmentation using RFM analysis
- Add predictive analytics (sales forecasting)
- Load the dataset into MySQL
- Connect MySQL with Python using SQLAlchemy
- Run the Jupyter Notebook for analysis and visualization
- End-to-end data analysis using SQL + Python
- Business-focused insights and metrics
- Clean and structured analytical workflow