Skip to content

"E-commerce Product Data Aggregator and Analyzer" Python-based system to simulate and analyze e-commerce data from multiple vendors, enabling centralized product aggregation and insightful analytics through RESTful APIs

Notifications You must be signed in to change notification settings

Quantamaster/Ecommerce_Data_Analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation


πŸ›’ E-commerce Product Data Aggregator & Analyzer

Python Flask Streamlit MySQL Data%20Pipeline


πŸ“š Table of Contents


Overview

A full-stack data engineering & analytics pipeline that ingests raw e-commerce product and order data, stores it in a MySQL database, exposes data through a Flask REST API, and visualizes insights using an interactive Streamlit dashboard.

This project demonstrates end-to-end data flow β€” from ingestion β†’ storage β†’ API β†’ analytics β€” using production-style Python tools.


πŸ“Œ Key Capabilities

  • Multi-Source Product Ingestion

    • Aggregates product data from multiple JSON sources
    • Harmonizes schemas and removes inconsistencies
  • Order Data Processing

    • Ingests order data from CSV
    • Maintains referential integrity between products, orders, and order items
  • REST API Layer

    • Flask-based API serving product data
    • Simulates an external upstream data provider
  • Analytics Dashboard

    • Streamlit web app for sales analysis and product performance
    • Interactive charts and KPIs
  • Relational Database Backend

    • MySQL (MariaDB) as a centralized data store
    • Normalized schema with foreign-key constraints

🧱 System Architecture


β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ JSON Product Feeds β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Flask REST API     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Data Ingestion     β”‚
β”‚ (Pandas + Python)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ MySQL Database     β”‚
β”‚ (Products, Orders) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Streamlit Dashboardβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜


πŸ› οΈ Technologies Used

Layer Tools
Language Python 3.8+
API Flask
Analytics Streamlit
Data Processing Pandas
Database MySQL / MariaDB
DB Connector mysql-connector-python / PyMySQL
Server XAMPP
Visualization Streamlit Charts

πŸ“‚ Project Structure

/Ecommerce_Data_Analyzer/
β”‚
β”œβ”€β”€ api_server.py          # Flask REST API
β”œβ”€β”€ data_ingestion.py      # Product & order ingestion
β”œβ”€β”€ app.py                 # Streamlit dashboard
β”‚
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ products_source_1.json
β”‚   β”œβ”€β”€ products_source_2.json
β”‚   └── orders.csv
β”‚
β”œβ”€β”€ mysql_setup.sql        # DB schema
β”œβ”€β”€ requirements.txt
└── README.md

πŸ“¦ Prerequisites

  • Python 3.8+
  • XAMPP (Apache + MySQL + phpMyAdmin)
  • Basic knowledge of SQL & Python virtual environments

βš™οΈ Installation & Setup

1️⃣ Clone Repository

git clone <your-repository-url>
cd Ecommerce_Data_Analyzer

2️⃣ Create & Activate Virtual Environment

python -m venv .venv

Windows

.\.venv\Scripts\activate

macOS / Linux

source .venv/bin/activate

3️⃣ Install Dependencies

pip install -r requirements.txt

πŸ—„οΈ MySQL Database Setup

  1. Launch XAMPP

  2. Start Apache and MySQL

  3. Open phpMyAdmin β†’ http://localhost/phpmyadmin

  4. Open SQL tab

  5. Paste contents of mysql_setup.sql

  6. Execute to create:

    • products
    • orders
    • order_items

⚠️ If MySQL runs on a non-default port (e.g. 3307), update it in Python configs.


πŸ” Database Configuration

Update DB_CONFIG in:

  • api_server.py
  • data_ingestion.py
  • app.py
DB_CONFIG = {
    'host': 'localhost',
    'database': 'ecommerce_data',
    'user': 'root',
    'password': '',
    'port': 3306
}

πŸ“Š Data Integrity Requirement

Ensure all product IDs referenced in orders.csv exist in the product JSON files.

Missing product definitions will cause foreign-key constraint errors.


β–Ά Running the Pipeline

1️⃣ Start Flask API

python api_server.py

πŸ“ Runs at: http://127.0.0.1:5000


2️⃣ Run Data Ingestion

python data_ingestion.py

βœ” Inserts products & orders into MySQL


3️⃣ Launch Streamlit Dashboard

streamlit run app.py

πŸ“ Opens at: http://localhost:8501


πŸ“ˆ Dashboard Preview

Performance Overview

Performance Overview

Dashboard

Dashboard


πŸ” Dashboard Insights

  • Sales trends over time
  • Product-level revenue analysis
  • Category & brand performance
  • Order volume statistics

πŸ›‘ Stopping Services

  • Press Ctrl + C in terminals running:

    • api_server.py
    • app.py
  • Stop MySQL & Apache via XAMPP


🎯 Learning Outcomes

  • Designing end-to-end data pipelines
  • Working with relational databases
  • Building REST APIs
  • Creating interactive analytics dashboards
  • Enforcing data integrity constraints

⭐ If this project helped you learn or build, consider starring the repository!

About

"E-commerce Product Data Aggregator and Analyzer" Python-based system to simulate and analyze e-commerce data from multiple vendors, enabling centralized product aggregation and insightful analytics through RESTful APIs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages