- Overview
- Key Capabilities
- System Architecture
- Technologies Used
- Project Structure
- Prerequisites
- Installation & Setup
- Database Setup
- Database Configuration
- Data Integrity Requirements
- Running the Pipeline
- Dashboard Preview
- Exploring the Dashboard
- Stopping Services
- Learning Outcomes
A full-stack data engineering & analytics pipeline that ingests raw e-commerce product and order data, stores it in a MySQL database, exposes data through a Flask REST API, and visualizes insights using an interactive Streamlit dashboard.
This project demonstrates end-to-end data flow β from ingestion β storage β API β analytics β using production-style Python tools.
-
Multi-Source Product Ingestion
- Aggregates product data from multiple JSON sources
- Harmonizes schemas and removes inconsistencies
-
Order Data Processing
- Ingests order data from CSV
- Maintains referential integrity between products, orders, and order items
-
REST API Layer
- Flask-based API serving product data
- Simulates an external upstream data provider
-
Analytics Dashboard
- Streamlit web app for sales analysis and product performance
- Interactive charts and KPIs
-
Relational Database Backend
- MySQL (MariaDB) as a centralized data store
- Normalized schema with foreign-key constraints
ββββββββββββββββββββββ
β JSON Product Feeds β
βββββββββββ¬βββββββββββ
β
βββββββββββΌβββββββββββ
β Flask REST API β
βββββββββββ¬βββββββββββ
β
βββββββββββΌβββββββββββ
β Data Ingestion β
β (Pandas + Python) β
βββββββββββ¬βββββββββββ
β
βββββββββββΌβββββββββββ
β MySQL Database β
β (Products, Orders) β
βββββββββββ¬βββββββββββ
β
βββββββββββΌβββββββββββ
β Streamlit Dashboardβ
ββββββββββββββββββββββ
| Layer | Tools |
|---|---|
| Language | Python 3.8+ |
| API | Flask |
| Analytics | Streamlit |
| Data Processing | Pandas |
| Database | MySQL / MariaDB |
| DB Connector | mysql-connector-python / PyMySQL |
| Server | XAMPP |
| Visualization | Streamlit Charts |
/Ecommerce_Data_Analyzer/
β
βββ api_server.py # Flask REST API
βββ data_ingestion.py # Product & order ingestion
βββ app.py # Streamlit dashboard
β
βββ data/
β βββ products_source_1.json
β βββ products_source_2.json
β βββ orders.csv
β
βββ mysql_setup.sql # DB schema
βββ requirements.txt
βββ README.md
- Python 3.8+
- XAMPP (Apache + MySQL + phpMyAdmin)
- Basic knowledge of SQL & Python virtual environments
git clone <your-repository-url>
cd Ecommerce_Data_Analyzerpython -m venv .venvWindows
.\.venv\Scripts\activatemacOS / Linux
source .venv/bin/activatepip install -r requirements.txt-
Launch XAMPP
-
Start Apache and MySQL
-
Open phpMyAdmin β
http://localhost/phpmyadmin -
Open SQL tab
-
Paste contents of
mysql_setup.sql -
Execute to create:
productsordersorder_items
β οΈ If MySQL runs on a non-default port (e.g.3307), update it in Python configs.
Update DB_CONFIG in:
api_server.pydata_ingestion.pyapp.py
DB_CONFIG = {
'host': 'localhost',
'database': 'ecommerce_data',
'user': 'root',
'password': '',
'port': 3306
}Ensure all product IDs referenced in orders.csv exist in the product JSON files.
Missing product definitions will cause foreign-key constraint errors.
python api_server.pyπ Runs at: http://127.0.0.1:5000
python data_ingestion.pyβ Inserts products & orders into MySQL
streamlit run app.pyπ Opens at: http://localhost:8501
- Sales trends over time
- Product-level revenue analysis
- Category & brand performance
- Order volume statistics
-
Press
Ctrl + Cin terminals running:api_server.pyapp.py
-
Stop MySQL & Apache via XAMPP
- Designing end-to-end data pipelines
- Working with relational databases
- Building REST APIs
- Creating interactive analytics dashboards
- Enforcing data integrity constraints
β If this project helped you learn or build, consider starring the repository!

