Turning raw data into measurable business outcomes. Building end-to-end analytics pipelines across Retail, E-Commerce, BFSI, and IT Consulting — from ETL to BI dashboards to verified impact.
I'm a Data Analyst who bridges the gap between "here's what the data shows" and "here's what we're going to do about it." Then I measure the outcome.
Track Record:
- 18-day SLA improvement (₹2.1 Cr value at risk prevented)
- 46% revenue concentration discovered (hidden tail risk flagged)
- 302 at-risk customers identified (enabling proactive retention)
- 8% pricing uplift via cross-platform pricing intelligence
- Zero compliance exceptions (53 HNI portfolios, 18-month period)
- 133 brands mapped across 3 platforms with window function pricing rank analysis
Business Problem: Brand managers and e-commerce analysts had no visibility into how the same perfume brand was being priced across Amazon.in, Flipkart, and Nykaa — making channel and pricing decisions blind.
What I Built:
- Data Collection: Python scraper via SerpApi — anti-bot-safe collection across 3 platforms
- Data Warehouse: SQLite in-notebook warehouse with window function SQL analysis
- BI Dashboard: Looker Studio interactive dashboard (live link below)
- Business Intelligence: Cross-platform brand pricing rank, tier segmentation, arbitrage detection
Business Outcomes:
✅ Nykaa vs Flipkart pricing gap quantified: 2.6x difference in avg price (₹1,969 vs ₹759) — distinct positioning confirmed
✅ 133 unique brands mapped across platforms — highly fragmented market identified
✅ Budget dominance flagged: Flipkart = 53% listings under ₹500, zero luxury — channel strategy insight
✅ 8% pricing uplift potential identified via cross-platform arbitrage analysis
✅ Window function SQL used to rank each brand's pricing per platform independently
Tech Stack:
Python SerpApi Pandas SQLite Window Functions Plotly Looker Studio Git
🔗 View Full Repo | 📊 Live Dashboard
Business Problem: 106 IT projects tracked manually. No early warning system for SLA breaches → projects failed reactively, teams scrambled at last minute.
What I Built:
- ETL Pipeline: Python (Pandas) ingestion from raw project data
- Data Warehouse: Star schema in MSSQL + 9 optimized CTE queries + 4 analytics views
- BI Dashboard: Power BI with 8 pages — project health, delay forecast, breach probability, sector risk, employee utilization
- Automation: Auto-generated executive PowerPoint deck + HTML email report via SMTP after every pipeline run
Business Outcomes:
✅ Flagged 69 projects at SLA-breach risk (proactive identification)
✅ Teams reshaped timelines in advance → 18-day average delay reduction
✅ Quantified impact: ₹2.1 Cr value at risk prevented
✅ 40 hours/month saved (eliminated manual reporting across 8 teams)
✅ BFSI sector flagged as worst performer (39.6 day avg delay) — resource reallocation trigger
Tech Stack:
Python Pandas MSSQL CTEs Window Functions Power BI DAX python-pptx SMTP Git
Business Problem: Job platforms have no visibility into where candidates drop off, which employers ghost applicants, or what engagement patterns predict churn — making product decisions guesswork.
What I Built:
- Data Simulation: Python-generated synthetic job marketplace dataset (6-stage funnel)
- Analytics Layer: SQL-based funnel metrics, cohort retention, ghosting detection
- Visualizations: Cohort retention heatmaps, engagement segmentation, ghosting-to-churn Sankey diagram
- Live App: Streamlit dashboard deployed and publicly accessible
Business Outcomes:
✅ 6-stage funnel analytics — drop-off quantified at every stage from apply to hire
✅ Cohort retention heatmaps — identified which cohorts churned fastest and why
✅ Ghosting pattern analysis — employer ghosting mapped to churn probability
✅ Live Streamlit deployment — accessible without local setup
Tech Stack:
Python SQLite SQL Pandas Streamlit Plotly Git
🔗 View Full Repo | 🚀 Live App
Business Problem: Retail business: 50,000 transactions (₹1.62 Cr revenue) but no visibility into SKU performance, customer health, or demand drivers.
What I Built:
- ETL Pipeline: Python (Polars) processing 50K transactions into clean data lake
- Data Warehouse: DuckDB with Parquet optimization for fast queries
- BI Dashboards: Power BI + DAX for SKU, customer, and revenue analysis
- AI Layer: LangChain + Groq (LLaMA 3.3) for natural-language querying
Business Outcomes:
✅ Revenue concentration risk flagged: Top 3 SKUs = 46% of revenue → inventory planning fixed
✅ 302 at-risk customers identified (30.2% showing churn signals) → proactive retention enabled
✅ VIP + Loyal customers = 94% of total revenue → strategic focus validated
✅ AI accessibility: Natural-language agent lets business query data without SQL
Tech Stack:
Python Polars DuckDB Parquet Power BI DAX LangChain Groq Git
- Managed portfolio operations and advisory workflows for 53 HNI clients (₹5 Cr AUM)
- Built risk-scoring models in Python to vet portfolio allocations and detect compliance drift
- Created real-time monitoring dashboards: zero compliance exceptions over 18-month operational period
- Validated advisory outcomes: portfolios outperformed benchmark by 240 basis points during down-market cycles
- Skills: Python, SQL, Risk Analysis, Compliance Monitoring, Power BI, Excel (Advanced)
- Conducted fundamental analysis across 20+ financial models (Auto, FMCG, Hospitality sectors)
- Built equity valuation models: DCF, comparable company analysis, precedent transactions
- Recommendations influenced ₹15+ Cr investment decisions
- Skills: Financial Modeling, DCF Analysis, Sector Research, Excel (Advanced), Data Analysis
| Category | Tools |
|---|---|
| Languages | Python, SQL |
| Data Processing | Pandas, Polars, NumPy, DuckDB |
| Databases | PostgreSQL, MSSQL, SQLite, DuckDB |
| BI & Visualization | Power BI, DAX, Looker Studio, Plotly, Excel (Advanced) |
| Data Modeling | Star Schema, CTEs, Window Functions |
| Cloud & DevOps | Azure, AWS S3, Amazon Athena, Git |
| AI & Emerging | LangChain, Groq (LLaMA 3.3), Agentic AI Workflows |
| Deployment | Streamlit, SMTP Automation, python-pptx |
🏆 SQL Advanced — HackerRank (Verified, 2026)
- Understand the business problem first — before touching data
- Build end-to-end pipelines — ETL → warehouse → BI → insight → action
- Quantify outcomes — not "we built a dashboard," but "this saved 40 hours/month"
- Bridge gap between data and decisions — dashboards are useless if nobody acts on them
- Verify impact — track actual business outcomes after analysis
- Data Analyst roles (full-time)
- MIS Executive positions
- Business Intelligence Analyst / Analytics Coordinator roles
- Remote or Mumbai-based opportunities
Teams that care about measurable business impact, not just dashboards. If you need someone who turns data into decisions, let's talk.
- 📧 Email: muditthakur918@gmail.com
- 🔗 LinkedIn: linkedin.com/in/mudit-thakur1
- 💻 HackerRank: HackerRank Profile
- 📍 Location: Mumbai, India
- 1.3 years of professional analytics experience (Mutual Fund advisory + Equity Research)
- 10 months: Mutual Fund Back Office Executive (53 HNI clients, ₹5 Cr AUM, risk modeling, compliance)
- 3 months: Equity Research Intern (20+ financial models, DCF analysis, sector research)
- 4 end-to-end portfolio projects with verified business outcomes
- Expert in SQL optimization — window functions, CTEs, query performance tuning
- Python pipelines for ETL, analysis, and AI agent integration
- Power BI dashboards that stakeholders actually use and act on
- SQL Advanced Certified — HackerRank (2026)