This repository contains applied statistical analysis projects using real-world time-series data across multiple domains, including finance and environmental studies.
The focus is on statistical reasoning, interpretability, and time-varying behavior, rather than prediction or black-box modeling.
To apply core concepts from probability and statistics—such as rolling statistics, volatility, and z-scores—to analyze variability and extreme events in real-world time-series data.
The projects in this repository use publicly available datasets, including:
- Historical daily price data of the NIFTY 50 index
- Air quality measurements (PM2.5) from Delhi air pollution datasets
Raw data files are not included due to licensing considerations.
- Python
- pandas, numpy
- matplotlib
- Jupyter Notebook
Notebook: stock_analysis.ipynb
Focus:
- Computation of daily percentage returns
- Mean and standard deviation analysis
- Return distribution visualization
- Identification of tail events
Key Takeaways:
- Average daily returns are close to zero, consistent with efficient market behavior
- Return distributions exhibit fat tails
- Volatility captures market risk effectively
Notebook: volatility_zscore.ipynb
Focus:
- Rolling mean and rolling volatility estimation
- Time-varying risk analysis
- Z-score based detection of statistically extreme events
Key Takeaways:
- Volatility is not constant over time
- Extreme events cluster during high-risk regimes
- Constant-variance assumptions are violated in real data
Notebook: air_quality_anomaly.ipynb
Focus:
- Time-series analysis of daily PM2.5 concentrations
- Rolling variability and volatility analysis
- Z-score based detection of extreme pollution events
Key Takeaways:
- Air pollution variability is time-dependent and non-constant
- Extreme pollution episodes occur in clusters
- Statistical anomaly detection methods are transferable beyond finance
- Time-series analysis
- Rolling mean and rolling standard deviation
- Volatility and variability modeling
- Z-scores and anomaly detection
- Tail risk and extreme event interpretation
- Applied statistical theory to real-world datasets across domains
- Developed intuition for time-varying risk and variability
- Gained experience handling messy, real-world data
- Strengthened ability to interpret and communicate statistical results
Mayank Kochar
Mathematics background with interest in applied statistics, data analysis, and
time-series modeling