This project implements and compares several commonly used volatility estimators using real OHLC (Open, High, Low, Close) market data. Volatility is a fundamental concept in quantitative finance, serving as a proxy for uncertainty and risk. Because true volatility is unobservable, it must be estimated from historical price data. Different estimators use different information sets and assumptions, leading to distinct behavior in practice.
The goal of this project is to study how these estimators behave empirically, highlight their trade-offs, and understand when each is most appropriate.
Volatility measures the magnitude of price fluctuations over time. High volatility corresponds to large and frequent price changes, while low volatility indicates more stable prices. In trading and risk management, volatility is closely associated with uncertainty and potential risk.
Let denote the asset price at time ( t ). The log return is defined as
Volatility is defined as the standard deviation of returns:
Because the true return distribution is unknown, volatility must be estimated from observed data.
Volatility is not directly observable, and market data is discrete, noisy, and incomplete. Different estimators:
- use different subsets of available price information,
- make different assumptions about price dynamics,
- respond differently to regime changes and shocks.
As a result, there is no single “correct” volatility—only estimates that are useful for specific purposes.
This estimator uses only closing prices and computes the standard deviation of log returns over a rolling window.
Pros
- Simple and widely understood
- Serves as a baseline estimator
Cons
- Ignores intraday price movements
- Can be noisy and slow to react to volatility spikes
Interpretation:
“How much do closing prices vary from day to day?”
Exponentially Weighted Moving Average (EWMA) volatility assigns greater weight to recent returns:
Pros
- Responds quickly to market shocks
- Widely used in risk management systems
Cons
- Still relies only on closing prices
- Sensitive to the choice of decay parameter ( \lambda )
Interpretation:
“How volatile is the market right now, with emphasis on recent movements?”
The Parkinson estimator uses the daily high–low price range:
Pros
- More efficient than close-to-close volatility
- Uses intraday information
Cons
- Assumes no drift
- Ignores opening gaps
Interpretation:
“How wide was the trading range during the day?”
This estimator uses Open, High, Low, and Close prices to improve efficiency:
Pros
- More efficient than Parkinson
- Incorporates directional movement
Cons
- Assumes continuous trading
- Sensitive to overnight price jumps
Interpretation:
“How volatile was intraday trading, accounting for direction?”
The Yang–Zhang estimator separates volatility into:
- overnight volatility,
- open-to-close volatility,
- intraday volatility (Rogers–Satchell).
This makes it robust to opening gaps and discontinuous trading.
Pros
- Handles overnight jumps explicitly
- One of the most robust practical estimators
Cons
- More complex
- Requires full OHLC data
Interpretation:
“How volatile was the market, accounting for both intraday trading and overnight information?”
Historical OHLC data is retrieved using the yfinance library. Estimators are
evaluated on liquid assets (e.g., SPY), allowing comparison across different
market regimes and volatility environments.
The estimators exhibit systematic differences:
- EWMA responds fastest to volatility shocks
- Range-based estimators (Parkinson, Garman–Klass) are smoother and more efficient
- Yang–Zhang handles overnight gaps more robustly
These differences highlight how estimator choice depends on the intended application.