Hosted at: https://everything-applied-ds.pages.dev/
🎮 Interactive tutorials for practical data science concepts
Learn applied data science through 70+ interactive tutorials covering statistical models, optimization, machine learning, and more — using sports betting as a rich applied use case to ground abstract concepts in real-world scenarios.
# Install dependencies
npm install
# Start development server
npm run dev
# Open http://localhost:5173- 70+ Concepts organized into 13 categories
- Interactive Demos with real-time parameter adjustment
- Synthetic Data generators for hands-on experimentation
- Progress Tracking saved in your browser
- Notes System to capture insights as you learn
- R Code Examples for each concept
Sports betting provides an ideal sandbox for learning applied data science:
- Probability & Statistics — odds, expected value, Bayesian updating
- Optimization — Kelly criterion, portfolio allocation
- Machine Learning — predictions, calibration, feature engineering
- Risk Management — VaR, correlation, stress testing
- Econometrics — causal inference, A/B testing, panel data
The concepts transfer directly to finance, operations research, and decision science.
src/
├── routes/ # Page routes
│ ├── models/ # Core statistical models
│ ├── pricing/ # Decision frameworks
│ ├── risk/ # Risk management
│ ├── ml/ # Machine learning
│ └── ...
├── lib/
│ ├── components/ # Reusable UI components
│ ├── data/ # Navigation & config
│ └── utils/ # Synthetic data utilities
- Foundations - Expected Value, Implied Probability, Regression
- Core Models - Bayesian Updating, Monte Carlo, Time Series
- Decision Making - Kelly Criterion, Optimization, Game Theory
- Risk - Correlation, VaR, Stress Testing
- Advanced - ML Calibration, Causal Inference, MLOps
- SvelteKit 2 with TypeScript
- Tailwind CSS for styling
- Chart.js for visualizations
- Browser Storage for notes & progress
Prior beliefs updated with new data. Used for: real-time line adjustments as information arrives (injury news, weather, sharp money).
Random sampling to model uncertainty and estimate probability distributions. Used for: stress testing pricing under different scenarios, estimating win rate distributions.
- Linear: Continuous outcomes (total points scored)
- Logistic: Binary outcomes (over/under hit rate)
- Poisson: Count data (goals, touchdowns, strikeouts)
Forecasting based on temporal patterns. Used for: player performance trends, seasonal adjustments, recent form weighting.
Combining multiple models for better predictions. Used for: player projection models that blend multiple data sources.
Complex pattern recognition. Used for: advanced player projections, image recognition (injury assessment from video), NLP on news/social.
Time-until-event modeling. Used for: injury return timelines, player career arcs, customer lifetime value.
State transition probabilities. Used for: game flow simulation, possession-based modeling, player state transitions.
EV = (Probability of Win × Payout) - (Probability of Loss × Stake)
Fundamental to all betting pricing. Users should have negative EV; house has positive EV.
Optimal bet sizing formula: f = (bp - q) / b where b=odds, p=win prob, q=loss prob
Used for: bankroll management, exposure limits, optimal pricing margins.
The house edge built into pricing. Balance between:
- Too high → users leave for competitors
- Too low → insufficient profitability Used for: payout multiplier calibration.
Maintaining balanced book by adjusting prices based on flow. Used for: dynamic line movement.
Ensuring no combination of picks guarantees profit. Used for: correlated pick validation, parlay pricing consistency.
Converting multipliers/odds to probabilities:
Implied Prob = 1 / Decimal Odds
For 3x payout: Implied prob = 33.3% But true prob might be 28% (5.3% house edge)
Identifying sophisticated bettors vs. casual users. Used for: line movement triggers, risk flags.
Measuring how picks move together (QB + his WR, teammates, same-game picks). Used for: exposure limits on correlated parlays.
Maximum expected loss at confidence level. Used for: daily/weekly exposure limits, worst-case scenario planning.
Too much exposure to single event/player/outcome. Used for: position limits, hedging requirements.
"What if LeBron gets injured?" "What if sharp syndicate targets us?" Used for: business continuity, pricing robustness.
Tracking total exposure across all active contests. Real-time monitoring of potential payouts vs. reserves.
Dynamic skill ratings that update based on performance. Used for: team/player strength estimation.
- Usage-based: Minutes played → opportunity → production
- Matchup-based: Opponent defense rating adjustments
- Pace-adjusted: Game speed impacts total stats
- Vegas totals integration: Implied game script from betting markets
Predicting game flow (blowout vs. close game) affects player usage/stats. Used for: contextual player pricing.
Understanding how users construct entries. Used for: identifying popular vs. contrarian picks, pricing adjustments.
Quantifying teammate stat increases when star player out. Used for: rapid line adjustments.
Users overweight recent performance. Used for: identifying mispriced lines when regression to mean expected.
Users pick favorite teams irrationally. Used for: shade lines on popular teams.
First number seen affects perception. Used for: how you display projections.
People feel losses 2x more than equivalent gains. Used for: promotion design, payout structure psychology.
When should users cash out? When should you close lines? Game theory applications.
Real-time price adjustments based on demand (like Uber). Used for: live contest pricing, high-demand slates.
Airline/hotel pricing strategy. Used for: contest entry fee optimization, promotional pricing timing.
How demand changes with price. Used for: finding optimal hold rate that maximizes revenue (not just profit per contest).
Causal inference for pricing experiments. Used for: testing new payout structures, hold rates, features.
Exploration vs. exploitation for pricing. Used for: learning optimal pricing in new markets/sports.
Maximize profit subject to constraints (liquidity, exposure limits). Used for: optimal pricing across slate.
AI learns optimal pricing policy through trial and error. Used for: adaptive pricing systems.
Finding global optima in convex problems. Used for: portfolio construction, risk minimization.
Specific to portfolio optimization (Markowitz). Used for: balancing risk-return in pricing strategy.
Player-level repeated observations over time. Used for: controlling for individual heterogeneity in projections.
Identifying causal effects when experiments impossible. Used for: measuring true pricing impact vs. confounds.
Measuring treatment effects across groups/time. Used for: evaluating pricing changes, new feature rollouts.
Creating comparable groups for causal analysis. Used for: user segmentation impact studies.
Players nested in teams, teams in leagues. Used for: borrowing strength across related units.
Creating predictive variables: rolling averages, rest days, travel distance, altitude, referee tendencies, etc.
Stacking, blending, averaging multiple models. Used for: combining expert models (one per sport).
Proper validation for temporal data. Used for: avoiding look-ahead bias in backtesting.
Systematically finding best model parameters.
Ensuring predicted probabilities match observed frequencies. Critical for pricing accuracy.
Identifying unusual betting patterns, data errors, sharp syndicates. Used for: fraud detection, model monitoring.
Predicting long-term user value. Used for: balancing acquisition cost vs. hold rate.
Identifying users likely to leave. Used for: retention-focused pricing/promotions.
Tracking user groups over time. Used for: measuring pricing strategy impact on retention.
CAC (Customer Acquisition Cost), LTV:CAC ratio, contribution margin. Used for: evaluating pricing profitability.
Measuring true causal lift from promotions/features. Used for: ROI of pricing experiments.
Benchmarking against DraftKings, FanDuel, Underdog. Used for: maintaining competitive payout rates.
How information flows through markets, order flow analysis. Used for: understanding sharp money impact.
Strategic interaction between competitors. Used for: pricing strategy when competitors react.
When winning bids/bets signal you mispriced. Used for: identifying when sharp money targets you.
Proper scoring rule for probability predictions. Used for: model evaluation for win/loss outcomes.
Accuracy of probabilistic predictions. Used for: measuring calibration quality.
Classification model performance. Used for: evaluating binary outcome predictions.
Continuous prediction accuracy. Used for: player point total projections.
Risk-adjusted returns. Used for: evaluating pricing strategy performance.
Did your opening line beat the closing line? Used for: measuring pricing quality vs. market.
Streaming data for live pricing adjustments.
Getting models into production, monitoring, retraining.
Fast Monte Carlo for scenario testing.
Real-time pricing health monitoring.
Tracking model iterations and performance.