Skip to content

juaneacha/The-Athlete-In-You

Repository files navigation

The Athlete In You: Increasing Engagement For The Olympic Games

Find Out More About The Athlete In You Now!

Introduction

The Olympics is one of those sports shows that has grown in viewership in recent years. Moreover, this newfound growth needs constant work to continue to improve. To that end, an effective way to keep and increase viewership is through increasing engagement. A website that can tell one’s ability to play in any of the Olympic sports can increase engagement by up to 52.6% (Barreto).

Concepts applied:

  • Classification Modeling: Naive Bayes, Decision Trees, Random Forest
  • Model Deployment: Heroku, Flask, HTML

Problem Statement

  • How to increase the Olympic’s engagement?
  • How can Olympic viewers be more invested in the sport?
  • How to bring more awareness to the different sports and teams?

Data Collection and Cleaning

  • Data Collection
  • Raw Data Snapshot
  • Data Cleaning
    • Data points are evaluated for correctness, datatypes, and overall uniformity
    • Outliers are dropped from the dataset
    • Null values are imputed and dropped when needed

Data Modeling:

  • Pre-Modeling
    • The Sport class is selected as the target class
    • Sport class is balanced for 18 categories with 4000 observations each
    • Non-numeric datatypes are converted to numeric (float)
    • Correlation among the features is evaluated
    • Data is split into 75% and 25% for training and testing, respectively
  • Modeling
    • A Naive Bayes, Decision Tree, and Random Forest algorithms are created
    • Models Evaluation
      • Confusion Matrix
      • Other Performance Metrics
  • Model Results
    • Random forest is the best algorithm out of the three. This is gleaned from the confusion matrix comparison and its precision, sensitivity, and F1-score results. In the confusion matrix comparison, random forest shows better classification performance compared to decision trees and random forest. This is shown in its uniform diagonal line where most of the observations fall in their right category. This is not the case for algorithms like naive Bayes where most of the observations are scattered outside of this diagonal line, showing lack of overall correctness. This is also observed in naive Bayes’ precision, sensitivity, and F1-score results where it shows the lowest score by a significant margin when compared to the random forest and decision tree. Speaking of decision trees, this algorithm performs similarly to random forest in most measures. It only shows a small drop in performance across the board when compared to random forest.
  • Model Deployment
    • HTML Index file is created with a form and graphics UI
    • The ML model is saved as a pickle file
    • ProcFile for Heroku is created and setup
    • Heroku is set to show HTML index and embed ML Model

Conclusion and Recommendations

  • Small to medium ML models are critical for effective deployment on the web
  • The Decision Tree Algorithm is the best algorithm to deploy on the web. It performs very similarly to the best algorithms on this project while being a fraction of the size
  • Two components make for an ideal web model: reliability and size. These two are embodied by the decision tree, making it the best option for web deployment
  • Having The Athlete In You Model as part of an advertising campaign can effectively increase engagement up to 52.6%

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors