This project uses XGBoost to predict if a player will score above a certain betline in a game based on historical data. XGBoost regression model is used predict the number of points a player is likely to score in their next game. This prediction is based on historical data, rolling averages, and trends. The model outputs a numerical value representing the expected points.
Once the prediction is made:
It can be compared against a betting line (a predefined threshold set by bookmakers for a player's performance). This comparison helps decide whether the player is likely to score "over" or "under" that line, aligning the regression output with the binary outcome commonly needed for sports betting decisions.
This combination of regression and decision-making relative to a threshold essentially bridges continuous predictions with actionable insights for the betting scenario.
- Ensure you have Python installed (preferably Python 3.8+).
- Install the required Python libraries using the following command:
Use the provided commands.py file to scrape the player data. Run the following command to execute the scraping process:
pip install -r requirements.txt
Steps
Scrape Data Use the provided cli_app.py file to scrape the player data. Run the following command to execute the scraping process:
python cli_app.py scrape-initThis will:
Call the scrape_seasons() function to scrape data for specific seasons.
Call the scrape_team_seasons() function to scrape team-level data for the same seasons.
Important: Ensure that the arrays of seasons defined in both scrape_seasons and scrape_team_seasons are consistent, as they must cover the same range of data.
Run the Model After data preparation, you can proceed to train or test the XGBoost model. Instructions for this are provided in the relevant script documentation.
python cli_app.py train-xgbAfter training make sure you load the generated model file to predictions files.
To make predictions, you need to provide input data in a JSON file named data.json. This file should contain an array of objects, where each object has the following structure:
[
{
"name": "Player Name",
"points": 25
},
{
"name": "Another Player",
"points": 18
}
]Running Predictions
Once the data.json file is populated with the required data, you can run the following command to make predictions:
python cli_app.py predict-allIn order to display predictions there is a flask application where u can see predictions made for the day flask application is very simple feel free to change the query to adhere your needs for showing predictions
flask run --debug
This project is a basic MVP (Minimum Viable Product) designed as a hobby project and relies heavily on manual commands to maintain and synchronize data. It is not fully automated, so keeping the data accurate and up-to-date requires regular intervention.
To ensure the predictions remain relevant:
- Run the scraping mechanism daily using the command:
python cli_app.py fill-data
This is necessary because NBA games typically occur daily or every other day.
Regular updates and maintenance are required to keep the data and predictions accurate.
This is a purely hobby project created for learning purposes. The models used here have not achieved high accuracy in predicting player performance, and the results should not be taken as reliable forecasts.