Skip to content

Kristio1232/Movie-Box-Office-predictor

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 

Repository files navigation

Movie-Box-Office-predictor 🎥 💰

Authors:

  • Jayant
  • Christopher Budd
  • Mustafa Syed

Objective:

To predict the box office revenue generated for movies since January 1st, 2015 until 5th November, 2023.

Dataset:

Dataset used: https://www.kaggle.com/datasets/akshaypawar7/millions-of-movies/data

Install: scikit-learn, numpy, matplotlib, pandas, seaborn and xgboost as follows:

  • conda install -c anaconda scikit-learn
  • conda install pandas
  • conda install numpy
  • conda install -c conda-forge matplotlib
  • conda install seaborn
  • pip install xgboost

Tasks:

  1. Getting the data
    • Loading the data on your machine
  2. Look at the big picture
  3. First impressions on the dataset, EDA graphs, and patterns found
  4. Preprocessing: preparing the data for the ML algorithms
    • Data cleaning
    • Encoding
    • Feature scaling (re-sampling)
  5. Training and evaluation of 3 ML algorithms
    • Algorithms used
    • Training
    • Findings and results comparison
  6. Three Graphs for the best performance algorithm
  7. Limitations of the model

Appendix 1 :

Code found in the pdf

Appendix 2 :

Github repository link: https://github.com/Jayant1Varma/Movie-Box-Office-predictor.git
Original dataset citation: The dataset used was https://www.kaggle.com/datasets/akshaypawar7/millions-of-movies/data .

Presentation video link: https://youtu.be/R6Qv8SrqOKY

However, this dataset is updated daily, but we used this dataset as it was available on November 5th 2023 You can find the exact dataset we used here: https://drive.google.com/file/d/1uPtHyqpAKkqZUpft8A0FPVXPR2iT32SN/view?usp=sharing It is also available on our github

About

My Copy of the Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 100.0%