Skip to content

Mark-Kasa7/Best-Machine-Algorithm-For-Predicting-Insurance-Claims

Repository files navigation

Best ML algorithms to predict motor insurance claims

Team Members

  • Peter Maila
  • Shamsuddeen Lawal
  • Kasavuli Mark
  • Rofhiwa Ntshagovhe
  • Sandisiwe Mtsha
  • Festus Godwin

Project description

  • The aim of this project is to evaluate various machine learning (ML) algorithms, based on several outcomes - the pros and cons, performance, accuracy, and interpretability, as examples - for the task of predicting motor insurance claims.

  • Here's a link to our notion link

  • Doc1_page-0001

  • Here's a link to the test and train datasets for PMD: PMD Datasets

  • Here's a link to the test and train datasets for Mobility: Mobility Datasets

Each dataset has it's own notebook containing the following contents:

Table of Contents

  1. Importing Data Dependencies

  2. Loading Data

  3. Exploratory Data Analysis (EDA)

  4. Preprocessing

  5. Feature Engineering

  6. Model and Model Evaluation

    • Generalised linear model
    • XGBoost
    • SVM
    • Random forest
    • CatBoost
    • Explainable Boosting Machines (EBM)
    • LightGBM

1. Data Preprocessing

We undertook a meticulous gathering of datasets from two notable insurance organizations, PMD and Mobility. Following this, we meticulously refined and organized the data to guarantee its accuracy and suitability for thorough analysis. This comprehensive data preparation procedure is designed to eradicate irregularities and confirm the dataset's appropriateness for modeling. The integration of varied datasets allows us to confirm the strength and adaptability of our models in different motor insurance scenarios.

2. Feature Engineering

Through the application of sophisticated techniques informed by domain expertise, we methodically identified and extracted relevant features from the dataset. This intricate procedure involved utilizing knowledge specific to the domain and employing methodologies to enhance the predictive capability of the chosen features. By enriching the dataset with meaningful features, we aimed to boost the performance and precision of our predictive models.

3. Statistical model, algorithms and gradient booster used:

  • Generalised linear model
  • XGBoost
  • Random forest
  • CatBoost
  • Explainable Boosting Machines (EBM)
  • LightGBM

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors