Skip to content

LohChiaHeung/Diabetes-Prediction-Using-Logistic-Regression

Repository files navigation

UCCD1033_Group2

This project focused on predicting diabetes using Logistic Regression (LR), Support Vector Machine (SVM), and Random Forest (RF) models. Through model comparison and hyperparameter tuning, Random Forest achieved the best overall performance with the highest recall (67.9%), accuracy (78.8%), and ROC-AUC (86.0%) on the test set, making it the most reliable for identifying diabetic patients.

This project highlights how machine learning can enhance early diabetes detection by minimizing false negatives and improving diagnostic accuracy.

Task:

Using the Pima Indians Diabetes dataset, to develop a binary classification model using Logistic Regression to predict whether a patient is diabetic.

From the task, we can identify that,

  • It is supervised learning.
  • It is binary classification task.

Guideline:

  • Perform necessary data preprocessing.
  • Evaluate the model using accuracy, precision, recall and ROC-AUC.

Dataset:

https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database/

Members:

  • Loh Chia Heung (Leader), 2301684
  • Tong Yu Shan, 2301157
  • Low Jia Hao, 2302161

About

A machine learning project for diabetes prediction using Logistic Regression, SVM, and Random Forest. After model tuning, Random Forest achieved the best performance with 78.8% accuracy and 86.0% ROC-AUC, improving early diabetes detection.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors