A Machine Learning project that predicts whether a person has diabetes based on medical input data.
- Data preprocessing (handling missing values)
- Train/Test split (80% / 20%)
- Stratified sampling
- Random Forest / XGBoost model
- Cross-validation for reliable performance
- Automatic generation of
test.csv - User input prediction via terminal
- Python
- Pandas
- NumPy
- Scikit-learn
- XGBoost
pip install -r requirements.txt
python main.py
- Test Accuracy: ~75% β 80%
- Cross-Validation Accuracy: ~75% β 80%
- Pregnancies
- Glucose
- Blood Pressure
- Skin Thickness
- Insulin
- BMI
- Diabetes Pedigree Function
- Age
- Predicts whether the person may have diabetes or not
This project is for educational purposes only and should not be used for medical diagnosis.
Nithish Praba M P
Machine Learning Enthusiast