Skip to content

Alfin-Abraham/Stroke-Risk-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 

Repository files navigation

Stroke-Risk-Classification

This project represents my first hands-on exploration of machine learning through stroke-risk prediction, using it as a case study to evaluate and compare classification model efficiency. Aligned with IBM’s Data Science Methodology, I implemented a complete Python workflow including data preprocessing, feature engineering, class imbalance correction, model training, and evaluation.

Key steps included undersampling the majority class to address imbalance, feature extraction (age, glucose levels), and scaling with StandardScaler to prepare data for model development. I evaluated three classification algorithms—K-Nearest Neighbors (KNN), Decision Trees, and Random Forests—using both hold-out validation and 10-fold cross-validation for reliable performance assessment.

The KNN model achieved the best results, with a 73% mean accuracy and improved F1-score, precision, and recall, driven by both preprocessing and feature selection. Special attention was placed on reducing false negatives, critical for medical decision support. Additionally, Decision Tree visualizations were generated to enhance model explainability. This project provided foundational experience in building machine learning pipelines for real-world health applications, emphasizing model transparency, performance trade-offs, and the importance of rigorous evaluation.

About

From Data to Diagnostics: Optimizing Stroke Risk Assessment with Advanced ML – A Case Study in Model Selection and Rigorous Problem Solving.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors