Scikit-Learn | XGBoost | SHAP | Matplotlib | Pandas
1. Loan Default Prediction with XGBoost/Random Forest - Link
- Automated preprocessing (missing values, encoding, outlier handling)
- Model comparison (Logistic, RandomForest, XGBoost)
- Explainability with SHAP
- End-to-end reproducibility (clean modular code)
2. Advanced House Price Prediction using Regression - Link
- This project pushed my understanding of Advanced Regression Techniques, Feature Engineering, Model Diagnostics and Model Pipeline.
- It also reinforced the importance of combining accuracy with interpretability.
- Linear Regression (with PCA): Achieved 75% R-squared
- Used Regularized Models: Ridge Regression,cLasso Regression, Elastic Net
- Used Ensemble Models: Random Forest Regressor, XGBoost Regressor
3. Advanced Customer Churn Prediction - Link
- Model showed an 84% ROC-AUC and strong recall.
- Successfully built a data-driven churn prediction system to retain high-risk customers
- Churning reduced by 15%
- Strategic Recommendations
-
- Support data-driven decisions for improving customer loyalty and revenue
-
- Offer incentives to shift month-to-month users to long-term contracts
-
- Launch proactive support programs for customers without tech or online security services
4. Loan defaulter prediction project - Link
- Built ML models to predict loan default risk using financial and demographic data. Implemented a complete pipeline from data cleaning to explainable model insights.
- Automated preprocessing (missing values, encoding, outlier handling) and compared multiple ML moodels (Logistic, RandomForest, XGBoost).
- Explainability with SHAP