ReSample is a web application designed to automatically balance datasets for machine learning tasks. It helps address class imbalances in datasets, improving model generalization and performance. It provides a user-friendly interface to handle missing values, balance class distributions, visualize the results and export the processed dataset.
Additionally, ReSample features a recommendation model that suggests ten of the most optimal combinations of balancing methods based on the size and imbalance ratio of the uploaded dataset.
Experience the deployed with Streamlit app here:
- Upload or use a sample dataset.
- Handle missing values with various strategies (drop, fiil with median/moda/mean).
- Get recommendations about balancing methods based on dataset size and imbalance ratio.
- Balance class distribution:
- Oversampling:
- Random Oversampling
- SMOTE
- Borderline SMOTE
- B-SMOTE SVM
- ADASYN
- Undersampling:
- Random Undersampling
- NearMiss (version 1, 2 and 3)
- TomekLinks
- CNN
- ENN
- OSS
- NCR
- Oversampling:
- Visualize data before and after balancing (Pie & Bar charts).
- Export the processed dataset.
