Skip to content

A web application for balancing datasets.

License

Notifications You must be signed in to change notification settings

klitsunova/ReSample

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ReSample

Python Streamlit Pandas Plotly

Overview

ReSample is a web application designed to automatically balance datasets for machine learning tasks. It helps address class imbalances in datasets, improving model generalization and performance. It provides a user-friendly interface to handle missing values, balance class distributions, visualize the results and export the processed dataset.

Additionally, ReSample features a recommendation model that suggests ten of the most optimal combinations of balancing methods based on the size and imbalance ratio of the uploaded dataset.


Live Demo

Experience the deployed with Streamlit app here:

Streamlit App


Features

  1. Upload or use a sample dataset.
  2. Handle missing values with various strategies (drop, fiil with median/moda/mean).
  3. Get recommendations about balancing methods based on dataset size and imbalance ratio.
  4. Balance class distribution:
    • Oversampling:
      • Random Oversampling
      • SMOTE
      • Borderline SMOTE
      • B-SMOTE SVM
      • ADASYN
    • Undersampling:
      • Random Undersampling
      • NearMiss (version 1, 2 and 3)
      • TomekLinks
      • CNN
      • ENN
      • OSS
      • NCR
  5. Visualize data before and after balancing (Pie & Bar charts).
  6. Export the processed dataset.

About

A web application for balancing datasets.

Topics

Resources

License

Stars

Watchers

Forks

Languages