This project aims to predict the quality of wine based on various chemical properties using machine learning models. The dataset used for this analysis contains wine samples with different features such as acidity, sugar content, pH, and alcohol levels.
- Data Preprocessing: The raw wine data is cleaned and processed for training machine learning models.
- Model Training: Multiple machine learning models are trained to classify the quality of wine.
- Model Comparison: The models are evaluated, and their performances are compared based on accuracy scores.
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Support Vector Classifier (SVC)
- Decision Tree Classifier
- Gaussian Naive Bayes
- Random Forest Classifier
- XGBoost Classifier
The models were evaluated based on their accuracy scores. The top-performing models were:
- XGBoost Classifier: 90.42%
- Random Forest Classifier: 89.58%
- Logistic Regression: 87.50%
- Python 3.x
- Required Libraries:
pandasnumpyscikit-learnxgboostmatplotlibseaborn
You can install the required libraries by running:
pip install pandas numpy scikit-learn xgboost matplotlib seaborn- Data Loading: Load the wine quality dataset.
- Preprocessing: Perform data cleaning and feature engineering.
- Model Training: Train various machine learning models on the preprocessed data.
- Evaluation: Compare the models based on their accuracy scores.
- Visualization: Use
matplotlibandseabornfor visualizations.
The project demonstrates the application of various machine learning models to predict wine quality based on chemical properties. XGBoost Classifier achieved the highest accuracy, followed by Random Forest and Logistic Regression.
This project is part of a wine quality analysis challenge. Special thanks to the creators of the dataset.