Skip to content

knd8412/PlantDiseaseClassifier

Repository files navigation

🌿 PlantDiseaseClassifier

AI-powered plant disease detection from leaf images β€” built with PyTorch, tracked with ClearML, and deployed on Hugging Face Spaces.

Python PyTorch Gradio ClearML License: MIT CI/CD


πŸ”— Quick Links

Resource Link
πŸ“ GitHub Repository knd8412/PlantDiseaseClassifier
πŸš€ Live Demo (Hugging Face) Vinuit/PlantDiseaseClassifier
πŸ§ͺ Baseline CNN Experiment c0422871afdd43a4905b6801890f3324
🧠 ResNet18 Experiment d6035906610145b7b2cfeca0fb1fa155

πŸ“– Overview

PlantDiseaseClassifier is an end-to-end machine learning system that identifies plant diseases from photographs of leaves. Upload a photo β€” get a diagnosis.

The model is trained on the PlantVillage dataset, which contains 55,400 images across 39 plant–disease classes at 256Γ—256 resolution. Examples of classes include Tomato_Early_Blight, Grape_Black_Rot, and Apple_Cedar_Rust.

The project covers the full ML lifecycle:

Raw Data β†’ Preprocessing β†’ Model Training β†’ Experiment Tracking β†’ Evaluation β†’ Deployment

✨ Features

  • 🧠 Custom CNN trained from scratch on PlantVillage
  • πŸ”„ Transfer Learning option with ResNet18
  • πŸ“Š Real-time experiment tracking via ClearML (metrics, checkpoints, hyperparameters)
  • πŸ–ΌοΈ Data augmentation and normalization pipeline
  • 🌐 Interactive Gradio web app β€” upload any leaf photo and get a prediction
  • πŸ“¦ Batch classification support
  • πŸš€ Public deployment on Hugging Face Spaces
  • πŸ” CI/CD automation via self-hosted GitHub Actions runner
  • πŸͺ Pre-commit hooks for code quality and style (flake8)
  • πŸ” Architecture auto-detection β€” evaluation script figures out your model type automatically
  • πŸ—‚οΈ Error gallery β€” visual analysis of worst misclassification patterns

βš™οΈ Tech Stack

Component Technology
Deep Learning Framework PyTorch
Experiment Tracking ClearML (KCL-hosted server)
Web Interface Gradio
Deployment Hugging Face Spaces
CI/CD GitHub Actions (self-hosted runner)
Linting flake8
Testing pytest
Code Quality pre-commit hooks
Version Control Git / GitHub

πŸ—‚οΈ Project Structure

PlantDiseaseClassifier/
β”œβ”€β”€ .github/              # GitHub Actions CI/CD workflows
β”œβ”€β”€ configs/              # YAML configuration files for training/evaluation
β”œβ”€β”€ data/                 # Dataset utilities and preprocessing scripts
β”œβ”€β”€ datasetNotebooks/     # Exploratory data analysis notebooks
β”œβ”€β”€ examples/             # Sample images for testing
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ train.py          # Main training script
β”‚   └── evaluate.py       # Evaluation script with error gallery
β”œβ”€β”€ tests/                # pytest test suite
β”œβ”€β”€ ui/
β”‚   └── app.py            # Gradio web application
β”œβ”€β”€ requirements.txt      # Python dependencies
β”œβ”€β”€ pyproject.toml        # Project metadata and tooling config
β”œβ”€β”€ .pre-commit-config.yaml
β”œβ”€β”€ .flake8
└── README.md

πŸš€ Getting Started

1. Clone the Repository

git clone https://github.com/knd8412/PlantDiseaseClassifier.git
cd PlantDiseaseClassifier

2. Create and Activate a Virtual Environment

python -m venv .venv

# Windows:
.venv\Scripts\activate

# Linux / macOS:
source .venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. (Optional) Set Up ClearML Experiment Tracking

Only needed once. Skip if you don't need experiment tracking.

clearml-init

5. Launch the Web App

python -m ui.app

The Gradio app will open in your browser at http://localhost:7860.


πŸ‹οΈ Training a Model

# Train using default config (25% subset for fast prototyping)
python src/train.py --config configs/train.yaml

The dataset downloads automatically from Hugging Face Hub on first run and is cached at ~/.cache/huggingface/datasets/. Subsequent runs use the cache β€” no re-download needed.

πŸ’‘ Tip: Adjust batch_size in the config to fit your GPU/CPU memory. To force CPU, set device: cpu in the config file.

Outputs after training:

File Description
outputs/best.pt Best model weights (by validation accuracy)
outputs/metrics.json Summary of the last run's metrics
ClearML dashboard Full task logs, metrics curves, and registered model

πŸ“Š Evaluating a Model

The evaluation script auto-detects the model architecture using a 3-step fallback:

  1. Checkpoint metadata β€” reads embedded model_config if saved by the updated train.py
  2. Auto-inference β€” analyzes state_dict weight shapes and key patterns
  3. Config file fallback β€” uses --config or configs/train.yaml as last resort
# Basic evaluation (architecture auto-detected)
python src/evaluate.py --model outputs/best.pt --split val

# Evaluate on the test split
python src/evaluate.py --model outputs/best.pt --split test

# Validate your setup without running full evaluation
python src/evaluate.py --model outputs/best.pt --dry-run

# Skip error gallery for faster runs
python src/evaluate.py --model outputs/best.pt --split val --no-error-gallery

# Override architecture with a specific config (for old checkpoints)
python src/evaluate.py --model outputs/best.pt --config configs/train_quick_test.yaml --split val

Evaluation Output

Output Description
Overall Accuracy % of correctly classified samples
Top-5 Accuracy % where correct class appears in top 5 predictions
Per-class Metrics Precision, recall, F1-score for each disease class
confusion_matrix.png Visual heatmap of classification patterns
errors/ directory Error gallery with misclassified samples
JSON results file Full metrics and per-class statistics

πŸ” Error Gallery

The error gallery visualizes the worst confusion patterns your model makes β€” useful for diagnosing where and why it fails.

python src/evaluate.py --model outputs/best.pt --split val \
    --error-gallery \
    --gallery-top-pairs 5 \
    --gallery-samples-per-pair 10

Generated output:

  • Image grids of misclassified samples per confusion pair
  • Metadata files with sample indices and confusion statistics
  • Full analysis report in Markdown format

See ERROR_GALLERY_README.md for detailed documentation.


πŸ“‰ Confusion Matrix Options

The confusion matrix defaults to the 15 most confused classes for readability.

# Show top 10 most confused classes
python src/evaluate.py --model outputs/best.pt --split val --cm-classes 10

# Show all 39 classes
python src/evaluate.py --model outputs/best.pt --split val --cm-classes 0

πŸ§ͺ Running Tests

pytest tests/

Pre-commit hooks run automatically on every git commit to enforce code quality. To run them manually:

pre-commit run --all-files

πŸ“ˆ Experiment Tracking with ClearML

All training and evaluation runs are automatically logged to ClearML:

  • πŸ“‰ Accuracy and loss curves
  • πŸ–ΌοΈ Confusion matrix uploaded as an artifact
  • πŸ—‚οΈ Error gallery images organized by confusion pair
  • πŸ“„ Error analysis Markdown as a downloadable artifact
  • βš™οΈ All hyperparameters captured automatically

Check your ClearML project dashboard after any train.py or evaluate.py run.


🌐 Deployment

The app is publicly deployed on Hugging Face Spaces:

πŸ‘‰ https://huggingface.co/spaces/Vinuit/PlantDiseaseCLassifier

Upload any plant leaf image and receive an instant disease prediction.


πŸ‘₯ Contributors

GitHub Name
@knd8412 Kamyar Nadarkhani
@Vinuitik Vinuitik
@k23099462 Jaroslav Rakoto-Miklas
@SoroushSoroush20041383 Soroush

Built with 🌱 for plant health and deep learning.

About

A trained AI model for detecting Plant's Diseases based on their photo

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors