Chicken Disease Classifier

A CNN-based image classification system for identifying chicken diseases using deep learning and computer vision techniques. This project uses DVC (Data Version Control) for pipeline management and reproducibility.

Features

Deep Learning: Uses a CNN (VGG16 based) for image classification.
DVC Pipeline: Automated pipeline for data ingestion, model definition, training, and evaluation.
Web Interface: User-friendly web app built with FastAPI and Bootstrap for easy interaction.
Reproducibility: Global random seed (42) and GPU memory growth configuration for consistent results.
Experiment Tracking: Tracks parameters (epochs, batch size, etc.) and metrics (accuracy, loss).
AWS S3 Storage: Hybrid local/cloud artifact storage with automatic model uploads.
CloudWatch Monitoring: Real-time training & prediction logging with custom metrics.

AWS Integration

This project supports AWS S3 for artifact storage and AWS CloudWatch for monitoring and observability.

Architecture

graph LR
    subgraph Local["🖥️ Local Environment"]
        TP["Training Pipeline"]
        FA["FastAPI App /predict"]
        CWC["CloudWatch Callback"]
    end

    subgraph AWS["☁️ AWS Cloud"]
        S3["S3 Bucket\n(Model, Data, Logs)"]
        CWL["CloudWatch Logs\n(Predictions, Errors)"]
        CWM["CloudWatch Metrics\n(Loss, Accuracy)"]
    end

    TP -->|upload| S3
    FA -->|logs| CWL
    CWC -->|metrics| CWM

    style AWS fill:#FF9900,color:#fff
    style Local fill:#232F3E,color:#fff

Setup

Copy .env.example to .env and fill in your AWS credentials:
```
cp .env.example .env
```
Set STORAGE_MODE=s3 and ENABLE_CLOUDWATCH=true in .env.
Create your S3 bucket and CloudWatch log group in AWS Console.

CloudWatch Metrics Tracked

Metric	Description
`TrainingLoss`	Loss per epoch during training
`TrainingAccuracy`	Accuracy per epoch during training
`PredictionCount`	Count per prediction (with `Class` dimension)
`PredictionLatency`	Time taken per prediction (seconds)
`DiseaseDetected`	Binary flag for disease detection

CloudWatch Recordings

Training Logs in CloudWatch:

Prediction Logs in CloudWatch:

System Configuration

The pipeline has been tested on the following configuration:

OS: macOS 26.1
Model: MacBook Pro
Chip: Apple M3 Pro
Cores: 11 (5 performance and 6 efficiency)
Memory: 18 GB

Installation

Clone the repository:

git clone https://github.com/neehanthreddym/chicken_disease_clf.git
cd chicken_disease_clf

Install dependencies:
```
pip install -r requirements.txt
```

Usage (DVC Pipeline)

To run the entire machine learning pipeline (Data Ingestion -> Model Definition -> Training -> Evaluation):

dvc repro

This will check for changes in dependencies and only run the necessary stages.

Pipeline Stages

Data Ingestion (stage01_data_ingestion.py): Downloads and extracts the dataset.
Model Definition (stage02_model_definition.py): Prepares the VGG16 base model.
Model Training (stage03_training.py): Trains the model with augmented data.
Model Evaluation (stage04_evaluation.py): Evaluates the trained model and saves scores.

Web Application

The project includes a FastAPI-based web interface to easily classify images.

Start the application:
```
python app.py
```
Open your browser and navigate to http://localhost:8000.

Reproducibility

Random Seed: A global seed of 42 is set for Python, NumPy, and TensorFlow to ensure reproducible training runs.
GPU Config: TensorFlow GPU memory growth is enabled to prevent allocation errors.

Development Workflow

Follow these steps when making changes to the project:

Update config.yaml: Modify system configuration settings (paths, URLs).
Update secrets.yaml (Optional): Add sensitive credentials like API keys.
Update params.yaml: Adjust parameters for model training/testing (Epochs, Batch Size, etc.).
Update the Entity: Modify data entities (dataclasses) in src/entity for accurate input/output.
Update the Configuration Manager: Adjust src/config/configuration.py to handle new configs.
Update the Components: Modify or add components in src/components.
Update the Pipeline: Update the processing steps in src/pipeline.
Update main.py: Modify the main script if necessary.
Update dvc.yaml: Update stage dependencies and outputs if the workflow changes.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.dvc		.dvc
.github/workflows		.github/workflows
artifacts		artifacts
assets		assets
config		config
research		research
src/cnn_classifier		src/cnn_classifier
templates		templates
.dvcignore		.dvcignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
app.py		app.py
dvc.lock		dvc.lock
dvc.yaml		dvc.yaml
evaluation_scores.json		evaluation_scores.json
main.py		main.py
params.yaml		params.yaml
requirements.txt		requirements.txt
setup.py		setup.py
template.py		template.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chicken Disease Classifier

Features

AWS Integration

Architecture

Setup

CloudWatch Metrics Tracked

CloudWatch Recordings

System Configuration

Installation

Usage (DVC Pipeline)

Pipeline Stages

Web Application

Reproducibility

Development Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Chicken Disease Classifier

Features

AWS Integration

Architecture

Setup

CloudWatch Metrics Tracked

CloudWatch Recordings

System Configuration

Installation

Usage (DVC Pipeline)

Pipeline Stages

Web Application

Reproducibility

Development Workflow

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages