A deep learning image classification project built with PaddlePaddle, supporting both CIFAR-10 and CIFAR-100 datasets with multiple neural network architectures including ResNet and VGG.
- Overview
- Features
- Requirements
- Installation
- Quick Start
- Usage
- Model Architectures
- Dataset Support
- Project Structure
- Examples
- Performance
- Results Documentation
- Contributing
This project implements image classification using PaddlePaddle deep learning framework. It provides a complete training and inference pipeline for classifying images using convolutional neural networks. The project supports both CPU and GPU training with multi-card capabilities and includes comprehensive logging and visualization features.
- Multiple Model Architectures: ResNet (20, 32, 110 layers) and VGG with batch normalization and dropout
- Dataset Support: CIFAR-10 (10 classes) and CIFAR-100 (100 classes)
- Training Modes: Single GPU, Multi-GPU, CPU-only training
- Inference Engine: Batch and single image inference with accuracy metrics
- Visualization: Integration with VisualDL for training monitoring
- Flexible Configuration: Command-line arguments for easy customization
- Performance Optimization: Multi-threaded CPU training and CUDA acceleration
- Python 3.6+
- PaddlePaddle 1.8+
- PIL (Python Imaging Library)
- NumPy
- VisualDL (optional, for visualization)
- Clone the repository:
git clone https://github.com/VikStoykov/PaddlePaddle.git
cd PaddlePaddle- Install PaddlePaddle:
# CPU version
pip install paddlepaddle
# GPU version (recommended)
pip install paddlepaddle-gpu- Install additional dependencies:
pip install pillow numpy visualdlTrain a ResNet32 model on CIFAR-10:
python run.py --train --model_path ./models/resnet32_cifar10 --infer_network ResNet32 --dataset cifar-10 --num_epochs 50 --batch_size 128Train with GPU acceleration:
python run.py --train --use_cuda --model_path ./models/resnet32_cifar10_gpu --num_epochs 50Classify a single image:
python run.py --infer --model_path ./models/resnet32_cifar10 --image ./images/cats/cat.jpeg --labels_file labels_cifar10.txt --expected_result cat --moment_resultBatch inference on multiple images:
python run.py --infer --model_path ./models/resnet32_cifar10 --images ./images/dogs/ --labels_file labels_cifar10.txt --expected_result dog| Parameter | Description | Default | Options |
|---|---|---|---|
--train |
Enable training mode | False | - |
--infer |
Enable inference mode | False | - |
--model_path |
Model storage path | Required | Path string |
--use_cuda |
Use GPU acceleration | False | - |
--infer_network |
Neural network architecture | ResNet32 | ResNet20, ResNet32, ResNet110, VGG |
--dataset |
Dataset to use | cifar-10 | cifar-10, cifar-100 |
--num_epochs |
Number of training epochs | 1 | Integer |
--batch_size |
Batch size for training | 128 | Integer |
--multi_card |
Multi-GPU training | False | - |
--cpu_num |
Number of CPU threads | 1 | Integer |
--logger |
VisualDL logging path | None | Path string |
| Parameter | Description | Default | Options |
|---|---|---|---|
--image |
Single image path | None | Path string |
--images |
Directory of images | None | Path string |
--labels_file |
Path to labels file | None | Path string |
--expected_result |
Expected classification result | None | String |
--moment_result |
Print instant results | False | - |
- ResNet20: 20-layer residual network
- ResNet32: 32-layer residual network (default)
- ResNet110: 110-layer deep residual network
Features:
- Batch normalization after each convolution
- Skip connections to prevent gradient vanishing
- Global average pooling
- Optimized for 32×32 input images
- VGG-style architecture with batch normalization
- Dropout layers for regularization (0.3-0.5)
- 5 convolutional blocks with max pooling
- Two fully connected layers (512 units each)
- Classes: 10 (airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck)
- Images: 50,000 training + 10,000 test
- Resolution: 32×32 RGB
- Labels file:
labels_cifar10.txt
- Classes: 100 (various objects, animals, vehicles)
- Images: 50,000 training + 10,000 test
- Resolution: 32×32 RGB
- Labels file:
labels_cifar100.txt
PaddlePaddle/
├── run.py # Main entry point
├── trainer_test.py # Test training functionality
├── labels_cifar10.txt # CIFAR-10 class labels
├── labels_cifar100.txt # CIFAR-100 class labels
├── model/ # Neural network implementations
│ ├── __init__.py
│ ├── resnet.py # ResNet architectures
│ └── vgg.py # VGG with batch normalization
├── module/ # Core functionality
│ ├── trainer.py # Training pipeline
│ ├── infer.py # Inference engine
│ └── env.py # Environment configuration
├── images/ # Sample test images
│ ├── cats/
│ ├── dogs/
│ └── cars/
└── docs/ # Documentation and results
├── en/
└── bg/
├── results.xlsx # Training results and performance metrics
└── results_infer.xlsx # Inference results and accuracy analysis
# Train with CPU (development/testing)
python run.py --train \
--model_path ./models/resnet32_cpu \
--infer_network ResNet32 \
--dataset cifar-10 \
--num_epochs 10 \
--batch_size 64 \
--cpu_num 4
# Train with GPU (production)
python run.py --train \
--use_cuda \
--model_path ./models/resnet32_gpu \
--infer_network ResNet32 \
--dataset cifar-10 \
--num_epochs 100 \
--batch_size 128 \
--logger ./logs/trainingpython run.py --train \
--use_cuda \
--model_path ./models/vgg_cifar100 \
--infer_network VGG \
--dataset cifar-100 \
--num_epochs 150 \
--batch_size 64# Set CUDA devices
export CUDA_VISIBLE_DEVICES="0,1,2,3"
python run.py --train \
--use_cuda \
--multi_card \
--model_path ./models/resnet110_multi \
--infer_network ResNet110 \
--num_epochs 200 \
--batch_size 256# Test a single cat image
python run.py --infer \
--model_path ./models/resnet32_cifar10 \
--image ./images/cats/cat.jpeg \
--labels_file labels_cifar10.txt \
--expected_result cat \
--moment_result
# Batch test all dog images
python run.py --infer \
--model_path ./models/resnet32_cifar10 \
--images ./images/dogs/ \
--labels_file labels_cifar10.txt \
--expected_result dog
# Test car images and get detailed results
python run.py --infer \
--model_path ./models/resnet32_cifar10 \
--images ./images/cars/ \
--labels_file labels_cifar10.txt \
--expected_result automobile \
--moment_result- ResNet32: ~92% accuracy on CIFAR-10 after 100 epochs
- ResNet110: ~93%+ accuracy on CIFAR-10 after 200 epochs
- VGG: ~90% accuracy on CIFAR-10 after 150 epochs
- Minimum: 4GB RAM, 2GB storage
- Recommended: 8GB RAM, GPU with 4GB VRAM, 10GB storage
- Multi-GPU: Multiple GPUs with 6GB+ VRAM each
- CPU (4 cores): 2-4 hours for 50 epochs
- Single GPU: 15-30 minutes for 50 epochs
- Multi-GPU: 5-15 minutes for 50 epochs
The project includes comprehensive results documentation in Excel format located in the docs/bg/ directory:
This file contains detailed training metrics and performance analysis:
- Model Performance: Accuracy and loss metrics for different architectures
- Training Configuration: Hyperparameters used for each experiment
- Convergence Analysis: Learning curves and training progression
- Comparative Results: Performance comparison between ResNet and VGG models
- Hardware Metrics: Training time and resource utilization data
This file documents inference performance and accuracy analysis:
- Classification Accuracy: Per-class and overall accuracy metrics
- Confusion Matrix: Detailed classification results breakdown
- Error Analysis: Common misclassification patterns and insights
- Performance Benchmarks: Inference speed and throughput measurements
- Test Dataset Results: Comprehensive evaluation on CIFAR-10/100 test sets