Skip to content

namas191297/efficientdetlite

Repository files navigation

EfficientDet Lite Object Detection with ONNX & TensorRT πŸš€

Object Detection GIF

License GitHub stars Python Version

Table of Contents


πŸ“– Project Overview

EfficientDet Lite Object Detection with ONNX & TensorRT is a high-performance project designed to implement EfficientDet Lite models (versions 0 to 4) for object detection. Utilizing ONNX for model inference and TensorRT for optimized engine building, this project enables efficient and rapid deployment of object detection models with support for FP32 and FP16 precision on NVIDIA GPUs.


✨ Features

  • Support for EfficientDet Lite Models: Implemented versions 0, 1, 2, 3, and 4.
  • ONNX Inference: Run inference directly using ONNX models.
  • TensorRT Engine Building: Optimize models with TensorRT for FP32 and FP16 precision.
  • Inference Scripts: Execute inference using both ONNX and TensorRT engines seamlessly.
  • Performance Benchmarking: Compare latency and speed across different models and backends.
  • (TO BE IMPLEMENTED) INT8 Quantization: INT8 Post-Training Quantization for faster inference.

πŸ›  Installation

Libraries and Tools

  • ONNX Runtime (tested with version 1.19.2)
  • TensorRT (tested with version 10.5.0)
  • PyCUDA (tested with version 2024.1.2)
  • cuda-python (tested with version 12.2.1 - should be the same as installed CUDA version)

Installation Steps

  1. Clone the Repository

    git clone https://github.com/namas191297/efficientdetlite.git
    cd efficientdetlite
  2. Set Up a Virtual Environment

    conda create -n efficientdetlite python=3.9
    conda activate efficientdetlite 
  3. Install Dependencies

    pip install -r requirements.txt
  4. Download EfficientDet Lite Models

    • Follow the instructions in the Models section to obtain the required model files.

πŸš€ Usage

Building TensorRT Engines

  • FP32 Precision

    python scripts/build_trt_engine.py --model path/to/model.onnx --precision FP32 --output path/to/engine_fp32.trt
  • FP16 Precision

    python scripts/build_trt_engine.py --model path/to/model.onnx --precision FP16 --output path/to/engine_fp16.trt

Running Inference with TensorRT Engines

  • Using FP32 Engine

    python scripts/infer_trt.py --engine path/to/engine_fp32.trt --image path/to/image.jpg
  • Using FP16 Engine

    python scripts/infer_trt.py --engine path/to/engine_fp16.trt --image path/to/image.jpg

Example Usage

  • Building TRT Engine from ONNX models
# Build .engine TRT Engine for EfficientDetLit4 with FP32 precision.
python build_engine.py --model_type efficientdet_lite4

# Build .engine TRT Engine for EfficientDetLit4 with FP16 precision.
python build_engine.py --model_type efficientdet_lite4 --fp16
  • Single Image
# Inference with ONNX on a single image
python onnx_inference_image.py --model_type efficientdet_lite1 --image test.jpg --score_threshold 0.5 --top_k 5

# Inference with TRT Engine on a single image using FP32 precision.
python trt_inference_image.py --model_type efficientdet_lite1 --image test.jpg --score_threshold 0.5 --top_k 5

# Inference with TRT Engine on a single image using FP16 precision.
python trt_inference_image.py --model_type efficientdet_lite1 --image test.jpg --score_threshold 0.5 --top_k 5 --fp16
  • Webcam
# Inference with ONNX on your webcam
python onnx_inference_webcam.py --model_type efficientdet_lite1 --score_threshold 0.5 --top_k 5

# Inference with TRT Engine on your webcam using FP32 precision.
python trt_inference_webcam.py --model_type efficientdet_lite1 --score_threshold 0.5 --top_k 5

# Inference with TRT Engine on your webcam using FP32 precision.
python trt_inference_webcam.py --model_type efficientdet_lite1 --score_threshold 0.5 --top_k 5 --fp16

🧠 Models

Supported Models

  • EfficientDet Lite 0
  • EfficientDet Lite 1
  • EfficientDet Lite 2
  • EfficientDet Lite 3
  • EfficientDet Lite 4

Model Details

  • Model Files: All models are included in this repo but you can still download the pre-trained EfficientDet Lite models from EfficientDetLite Google Drive Repo.
  • Place all .engine files under trt_models/.
  • Place all the .onnx files under onnx_models/.

⚑ Performance Comparison

Latency and Speed Metrics

The following table compares the latency (ms) of each EfficientDet Lite model across different backends when running on an NVIDIA RTX 3060.

Model ONNX TensorRT FP32 TensorRT FP16
Lite0 27 27 19
Lite1 39 33 23
Lite2 54 42 27
Lite3 78 54 33
Lite4 145 82 46

Hardware Specifications

  • GPU: NVIDIA RTX 3060
  • CUDA Version: 12.2
  • TensorRT Version: 10.5.0

πŸ“ˆ Results

Detection Examples

Detection Example

Inference using EfficientDetLite 4

Benchmark Results

The project demonstrates significant improvements in inference speed when utilizing TensorRT, especially with FP16 precision. TensorRT FP16 offers up to 300% speedup compared to ONNX for larger models, enabling real-time object detection applications.


πŸ“ Repository Structure

root/
β”œβ”€β”€ onnx_models/                 # ONNX model files    
β”œβ”€β”€ trt_models/                  # TensorRT engine files                  
β”œβ”€β”€ build_engine.py              # Script to build a TRT engine from ONNX models.
β”œβ”€β”€ trt_engine_builder.py        # TRTEngineBuilder class implementation.
β”œβ”€β”€ trt_executor.py              # TRTExecutor class implementation for inference.
β”œβ”€β”€ trt_config.py                # Contains LABELS for classes and helper dictionary to build and run models. 
β”œβ”€β”€ onnx_inference_image.py      # Script to run ONNX inference on a single image.
β”œβ”€β”€ onnx_inference_webcam.py     # Script to run ONNX inference on webcam. 
β”œβ”€β”€ trt_inference_image.py       # Script to run TRT inference on a single image.
β”œβ”€β”€ trt_inference_webcam.py      # # Script to run ONNX inference on webcam.
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ README.md                    # This file
└── LICENSE                      # License information

Scripts Overview

  • build_engine.py: Builds TensorRT engines from ONNX models with specified precision.
  • onnx_inference_image.py: Runs inference using ONNX model on a single image.
  • onnx_inference_webcam.py: Runs inference using ONNX model on webcam.
  • trt_inference_image.py: Runs inference using TRT engines on a single image.
  • trt_inference_webcam.py: Runs inference using TRT engines on webcam.

πŸ“œ License

This project is licensed under the Creative Commons Attribution 3.0.


πŸ“« Contact

Email: namas.brd@gmail.com
LinkedIn: Namas Bhandari

Feel free to reach out for any questions, suggestions, or collaborations!


About

A library for all EfficientDet Lite object detection models (0,1,2,3,4) using ONNX and TensorRT.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages