Skip to content

D-Robotics/rdk_model_zoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

293 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RDK Model Zoo Logo

RDK Model Zoo

Out-of-the-Box AI Model Deployment Pipelines and Full-Link Conversion Tutorials Based on D-Robotics BPU

English | 简体中文

Stars Forks PRs Welcome License Community

Introduction

Mission: Dedicated to providing D-Robotics developers with extreme performance, out-of-the-box, and full-scenario AI deployment validation experiences.

This repository is the official collection of BPU model examples and tools (Model Zoo) provided by D-Robotics. It is oriented towards AI model deployment and application development on BPU (Brain Processing Unit), helping developers to quickly get started with BPU and fast-track model inference workflows.

The repository includes BPU-ready models across multiple AI domains and provides complete reference implementations from Original Model (PyTorch/ONNX) -> Fixed-point Quantization -> Inference Execution -> Result Parsing -> Example Validation, helping users understand and utilize BPU capabilities at minimal cost.

Core Value

  • 🚀 Quick BPU Adoption: Provides out-of-the-box inference pipelines to help users complete BPU inference validation and performance evaluation in the shortest time.
  • 🧩 Complete End-to-End Examples: Covers the entire process from algorithm export and fixed-point quantization to efficient on-board execution (.bin / .hbm). Includes model loading, preprocessing, BPU inference execution, post-processing, and result visualization.
  • 📐 Standardized Design & Documentation: Provides unified directory structures and sample code specifications, supporting Python (hbm_runtime) and C/C++ interfaces for easy understanding, secondary development, and reduced integration/maintenance costs.
  • 🌐 Full Scenario Coverage: Covers classification, detection, segmentation, pose estimation, OCR, and multi-modal models.

Hardware & System Support

This repository uses hardware-specific branches to keep maintained samples, legacy demos, and board-specific documents clearly separated. The current rdk_x5 branch is the primary delivery branch for RDK X5. The previous main branch has been renamed to rdk_x5_legacy and is kept only as the historical archive branch.

Target Hardware Branch Description
RDK X5 rdk_x5 Primary delivery branch for RDK X5. Recommended system version: RDK OS >= 3.5.0, based on Ubuntu 22.04 aarch64 and TROS-Humble.
RDK X5 legacy demos rdk_x5_legacy Historical archive branch for the previous RDK X5 demos. Use it only when you need to reference legacy demo content.
RDK X3 rdk_x3 Branch for RDK X3 devices.
RDK S series rdk_s Branch for RDK S series boards. Historical archived demos for RDK S series boards are kept in RDK Model Zoo S.

Directory Structure

Click to expand project directory architecture
rdk_model_zoo/
|-- samples/
|   |-- vision/
|   |   |-- clip/                 # Image-text multimodal matching
|   |   |-- convnext/             # Image classification
|   |   |-- edgenext/             # Image classification
|   |   |-- efficientformer/      # Image classification
|   |   |-- efficientformerv2/    # Image classification
|   |   |-- efficientnet/         # Image classification
|   |   |-- efficientvit/         # Image classification
|   |   |-- fasternet/            # Image classification
|   |   |-- fastvit/              # Image classification
|   |   |-- fcos/                 # Object detection
|   |   |-- googlenet/            # Image classification
|   |   |-- lprnet/               # License plate recognition
|   |   |-- mobilenetv1/          # Image classification
|   |   |-- mobilenetv2/          # Image classification
|   |   |-- mobilenetv3/          # Image classification
|   |   |-- mobilenetv4/          # Image classification
|   |   |-- mobileone/            # Image classification
|   |   |-- modnet/               # Image matting
|   |   |-- paddleocr/            # OCR text detection and recognition
|   |   |-- repghost/             # Image classification
|   |   |-- repvgg/               # Image classification
|   |   |-- repvit/               # Image classification
|   |   |-- resnet/               # Image classification
|   |   |-- resnext/              # Image classification
|   |   |-- ultralytics_yolo/     # Detection, segmentation, pose, classification
|   |   |-- ultralytics_yolo26/   # Detection, segmentation, pose, classification
|   |   |-- vargconvnet/          # Image classification
|   |   |-- yoloe/                # Instance segmentation
|   |   |-- yolov5/               # Object detection
|   |   `-- yoloworld/           # Open-vocabulary object detection
|-- docs/                  # Project guidelines and reference documentation
|-- datasets/              # Sample datasets and download scripts
|-- tros/                  # TROS integration guides and examples
|-- utils/                 # Shared C++ / Python utilities

Quick Start

  1. Check system version: Ensure the target board is running RDK OS >= 3.5.0.
  2. Connect hardware: Ensure your RDK board is powered and network-connected. SSH or VSCode Remote SSH is recommended.
  3. Read the model README first: Always open the target directory README.md before running commands.
  4. Run the Ultralytics YOLO11x detection sample:
cd samples/vision/ultralytics_yolo/model
wget -nc https://archive.d-robotics.cc/downloads/rdk_model_zoo/rdk_x5/ultralytics_YOLO/yolo11x_detect_bayese_640x640_nv12.bin

cd ../runtime/python
python3 main.py \
  --task detect \
  --model-path ../../model/yolo11x_detect_bayese_640x640_nv12.bin \
  --test-img ../../../../../datasets/coco/assets/bus.jpg \
  --img-save-path ../../test_data/inference_yolo11x.jpg

Inference Result:

YOLO11x Inference Result

Model List

Category Model Name Model Path Supported Platform Details
Image Classification ConvNeXt samples/vision/convnext RDK X5 Details
Image Classification EdgeNeXt samples/vision/edgenext RDK X5 Details
Image Classification EfficientFormer samples/vision/efficientformer RDK X5 Details
Image Classification EfficientFormerV2 samples/vision/efficientformerv2 RDK X5 Details
Image Classification EfficientNet samples/vision/efficientnet RDK X5 Details
Image Classification EfficientViT samples/vision/efficientvit RDK X5 Details
Image Classification FasterNet samples/vision/fasternet RDK X5 Details
Image Classification FastViT samples/vision/fastvit RDK X5 Details
Image Classification GoogLeNet samples/vision/googlenet RDK X5 Details
Image Classification MobileNetV1 samples/vision/mobilenetv1 RDK X5 Details
Image Classification MobileNetV2 samples/vision/mobilenetv2 RDK X5 Details
Image Classification MobileNetV3 samples/vision/mobilenetv3 RDK X5 Details
Image Classification MobileNetV4 samples/vision/mobilenetv4 RDK X5 Details
Image Classification MobileOne samples/vision/mobileone RDK X5 Details
Image Classification RepGhost samples/vision/repghost RDK X5 Details
Image Classification RepVGG samples/vision/repvgg RDK X5 Details
Image Classification RepViT samples/vision/repvit RDK X5 Details
Image Classification ResNet samples/vision/resnet RDK X5 Details
Image Classification ResNeXt samples/vision/resnext RDK X5 Details
Image Classification VargConvNet samples/vision/vargconvnet RDK X5 Details
Object Detection FCOS samples/vision/fcos RDK X5 Details
Object Detection YOLOv5 samples/vision/yolov5 RDK X5 Details
Object Detection / Instance Segmentation / Pose Estimation / Image Classification Ultralytics YOLO (YOLOv5u / YOLOv8 / YOLOv9 / YOLOv10 / YOLO11 / YOLO12 / YOLO13) samples/vision/ultralytics_yolo RDK X5 Details
Object Detection / Instance Segmentation / Pose Estimation / Image Classification Ultralytics YOLO26 samples/vision/ultralytics_yolo26 RDK X5 Details
Instance Segmentation YOLOE samples/vision/yoloe RDK X5 Details
Image Matting MODNet samples/vision/modnet RDK X5 Details
OCR Text Detection and Recognition PaddleOCR samples/vision/paddleocr RDK X5 Details
License Plate Recognition LPRNet samples/vision/lprnet RDK X5 Details
Image-Text Multimodal Matching CLIP samples/vision/clip RDK X5 Details
Open-Vocabulary Object Detection YOLOWorld samples/vision/yoloworld RDK X5 Details

Documentation & Resources


FAQ

1. Model accuracy doesn't meet expectations?
  • Ensure OpenExplorer Docker and board-side libdnn.so versions are up-to-date.
  • Check if model export followed the structure adjustments/operator replacements required in the model's README.
  • Verify cosine similarity of each output node is >= 0.999 (minimum 0.99) during quantization validation.
2. Inference speed doesn't meet expectations?
  • Python API performance is lower than C/C++. For maximum performance, use C/C++.
  • Benchmark data (pure forward) excludes pre/post-processing. Models with NV12 input usually achieve peak BPU throughput.
  • Ensure CPU/BPU frequency is locked to maximum.
  • Check for other resource-heavy processes.
3. How to fix quantization precision loss?
  • Refer to the PTQ accuracy debugging section in the platform documentation.
  • If INT8 loss is severe due to model characteristics, consider Mixed Precision or QAT (Quantization-Aware Training).
4. Error "Can't reshape 1354752 in (1,3,640,640)"?

Update the resolution in preprocess.py to match your ONNX model's input size. Delete old calibration data and re-run the calibration script.

5. mAP accuracy is lower than official results (e.g., Ultralytics)?
  • Deployment uses fixed shape and INT8 quantization, unlike dynamic shape/float official tests.
  • Slight implementation differences in evaluation scripts (e.g., pycocotools).
  • NCHW-RGB to NV12 conversion adds minimal pixel-level loss.
6. Does the model use CPU during inference?

Yes. Non-quantizable or BPU-unsupported operators fallback to CPU. Even for pure BPU models, input/output quantization/dequantization nodes are executed by the CPU.


Community & Contribution

Star History

Star History Chart

We warmly welcome contributions! Please raise an issue on GitHub Issues or discuss on the Developer Community.

License

This project is licensed under the Apache License 2.0 agreement.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors