Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,5 @@ demos/detect/YOLO12/YOLO12-Detect_NCHWRGB/eval_cpp
demos/detect/YOLO12/YOLO12-Detect_YUV420SP/eval_cpp

GEMINI.md

.DS_Store
99 changes: 99 additions & 0 deletions samples/vision/hgnetv2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# HGNetV2 Model Description

English | [简体中文](./README_cn.md)

This directory provides complete usage instructions for the HGNetV2 sample in the Model Zoo, including algorithm overview, model conversion, runtime inference, model file management, and evaluation instructions.

## Algorithm Introduction

HGNetV2 is a next‑generation convolutional neural network (CNN) backbone designed to achieve the best balance between accuracy and latency on NVIDIA GPUs. Building upon the original HGNet, HGNetV2 achieves fast inference speed while maintaining high accuracy, and performs excellently in tasks such as image classification, object detection, and segmentation, making it an ideal choice for GPU‑based computer vision applications.

- **Detailed Introduction**: [docs/en/models/ImageNet1k/PP-HGNetV2.md](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/en/models/ImageNet1k/PP-HGNetV2.md)

### Algorithm Functions

HGNetV2 supports the following tasks:

- ImageNet 1000‑class image classification

### Algorithm Features

- **Aggregating multiple receptive fields**: The HG‑Block combines multi‑scale features, capturing feature information of different sizes from shallow to deep layers, which is friendly to small object detection and recognition.
- **Improved stem module**: The initial preprocessing layers of the network are improved by stacking more \(2 \times 2\) convolution kernels to learn rich local features, while using smaller channel numbers, boosting performance on high‑resolution tasks.
- **Learnable downsampling (LDS)**: Integrates an adaptive downsampling layer that preserves more useful spatial details while reducing computational redundancy.

## Directory Structure

```text
.
|-- conversion
| |-- HGNetV2_medium.yaml
| |-- HGNetV2_small.yaml
| |-- README.md
| `-- README_cn.md
|-- evaluator
| |-- README.md
| `-- README_cn.md
|-- model
| |-- download.sh
| |-- README.md
| `-- README_cn.md
|-- runtime
| `-- python
| |-- main.py
| |-- HGNetV2.py
| |-- README.md
| |-- README_cn.md
| `-- run.sh
|-- test_data
| |-- sandbar.JPEG
| |-- classname.txt
| `-- result.png
|-- README.md
`-- README_cn.md
```

## Quick Start

### Python

- For detailed Python instructions, please refer to [runtime/python/README.md](./runtime/python/README.md).
- Quick start command:

```bash
cd runtime/python
bash run.sh
```

## Model Conversion

- Pre‑compiled `.bin` models are provided via the [model](./model/README.md) directory.
- Conversion instructions can be found in [conversion/README.md](./conversion/README.md).

## Model Inference

Currently, this sample maintains the Python inference path.

- Python inference instructions: [runtime/python/README.md](./runtime/python/README.md)

## Model Evaluation

For evaluation instructions, performance data, and validation results, please refer to [evaluator/README.md](./evaluator/README.md).

## Performance Data

The following table shows the HGNetV2 performance data released on the `RDK X5`.

| Model | Input Size | Params (M) | Float Top-1 | Quantized Top-1 | Single‑thread Latency (ms) | Multi‑thread Latency (ms) | FPS |
| --- | --- | --- | --- | --- | --- | --- | --- |
| HGNetv2_b0 | 224x224 | 6.0 | 77.342 | 72.17 | 1.96 | 3.29 | 902.09 |
| HGNetv2_b1 | 224x224 | 6.34 | 78.872 | 73.47 | 2.41 | 3.89 | 760.13 |
| HGNetv2_b2 | 224x224 | 11.2 | 81.578 | 75.55 | 3.52 | 7.41 | 401.16 |
| HGNetv2_b3 | 224x224 | 16.3 | 82.916 | 76.51 | 4.53 | 10.37 | 287.27 |
| HGNetv2_b4 | 224x224 | 19.8 | 83.694 | 81.93 | 5.29 | 12.32 | 241.94 |

![Inference result](./test_data/result.jpg)

## License

Follows the top‑level License of the Model Zoo.
99 changes: 99 additions & 0 deletions samples/vision/hgnetv2/README_cn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# HGNetV2 模型说明

[English](./README.md) | 简体中文

本目录给出 HGNetV2 sample 在 Model Zoo 中的完整使用说明,包括算法概览、模型转换、运行时推理、模型文件管理和评测说明。

## 算法介绍

HGNetV2 是一款专为在 NVIDIA GPU 上实现精度与延迟的最佳平衡而设计的下一代卷积神经网络(CNN)骨干网络。基于原始的 HGNet,HGNetV2 在保持高精度的同时实现了快速的推理速度,并在图像分类、目标检测和分割等任务中表现出色,因此成为基于 GPU 的计算机视觉应用的理想选择。

- **详细介绍**: [docs/zh_CN/models/ImageNet1k/PP-HGNetV2.md](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models/ImageNet1k/PP-HGNetV2.md)

### 算法功能

HGNetV2 支持以下任务:

- ImageNet 1000 类图像分类

### 算法特点

- **聚合多种感受野**:HG-Block 结合了多尺度特征,能够捕获从浅层到深层、不同大小的特征信息,对小物体的检测和识别友好。
- **更优的 Stem 模块**:改进了网络的初始预处理层,堆叠了更多的 \(2 \times 2\) 卷积核以学习丰富的局部特征,同时使用更小的通道数,提升了大分辨率任务的性能。
- **可学习的下采样(LDS)**:融合了能够自适应调整的下采样层,在减少计算冗余的同时保留了更多有用的空间细节.

## 目录结构

```text
.
|-- conversion
| |-- HGNetV2_medium.yaml
| |-- HGNetV2_small.yaml
| |-- README.md
| `-- README_cn.md
|-- evaluator
| |-- README.md
| `-- README_cn.md
|-- model
| |-- download.sh
| |-- README.md
| `-- README_cn.md
|-- runtime
| `-- python
| |-- main.py
| |-- HGNetV2.py
| |-- README.md
| |-- README_cn.md
| `-- run.sh
|-- test_data
| |-- sandbar.JPEG
| |-- classname.txt
| `-- result.png
|-- README.md
`-- README_cn.md
```

## 快速体验

### Python

- Python 详细说明请参考 [runtime/python/README_cn.md](./runtime/python/README_cn.md)。
- 快速体验命令:

```bash
cd runtime/python
bash run.sh
```

## 模型转换

- 预编译 `.bin` 模型通过 [model](./model/README_cn.md) 目录提供。
- 转换说明请参考 [conversion/README_cn.md](./conversion/README_cn.md)。

## 模型推理

本 sample 当前维护的推理路径为 Python。

- Python 推理说明: [runtime/python/README_cn.md](./runtime/python/README_cn.md)

## 模型评估

评测说明、性能数据和验证结果请参考 [evaluator/README_cn.md](./evaluator/README_cn.md)。

## 性能数据

下表为 `RDK X5` 上发布的 HGNetV2 性能数据。

| 模型 | 输入尺寸 | 参数量 (M) | 浮点 Top-1 | 量化 Top-1 | 单线程时延 (ms) | 多线程时延 (ms) | FPS |
| --- | --- | --- | --- | --- | --- | --- | --- |
| HGNetv2_b0 | 224x224 | 6.0 | 77.342 | 72.17 | 1.96 | 3.29 | 902.09 |
| HGNetv2_b1 | 224x224 | 6.34 | 78.872 | 73.47 | 2.41 | 3.89 | 760.13 |
| HGNetv2_b2 | 224x224 | 11.2 | 81.578 | 75.55 | 3.52 | 7.41 | 401.16 |
| HGNetv2_b3 | 224x224 | 16.3 | 82.916 | 76.51 | 4.53 | 10.37 | 287.27 |
| HGNetv2_b4 | 224x224 | 19.8 | 83.694 | 81.93 | 5.29 | 12.32 | 241.94 |

![推理结果](./test_data/result.jpg)

## License

遵循 Model Zoo 顶层 License。
48 changes: 48 additions & 0 deletions samples/vision/hgnetv2/conversion/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# HGNetv2 Model Conversion and Compilation Guide

English | [简体中文](./README_cn.md)

This directory provides tools and instructions for converting HGNetv2 models into BPU quantized models (`.bin`) compatible with D-Robotics RDK hardware.

## Model Compilation Environment

To convert models, you need to install the **RDK X5 OpenExplore Toolchain**.

### Docker Installation

**RDK X5 OpenExplore 1.2.8**
```bash
docker pull openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8
```
Alternatively, obtain the offline Docker image from the D-Robotics Developer Community: [https://forum.d-robotics.cc/t/topic/28035](https://forum.d-robotics.cc/t/topic/28035)

**Start the container**:
```bash
# Mount your model zoo directory into the container
docker run -it --rm -v /path/to/rdk_model_zoo:/data openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8 /bin/bash
```

---

## Conversion Process

### 1. pth to onnx Model Conversion

We provide the script `onnx_export/export_hgnetv2_b0_bpu.py` to convert a `.pth` file to an ONNX file.

### 2. onnx to bin Model Conversion

**Prerequisites**:
- An ONNX model adapted for BPU has been exported (refer to `onnx_export/export_hgnetv2_b0_bpu.py`).
- Prepare a folder containing 20–50 images (`.jpg` or `.png`) for quantization calibration.

**Run the conversion**:
```bash
hb_mapper makertbin --model-type onnx --config hgnetv2_b0.yaml
```
After successful conversion, the generated `.bin` model file will be located in the same directory as the ONNX model.

---

## License
The tools in this directory follow the [Apache 2.0 License](../../../../LICENSE).
47 changes: 47 additions & 0 deletions samples/vision/hgnetv2/conversion/README_cn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# HGNetv2 模型转换与编译指南

[English](./README.md) | 简体中文

本目录提供了将 HGNetv2 模型转换为适配地瓜机器人(D-Robotics)RDK 硬件的 BPU 量化模型(`.bin`)的工具与说明。

## 模型编译环境

为了转换模型,您需要安装 **RDK X5 OpenExplore 工具链**。

### Docker 安装

**RDK X5 OpenExplore 1.2.8**
```bash
docker pull openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8
```
或者前往地瓜开发者社区获取离线版本的 Docker 镜像: [https://forum.d-robotics.cc/t/topic/28035](https://forum.d-robotics.cc/t/topic/28035)

**启动容器**:
```bash
# 挂载您的 model zoo 目录到容器中
docker run -it --rm -v /path/to/rdk_model_zoo:/data openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8 /bin/bash
```
---

## 转换流程

### 1. pth 转 onnx 模型

我们提供了 `onnx_export/export_hgnetv2_b0_bpu.py` 脚本,可以将 pth 文件转为 onnx 文件。

### 2. onnx 转 bin 模型

**准备工作**:
- 已经导出为 BPU 适配的 ONNX 模型(参考 `onnx_export/export_hgnetv2_b0_bpu.py`)。
- 准备一个文件夹,包含 20~50 张用于量化校准的图片(`.jpg` 或 `.png`)。

**运行转换**:
```bash
hb_mapper makertbin --model-type onnx --config hgnetv2_b0.yaml
```
转换成功后,生成的 `.bin` 模型文件将位于 ONNX 模型的同级目录下。

---

## License
本目录下的工具遵循 [Apache 2.0 License](../../../../LICENSE)。
46 changes: 46 additions & 0 deletions samples/vision/hgnetv2/conversion/hgnetv2_b0.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Copyright (c) 2021-2024 D-Robotics Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


model_parameters:
onnx_model: './HGNetV2_b0.onnx'
march: "bayes-e"
layer_out_dump: False
working_dir: 'HGNetV2_b0_224x224_nv12'
output_model_file_prefix: 'HGNetV2_b0_224x224_nv12'


input_parameters:
input_name: "input"
input_type_rt: 'nv12'
input_type_train: 'rgb'
input_layout_train: 'NCHW'
input_shape: ''
norm_type: 'data_mean_and_scale'
mean_value: 123.675 116.28 103.53
scale_value: 0.01712475 0.017507 0.01742919



calibration_parameters:
cal_data_dir: '../cal_data'
cal_data_type: 'float32'
calibration_type: 'default'
preprocess_on: True


compiler_parameters:
compile_mode: 'latency'
debug: False
optimize_level: 'O3'
45 changes: 45 additions & 0 deletions samples/vision/hgnetv2/conversion/hgnetv2_b1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Copyright (c) 2021-2024 D-Robotics Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0

# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


model_parameters:
onnx_model: './HGNetV2_b1.onnx'
march: "bayes-e"
layer_out_dump: False
working_dir: 'HGNetV2_b1_224x224_nv12'
output_model_file_prefix: 'HGNetV2_b1_224x224_nv12'


input_parameters:
input_name: "input"
input_type_rt: 'nv12'
input_type_train: 'rgb'
input_layout_train: 'NCHW'
input_shape: ''
norm_type: 'data_mean_and_scale'
mean_value: 123.675 116.28 103.53
scale_value: 0.01712475 0.017507 0.01742919


calibration_parameters:
cal_data_dir: '../cal_data'
cal_data_type: 'float32'
calibration_type: 'default'
preprocess_on: True


compiler_parameters:
compile_mode: 'latency'
debug: False
optimize_level: 'O3'
Loading