diff --git a/.gitignore b/.gitignore index eb2aad5..793ebb6 100644 --- a/.gitignore +++ b/.gitignore @@ -31,3 +31,5 @@ demos/detect/YOLO12/YOLO12-Detect_NCHWRGB/eval_cpp demos/detect/YOLO12/YOLO12-Detect_YUV420SP/eval_cpp GEMINI.md + +.DS_Store diff --git a/samples/vision/hgnetv2/README.md b/samples/vision/hgnetv2/README.md new file mode 100644 index 0000000..e80239f --- /dev/null +++ b/samples/vision/hgnetv2/README.md @@ -0,0 +1,99 @@ +# HGNetV2 Model Description + +English | [简体中文](./README_cn.md) + +This directory provides complete usage instructions for the HGNetV2 sample in the Model Zoo, including algorithm overview, model conversion, runtime inference, model file management, and evaluation instructions. + +## Algorithm Introduction + +HGNetV2 is a next‑generation convolutional neural network (CNN) backbone designed to achieve the best balance between accuracy and latency on NVIDIA GPUs. Building upon the original HGNet, HGNetV2 achieves fast inference speed while maintaining high accuracy, and performs excellently in tasks such as image classification, object detection, and segmentation, making it an ideal choice for GPU‑based computer vision applications. + +- **Detailed Introduction**: [docs/en/models/ImageNet1k/PP-HGNetV2.md](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/en/models/ImageNet1k/PP-HGNetV2.md) + +### Algorithm Functions + +HGNetV2 supports the following tasks: + +- ImageNet 1000‑class image classification + +### Algorithm Features + +- **Aggregating multiple receptive fields**: The HG‑Block combines multi‑scale features, capturing feature information of different sizes from shallow to deep layers, which is friendly to small object detection and recognition. +- **Improved stem module**: The initial preprocessing layers of the network are improved by stacking more \(2 \times 2\) convolution kernels to learn rich local features, while using smaller channel numbers, boosting performance on high‑resolution tasks. +- **Learnable downsampling (LDS)**: Integrates an adaptive downsampling layer that preserves more useful spatial details while reducing computational redundancy. + +## Directory Structure + +```text +. +|-- conversion +| |-- HGNetV2_medium.yaml +| |-- HGNetV2_small.yaml +| |-- README.md +| `-- README_cn.md +|-- evaluator +| |-- README.md +| `-- README_cn.md +|-- model +| |-- download.sh +| |-- README.md +| `-- README_cn.md +|-- runtime +| `-- python +| |-- main.py +| |-- HGNetV2.py +| |-- README.md +| |-- README_cn.md +| `-- run.sh +|-- test_data +| |-- sandbar.JPEG +| |-- classname.txt +| `-- result.png +|-- README.md +`-- README_cn.md +``` + +## Quick Start + +### Python + +- For detailed Python instructions, please refer to [runtime/python/README.md](./runtime/python/README.md). +- Quick start command: + +```bash +cd runtime/python +bash run.sh +``` + +## Model Conversion + +- Pre‑compiled `.bin` models are provided via the [model](./model/README.md) directory. +- Conversion instructions can be found in [conversion/README.md](./conversion/README.md). + +## Model Inference + +Currently, this sample maintains the Python inference path. + +- Python inference instructions: [runtime/python/README.md](./runtime/python/README.md) + +## Model Evaluation + +For evaluation instructions, performance data, and validation results, please refer to [evaluator/README.md](./evaluator/README.md). + +## Performance Data + +The following table shows the HGNetV2 performance data released on the `RDK X5`. + +| Model | Input Size | Params (M) | Float Top-1 | Quantized Top-1 | Single‑thread Latency (ms) | Multi‑thread Latency (ms) | FPS | +| --- | --- | --- | --- | --- | --- | --- | --- | +| HGNetv2_b0 | 224x224 | 6.0 | 77.342 | 72.17 | 1.96 | 3.29 | 902.09 | +| HGNetv2_b1 | 224x224 | 6.34 | 78.872 | 73.47 | 2.41 | 3.89 | 760.13 | +| HGNetv2_b2 | 224x224 | 11.2 | 81.578 | 75.55 | 3.52 | 7.41 | 401.16 | +| HGNetv2_b3 | 224x224 | 16.3 | 82.916 | 76.51 | 4.53 | 10.37 | 287.27 | +| HGNetv2_b4 | 224x224 | 19.8 | 83.694 | 81.93 | 5.29 | 12.32 | 241.94 | + +![Inference result](./test_data/result.jpg) + +## License + +Follows the top‑level License of the Model Zoo. \ No newline at end of file diff --git a/samples/vision/hgnetv2/README_cn.md b/samples/vision/hgnetv2/README_cn.md new file mode 100644 index 0000000..8fbf8f2 --- /dev/null +++ b/samples/vision/hgnetv2/README_cn.md @@ -0,0 +1,99 @@ +# HGNetV2 模型说明 + +[English](./README.md) | 简体中文 + +本目录给出 HGNetV2 sample 在 Model Zoo 中的完整使用说明,包括算法概览、模型转换、运行时推理、模型文件管理和评测说明。 + +## 算法介绍 + +HGNetV2 是一款专为在 NVIDIA GPU 上实现精度与延迟的最佳平衡而设计的下一代卷积神经网络(CNN)骨干网络。基于原始的 HGNet,HGNetV2 在保持高精度的同时实现了快速的推理速度,并在图像分类、目标检测和分割等任务中表现出色,因此成为基于 GPU 的计算机视觉应用的理想选择。 + +- **详细介绍**: [docs/zh_CN/models/ImageNet1k/PP-HGNetV2.md](https://github.com/PaddlePaddle/PaddleClas/blob/develop/docs/zh_CN/models/ImageNet1k/PP-HGNetV2.md) + +### 算法功能 + +HGNetV2 支持以下任务: + +- ImageNet 1000 类图像分类 + +### 算法特点 + +- **聚合多种感受野**:HG-Block 结合了多尺度特征,能够捕获从浅层到深层、不同大小的特征信息,对小物体的检测和识别友好。 +- **更优的 Stem 模块**:改进了网络的初始预处理层,堆叠了更多的 \(2 \times 2\) 卷积核以学习丰富的局部特征,同时使用更小的通道数,提升了大分辨率任务的性能。 +- **可学习的下采样(LDS)**:融合了能够自适应调整的下采样层,在减少计算冗余的同时保留了更多有用的空间细节. + +## 目录结构 + +```text +. +|-- conversion +| |-- HGNetV2_medium.yaml +| |-- HGNetV2_small.yaml +| |-- README.md +| `-- README_cn.md +|-- evaluator +| |-- README.md +| `-- README_cn.md +|-- model +| |-- download.sh +| |-- README.md +| `-- README_cn.md +|-- runtime +| `-- python +| |-- main.py +| |-- HGNetV2.py +| |-- README.md +| |-- README_cn.md +| `-- run.sh +|-- test_data +| |-- sandbar.JPEG +| |-- classname.txt +| `-- result.png +|-- README.md +`-- README_cn.md +``` + +## 快速体验 + +### Python + +- Python 详细说明请参考 [runtime/python/README_cn.md](./runtime/python/README_cn.md)。 +- 快速体验命令: + +```bash +cd runtime/python +bash run.sh +``` + +## 模型转换 + +- 预编译 `.bin` 模型通过 [model](./model/README_cn.md) 目录提供。 +- 转换说明请参考 [conversion/README_cn.md](./conversion/README_cn.md)。 + +## 模型推理 + +本 sample 当前维护的推理路径为 Python。 + +- Python 推理说明: [runtime/python/README_cn.md](./runtime/python/README_cn.md) + +## 模型评估 + +评测说明、性能数据和验证结果请参考 [evaluator/README_cn.md](./evaluator/README_cn.md)。 + +## 性能数据 + +下表为 `RDK X5` 上发布的 HGNetV2 性能数据。 + +| 模型 | 输入尺寸 | 参数量 (M) | 浮点 Top-1 | 量化 Top-1 | 单线程时延 (ms) | 多线程时延 (ms) | FPS | +| --- | --- | --- | --- | --- | --- | --- | --- | +| HGNetv2_b0 | 224x224 | 6.0 | 77.342 | 72.17 | 1.96 | 3.29 | 902.09 | +| HGNetv2_b1 | 224x224 | 6.34 | 78.872 | 73.47 | 2.41 | 3.89 | 760.13 | +| HGNetv2_b2 | 224x224 | 11.2 | 81.578 | 75.55 | 3.52 | 7.41 | 401.16 | +| HGNetv2_b3 | 224x224 | 16.3 | 82.916 | 76.51 | 4.53 | 10.37 | 287.27 | +| HGNetv2_b4 | 224x224 | 19.8 | 83.694 | 81.93 | 5.29 | 12.32 | 241.94 | + +![推理结果](./test_data/result.jpg) + +## License + +遵循 Model Zoo 顶层 License。 diff --git a/samples/vision/hgnetv2/conversion/README.md b/samples/vision/hgnetv2/conversion/README.md new file mode 100644 index 0000000..6e2660a --- /dev/null +++ b/samples/vision/hgnetv2/conversion/README.md @@ -0,0 +1,48 @@ +# HGNetv2 Model Conversion and Compilation Guide + +English | [简体中文](./README_cn.md) + +This directory provides tools and instructions for converting HGNetv2 models into BPU quantized models (`.bin`) compatible with D-Robotics RDK hardware. + +## Model Compilation Environment + +To convert models, you need to install the **RDK X5 OpenExplore Toolchain**. + +### Docker Installation + +**RDK X5 OpenExplore 1.2.8** +```bash +docker pull openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8 +``` +Alternatively, obtain the offline Docker image from the D-Robotics Developer Community: [https://forum.d-robotics.cc/t/topic/28035](https://forum.d-robotics.cc/t/topic/28035) + +**Start the container**: +```bash +# Mount your model zoo directory into the container +docker run -it --rm -v /path/to/rdk_model_zoo:/data openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8 /bin/bash +``` + +--- + +## Conversion Process + +### 1. pth to onnx Model Conversion + +We provide the script `onnx_export/export_hgnetv2_b0_bpu.py` to convert a `.pth` file to an ONNX file. + +### 2. onnx to bin Model Conversion + +**Prerequisites**: +- An ONNX model adapted for BPU has been exported (refer to `onnx_export/export_hgnetv2_b0_bpu.py`). +- Prepare a folder containing 20–50 images (`.jpg` or `.png`) for quantization calibration. + +**Run the conversion**: +```bash +hb_mapper makertbin --model-type onnx --config hgnetv2_b0.yaml +``` +After successful conversion, the generated `.bin` model file will be located in the same directory as the ONNX model. + +--- + +## License +The tools in this directory follow the [Apache 2.0 License](../../../../LICENSE). \ No newline at end of file diff --git a/samples/vision/hgnetv2/conversion/README_cn.md b/samples/vision/hgnetv2/conversion/README_cn.md new file mode 100644 index 0000000..35756df --- /dev/null +++ b/samples/vision/hgnetv2/conversion/README_cn.md @@ -0,0 +1,47 @@ +# HGNetv2 模型转换与编译指南 + +[English](./README.md) | 简体中文 + +本目录提供了将 HGNetv2 模型转换为适配地瓜机器人(D-Robotics)RDK 硬件的 BPU 量化模型(`.bin`)的工具与说明。 + +## 模型编译环境 + +为了转换模型,您需要安装 **RDK X5 OpenExplore 工具链**。 + +### Docker 安装 + +**RDK X5 OpenExplore 1.2.8** +```bash +docker pull openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8 +``` +或者前往地瓜开发者社区获取离线版本的 Docker 镜像: [https://forum.d-robotics.cc/t/topic/28035](https://forum.d-robotics.cc/t/topic/28035) + +**启动容器**: +```bash +# 挂载您的 model zoo 目录到容器中 +docker run -it --rm -v /path/to/rdk_model_zoo:/data openexplorer/ai_toolchain_ubuntu_20_x5_cpu:v1.2.8 /bin/bash +``` +--- + +## 转换流程 + +### 1. pth 转 onnx 模型 + +我们提供了 `onnx_export/export_hgnetv2_b0_bpu.py` 脚本,可以将 pth 文件转为 onnx 文件。 + +### 2. onnx 转 bin 模型 + +**准备工作**: +- 已经导出为 BPU 适配的 ONNX 模型(参考 `onnx_export/export_hgnetv2_b0_bpu.py`)。 +- 准备一个文件夹,包含 20~50 张用于量化校准的图片(`.jpg` 或 `.png`)。 + +**运行转换**: +```bash +hb_mapper makertbin --model-type onnx --config hgnetv2_b0.yaml +``` +转换成功后,生成的 `.bin` 模型文件将位于 ONNX 模型的同级目录下。 + +--- + +## License +本目录下的工具遵循 [Apache 2.0 License](../../../../LICENSE)。 \ No newline at end of file diff --git a/samples/vision/hgnetv2/conversion/hgnetv2_b0.yaml b/samples/vision/hgnetv2/conversion/hgnetv2_b0.yaml new file mode 100644 index 0000000..a85d184 --- /dev/null +++ b/samples/vision/hgnetv2/conversion/hgnetv2_b0.yaml @@ -0,0 +1,46 @@ +# Copyright (c) 2021-2024 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +model_parameters: + onnx_model: './HGNetV2_b0.onnx' + march: "bayes-e" + layer_out_dump: False + working_dir: 'HGNetV2_b0_224x224_nv12' + output_model_file_prefix: 'HGNetV2_b0_224x224_nv12' + + +input_parameters: + input_name: "input" + input_type_rt: 'nv12' + input_type_train: 'rgb' + input_layout_train: 'NCHW' + input_shape: '' + norm_type: 'data_mean_and_scale' + mean_value: 123.675 116.28 103.53 + scale_value: 0.01712475 0.017507 0.01742919 + + + +calibration_parameters: + cal_data_dir: '../cal_data' + cal_data_type: 'float32' + calibration_type: 'default' + preprocess_on: True + + +compiler_parameters: + compile_mode: 'latency' + debug: False + optimize_level: 'O3' diff --git a/samples/vision/hgnetv2/conversion/hgnetv2_b1.yaml b/samples/vision/hgnetv2/conversion/hgnetv2_b1.yaml new file mode 100644 index 0000000..cf9c020 --- /dev/null +++ b/samples/vision/hgnetv2/conversion/hgnetv2_b1.yaml @@ -0,0 +1,45 @@ +# Copyright (c) 2021-2024 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +model_parameters: + onnx_model: './HGNetV2_b1.onnx' + march: "bayes-e" + layer_out_dump: False + working_dir: 'HGNetV2_b1_224x224_nv12' + output_model_file_prefix: 'HGNetV2_b1_224x224_nv12' + + +input_parameters: + input_name: "input" + input_type_rt: 'nv12' + input_type_train: 'rgb' + input_layout_train: 'NCHW' + input_shape: '' + norm_type: 'data_mean_and_scale' + mean_value: 123.675 116.28 103.53 + scale_value: 0.01712475 0.017507 0.01742919 + + +calibration_parameters: + cal_data_dir: '../cal_data' + cal_data_type: 'float32' + calibration_type: 'default' + preprocess_on: True + + +compiler_parameters: + compile_mode: 'latency' + debug: False + optimize_level: 'O3' diff --git a/samples/vision/hgnetv2/conversion/hgnetv2_b2.yaml b/samples/vision/hgnetv2/conversion/hgnetv2_b2.yaml new file mode 100644 index 0000000..ec90bec --- /dev/null +++ b/samples/vision/hgnetv2/conversion/hgnetv2_b2.yaml @@ -0,0 +1,45 @@ +# Copyright (c) 2021-2024 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +model_parameters: + onnx_model: './HGNetV2_b2.onnx' + march: "bayes-e" + layer_out_dump: False + working_dir: 'HGNetV2_b2_224x224_nv12' + output_model_file_prefix: 'HGNetV2_b2_224x224_nv12' + + +input_parameters: + input_name: "input" + input_type_rt: 'nv12' + input_type_train: 'rgb' + input_layout_train: 'NCHW' + input_shape: '' + norm_type: 'data_mean_and_scale' + mean_value: 123.675 116.28 103.53 + scale_value: 0.01712475 0.017507 0.01742919 + + +calibration_parameters: + cal_data_dir: '../cal_data' + cal_data_type: 'float32' + calibration_type: 'default' + preprocess_on: True + + +compiler_parameters: + compile_mode: 'latency' + debug: False + optimize_level: 'O3' diff --git a/samples/vision/hgnetv2/conversion/hgnetv2_b3.yaml b/samples/vision/hgnetv2/conversion/hgnetv2_b3.yaml new file mode 100644 index 0000000..8e059be --- /dev/null +++ b/samples/vision/hgnetv2/conversion/hgnetv2_b3.yaml @@ -0,0 +1,45 @@ +# Copyright (c) 2021-2024 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +model_parameters: + onnx_model: './HGNetV2_b3.onnx' + march: "bayes-e" + layer_out_dump: False + working_dir: 'HGNetV2_b3_224x224_nv12' + output_model_file_prefix: 'HGNetV2_b3_224x224_nv12' + + +input_parameters: + input_name: "input" + input_type_rt: 'nv12' + input_type_train: 'rgb' + input_layout_train: 'NCHW' + input_shape: '' + norm_type: 'data_mean_and_scale' + mean_value: 123.675 116.28 103.53 + scale_value: 0.01712475 0.017507 0.01742919 + + +calibration_parameters: + cal_data_dir: '../cal_data' + cal_data_type: 'float32' + calibration_type: 'default' + preprocess_on: True + + +compiler_parameters: + compile_mode: 'latency' + debug: False + optimize_level: 'O3' diff --git a/samples/vision/hgnetv2/conversion/hgnetv2_b4.yaml b/samples/vision/hgnetv2/conversion/hgnetv2_b4.yaml new file mode 100644 index 0000000..fff60d9 --- /dev/null +++ b/samples/vision/hgnetv2/conversion/hgnetv2_b4.yaml @@ -0,0 +1,45 @@ +# Copyright (c) 2021-2024 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 + +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +model_parameters: + onnx_model: './HGNetV2_b4.onnx' + march: "bayes-e" + layer_out_dump: False + working_dir: 'HGNetV2_b4_224x224_nv12' + output_model_file_prefix: 'HGNetV2_b4_224x224_nv12' + + +input_parameters: + input_name: "input" + input_type_rt: 'nv12' + input_type_train: 'rgb' + input_layout_train: 'NCHW' + input_shape: '' + norm_type: 'data_mean_and_scale' + mean_value: 123.675 116.28 103.53 + scale_value: 0.01712475 0.017507 0.01742919 + + +calibration_parameters: + cal_data_dir: '../cal_data' + cal_data_type: 'float32' + calibration_type: 'default' + preprocess_on: True + + +compiler_parameters: + compile_mode: 'latency' + debug: False + optimize_level: 'O3' diff --git a/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b0_bpu.py b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b0_bpu.py new file mode 100644 index 0000000..7267ca2 --- /dev/null +++ b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b0_bpu.py @@ -0,0 +1,22 @@ +import timm +import torch + +# 1. 加载模型并设置为推理模式 +model = timm.create_model('hgnetv2_b0.ssld_stage2_ft_in1k', pretrained=True) +model.eval() + +# 2. 创建示例输入 (batch_size=1, 3个颜色通道, 图像尺寸224x224) +dummy_input = torch.randn(1, 3, 224, 224) + +# 4. 导出模型 +torch.onnx.export( + model, + dummy_input, + "hgnetv2_b0.onnx", + input_names=['input'], + output_names=['output'], + opset_version=11, + dynamo=False +) + +print("模型已成功导出为 hgnetv2_b0.onnx") \ No newline at end of file diff --git a/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b1_bpu.py b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b1_bpu.py new file mode 100644 index 0000000..5f80302 --- /dev/null +++ b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b1_bpu.py @@ -0,0 +1,22 @@ +import timm +import torch + +# 1. 加载模型并设置为推理模式 +model = timm.create_model('hgnetv2_b1.ssld_stage2_ft_in1k', pretrained=True) +model.eval() + +# 2. 创建示例输入 (batch_size=1, 3个颜色通道, 图像尺寸224x224) +dummy_input = torch.randn(1, 3, 224, 224) + +# 4. 导出模型 +torch.onnx.export( + model, + dummy_input, + "hgnetv2_b1.onnx", + input_names=['input'], + output_names=['output'], + opset_version=11, + dynamo=False +) + +print("模型已成功导出为 hgnetv2_b1.onnx") \ No newline at end of file diff --git a/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b2_bpu.py b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b2_bpu.py new file mode 100644 index 0000000..0036c4f --- /dev/null +++ b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b2_bpu.py @@ -0,0 +1,22 @@ +import timm +import torch + +# 1. 加载模型并设置为推理模式 +model = timm.create_model('hgnetv2_b2.ssld_stage2_ft_in1k', pretrained=True) +model.eval() + +# 2. 创建示例输入 (batch_size=1, 3个颜色通道, 图像尺寸224x224) +dummy_input = torch.randn(1, 3, 224, 224) + +# 4. 导出模型 +torch.onnx.export( + model, + dummy_input, + "hgnetv2_b2.onnx", + input_names=['input'], + output_names=['output'], + opset_version=11, + dynamo=False +) + +print("模型已成功导出为 hgnetv2_b2.onnx") \ No newline at end of file diff --git a/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b3_bpu.py b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b3_bpu.py new file mode 100644 index 0000000..7334db9 --- /dev/null +++ b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b3_bpu.py @@ -0,0 +1,22 @@ +import timm +import torch + +# 1. 加载模型并设置为推理模式 +model = timm.create_model('hgnetv2_b3.ssld_stage2_ft_in1k', pretrained=True) +model.eval() + +# 2. 创建示例输入 (batch_size=1, 3个颜色通道, 图像尺寸224x224) +dummy_input = torch.randn(1, 3, 224, 224) + +# 4. 导出模型 +torch.onnx.export( + model, + dummy_input, + "hgnetv2_b3.onnx", + input_names=['input'], + output_names=['output'], + opset_version=11, + dynamo=False +) + +print("模型已成功导出为 hgnetv2_b3.onnx") \ No newline at end of file diff --git a/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b4_bpu.py b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b4_bpu.py new file mode 100644 index 0000000..437835e --- /dev/null +++ b/samples/vision/hgnetv2/conversion/onnx_export/export_hgnetv2_b4_bpu.py @@ -0,0 +1,22 @@ +import timm +import torch + +# 1. 加载模型并设置为推理模式 +model = timm.create_model('hgnetv2_b4.ssld_stage2_ft_in1k', pretrained=True) +model.eval() + +# 2. 创建示例输入 (batch_size=1, 3个颜色通道, 图像尺寸224x224) +dummy_input = torch.randn(1, 3, 224, 224) + +# 4. 导出模型 +torch.onnx.export( + model, + dummy_input, + "hgnetv2_b4.onnx", + input_names=['input'], + output_names=['output'], + opset_version=11, + dynamo=False +) + +print("模型已成功导出为 hgnetv2_b4.onnx") \ No newline at end of file diff --git a/samples/vision/hgnetv2/evaluator/README.md b/samples/vision/hgnetv2/evaluator/README.md new file mode 100644 index 0000000..0396264 --- /dev/null +++ b/samples/vision/hgnetv2/evaluator/README.md @@ -0,0 +1,47 @@ +# Model Evaluation + +This directory provides benchmark instructions and validation references for the HGNetv2 sample. + +## Supported Models + +| Model | Input Size | Number of Classes | +| --- | --- | --- | +| HGNetv2_b0 | 224x224 | 1000 | +| HGNetv2_b1 | 224x224 | 1000 | +| HGNetv2_b2 | 224x224 | 1000 | +| HGNetv2_b3 | 224x224 | 1000 | +| HGNetv2_b4 | 224x224 | 1000 | + +## Test Environment + +- Platform: `RDK X5` +- Runtime Backend: `hbm_runtime` +- Model Format: `.bin` +- CPU: 8xA55@1.8GHz, all cores in Performance mode +- BPU: 1xBayes-e@1GHz, equivalent to 10 TOPS INT8 compute power + +## Metrics Description + +- **Float Top-1**: Classification accuracy of the ONNX model before quantization. +- **Quantized Top-1**: Actual inference accuracy of the deployed quantized model. +- **Single‑thread Latency**: Inference latency for a single frame, single thread, and single BPU core. +- **Multi‑thread Latency**: Measured latency under multi‑threaded task submission. +- **FPS**: Multi‑thread throughput test result on `RDK X5`. + +## Benchmark Results + +| Model | Input Size | Params (M) | Float Top-1 | Quantized Top-1 | Single‑thread Latency (ms) | Multi‑thread Latency (ms) | FPS | +| --- | --- | --- | --- | --- | --- | --- | --- | +| HGNetv2_b0 | 224x224 | 6.0 | 77.342 | 72.17 | 1.96 | 3.29 | 902.09 | +| HGNetv2_b1 | 224x224 | 6.34 | 78.872 | 73.47 | 2.41 | 3.89 | 760.13 | +| HGNetv2_b2 | 224x224 | 11.2 | 81.578 | 75.55 | 3.52 | 7.41 | 401.16 | +| HGNetv2_b3 | 224x224 | 16.3 | 82.916 | 76.51 | 4.53 | 10.37 | 287.27 | +| HGNetv2_b4 | 224x224 | 19.8 | 83.694 | 81.93 | 5.29 | 12.32 | 241.94 | + +## Validation Instructions + +This sample is validated through the standard Python inference pipeline: + +- `evaluator/eval.py` + +The validation dataset is ImageNet-1k val. \ No newline at end of file diff --git a/samples/vision/hgnetv2/evaluator/README_cn.md b/samples/vision/hgnetv2/evaluator/README_cn.md new file mode 100644 index 0000000..7ab0a6c --- /dev/null +++ b/samples/vision/hgnetv2/evaluator/README_cn.md @@ -0,0 +1,49 @@ +[English](./README.md) | 简体中文 + +# 模型评测 + +本目录提供 HGNetv2 sample 的 benchmark 说明和验证参考。 + +## 支持模型 + +| 模型 | 输入尺寸 | 类别数 | +| --- | --- | --- | +| HGNetv2_b0 | 224x224 | 1000 | +| HGNetv2_b1 | 224x224 | 1000 | +| HGNetv2_b2 | 224x224 | 1000 | +| HGNetv2_b3 | 224x224 | 1000 | +| HGNetv2_b4 | 224x224 | 1000 | + +## 测试环境 + +- 平台:`RDK X5` +- 运行时后端:`hbm_runtime` +- 模型格式:`.bin` +- CPU:8xA55@1.8GHz,全核 Performance 调度 +- BPU:1xBayes-e@1GHz,等效 10TOPS INT8 算力 + +## 指标说明 + +- 浮点 Top-1 为量化前 ONNX 模型的分类精度。 +- 量化 Top-1 为量化后部署模型的实际推理精度。 +- 单线程时延为单帧、单线程、单 BPU 核的推理时延。 +- 多线程时延为多线程任务提交场景下的测量结果。 +- FPS 为 `RDK X5` 上的多线程吞吐测试结果。 + +## Benchmark 结果 + +| 模型 | 输入尺寸 | 参数量 (M) | 浮点 Top-1 | 量化 Top-1 | 单线程时延 (ms) | 多线程时延 (ms) | FPS | +| --- | --- | --- | --- | --- | --- | --- | --- | +| HGNetv2_b0 | 224x224 | 6.0 | 77.342 | 72.17 | 1.96 | 3.29 | 902.09 | +| HGNetv2_b1 | 224x224 | 6.34 | 78.872 | 73.47 | 2.41 | 3.89 | 760.13 | +| HGNetv2_b2 | 224x224 | 11.2 | 81.578 | 75.55 | 3.52 | 7.41 | 401.16 | +| HGNetv2_b3 | 224x224 | 16.3 | 82.916 | 76.51 | 4.53 | 10.37 | 287.27 | +| HGNetv2_b4 | 224x224 | 19.8 | 83.694 | 81.93 | 5.29 | 12.32 | 241.94 | + +## 验证说明 + +本 sample 通过标准 Python 运行链路进行验证: + +- `evaluator/eval.py` + +验证的数据集为ImageNet-1k val diff --git a/samples/vision/hgnetv2/evaluator/eval.py b/samples/vision/hgnetv2/evaluator/eval.py new file mode 100644 index 0000000..a2aa6da --- /dev/null +++ b/samples/vision/hgnetv2/evaluator/eval.py @@ -0,0 +1,230 @@ +#!/usr/bin/env python3 +# Copyright (c) 2026 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +"""HGNetV2 ImageNet Evaluation Script (recursive directories, CSV ground truth). + +Supports datasets with subfolders. The CSV file should contain relative paths +(e.g., 'imagenet_val/n01440764/xxx.JPEG') and zero‑based labels. +The script recursively scans `--image-path`, computes the relative path of each image, +and matches it against the CSV entries exactly. +""" + +import argparse +import csv +import json +import logging +import os +import sys +from datetime import datetime +from time import time + +import cv2 + +# 添加项目根目录与 runtime 路径 +current_dir = os.path.dirname(os.path.abspath(__file__)) +project_root = os.path.abspath(os.path.join(current_dir, "../../../../../")) +runtime_path = os.path.abspath(os.path.join(current_dir, "../runtime/python")) +sys.path.append(project_root) +sys.path.append(runtime_path) + +from hgnetv2 import HGNetV2, HGNetV2Config + +logging.basicConfig( + level=logging.INFO, + format="[%(name)s] [%(asctime)s.%(msecs)03d] [%(levelname)s] %(message)s", + datefmt="%H:%M:%S", +) +logger = logging.getLogger("HGNetV2_Eval_Recursive") + + +def load_ground_truth_csv(csv_path: str): + """ + 从 CSV 文件加载 ground truth,返回 {相对路径字符串: 标签} 的字典。 + 第一列保持原样(不做任何 basename 提取),第二列为标签(0‑based)。 + 自动跳过标题行(如果第一行包含 'image' 和 'category')。 + """ + gt_map = {} + with open(csv_path, "r") as f: + reader = csv.reader(f) + for row in reader: + if len(row) < 2: + continue + # 跳过可能的标题行 + if row[0].strip().lower() == "image:file" and row[1].strip().lower() == "category": + continue + img_rel_path = row[0].strip() + try: + label = int(row[1].strip()) + except ValueError: + logger.warning(f"Invalid label in CSV: {row[1]}, skipping") + continue + + # 统一路径分隔符为 '/',便于跨平台匹配(Windows 下 CSV 中可能也是 '/') + img_rel_path = img_rel_path.replace("\\", "/") + gt_map[img_rel_path] = label + + logger.info(f"Loaded {len(gt_map)} ground truth entries from {csv_path}") + return gt_map + + +def collect_images_with_relative_paths(image_root: str): + """ + 递归遍历 image_root 目录,收集所有图像文件。 + 返回列表,每个元素为 (相对路径字符串, 绝对路径)。 + 相对路径使用 '/' 作为分隔符,并且不包含前导 './'。 + """ + image_extensions = (".jpg", ".jpeg", ".png", ".JPEG") + results = [] + # 确保根目录以 os.sep 结尾,方便后续计算相对路径 + root_norm = os.path.abspath(image_root) + os.sep + + for dirpath, _, filenames in os.walk(image_root): + for f in filenames: + if f.lower().endswith(image_extensions): + abs_path = os.path.join(dirpath, f) + # 计算相对于 image_root 的相对路径 + rel_path = os.path.relpath(abs_path, image_root) + # 统一分隔符为 '/' + rel_path = rel_path.replace("\\", "/") + results.append((rel_path, abs_path)) + + # 可选:排序,便于观察 + results.sort(key=lambda x: x[0]) + logger.info(f"Found {len(results)} images under {image_root}") + return results + + +def main(): + parser = argparse.ArgumentParser(description="HGNetV2 ImageNet Evaluation with recursive subdirectories and CSV ground truth") + parser.add_argument("--model-path", type=str, required=True, help="Path to the quantized HGNetV2 *.bin model.") + parser.add_argument("--image-path", type=str, required=True, help="Root directory containing the validation images (with subfolders).") + parser.add_argument("--val-csv", type=str, required=True, help="Path to the CSV file with columns: image_path, category (0‑based).") + parser.add_argument("--label-file", type=str, default="", help="Path to ImageNet class names (optional).") + parser.add_argument("--json-save-path", type=str, default="hgnetv2_cls_results.json", help="Path to save evaluation results.") + parser.add_argument("--limit", type=int, default=0, help="Limit the number of images to evaluate (0 = all).") + parser.add_argument("--topk", type=int, default=5, help="Top K for accuracy evaluation.") + parser.add_argument("--resize-type", type=int, default=0, help="Resize type (0: direct, 1: letterbox).") + parser.add_argument("--priority", type=int, default=0, help="Model scheduling priority (0~255).") + parser.add_argument("--bpu-cores", nargs="+", type=int, default=[0], help="BPU core indices.") + args = parser.parse_args() + + # 检查输入 + if not os.path.exists(args.model_path): + logger.error(f"Model not found: {args.model_path}") + return + if not os.path.exists(args.val_csv): + logger.error(f"CSV file not found: {args.val_csv}") + return + if not os.path.isdir(args.image_path): + logger.error(f"Image directory not found: {args.image_path}") + return + + # 加载 ground truth 映射 (相对路径 -> 标签) + gt_map = load_ground_truth_csv(args.val_csv) + + # 初始化模型 + config = HGNetV2Config( + model_path=args.model_path, + label_file=args.label_file if args.label_file else "", + resize_type=args.resize_type, + topk=args.topk, + ) + model = HGNetV2(config) + model.set_scheduling_params(priority=args.priority, bpu_cores=args.bpu_cores) + + # 收集所有图像及其相对路径 + images = collect_images_with_relative_paths(args.image_path) + if args.limit > 0: + images = images[:args.limit] + + total_imgs = len(images) + logger.info(f"Will evaluate up to {total_imgs} images") + + # 统计 + matched = 0 + total_cnt = 0 + top1_cnt = 0 + top5_cnt = 0 + t_start = time() + + for idx, (rel_path, abs_path) in enumerate(images): + if (idx + 1) % 100 == 0: + fps = (idx + 1) / (time() - t_start) + logger.info(f"Processed {idx + 1}/{total_imgs} - {fps:.1f} FPS") + + # 查找 ground truth + truth_label = gt_map.get(rel_path) + if truth_label is None: + logger.debug(f"No ground truth for {rel_path}, skipping") + continue + matched += 1 + + img = cv2.imread(abs_path) + if img is None: + logger.error(f"Failed to read image: {abs_path}") + continue + + try: + topk_idx, topk_prob, _ = model.predict(img) + pred_ids = topk_idx.tolist() + except Exception as e: + logger.error(f"Error processing {rel_path}: {e}") + continue + + total_cnt += 1 + if truth_label == pred_ids[0]: + top1_cnt += 1 + top5_cnt += 1 + elif truth_label in pred_ids: + top5_cnt += 1 + + elapsed = time() - t_start + top1_acc = top1_cnt / total_cnt if total_cnt else 0.0 + top5_acc = top5_cnt / total_cnt if total_cnt else 0.0 + fps = total_cnt / elapsed if elapsed else 0.0 + + summary = { + "date": datetime.now().strftime("%Y-%m-%d %H:%M:%S"), + "model": args.model_path, + "image_root": args.image_path, + "csv_file": args.val_csv, + "total_images_scanned": total_imgs, + "matched_to_gt": matched, + "successful_inferences": total_cnt, + "top1_acc": top1_acc, + "top5_acc": top5_acc, + "fps": fps, + "config": { + "resize_type": args.resize_type, + "topk": args.topk, + "bpu_cores": args.bpu_cores, + "priority": args.priority, + }, + } + + logger.info("Evaluation finished.") + logger.info(f"Matched {matched}/{total_imgs} images to ground truth") + logger.info(f"Successful inferences: {total_cnt}") + logger.info(f"Top-1 Accuracy: {top1_acc:.4f} ({top1_cnt}/{total_cnt})") + logger.info(f"Top-5 Accuracy: {top5_acc:.4f} ({top5_cnt}/{total_cnt})") + logger.info(f"Average FPS: {fps:.2f}") + + with open(args.json_save_path, "w") as f: + json.dump(summary, f, indent=4) + logger.info(f"Results saved to {args.json_save_path}") + + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/samples/vision/hgnetv2/evaluator/hgnetv2.py b/samples/vision/hgnetv2/evaluator/hgnetv2.py new file mode 100644 index 0000000..d8a8338 --- /dev/null +++ b/samples/vision/hgnetv2/evaluator/hgnetv2.py @@ -0,0 +1,232 @@ +# Copyright (c) 2026 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +HGNetV2 inference module. + +This module implements the standardized HGNetV2 image classification +pipeline for `RDK X5`, including preprocessing, BPU execution, and Top-K +classification post-processing. + +Key Features: + - Load quantized `.bin` models with `hbm_runtime`. + - Convert BGR images to packed NV12 tensors required by the runtime. + - Run ImageNet-1k classification and return Top-K results. + - Keep the runtime interface aligned with the other classification samples. +""" + +from __future__ import annotations + +import os +import sys +import time +from dataclasses import dataclass +from typing import Dict, List, Optional, Tuple + +import cv2 +import hbm_runtime +import numpy as np +from scipy.special import softmax + +sys.path.append(os.path.abspath("../../../../../")) +import utils.py_utils.file_io as file_io +import utils.py_utils.preprocess as pre_utils + + +@dataclass +class HGNetV2Config: + """ + Configuration for HGNetV2 classification inference. + + Args: + model_path: Path to the compiled `.bin` model file. + label_file: Path to the ImageNet label file used for result decoding. + resize_type: Resize strategy used during preprocessing. + topk: Number of Top-K classes to return. + """ + + model_path: str + label_file: Optional[str] = None + resize_type: int = 0 + topk: int = 5 + + +class HGNetV2: + """ + HGNetV2 classification wrapper based on `hbm_runtime`. + + This class exposes the standard pipeline required by this repository: + `pre_process`, `forward`, `post_process`, `predict`, and `__call__`. + The runtime assumes a single packed-NV12 input and a single logits output. + """ + + def __init__(self, config: HGNetV2Config): + """ + Initialize the runtime wrapper and parse model metadata. + + Args: + config: Runtime configuration including model path and inference + parameters. + """ + + self.cfg = config + self.model = hbm_runtime.HB_HBMRuntime(config.model_path) + self.model_name = self.model.model_names[0] + self.input_names = self.model.input_names[self.model_name] + self.output_names = self.model.output_names[self.model_name] + self.input_shapes = self.model.input_shapes[self.model_name] + self.input_h = self.input_shapes[self.input_names[0]][2] + self.input_w = self.input_shapes[self.input_names[0]][3] + self.labels = file_io.load_imagenet_labels(config.label_file) if config.label_file else {} + + def set_scheduling_params( + self, + priority: Optional[int] = None, + bpu_cores: Optional[List[int]] = None, + ) -> None: + """ + Set optional runtime scheduling parameters. + + Args: + priority: Scheduling priority in the range `0~255`. + bpu_cores: Optional list of BPU core indexes used for inference. + """ + + kwargs = {} + if priority is not None: + kwargs["priority"] = {self.model_name: priority} + if bpu_cores is not None: + kwargs["bpu_cores"] = {self.model_name: bpu_cores} + if kwargs: + self.model.set_scheduling_params(**kwargs) + + def pre_process( + self, + image: np.ndarray, + resize_type: Optional[int] = None, + ) -> Dict[str, Dict[str, np.ndarray]]: + """ + Convert one BGR image into the packed NV12 tensor expected by the model. + + Args: + image: Input image in OpenCV BGR format. + resize_type: Optional override for the preprocessing resize strategy. + + Returns: + Nested input dictionary accepted by `hbm_runtime.run()`. + """ + + resize_type = resize_type if resize_type is not None else self.cfg.resize_type + resize_img = pre_utils.resized_image( + image, + self.input_w, + self.input_h, + resize_type, + interpolation=cv2.INTER_LINEAR, + ) + y, uv = pre_utils.bgr_to_nv12_planes(resize_img) + nv12 = np.concatenate((y.reshape(-1), uv.reshape(-1)), axis=0).reshape( + (1, self.input_h * 3 // 2, self.input_w, 1) + ) + return {self.model_name: {self.input_names[0]: nv12.astype(np.uint8)}} + + def forward(self, inputs: Dict[str, Dict[str, np.ndarray]]) -> Dict[str, np.ndarray]: + """ + Execute one forward pass on BPU. + + Args: + inputs: Prepared input tensors returned by `pre_process()`. + + Returns: + Raw model outputs keyed by output tensor name. + """ + + return self.model.run(inputs)[self.model_name] + + def post_process( + self, + outputs: Dict[str, np.ndarray], + topk: Optional[int] = None, + ) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """ + Convert the raw logits tensor into Top-K classification results. + + Args: + outputs: Raw runtime outputs from `forward()`. + topk: Optional override for the Top-K result count. + + Returns: + A tuple of `(topk_idx, topk_prob, topk_labels)`. + """ + + topk = topk or self.cfg.topk + prob = softmax(np.squeeze(outputs[self.output_names[0]])) + topk_idx = np.argsort(prob)[-topk:][::-1] + topk_prob = prob[topk_idx] + topk_labels = [self.labels.get(int(idx), str(int(idx))) for idx in topk_idx] + return topk_idx, topk_prob, topk_labels + + def predict( + self, + image: np.ndarray, + resize_type: Optional[int] = None, + topk: Optional[int] = None, + ) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """ + Run the complete HGNetV2 inference pipeline on one image. + + Args: + image: Input image in BGR format. + resize_type: Optional override for preprocessing resize strategy. + topk: Optional override for Top-K result count. + + Returns: + The Top-K class IDs, probabilities, and labels produced by + `post_process()`. + """ + + s1 = time.perf_counter() + inputs = self.pre_process(image, resize_type) + t1 = (time.perf_counter() - s1) * 1000 + + s2 = time.perf_counter() + outputs = self.forward(inputs) + t2 = (time.perf_counter() - s2) * 1000 + + s3 = time.perf_counter() + results = self.post_process(outputs, topk) + t3 = (time.perf_counter() - s3) * 1000 + + print(f"\n[Log] Pre-process: {t1:.2f} ms | Inference: {t2:.2f} ms | Post-process: {t3:.2f} ms") + return results + + def __call__( + self, + image: np.ndarray, + resize_type: Optional[int] = None, + topk: Optional[int] = None, + ) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """ + Provide functional-style access to `predict()`. + + Args: + image: Input image in BGR format. + resize_type: Optional override for preprocessing resize strategy. + topk: Optional override for Top-K result count. + + Returns: + The same result tuple returned by `predict()`. + """ + + return self.predict(image, resize_type, topk) \ No newline at end of file diff --git a/samples/vision/hgnetv2/model/README.md b/samples/vision/hgnetv2/model/README.md new file mode 100644 index 0000000..e69de29 diff --git a/samples/vision/hgnetv2/model/README_cn.md b/samples/vision/hgnetv2/model/README_cn.md new file mode 100644 index 0000000..e69de29 diff --git a/samples/vision/hgnetv2/runtime/python/README.md b/samples/vision/hgnetv2/runtime/python/README.md new file mode 100644 index 0000000..801c6db --- /dev/null +++ b/samples/vision/hgnetv2/runtime/python/README.md @@ -0,0 +1,59 @@ +# HGNetV2 Image Classification Python Example + +English | [简体中文](./README_cn.md) + +This example demonstrates how to perform ImageNet-1k image classification tasks on the BPU using a quantized HGNetV2 model. + +## Directory Structure + +```text +. +|-- main.py +|-- hgnetv2.py +|-- README.md +|-- README_cn.md +`-- run.sh +``` + +## Parameters + +| Parameter | Description | Default | +| --- | --- | --- | +| `--model-path` | Path to the quantized `.bin` model file. | `./model/HGNetV2_b0_224x224_nv12.bin` | +| `--label-file` | Path to the ImageNet label file. | `./datasets/imagenet/classname.txt` | +| `--priority` | Model priority, range `0~255`. | `0` | +| `--bpu-cores` | BPU core index used for inference. | `0` | +| `--test-img` | Path to the test input image. | `./test_data/sandbar.JPEG` | +| `--img-save-path` | Path to save the output visualization image. | `./test_data/result.jpg` | +| `--resize-type` | Resize strategy (`0`: direct resize, `1`: letterbox). | `0` | +| `--topk` | Number of top-K categories to display. | `5` | + +## Quick Start + +```bash +chmod +x run.sh +./run.sh +``` + +## Manual Execution + +- Using default parameters: + +```bash +python3 main.py +``` + +- Explicitly specifying parameters: + +```bash +python3 main.py \ + --model-path ../../model/hgnetv2_b0_224x224_nv12.bin \ + --test-img ./test_data/great_grey_owl.JPEG \ + --img-save-path ./test_data/result.jpg \ + --topk 5 +``` + +## API Description + +- **HGNetV2Config**: Encapsulates the model path, label file, and inference parameters. +- **HGNetV2**: Implements preprocessing, BPU inference, and top‑K classification post‑processing. \ No newline at end of file diff --git a/samples/vision/hgnetv2/runtime/python/README_cn.md b/samples/vision/hgnetv2/runtime/python/README_cn.md new file mode 100644 index 0000000..2998808 --- /dev/null +++ b/samples/vision/hgnetv2/runtime/python/README_cn.md @@ -0,0 +1,59 @@ +# HGNetV2 图像分类 Python 示例 + +[English](./README.md) | 简体中文 + +本示例展示如何在 BPU 上使用量化后的 HGNetV2 模型执行 ImageNet-1k 图像分类任务。 + +## 目录结构 + +```text +. +|-- main.py +|-- hgnetv2.py +|-- README.md +|-- README_cn.md +`-- run.sh +``` + +## 参数说明 + +| 参数 | 说明 | 默认值 | +| --- | --- | --- | +| `--model-path` | 量化 `.bin` 模型文件路径。 | `./model/HGNetV2_b0_224x224_nv12.bin` | +| `--label-file` | ImageNet 标签文件路径。 | `./datasets/imagenet/classname.txt` | +| `--priority` | 模型优先级,范围 `0~255`。 | `0` | +| `--bpu-cores` | 用于推理的 BPU 核索引。 | `0` | +| `--test-img` | 测试输入图像路径。 | `./test_data/sandbar.JPEG` | +| `--img-save-path` | 输出可视化图像保存路径。 | `./test_data/result.jpg` | +| `--resize-type` | 缩放策略(`0`:直接缩放,`1`:letterbox)。 | `0` | +| `--topk` | 显示的 Top-K 类别数量。 | `5` | + +## 快速运行 + +```bash +chmod +x run.sh +./run.sh +``` + +## 手动运行 + +- 使用默认参数: + +```bash +python3 main.py +``` + +- 显式指定参数: + +```bash +python3 main.py \ + --model-path ../../model/hgnetv2_b0_224x224_nv12.bin \ + --test-img ./test_data/great_grey_owl.JPEG \ + --img-save-path ./test_data/result.jpg \ + --topk 5 +``` + +## 接口说明 + +- **HGNetV2Config**:封装模型路径、标签文件和推理参数。 +- **HGNetV2**:实现预处理、BPU 推理和 Top-K 分类后处理。 diff --git a/samples/vision/hgnetv2/runtime/python/hgnetv2.py b/samples/vision/hgnetv2/runtime/python/hgnetv2.py new file mode 100644 index 0000000..d8a8338 --- /dev/null +++ b/samples/vision/hgnetv2/runtime/python/hgnetv2.py @@ -0,0 +1,232 @@ +# Copyright (c) 2026 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +HGNetV2 inference module. + +This module implements the standardized HGNetV2 image classification +pipeline for `RDK X5`, including preprocessing, BPU execution, and Top-K +classification post-processing. + +Key Features: + - Load quantized `.bin` models with `hbm_runtime`. + - Convert BGR images to packed NV12 tensors required by the runtime. + - Run ImageNet-1k classification and return Top-K results. + - Keep the runtime interface aligned with the other classification samples. +""" + +from __future__ import annotations + +import os +import sys +import time +from dataclasses import dataclass +from typing import Dict, List, Optional, Tuple + +import cv2 +import hbm_runtime +import numpy as np +from scipy.special import softmax + +sys.path.append(os.path.abspath("../../../../../")) +import utils.py_utils.file_io as file_io +import utils.py_utils.preprocess as pre_utils + + +@dataclass +class HGNetV2Config: + """ + Configuration for HGNetV2 classification inference. + + Args: + model_path: Path to the compiled `.bin` model file. + label_file: Path to the ImageNet label file used for result decoding. + resize_type: Resize strategy used during preprocessing. + topk: Number of Top-K classes to return. + """ + + model_path: str + label_file: Optional[str] = None + resize_type: int = 0 + topk: int = 5 + + +class HGNetV2: + """ + HGNetV2 classification wrapper based on `hbm_runtime`. + + This class exposes the standard pipeline required by this repository: + `pre_process`, `forward`, `post_process`, `predict`, and `__call__`. + The runtime assumes a single packed-NV12 input and a single logits output. + """ + + def __init__(self, config: HGNetV2Config): + """ + Initialize the runtime wrapper and parse model metadata. + + Args: + config: Runtime configuration including model path and inference + parameters. + """ + + self.cfg = config + self.model = hbm_runtime.HB_HBMRuntime(config.model_path) + self.model_name = self.model.model_names[0] + self.input_names = self.model.input_names[self.model_name] + self.output_names = self.model.output_names[self.model_name] + self.input_shapes = self.model.input_shapes[self.model_name] + self.input_h = self.input_shapes[self.input_names[0]][2] + self.input_w = self.input_shapes[self.input_names[0]][3] + self.labels = file_io.load_imagenet_labels(config.label_file) if config.label_file else {} + + def set_scheduling_params( + self, + priority: Optional[int] = None, + bpu_cores: Optional[List[int]] = None, + ) -> None: + """ + Set optional runtime scheduling parameters. + + Args: + priority: Scheduling priority in the range `0~255`. + bpu_cores: Optional list of BPU core indexes used for inference. + """ + + kwargs = {} + if priority is not None: + kwargs["priority"] = {self.model_name: priority} + if bpu_cores is not None: + kwargs["bpu_cores"] = {self.model_name: bpu_cores} + if kwargs: + self.model.set_scheduling_params(**kwargs) + + def pre_process( + self, + image: np.ndarray, + resize_type: Optional[int] = None, + ) -> Dict[str, Dict[str, np.ndarray]]: + """ + Convert one BGR image into the packed NV12 tensor expected by the model. + + Args: + image: Input image in OpenCV BGR format. + resize_type: Optional override for the preprocessing resize strategy. + + Returns: + Nested input dictionary accepted by `hbm_runtime.run()`. + """ + + resize_type = resize_type if resize_type is not None else self.cfg.resize_type + resize_img = pre_utils.resized_image( + image, + self.input_w, + self.input_h, + resize_type, + interpolation=cv2.INTER_LINEAR, + ) + y, uv = pre_utils.bgr_to_nv12_planes(resize_img) + nv12 = np.concatenate((y.reshape(-1), uv.reshape(-1)), axis=0).reshape( + (1, self.input_h * 3 // 2, self.input_w, 1) + ) + return {self.model_name: {self.input_names[0]: nv12.astype(np.uint8)}} + + def forward(self, inputs: Dict[str, Dict[str, np.ndarray]]) -> Dict[str, np.ndarray]: + """ + Execute one forward pass on BPU. + + Args: + inputs: Prepared input tensors returned by `pre_process()`. + + Returns: + Raw model outputs keyed by output tensor name. + """ + + return self.model.run(inputs)[self.model_name] + + def post_process( + self, + outputs: Dict[str, np.ndarray], + topk: Optional[int] = None, + ) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """ + Convert the raw logits tensor into Top-K classification results. + + Args: + outputs: Raw runtime outputs from `forward()`. + topk: Optional override for the Top-K result count. + + Returns: + A tuple of `(topk_idx, topk_prob, topk_labels)`. + """ + + topk = topk or self.cfg.topk + prob = softmax(np.squeeze(outputs[self.output_names[0]])) + topk_idx = np.argsort(prob)[-topk:][::-1] + topk_prob = prob[topk_idx] + topk_labels = [self.labels.get(int(idx), str(int(idx))) for idx in topk_idx] + return topk_idx, topk_prob, topk_labels + + def predict( + self, + image: np.ndarray, + resize_type: Optional[int] = None, + topk: Optional[int] = None, + ) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """ + Run the complete HGNetV2 inference pipeline on one image. + + Args: + image: Input image in BGR format. + resize_type: Optional override for preprocessing resize strategy. + topk: Optional override for Top-K result count. + + Returns: + The Top-K class IDs, probabilities, and labels produced by + `post_process()`. + """ + + s1 = time.perf_counter() + inputs = self.pre_process(image, resize_type) + t1 = (time.perf_counter() - s1) * 1000 + + s2 = time.perf_counter() + outputs = self.forward(inputs) + t2 = (time.perf_counter() - s2) * 1000 + + s3 = time.perf_counter() + results = self.post_process(outputs, topk) + t3 = (time.perf_counter() - s3) * 1000 + + print(f"\n[Log] Pre-process: {t1:.2f} ms | Inference: {t2:.2f} ms | Post-process: {t3:.2f} ms") + return results + + def __call__( + self, + image: np.ndarray, + resize_type: Optional[int] = None, + topk: Optional[int] = None, + ) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """ + Provide functional-style access to `predict()`. + + Args: + image: Input image in BGR format. + resize_type: Optional override for preprocessing resize strategy. + topk: Optional override for Top-K result count. + + Returns: + The same result tuple returned by `predict()`. + """ + + return self.predict(image, resize_type, topk) \ No newline at end of file diff --git a/samples/vision/hgnetv2/runtime/python/main.py b/samples/vision/hgnetv2/runtime/python/main.py new file mode 100644 index 0000000..48edfa5 --- /dev/null +++ b/samples/vision/hgnetv2/runtime/python/main.py @@ -0,0 +1,117 @@ +# Copyright (c) 2026 D-Robotics Corporation +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +""" +HGNetV2 image classification inference entry script. + +This module provides the standard Python entry for the HGNetV2 sample on +`RDK X5`. The script is responsible for parsing command-line arguments, +constructing the runtime wrapper, loading the input image and labels, +running inference, printing Top-K results, and saving the final +visualization image. +""" + +from __future__ import annotations + +import argparse +import logging +import os +import sys + +import cv2 + +sys.path.append(os.path.abspath("../../../../../")) +import utils.py_utils.file_io as file_io +import utils.py_utils.inspect as inspect +import utils.py_utils.visualize as visualize +from hgnetv2 import HGNetV2, HGNetV2Config + + +logging.basicConfig( + level=logging.INFO, + format="[%(name)s] [%(asctime)s.%(msecs)03d] [%(levelname)s] %(message)s", + datefmt="%H:%M:%S", +) +logger = logging.getLogger("HGNetV2") + +SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__)) +PROJECT_ROOT = os.path.abspath(os.path.join(SCRIPT_DIR, "../../../../../")) +MODEL_DIR = os.path.abspath(os.path.join(SCRIPT_DIR, "../../model")) +TEST_DATA_DIR = os.path.abspath(os.path.join(SCRIPT_DIR, "../../test_data")) +DEFAULT_MODEL_PATH = os.path.join(MODEL_DIR, "HGNetV2_224x224_nv12.bin") +DEFAULT_TEST_IMAGE = os.path.join(TEST_DATA_DIR, "sandbar.JPEG") +DEFAULT_RESULT_IMAGE = os.path.join(TEST_DATA_DIR, "result.jpg") +DEFAULT_LABEL_FILE = os.path.join(PROJECT_ROOT, "datasets/imagenet/classname.txt") + + +def save_image(path: str, image) -> None: + """Save the classification visualization image to disk.""" + + save_dir = os.path.dirname(path) + if save_dir: + os.makedirs(save_dir, exist_ok=True) + if not cv2.imwrite(path, image): + raise RuntimeError(f"Failed to save image to {path}") + + +def main() -> None: + """ + Run the complete HGNetV2 classification pipeline on a single image. + + The entry follows the standardized sample pattern used in this repository: + 1. Parse default-usable command-line arguments. + 2. Build the HGNetV2 runtime configuration. + 3. Load the ImageNet labels and the test image. + 4. Execute `predict()` on the runtime wrapper. + 5. Print Top-K results and save the visualization image. + """ + + parser = argparse.ArgumentParser(description="HGNetV2 Classification Inference") + parser.add_argument("--model-path", type=str, default=DEFAULT_MODEL_PATH, help="Path to the BPU quantized *.bin model.") + parser.add_argument("--label-file", type=str, default=DEFAULT_LABEL_FILE, help="Path to the ImageNet label file.") + parser.add_argument("--priority", type=int, default=0, help="Model priority (0~255).") + parser.add_argument("--bpu-cores", nargs="+", type=int, default=[0], help="BPU core indexes to run inference.") + parser.add_argument("--test-img", type=str, default=DEFAULT_TEST_IMAGE, help="Path to the test input image.") + parser.add_argument("--img-save-path", type=str, default=DEFAULT_RESULT_IMAGE, help="Path to save output result image.") + parser.add_argument("--resize-type", type=int, default=0, help="Resize strategy (0: direct, 1: letterbox).") + parser.add_argument("--topk", type=int, default=5, help="Number of top results to return.") + args = parser.parse_args() + + config = HGNetV2Config( + model_path=args.model_path, + label_file=args.label_file, + resize_type=args.resize_type, + topk=args.topk, + ) + model = HGNetV2(config) + model.set_scheduling_params(priority=args.priority, bpu_cores=args.bpu_cores) + + inspect.print_model_info(model.model) + + image = file_io.load_image(args.test_img) + labels = model.labels + topk_idx, topk_prob, topk_labels = model.predict(image) + + logger.info(f"Top-{args.topk} results:") + for i, (cid, score, label) in enumerate(zip(topk_idx, topk_prob, topk_labels), start=1): + logger.info(f"Rank {i}: class={cid}, label={label}, score={score:.4f}") + + vis_results = list(zip(topk_idx.tolist(), topk_prob.tolist())) + vis_image = visualize.draw_classification(image.copy(), vis_results, labels) + save_image(args.img_save_path, vis_image) + logger.info(f'Saving results to "{args.img_save_path}"') + + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/samples/vision/hgnetv2/runtime/python/run.sh b/samples/vision/hgnetv2/runtime/python/run.sh new file mode 100644 index 0000000..a897dc1 --- /dev/null +++ b/samples/vision/hgnetv2/runtime/python/run.sh @@ -0,0 +1,8 @@ +#!/bin/bash +set -e + +MODEL_PATH="/opt/hobot/model/x5/basic/HGNetV2_224x224_nv12.bin" +[ ! -f "$MODEL_PATH" ] && MODEL_PATH="../../model/HGNetV2_224x224_nv12.bin" +[ ! -f "$MODEL_PATH" ] && bash ../../model/download.sh && MODEL_PATH="../../model/HGNetV2_224x224_nv12.bin" + +python3 main.py --model-path "$MODEL_PATH" diff --git a/samples/vision/hgnetv2/test_data/classname.txt b/samples/vision/hgnetv2/test_data/classname.txt new file mode 100644 index 0000000..722c984 --- /dev/null +++ b/samples/vision/hgnetv2/test_data/classname.txt @@ -0,0 +1,1000 @@ +tench, Tinca tinca +goldfish, Carassius auratus +great white shark, white shark, man-eater, man-eating shark, Carcharodon carcharias +tiger shark, Galeocerdo cuvieri +hammerhead, hammerhead shark +electric ray, crampfish, numbfish, torpedo +stingray +cock +hen +ostrich, Struthio camelus +brambling, Fringilla montifringilla +goldfinch, Carduelis carduelis +house finch, linnet, Carpodacus mexicanus +junco, snowbird +indigo bunting, indigo finch, indigo bird, Passerina cyanea +robin, American robin, Turdus migratorius +bulbul +jay +magpie +chickadee +water ouzel, dipper +kite +bald eagle, American eagle, Haliaeetus leucocephalus +vulture +great grey owl, great gray owl, Strix nebulosa +European fire salamander, Salamandra salamandra +common newt, Triturus vulgaris +eft +spotted salamander, Ambystoma maculatum +axolotl, mud puppy, Ambystoma mexicanum +bullfrog, Rana catesbeiana +tree frog, tree-frog +tailed frog, bell toad, ribbed toad, tailed toad, Ascaphus trui +loggerhead, loggerhead turtle, Caretta caretta +leatherback turtle, leatherback, leathery turtle, Dermochelys coriacea +mud turtle +terrapin +box turtle, box tortoise +banded gecko +common iguana, iguana, Iguana iguana +American chameleon, anole, Anolis carolinensis +whiptail, whiptail lizard +agama +frilled lizard, Chlamydosaurus kingi +alligator lizard +Gila monster, Heloderma suspectum +green lizard, Lacerta viridis +African chameleon, Chamaeleo chamaeleon +Komodo dragon, Komodo lizard, dragon lizard, giant lizard, Varanus komodoensis +African crocodile, Nile crocodile, Crocodylus niloticus +American alligator, Alligator mississipiensis +triceratops +thunder snake, worm snake, Carphophis amoenus +ringneck snake, ring-necked snake, ring snake +hognose snake, puff adder, sand viper +green snake, grass snake +king snake, kingsnake +garter snake, grass snake +water snake +vine snake +night snake, Hypsiglena torquata +boa constrictor, Constrictor constrictor +rock python, rock snake, Python sebae +Indian cobra, Naja naja +green mamba +sea snake +horned viper, cerastes, sand viper, horned asp, Cerastes cornutus +diamondback, diamondback rattlesnake, Crotalus adamanteus +sidewinder, horned rattlesnake, Crotalus cerastes +trilobite +harvestman, daddy longlegs, Phalangium opilio +scorpion +black and gold garden spider, Argiope aurantia +barn spider, Araneus cavaticus +garden spider, Aranea diademata +black widow, Latrodectus mactans +tarantula +wolf spider, hunting spider +tick +centipede +black grouse +ptarmigan +ruffed grouse, partridge, Bonasa umbellus +prairie chicken, prairie grouse, prairie fowl +peacock +quail +partridge +African grey, African gray, Psittacus erithacus +macaw +sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita +lorikeet +coucal +bee eater +hornbill +hummingbird +jacamar +toucan +drake +red-breasted merganser, Mergus serrator +goose +black swan, Cygnus atratus +tusker +echidna, spiny anteater, anteater +platypus, duckbill, duckbilled platypus, duck-billed platypus, Ornithorhynchus anatinus +wallaby, brush kangaroo +koala, koala bear, kangaroo bear, native bear, Phascolarctos cinereus +wombat +jellyfish +sea anemone, anemone +brain coral +flatworm, platyhelminth +nematode, nematode worm, roundworm +conch +snail +slug +sea slug, nudibranch +chiton, coat-of-mail shell, sea cradle, polyplacophore +chambered nautilus, pearly nautilus, nautilus +Dungeness crab, Cancer magister +rock crab, Cancer irroratus +fiddler crab +king crab, Alaska crab, Alaskan king crab, Alaska king crab, Paralithodes camtschatica +American lobster, Northern lobster, Maine lobster, Homarus americanus +spiny lobster, langouste, rock lobster, crawfish, crayfish, sea crawfish +crayfish, crawfish, crawdad, crawdaddy +hermit crab +isopod +white stork, Ciconia ciconia +black stork, Ciconia nigra +spoonbill +flamingo +little blue heron, Egretta caerulea +American egret, great white heron, Egretta albus +bittern +crane +limpkin, Aramus pictus +European gallinule, Porphyrio porphyrio +American coot, marsh hen, mud hen, water hen, Fulica americana +bustard +ruddy turnstone, Arenaria interpres +red-backed sandpiper, dunlin, Erolia alpina +redshank, Tringa totanus +dowitcher +oystercatcher, oyster catcher +pelican +king penguin, Aptenodytes patagonica +albatross, mollymawk +grey whale, gray whale, devilfish, Eschrichtius gibbosus, Eschrichtius robustus +killer whale, killer, orca, grampus, sea wolf, Orcinus orca +dugong, Dugong dugon +sea lion +Chihuahua +Japanese spaniel +Maltese dog, Maltese terrier, Maltese +Pekinese, Pekingese, Peke +Shih-Tzu +Blenheim spaniel +papillon +toy terrier +Rhodesian ridgeback +Afghan hound, Afghan +basset, basset hound +beagle +bloodhound, sleuthhound +bluetick +black-and-tan coonhound +Walker hound, Walker foxhound +English foxhound +redbone +borzoi, Russian wolfhound +Irish wolfhound +Italian greyhound +whippet +Ibizan hound, Ibizan Podenco +Norwegian elkhound, elkhound +otterhound, otter hound +Saluki, gazelle hound +Scottish deerhound, deerhound +Weimaraner +Staffordshire bullterrier, Staffordshire bull terrier +American Staffordshire terrier, Staffordshire terrier, American pit bull terrier, pit bull terrier +Bedlington terrier +Border terrier +Kerry blue terrier +Irish terrier +Norfolk terrier +Norwich terrier +Yorkshire terrier +wire-haired fox terrier +Lakeland terrier +Sealyham terrier, Sealyham +Airedale, Airedale terrier +cairn, cairn terrier +Australian terrier +Dandie Dinmont, Dandie Dinmont terrier +Boston bull, Boston terrier +miniature schnauzer +giant schnauzer +standard schnauzer +Scotch terrier, Scottish terrier, Scottie +Tibetan terrier, chrysanthemum dog +silky terrier, Sydney silky +soft-coated wheaten terrier +West Highland white terrier +Lhasa, Lhasa apso +flat-coated retriever +curly-coated retriever +golden retriever +Labrador retriever +Chesapeake Bay retriever +German short-haired pointer +vizsla, Hungarian pointer +English setter +Irish setter, red setter +Gordon setter +Brittany spaniel +clumber, clumber spaniel +English springer, English springer spaniel +Welsh springer spaniel +cocker spaniel, English cocker spaniel, cocker +Sussex spaniel +Irish water spaniel +kuvasz +schipperke +groenendael +malinois +briard +kelpie +komondor +Old English sheepdog, bobtail +Shetland sheepdog, Shetland sheep dog, Shetland +collie +Border collie +Bouvier des Flandres, Bouviers des Flandres +Rottweiler +German shepherd, German shepherd dog, German police dog, alsatian +Doberman, Doberman pinscher +miniature pinscher +Greater Swiss Mountain dog +Bernese mountain dog +Appenzeller +EntleBucher +boxer +bull mastiff +Tibetan mastiff +French bulldog +Great Dane +Saint Bernard, St Bernard +Eskimo dog, husky +malamute, malemute, Alaskan malamute +Siberian husky +dalmatian, coach dog, carriage dog +affenpinscher, monkey pinscher, monkey dog +basenji +pug, pug-dog +Leonberg +Newfoundland, Newfoundland dog +Great Pyrenees +Samoyed, Samoyede +Pomeranian +chow, chow chow +keeshond +Brabancon griffon +Pembroke, Pembroke Welsh corgi +Cardigan, Cardigan Welsh corgi +toy poodle +miniature poodle +standard poodle +Mexican hairless +timber wolf, grey wolf, gray wolf, Canis lupus +white wolf, Arctic wolf, Canis lupus tundrarum +red wolf, maned wolf, Canis rufus, Canis niger +coyote, prairie wolf, brush wolf, Canis latrans +dingo, warrigal, warragal, Canis dingo +dhole, Cuon alpinus +African hunting dog, hyena dog, Cape hunting dog, Lycaon pictus +hyena, hyaena +red fox, Vulpes vulpes +kit fox, Vulpes macrotis +Arctic fox, white fox, Alopex lagopus +grey fox, gray fox, Urocyon cinereoargenteus +tabby, tabby cat +tiger cat +Persian cat +Siamese cat, Siamese +Egyptian cat +cougar, puma, catamount, mountain lion, painter, panther, Felis concolor +lynx, catamount +leopard, Panthera pardus +snow leopard, ounce, Panthera uncia +jaguar, panther, Panthera onca, Felis onca +lion, king of beasts, Panthera leo +tiger, Panthera tigris +cheetah, chetah, Acinonyx jubatus +brown bear, bruin, Ursus arctos +American black bear, black bear, Ursus americanus, Euarctos americanus +ice bear, polar bear, Ursus Maritimus, Thalarctos maritimus +sloth bear, Melursus ursinus, Ursus ursinus +mongoose +meerkat, mierkat +tiger beetle +ladybug, ladybeetle, lady beetle, ladybird, ladybird beetle +ground beetle, carabid beetle +long-horned beetle, longicorn, longicorn beetle +leaf beetle, chrysomelid +dung beetle +rhinoceros beetle +weevil +fly +bee +ant, emmet, pismire +grasshopper, hopper +cricket +walking stick, walkingstick, stick insect +cockroach, roach +mantis, mantid +cicada, cicala +leafhopper +lacewing, lacewing fly +dragonfly, darning needle, devil's darning needle, sewing needle, snake feeder, snake doctor, mosquito hawk, skeeter hawk +damselfly +admiral +ringlet, ringlet butterfly +monarch, monarch butterfly, milkweed butterfly, Danaus plexippus +cabbage butterfly +sulphur butterfly, sulfur butterfly +lycaenid, lycaenid butterfly +starfish, sea star +sea urchin +sea cucumber, holothurian +wood rabbit, cottontail, cottontail rabbit +hare +Angora, Angora rabbit +hamster +porcupine, hedgehog +fox squirrel, eastern fox squirrel, Sciurus niger +marmot +beaver +guinea pig, Cavia cobaya +sorrel +zebra +hog, pig, grunter, squealer, Sus scrofa +wild boar, boar, Sus scrofa +warthog +hippopotamus, hippo, river horse, Hippopotamus amphibius +ox +water buffalo, water ox, Asiatic buffalo, Bubalus bubalis +bison +ram, tup +bighorn, bighorn sheep, cimarron, Rocky Mountain bighorn, Rocky Mountain sheep, Ovis canadensis +ibex, Capra ibex +hartebeest +impala, Aepyceros melampus +gazelle +Arabian camel, dromedary, Camelus dromedarius +llama +weasel +mink +polecat, fitch, foulmart, foumart, Mustela putorius +black-footed ferret, ferret, Mustela nigripes +otter +skunk, polecat, wood pussy +badger +armadillo +three-toed sloth, ai, Bradypus tridactylus +orangutan, orang, orangutang, Pongo pygmaeus +gorilla, Gorilla gorilla +chimpanzee, chimp, Pan troglodytes +gibbon, Hylobates lar +siamang, Hylobates syndactylus, Symphalangus syndactylus +guenon, guenon monkey +patas, hussar monkey, Erythrocebus patas +baboon +macaque +langur +colobus, colobus monkey +proboscis monkey, Nasalis larvatus +marmoset +capuchin, ringtail, Cebus capucinus +howler monkey, howler +titi, titi monkey +spider monkey, Ateles geoffroyi +squirrel monkey, Saimiri sciureus +Madagascar cat, ring-tailed lemur, Lemur catta +indri, indris, Indri indri, Indri brevicaudatus +Indian elephant, Elephas maximus +African elephant, Loxodonta africana +lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens +giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca +barracouta, snoek +eel +coho, cohoe, coho salmon, blue jack, silver salmon, Oncorhynchus kisutch +rock beauty, Holocanthus tricolor +anemone fish +sturgeon +gar, garfish, garpike, billfish, Lepisosteus osseus +lionfish +puffer, pufferfish, blowfish, globefish +abacus +abaya +academic gown, academic robe, judge's robe +accordion, piano accordion, squeeze box +acoustic guitar +aircraft carrier, carrier, flattop, attack aircraft carrier +airliner +airship, dirigible +altar +ambulance +amphibian, amphibious vehicle +analog clock +apiary, bee house +apron +ashcan, trash can, garbage can, wastebin, ash bin, ash-bin, ashbin, dustbin, trash barrel, trash bin +assault rifle, assault gun +backpack, back pack, knapsack, packsack, rucksack, haversack +bakery, bakeshop, bakehouse +balance beam, beam +balloon +ballpoint, ballpoint pen, ballpen, Biro +Band Aid +banjo +bannister, banister, balustrade, balusters, handrail +barbell +barber chair +barbershop +barn +barometer +barrel, cask +barrow, garden cart, lawn cart, wheelbarrow +baseball +basketball +bassinet +bassoon +bathing cap, swimming cap +bath towel +bathtub, bathing tub, bath, tub +beach wagon, station wagon, wagon, estate car, beach waggon, station waggon, waggon +beacon, lighthouse, beacon light, pharos +beaker +bearskin, busby, shako +beer bottle +beer glass +bell cote, bell cot +bib +bicycle-built-for-two, tandem bicycle, tandem +bikini, two-piece +binder, ring-binder +binoculars, field glasses, opera glasses +birdhouse +boathouse +bobsled, bobsleigh, bob +bolo tie, bolo, bola tie, bola +bonnet, poke bonnet +bookcase +bookshop, bookstore, bookstall +bottlecap +bow +bow tie, bow-tie, bowtie +brass, memorial tablet, plaque +brassiere, bra, bandeau +breakwater, groin, groyne, mole, bulwark, seawall, jetty +breastplate, aegis, egis +broom +bucket, pail +buckle +bulletproof vest +bullet train, bullet +butcher shop, meat market +cab, hack, taxi, taxicab +caldron, cauldron +candle, taper, wax light +cannon +canoe +can opener, tin opener +cardigan +car mirror +carousel, carrousel, merry-go-round, roundabout, whirligig +carpenter's kit, tool kit +carton +car wheel +cash machine, cash dispenser, automated teller machine, automatic teller machine, automated teller, automatic teller, ATM +cassette +cassette player +castle +catamaran +CD player +cello, violoncello +cellular telephone, cellular phone, cellphone, cell, mobile phone +chain +chainlink fence +chain mail, ring mail, mail, chain armor, chain armour, ring armor, ring armour +chain saw, chainsaw +chest +chiffonier, commode +chime, bell, gong +china cabinet, china closet +Christmas stocking +church, church building +cinema, movie theater, movie theatre, movie house, picture palace +cleaver, meat cleaver, chopper +cliff dwelling +cloak +clog, geta, patten, sabot +cocktail shaker +coffee mug +coffeepot +coil, spiral, volute, whorl, helix +combination lock +computer keyboard, keypad +confectionery, confectionary, candy store +container ship, containership, container vessel +convertible +corkscrew, bottle screw +cornet, horn, trumpet, trump +cowboy boot +cowboy hat, ten-gallon hat +cradle +crane +crash helmet +crate +crib, cot +Crock Pot +croquet ball +crutch +cuirass +dam, dike, dyke +desk +desktop computer +dial telephone, dial phone +diaper, nappy, napkin +digital clock +digital watch +dining table, board +dishrag, dishcloth +dishwasher, dish washer, dishwashing machine +disk brake, disc brake +dock, dockage, docking facility +dogsled, dog sled, dog sleigh +dome +doormat, welcome mat +drilling platform, offshore rig +drum, membranophone, tympan +drumstick +dumbbell +Dutch oven +electric fan, blower +electric guitar +electric locomotive +entertainment center +envelope +espresso maker +face powder +feather boa, boa +file, file cabinet, filing cabinet +fireboat +fire engine, fire truck +fire screen, fireguard +flagpole, flagstaff +flute, transverse flute +folding chair +football helmet +forklift +fountain +fountain pen +four-poster +freight car +French horn, horn +frying pan, frypan, skillet +fur coat +garbage truck, dustcart +gasmask, respirator, gas helmet +gas pump, gasoline pump, petrol pump, island dispenser +goblet +go-kart +golf ball +golfcart, golf cart +gondola +gong, tam-tam +gown +grand piano, grand +greenhouse, nursery, glasshouse +grille, radiator grille +grocery store, grocery, food market, market +guillotine +hair slide +hair spray +half track +hammer +hamper +hand blower, blow dryer, blow drier, hair dryer, hair drier +hand-held computer, hand-held microcomputer +handkerchief, hankie, hanky, hankey +hard disc, hard disk, fixed disk +harmonica, mouth organ, harp, mouth harp +harp +harvester, reaper +hatchet +holster +home theater, home theatre +honeycomb +hook, claw +hoopskirt, crinoline +horizontal bar, high bar +horse cart, horse-cart +hourglass +iPod +iron, smoothing iron +jack-o'-lantern +jean, blue jean, denim +jeep, landrover +jersey, T-shirt, tee shirt +jigsaw puzzle +jinrikisha, ricksha, rickshaw +joystick +kimono +knee pad +knot +lab coat, laboratory coat +ladle +lampshade, lamp shade +laptop, laptop computer +lawn mower, mower +lens cap, lens cover +letter opener, paper knife, paperknife +library +lifeboat +lighter, light, igniter, ignitor +limousine, limo +liner, ocean liner +lipstick, lip rouge +Loafer +lotion +loudspeaker, speaker, speaker unit, loudspeaker system, speaker system +loupe, jeweler's loupe +lumbermill, sawmill +magnetic compass +mailbag, postbag +mailbox, letter box +maillot +maillot, tank suit +manhole cover +maraca +marimba, xylophone +mask +matchstick +maypole +maze, labyrinth +measuring cup +medicine chest, medicine cabinet +megalith, megalithic structure +microphone, mike +microwave, microwave oven +military uniform +milk can +minibus +miniskirt, mini +minivan +missile +mitten +mixing bowl +mobile home, manufactured home +Model T +modem +monastery +monitor +moped +mortar +mortarboard +mosque +mosquito net +motor scooter, scooter +mountain bike, all-terrain bike, off-roader +mountain tent +mouse, computer mouse +mousetrap +moving van +muzzle +nail +neck brace +necklace +nipple +notebook, notebook computer +obelisk +oboe, hautboy, hautbois +ocarina, sweet potato +odometer, hodometer, mileometer, milometer +oil filter +organ, pipe organ +oscilloscope, scope, cathode-ray oscilloscope, CRO +overskirt +oxcart +oxygen mask +packet +paddle, boat paddle +paddlewheel, paddle wheel +padlock +paintbrush +pajama, pyjama, pj's, jammies +palace +panpipe, pandean pipe, syrinx +paper towel +parachute, chute +parallel bars, bars +park bench +parking meter +passenger car, coach, carriage +patio, terrace +pay-phone, pay-station +pedestal, plinth, footstall +pencil box, pencil case +pencil sharpener +perfume, essence +Petri dish +photocopier +pick, plectrum, plectron +pickelhaube +picket fence, paling +pickup, pickup truck +pier +piggy bank, penny bank +pill bottle +pillow +ping-pong ball +pinwheel +pirate, pirate ship +pitcher, ewer +plane, carpenter's plane, woodworking plane +planetarium +plastic bag +plate rack +plow, plough +plunger, plumber's helper +Polaroid camera, Polaroid Land camera +pole +police van, police wagon, paddy wagon, patrol wagon, wagon, black Maria +poncho +pool table, billiard table, snooker table +pop bottle, soda bottle +pot, flowerpot +potter's wheel +power drill +prayer rug, prayer mat +printer +prison, prison house +projectile, missile +projector +puck, hockey puck +punching bag, punch bag, punching ball, punchball +purse +quill, quill pen +quilt, comforter, comfort, puff +racer, race car, racing car +racket, racquet +radiator +radio, wireless +radio telescope, radio reflector +rain barrel +recreational vehicle, RV, R.V. +reel +reflex camera +refrigerator, icebox +remote control, remote +restaurant, eating house, eating place, eatery +revolver, six-gun, six-shooter +rifle +rocking chair, rocker +rotisserie +rubber eraser, rubber, pencil eraser +rugby ball +rule, ruler +running shoe +safe +safety pin +saltshaker, salt shaker +sandal +sarong +sax, saxophone +scabbard +scale, weighing machine +school bus +schooner +scoreboard +screen, CRT screen +screw +screwdriver +seat belt, seatbelt +sewing machine +shield, buckler +shoe shop, shoe-shop, shoe store +shoji +shopping basket +shopping cart +shovel +shower cap +shower curtain +ski +ski mask +sleeping bag +slide rule, slipstick +sliding door +slot, one-armed bandit +snorkel +snowmobile +snowplow, snowplough +soap dispenser +soccer ball +sock +solar dish, solar collector, solar furnace +sombrero +soup bowl +space bar +space heater +space shuttle +spatula +speedboat +spider web, spider's web +spindle +sports car, sport car +spotlight, spot +stage +steam locomotive +steel arch bridge +steel drum +stethoscope +stole +stone wall +stopwatch, stop watch +stove +strainer +streetcar, tram, tramcar, trolley, trolley car +stretcher +studio couch, day bed +stupa, tope +submarine, pigboat, sub, U-boat +suit, suit of clothes +sundial +sunglass +sunglasses, dark glasses, shades +sunscreen, sunblock, sun blocker +suspension bridge +swab, swob, mop +sweatshirt +swimming trunks, bathing trunks +swing +switch, electric switch, electrical switch +syringe +table lamp +tank, army tank, armored combat vehicle, armoured combat vehicle +tape player +teapot +teddy, teddy bear +television, television system +tennis ball +thatch, thatched roof +theater curtain, theatre curtain +thimble +thresher, thrasher, threshing machine +throne +tile roof +toaster +tobacco shop, tobacconist shop, tobacconist +toilet seat +torch +totem pole +tow truck, tow car, wrecker +toyshop +tractor +trailer truck, tractor trailer, trucking rig, rig, articulated lorry, semi +tray +trench coat +tricycle, trike, velocipede +trimaran +tripod +triumphal arch +trolleybus, trolley coach, trackless trolley +trombone +tub, vat +turnstile +typewriter keyboard +umbrella +unicycle, monocycle +upright, upright piano +vacuum, vacuum cleaner +vase +vault +velvet +vending machine +vestment +viaduct +violin, fiddle +volleyball +waffle iron +wall clock +wallet, billfold, notecase, pocketbook +wardrobe, closet, press +warplane, military plane +washbasin, handbasin, washbowl, lavabo, wash-hand basin +washer, automatic washer, washing machine +water bottle +water jug +water tower +whiskey jug +whistle +wig +window screen +window shade +Windsor tie +wine bottle +wing +wok +wooden spoon +wool, woolen, woollen +worm fence, snake fence, snake-rail fence, Virginia fence +wreck +yawl +yurt +web site, website, internet site, site +comic book +crossword puzzle, crossword +street sign +traffic light, traffic signal, stoplight +book jacket, dust cover, dust jacket, dust wrapper +menu +plate +guacamole +consomme +hot pot, hotpot +trifle +ice cream, icecream +ice lolly, lolly, lollipop, popsicle +French loaf +bagel, beigel +pretzel +cheeseburger +hotdog, hot dog, red hot +mashed potato +head cabbage +broccoli +cauliflower +zucchini, courgette +spaghetti squash +acorn squash +butternut squash +cucumber, cuke +artichoke, globe artichoke +bell pepper +cardoon +mushroom +Granny Smith +strawberry +orange +lemon +fig +pineapple, ananas +banana +jackfruit, jak, jack +custard apple +pomegranate +hay +carbonara +chocolate sauce, chocolate syrup +dough +meat loaf, meatloaf +pizza, pizza pie +potpie +burrito +red wine +espresso +cup +eggnog +alp +bubble +cliff, drop, drop-off +coral reef +geyser +lakeside, lakeshore +promontory, headland, head, foreland +sandbar, sand bar +seashore, coast, seacoast, sea-coast +valley, vale +volcano +ballplayer, baseball player +groom, bridegroom +scuba diver +rapeseed +daisy +yellow lady's slipper, yellow lady-slipper, Cypripedium calceolus, Cypripedium parviflorum +corn +acorn +hip, rose hip, rosehip +buckeye, horse chestnut, conker +coral fungus +agaric +gyromitra +stinkhorn, carrion fungus +earthstar +hen-of-the-woods, hen of the woods, Polyporus frondosus, Grifola frondosa +bolete +ear, spike, capitulum +toilet tissue, toilet paper, bathroom tissue \ No newline at end of file diff --git a/samples/vision/hgnetv2/test_data/result.jpg b/samples/vision/hgnetv2/test_data/result.jpg new file mode 100644 index 0000000..dae6e46 Binary files /dev/null and b/samples/vision/hgnetv2/test_data/result.jpg differ diff --git a/samples/vision/hgnetv2/test_data/sandbar.JPEG b/samples/vision/hgnetv2/test_data/sandbar.JPEG new file mode 100644 index 0000000..fb50bdb Binary files /dev/null and b/samples/vision/hgnetv2/test_data/sandbar.JPEG differ