LightGBM Model Inference on Alveo U50 FPGA is Slower Than Expected

Recently, I have been working on deploying my LightGBM model on an Alveo U50 FPGA card using Conifer. To achieve this, I implemented a [LightGBM converter](https://github.com/zjzjwang/conifer/blob/add_lgb/conifer/converters/lightgbm.py) and created a [build script](https://github.com/zjzjwang/conifer/blob/add_lgb/examples/lightgbm_to_alveo.py) for my [LightGBM regression model](https://github.com/zjzjwang/conifer/blob/add_lgb/examples/assets/model.txt). The model has 300 trees, with a max depth of 5, and its feature size is 105.

After building the model, I wrote a [C++ inference code](https://github.com/zjzjwang/conifer/blob/add_lgb/infer.cpp) to load the `xclbin` and benchmark the inference speed. However, I found that it takes approximately **90 microseconds** to infer a single sample, which seems much slower than what I was anticipating.

Could you provide any insights as to why this might be happening? Am I possibly doing something wrong? Any help would be greatly appreciated. Thank you!

Here is the command I used to build the `infer.cpp` file:

```bash
# Move infer.cpp to `cfg['OutputDir'] = 'prj_lgb'`, then build with the following command (modify the include path accordingly)
g++ -O3 -std=c++14 infer.cpp firmware/BDT.cpp firmware/my_prj.cpp -o app \
-I$XILINX_XRT/include/ -L$XILINX_XRT/lib -lxrt_coreutil -pthread \
-I/home/wang/Xilinx/Vitis_HLS/2022.2/include
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LightGBM Model Inference on Alveo U50 FPGA is Slower Than Expected #80

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LightGBM Model Inference on Alveo U50 FPGA is Slower Than Expected #80

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions