Recently, I have been working on deploying my LightGBM model on an Alveo U50 FPGA card using Conifer. To achieve this, I implemented a LightGBM converter and created a build script for my LightGBM regression model. The model has 300 trees, with a max depth of 5, and its feature size is 105.
After building the model, I wrote a C++ inference code to load the xclbin and benchmark the inference speed. However, I found that it takes approximately 90 microseconds to infer a single sample, which seems much slower than what I was anticipating.
Could you provide any insights as to why this might be happening? Am I possibly doing something wrong? Any help would be greatly appreciated. Thank you!
Here is the command I used to build the infer.cpp file:
# Move infer.cpp to `cfg['OutputDir'] = 'prj_lgb'`, then build with the following command (modify the include path accordingly)
g++ -O3 -std=c++14 infer.cpp firmware/BDT.cpp firmware/my_prj.cpp -o app \
-I$XILINX_XRT/include/ -L$XILINX_XRT/lib -lxrt_coreutil -pthread \
-I/home/wang/Xilinx/Vitis_HLS/2022.2/include
Recently, I have been working on deploying my LightGBM model on an Alveo U50 FPGA card using Conifer. To achieve this, I implemented a LightGBM converter and created a build script for my LightGBM regression model. The model has 300 trees, with a max depth of 5, and its feature size is 105.
After building the model, I wrote a C++ inference code to load the
xclbinand benchmark the inference speed. However, I found that it takes approximately 90 microseconds to infer a single sample, which seems much slower than what I was anticipating.Could you provide any insights as to why this might be happening? Am I possibly doing something wrong? Any help would be greatly appreciated. Thank you!
Here is the command I used to build the
infer.cppfile: