Great work all, been keeping an eye on what you all are doing!
Curious what if any techniques are applied to the original model/script to get it working well in the banana environment.
Meaning, what could I, as a developer adding a new model to be deployed on banana do to the original inference script/rest endpoint, do to increase or aid banana's ability to improve model execution/ model load times? If I try to run an onnx model, will that instead slow it down? I understand banana does some additional work on our inference code, but understanding which parts it can and cannot speed up or slowdown, would help us know how/what solutions can be customized to run well on it.
Thanks!
Great work all, been keeping an eye on what you all are doing!
Curious what if any techniques are applied to the original model/script to get it working well in the banana environment.
Meaning, what could I, as a developer adding a new model to be deployed on banana do to the original inference script/rest endpoint, do to increase or aid banana's ability to improve model execution/ model load times? If I try to run an onnx model, will that instead slow it down? I understand banana does some additional work on our inference code, but understanding which parts it can and cannot speed up or slowdown, would help us know how/what solutions can be customized to run well on it.
Thanks!