[Question] Techniques to optimize model load/inference execution time

Great work all, been keeping an eye on what you all are doing!

Curious what if any techniques are applied to the original model/script to get it working well in the banana environment.
Meaning, what could I, as a developer adding a new model to be deployed on banana do to the original inference script/rest endpoint, do to increase or aid banana's ability to improve model execution/ model load times? If I try to run an onnx model, will that instead slow it down? I understand banana does some additional work on our inference code, but understanding which parts it can and cannot speed up or slowdown, would help us know how/what solutions can be customized to run well on it.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Techniques to optimize model load/inference execution time #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Techniques to optimize model load/inference execution time #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions