Hi there,
I am trying to run a PyTorch toy example model on Rubik Pi 3 (QCS6490) using the following code (taken from here).
The model was successfully compiled and downloaded, and I have a mobilenetv2.tflite file locally.
Now I want to use the model for inference, so I am using the following code:
import numpy as np
import tflite_runtime.interpreter as tflite
def run_inference(model_path, input_data):
# Load interpreter
#interpreter = tflite.Interpreter(model_path=model_path)
interpreter = tflite.Interpreter(model_path=model_path, experimental_delegates=[tflite.load_delegate('/usr/lib/libQnnTFLiteDelegate.so')])
interpreter.allocate_tensors()
# Get I/O details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Set input
interpreter.set_tensor(input_details[0]['index'], input_data)
# Run inference
interpreter.invoke()
# Get output
output = interpreter.get_tensor(output_details[0]['index'])
return output
# Example usage
input_shape = (1, 3, 224, 224)
input_data = np.random.randn(*input_shape).astype(np.float32)
result = run_inference("mobilenetv2.tflite", input_data)
print(f"Output shape: {result.shape}")
and I get the following error:
File "/home/ubuntu/.pyenv/versions/3.8.18/lib/python3.8/site-packages/tflite_runtime/interpreter.py", line 513, in __init__
self._interpreter.ModifyGraphWithDelegate(
RuntimeError: Restored original execution plan after delegate application failure.
If I don't explicitly specify a delegate and let it automatically select a delegate using interpreter = tflite.Interpreter(model_path=model_path), it runs on the CPU and gives the following output:
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Output shape: (1, 1000)
How can I get the network accelerated on the HTP NPU? Is there anything wrong with my inference script?
Thank you in advance!
P.S. I will post the system info in the next comment.
Hi there,
I am trying to run a PyTorch toy example model on Rubik Pi 3 (QCS6490) using the following code (taken from here).
The model was successfully compiled and downloaded, and I have a
mobilenetv2.tflitefile locally.Now I want to use the model for inference, so I am using the following code:
and I get the following error:
If I don't explicitly specify a delegate and let it automatically select a delegate using
interpreter = tflite.Interpreter(model_path=model_path), it runs on the CPU and gives the following output:INFO: Created TensorFlow Lite XNNPACK delegate for CPU. Output shape: (1, 1000)How can I get the network accelerated on the HTP NPU? Is there anything wrong with my inference script?
Thank you in advance!
P.S. I will post the system info in the next comment.