Skip to content

Add FPS benchmark metric to influx#214

Open
tjb7 wants to merge 37 commits intomainfrom
tjb_fps_graphs
Open

Add FPS benchmark metric to influx#214
tjb7 wants to merge 37 commits intomainfrom
tjb_fps_graphs

Conversation

@tjb7
Copy link
Copy Markdown
Collaborator

@tjb7 tjb7 commented Apr 2, 2026

Adds influx db pushing of fps results, is BEST EFFORT, depends on oakctl installed on running system.

@tjb7 tjb7 requested a review from a team as a code owner April 2, 2026 05:27
@tjb7 tjb7 requested review from conorsim, klemen1999, kozlov721 and tersekmatija and removed request for a team April 2, 2026 05:27
@tjb7 tjb7 marked this pull request as draft April 2, 2026 05:27
@tjb7
Copy link
Copy Markdown
Collaborator Author

tjb7 commented Apr 2, 2026

NOTE: it's not really ready for review yet.

Basic idea is just to get FPS metrics into influx so we can watch them across all versions of everything, so nothing slips by

@tjb7 tjb7 requested a review from danilo-pejovic April 2, 2026 05:33
@tjb7 tjb7 force-pushed the tjb_fps_graphs branch from e5bc2d2 to 9ae4b83 Compare April 7, 2026 07:40
@tjb7 tjb7 force-pushed the tjb_fps_graphs branch from 69f7149 to 45768b9 Compare April 8, 2026 11:00
@@ -0,0 +1,85 @@
name: HIL FPS Benchmark Influx
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danilo-pejovic not sure if this is needed, would you rather have a seperate workflow or instrument all the tests directly (in rvc4_test.yaml) ... it is my preference to instrument all tests/gather all data, but up to your choice.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that are running on our hil are bom-test.yaml ones, not rvc4_test - those seem internal tests to ml team. I would just add influx thing to bom-tests and not create a new workflow.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are still creating a new unnecessary workflow. Can we edit this one that is already part of the bom to also pass info to influx?

https://github.com/luxonis/modelconverter/blob/main/.github/workflows/bom-test.yaml

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commend is still unaddressed - lets not create separate workflow for no reason.

I edited bom-test workflow to run this - it will just need to be returned to pip install framework when confirmed to work with experimental hil framework. Same needs to be done for run_hil_tests.sh

@tjb7 tjb7 marked this pull request as ready for review April 8, 2026 11:10
Comment thread .github/workflows/rvc4_test.yaml
Comment thread tests/test_benchmark/run_hil_tests.sh Outdated
# Cache device metadata once for the whole run using oakctl. If oakctl is not
# available, keep the benchmark runnable and record explicit placeholder values
# for the missing oakctl-derived metadata.
oakctl_output=$(oakctl list --format json 2>/dev/null || printf '')
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oakctl is not part of these tests. Why are we adding more moving parts that can fail?

Hil framework has built in functions to account get all necessary info from camera:

danilo@danilo:~/Downloads$ camera -t slo316-4pro -n test oak4_pro info -v +----------+----------+------------+------------+------------+------------+------------+--------------+----------+----------------+-------------------------------------------------+----------+ | name | model | platform | revision | mxid | username | password | hostname | adb | state | os_version | adb_id | +==========+==========+============+============+============+============+============+==============+==========+================+=================================================+==========+ | oak4_pro | oak4_pro | rvc4 | 9 | 1583798758 | root | oelinux123 | 10.12.143.89 | 5e66d9e6 | setup-complete | 1.30.0+955791f9014cfe160419deae2f64980f180f4d9c | 5e66d9e6 | +----------+----------+------------+------------+------------+------------+------------+--------------+----------+----------------+-------------------------------------------------+----------+ danilo@danilo:~/Downloads$ camera -t slo316-4pro -n test oak4_pro info -v -j [{"name": "oak4_pro", "model": "oak4_pro", "platform": "rvc4", "revision": 9, "mxid": "1583798758", "username": "root", "password": "oelinux123", "hostname": "10.12.143.89", "adb": "5e66d9e6", "state": "setup-complete", "os_version": "1.30.0+955791f9014cfe160419deae2f64980f180f4d9c", "adb_id": "5e66d9e6"}]

Alternatively - camera object in python can give all the necessary info in even more readable/maintainable way. That is probably the way we should supply that info.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will sub-in the json version of the above:

(.hil) hil@slo130-fl:~$ camera -t slo316-4pro -n test oak4_pro info -j
[{"name": "oak4_pro", "model": "oak4_pro", "platform": "rvc4", "revision": 9, "mxid": "1583798758", "hostname": "10.12.143.89", "os_version": "1.27.1+9950fe7353d76fec53a99019be9e1d6130e8f69a", "adb_id": "5e66d9e6"}]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: how to determine the camera used if there are multiple cameras?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I will just use: camera -t ${HIL_TESTBED} -n test all info -j and take the first:

(.hil) hil@slo238-4s:~$ camera -t ${HIL_TESTBED} -n test all info -j use_ssh 2026-04-13 07:21:36,832 [INFO] ssh command failed with the following output: 2026-04-13 07:21:36,832 [INFO] 2026-04-13 07:21:36,832 [INFO] Error message (stderr): 2026-04-13 07:21:36,832 [INFO] 2026-04-13 07:21:36,832 [INFO] Exit code: 5 [{"name": "oak4_s", "model": "oak4_s", "platform": "rvc4", "revision": 9, "mxid": "2671960531", "hostname": "10.12.140.235", "os_version": "1.27.1+9950fe7353d76fec53a99019be9e1d6130e8f69a", "adb_id": "9f42e1d3"}]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As stated in dms via example provided and in first comment here I would prefer camera metadata to be gathered inside the script:

Alternatively - camera object in python can give all the necessary info in even more readable/maintainable way. That is probably the way we should supply that info.

It would be more readable, consistent and maintainable.

Comment thread tests/test_benchmark/run_hil_tests.sh Outdated
return f'"{escaped}"'


def _write_fps_result_to_influx(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be implemented in hil_framework? We can probably hardcode influx variables there and then create simple function like camera/testbed.log_to_influx(arguments) and it will be pretty similar to this implementation and how sanity checks work for hil already. That way it is more maintainable - if we for example change where we host out database, it will be one change instead of changing every repo.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, but let's do it in a follow up ticket, just so I can get this work finished and immediately useful.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to do things properly from the start if we know it is something we are going to be building on - it shouldnt take long.

Comment thread tests/test_benchmark/test_benchmark_regression.py Outdated
Comment thread tests/test_benchmark/test_benchmark_regression.py
@@ -0,0 +1,85 @@
name: HIL FPS Benchmark Influx
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that are running on our hil are bom-test.yaml ones, not rvc4_test - those seem internal tests to ml team. I would just add influx thing to bom-tests and not create a new workflow.

default="rvc4",
help="Target platform to benchmark (default: rvc4).",
)
parser.addoption(
Copy link
Copy Markdown
Collaborator

@danilo-pejovic danilo-pejovic Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments above/below adds more info.

Probably should be implemented in more maintainable way to avoid piping info we already have multiple times.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if I understand correctly, I should just remove this here, and fail hard if the env is not properly configured. I will make the change, please comment again if you mean something else here.

Copy link
Copy Markdown
Collaborator

@danilo-pejovic danilo-pejovic Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are passing camera mxid, os versions,... as arguments to the script when there is no need to do that and that info is readily available inside the script via hil framework.

Comment thread tests/test_benchmark/test_benchmark_regression.py
Comment thread tests/test_benchmark/run_hil_tests.sh Outdated
@danilo-pejovic
Copy link
Copy Markdown
Collaborator

danilo-pejovic commented Apr 10, 2026

Also, here is a screenshot of the final data in grafana:
image

And the direct link: http://10.12.139.254:3000/d/ad4vt7g/fps-model-converter-tests-v3?orgId=1&from=2026-04-07T10:26:51.553Z&to=2026-04-07T17:48:13.745Z&timezone=browser&var-test_name=luxonis%2Fyolov8-nano-pose-estimation:coco-512x288&var-test_name=luxonis%2Ffastsam-s:512x288&var-test_name=luxonis%2Fyolov6-nano:r2-coco-512x288&var-depthai_version=3.3.0&var-depthai_version=3.4.0&var-depthai_version=3.5.0&var-camera_os_version=$__all&var-run_id=$__all&var-camera_mxid=$__all&var-testbed_name=$__all&var-min_fps_threshold=

In the graph max and min values for models dont seem to be consistent? IIRC they are just some expected value with buffer on both sides so it should be always the same for the same model. Are we maybe mixing up models in tables?

Also as long as those min and max values are set constants we probably dont need to track them for every test. It just adds clutter

@tjb7
Copy link
Copy Markdown
Collaborator Author

tjb7 commented Apr 13, 2026

Will try to refactor out to the HIL environment

@tjb7
Copy link
Copy Markdown
Collaborator Author

tjb7 commented Apr 16, 2026

Refactor complete, ready for another look @danilo-pejovic

Copy link
Copy Markdown
Collaborator

@danilo-pejovic danilo-pejovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/luxonis/modelconverter/actions/runs/24509126353/job/71636481076

tests seem to pass. Lets fix last couple of issues and it should be good to go.

Comment thread TASK.md Outdated
@@ -0,0 +1,87 @@
# FPS Influx Split Plan
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets not commit and push task.md

@@ -0,0 +1,85 @@
name: HIL FPS Benchmark Influx
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commend is still unaddressed - lets not create separate workflow for no reason.

I edited bom-test workflow to run this - it will just need to be returned to pip install framework when confirmed to work with experimental hil framework. Same needs to be done for run_hil_tests.sh

return slug.rsplit("/", 1)[-1]


def _normalize_tag(value: Any) -> str:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something we will want to do every time we would want to do everytime even on other performance tests? In that case it belongs in hil_framework and not here

Comment thread tests/test_benchmark/run_hil_tests.sh Outdated
exit 1
fi

rvc4_camera=$(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from hil_framework.lib_testbed.db_source.InfluxClient import InfluxClient
from hil_framework.lib_testbed.config.Config import Config
from hil_framework.lib_testbed.utils.Testbed import Testbed
from hil_framework.lib_testbed.camera.RVC4.RVC4CameraBase import RVC4CameraBase

import os

testbed_name=os.getenv("HIL_TESTBED")
config=Config(testbed_name)
testbed = Testbed(config)
camera: RVC4CameraBase = next((cam for cam in testbed.cameras if cam.platform == 'rvc4'), None)

print(camera.name)
print(camera.platform)
print(camera.mxid)
print(camera.get_os_version()) # note if this creates issue as it has been affected by current refactor, fixed the issue but could still be bad on older versions of hil framework
print(camera.revision)
print(camera.model.name)
print(camera.hostname)

As stated in:

#214 (comment) (preferred choice, towards the end)

#214 (comment)

https://github.com/luxonis/modelconverter/pull/214/changes#r3072218602

and multiple times in dms with examples

I would heavily prefer to collect data in python as camera object is less likely to be changed in the future compared to cli interface which is generally used for interactive purposes (granted both are unlikely), is more readable, maintainable and less likely to suffer random bugs. I will not block the release over this as it is unlikely to cause issues in short term, but I dont think I have gotten a response - so if we go for cli tool I would at least want it to be a conscious choice. Cli tool was just a quick example from my side before, I will be more clear from now on.

Copy link
Copy Markdown
Collaborator Author

@tjb7 tjb7 Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok will change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants