This issue can be closed with data samples where current `nvbench_compare.py` behavior is not enough / unstable.