fix(benchmark_runner): record failed-item exceptions on TestRecord so… by WatchTree-19 · Pull Request #1552 · mlcommons/modelbench

WatchTree-19 · 2026-06-24T14:19:58Z

… # errors is accurate (#1353)

_make_test_record hardcoded test_item_exceptions=[], so exceptions captured on failed items never reached the TestRecord. HazardScore.score() sums len(test_record.test_item_exceptions) into HazardScore.exceptions, which feeds the '# errors' column of the results table, so it always read 0 even when items raised.

Populate test_item_exceptions from run.failed_items_for(sut, test), one TestItemExceptionRecord per failed item carrying exceptions (mirrors simple_test_runner). Adds a regression test plus a clean-run test.

… # errors is accurate (mlcommons#1353) _make_test_record hardcoded test_item_exceptions=[], so exceptions captured on failed items never reached the TestRecord. HazardScore.score() sums len(test_record.test_item_exceptions) into HazardScore.exceptions, which feeds the '# errors' column of the results table, so it always read 0 even when items raised. Populate test_item_exceptions from run.failed_items_for(sut, test), one TestItemExceptionRecord per failed item carrying exceptions (mirrors simple_test_runner). Adds a regression test plus a clean-run test.

github-actions · 2026-06-24T14:20:10Z

MLCommons CLA bot:
Thank you very much for your submission; we really appreciate it. Before we can accept your contribution,
we ask that you sign the MLCommons CLA (Apache 2). Please submit your GitHub ID to our onboarding form to initiate
authorization. If you are from a MLCommons member organization, we will request that you be added to the CLA.
If you are not from a member organization, we will email you a CLA to sign. For any questions, please contact
support@mlcommons.org.
0 out of 1 committers have signed the MLCommons CLA.
❌ @WatchTree-19
_{You can retrigger this bot by commenting recheck in this Pull Request}

WatchTree-19 requested a review from a team as a code owner June 24, 2026 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(benchmark_runner): record failed-item exceptions on TestRecord so…#1552

fix(benchmark_runner): record failed-item exceptions on TestRecord so…#1552
WatchTree-19 wants to merge 1 commit into
mlcommons:mainfrom
WatchTree-19:fix/1353-record-failed-item-exceptions

WatchTree-19 commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

WatchTree-19 commented Jun 24, 2026

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant