Feat: Support creation of run_metadata.json at the end of benchmark run by anandhu-eng · Pull Request #374 · mlcommons/endpoints

anandhu-eng · 2026-06-23T20:29:24Z

What does this PR do?

Type of change

Bug fix
New feature
Documentation update
Refactor/cleanup

Related issues

Testing

Tests added/updated
All tests pass locally
Manual testing completed

Checklist

Code follows project style
Pre-commit hooks pass
Documentation updated (if needed)

Captures qps, request/TTFT/TPOT latency percentiles, and scalar metrics into a run_metadata.json artifact in the report directory. Infrastructure fields (tensor_parallel, config_summary, etc.) are left as None for a separate sysinfo population step. Failure to write run_metadata.json is isolated so results.json is never suppressed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replaces datetime.now(UTC) (finalization time) with monotime_to_datetime(bench.session.start_time_ns).date().isoformat(), recording the date the benchmark actually started as a YYYY-MM-DD string. Also threads start_time_ns through _build_run_metadata as a keyword arg and updates all test call sites to pass _START_NS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

datetime.fromtimestamp() in monotime_to_datetime() returns naive local time. .astimezone(UTC) before .date() ensures run_date is always the UTC calendar date, making the value consistent across machines in different timezones. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add concurrency field (from load_pattern.target_concurrency) - Rename measured_latency_*_avg → measured_latency_*_average - Reorder all fields to match the canonical submission field order Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-06-23T20:29:32Z

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

gemini-code-assist

Code Review

This pull request introduces the generation of a run_metadata.json file containing benchmark run metrics during finalization, along with comprehensive unit tests. The review feedback highlights a potential ZeroDivisionError if concurrency is zero, compatibility concerns with Python versions older than 3.11 due to the direct import of UTC from datetime, and a recommendation to explicitly specify UTF-8 encoding when writing the metadata file.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-23T20:30:20Z

 from dataclasses import dataclass, field
 from dataclasses import replace as dataclass_replace
-from datetime import datetime
+from datetime import UTC, datetime


Importing UTC directly from datetime was introduced in Python 3.11. If this project is run on Python 3.10 or older, this import will raise an ImportError. To ensure backward compatibility with older Python versions, it is safer to import timezone and use timezone.utc instead.

Suggested change

from datetime import UTC, datetime

from datetime import datetime, timezone

I think this is fine as the project uses python>3.12 as detailed here

gemini-code-assist · 2026-06-23T20:30:20Z

+        "run_date": monotime_to_datetime(start_time_ns)
+        .astimezone(UTC)
+        .date()
+        .isoformat(),


Use timezone.utc instead of UTC to maintain compatibility with Python versions older than 3.11.

Suggested change

"run_date": monotime_to_datetime(start_time_ns)

.astimezone(UTC)

.date()

.isoformat(),

"run_date": monotime_to_datetime(start_time_ns)

.astimezone(timezone.utc)

.date()

.isoformat(),

I think this is fine as the project uses python>3.12 as detailed here

- Guard tps_per_user division against concurrency=0 (falsy check instead of `is not None`) to prevent ZeroDivisionError - Add encoding="utf-8" to run_metadata.json open() for platform safety Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

nvzhihanj · 2026-06-23T21:54:35Z

+        ),
+        "link_config": None,
+        "link_logs": None,
+        "measured_latency_ttft_min": _metric_stat(ttft, "min"),


Hi Anandu, I would suggest that we have submission checker or other means to auto-generate the run_metadata.json by reading the results_summary.json in post-processing, i.e. make results_summary.json self complete and don't create more output file that duplicate contents.

By creating more separate files with a similar set of metrics, we are increasing the entropy of the system and decreasing the maintainability of the module. Not all scenarios will have the TTFT/TPOT in endpoints (e.g. T2V, VLM).

We can discuss in slack or iterate on the exact requirements. Right now we have points.yaml and results_summary.json which should have all the information

anandhu-eng and others added 4 commits June 23, 2026 19:35

anandhu-eng requested a review from a team June 23, 2026 20:29

github-actions Bot requested review from arekay-nv and nvzhihanj June 23, 2026 20:29

gemini-code-assist Bot reviewed Jun 23, 2026

View reviewed changes

nvzhihanj requested changes Jun 23, 2026

View reviewed changes

Merge branch 'main' into feat/runmetadata

c97e07b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: Support creation of run_metadata.json at the end of benchmark run#374

Feat: Support creation of run_metadata.json at the end of benchmark run#374
anandhu-eng wants to merge 6 commits into
mainfrom
feat/runmetadata

anandhu-eng commented Jun 23, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Uh oh!

anandhu-eng Jun 23, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Uh oh!

anandhu-eng Jun 23, 2026

Uh oh!

nvzhihanj Jun 23, 2026

Uh oh!

nvzhihanj Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	from datetime import UTC, datetime
	from datetime import datetime, timezone

Conversation

anandhu-eng commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Type of change

Related issues

Testing

Checklist

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

anandhu-eng Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

anandhu-eng Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

nvzhihanj Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

nvzhihanj Jun 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

anandhu-eng commented Jun 23, 2026 •

edited

Loading