Skip to content

Update performance regex pattern for log matching#101

Open
Saiamd999 wants to merge 1 commit intomainfrom
Saiamd999/ROCM-21596
Open

Update performance regex pattern for log matching#101
Saiamd999 wants to merge 1 commit intomainfrom
Saiamd999/ROCM-21596

Conversation

@Saiamd999
Copy link
Copy Markdown

Motivation

we have (',' and ‘/s’) after the perf numbers which breaks the regex pattern matching in the madengine framework.
fixes: https://amd-hub.atlassian.net/browse/ROCM-21596

Technical Details

Test Plan

Test Result

Submission Checklist

Copilot AI review requested due to automatic review settings April 14, 2026 07:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates madengine’s log parsing to recognize performance lines where the numeric value is followed by additional suffix characters (notably , and /s), addressing ROCM-21596 where the previous regex failed to match those formats.

Changes:

  • Update the performance: regex in ContainerRunner.run_container() to tolerate a value suffix (e.g., /s) and an optional comma before the metric token.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


# Pattern 1: "performance: 12345 metric_name" (original expected format)
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)'
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)'
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change expands the supported performance: log format in ContainerRunner, but the shared multi-node parser in src/madengine/deployment/base.py still uses the older strict pattern (performance:\s*([\d.]+)\s+(\S+)) and will fail on value suffixes like /s and/or a trailing comma. If this log format change is meant to be supported across madengine (SLURM/Kubernetes too), consider updating the shared parser or refactoring to a single helper so performance extraction stays consistent across execution paths.

Copilot uses AI. Check for mistakes.

# Pattern 1: "performance: 12345 metric_name" (original expected format)
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)'
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)'
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are existing tests that assert parsing of performance: 100.5 samples_per_second (e.g. tests/integration/test_container_execution.py::test_run_container_success), but no coverage for the newly accepted formats (comma after value and /s suffix). Add test cases for representative new log lines (e.g. performance: 100.5, samples_per_second and performance: 100.5/s, samples_per_second) to prevent regressions in this regex.

Copilot uses AI. Check for mistakes.

# Pattern 1: "performance: 12345 metric_name" (original expected format)
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)'
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)'
Copy link

Copilot AI Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated regex matches formats like performance: 123/s, metric, but it will not match when the comma comes immediately after the number (e.g. performance: 123,/s metric or performance: 123, /s metric) because ,? is only allowed after the optional [/a-zA-Z]* suffix and \s+ is required right after the comma. If the logs can contain a comma and /s after the value in either order (as described in the PR motivation), adjust the pattern to accept both orderings (or normalize the value token before matching by stripping , and /s variants).

Suggested change
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)'
# Also accepts optional "/s"-style suffixes and commas in either order,
# e.g. "performance: 123/s, metric", "performance: 123,/s metric",
# and "performance: 123, /s metric".
perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)(?:\s*(?:/[/a-zA-Z]*)?\s*,?|\s*,\s*(?:/[/a-zA-Z]*)?)\s+([a-zA-Z_][a-zA-Z0-9_]*)'

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants