Update performance regex pattern for log matching#101
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates madengine’s log parsing to recognize performance lines where the numeric value is followed by additional suffix characters (notably , and /s), addressing ROCM-21596 where the previous regex failed to match those formats.
Changes:
- Update the
performance:regex inContainerRunner.run_container()to tolerate a value suffix (e.g.,/s) and an optional comma before the metric token.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| # Pattern 1: "performance: 12345 metric_name" (original expected format) | ||
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)' | ||
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)' |
There was a problem hiding this comment.
This change expands the supported performance: log format in ContainerRunner, but the shared multi-node parser in src/madengine/deployment/base.py still uses the older strict pattern (performance:\s*([\d.]+)\s+(\S+)) and will fail on value suffixes like /s and/or a trailing comma. If this log format change is meant to be supported across madengine (SLURM/Kubernetes too), consider updating the shared parser or refactoring to a single helper so performance extraction stays consistent across execution paths.
|
|
||
| # Pattern 1: "performance: 12345 metric_name" (original expected format) | ||
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)' | ||
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)' |
There was a problem hiding this comment.
There are existing tests that assert parsing of performance: 100.5 samples_per_second (e.g. tests/integration/test_container_execution.py::test_run_container_success), but no coverage for the newly accepted formats (comma after value and /s suffix). Add test cases for representative new log lines (e.g. performance: 100.5, samples_per_second and performance: 100.5/s, samples_per_second) to prevent regressions in this regex.
|
|
||
| # Pattern 1: "performance: 12345 metric_name" (original expected format) | ||
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)\s+([a-zA-Z_][a-zA-Z0-9_]*)' | ||
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)' |
There was a problem hiding this comment.
The updated regex matches formats like performance: 123/s, metric, but it will not match when the comma comes immediately after the number (e.g. performance: 123,/s metric or performance: 123, /s metric) because ,? is only allowed after the optional [/a-zA-Z]* suffix and \s+ is required right after the comma. If the logs can contain a comma and /s after the value in either order (as described in the PR motivation), adjust the pattern to accept both orderings (or normalize the value token before matching by stripping , and /s variants).
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)[/a-zA-Z]*,?\s+([a-zA-Z_][a-zA-Z0-9_]*)' | |
| # Also accepts optional "/s"-style suffixes and commas in either order, | |
| # e.g. "performance: 123/s, metric", "performance: 123,/s metric", | |
| # and "performance: 123, /s metric". | |
| perf_pattern = r'performance:\s+([0-9][0-9.eE+-]*)(?:\s*(?:/[/a-zA-Z]*)?\s*,?|\s*,\s*(?:/[/a-zA-Z]*)?)\s+([a-zA-Z_][a-zA-Z0-9_]*)' |
Motivation
we have (',' and ‘/s’) after the perf numbers which breaks the regex pattern matching in the madengine framework.
fixes: https://amd-hub.atlassian.net/browse/ROCM-21596
Technical Details
Test Plan
Test Result
Submission Checklist