Skip to content

Fix catastrophic regex backtracking in shell framework#4658

Open
thrix wants to merge 3 commits intoteemtee:mainfrom
thrix:fix-shell-regex-backtracking
Open

Fix catastrophic regex backtracking in shell framework#4658
thrix wants to merge 3 commits intoteemtee:mainfrom
thrix:fix-shell-regex-backtracking

Conversation

@thrix
Copy link
Contributor

@thrix thrix commented Mar 6, 2026

Summary

Fix _extract_failures() in tmt/frameworks/shell.py which causes tmt processes to hang for hours when test output contains very long lines.

  • Replace re.findall(r'.*\b(?:error|fail)\b.*', ...) with a line-by-line re.search() approach that avoids catastrophic regex backtracking
  • Add unit tests for _extract_failures() including a regression test with 1M-character lines

Fixes #4657

Context

Testing Farm jobs from the Hummingbird project were getting stuck in "running" state for hours. The test output files contained base64-encoded in-toto attestation payloads — single lines of 1M+ characters — causing the regex engine to backtrack indefinitely.

Detailed write-up: https://vadkerti.net/posts/debugging-stuck-tmt-processes-with-claude-code/

Test plan

  • Unit tests pass (hatch -e dev run -- python -m pytest tests/unit/test_shell.py -v)
  • Existing shell framework integration tests still pass

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The catastrophic regex backtracking is fixed by replacing re.findall with a line-by-line re.search. The added unit tests cover the fix and include a performance regression test. Two suggestions are provided to improve the new test code, focusing on type correctness and assertion specificity.

@thrix
Copy link
Contributor Author

thrix commented Mar 6, 2026

@gemini-code-assist please review again

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a catastrophic backtracking issue in _extract_failures by using a line-by-line search. The added unit tests cover the changes, including a performance regression test. I have provided two suggestions for minor improvements.

@thrix thrix added the ci | full test Pull request is ready for the full test execution label Mar 6, 2026
@thrix thrix requested a review from happz March 9, 2026 13:19
thrix added 3 commits March 9, 2026 14:19
Replace re.findall() with a line-by-line re.search() approach in
_extract_failures() to avoid catastrophic backtracking on long lines.

The old regex r'.*\b(?:error|fail)\b.*' caused tmt processes to hang
for hours when test output contained lines over 1M characters (e.g.
base64-encoded in-toto attestation payloads from container builds).

Fixes teemtee#4657

Assisted-by Claude Code
- Use Path("dummy.log") instead of MagicMock() for log_path parameter
- Make test assertions exact lists instead of length/membership checks

Assisted-by Claude Code
- Remove TestExtractFailures class, use plain test functions (happz)
- Move import tmt.utils to top of file (gemini-code-assist)
- Pre-compile regex as module-level FAILURE_PATTERN constant (gemini-code-assist)

Assisted-by Claude Code
@thrix thrix force-pushed the fix-shell-regex-backtracking branch from 6884a58 to 4063a35 Compare March 9, 2026 13:19
@thrix
Copy link
Contributor Author

thrix commented Mar 9, 2026

@gemini-code-assist please review again

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively resolves a critical performance issue related to catastrophic regex backtracking by implementing a more efficient line-by-line search strategy. The accompanying unit tests are thorough and include a valuable performance regression test to prevent this issue from recurring. My feedback includes one suggestion to enhance the maintainability of the new performance test.

@thrix thrix requested review from LecrisUT and psss March 9, 2026 15:00
@happz happz added this to planning Mar 11, 2026
@github-project-automation github-project-automation bot moved this to backlog in planning Mar 11, 2026
@happz happz moved this from backlog to review in planning Mar 11, 2026
@happz happz moved this from review to implement in planning Mar 11, 2026
Comment on lines +9 to +13
def _make_invocation(log_content: str) -> MagicMock:
"""Create a mock TestInvocation that returns given log content on read."""
mock = MagicMock()
mock.phase.step.plan.execute.read.return_value = log_content
return mock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use fixture, please make this a proper fixture

Comment on lines +45 to +52
def test_word_boundary() -> None:
"""Words like 'errorless' or 'failover' should not match."""
invocation = _make_invocation("errorless operation\nfailover complete\n")
assert _extract_failures(invocation, Path("dummy.log")) == []


def test_file_error() -> None:
invocation = MagicMock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent documentaiton of tests

Comment on lines +16 to +30
def test_no_failures() -> None:
invocation = _make_invocation("all good\nnothing wrong here\n")
assert _extract_failures(invocation, Path("dummy.log")) == []


def test_error_match() -> None:
invocation = _make_invocation("line1\nsome error occurred\nline3\n")
result = _extract_failures(invocation, Path("dummy.log"))
assert result == ["some error occurred"]


def test_fail_match() -> None:
invocation = _make_invocation("test passed\ntest fail here\ndone\n")
result = _extract_failures(invocation, Path("dummy.log"))
assert result == ["test fail here"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So much repetition. Please make these parametrized

Comment on lines +57 to +83
def test_long_lines_performance() -> None:
"""
Regression test: regex must not cause catastrophic backtracking
on very long lines.

The original implementation used re.findall(r'.*\\b(?:error|fail)\\b.*', ...)
which caused O(n^2) or worse backtracking on long lines without matches,
hanging tmt processes for hours on 1M+ character lines (e.g. base64-encoded
in-toto attestation payloads in container build logs).
"""
# Build a log with a 1M-character line (similar to base64 attestation data)
long_line = "A" * 1_000_000
log_content = f"start\n{long_line}\nsome error here\nend\n"

invocation = _make_invocation(log_content)

start = time.monotonic()
result = _extract_failures(invocation, Path("dummy.log"))
elapsed = time.monotonic() - start

# Must complete in under 5 seconds (the old regex took 5+ seconds
# on just 10k characters, and would never complete on 1M characters)
assert elapsed < 5.0, (
f"_extract_failures took {elapsed:.1f}s on a log with a 1M-char line; "
f"likely catastrophic regex backtracking"
)
assert result == ["some error here"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking test times in tests is very iffy to me. But in absence of benchmark infrastructure 🤷 . I am also fine with just ignoring this test since it fixed a rather edge-case situation

Comment on lines +73 to +75
start = time.monotonic()
result = _extract_failures(invocation, Path("dummy.log"))
elapsed = time.monotonic() - start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The choice of monotic here seems arbitrary?

Comment on lines +77 to +79
# Must complete in under 5 seconds (the old regex took 5+ seconds
# on just 10k characters, and would never complete on 1M characters)
assert elapsed < 5.0, (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 seconds is a very arbitrary cut-off. If the old one would have finished these in 5.001 seconds, and this change brings it down to 4.999, this test does not really tell anything. Similarly if we make changes to the code and it gradually increases the test time to the point that it takes 4.999 tests still pass.

If there is log-scale changes then confirm the intended change are at least an order of magnitude difference and account more realistically buffers of what time scale it is expected

Comment on lines +67 to +69
# Build a log with a 1M-character line (similar to base64 attestation data)
long_line = "A" * 1_000_000
log_content = f"start\n{long_line}\nsome error here\nend\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the long line does not have a new-line that separates from the some error part, what if it is comprised of multiple words which would match the \b?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking

@psss psss added this to the 1.70 milestone Mar 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci | full test Pull request is ready for the full test execution

Projects

Status: implement

Development

Successfully merging this pull request may close these issues.

Catastrophic regex backtracking in shell framework _extract_failures on long lines

5 participants