testrunner: don't mask a parent test that fails on its own#5698
Open
pietern wants to merge 1 commit into
Open
Conversation
A known-failure rule names a specific subtest, but a parent test also fails whenever one of its subtests fails. The runner allowed the parent to fail by matching rules bidirectionally: a rule whose pattern was a descendant of the failing test also matched. That allowance was unconditional, so the top-level TestAccept failing on its own (before any subtest ran) matched any TestAccept/... rule and was silently swallowed. Move parent-cascade handling out of rule matching and into failure analysis: matches() reverts to plain forward matching, and a failure is allowed when a rule matches or a subtest of it also failed. Go reports a parent's failure only after its subtests, so the failing subtest is already recorded when the parent is evaluated. Co-authored-by: Isaac
Contributor
Waiting for approvalBased on git history, these people are best suited to review:
Eligible reviewers: Suggestions based on git history. See OWNERS for ownership rules. |
denik
reviewed
Jun 24, 2026
| if unexpectedFailures[key] { | ||
| fmt.Printf("%s %s passed on retry\n", result.Package, result.Test) | ||
| delete(unexpectedFailures, key) | ||
| } |
Contributor
There was a problem hiding this comment.
we can do this, although a bit hard to see what's going.
Alternative, possibly simpler is to place all the version checks into dedicated subtest so it's not just TestAccept that fails but TestAccept/min-versions
Contributor
There was a problem hiding this comment.
^ we should probably do it regardless, much better to see that it's TestAccept/min-versions having issues rather than TestAccept which tells nothing.
Collaborator
Integration test reportCommit: 549c080
28 interesting tests: 13 SKIP, 11 flaky, 3 RECOVERED, 1 FAIL
Top 7 slowest tests (at least 2 minutes):
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The integration test runner masks failures listed in
known_failures.txt. A known-failure rule names a specific subtest (e.g.TestAccept/ssh/connection), but a parent test also fails whenever one of its subtests fails, so the runner additionally allowed the parent to fail. It did this by matching a rule against the parent bidirectionally: a rule whose pattern was a descendant of the failing test also matched.That allowance was unconditional. When the top-level
TestAcceptfailed on its own — before any subtest ran — its failure still matched anyTestAccept/...rule and was silently swallowed. The recentRequireRuffsetup check (#5662) does exactly this on the integration runners (ruff isn't installed there), soTestAccepthas been failing on every environment while the nightly reports green.This moves the parent-cascade handling out of rule matching and into failure analysis:
ConfigRule.matchesreverts to plain forward matching — a rule matches a test or its subtree, nothing about parents.checkFailuresallows a failure when a rule matches or a subtest of it also failed. Go reports a parent's failure only after its subtests, so the failing subtest is already recorded when the parent is evaluated; no second pass is needed.Net effect: a parent failing because of a (known or unknown) subtest is still not double-counted, but a test that fails on its own with no failing subtest now surfaces as unexpected.
This only makes the failure visible. Installing ruff on the integration runners is handled separately.
Tests
TestCheckFailuresexercisescheckFailuresend-to-end: parent-with-failing-subtest (allowed), parent-alone (the ruff case, surfaced), unlisted failure (surfaced), and fail-then-pass-on-retry (allowed).TestConfigRuleMatchesreverse-match cases now assert that a rule does not match a parent of the listed test.This pull request and its description were written by Isaac.