(WIP) OCPBUGS-86181#6107
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR makes MachineConfig tests SNO-aware, adds a default-MCP test path when worker MCPs lack nodes, hardens machine-count checks against transient connection failures, adds a master MachineConfig fixture, and changes MOSC leader-election log checks to poll until observed. ChangesSNO and Image Mode Test Reliability
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 11❌ Failed checks (1 warning, 10 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: isabella-janssen The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/payload-aggregate periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-single-node-disruptive-techpreview 5 |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/178b0a90-5b9c-11f1-8b57-67f8e83fcc97-0 |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended-priv/mco_ocb.go`:
- Around line 314-317: The callback passed to o.Eventually currently discards
the error from mOSBuilder.Logs(); change it to capture the error (logs, err :=
mOSBuilder.Logs()) and handle it instead of using the blank identifier: if err
!= nil, return a descriptive string containing the error (or otherwise surface
the error to the test/failer) so the Eventually assertion sees failures to fetch
logs; update the callback around mOSBuilder.Logs() used inside o.Eventually(...)
to propagate or report the error rather than ignoring it.
In `@test/extended/image_mode_status_reporting.go`:
- Around line 394-409: In validateMCPMachineCountTransitions, stop ignoring the
error from extpriv.IsSNOSafe(oc): capture the returned error, log or return it
(or set a safe default for loopCount) and decide fallback behavior when
IsSNOSafe fails; also ensure the SNO retry-budget bump (loopCount = 10) is
applied where intended—either move the IsSNOSafe check out of the layered-only
block or duplicate its handling for the non-layered path so SNO clusters get the
higher retry count in both cases if that’s the desired behavior.
In `@test/extended/machineconfignode.go`:
- Around line 302-315: The code currently skips asserting UpdateExecuted=True
unconditionally; restore a failing assertion for non-SNO/non-connection-error
paths: after calling waitForMCNConditionStatus (function
waitForMCNConditionStatus, vars conditionMet and err, checking isSNO and
errors.Is(err, syscall.ECONNREFUSED)), keep the special-case that logs and
tolerates the ECONNREFUSED for SNO, but in the else branch ensure you both
assert no error (o.Expect(err).NotTo(HaveOccurred())) and assert conditionMet is
true (o.Expect(conditionMet).To(o.BeTrue(), fmt.Sprintf("Error, could not detect
%s on node %s: %v", mcfgv1.MachineConfigNodeUpdateExecuted, updatingNodeName,
err))). This guarantees non-SNO and non-conn-error runs fail if UpdateExecuted
was not observed.
- Around line 208-216: The call to extpriv.IsSNOSafe(oc) currently discards its
error (isSNO, _ := extpriv.IsSNOSafe(oc)); change it to capture the error and
handle it explicitly (e.g., isSNO, err := extpriv.IsSNOSafe(oc); if err != nil {
/* fail the test or log and return */ }) so that SNO detection failures do not
silently fall back to non-SNO timing; update the surrounding logic that uses
isSNO, and ensure any failure uses the test framework's failure/logging
mechanism (e.g., t.Fatalf, framework.Failf or klog/Logf as appropriate) instead
of ignoring the error while still setting updatingWaitTime and
updatingWaitInterval.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 08da7797-b463-4018-8e76-ae6f432da12b
📒 Files selected for processing (4)
test/extended-priv/mco_ocb.gotest/extended-priv/testdata/files/machineconfigs/master-extension-mc.yamltest/extended/image_mode_status_reporting.gotest/extended/machineconfignode.go
de9e297 to
a1d9689
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended/machineconfignode.go`:
- Around line 292-301: The SNO branch current logic swallows non-ECONNREFUSED
errors (variables: isSNO, err, syscall.ECONNREFUSED) and only logs on
ECONNREFUSED or missing condition, causing inconsistent behavior with the
non-SNO path; update the isSNO handling in the function that waits for
AppliedFiles=True so that if err != nil and not errors.Is(err,
syscall.ECONNREFUSED) you call the same assertion used in the non-SNO path
(o.Expect(err).NotTo(o.HaveOccurred(), fmt.Sprintf(...))) so non-ECONNREFUSED
errors fail the test, while preserving the existing special-case log for
ECONNREFUSED and the conditionMet check (logger.Infof and
o.Expect(conditionMet).To(o.BeTrue())).
- Around line 310-319: The SNO branch in the UpdateExecuted check currently
swallows non-ECONNREFUSED errors; update the isSNO handling in the same block
that checks err and conditionMet so that if err is non-nil and not
syscall.ECONNREFUSED you log the error and/or fail the test similarly to the
non-SNO branch (use logger.Errorf or o.Expect to surface the error), and if
conditionMet is false ensure you log or fail (use logger.Infof + o.Expect or
o.Expect(conditionMet).To(o.BeTrue(...))) so behavior matches the non-SNO path;
reference variables/function names: isSNO, err, syscall.ECONNREFUSED,
conditionMet, logger, and o.Expect/BeTrue.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: b795a6e6-b331-41eb-898e-3110a36678bb
📒 Files selected for processing (3)
test/extended-priv/mco_ocb.gotest/extended/image_mode_status_reporting.gotest/extended/machineconfignode.go
🚧 Files skipped from review as they are similar to previous changes (2)
- test/extended-priv/mco_ocb.go
- test/extended/image_mode_status_reporting.go
e79ee19 to
acdf2e4
Compare
31e8864 to
b1c6eff
Compare
|
/payload-job periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-single-node-disruptive-techpreview |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/4f0ef2c0-603c-11f1-97d4-770ff5b6a596-0 |
|
/payload-aggregate periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-single-node-disruptive-techpreview 5 |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/48ff8260-6054-11f1-931e-e338040069af-0 |
25c491d to
3eaa05f
Compare
…ition tests resilient on SNO
3eaa05f to
bc777ae
Compare
|
/payload-aggregate periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-single-node-disruptive-techpreview 5 |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/93f06240-60f3-11f1-9f6f-88cb747357c8-0 |
|
/payload-job periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-single-node-disruptive-techpreview |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/2c4f6ed0-6105-11f1-820a-cab7dccb558e-0 |
|
/payload-job periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-mco-single-node-disruptive-techpreview |
|
@isabella-janssen: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/9f119130-6107-11f1-9c1a-f7f378e83c7c-0 |
Closes: OCPBUGS-86181
- What I did
This primarily updates the MCN property and condition transition tests run against the default master MCP instead of a custom MCP when running on a SNO cluster. Some timeouts were increased for the SNO case and the cleanup ordering of the image rebuild required test was updated to decrease the number of image builds.
- How to verify it
All MCN property and transition tests for the
ImageModeStatusReportingfeature gate should pass in the SNO test suite and continue passing in the default cluster suites.- Description for the changelog
OCPBUGS-86181: Update ImageModeStatusReporting MCN property and condition transition tests to be resilient on SNO topology
Summary by CodeRabbit
Tests
New Files