Testing changes for MCS Firstboot Pivot Failure Reporting v2 #6124
Testing changes for MCS Firstboot Pivot Failure Reporting v2 #6124CourtCourt521 wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
WalkthroughThe PR adds Machine Config Server (MCS) URL annotation to Ignition node configurations: a new annotation constant and exported GetIgnitionHost compute the host from Infrastructure; server appenders now accept and propagate mcsURL; ClusterServer fetches Infrastructure to compute mcsURL; BootstrapServer and tests updated; RBAC extended to read infrastructures. ChangesMachine Config Server URL Annotation in Ignition Configs
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 13 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (13 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: CourtCourt521 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/server/server_test.go`:
- Line 188: The tests currently call getNodeAnnotation and build a local
mcIgnCfg but never assert that MachineConfigServerURLAnnotationKey was written
or configure the test servers to emit that MCS URL; update the tests to (1)
configure the test server(s) used in these cases to return/emit the expected MCS
URL, (2) set the expected URL value and call
getNodeAnnotation(mp.Status.Configuration.Name, "", "<expected-url>", mc) as you
already do, and (3) add explicit assertions that the returned annotation map
contains MachineConfigServerURLAnnotationKey with the expected URL value (use
the same unique symbols: getNodeAnnotation, MachineConfigServerURLAnnotationKey,
and mcIgnCfg to locate where to change behavior and where to add the assertion).
Ensure both occurrences (around lines shown) are updated consistently.
In `@pkg/server/server.go`:
- Around line 322-347: GetIgnitionHost currently uses fmt.Sprintf("%s:%s",
internalURLParsed.Hostname(), securePortStr) which breaks for IPv6 and assumes
infraStatus and API IP slices are non-nil/non-empty; change the fallback to use
net.JoinHostPort(internalURLParsed.Hostname(), securePortStr) and add an initial
nil check for infraStatus to return an error if nil, then in each platform
branch (BareMetal/APIServerInternalIPs, OpenStack/APIServerInternalIPs,
Ovirt/APIServerInternalIPs, VSphere.APIServerInternalIPs) check that the slice
exists and len(...)>0 before indexing [0] to avoid panics, keeping the
SecurePort and ignitionHost assignment logic otherwise the same.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 0a4e5c0a-ff8b-478e-8e72-f71e85d2c927
📒 Files selected for processing (6)
pkg/daemon/constants/constants.gopkg/operator/sync.gopkg/server/bootstrap_server.gopkg/server/cluster_server.gopkg/server/server.gopkg/server/server_test.go
| t.Fatalf("unexpected error while appending file to ignition: %v", err) | ||
| } | ||
| anno, err := getNodeAnnotation(mp.Status.Configuration.Name, "", mc) | ||
| anno, err := getNodeAnnotation(mp.Status.Configuration.Name, "", "https://api-int.test.example.com:22623", mc) |
There was a problem hiding this comment.
These tests still don’t verify the new MCS URL annotation.
The hardcoded URL only goes into a local mcIgnCfg that never drives the final assertions, and the servers under test are not configured to emit that URL. These cases will keep passing even if MachineConfigServerURLAnnotationKey is never written.
Also applies to: 383-383
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/server/server_test.go` at line 188, The tests currently call
getNodeAnnotation and build a local mcIgnCfg but never assert that
MachineConfigServerURLAnnotationKey was written or configure the test servers to
emit that MCS URL; update the tests to (1) configure the test server(s) used in
these cases to return/emit the expected MCS URL, (2) set the expected URL value
and call getNodeAnnotation(mp.Status.Configuration.Name, "", "<expected-url>",
mc) as you already do, and (3) add explicit assertions that the returned
annotation map contains MachineConfigServerURLAnnotationKey with the expected
URL value (use the same unique symbols: getNodeAnnotation,
MachineConfigServerURLAnnotationKey, and mcIgnCfg to locate where to change
behavior and where to add the assertion). Ensure both occurrences (around lines
shown) are updated consistently.
| func GetIgnitionHost(infraStatus *configv1.InfrastructureStatus) (string, error) { | ||
| internalURL := infraStatus.APIServerInternalURL | ||
| internalURLParsed, err := url.Parse(internalURL) | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| securePortStr := strconv.Itoa(SecurePort) | ||
| ignitionHost := fmt.Sprintf("%s:%s", internalURLParsed.Hostname(), securePortStr) | ||
| if infraStatus.PlatformStatus != nil { | ||
| switch infraStatus.PlatformStatus.Type { | ||
| case configv1.BareMetalPlatformType: | ||
| ignitionHost = net.JoinHostPort(infraStatus.PlatformStatus.BareMetal.APIServerInternalIPs[0], securePortStr) | ||
| case configv1.OpenStackPlatformType: | ||
| ignitionHost = net.JoinHostPort(infraStatus.PlatformStatus.OpenStack.APIServerInternalIPs[0], securePortStr) | ||
| case configv1.OvirtPlatformType: | ||
| ignitionHost = net.JoinHostPort(infraStatus.PlatformStatus.Ovirt.APIServerInternalIPs[0], securePortStr) | ||
| case configv1.VSpherePlatformType: | ||
| if infraStatus.PlatformStatus.VSphere != nil { | ||
| if len(infraStatus.PlatformStatus.VSphere.APIServerInternalIPs) > 0 { | ||
| ignitionHost = net.JoinHostPort(infraStatus.PlatformStatus.VSphere.APIServerInternalIPs[0], securePortStr) | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return ignitionHost, nil |
There was a problem hiding this comment.
Harden GetIgnitionHost (IPv6 host:port + panic safety)
- Replace the fallback
fmt.Sprintf("%s:%s", internalURLParsed.Hostname(), securePortStr)withnet.JoinHostPort(...)so IPv6 literals get valid brackets. - Add
infraStatus == nilhandling and checklen(APIServerInternalIPs) > 0before indexing[0]in the BareMetal/OpenStack/Ovirt/VSphere branches to avoid panics.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/server/server.go` around lines 322 - 347, GetIgnitionHost currently uses
fmt.Sprintf("%s:%s", internalURLParsed.Hostname(), securePortStr) which breaks
for IPv6 and assumes infraStatus and API IP slices are non-nil/non-empty; change
the fallback to use net.JoinHostPort(internalURLParsed.Hostname(),
securePortStr) and add an initial nil check for infraStatus to return an error
if nil, then in each platform branch (BareMetal/APIServerInternalIPs,
OpenStack/APIServerInternalIPs, Ovirt/APIServerInternalIPs,
VSphere.APIServerInternalIPs) check that the slice exists and len(...)>0 before
indexing [0] to avoid panics, keeping the SecurePort and ignitionHost assignment
logic otherwise the same.
7556763 to
bbcbf77
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@manifests/machineconfigserver/clusterrole.yaml`:
- Around line 28-30: The ClusterRole currently grants unnecessary verbs for the
config.openshift.io/infrastructures resource; remove "list" and "watch" and
retain only the "get" verb since pkg/server/cluster_server.go only calls
ConfigV1().Infrastructures().Get(...). Locate the rule that lists apiGroups:
["config.openshift.io"] and resources: ["infrastructures"] in
manifests/machineconfigserver/clusterrole.yaml and update the verbs array to
only include "get" so RBAC is tightened to the minimum required permission.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1bffdd77-4bc7-48c8-a0a0-697a22118952
📒 Files selected for processing (7)
manifests/machineconfigserver/clusterrole.yamlpkg/daemon/constants/constants.gopkg/operator/sync.gopkg/server/bootstrap_server.gopkg/server/cluster_server.gopkg/server/server.gopkg/server/server_test.go
🚧 Files skipped from review as they are similar to previous changes (6)
- pkg/daemon/constants/constants.go
- pkg/server/server_test.go
- pkg/server/bootstrap_server.go
- pkg/server/cluster_server.go
- pkg/operator/sync.go
- pkg/server/server.go
| - apiGroups: ["config.openshift.io"] | ||
| resources: ["infrastructures"] | ||
| verbs: ["get", "list", "watch"] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check for Infrastructure informer/lister usage that requires list+watch
# Search for Infrastructure informer/lister initialization
rg -nP --type=go -C3 '\bInfrastructure(Informer|Lister)\b'
# Search for watch operations on Infrastructure
rg -nP --type=go -C3 'Infrastructures\(\)\.Watch'
# Search for list operations on Infrastructure
rg -nP --type=go -C3 'Infrastructures\(\)\.List'Repository: openshift/machine-config-operator
Length of output: 18993
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Find where infraInformer is actually used (excluding vendor)
rg -n --type=go --glob '!vendor/**' '\binfraInformer\b' pkg
# 2) Look for starting/starting informers via factory or .Informer().Run/.HasSynced (excluding vendor)
rg -n --type=go --glob '!vendor/**' '\.Informer\(\)\.Run|\bRun\(.*infra|Start.*infra|WaitForCacheSync|HasSynced|informerFactory' pkg
# 3) Check if there are any direct API calls for Infrastructures().List/Watch in pkg (excluding vendor)
rg -n --type=go --glob '!vendor/**' 'Infrastructures\(\)\.(List|Watch)\b' pkgRepository: openshift/machine-config-operator
Length of output: 23552
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Look for where the config informer factory is created/started (excluding vendor)
rg -n --type=go --glob '!vendor/**' '\bNewSharedInformerFactory\([^)]*configClient|\bStart\([^)]*\)' pkg/operator pkg/controller pkg/server pkg/daemon || true
# Specifically search for starting shared informer factories named/configure on configClient
rg -n --type=go --glob '!vendor/**' 'configinformers\.NewSharedInformerFactory|configinformers.*NewSharedInformerFactory' pkg/operator pkg/controller pkg/server pkg/daemon || true
# Find where infraInformer.Informer() comes from in operator.go to trace factory start
rg -n --type=go --glob '!vendor/**' 'infraInformer\s*[:=]' pkg/operator/operator.go || true
rg -n --type=go --glob '!vendor/**' 'Infrastructures\(\)\s*\.Informer\(\)' pkg/operator/operator.go || trueRepository: openshift/machine-config-operator
Length of output: 6087
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Trace config informer factory creation for infraInformer and whether it is started
rg -n --type=go --glob '!vendor/**' 'configinformers\.NewSharedInformerFactory\(' pkg/operator/operator.go pkg/server/cluster_server.go pkg/controller pkg/daemon pkg/osimagestream || true
# Find the sharedInformerFactory variable name used in pkg/server/cluster_server.go and see Start call scope
sed -n '1,220p' pkg/server/cluster_server.go | cat -n
# Show relevant section of pkg/operator/operator.go around where informer factories are started (look near Start())
rg -n --type=go --glob '!vendor/**' '\.Start\(stopCh\)|SharedInformerFactory\.Start' pkg/operator/operator.go
sed -n '430,560p' pkg/operator/operator.go | cat -nRepository: openshift/machine-config-operator
Length of output: 16197
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Search only in pkg/server for infra informer/lister usage and list/watch
rg -n --type=go --glob '!vendor/**' 'Infrastructures\(\)\.(Get|List|Watch)|infrastructure|infraInformer|infraLister' pkg/server || true
# Also quickly check for any shared informer factory involving config client in server
rg -n --type=go --glob '!vendor/**' 'configinformers|ConfigV1\(\)|ConfigClientOrDie\(' pkg/server || trueRepository: openshift/machine-config-operator
Length of output: 694
Tighten RBAC: drop list/watch for config.openshift.io/infrastructures
manifests/machineconfigserver/clusterrole.yaml should not grant list/watch: pkg/server/cluster_server.go only uses configClient.ConfigV1().Infrastructures().Get(ctx, "cluster", ...) to compute the MCS URL, and pkg/server contains no Infrastructures().List/Watch usage (or infra informer usage).
🔒 Tighten RBAC to minimum necessary verbs
- apiGroups: ["config.openshift.io"]
resources: ["infrastructures"]
- verbs: ["get", "list", "watch"]
+ verbs: ["get"]🧰 Tools
🪛 Checkov (3.2.530)
[medium] 1-30: Minimize wildcard use in Roles and ClusterRoles
(CKV_K8S_49)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@manifests/machineconfigserver/clusterrole.yaml` around lines 28 - 30, The
ClusterRole currently grants unnecessary verbs for the
config.openshift.io/infrastructures resource; remove "list" and "watch" and
retain only the "get" verb since pkg/server/cluster_server.go only calls
ConfigV1().Infrastructures().Get(...). Locate the rule that lists apiGroups:
["config.openshift.io"] and resources: ["infrastructures"] in
manifests/machineconfigserver/clusterrole.yaml and update the verbs array to
only include "get" so RBAC is tightened to the minimum required permission.
Testing not an actual pull request yet
Testing not an actual pull request
Testing not an actual pull request yet
Testing not an actual pull request yet
Summary by CodeRabbit
Enhancements
Chores
Tests