✨ Enable dual-stack binds for VM Operator webhook, metrics, health, profiler, and web-console validator#5
Open
hpannem wants to merge 37 commits into
Open
Conversation
LocalizedMethodFault can be one of many different vSphere Fault type like NoCompatibleHost, GenericDrsFault etc
- Update LocalizedMessagesFromFault to recursively unwrap GenericDrsFault and NoCompatibleHost using reflection.
- Refactors the fault parser to use reflection, allowing it to drill down into VimFault.MethodFault.FaultMessage and recursive Error arrays. Includes new test coverage for anti-affinity GenericDrsFaults.
- Add comprehensive unit tests for deeply nested fault extraction, including anti-affinity policy errors.
Changed build mode for Go from 'autobuild' to 'manual' and updated build steps accordingly.
This patch removes the codeql action until its issues can be sorted.
A PVC can have other DataSourceRef types, such snapshot.storage.k8s.io/VolumeSnapshot if created from a snapshot, that its zonal constraint should still be considered until we put all PVCs in the placement ConfigSpec.
Writing to the fake provider vSphereClient may race with the controller reading it from the fake provider. Seen in https://github.com/vmware-tanzu/vm-operator/actions/runs/24476299421/job/71529316798?pr=1554
When either is unset, or they have different values, lookup the VM's zone and assign the zone to the label and status. To avoid a lookup if both the label and status are set to the same value, we'll just trust what is there.
Add stuff that the old new-schema-version.py missed when a type was just added in the now prior Hub version. While here, also add conversion for VirtualMachineClassInstance since that wasn't done between v1a4 and v1a5. Since this CRD is stripped only enable conversion when the feature is also enabled.
…ackfill-zone-status-from-label 🌱 Don't always backfill VM Status.Zone from label
…1a6-conversion 🐛 Add missing v1a6 version conversion bits
…roller-data-race 🌱 Fix data race in VMIC controller test
The updated version addresses the following CVEs: CVE-2026-32282 CVE-2026-32289 CVE-2026-33810 CVE-2026-27144 CVE-2026-27143 CVE-2026-32288 CVE-2026-32283 CVE-2026-27140 Signed-off-by: Rafael Brito <rafael.brito@broadcom.com>
…/bump-go-1.26.2 🌱 Go 1.26.2
[Merging on behalf of @Shuting] * ✨ Add validation for VLAN sub-interfaces capability This commit introduces capability checks for VLAN sub-interfaces in the VirtualMachine webhook to ensure the feature is only used when supported by the Supervisor. * Fix unit test regression error * Use latest WCP Capabilities Key * Fix comments * Explicit set VMVlanSubinterface to false in unit test
…rect parameter Signed-off-by: Rafael Brito <rafael.brito@broadcom.com>
…/e2etest-tweak-script-and-readme 🐛 Small corrections on the newly introduced e2e test setup
…-just-vm-dsref 🐛 Only skip PVCs with VirtualMachine as the DSRef
Allow `kubernetes.io/hostname` as a valid topology key for VMAffinity RequiredDuringScheduling when VMAffinityDuringExecution feature flag is enabled. - Add feature flag check for hostname topology key validation - Add comprehensive test coverage for all VMAffinity validation scenarios - Maintain backward compatibility (zone-only when feature disabled) Signed-off-by: Nabarun Pal <nabarun.pal@broadcom.com>
This PR exposes an argument for the download path of Kubectl and an argument for where to store the logs.
…ffinity Previously, processVMAffinity() and processVMAntiAffinity() only translated zone-topology terms (topologyKey: topology.kubernetes.io/zone) into VmPlacementPolicies. Host-topology terms (topologyKey: kubernetes.io/hostname) were silently skipped with the expectation that ClusterModules would handle host-level anti-affinity. This change: - Refactors buildTagIDsFromZoneTopology into a generic buildTagIDsFromTopology(vmCtx, terms, topologyKey) with thin wrappers buildTagIDsFromZoneTopology and buildTagIDsFromHostTopology to avoid duplicating the tag extraction logic. - Extends processVMAffinity to generate VmVmAffinity policies with Host topology for both required and preferred terms. - Extends processVMAntiAffinity to generate VmVmAntiAffinity policies with Host topology for both required and preferred terms. Each topology/strictness combination produces a separate policy object. - Enable HostRecommRequired for placement calls made through group placement. - Gates all host-topology processing behind a feature-gate - Adds tests for required/preferred host-level affinity and anti-affinity, mixed zone+host topology terms, and verifies that host-topology terms are silently skipped when VMAffinityDuringExecution is disabled while zone-topology terms continue to work. Signed-off-by: Nabarun Pal <nabarun.pal@broadcom.com>
When the VM Class BootOptions was already populated, the reconciler cleared it and only applied VM spec fields. Seed csBootOptions from *configSpec.BootOptions when it is set so class defaults (e.g. bootDelay) remain unless the VM spec overrides them. Add tests for class-only bootDelay and VM spec override.
Update controller-runtime from v0.22.3 to v0.23.1 along with k8s.io/* dependencies from v0.34.1 to v0.35.4, including e2e tests. Ginkgo had to be upgraded to v2.27.2. Additionally, this patch includes the update of otel dependency to v1.41 to address CVE-2026-29181 and CVE-2026-39883. This includes the following breaking/deprecation changes (from @aruneshpa): - ctrl.NewWebhookManagedBy(mgr).For(&obj{}).Complete() changed to ctrl.NewWebhookManagedBy(mgr, &obj{}).Complete() (object is now passed as a second argument instead of via .For() chaining). - mgr.GetEventRecorderFor() is deprecated in favor of mgr.GetEventRecorder(), which returns the new k8s.io/client-go/tools/events.EventRecorder instead of the old k8s.io/client-go/tools/record.EventRecorder. Migrated pkg/record.Recorder to accept the new EventRecorder type and updated all call sites and test fakes accordingly. As per investigation why some test-services failed after the upgrade, it has been pointed that k8s v0.35.x client being ~100ms slower in cache sync. The `Eventually` default timeout from "1s" had to be increased to "2s" in a couple of tests. Signed-off-by: Rafael Brito <rafael.brito@broadcom.com> Signed-off-by: Rafael Brito <rafa@stormforge.io> Signed-off-by: Rafael Brito <rafael.brito@broadcom.com>
…finity 🌱 Allow host-level topology key in VMAffinity with feature flag
…/process-host-aaf ✨ Generate VmPlacementPolicies for host-level AF and AAF during Placement
…/bump-controller-runtime-and-otel 🌱 Update controller runtime to 0.23.1 and otel 1.41
extraconfig validation
When VMAffinityDuringExecution feature flag is enabled: - Allow both zone and host topology keys for preferred scheduling - Maintain backward compatibility when feature flag is disabled (zone only) - Apply consistent host topology support across Required and Preferred scheduling Test coverage added: - VM Affinity PreferredDuringScheduling with host topology key acceptance - VM Affinity PreferredDuringScheduling with unsupported topology key rejection - VM Anti-Affinity PreferredDuringScheduling with unsupported topology key rejection This enables users to specify kubernetes.io/hostname topology keys for VM affinity preferred scheduling when the VMAffinityDuringExecution capability is enabled. Signed-off-by: Nabarun Pal <nabarun.pal@broadcom.com>
Adding a new E2E test in the VM-Hardware suite that tests positive and negative cases for mutli-writer, encrypted volumes with physical sharing mode controllers.
…ost-topology-af-preferred-during-scheduling 🐛 Add host topology support for VM Affinity PreferredDuringScheduling
…ervice changes (vmware-tanzu#1552) * Dualstack VirtualMachineService changes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do, and why is it needed?
VM Operator’s controller manager and web-console validation server need to listen on addresses that work in IPv6-only and dual-stack Kubernetes clusters. This change adds a configurable webhook bind address (default empty; deployments set ::) and wires it into controller-runtime’s webhook server Host. It updates shipped kustomize patches so metrics, health, and profiler use [::]:port where appropriate, and extends the web-console validator with SERVER_BIND_ADDRESS / --server-bind-address so the HTTP server can bind on [::] for dual-stack. WCP’s metrics port patch is updated to [::]:9848 while keeping args[0] as the metrics flag so existing JSON6902 patches keep working.
Are there any special notes for your reviewer
config/default/manager_auth_proxy_patch.yaml intentionally lists --metrics-addr=[::]:8443 first so config/wcp/vmoperator/manager_metrics_port_patch.yaml (which replaces args/0) still only retargets the metrics bind.
Please add a release note if necessary
The controller manager accepts --webhook-bind-address (use "::" for dual-stack). Default kustomize patches bind metrics, health, profiler, and the web-console validation server on IPv6 dual-stack-friendly addresses. The web-console validator honors SERVER_BIND_ADDRESS and --server-bind-address.