Skip to content

Conversation

@JunAr7112
Copy link
Contributor

Description

Fix flaking failures in the GPU Operator CI e2e tests

Checklist

  • No secrets, sensitive information, or unrelated changes
  • Lint checks passing (make lint)
  • Generated assets in-sync (make validate-generated-assets)
  • Go mod artifacts in-sync (make validate-modules)
  • Test cases are added for new code paths

Testing

Verified that we no longer get the error when we delete the pod:

Checking nvidia driver pod
NAME READY STATUS RESTARTS AGE
nvidia-gpu-driver-ubuntu22.04-7bc55f95b6-ltr4c 1/1 Running 0 2m36s
Checking nvidia driver pod readiness
nvidia driver pod is ready
checking nvidia-driver-daemonset labels
SUCCESS: Pod nvidia-gpu-driver-ubuntu22.04-7bc55f95b6-ltr4c has correct labels: cloudprovider=aws, platform=kubernetes
=== test_custom_labels_override PASSED ===

Signed-off-by: Arjun <agadiyar@nvidia.com>
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 2, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant