Skip to content

Migrate Rancher test cases to Robot#2568

Open
khushboo-rancher wants to merge 1 commit intoharvester:mainfrom
khushboo-rancher:migrate_rancher_tests
Open

Migrate Rancher test cases to Robot#2568
khushboo-rancher wants to merge 1 commit intoharvester:mainfrom
khushboo-rancher:migrate_rancher_tests

Conversation

@khushboo-rancher
Copy link
Copy Markdown
Collaborator

Which issue(s) this PR fixes:

Issue # #2488

  1. This PR covers all the test cases from pytest
  2. Test - upgrade of rke2 guest cluster is added. - [TEST] Upgrade from guest RKE2 version to latest #1294
  3. Having multiple guest clusters and performing operations to cover - [e2e] [BUG] Rancher attach volume to the wrong VM and cause workload stuck #1752
  4. Moved to 'Traefik' from Nginx for guest cluster ingress.

Signed-off-by: Khushboo <fnu.khushboo@suse.com>
Copilot AI review requested due to automatic review settings April 10, 2026 09:21
@khushboo-rancher khushboo-rancher requested a review from a team April 10, 2026 09:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates Rancher integration coverage from pytest into the Robot Framework test suite, adding end-to-end workflows for importing Harvester into Rancher, provisioning RKE2 guest clusters, deploying/validating workloads (including Traefik-based ingress), scaling, and performing an RKE2 upgrade scenario.

Changes:

  • Added a Robot test suite for Rancher↔Harvester integration (single/multi-node clusters, workloads, scale up/down, upgrade, cleanup).
  • Introduced a Rancher integration library layer (Base + CRD and REST implementations) and Robot keyword wrappers/resources to drive Rancher APIs.
  • Updated global defaults (e.g., storage class) and added Rancher/RKE2-related variables/constants used by the new suites.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
harvester_robot_tests/tests/resilient/vm_node_failure.robot Adds resilient/negative HA VM node-failure scenarios (reboot/power-off) in Robot.
harvester_robot_tests/tests/regression/test_rancher_integration.robot New Rancher integration regression suite covering create/scale/upgrade/cleanup flows.
harvester_robot_tests/libs/rancher/rest.py REST-based implementation of Rancher + Harvester API operations used by Robot keywords.
harvester_robot_tests/libs/rancher/crd.py CRD/kubectl-based implementation of Rancher operations (alternative strategy to REST).
harvester_robot_tests/libs/rancher/rancher.py Delegator selecting CRD vs REST implementation via env strategy.
harvester_robot_tests/libs/rancher/base.py Defines the Rancher integration interface shared by CRD/REST implementations.
harvester_robot_tests/libs/rancher/init.py Exposes the Rancher component package entrypoint.
harvester_robot_tests/libs/keywords/rancher_keywords.py Python keyword wrapper used by Robot to call Rancher component operations.
harvester_robot_tests/libs/constant.py Updates defaults and adds Rancher/RKE2-related constants (user data, timeouts, etc.).
harvester_robot_tests/keywords/variables.resource Adds Rancher endpoint/credentials, RKE2 version, ingress controller, and IP pool variables.
harvester_robot_tests/keywords/rancher.resource Large Robot resource providing reusable Rancher integration keywords and suite setup/teardown.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +256 to +260
cluster = self.wait_for_harvester_ready(cluster_name, timeout)
cluster_id = cluster.get("status", {}).get("clusterName", "")

# Get registration URL
manifest_url = self.get_cluster_registration_url(cluster_id)
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import_harvester_to_rancher calls wait_for_harvester_ready() immediately after creating the Rancher cluster entry, but the Harvester cluster won't become "ready" until after the registration manifest URL has been set. This also calls get_cluster_registration_url(cluster_id) without the required rancher_endpoint (and optional timeout), which will raise a TypeError. Update the flow to: wait_for_cluster_id -> get_cluster_registration_url(cluster_id, rancher_endpoint, timeout) -> set_cluster_registration_url -> wait_for_harvester_ready.

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +38
if self.rancher_token and self.rancher_session:
return # Already authenticated

if not rancher_endpoint:
rancher_endpoint = self.rancher_endpoint

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_authenticate_rancher() accepts a rancher_endpoint override but never persists it to self.rancher_endpoint. Callers like get_cluster_registration_url(..., rancher_endpoint=...) will authenticate against the override but subsequent _rancher_request() calls still build URLs from self.rancher_endpoint (possibly empty), leading to invalid requests. Consider setting self.rancher_endpoint = rancher_endpoint when an override is provided (and/or passing the endpoint through to _rancher_request).

Suggested change
if self.rancher_token and self.rancher_session:
return # Already authenticated
if not rancher_endpoint:
rancher_endpoint = self.rancher_endpoint
if rancher_endpoint:
self.rancher_endpoint = rancher_endpoint
rancher_endpoint = self.rancher_endpoint
if self.rancher_token and self.rancher_session:
return # Already authenticated

Copilot uses AI. Check for mistakes.
Comment on lines +86 to +90
try:
if method.upper() == "GET":
response = self.rancher_session.get(url)
elif method.upper() == "POST":
response = self.rancher_session.post(url, json=data)
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_rancher_request() issues HTTP requests without any timeout. If Rancher becomes slow/unreachable, the test run can hang indefinitely. Use a bounded timeout (e.g., DEFAULT_TIMEOUT/DEFAULT_TIMEOUT_LONG or a dedicated constant) for all methods and make it configurable if needed.

Copilot uses AI. Check for mistakes.
Comment on lines +299 to +313
# Update the value field with the JSON string
data["value"] = value

# Use the API's internal _put method directly
path = f"apis/{self.harvester_api.API_VERSION}/settings/cluster-registration-url"
resp = self.harvester_api._put(path, json=data)

if resp.status_code not in [200, 201]:
try:
error_data = resp.json()
except Exception:
error_data = resp.text
raise Exception(
f"Failed to set cluster-registration-url: "
f"{resp.status_code}, {error_data}"
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set_cluster_registration_url() calls self.harvester_api._put(...), but the Harvester API client created by create_harvester_api_client() exposes put(...) and settings.update(...)—there is no _put. This will raise AttributeError at runtime. Use self.harvester_api.settings.update("cluster-registration-url", value) (or self.harvester_api.put(...)) and validate the returned status code.

Suggested change
# Update the value field with the JSON string
data["value"] = value
# Use the API's internal _put method directly
path = f"apis/{self.harvester_api.API_VERSION}/settings/cluster-registration-url"
resp = self.harvester_api._put(path, json=data)
if resp.status_code not in [200, 201]:
try:
error_data = resp.json()
except Exception:
error_data = resp.text
raise Exception(
f"Failed to set cluster-registration-url: "
f"{resp.status_code}, {error_data}"
# Update the setting value using the public settings API
code, data = self.harvester_api.settings.update("cluster-registration-url", value)
if code not in [200, 201]:
raise Exception(
f"Failed to set cluster-registration-url: "
f"{code}, {data}"

Copilot uses AI. Check for mistakes.
Comment on lines +412 to +414
code, data = self._rancher_request("POST", "v3/cloudcredentials", payload)

if code not in [200, 201]:
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cloud credential endpoints use inconsistent casing: create uses v3/cloudcredentials, while get/delete use v3/cloudCredentials/.... Rancher API paths are typically case-sensitive, so this can cause 404s when reading/deleting a credential created by this client. Standardize the path casing across create/get/delete.

Copilot uses AI. Check for mistakes.
Comment on lines +486 to +487
logging("Successfully set cluster-registration-url")

Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate log line: "Successfully set cluster-registration-url" is emitted twice, which adds noise to test logs and makes troubleshooting harder. Remove the extra logging call.

Suggested change
logging("Successfully set cluster-registration-url")

Copilot uses AI. Check for mistakes.
else:
raise Exception(f"Failed to delete cloud credential: {stderr}")

def create_rke2_cluster(self, name, cloud_provider_config_id, hostname_prefix,
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_rke2_cluster(..., hostname_prefix, ...) includes a hostname_prefix parameter but it is never used when constructing the Cluster manifest. Either apply it to the spec (if Rancher/Harvester supports naming the machines/VMs with a prefix) or remove it from the API to avoid confusion.

Suggested change
def create_rke2_cluster(self, name, cloud_provider_config_id, hostname_prefix,
def create_rke2_cluster(self, name, cloud_provider_config_id,

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +75
def get_rke2_version(self, target_version):
"""
Get RKE2 version from Rancher that matches target version.

Args:
target_version: Target version prefix (e.g. 'v1.28', 'v1.29')
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The abstract interface for get_rke2_version doesn't accept rancher_endpoint, but both implementations and all callers in rancher_keywords.py/rancher.py require passing an endpoint. Align the Base method signature with actual usage (e.g., get_rke2_version(self, target_version, rancher_endpoint=None)) to keep the interface consistent and prevent future implementation drift.

Suggested change
def get_rke2_version(self, target_version):
"""
Get RKE2 version from Rancher that matches target version.
Args:
target_version: Target version prefix (e.g. 'v1.28', 'v1.29')
def get_rke2_version(self, target_version, rancher_endpoint=None):
"""
Get RKE2 version from Rancher that matches target version.
Args:
target_version: Target version prefix (e.g. 'v1.28', 'v1.29')
rancher_endpoint: Rancher server endpoint

Copilot uses AI. Check for mistakes.
Comment on lines +49 to +51
${RANCHER_ENDPOINT} %{RANCHER_ENDPOINT=https://10.115.1.128}
${RANCHER_USERNAME} %{RANCHER_USERNAME=admin}
${RANCHER_PASSWORD} %{RANCHER_PASSWORD=password1234}
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RANCHER_ENDPOINT defaults to a concrete internal IP (https://10.115.1.128). This makes the suite environment-specific and can accidentally target the wrong Rancher instance if the variable isn't overridden. Prefer leaving it empty or using a placeholder domain so CI/users are forced to configure it explicitly.

Suggested change
${RANCHER_ENDPOINT} %{RANCHER_ENDPOINT=https://10.115.1.128}
${RANCHER_USERNAME} %{RANCHER_USERNAME=admin}
${RANCHER_PASSWORD} %{RANCHER_PASSWORD=password1234}
${RANCHER_ENDPOINT} %{RANCHER_ENDPOINT=}
${RANCHER_USERNAME} %{RANCHER_USERNAME=admin}
${RANCHER_PASSWORD} %{RANCHER_PASSWORD=}

Copilot uses AI. Check for mistakes.
Comment on lines +125 to +130
DEFAULT_RKE2_USER_DATA = """#cloud-config
password: password
chpasswd:
expire: false
ssh_pwauth: true
package_update: true
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DEFAULT_RKE2_USER_DATA hard-codes a plaintext password and enables SSH password auth by default. Even for test suites, this is a risky default that can leak into shared environments. Consider sourcing credentials from environment/Robot variables (or defaulting to SSH key-only) and documenting the expected security posture.

Copilot uses AI. Check for mistakes.
@albinsun
Copy link
Copy Markdown
Contributor

Could you attach the robot test report for reference?
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants