feat(MTV-595): add warm migration test for VMs with XFS v4 filesystem#503
feat(MTV-595): add warm migration test for VMs with XFS v4 filesystem#503Chenli-Hu wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (6)
WalkthroughAdds XFS compatibility and custom pod network support to warm migration tooling and tests: plan creation accepts xfs_compatibility, network mapping can prioritize a custom pod network, tests add an XFS scenario and pass the custom network, and post-migration checks validate XFS metadata on migrated Linux VMs. ChangesWarm Migration XFS Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels: Suggested reviewers:
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Report bugs in Issues Welcome! 🎉This pull request will be automatically processed with the following features: 🔄 Automatic Actions
📋 Available CommandsPR Status Management
Review & Approval
Testing & Validation
Container Operations
Cherry-pick Operations
Label Management
✅ Merge RequirementsThis PR will be automatically approved when the following conditions are met:
📊 Review ProcessApprovers and ReviewersApprovers:
Reviewers:
Available Labels
AI Features
💡 Tips
For more information, please refer to the project documentation or contact the maintainers. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@utilities/utils.py`:
- Around line 305-315: The code currently mutates the input list via
networks.remove(...) and networks.insert(...), which can leak order across
callers; instead construct and return a new list so the original networks
argument is not modified: after finding pod_net (the result of next(...)), build
a new list with pod_net first followed by all other entries filtered out (use
networks and custom_pod_network to filter), and return that new list; keep the
existing ValueError behavior when pod_net is None and reference the same symbols
(pod_net, networks, custom_pod_network) so the change is localized.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 02b43eca-2a31-467e-b8e6-358366e4a560
📒 Files selected for processing (4)
tests/tests_config/config.pytests/warm/test_warm_migration_comprehensive.pyutilities/mtv_migration.pyutilities/utils.py
|
Hi @krcmarik , please help to review, thank you! 1)Migrate cloned xfs VM only 2)Migrate both cloned mtv-win2022-ip-3disks and cloned mtv-feature-rhel7-xfs tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_create_storagemap[comprehensive-warm-xfs] PASSED tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_create_networkmap[comprehensive-warm-xfs] PASSED tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_create_plan[comprehensive-warm-xfs] PASSED tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_migrate_vms[comprehensive-warm-xfs] PASSED tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_check_vms[comprehensive-warm-xfs] PASSED |
|
/verified |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/warm/test_warm_migration_comprehensive.py`:
- Line 193: The test currently enables preserve_static_ips by default via
prepared_plan.get("preserve_static_ips", True), which implicitly turns the
feature on for plans that never declared it; change this to read the flag
without a default so missing keys are treated as absent (e.g., use
prepared_plan["preserve_static_ips"] or otherwise pass the explicit value from
the test fixture) to avoid silently enabling static-IP validation paths—update
the call site that sets preserve_static_ips to stop using .get(..., True) and
instead require an explicit value from prepared_plan.
In `@utilities/mtv_migration.py`:
- Around line 146-147: You've added a new positional parameter xfs_compatibility
before target_power_state which breaks existing positional callers; change the
function signature to keep existing positional binding by either moving
xfs_compatibility after target_power_state or make it keyword-only (e.g., put
"*, xfs_compatibility: bool = False" after target_power_state) so old callers
still bind target_power_state correctly, and then update any call sites/tests
that intentionally pass xfs_compatibility positionally to use the keyword form.
In `@utilities/post_migration.py`:
- Around line 1257-1276: The code resolves destination_vm_name but still uses
vm_name when creating the SSH connection and logging; update usages to target
the migrated VM by replacing vm_name with destination_vm_name in
vm_ssh_connections.create and in the LOGGER.info message (leave
get_ssh_credentials_from_provider_config(source_provider_data, source_vm_data)
and the win_os check as-is so credentials and OS checks still come from
source_vm_data). Ensure vm_ssh_connections.create(...) is called with
vm_name=destination_vm_name and corresponding log text reflects the destination
VM.
In `@utilities/utils.py`:
- Around line 258-259: The current call to _reorder_networks_by_pod_network can
raise a ValueError when custom_pod_network is missing even though pod_only=True
makes reordering unnecessary; update the condition around the call so it only
runs when custom_pod_network is truthy AND pod_only is False (i.e., guard the
call with "if custom_pod_network and not pod_only") to skip reordering in
pod_only mode; reference symbols: custom_pod_network, pod_only,
_reorder_networks_by_pod_network, networks.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 4da22c4c-e415-4cb9-abe3-75542d1da274
📒 Files selected for processing (5)
tests/tests_config/config.pytests/warm/test_warm_migration_comprehensive.pyutilities/mtv_migration.pyutilities/post_migration.pyutilities/utils.py
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@utilities/post_migration.py`:
- Around line 1475-1488: The XFS check calls check_vm_command_output whenever
plan.get("xfs_compatibility") is true, causing failures if the destination VM is
powered off; update the condition to also verify the destination VM is powered
on (e.g., vm.get("power_state") == "on" or an existing vm_powered_on flag) and
only run check_vm_command_output when vm_ssh_connections and
plan.get("xfs_compatibility") are true AND the VM is powered on, keeping the
existing try/except and res[vm_name].append(...) behavior for error reporting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9e1102a9-cf8a-4744-9f84-80712626103e
📒 Files selected for processing (6)
tests/tests_config/config.pytests/warm/test_warm_migration_comprehensive.pyutilities/mtv_migration.pyutilities/post_migration.pyutilities/utils.pyutilities/vmware_guest_operations.py
Code Review by Qodo
1. source_vms_data defaulted to {}
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@utilities/post_migration.py`:
- Around line 1321-1333: The code is silently defaulting source_vm_data to {}
and assuming ssh_conn.rrmngmnt_host exists which leads to opaque failures;
update the logic around source_vm_data,
get_ssh_credentials_from_provider_config, and ssh_conn usage to fail fast:
explicitly check if plan.get("source_vms_data") is None or if source_vm_data is
missing for vm_name and raise a clear ValueError naming vm_name and plan key,
validate that get_ssh_credentials_from_provider_config returned non-None
ssh_username/ssh_password and raise a descriptive error if missing, and before
using ssh_conn.rrmngmnt_host validate ssh_conn is not None and that
ssh_conn.rrmngmnt_host is present (raise a clear exception referencing
destination_vm_name and ssh_conn) so callers get actionable setup errors instead
of opaque attribute errors.
In `@utilities/vmware_guest_operations.py`:
- Around line 222-224: The awk program is written with double braces so it’s
passed literally and returns empty values; update the shell command strings that
set dev and method (the lines constructing "dev=$(nmcli -f GENERAL.DEVICES ... |
awk -F: '{{print $2}}' ..." and "method=$(nmcli -f ipv4.method ... | awk -F:
'{{print $2}}' ...)") to use single-brace awk syntax ({print $2}) so dev and
method are extracted correctly; ensure both occurrences are fixed and keep the
surrounding nmcli/test/xargs pipeline intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 193fb40a-71bf-4373-b22a-67de42ab42f0
📒 Files selected for processing (6)
tests/tests_config/config.pytests/warm/test_warm_migration_comprehensive.pyutilities/mtv_migration.pyutilities/post_migration.pyutilities/utils.pyutilities/vmware_guest_operations.py
| virtual_machines_list=prepared_plan["virtual_machines"], | ||
| target_power_state=prepared_plan["target_power_state"], | ||
| target_power_state=prepared_plan.get("target_power_state"), | ||
| target_namespace=target_namespace, | ||
| warm_migration=prepared_plan["warm_migration"], | ||
| preserve_static_ips=prepared_plan["preserve_static_ips"], | ||
| vm_target_namespace=prepared_plan["vm_target_namespace"], | ||
| pvc_name_template=prepared_plan["pvc_name_template"], | ||
| pvc_name_template_use_generate_name=prepared_plan["pvc_name_template_use_generate_name"], | ||
| target_labels=target_vm_labels["vm_labels"], | ||
| target_affinity=prepared_plan["target_affinity"], | ||
| preserve_static_ips=prepared_plan.get("preserve_static_ips", False), | ||
| vm_target_namespace=prepared_plan.get("vm_target_namespace"), | ||
| pvc_name_template=prepared_plan.get("pvc_name_template"), | ||
| pvc_name_template_use_generate_name=prepared_plan.get("pvc_name_template_use_generate_name"), | ||
| target_labels=target_vm_labels.get("vm_labels") if "target_labels" in prepared_plan else None, | ||
| target_affinity=prepared_plan.get("target_affinity"), | ||
| xfs_compatibility=prepared_plan.get("xfs_compatibility", False), | ||
| ) |
There was a problem hiding this comment.
1. Defaults used in prepared_plan.get() 📘 Rule violation ≡ Correctness
The test now reads project-controlled tests_params values via .get() with implicit defaults/None, which can silently mask missing or misspelled required configuration keys and change migration behavior without failing fast.
Agent Prompt
## Issue description
`prepared_plan` originates from `py_config["tests_params"][...]` (configuration we control). The test uses `.get()` (and `.get(..., default)`) for keys that should be explicitly present, which can hide misconfiguration and let invalid config propagate as `None`/defaults.
## Issue Context
Rule requires required project-controlled config keys to be accessed directly (e.g., `prepared_plan["target_power_state"]`) and forbids silently defaulting missing keys (except for optional feature flags).
## Fix Focus Areas
- tests/warm/test_warm_migration_comprehensive.py[189-200]
- tests/tests_config/config.py[524-575]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
Code review by qodo was updated up to the latest commit 63804a1 |
| # Get source VM data from plan | ||
| source_vm_data = plan.get("source_vms_data", {}).get(vm_name, {}) | ||
|
|
||
| # Check if Windows VM | ||
| if source_vm_data.get("win_os", False): | ||
| raise ValueError( | ||
| f"Command output verification is only supported on Linux guests. VM '{vm_name}' is a Windows guest." | ||
| ) | ||
|
|
||
| LOGGER.info(f"Running command on VM {destination_vm_name}: {' '.join(command)}") | ||
|
|
||
| # Get SSH credentials | ||
| ssh_username, ssh_password = get_ssh_credentials_from_provider_config(source_provider_data, source_vm_data) | ||
|
|
There was a problem hiding this comment.
1. source_vms_data defaulted to {} 📘 Rule violation ☼ Reliability
check_vm_command_output() silently defaults missing plan["source_vms_data"][vm_name] to {},
even though it uses that data to infer Windows-vs-Linux and select SSH credentials, which can
mis-detect guest OS (e.g., treat Windows as Linux) and fail later for misleading reasons. This
violates the intended fail-fast behavior and can cause confusing downstream post-migration
validation failures.
Agent Prompt
## Issue description
`check_vm_command_output()` currently uses `plan.get("source_vms_data", {}).get(vm_name, {})` to determine whether the guest is Windows and to choose Linux vs Windows SSH credentials, which masks missing `source_vms_data`/missing VM entries and can cause incorrect OS detection and misleading downstream failures during post-migration validation. Update the logic to fail fast with a clear exception when required VM metadata is missing, and/or stop using `source_vms_data` as the general source of OS/credential metadata.
## Issue Context
- This code path is used for post-migration validation and should comply with the fail-fast requirement (reject missing/invalid required data early with a clear, specific exception).
- `get_ssh_credentials_from_provider_config()` expects a reliable `source_vm_info["win_os"]` flag.
- `source_vms_data` is not guaranteed to exist for all scenarios/providers and is treated as required only for specific features (e.g., static-IP preservation), so it is not a reliable general source of OS metadata.
- `check_vms()` already has authoritative `source_vm` data available from `source_provider.vm_dict(...)`; prefer passing that into `check_vm_command_output()` (e.g., accept `source_vm_info: dict[str, Any]`) and use it for the Windows guard and credential selection. Optionally keep `source_vms_data` only for data that truly lives there, but do not use it for OS detection.
## Fix Focus Areas
- utilities/post_migration.py[1294-1361]
- utilities/post_migration.py[1320-1333]
- utilities/post_migration.py[1475-1494]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| with ssh_conn: | ||
| # Execute command using executor with the correct user | ||
| executor = ssh_conn.rrmngmnt_host.executor(user=ssh_conn.rrmngmnt_user) # type: ignore[union-attr] | ||
| executor.port = ssh_conn.local_port | ||
|
|
||
| rc, stdout, stderr = executor.run_cmd(command) | ||
|
|
||
| if rc != 0: | ||
| raise GuestCommandError( | ||
| f"Command {' '.join(command)} failed on VM {destination_vm_name} (rc={rc}): {stderr}" | ||
| ) |
There was a problem hiding this comment.
2. No retries for ssh cmd 🐞 Bug ☼ Reliability
check_vm_command_output() runs the SSH command exactly once with no retry/backoff, unlike existing SSH checks in this file, so transient guest readiness/SSH instability post-migration can cause intermittent false test failures.
Agent Prompt
### Issue description
`check_vm_command_output()` executes `executor.run_cmd(command)` once and fails immediately on non-zero rc or missing expected output. Post-migration guests commonly have short windows of instability (services still starting, udev settling, etc.), and this file already uses `TimeoutSampler` retry patterns for SSH connectivity.
### Issue Context
- `check_ssh_connectivity()` uses `TimeoutSampler` to avoid flakiness.
- `check_vm_command_output()` creates a new SSH connection and immediately executes the command without any retry loop.
### Fix Focus Areas
- utilities/post_migration.py[84-140]
- utilities/post_migration.py[1294-1361]
### Suggested fix
- Wrap the command execution + expected-output assertion in a `TimeoutSampler` retry loop (similar to `check_ssh_connectivity()`), e.g.:
- retry for `wait_timeout=120` with `sleep=10`
- retry on SSH exceptions and on `rc != 0`
- (optionally) retry if `expected_output` not found, since filesystems/services may not be ready immediately
- Keep the final error message from the last attempt for debuggability.
- Alternatively, reuse an already-validated SSH session/connection if available to reduce repeated port-forward setup.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Signed-off-by: Chenli Hu <chhu@redhat.com>
|
Code review by qodo was updated up to the latest commit 2ddfb0b |
|
/verified |
|
@Chenli-Hu pleasee checkout Qodo comments |

Summary by CodeRabbit
New Features
Bug Fixes
Tests