feat(MTV-595): add warm migration test for VMs with XFS v4 filesystem by Chenli-Hu · Pull Request #503 · RedHatQE/mtv-api-tests

Chenli-Hu · 2026-05-19T00:43:59Z

Summary by CodeRabbit

New Features
- XFS filesystem compatibility for warm migrations, including optional post-migration verification on migrated VMs.
- Ability to specify a custom pod network in a migration plan to prioritize its mapping during migration.
Bug Fixes
- Improved guest-network IP detection compatibility with older nmcli versions.
Tests
- Added an XFS warm-migration test case and split parametrization for Windows vs. XFS; test config updated.

coderabbitai · 2026-05-19T00:44:09Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 24860194-4681-41ea-bbc3-957e6c9feb9b

📥 Commits

Reviewing files that changed from the base of the PR and between 63804a1 and 2ddfb0b.

📒 Files selected for processing (6)

tests/tests_config/config.py
tests/warm/test_warm_migration_comprehensive.py
utilities/mtv_migration.py
utilities/post_migration.py
utilities/utils.py
utilities/vmware_guest_operations.py

Walkthrough

Adds XFS compatibility and custom pod network support to warm migration tooling and tests: plan creation accepts xfs_compatibility, network mapping can prioritize a custom pod network, tests add an XFS scenario and pass the custom network, and post-migration checks validate XFS metadata on migrated Linux VMs.

Changes

Warm Migration XFS Support

Layer / File(s)	Summary
Network reordering for custom pod network `utilities/utils.py`	`gen_network_map_list` accepts `custom_pod_network` and, when set and `pod_only` is false, reorders VM networks so the named network is mapped to the pod destination first. Adds `_reorder_networks_by_pod_network` that finds the named network or raises `ValueError` listing available names.
XFS and custom network parameter wiring `utilities/mtv_migration.py`	`create_plan_resource` gains `xfs_compatibility: bool = False` and conditionally injects it into the Plan CR. `get_network_migration_map` accepts `custom_pod_network` and forwards it to `gen_network_map_list`.
XFS test configuration and parametrization `tests/tests_config/config.py`, `tests/warm/test_warm_migration_comprehensive.py`	Add `"test_warm_migration_xfs"` to `tests_params` enabling XFS and `custom_pod_network`. Split comprehensive warm migration parametrization into distinct Windows and XFS cases and add XFS to the test docstring feature list.
XFS test implementation `tests/warm/test_warm_migration_comprehensive.py`	`test_create_networkmap` reads optional `custom_pod_network` from `prepared_plan` and passes it to `get_network_migration_map`. `test_create_plan` uses `.get(...)` with defaults for optional plan fields (including `preserve_static_ips`, `xfs_compatibility`) and passes `target_labels` only when present.
Post-migration XFS validation `utilities/post_migration.py`	Add `check_vm_command_output()` to run SSH commands and assert expected stdout. When `plan["xfs_compatibility"]` is enabled and SSH connections exist, run `xfs_info /sdb1` and expect `crc=0`, recording per-VM failures.
nmcli compatibility tweak `utilities/vmware_guest_operations.py`	Embedded guest `nmcli` script now uses `nmcli -f` (instead of `-g`) to extract fields for compatibility with older nmcli versions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels: verified, lgtm-coderabbitai[bot]

Suggested reviewers:

myakove
krcmarik
solenoci

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main change: adding a warm migration test for VMs with XFS v4 filesystem, which is the primary focus across all modified files.
Docstring Coverage	✅ Passed	Docstring coverage is 81.82% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

redhat-qe-bot · 2026-05-19T00:44:40Z

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
Issue Creation: Disabled for this repository
Branch Labeling: Branch-specific labels are applied to track the target branch
Auto-verification: Auto-verified users have their PRs automatically marked as verified
Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

/wip - Mark PR as work in progress (adds WIP: prefix to title)
/wip cancel - Remove work in progress status
/hold - Block PR merging (approvers only)
/hold cancel - Unblock PR merging
/verified - Mark PR as verified
/verified cancel - Remove verification status
/reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
/regenerate-welcome - Regenerate this welcome message

Review & Approval

/lgtm - Approve changes (looks good to me)
/approve - Approve PR (approvers only)
/automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
/assign-reviewers - Assign reviewers based on OWNERS file
/assign-reviewer @username - Assign specific reviewer
/check-can-merge - Check if PR meets merge requirements

Testing & Validation

/retest tox - Run Python test suite with tox
/retest build-container - Rebuild and test container image
/retest conventional-title - Validate commit message format
/retest all - Run all available tests

Container Operations

/build-and-push-container - Build and push container image (tagged with PR number)
- Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

/cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
- Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

/<label-name> - Add a label to the PR
/<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

Approval: /approve from at least one approver
Status Checks: All required status checks must pass
No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

myakove
solenoci

Reviewers:

krcmarik
myakove
solenoci

Available Labels

hold
verified
wip
lgtm
approve
automerge

AI Features

Conventional Title: Mode: fix (claude/claude-opus-4-6[1m])
Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])
Test Oracle: Triggers: approved (cursor/gpt-5.4-xhigh-fast); /test-oracle can be used anytime

💡 Tips

WIP Status: Use /wip when your PR is not ready for review
Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
Cherry-picking: Cherry-pick labels are processed when the PR is merged
Container Builds: Container images are automatically tagged with the PR number
Permission Levels: Some commands require approver permissions
Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utilities/utils.py`:
- Around line 305-315: The code currently mutates the input list via
networks.remove(...) and networks.insert(...), which can leak order across
callers; instead construct and return a new list so the original networks
argument is not modified: after finding pod_net (the result of next(...)), build
a new list with pod_net first followed by all other entries filtered out (use
networks and custom_pod_network to filter), and return that new list; keep the
existing ValueError behavior when pod_net is None and reference the same symbols
(pod_net, networks, custom_pod_network) so the change is localized.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 02b43eca-2a31-467e-b8e6-358366e4a560

📥 Commits

Reviewing files that changed from the base of the PR and between f82917b and 8ed18dc.

📒 Files selected for processing (4)

tests/tests_config/config.py
tests/warm/test_warm_migration_comprehensive.py
utilities/mtv_migration.py
utilities/utils.py

Chenli-Hu · 2026-05-19T01:20:03Z

Hi @krcmarik , please help to review, thank you!
I tested in local, both 1) and 2) are passed on MTV 2.11.6.
There is a bug: MTV-5419 on 2.12.0, without that fix the warm migration will failed on MTV 2.12.0. => Tested on latest 2.12.0, the bug is fixed, both 1) and 2) passed on 2.12.0-40.

1)Migrate cloned xfs VM only
#uv run pytest -vv -m "tier0" -k "comprehensive-warm-xfs" --tc=source_provider:"vcenter" --tc=storage_class:"nfs-csi" ....

2)Migrate both cloned mtv-win2022-ip-3disks and cloned mtv-feature-rhel7-xfs
#uv run pytest -vv -m "tier0" -k "comprehensive-warm" --tc=source_provider:"vcenter" --tc=storage_class:"nfs-csi"....

tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_create_storagemap[comprehensive-warm-xfs] PASSED
_ 1 of 5 completed, 1 Pass, 0 Fail, 0 Skip, 0 XPass, 0 XFail, 0 Error, 0 ReRun _

tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_create_networkmap[comprehensive-warm-xfs] PASSED
_ 2 of 5 completed, 2 Pass, 0 Fail, 0 Skip, 0 XPass, 0 XFail, 0 Error, 0 ReRun _

tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_create_plan[comprehensive-warm-xfs] PASSED
_ 3 of 5 completed, 3 Pass, 0 Fail, 0 Skip, 0 XPass, 0 XFail, 0 Error, 0 ReRun _

tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_migrate_vms[comprehensive-warm-xfs] PASSED
_ 4 of 5 completed, 4 Pass, 0 Fail, 0 Skip, 0 XPass, 0 XFail, 0 Error, 0 ReRun _

tests/warm/test_warm_migration_comprehensive.py::TestWarmMigrationComprehensive::test_check_vms[comprehensive-warm-xfs] PASSED
_ 5 of 5 completed, 5 Pass, 0 Fail, 0 Skip, 0 XPass, 0 XFail, 0 Error, 0 ReRun _
======== 5 passed, 211 deselected, 3021 warnings in 1589.90s (0:26:29) =========

Chenli-Hu · 2026-05-19T01:27:12Z

/verified

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/warm/test_warm_migration_comprehensive.py`:
- Line 193: The test currently enables preserve_static_ips by default via
prepared_plan.get("preserve_static_ips", True), which implicitly turns the
feature on for plans that never declared it; change this to read the flag
without a default so missing keys are treated as absent (e.g., use
prepared_plan["preserve_static_ips"] or otherwise pass the explicit value from
the test fixture) to avoid silently enabling static-IP validation paths—update
the call site that sets preserve_static_ips to stop using .get(..., True) and
instead require an explicit value from prepared_plan.

In `@utilities/mtv_migration.py`:
- Around line 146-147: You've added a new positional parameter xfs_compatibility
before target_power_state which breaks existing positional callers; change the
function signature to keep existing positional binding by either moving
xfs_compatibility after target_power_state or make it keyword-only (e.g., put
"*, xfs_compatibility: bool = False" after target_power_state) so old callers
still bind target_power_state correctly, and then update any call sites/tests
that intentionally pass xfs_compatibility positionally to use the keyword form.

In `@utilities/post_migration.py`:
- Around line 1257-1276: The code resolves destination_vm_name but still uses
vm_name when creating the SSH connection and logging; update usages to target
the migrated VM by replacing vm_name with destination_vm_name in
vm_ssh_connections.create and in the LOGGER.info message (leave
get_ssh_credentials_from_provider_config(source_provider_data, source_vm_data)
and the win_os check as-is so credentials and OS checks still come from
source_vm_data). Ensure vm_ssh_connections.create(...) is called with
vm_name=destination_vm_name and corresponding log text reflects the destination
VM.

In `@utilities/utils.py`:
- Around line 258-259: The current call to _reorder_networks_by_pod_network can
raise a ValueError when custom_pod_network is missing even though pod_only=True
makes reordering unnecessary; update the condition around the call so it only
runs when custom_pod_network is truthy AND pod_only is False (i.e., guard the
call with "if custom_pod_network and not pod_only") to skip reordering in
pod_only mode; reference symbols: custom_pod_network, pod_only,
_reorder_networks_by_pod_network, networks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 4da22c4c-e415-4cb9-abe3-75542d1da274

📥 Commits

Reviewing files that changed from the base of the PR and between 8ed18dc and 4a38083.

📒 Files selected for processing (5)

tests/tests_config/config.py
tests/warm/test_warm_migration_comprehensive.py
utilities/mtv_migration.py
utilities/post_migration.py
utilities/utils.py

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utilities/post_migration.py`:
- Around line 1475-1488: The XFS check calls check_vm_command_output whenever
plan.get("xfs_compatibility") is true, causing failures if the destination VM is
powered off; update the condition to also verify the destination VM is powered
on (e.g., vm.get("power_state") == "on" or an existing vm_powered_on flag) and
only run check_vm_command_output when vm_ssh_connections and
plan.get("xfs_compatibility") are true AND the VM is powered on, keeping the
existing try/except and res[vm_name].append(...) behavior for error reporting.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9e1102a9-cf8a-4744-9f84-80712626103e

📥 Commits

Reviewing files that changed from the base of the PR and between 4a38083 and 9c5c0b4.

📒 Files selected for processing (6)

tests/tests_config/config.py
tests/warm/test_warm_migration_comprehensive.py
utilities/mtv_migration.py
utilities/post_migration.py
utilities/utils.py
utilities/vmware_guest_operations.py

qodo-code-review · 2026-05-27T18:04:59Z

Code Review by Qodo

🐞 Bugs (5) 📘 Rule violations (4)

1. source_vms_data defaulted to {} 📘 Rule violation ☼ Reliability

Description

check_vm_command_output() silently defaults missing plan["source_vms_data"][vm_name] to {},
even though it uses that data to infer Windows-vs-Linux and select SSH credentials, which can
mis-detect guest OS (e.g., treat Windows as Linux) and fail later for misleading reasons. This
violates the intended fail-fast behavior and can cause confusing downstream post-migration
validation failures.

Code

utilities/post_migration.py[R1320-1333]

Evidence

PR Compliance ID 9 requires rejecting invalid or missing required data early with a clear exception,
but the implementation uses plan.get("source_vms_data", {}).get(vm_name, {}), allowing absent
source_vms_data or missing VM entries to propagate into OS detection and credential selection
without an explicit failure. Because credential selection is driven by the win_os flag taken from
this optional/feature-specific structure (treated elsewhere as required only for specific features
like static-IP preservation), missing data can cause Windows guests to be treated as Linux and lead
to check failures for the wrong reason rather than a clear, immediate error at the point of use.

CLAUDE.md: Fail Fast: Do Not Allow Invalid None Values to Propagate
utilities/post_migration.py[1320-1333]
utilities/post_migration.py[39-81]
utilities/post_migration.py[1294-1336]
utilities/post_migration.py[1444-1454]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`check_vm_command_output()` currently uses `plan.get("source_vms_data", {}).get(vm_name, {})` to determine whether the guest is Windows and to choose Linux vs Windows SSH credentials, which masks missing `source_vms_data`/missing VM entries and can cause incorrect OS detection and misleading downstream failures during post-migration validation. Update the logic to fail fast with a clear exception when required VM metadata is missing, and/or stop using `source_vms_data` as the general source of OS/credential metadata.

## Issue Context
- This code path is used for post-migration validation and should comply with the fail-fast requirement (reject missing/invalid required data early with a clear, specific exception).
- `get_ssh_credentials_from_provider_config()` expects a reliable `source_vm_info["win_os"]` flag.
- `source_vms_data` is not guaranteed to exist for all scenarios/providers and is treated as required only for specific features (e.g., static-IP preservation), so it is not a reliable general source of OS metadata.
- `check_vms()` already has authoritative `source_vm` data available from `source_provider.vm_dict(...)`; prefer passing that into `check_vm_command_output()` (e.g., accept `source_vm_info: dict[str, Any]`) and use it for the Windows guard and credential selection. Optionally keep `source_vms_data` only for data that truly lives there, but do not use it for OS detection.

## Fix Focus Areas
- utilities/post_migration.py[1294-1361]
- utilities/post_migration.py[1320-1333]
- utilities/post_migration.py[1475-1494]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. No retries for SSH cmd 🐞 Bug ☼ Reliability

Description

check_vm_command_output() runs the SSH command exactly once with no retry/backoff, unlike existing
SSH checks in this file, so transient guest readiness/SSH instability post-migration can cause
intermittent false test failures.

Code

utilities/post_migration.py[R1337-1347]

Evidence

The codebase already implements retry/backoff for SSH readiness using TimeoutSampler, but the
newly added helper does not, increasing the chance of timing-based flake failures right after
migration.

utilities/post_migration.py[84-140]
utilities/post_migration.py[1294-1358]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`check_vm_command_output()` executes `executor.run_cmd(command)` once and fails immediately on non-zero rc or missing expected output. Post-migration guests commonly have short windows of instability (services still starting, udev settling, etc.), and this file already uses `TimeoutSampler` retry patterns for SSH connectivity.

### Issue Context
- `check_ssh_connectivity()` uses `TimeoutSampler` to avoid flakiness.
- `check_vm_command_output()` creates a new SSH connection and immediately executes the command without any retry loop.

### Fix Focus Areas
- utilities/post_migration.py[84-140]
- utilities/post_migration.py[1294-1361]

### Suggested fix
- Wrap the command execution + expected-output assertion in a `TimeoutSampler` retry loop (similar to `check_ssh_connectivity()`), e.g.:
 - retry for `wait_timeout=120` with `sleep=10`
 - retry on SSH exceptions and on `rc != 0`
 - (optionally) retry if `expected_output` not found, since filesystems/services may not be ready immediately
- Keep the final error message from the last attempt for debuggability.
- Alternatively, reuse an already-validated SSH session/connection if available to reduce repeated port-forward setup.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Defaults used in prepared_plan.get() 📘 Rule violation ≡ Correctness

Description

The test now reads project-controlled tests_params values via .get() with implicit
defaults/None, which can silently mask missing or misspelled required configuration keys and change
migration behavior without failing fast.

Code

tests/warm/test_warm_migration_comprehensive.py[R189-200]

Evidence
Rule 28 forbids using defaults for project-controlled config keys. The updated test_create_plan
passes multiple values from prepared_plan using .get() (including a default for
preserve_static_ips) which can silently substitute missing keys instead of failing fast.
CLAUDE.md: Configuration We Control Must Not Use Defaults: Access Required Keys Directly; Only Optional Feature Flags May Default
tests/warm/test_warm_migration_comprehensive.py[189-200]
tests/tests_config/config.py[524-575]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`prepared_plan` originates from `py_config["tests_params"][...]` (configuration we control). The test uses `.get()` (and `.get(..., default)`) for keys that should be explicitly present, which can hide misconfiguration and let invalid config propagate as `None`/defaults.

## Issue Context
Rule requires required project-controlled config keys to be accessed directly (e.g., `prepared_plan["target_power_state"]`) and forbids silently defaulting missing keys (except for optional feature flags).

## Fix Focus Areas
- tests/warm/test_warm_migration_comprehensive.py[189-200]
- tests/tests_config/config.py[524-575]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Class parametrization shares class state 📘 Rule violation ☼ Reliability ⭐ New

Description

The class-level parametrization adds a second scenario (test_warm_migration_xfs) while the test
class stores migration resources on mutable class attributes, which can leak/overwrite state across
test cases and create order-dependent behavior. This violates the requirement to avoid shared
mutable state between tests.

Code

tests/warm/test_warm_migration_comprehensive.py[R40-43]

Evidence
A second class-level parameter set was added, causing the same test class to run for multiple
configurations. The class still uses mutable class attributes to store resources, which can be
overwritten between parametrized runs and introduces shared mutable state across test cases.
CLAUDE.md: Do Not Share Mutable State Between Tests
tests/warm/test_warm_migration_comprehensive.py[33-46]
tests/warm/test_warm_migration_comprehensive.py[92-201]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`TestWarmMigrationComprehensive` is now parametrized with two plan configurations, but it relies on mutable class attributes (`self.__class__.storage_map`, `network_map`, `plan_resource`) to pass state between tests. This creates cross-test coupling and can cause state from one parametrized run to overwrite/affect another.

## Issue Context
The compliance rule forbids sharing mutable state between tests; resources should be produced/consumed via fixtures (ideally class-scoped) or each test should be fully self-contained.

## Fix Focus Areas
- tests/warm/test_warm_migration_comprehensive.py[33-200]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. Wrong exception type used 🐞 Bug ⚙ Maintainability ⭐ New

Description

check_vm_command_output() raises GuestCommandError for SSH command failures, but GuestCommandError
is defined as VMware Guest Operations-specific, making failures misleading and potentially
mis-handled by callers that key off exception type.

Code

utilities/post_migration.py[R1351-1354]

Evidence
The exception class is explicitly documented as VMware Guest Operations-specific, but the new SSH
helper raises it for SSH command failures, creating a semantic mismatch.
exceptions/exceptions.py[122-127]
utilities/post_migration.py[1294-1354]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`utilities/post_migration.py:check_vm_command_output()` raises `GuestCommandError` when an SSH-executed command returns non-zero. However, `GuestCommandError` is defined/documented as applying to VMware Guest Operations failures, so using it for SSH conflates two different execution paths and makes troubleshooting/exception handling confusing.

### Issue Context
- `GuestCommandError` is documented as *VMware Guest Operations* related.
- `check_vm_command_output()` executes commands via SSH (`rrmngmnt` executor), not via Guest Operations.

### Fix Focus Areas
- Define a new exception for SSH/remote command execution (e.g., `SSHCommandError` / `RemoteCommandError`) **or** reuse an existing generic command-exec exception used elsewhere in the repo.
- Update `check_vm_command_output()` to raise the new/generic exception type instead of `GuestCommandError`.
- Ensure any callers/catch blocks (current and future) remain correct.

#### Fix Focus Areas (files/lines)
- exceptions/exceptions.py[122-130]
- utilities/post_migration.py[1294-1366]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

6. Hard-coded XFS device path 🐞 Bug ☼ Reliability

Description

When xfs_compatibility is enabled, check_vms() runs xfs_info against the fixed device /sdb1,
but the plan/test config does not specify which disk/partition is XFS; if the VM’s XFS filesystem is
on a different device name, the post-migration validation will fail even though migration succeeded.

Code

utilities/post_migration.py[R1475-1491]

Evidence
The test config enables xfs_compatibility but does not provide a device identifier; the
post-migration check then hard-codes /sdb1 and any resulting error is turned into a test failure
via the aggregated failed_checks path.
tests/tests_config/config.py[558-575]
utilities/post_migration.py[1475-1494]
utilities/post_migration.py[1592-1595]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The XFS compatibility post-migration check always executes `xfs_info /sdb1`. The VM config/plan does not define an XFS device path, so this check is brittle and can fail for correct migrations when the XFS filesystem is on a different device (e.g., `/dev/sda1`, `/dev/vda2`, `/dev/mapper/...`).

### Issue Context
- The new XFS verification runs inside `check_vms()` only when `plan.get('xfs_compatibility')` is true.
- Any exception is collected into `res[...]` and ultimately causes `pytest.fail(...)`, so a wrong device path makes the whole test fail.

### Fix Focus Areas
- utilities/post_migration.py[1294-1361]
- utilities/post_migration.py[1475-1494]
- tests/tests_config/config.py[558-575]

### Suggested fix
Implement one of the following (prefer dynamic discovery):
1) **Dynamic discovery**: SSH-run a command to find an XFS filesystem device (e.g., `findmnt -nr -t xfs -o SOURCE | head -n1` or `lsblk -nrpo NAME,FSTYPE | awk '$2=="xfs"{print $1; exit}'`) and then run `xfs_info` on that discovered device.
2) **Config-driven**: Add a plan config field like `xfs_device` (or `xfs_mountpoint`) under `test_warm_migration_xfs` and pass it into the verifier instead of hard-coding.
3) **Fallbacks**: Try a small set of common device patterns and fail with a clear error that includes `lsblk` output.

Also update logging/error messages to include the discovered device and full command output on failure.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

View more (2)

7. Hardcoded namespaces in XFS config 📘 Rule violation ☼ Reliability

Description

The new XFS warm-migration test config hardcodes namespace values (vm_target_namespace pattern and
multus_namespace), rather than sourcing namespaces from fixtures, which can undermine parallel
isolation and policy consistency.

Code

tests/tests_config/config.py[R569-571]

Evidence
Rule 33 requires namespace values to be provided by fixtures and not hardcoded. The added
test_warm_migration_xfs configuration includes direct namespace strings (vm_target_namespace and
multus_namespace).
CLAUDE.md: Namespaces Must Be Provided by Fixtures (Never Hardcode)
tests/tests_config/config.py[558-575]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new `test_warm_migration_xfs` config sets namespace values directly in config (including a generated string and a fixed `default`). Compliance requires namespaces to be provided by fixtures rather than hardcoded in test logic/config.

## Issue Context
Namespace fixtures typically incorporate session UUID patterns to ensure uniqueness and safe parallel execution.

## Fix Focus Areas
- tests/tests_config/config.py[558-575]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

8. Pod network reorder mismatch 🐞 Bug ≡ Correctness

Description

_reorder_networks_by_pod_network() can only match networks by net['name'], but some inventories
(e.g., OpenShift) return a pod network mapping as {'type': 'pod'} without a name, making it
impossible to select/reorder the pod network via custom_pod_network and leading to
confusing/incorrect behavior when the pod network is not already first.

Code

utilities/utils.py[R305-311]

Evidence

The reorder helper searches for a match only via net.get('name'), while the OpenShift inventory
explicitly produces a pod network mapping that lacks a name key, so that entry can never be
targeted by custom_pod_network matching-by-name.

utilities/utils.py[254-312]
libs/forklift_inventory.py[479-499]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`custom_pod_network` reordering only matches networks by `name`, but OpenShift inventory can emit a pod network entry without a `name` key (`{"type": "pod"}`). This prevents selecting/reordering that pod entry and can cause incorrect mapping when the pod network is not first.

### Issue Context
- `gen_network_map_list()` optionally reorders the inventory network list before mapping index 0 to pod.
- OpenShift inventory network list may include a pod entry represented as `{"type": "pod"}`.

### Fix Focus Areas
- utilities/utils.py[226-312]
- libs/forklift_inventory.py[479-499]

### Suggested fix
- Make `_reorder_networks_by_pod_network()` able to:
 - Prefer `{"type": "pod"}` entry when `custom_pod_network` indicates pod (e.g., allow `custom_pod_network="pod"` or `"__pod__"`), **or**
 - If an entry has `type=="pod"` and no name, allow matching by a synthetic name, **or**
 - Explicitly reject `custom_pod_network` for inventories that include non-name entries with a clear error message explaining what values are valid.
- Update the ValueError message to differentiate unnamed networks (pod) from named ones so users know what they can select.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

9. Docstring exceptions incomplete 🐞 Bug ⚙ Maintainability ⭐ New

Description

check_vm_command_output() documents only ValueError/AssertionError/GuestCommandError, but it also
raises KeyError (missing source_vms_data) and ConnectionError (SSH connection not established), so
the function contract is inaccurate and callers will be misled.

Code

utilities/post_migration.py[R1312-1345]

Evidence
The docstring's Raises section does not include KeyError/ConnectionError, but the function body
raises both under common error paths.
utilities/post_migration.py[1302-1333]
utilities/post_migration.py[1342-1345]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`check_vm_command_output()` raises exceptions that are not listed in its docstring `Raises:` section (notably `KeyError` and `ConnectionError`). This makes the function contract misleading and complicates correct handling/debugging.

### Issue Context
The function explicitly raises:
- `KeyError` when `plan["source_vms_data"][vm_name]` is missing.
- `ConnectionError` when the SSH connection is not established.

### Fix Focus Areas
- Update the docstring `Raises:` section to include all exception types that can be raised.
- If you change the exception type for SSH command failures (see related finding), update the docstring accordingly.

#### Fix Focus Areas (files/lines)
- utilities/post_migration.py[1302-1365]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Previous review results

Review updated until commit 2ddfb0b

Results up to commit 63804a1

🐞 Bugs (2) 📘 Rule violations (2) 📎 Requirement gaps (0)

1. Defaults used in prepared_plan.get() 📘 Rule violation ≡ Correctness

Description

The test now reads project-controlled tests_params values via .get() with implicit
defaults/None, which can silently mask missing or misspelled required configuration keys and change
migration behavior without failing fast.

Code

tests/warm/test_warm_migration_comprehensive.py[R189-200]

Evidence
Rule 28 forbids using defaults for project-controlled config keys. The updated test_create_plan
passes multiple values from prepared_plan using .get() (including a default for
preserve_static_ips) which can silently substitute missing keys instead of failing fast.
CLAUDE.md: Configuration We Control Must Not Use Defaults: Access Required Keys Directly; Only Optional Feature Flags May Default
tests/warm/test_warm_migration_comprehensive.py[189-200]
tests/tests_config/config.py[524-575]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`prepared_plan` originates from `py_config["tests_params"][...]` (configuration we control). The test uses `.get()` (and `.get(..., default)`) for keys that should be explicitly present, which can hide misconfiguration and let invalid config propagate as `None`/defaults.

## Issue Context
Rule requires required project-controlled config keys to be accessed directly (e.g., `prepared_plan["target_power_state"]`) and forbids silently defaulting missing keys (except for optional feature flags).

## Fix Focus Areas
- tests/warm/test_warm_migration_comprehensive.py[189-200]
- tests/tests_config/config.py[524-575]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Hard-coded XFS device path 🐞 Bug ☼ Reliability

Description

When xfs_compatibility is enabled, check_vms() runs xfs_info against the fixed device /sdb1,
but the plan/test config does not specify which disk/partition is XFS; if the VM’s XFS filesystem is
on a different device name, the post-migration validation will fail even though migration succeeded.

Code

utilities/post_migration.py[R1475-1491]

Evidence
The test config enables xfs_compatibility but does not provide a device identifier; the
post-migration check then hard-codes /sdb1 and any resulting error is turned into a test failure
via the aggregated failed_checks path.
tests/tests_config/config.py[558-575]
utilities/post_migration.py[1475-1494]
utilities/post_migration.py[1592-1595]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
The XFS compatibility post-migration check always executes `xfs_info /sdb1`. The VM config/plan does not define an XFS device path, so this check is brittle and can fail for correct migrations when the XFS filesystem is on a different device (e.g., `/dev/sda1`, `/dev/vda2`, `/dev/mapper/...`).

### Issue Context
- The new XFS verification runs inside `check_vms()` only when `plan.get('xfs_compatibility')` is true.
- Any exception is collected into `res[...]` and ultimately causes `pytest.fail(...)`, so a wrong device path makes the whole test fail.

### Fix Focus Areas
- utilities/post_migration.py[1294-1361]
- utilities/post_migration.py[1475-1494]
- tests/tests_config/config.py[558-575]

### Suggested fix
Implement one of the following (prefer dynamic discovery):
1) **Dynamic discovery**: SSH-run a command to find an XFS filesystem device (e.g., `findmnt -nr -t xfs -o SOURCE | head -n1` or `lsblk -nrpo NAME,FSTYPE | awk '$2=="xfs"{print $1; exit}'`) and then run `xfs_info` on that discovered device.
2) **Config-driven**: Add a plan config field like `xfs_device` (or `xfs_mountpoint`) under `test_warm_migration_xfs` and pass it into the verifier instead of hard-coding.
3) **Fallbacks**: Try a small set of common device patterns and fail with a clear error that includes `lsblk` output.

Also update logging/error messages to include the discovered device and full command output on failure.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. Hardcoded namespaces in XFS config 📘 Rule violation ☼ Reliability

Description

The new XFS warm-migration test config hardcodes namespace values (vm_target_namespace pattern and
multus_namespace), rather than sourcing namespaces from fixtures, which can undermine parallel
isolation and policy consistency.

Code

tests/tests_config/config.py[R569-571]

Evidence
Rule 33 requires namespace values to be provided by fixtures and not hardcoded. The added
test_warm_migration_xfs configuration includes direct namespace strings (vm_target_namespace and
multus_namespace).
CLAUDE.md: Namespaces Must Be Provided by Fixtures (Never Hardcode)
tests/tests_config/config.py[558-575]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new `test_warm_migration_xfs` config sets namespace values directly in config (including a generated string and a fixed `default`). Compliance requires namespaces to be provided by fixtures rather than hardcoded in test logic/config.

## Issue Context
Namespace fixtures typically incorporate session UUID patterns to ensure uniqueness and safe parallel execution.

## Fix Focus Areas
- tests/tests_config/config.py[558-575]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

4. Pod network reorder mismatch 🐞 Bug ≡ Correctness

Description

_reorder_networks_by_pod_network() can only match networks by net['name'], but some inventories
(e.g., OpenShift) return a pod network mapping as {'type': 'pod'} without a name, making it
impossible to select/reorder the pod network via custom_pod_network and leading to
confusing/incorrect behavior when the pod network is not already first.

Code

utilities/utils.py[R305-311]

Evidence

The reorder helper searches for a match only via net.get('name'), while the OpenShift inventory
explicitly produces a pod network mapping that lacks a name key, so that entry can never be
targeted by custom_pod_network matching-by-name.

utilities/utils.py[254-312]
libs/forklift_inventory.py[479-499]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`custom_pod_network` reordering only matches networks by `name`, but OpenShift inventory can emit a pod network entry without a `name` key (`{"type": "pod"}`). This prevents selecting/reordering that pod entry and can cause incorrect mapping when the pod network is not first.

### Issue Context
- `gen_network_map_list()` optionally reorders the inventory network list before mapping index 0 to pod.
- OpenShift inventory network list may include a pod entry represented as `{"type": "pod"}`.

### Fix Focus Areas
- utilities/utils.py[226-312]
- libs/forklift_inventory.py[479-499]

### Suggested fix
- Make `_reorder_networks_by_pod_network()` able to:
 - Prefer `{"type": "pod"}` entry when `custom_pod_network` indicates pod (e.g., allow `custom_pod_network="pod"` or `"__pod__"`), **or**
 - If an entry has `type=="pod"` and no name, allow matching by a synthetic name, **or**
 - Explicitly reject `custom_pod_network` for inventories that include non-name entries with a clear error message explaining what values are valid.
- Update the ValueError message to differentiate unnamed networks (pod) from named ones so users know what they can select.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Results up to commit 63804a1

🐞 Bugs (1) 📘 Rule violations (1) 📎 Requirement gaps (0)

1. source_vms_data defaulted to {} 📘 Rule violation ☼ Reliability

Description

check_vm_command_output() silently defaults missing plan["source_vms_data"][vm_name] to {},
even though it uses that data to infer Windows-vs-Linux and select SSH credentials, which can
mis-detect guest OS (e.g., treat Windows as Linux) and fail later for misleading reasons. This
violates the intended fail-fast behavior and can cause confusing downstream post-migration
validation failures.

Code

utilities/post_migration.py[R1320-1333]

Evidence

PR Compliance ID 9 requires rejecting invalid or missing required data early with a clear exception,
but the implementation uses plan.get("source_vms_data", {}).get(vm_name, {}), allowing absent
source_vms_data or missing VM entries to propagate into OS detection and credential selection
without an explicit failure. Because credential selection is driven by the win_os flag taken from
this optional/feature-specific structure (treated elsewhere as required only for specific features
like static-IP preservation), missing data can cause Windows guests to be treated as Linux and lead
to check failures for the wrong reason rather than a clear, immediate error at the point of use.

CLAUDE.md: Fail Fast: Do Not Allow Invalid None Values to Propagate
utilities/post_migration.py[1320-1333]
utilities/post_migration.py[39-81]
utilities/post_migration.py[1294-1336]
utilities/post_migration.py[1444-1454]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`check_vm_command_output()` currently uses `plan.get("source_vms_data", {}).get(vm_name, {})` to determine whether the guest is Windows and to choose Linux vs Windows SSH credentials, which masks missing `source_vms_data`/missing VM entries and can cause incorrect OS detection and misleading downstream failures during post-migration validation. Update the logic to fail fast with a clear exception when required VM metadata is missing, and/or stop using `source_vms_data` as the general source of OS/credential metadata.

## Issue Context
- This code path is used for post-migration validation and should comply with the fail-fast requirement (reject missing/invalid required data early with a clear, specific exception).
- `get_ssh_credentials_from_provider_config()` expects a reliable `source_vm_info["win_os"]` flag.
- `source_vms_data` is not guaranteed to exist for all scenarios/providers and is treated as required only for specific features (e.g., static-IP preservation), so it is not a reliable general source of OS metadata.
- `check_vms()` already has authoritative `source_vm` data available from `source_provider.vm_dict(...)`; prefer passing that into `check_vm_command_output()` (e.g., accept `source_vm_info: dict[str, Any]`) and use it for the Windows guard and credential selection. Optionally keep `source_vms_data` only for data that truly lives there, but do not use it for OS detection.

## Fix Focus Areas
- utilities/post_migration.py[1294-1361]
- utilities/post_migration.py[1320-1333]
- utilities/post_migration.py[1475-1494]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. No retries for SSH cmd 🐞 Bug ☼ Reliability

Description

check_vm_command_output() runs the SSH command exactly once with no retry/backoff, unlike existing
SSH checks in this file, so transient guest readiness/SSH instability post-migration can cause
intermittent false test failures.

Code

utilities/post_migration.py[R1337-1347]

Evidence

The codebase already implements retry/backoff for SSH readiness using TimeoutSampler, but the
newly added helper does not, increasing the chance of timing-based flake failures right after
migration.

utilities/post_migration.py[84-140]
utilities/post_migration.py[1294-1358]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`check_vm_command_output()` executes `executor.run_cmd(command)` once and fails immediately on non-zero rc or missing expected output. Post-migration guests commonly have short windows of instability (services still starting, udev settling, etc.), and this file already uses `TimeoutSampler` retry patterns for SSH connectivity.

### Issue Context
- `check_ssh_connectivity()` uses `TimeoutSampler` to avoid flakiness.
- `check_vm_command_output()` creates a new SSH connection and immediately executes the command without any retry loop.

### Fix Focus Areas
- utilities/post_migration.py[84-140]
- utilities/post_migration.py[1294-1361]

### Suggested fix
- Wrap the command execution + expected-output assertion in a `TimeoutSampler` retry loop (similar to `check_ssh_connectivity()`), e.g.:
 - retry for `wait_timeout=120` with `sleep=10`
 - retry on SSH exceptions and on `rc != 0`
 - (optionally) retry if `expected_output` not found, since filesystems/services may not be ready immediately
- Keep the final error message from the last attempt for debuggability.
- Alternatively, reuse an already-validated SSH session/connection if available to reduce repeated port-forward setup.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@utilities/post_migration.py`:
- Around line 1321-1333: The code is silently defaulting source_vm_data to {}
and assuming ssh_conn.rrmngmnt_host exists which leads to opaque failures;
update the logic around source_vm_data,
get_ssh_credentials_from_provider_config, and ssh_conn usage to fail fast:
explicitly check if plan.get("source_vms_data") is None or if source_vm_data is
missing for vm_name and raise a clear ValueError naming vm_name and plan key,
validate that get_ssh_credentials_from_provider_config returned non-None
ssh_username/ssh_password and raise a descriptive error if missing, and before
using ssh_conn.rrmngmnt_host validate ssh_conn is not None and that
ssh_conn.rrmngmnt_host is present (raise a clear exception referencing
destination_vm_name and ssh_conn) so callers get actionable setup errors instead
of opaque attribute errors.

In `@utilities/vmware_guest_operations.py`:
- Around line 222-224: The awk program is written with double braces so it’s
passed literally and returns empty values; update the shell command strings that
set dev and method (the lines constructing "dev=$(nmcli -f GENERAL.DEVICES ... |
awk -F: '{{print $2}}' ..." and "method=$(nmcli -f ipv4.method ... | awk -F:
'{{print $2}}' ...)") to use single-brace awk syntax ({print $2}) so dev and
method are extracted correctly; ensure both occurrences are fixed and keep the
surrounding nmcli/test/xargs pipeline intact.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 193fb40a-71bf-4373-b22a-67de42ab42f0

📥 Commits

Reviewing files that changed from the base of the PR and between 9c5c0b4 and 63804a1.

📒 Files selected for processing (6)

tests/tests_config/config.py
tests/warm/test_warm_migration_comprehensive.py
utilities/mtv_migration.py
utilities/post_migration.py
utilities/utils.py
utilities/vmware_guest_operations.py

qodo-code-review · 2026-05-27T18:10:59Z

            virtual_machines_list=prepared_plan["virtual_machines"],
-            target_power_state=prepared_plan["target_power_state"],
+            target_power_state=prepared_plan.get("target_power_state"),
            target_namespace=target_namespace,
            warm_migration=prepared_plan["warm_migration"],
-            preserve_static_ips=prepared_plan["preserve_static_ips"],
-            vm_target_namespace=prepared_plan["vm_target_namespace"],
-            pvc_name_template=prepared_plan["pvc_name_template"],
-            pvc_name_template_use_generate_name=prepared_plan["pvc_name_template_use_generate_name"],
-            target_labels=target_vm_labels["vm_labels"],
-            target_affinity=prepared_plan["target_affinity"],
+            preserve_static_ips=prepared_plan.get("preserve_static_ips", False),
+            vm_target_namespace=prepared_plan.get("vm_target_namespace"),
+            pvc_name_template=prepared_plan.get("pvc_name_template"),
+            pvc_name_template_use_generate_name=prepared_plan.get("pvc_name_template_use_generate_name"),
+            target_labels=target_vm_labels.get("vm_labels") if "target_labels" in prepared_plan else None,
+            target_affinity=prepared_plan.get("target_affinity"),
+            xfs_compatibility=prepared_plan.get("xfs_compatibility", False),
        )


1. Defaults used in prepared_plan.get() 📘 Rule violation ≡ Correctness

The test now reads project-controlled tests_params values via .get() with implicit defaults/None, which can silently mask missing or misspelled required configuration keys and change migration behavior without failing fast.

Agent Prompt

## Issue description `prepared_plan` originates from `py_config["tests_params"][...]` (configuration we control). The test uses `.get()` (and `.get(..., default)`) for keys that should be explicitly present, which can hide misconfiguration and let invalid config propagate as `None`/defaults. ## Issue Context Rule requires required project-controlled config keys to be accessed directly (e.g., `prepared_plan["target_power_state"]`) and forbids silently defaulting missing keys (except for optional feature flags). ## Fix Focus Areas - tests/warm/test_warm_migration_comprehensive.py[189-200] - tests/tests_config/config.py[524-575]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-05-27T18:11:02Z

Code review by qodo was updated up to the latest commit 63804a1

qodo-code-review · 2026-05-27T18:16:55Z

+    # Get source VM data from plan
+    source_vm_data = plan.get("source_vms_data", {}).get(vm_name, {})
+
+    # Check if Windows VM
+    if source_vm_data.get("win_os", False):
+        raise ValueError(
+            f"Command output verification is only supported on Linux guests. VM '{vm_name}' is a Windows guest."
+        )
+
+    LOGGER.info(f"Running command on VM {destination_vm_name}: {' '.join(command)}")
+
+    # Get SSH credentials
+    ssh_username, ssh_password = get_ssh_credentials_from_provider_config(source_provider_data, source_vm_data)
+


1. source_vms_data defaulted to {} 📘 Rule violation ☼ Reliability

check_vm_command_output() silently defaults missing plan["source_vms_data"][vm_name] to {}, even though it uses that data to infer Windows-vs-Linux and select SSH credentials, which can mis-detect guest OS (e.g., treat Windows as Linux) and fail later for misleading reasons. This violates the intended fail-fast behavior and can cause confusing downstream post-migration validation failures.

Agent Prompt

## Issue description `check_vm_command_output()` currently uses `plan.get("source_vms_data", {}).get(vm_name, {})` to determine whether the guest is Windows and to choose Linux vs Windows SSH credentials, which masks missing `source_vms_data`/missing VM entries and can cause incorrect OS detection and misleading downstream failures during post-migration validation. Update the logic to fail fast with a clear exception when required VM metadata is missing, and/or stop using `source_vms_data` as the general source of OS/credential metadata. ## Issue Context - This code path is used for post-migration validation and should comply with the fail-fast requirement (reject missing/invalid required data early with a clear, specific exception). - `get_ssh_credentials_from_provider_config()` expects a reliable `source_vm_info["win_os"]` flag. - `source_vms_data` is not guaranteed to exist for all scenarios/providers and is treated as required only for specific features (e.g., static-IP preservation), so it is not a reliable general source of OS metadata. - `check_vms()` already has authoritative `source_vm` data available from `source_provider.vm_dict(...)`; prefer passing that into `check_vm_command_output()` (e.g., accept `source_vm_info: dict[str, Any]`) and use it for the Windows guard and credential selection. Optionally keep `source_vms_data` only for data that truly lives there, but do not use it for OS detection. ## Fix Focus Areas - utilities/post_migration.py[1294-1361] - utilities/post_migration.py[1320-1333] - utilities/post_migration.py[1475-1494]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-05-27T18:16:55Z

+    with ssh_conn:
+        # Execute command using executor with the correct user
+        executor = ssh_conn.rrmngmnt_host.executor(user=ssh_conn.rrmngmnt_user)  # type: ignore[union-attr]
+        executor.port = ssh_conn.local_port
+
+        rc, stdout, stderr = executor.run_cmd(command)
+
+        if rc != 0:
+            raise GuestCommandError(
+                f"Command {' '.join(command)} failed on VM {destination_vm_name} (rc={rc}): {stderr}"
+            )


2. No retries for ssh cmd 🐞 Bug ☼ Reliability

check_vm_command_output() runs the SSH command exactly once with no retry/backoff, unlike existing SSH checks in this file, so transient guest readiness/SSH instability post-migration can cause intermittent false test failures.

Agent Prompt

### Issue description `check_vm_command_output()` executes `executor.run_cmd(command)` once and fails immediately on non-zero rc or missing expected output. Post-migration guests commonly have short windows of instability (services still starting, udev settling, etc.), and this file already uses `TimeoutSampler` retry patterns for SSH connectivity. ### Issue Context - `check_ssh_connectivity()` uses `TimeoutSampler` to avoid flakiness. - `check_vm_command_output()` creates a new SSH connection and immediately executes the command without any retry loop. ### Fix Focus Areas - utilities/post_migration.py[84-140] - utilities/post_migration.py[1294-1361] ### Suggested fix - Wrap the command execution + expected-output assertion in a `TimeoutSampler` retry loop (similar to `check_ssh_connectivity()`), e.g.: - retry for `wait_timeout=120` with `sleep=10` - retry on SSH exceptions and on `rc != 0` - (optionally) retry if `expected_output` not found, since filesystems/services may not be ready immediately - Keep the final error message from the last attempt for debuggability. - Alternatively, reuse an already-validated SSH session/connection if available to reduce repeated port-forward setup.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Signed-off-by: Chenli Hu <chhu@redhat.com>

qodo-code-review · 2026-05-27T19:44:15Z

Code review by qodo was updated up to the latest commit 2ddfb0b

qodo-code-review · 2026-05-27T19:50:32Z

Code Review by Qodo

Looking for bugs?

Check back in a few minutes. An AI review agent is analyzing this pull request.

Chenli-Hu · 2026-05-27T20:05:39Z

/verified

myakove · 2026-05-31T06:39:02Z

@Chenli-Hu pleasee checkout Qodo comments

redhat-qe-bot requested a review from solenoci May 19, 2026 00:44

redhat-qe-bot assigned Chenli-Hu May 19, 2026

redhat-qe-bot requested a review from myakove May 19, 2026 00:44

redhat-qe-bot added the branch-main label May 19, 2026

redhat-qe-bot requested a review from krcmarik May 19, 2026 00:44

redhat-qe-bot added the size/L label May 19, 2026

redhat-qe-bot changed the title ~~Warm migrate VM with XFS v4 from vSphere~~ feat(MTV-595): add warm migration test for VMs with XFS v4 filesystem May 19, 2026

coderabbitai Bot requested changes May 19, 2026

View reviewed changes

Comment thread utilities/utils.py Outdated

redhat-qe-bot1 added the changes-requested-coderabbitai[bot] label May 19, 2026

Chenli-Hu force-pushed the xfs branch from 8ed18dc to 4a38083 Compare May 19, 2026 02:14

redhat-qe-bot removed the changes-requested-coderabbitai[bot] label May 19, 2026

coderabbitai Bot requested changes May 19, 2026

View reviewed changes

Comment thread tests/warm/test_warm_migration_comprehensive.py Outdated

Comment thread utilities/mtv_migration.py Outdated

Comment thread utilities/post_migration.py

Comment thread utilities/utils.py Outdated

redhat-qe-bot added the changes-requested-coderabbitai[bot] label May 19, 2026

redhat-qe-bot2 added the needs-rebase label May 19, 2026

Chenli-Hu force-pushed the xfs branch from 4a38083 to 9c5c0b4 Compare May 19, 2026 21:21

redhat-qe-bot removed changes-requested-coderabbitai[bot] needs-rebase labels May 19, 2026

coderabbitai Bot requested changes May 19, 2026

View reviewed changes

Comment thread utilities/post_migration.py

redhat-qe-bot added the changes-requested-coderabbitai[bot] label May 19, 2026

Chenli-Hu force-pushed the xfs branch from 9c5c0b4 to 42e32a7 Compare May 19, 2026 21:36

redhat-qe-bot1 removed the changes-requested-coderabbitai[bot] label May 19, 2026

redhat-qe-bot added the needs-rebase label May 21, 2026

Chenli-Hu force-pushed the xfs branch from 42e32a7 to 63804a1 Compare May 27, 2026 18:04

redhat-qe-bot removed the needs-rebase label May 27, 2026

coderabbitai Bot requested changes May 27, 2026

View reviewed changes

Comment thread utilities/post_migration.py Outdated

Comment thread utilities/vmware_guest_operations.py Outdated

redhat-qe-bot1 added the changes-requested-coderabbitai[bot] label May 27, 2026

qodo-code-review Bot reviewed May 27, 2026

View reviewed changes

rh-bot-1 added the commented-qodo-code-review[bot] label May 27, 2026

qodo-code-review Bot reviewed May 27, 2026

View reviewed changes

Warm migrate VM with XFS v4 from vSphere

2ddfb0b

Signed-off-by: Chenli Hu <chhu@redhat.com>

Chenli-Hu force-pushed the xfs branch from 63804a1 to 2ddfb0b Compare May 27, 2026 19:44

redhat-qe-bot removed commented-qodo-code-review[bot] changes-requested-coderabbitai[bot] labels May 27, 2026

coderabbitai Bot approved these changes May 27, 2026

View reviewed changes

redhat-qe-bot2 added the lgtm-coderabbitai[bot] label May 27, 2026

redhat-qe-bot added verified needs-rebase labels May 27, 2026

Conversation

Chenli-Hu commented May 19, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

redhat-qe-bot commented May 19, 2026

Welcome! 🎉

🔄 Automatic Actions

📋 Available Commands

PR Status Management

Review & Approval

Testing & Validation

Container Operations

Cherry-pick Operations

Label Management

✅ Merge Requirements

📊 Review Process

💡 Tips

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Chenli-Hu commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Chenli-Hu commented May 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

qodo-code-review Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Previous review results

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

qodo-code-review Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qodo-code-review Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot May 27, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

qodo-code-review Bot commented May 27, 2026

Code Review by Qodo

Looking for bugs?

Uh oh!

Chenli-Hu commented May 27, 2026

Uh oh!

myakove commented May 31, 2026

Uh oh!

Reviewers

Chenli-Hu commented May 19, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 19, 2026 •

edited

Loading

Chenli-Hu commented May 19, 2026 •

edited

Loading

Chenli-Hu commented May 19, 2026 •

edited

Loading

qodo-code-review Bot commented May 27, 2026 •

edited

Loading

qodo-code-review Bot commented May 27, 2026 •

edited

Loading

qodo-code-review Bot commented May 27, 2026 •

edited

Loading