Skip to content

fix(templates): skip CPU-limited VMs gracefully instead of failing the play#45

Open
t0kubetsu wants to merge 1 commit into
mainfrom
fix/template-vcpu-host-limit
Open

fix(templates): skip CPU-limited VMs gracefully instead of failing the play#45
t0kubetsu wants to merge 1 commit into
mainfrom
fix/template-vcpu-host-limit

Conversation

@t0kubetsu
Copy link
Copy Markdown

Fixes #43

Summary

  • Add `'MAX' not in stderr` to the `failed_when` condition in START task so a host CPU limit does not abort the play
  • After start, build `templates_started` (VMs that started) and `templates_skipped_cpu` (VMs blocked by CPU limit)
  • WAIT, DETACH, RE-ATTACH, CONVERT now loop over `templates_started` instead of `templates_to_update`
  • A WARN task displays which templates were skipped and why
  • Applies to blank_scenario_2_subnets, blank_scenario_4_subnets, blank_scenario_6_subnets

Why not cap vm_cores?

PR #44 took that approach and was closed as wrong: capping vm_cores changes the actual template spec and causes cloned VMs to have fewer CPUs than intended. The template definitions are correct — the playbook just needs to handle the case where the current host cannot fulfil them.

Test plan

  • Run on a host with 4 physical CPUs — templates with ≤4 cores convert; 6/8-core templates emit a warning and are skipped without failing the play
  • Run on a host with 8+ physical CPUs — all templates convert, no warning

When a template VM has more vm_cores than the host's physical CPU count,
qm start fails with "MAX X vcpus allowed per VM on this node". Previously
this failed the entire play, blocking all remaining templates.

Changes:
- Add 'MAX' to the START task failed_when ignore list so the play continues
- Build templates_started (VMs that started) and templates_skipped_cpu
  (VMs that hit the CPU limit) from the start results
- Run WAIT, DETACH, RE-ATTACH, CONVERT only on templates_started
- Display a clear warning listing skipped templates and why

The template vm_cores definitions are intentionally preserved — they
represent the intended sizing tier. Users on hosts with more physical
CPUs will get all templates; users on smaller hosts get a warning
instead of a playbook failure.

Applies to: blank_scenario_2_subnets, blank_scenario_4_subnets,
            blank_scenario_6_subnets
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(templates): medium-06 / large-06 / large-08 templates fail to start when host has fewer than 6 physical CPUs

1 participant