fix(prepare): enable containerd device_ownership_from_security_context for CDI block imports#47
Conversation
…t for CDI block imports KubeVirt's CDI importer writes VM disk images into raw block volumes from a non-root pod. containerd only chowns the block device to the pod's SecurityContext when device_ownership_from_security_context is enabled on the CRI plugin, and k3s ships it disabled. Without it the importer fails with 'cannot open /dev/cdi-block-volume: Permission denied', the DataVolume hangs in ImportInProgress, and VMs referencing the disk stay Pending. Add a k3s containerd drop-in (config-v3.toml.d/10-cozystack-cri.toml) to all three prepare playbooks, gated behind cozystack_enable_kubevirt and overridable via cozystack_k3s_containerd_dropin_dir. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <andrei.kvapil@aenix.io>
📝 WalkthroughWalkthroughThis PR releases v1.4.3 with a fix for KubeVirt CDI block import failures. Playbooks now enable a containerd CRI configuration setting ChangesKubeVirt CDI containerd CRI configuration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request enables device_ownership_from_security_context on the containerd CRI plugin for KubeVirt CDI block imports across RHEL, SUSE, and Ubuntu playbooks, and updates the documentation and changelog. Feedback on these changes suggests avoiding the use of failed_when: false on the k3s restart handlers, as it silences legitimate failures; instead, you should use ansible.builtin.stat to verify if the services exist before conditionally restarting them. Additionally, to fully support containerd 1.x clusters as documented, the containerd config version and plugin name should be parameterized with variables rather than hardcoded.
| - name: Restart k3s to apply containerd config | ||
| ansible.builtin.systemd: | ||
| name: "{{ item }}" | ||
| state: restarted | ||
| loop: | ||
| - k3s | ||
| - k3s-agent | ||
| # Only the unit matching this node's role exists; the other is | ||
| # absent, and on the full-pipeline run prepare executes before | ||
| # k3s is installed (the drop-in is then read at first k3s start). | ||
| # failed_when: false tolerates both — a missing unit is not an error. | ||
| failed_when: false |
There was a problem hiding this comment.
Using failed_when: false on the k3s restart handler is dangerous because it silences legitimate failures (e.g., if k3s fails to start due to a syntax error in the newly written containerd config). A safer approach is to check if the k3s services are installed using ansible.builtin.stat in the tasks, and then conditionally restart them in the handler only if they exist. This allows the playbook to fail loudly if an installed k3s service fails to restart.
- name: Restart k3s to apply containerd config
ansible.builtin.systemd:
name: "{{ item.item }}"
state: restarted
loop: "{{ k3s_services_stat.results | default([]) }}"
when: item.stat.exists | default(false)| - name: Restart k3s to apply containerd config | ||
| ansible.builtin.systemd: | ||
| name: "{{ item }}" | ||
| state: restarted | ||
| loop: | ||
| - k3s | ||
| - k3s-agent | ||
| # Only the unit matching this node's role exists; the other is | ||
| # absent, and on the full-pipeline run prepare executes before | ||
| # k3s is installed (the drop-in is then read at first k3s start). | ||
| # failed_when: false tolerates both — a missing unit is not an error. | ||
| failed_when: false |
There was a problem hiding this comment.
Using failed_when: false on the k3s restart handler is dangerous because it silences legitimate failures (e.g., if k3s fails to start due to a syntax error in the newly written containerd config). A safer approach is to check if the k3s services are installed using ansible.builtin.stat in the tasks, and then conditionally restart them in the handler only if they exist. This allows the playbook to fail loudly if an installed k3s service fails to restart.
- name: Restart k3s to apply containerd config
ansible.builtin.systemd:
name: "{{ item.item }}"
state: restarted
loop: "{{ k3s_services_stat.results | default([]) }}"
when: item.stat.exists | default(false)| - name: Restart k3s to apply containerd config | ||
| ansible.builtin.systemd: | ||
| name: "{{ item }}" | ||
| state: restarted | ||
| loop: | ||
| - k3s | ||
| - k3s-agent | ||
| # Only the unit matching this node's role exists; the other is | ||
| # absent, and on the full-pipeline run prepare executes before | ||
| # k3s is installed (the drop-in is then read at first k3s start). | ||
| # failed_when: false tolerates both — a missing unit is not an error. | ||
| failed_when: false |
There was a problem hiding this comment.
Using failed_when: false on the k3s restart handler is dangerous because it silences legitimate failures (e.g., if k3s fails to start due to a syntax error in the newly written containerd config). A safer approach is to check if the k3s services are installed using ansible.builtin.stat in the tasks, and then conditionally restart them in the handler only if they exist. This allows the playbook to fail loudly if an installed k3s service fails to restart.
- name: Restart k3s to apply containerd config
ansible.builtin.systemd:
name: "{{ item.item }}"
state: restarted
loop: "{{ k3s_services_stat.results | default([]) }}"
when: item.stat.exists | default(false)| - name: Ensure k3s containerd config drop-in directory exists | ||
| ansible.builtin.file: | ||
| path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}" | ||
| state: directory | ||
| mode: "0755" | ||
| when: cozystack_enable_kubevirt | default(true) | bool |
There was a problem hiding this comment.
To safely restart k3s only when it is actually installed, we can add a task to check if the systemd unit files exist using ansible.builtin.stat before ensuring the drop-in directory exists.
- name: Check if k3s services are installed
ansible.builtin.stat:
path: "/etc/systemd/system/{{ item }}.service"
loop:
- k3s
- k3s-agent
register: k3s_services_stat
when: cozystack_enable_kubevirt | default(true) | bool
- name: Ensure k3s containerd config drop-in directory exists
ansible.builtin.file:
path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
state: directory
mode: "0755"
when: cozystack_enable_kubevirt | default(true) | bool| - name: Enable device_ownership_from_security_context for CDI block imports | ||
| ansible.builtin.copy: | ||
| dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml" | ||
| mode: "0644" | ||
| content: | | ||
| version = 3 | ||
|
|
||
| [plugins.'io.containerd.cri.v1.runtime'] | ||
| device_ownership_from_security_context = true | ||
| when: cozystack_enable_kubevirt | default(true) | bool | ||
| notify: Restart k3s to apply containerd config |
There was a problem hiding this comment.
The README and CHANGELOG state that containerd 1.x clusters are supported by overriding the drop-in directory and adjusting the plugin table. However, the config version (version = 3) and plugin name (io.containerd.cri.v1.runtime) are currently hardcoded in the task. To make this actually configurable for containerd 1.x, we should use variables with sensible defaults.
- name: Enable device_ownership_from_security_context for CDI block imports
ansible.builtin.copy:
dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
mode: "0644"
content: |
version = {{ cozystack_k3s_containerd_config_version | default(3) }}
[plugins.'{{ cozystack_k3s_containerd_cri_plugin | default("io.containerd.cri.v1.runtime") }}']
device_ownership_from_security_context = true
when: cozystack_enable_kubevirt | default(true) | bool
notify: Restart k3s to apply containerd config| - name: Ensure k3s containerd config drop-in directory exists | ||
| ansible.builtin.file: | ||
| path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}" | ||
| state: directory | ||
| mode: "0755" | ||
| when: cozystack_enable_kubevirt | default(true) | bool |
There was a problem hiding this comment.
To safely restart k3s only when it is actually installed, we can add a task to check if the systemd unit files exist using ansible.builtin.stat before ensuring the drop-in directory exists.
- name: Check if k3s services are installed
ansible.builtin.stat:
path: "/etc/systemd/system/{{ item }}.service"
loop:
- k3s
- k3s-agent
register: k3s_services_stat
when: cozystack_enable_kubevirt | default(true) | bool
- name: Ensure k3s containerd config drop-in directory exists
ansible.builtin.file:
path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
state: directory
mode: "0755"
when: cozystack_enable_kubevirt | default(true) | bool| - name: Enable device_ownership_from_security_context for CDI block imports | ||
| ansible.builtin.copy: | ||
| dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml" | ||
| mode: "0644" | ||
| content: | | ||
| version = 3 | ||
|
|
||
| [plugins.'io.containerd.cri.v1.runtime'] | ||
| device_ownership_from_security_context = true | ||
| when: cozystack_enable_kubevirt | default(true) | bool | ||
| notify: Restart k3s to apply containerd config |
There was a problem hiding this comment.
The README and CHANGELOG state that containerd 1.x clusters are supported by overriding the drop-in directory and adjusting the plugin table. However, the config version (version = 3) and plugin name (io.containerd.cri.v1.runtime) are currently hardcoded in the task. To make this actually configurable for containerd 1.x, we should use variables with sensible defaults.
- name: Enable device_ownership_from_security_context for CDI block imports
ansible.builtin.copy:
dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
mode: "0644"
content: |
version = {{ cozystack_k3s_containerd_config_version | default(3) }}
[plugins.'{{ cozystack_k3s_containerd_cri_plugin | default("io.containerd.cri.v1.runtime") }}']
device_ownership_from_security_context = true
when: cozystack_enable_kubevirt | default(true) | bool
notify: Restart k3s to apply containerd config| - name: Ensure k3s containerd config drop-in directory exists | ||
| ansible.builtin.file: | ||
| path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}" | ||
| state: directory | ||
| mode: "0755" | ||
| when: cozystack_enable_kubevirt | default(true) | bool |
There was a problem hiding this comment.
To safely restart k3s only when it is actually installed, we can add a task to check if the systemd unit files exist using ansible.builtin.stat before ensuring the drop-in directory exists.
- name: Check if k3s services are installed
ansible.builtin.stat:
path: "/etc/systemd/system/{{ item }}.service"
loop:
- k3s
- k3s-agent
register: k3s_services_stat
when: cozystack_enable_kubevirt | default(true) | bool
- name: Ensure k3s containerd config drop-in directory exists
ansible.builtin.file:
path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
state: directory
mode: "0755"
when: cozystack_enable_kubevirt | default(true) | bool| - name: Enable device_ownership_from_security_context for CDI block imports | ||
| ansible.builtin.copy: | ||
| dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml" | ||
| mode: "0644" | ||
| content: | | ||
| version = 3 | ||
|
|
||
| [plugins.'io.containerd.cri.v1.runtime'] | ||
| device_ownership_from_security_context = true | ||
| when: cozystack_enable_kubevirt | default(true) | bool | ||
| notify: Restart k3s to apply containerd config |
There was a problem hiding this comment.
The README and CHANGELOG state that containerd 1.x clusters are supported by overriding the drop-in directory and adjusting the plugin table. However, the config version (version = 3) and plugin name (io.containerd.cri.v1.runtime) are currently hardcoded in the task. To make this actually configurable for containerd 1.x, we should use variables with sensible defaults.
- name: Enable device_ownership_from_security_context for CDI block imports
ansible.builtin.copy:
dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
mode: "0644"
content: |
version = {{ cozystack_k3s_containerd_config_version | default(3) }}
[plugins.'{{ cozystack_k3s_containerd_cri_plugin | default("io.containerd.cri.v1.runtime") }}']
device_ownership_from_security_context = true
when: cozystack_enable_kubevirt | default(true) | bool
notify: Restart k3s to apply containerd configThere was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
galaxy.yml (1)
4-4:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winVersion bump violates collection versioning policy.
(Duplicate of the concern raised in CHANGELOG.rst)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@galaxy.yml` at line 4, The version line "version: 1.4.3" in galaxy.yml violates the collection versioning policy; revert or adjust this version to comply with the policy (e.g., keep previous version or follow the permitted increment scheme) and align CHANGELOG.rst with the intended release changes instead of forcing a version bump; update the version value in galaxy.yml (the "version" key) to the correct policy-compliant version and ensure CHANGELOG.rst reflects any release notes rather than being used to justify a noncompliant bump.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CHANGELOG.rst`:
- Around line 6-24: The CHANGELOG entry bumped the collection version to v1.4.3
which violates the policy that collection version mirrors
roles/cozystack/defaults/main.yml:cozystack_chart_version; revert the version
header change (restore v1.4.2 or remove the explicit bump) in CHANGELOG.rst so
the bugfix entry is not presented as a collection version bump unless you also
update cozystack_chart_version in roles/cozystack/defaults/main.yml to 1.4.3;
ensure the changelog text for the containerd fix remains but do not change the
collection version number.
---
Duplicate comments:
In `@galaxy.yml`:
- Line 4: The version line "version: 1.4.3" in galaxy.yml violates the
collection versioning policy; revert or adjust this version to comply with the
policy (e.g., keep previous version or follow the permitted increment scheme)
and align CHANGELOG.rst with the intended release changes instead of forcing a
version bump; update the version value in galaxy.yml (the "version" key) to the
correct policy-compliant version and ensure CHANGELOG.rst reflects any release
notes rather than being used to justify a noncompliant bump.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: f11114e0-a9e7-4015-90c4-2dfaad96a1eb
📒 Files selected for processing (6)
CHANGELOG.rstREADME.mdexamples/rhel/prepare-rhel.ymlexamples/suse/prepare-suse.ymlexamples/ubuntu/prepare-ubuntu.ymlgalaxy.yml
| v1.4.3 | ||
| ====== | ||
|
|
||
| Bugfixes | ||
| -------- | ||
|
|
||
| - Prepare playbooks now enable | ||
| ``device_ownership_from_security_context`` on the containerd CRI | ||
| plugin (k3s drop-in | ||
| ``config-v3.toml.d/10-cozystack-cri.toml``). KubeVirt's CDI importer | ||
| writes disk images into raw block volumes as a non-root pod, which | ||
| requires containerd to chown the block device to the pod's | ||
| SecurityContext; k3s disables this by default. Without it the | ||
| importer failed with ``blockdev: cannot open /dev/cdi-block-volume: | ||
| Permission denied``, the ``DataVolume`` hung in ``ImportInProgress``, | ||
| and VMs referencing the disk stayed ``Pending``. Gated behind | ||
| ``cozystack_enable_kubevirt``; drop-in directory overridable via | ||
| ``cozystack_k3s_containerd_dropin_dir`` for containerd 1.x clusters. | ||
|
|
There was a problem hiding this comment.
Version bump violates collection versioning policy.
The collection version is being bumped from 1.4.2 to 1.4.3 for a bugfix (containerd CRI configuration), but the coding guidelines state: "Collection version is inherited from cozystack/cozystack and tracks the upstream Cozystack chart version (the value in roles/cozystack/defaults/main.yml:cozystack_chart_version). Do NOT bump the collection version just because a PR adds features or fixes bugs."
Unless cozystack_chart_version in roles/cozystack/defaults/main.yml has also been updated to 1.4.3 (which is not shown in the provided files), this version bump should be reverted. The fix should be released as part of the next version that syncs with an upstream Cozystack chart release.
As per coding guidelines: Collection version is inherited from cozystack/cozystack and tracks the upstream Cozystack chart version (the value in roles/cozystack/defaults/main.yml:cozystack_chart_version). Do NOT bump the collection version just because a PR adds features or fixes bugs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@CHANGELOG.rst` around lines 6 - 24, The CHANGELOG entry bumped the collection
version to v1.4.3 which violates the policy that collection version mirrors
roles/cozystack/defaults/main.yml:cozystack_chart_version; revert the version
header change (restore v1.4.2 or remove the explicit bump) in CHANGELOG.rst so
the bugfix entry is not presented as a collection version bump unless you also
update cozystack_chart_version in roles/cozystack/defaults/main.yml to 1.4.3;
ensure the changelog text for the containerd fix remains but do not change the
collection version number.
Summary
KubeVirt's CDI importer writes VM disk images into raw block volumes from a non-root pod (
runAsUser: 107). containerd only chowns the block device to the pod'sSecurityContextwhendevice_ownership_from_security_contextis enabled on the CRI plugin, and k3s ships it disabled. Without it the importer fails withblockdev: cannot open /dev/cdi-block-volume: Permission denied, theDataVolumeis stuck inImportInProgress, and every VM referencing the disk staysPending— one of the silent "VMs stuck in Pending" failure modes the README already warns about. The prepare playbooks configure every other node prerequisite but never touched the containerd CRI config; this adds it.Changes
ubuntu/rhel/suse) drop a k3s containerd config (config-v3.toml.d/10-cozystack-cri.toml) settingdevice_ownership_from_security_context = true, gated behindcozystack_enable_kubevirt.cozystack_k3s_containerd_dropin_dirfor containerd 1.x clusters.galaxy.ymlversion bump to1.4.3.Test plan
ansible-lintpasses (production profile, 0 failures / 0 warnings)ansible-test sanitypasses — not run (no plugin changes)Permission deniedfor 5h; after applying the equivalent drop-in + k3s restart theDataVolumeimport resumed and completed (Succeeded, PVCBound), and the VM booted with the disk attached.copy/filetasks are idempotent; the handler fires only when the drop-in content changes.All three prepare playbooks pass
ansible-playbook --syntax-check.Summary by CodeRabbit
Release Notes v1.4.3
Bug Fixes
ImportInProgressstate and prevent associated VMs from reachingPendingstatus (when KubeVirt is enabled).Documentation