Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,26 @@ cozystack.installer Release Notes
=================================


v1.4.3
======

Bugfixes
--------

- Prepare playbooks now enable
``device_ownership_from_security_context`` on the containerd CRI
plugin (k3s drop-in
``config-v3.toml.d/10-cozystack-cri.toml``). KubeVirt's CDI importer
writes disk images into raw block volumes as a non-root pod, which
requires containerd to chown the block device to the pod's
SecurityContext; k3s disables this by default. Without it the
importer failed with ``blockdev: cannot open /dev/cdi-block-volume:
Permission denied``, the ``DataVolume`` hung in ``ImportInProgress``,
and VMs referencing the disk stayed ``Pending``. Gated behind
``cozystack_enable_kubevirt``; drop-in directory overridable via
``cozystack_k3s_containerd_dropin_dir`` for containerd 1.x clusters.

Comment on lines +6 to +24

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Version bump violates collection versioning policy.

The collection version is being bumped from 1.4.2 to 1.4.3 for a bugfix (containerd CRI configuration), but the coding guidelines state: "Collection version is inherited from cozystack/cozystack and tracks the upstream Cozystack chart version (the value in roles/cozystack/defaults/main.yml:cozystack_chart_version). Do NOT bump the collection version just because a PR adds features or fixes bugs."

Unless cozystack_chart_version in roles/cozystack/defaults/main.yml has also been updated to 1.4.3 (which is not shown in the provided files), this version bump should be reverted. The fix should be released as part of the next version that syncs with an upstream Cozystack chart release.

As per coding guidelines: Collection version is inherited from cozystack/cozystack and tracks the upstream Cozystack chart version (the value in roles/cozystack/defaults/main.yml:cozystack_chart_version). Do NOT bump the collection version just because a PR adds features or fixes bugs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CHANGELOG.rst` around lines 6 - 24, The CHANGELOG entry bumped the collection
version to v1.4.3 which violates the policy that collection version mirrors
roles/cozystack/defaults/main.yml:cozystack_chart_version; revert the version
header change (restore v1.4.2 or remove the explicit bump) in CHANGELOG.rst so
the bugfix entry is not presented as a collection version bump unless you also
update cozystack_chart_version in roles/cozystack/defaults/main.yml to 1.4.3;
ensure the changelog text for the containerd fix remains but do not change the
collection version number.


v1.4.0
======

Expand Down
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,18 @@ tun
kvm_intel # or kvm_amd depending on the CPU
```

#### Enabled by default: containerd device ownership for CDI block imports

When KubeVirt is enabled, the prepare playbook drops a containerd CRI config that sets `device_ownership_from_security_context = true`. KubeVirt's CDI (Containerized Data Importer) writes VM disk images into raw **block** volumes from a non-root importer pod; containerd only chowns the block device to the pod's `SecurityContext` UID/GID when this option is on, and k3s ships it disabled. Without it the importer fails with `blockdev: cannot open /dev/cdi-block-volume: Permission denied`, the `DataVolume` is stuck in `ImportInProgress`, and every VM that references the disk stays `Pending` — one of the silent "VMs stuck in Pending" failure modes called out above.

Written as a drop-in that containerd merges on top of k3s's generated `config.toml`:

```text
/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d/10-cozystack-cri.toml
```

`config-v3.toml.d` and the `io.containerd.cri.v1.runtime` plugin table are the containerd 2.x (config version 3) paths shipped by current k3s. On a containerd 1.x cluster override `cozystack_k3s_containerd_dropin_dir` (and adjust the plugin table to `io.containerd.grpc.v1.cri`). The drop-in is read at first k3s start in the full pipeline; on a re-run against a running cluster a handler restarts k3s so the change takes effect.

#### Known limitations

ZFS support depends on the OS ecosystem and kernel flavor. The prepare
Expand Down
48 changes: 48 additions & 0 deletions examples/rhel/prepare-rhel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,19 @@
state: restarted
failed_when: false # tolerated: same reason as the enable task below

- name: Restart k3s to apply containerd config
ansible.builtin.systemd:
name: "{{ item }}"
state: restarted
loop:
- k3s
- k3s-agent
# Only the unit matching this node's role exists; the other is
# absent, and on the full-pipeline run prepare executes before
# k3s is installed (the drop-in is then read at first k3s start).
# failed_when: false tolerates both — a missing unit is not an error.
failed_when: false
Comment on lines +125 to +136

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using failed_when: false on the k3s restart handler is dangerous because it silences legitimate failures (e.g., if k3s fails to start due to a syntax error in the newly written containerd config). A safer approach is to check if the k3s services are installed using ansible.builtin.stat in the tasks, and then conditionally restart them in the handler only if they exist. This allows the playbook to fail loudly if an installed k3s service fails to restart.

    - name: Restart k3s to apply containerd config
      ansible.builtin.systemd:
        name: "{{ item.item }}"
        state: restarted
      loop: "{{ k3s_services_stat.results | default([]) }}"
      when: item.stat.exists | default(false)


tasks:
- name: Create k3s_cluster group for k3s.orchestration
ansible.builtin.group_by:
Expand Down Expand Up @@ -188,6 +201,41 @@
| map(attribute='item')
| list }}

# CDI (Containerized Data Importer) streams VM disk images into raw
# block volumes from a NON-root importer pod. containerd only chowns
# the block device to the pod's SecurityContext UID/GID when
# device_ownership_from_security_context is enabled on the CRI
# plugin, and k3s ships it disabled. Without it the importer dies
# with "blockdev: cannot open /dev/cdi-block-volume: Permission
# denied", the DataVolume hangs in ImportInProgress, and every VM
# that references the disk stays Pending.
#
# The drop-in is merged by containerd on top of k3s's generated
# config.toml via the config-v3.toml.d import glob — read at first
# k3s start (full pipeline) or applied by the handler on re-runs
# against a running cluster. config-v3.toml.d and
# io.containerd.cri.v1.runtime are the containerd 2.x (config
# version 3) paths shipped by current k3s; override
# cozystack_k3s_containerd_dropin_dir for a containerd 1.x cluster.
- name: Ensure k3s containerd config drop-in directory exists
ansible.builtin.file:
path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
state: directory
mode: "0755"
when: cozystack_enable_kubevirt | default(true) | bool
Comment on lines +220 to +225

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To safely restart k3s only when it is actually installed, we can add a task to check if the systemd unit files exist using ansible.builtin.stat before ensuring the drop-in directory exists.

    - name: Check if k3s services are installed
      ansible.builtin.stat:
        path: "/etc/systemd/system/{{ item }}.service"
      loop:
        - k3s
        - k3s-agent
      register: k3s_services_stat
      when: cozystack_enable_kubevirt | default(true) | bool

    - name: Ensure k3s containerd config drop-in directory exists
      ansible.builtin.file:
        path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
        state: directory
        mode: "0755"
      when: cozystack_enable_kubevirt | default(true) | bool


- name: Enable device_ownership_from_security_context for CDI block imports
ansible.builtin.copy:
dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
mode: "0644"
content: |
version = 3

[plugins.'io.containerd.cri.v1.runtime']
device_ownership_from_security_context = true
when: cozystack_enable_kubevirt | default(true) | bool
notify: Restart k3s to apply containerd config
Comment on lines +227 to +237

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The README and CHANGELOG state that containerd 1.x clusters are supported by overriding the drop-in directory and adjusting the plugin table. However, the config version (version = 3) and plugin name (io.containerd.cri.v1.runtime) are currently hardcoded in the task. To make this actually configurable for containerd 1.x, we should use variables with sensible defaults.

    - name: Enable device_ownership_from_security_context for CDI block imports
      ansible.builtin.copy:
        dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
        mode: "0644"
        content: |
          version = {{ cozystack_k3s_containerd_config_version | default(3) }}

          [plugins.'{{ cozystack_k3s_containerd_cri_plugin | default("io.containerd.cri.v1.runtime") }}']
            device_ownership_from_security_context = true
      when: cozystack_enable_kubevirt | default(true) | bool
      notify: Restart k3s to apply containerd config


- name: Ensure multipath drop-in directory exists
ansible.builtin.file:
path: /etc/multipath/conf.d
Expand Down
48 changes: 48 additions & 0 deletions examples/suse/prepare-suse.yml
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,19 @@
state: restarted
failed_when: false # tolerated: same reason as the enable task below

- name: Restart k3s to apply containerd config
ansible.builtin.systemd:
name: "{{ item }}"
state: restarted
loop:
- k3s
- k3s-agent
# Only the unit matching this node's role exists; the other is
# absent, and on the full-pipeline run prepare executes before
# k3s is installed (the drop-in is then read at first k3s start).
# failed_when: false tolerates both — a missing unit is not an error.
failed_when: false
Comment on lines +120 to +131

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using failed_when: false on the k3s restart handler is dangerous because it silences legitimate failures (e.g., if k3s fails to start due to a syntax error in the newly written containerd config). A safer approach is to check if the k3s services are installed using ansible.builtin.stat in the tasks, and then conditionally restart them in the handler only if they exist. This allows the playbook to fail loudly if an installed k3s service fails to restart.

    - name: Restart k3s to apply containerd config
      ansible.builtin.systemd:
        name: "{{ item.item }}"
        state: restarted
      loop: "{{ k3s_services_stat.results | default([]) }}"
      when: item.stat.exists | default(false)


tasks:
- name: Create k3s_cluster group for k3s.orchestration
ansible.builtin.group_by:
Expand Down Expand Up @@ -183,6 +196,41 @@
| map(attribute='item')
| list }}

# CDI (Containerized Data Importer) streams VM disk images into raw
# block volumes from a NON-root importer pod. containerd only chowns
# the block device to the pod's SecurityContext UID/GID when
# device_ownership_from_security_context is enabled on the CRI
# plugin, and k3s ships it disabled. Without it the importer dies
# with "blockdev: cannot open /dev/cdi-block-volume: Permission
# denied", the DataVolume hangs in ImportInProgress, and every VM
# that references the disk stays Pending.
#
# The drop-in is merged by containerd on top of k3s's generated
# config.toml via the config-v3.toml.d import glob — read at first
# k3s start (full pipeline) or applied by the handler on re-runs
# against a running cluster. config-v3.toml.d and
# io.containerd.cri.v1.runtime are the containerd 2.x (config
# version 3) paths shipped by current k3s; override
# cozystack_k3s_containerd_dropin_dir for a containerd 1.x cluster.
- name: Ensure k3s containerd config drop-in directory exists
ansible.builtin.file:
path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
state: directory
mode: "0755"
when: cozystack_enable_kubevirt | default(true) | bool
Comment on lines +215 to +220

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To safely restart k3s only when it is actually installed, we can add a task to check if the systemd unit files exist using ansible.builtin.stat before ensuring the drop-in directory exists.

    - name: Check if k3s services are installed
      ansible.builtin.stat:
        path: "/etc/systemd/system/{{ item }}.service"
      loop:
        - k3s
        - k3s-agent
      register: k3s_services_stat
      when: cozystack_enable_kubevirt | default(true) | bool

    - name: Ensure k3s containerd config drop-in directory exists
      ansible.builtin.file:
        path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
        state: directory
        mode: "0755"
      when: cozystack_enable_kubevirt | default(true) | bool


- name: Enable device_ownership_from_security_context for CDI block imports
ansible.builtin.copy:
dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
mode: "0644"
content: |
version = 3

[plugins.'io.containerd.cri.v1.runtime']
device_ownership_from_security_context = true
when: cozystack_enable_kubevirt | default(true) | bool
notify: Restart k3s to apply containerd config
Comment on lines +222 to +232

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The README and CHANGELOG state that containerd 1.x clusters are supported by overriding the drop-in directory and adjusting the plugin table. However, the config version (version = 3) and plugin name (io.containerd.cri.v1.runtime) are currently hardcoded in the task. To make this actually configurable for containerd 1.x, we should use variables with sensible defaults.

    - name: Enable device_ownership_from_security_context for CDI block imports
      ansible.builtin.copy:
        dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
        mode: "0644"
        content: |
          version = {{ cozystack_k3s_containerd_config_version | default(3) }}

          [plugins.'{{ cozystack_k3s_containerd_cri_plugin | default("io.containerd.cri.v1.runtime") }}']
            device_ownership_from_security_context = true
      when: cozystack_enable_kubevirt | default(true) | bool
      notify: Restart k3s to apply containerd config


- name: Ensure multipath drop-in directory exists
ansible.builtin.file:
path: /etc/multipath/conf.d
Expand Down
48 changes: 48 additions & 0 deletions examples/ubuntu/prepare-ubuntu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,19 @@
# IS consulted downstream.)
failed_when: false

- name: Restart k3s to apply containerd config
ansible.builtin.systemd:
name: "{{ item }}"
state: restarted
loop:
- k3s
- k3s-agent
# Only the unit matching this node's role exists; the other is
# absent, and on the full-pipeline run prepare executes before
# k3s is installed (the drop-in is then read at first k3s start).
# failed_when: false tolerates both — a missing unit is not an error.
failed_when: false
Comment on lines +141 to +152

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using failed_when: false on the k3s restart handler is dangerous because it silences legitimate failures (e.g., if k3s fails to start due to a syntax error in the newly written containerd config). A safer approach is to check if the k3s services are installed using ansible.builtin.stat in the tasks, and then conditionally restart them in the handler only if they exist. This allows the playbook to fail loudly if an installed k3s service fails to restart.

    - name: Restart k3s to apply containerd config
      ansible.builtin.systemd:
        name: "{{ item.item }}"
        state: restarted
      loop: "{{ k3s_services_stat.results | default([]) }}"
      when: item.stat.exists | default(false)


tasks:
- name: Create k3s_cluster group for k3s.orchestration
ansible.builtin.group_by:
Expand Down Expand Up @@ -229,6 +242,41 @@
| map(attribute='item')
| list }}

# CDI (Containerized Data Importer) streams VM disk images into raw
# block volumes from a NON-root importer pod. containerd only chowns
# the block device to the pod's SecurityContext UID/GID when
# device_ownership_from_security_context is enabled on the CRI
# plugin, and k3s ships it disabled. Without it the importer dies
# with "blockdev: cannot open /dev/cdi-block-volume: Permission
# denied", the DataVolume hangs in ImportInProgress, and every VM
# that references the disk stays Pending.
#
# The drop-in is merged by containerd on top of k3s's generated
# config.toml via the config-v3.toml.d import glob — read at first
# k3s start (full pipeline) or applied by the handler on re-runs
# against a running cluster. config-v3.toml.d and
# io.containerd.cri.v1.runtime are the containerd 2.x (config
# version 3) paths shipped by current k3s; override
# cozystack_k3s_containerd_dropin_dir for a containerd 1.x cluster.
- name: Ensure k3s containerd config drop-in directory exists
ansible.builtin.file:
path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
state: directory
mode: "0755"
when: cozystack_enable_kubevirt | default(true) | bool
Comment on lines +261 to +266

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To safely restart k3s only when it is actually installed, we can add a task to check if the systemd unit files exist using ansible.builtin.stat before ensuring the drop-in directory exists.

    - name: Check if k3s services are installed
      ansible.builtin.stat:
        path: "/etc/systemd/system/{{ item }}.service"
      loop:
        - k3s
        - k3s-agent
      register: k3s_services_stat
      when: cozystack_enable_kubevirt | default(true) | bool

    - name: Ensure k3s containerd config drop-in directory exists
      ansible.builtin.file:
        path: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}"
        state: directory
        mode: "0755"
      when: cozystack_enable_kubevirt | default(true) | bool


- name: Enable device_ownership_from_security_context for CDI block imports
ansible.builtin.copy:
dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
mode: "0644"
content: |
version = 3

[plugins.'io.containerd.cri.v1.runtime']
device_ownership_from_security_context = true
when: cozystack_enable_kubevirt | default(true) | bool
notify: Restart k3s to apply containerd config
Comment on lines +268 to +278

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The README and CHANGELOG state that containerd 1.x clusters are supported by overriding the drop-in directory and adjusting the plugin table. However, the config version (version = 3) and plugin name (io.containerd.cri.v1.runtime) are currently hardcoded in the task. To make this actually configurable for containerd 1.x, we should use variables with sensible defaults.

    - name: Enable device_ownership_from_security_context for CDI block imports
      ansible.builtin.copy:
        dest: "{{ cozystack_k3s_containerd_dropin_dir | default('/var/lib/rancher/k3s/agent/etc/containerd/config-v3.toml.d') }}/10-cozystack-cri.toml"
        mode: "0644"
        content: |
          version = {{ cozystack_k3s_containerd_config_version | default(3) }}

          [plugins.'{{ cozystack_k3s_containerd_cri_plugin | default("io.containerd.cri.v1.runtime") }}']
            device_ownership_from_security_context = true
      when: cozystack_enable_kubevirt | default(true) | bool
      notify: Restart k3s to apply containerd config


- name: Ensure multipath drop-in directory exists
ansible.builtin.file:
path: /etc/multipath/conf.d
Expand Down
2 changes: 1 addition & 1 deletion galaxy.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
namespace: cozystack
name: installer
version: 1.4.2
version: 1.4.3
readme: README.md
authors:
- Aleksei Sviridkin <f@lex.la>
Expand Down