Skip to content

runtime: Add annotation-based block device mounting#40

Draft
antoine-gaillard wants to merge 5 commits intodatadogfrom
agaillard/block-mount-annotation
Draft

runtime: Add annotation-based block device mounting#40
antoine-gaillard wants to merge 5 commits intodatadogfrom
agaillard/block-mount-annotation

Conversation

@antoine-gaillard
Copy link
Copy Markdown

@antoine-gaillard antoine-gaillard commented Jan 21, 2026

Problem: CSI block mode PVCs are passed as raw device nodes. Containers must mount them manually, requiring privileged: true or CAP_SYS_ADMIN. Additionally, new volumes without a snapshot (e.g. fresh EBS) arrive unformatted, and the guest rootfs doesn't ship mkfs/blkid.

Solution: New annotation instructs kata-agent to mount the device:

annotations:
  io.katacontainers.volume.block-mounts: '{"/dev/xvda": {"mount": "/data", "fstype": "ext4"}}'

How it works:

1. Block device hotplugged via existing volumeDevices path
2. Runtime checks host-side device with blkid — if unformatted, runs mkfs.<fstype> on the host before passing to the agent
3. Runtime parses annotation → creates grpc.Storage objects
4. Device removed from OCI spec (no raw device in container)
5. Bind mount added → container sees mounted filesystem

Options:

- mount: destination path (required)
- fstype: ext4, xfs (default: ext4)
- options: mount options (default: ["rw"])
- fsGroup: optional gid for ownership

Testing:

- 27 unit tests (parsing, validation, storage creation, host-side formatting)
- Live tested on cluster with CSI block PVC + fresh unformatted EBS and EBS with existing data

@antoine-gaillard antoine-gaillard force-pushed the agaillard/block-mount-annotation branch from 01bc551 to f6b741c Compare January 22, 2026 07:15
Comment thread src/runtime/virtcontainers/kata_agent.go Outdated
Copy link
Copy Markdown

@zaymat zaymat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm. The logic to check we're not mount devices outside of what's defined in the pod spec seems ok.

Small nit on optimization.

Now let's do the Rust part 😁

var storages []*grpc.Storage
devicesToRemove := make(map[string]bool)

for devicePath, mountConfig := range blockMounts {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small optimization: if you instead go through container's devices and check if you found an entry for dev.ContainerPath in the blockMounts map, you go through each device once.

With the current option, you have a (# device)*(# auto mount) complexity.

antoine-gaillard and others added 5 commits March 30, 2026 09:37
Add support for mounting block devices (volumeDevices) as filesystems
inside the guest VM via annotation. This allows CSI block mode PVCs to
be automatically mounted by kata-agent, eliminating the need for
privileged containers.

Annotation format:
  io.katacontainers.volume.block-mounts: '{"<devicePath>": {"mount": "<path>", "fstype": "<fs>"}}'

Example:
  io.katacontainers.volume.block-mounts: '{"/dev/xvda": {"mount": "/data", "fstype": "ext4"}}'

Supported options:
- mount: destination path in container (required)
- fstype: filesystem type - ext4, xfs, btrfs (default: ext4)
- options: mount options array (default: ["rw"])
- fsGroup: optional gid for filesystem group ownership

Implementation:
- Delegates driver/source selection to handleBlockVolume() for DeviceBlock
  to ensure proper struct-based detection (e.g., blockDrive.Pmem takes
  precedence over config)
- Extracts PCIPath directly for VhostUserBlk devices
- removeDevicesFromOCISpec is a plain function (no receiver needed)
- Comprehensive test coverage including pmem device test proving struct-based
  detection works (nvdimm driver despite VirtioBlock config)

Fixes netkit_endpoint_test.go compilation by using proper constructors:
- Use PciPathFromString() instead of struct literal
- Use CcwDeviceFrom() instead of struct literal

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ount

Fresh ephemeral EBS volumes arrive without a filesystem. The kata-agent
fails with EINVAL when trying to mount an unformatted device, and the
guest rootfs does not ship mkfs or blkid. Format on the host side in
createAnnotationBlockStorages using the host device path from BlockDrive.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Iterate container devices once with map lookup instead of nested loops.
Detect duplicate container devices and report unmatched annotation keys
with device paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@antoine-gaillard antoine-gaillard force-pushed the agaillard/block-mount-annotation branch from a81694e to 4861c33 Compare March 30, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants