Add dm-thin storage backend as alternative to ZFS by antiguru · Pull Request #14 · aljoscha/ember

antiguru · 2026-04-27T09:15:15Z

Adds Linux device-mapper thin provisioning as a third storage backend alongside ZFS and APFS.
The motivation is dropping the out-of-tree kernel module + dedicated-pool requirement that ZFS imposes, while keeping block-level copy-on-write semantics that fit Firecracker's raw-drive model.
Selected at init time via ember init --storage <zfs|dm-thin> and persisted on GlobalConfig.

The design lives in docs/DM-THIN-SPEC.md.
Random u64 thin ids are stored on the existing VmMetadata/ImageEntry/SnapshotEntry records; no separate allocator state file.
Concurrent invocations are race-free by construction since the kernel rejects duplicate ids atomically.

Surface area

ember init --storage dm-thin --storage-path <dir|dev> --size <SIZE> brings up a thin pool from sparse files (or a raw block device).
All existing vm, image, and snapshot commands route through the dm-thin path when that backend is active.
New: ember deinit [--purge] tears down the pool. New: ember storage grow --size <SIZE> grows the data device.
Storage is now Arc<dyn StorageBackend> so multiple Linux backends can share one binary; Vm and Network stay compile-time selected.

Phases (one commit per phase)

Trait-object dispatch refactor (no behavior change).
StorageKind enum + multi-backend init fields on GlobalConfig/InitConfig.
dm_thin CLI wrappers (pool, thin, loop_device, tools).
StorageBackend trait reshape: VolumeHandle return type for creators; methods that consume an existing volume take &VmMetadata/&ImageEntry.
DmThinStorage impl.
thin_id + snapshots fields on metadata records (skip-serializing for backends that don't use them).
--storage CLI flag + init dispatch.
ember deinit + ember storage grow.
Integration smoke tests for init/deinit/grow (gated #[ignore]).

15 new unit tests in the dm-thin module pass under cargo test.
The integration tests need root + the dm-thin kernel module, run them with sudo cargo test --test dm_thin -- --ignored --test-threads=1.

🤖 Generated with Claude Code

Foundation for runtime backend selection (ZFS, btrfs, dm-thin) on Linux. The trait was already object-safe; this adds create_storage and init_storage factories on each platform crate, makes Storage a runtime trait object, and updates the 14 CLI call sites. No behavior change. Linux still always returns LinuxStorage (ZFS), macOS always MacosStorage. Concrete dispatch lands when StorageKind plumbing arrives in Phase 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

GlobalConfig gains storage_backend (defaults Zfs for backwards compat), storage_path, and dm_thin_block_size. InitConfig gains storage_path, btrfs_size, dm_thin_size, dm_thin_metadata_size, dm_thin_block_size. No CLI surface change yet — ember init still produces a Zfs config. The fields are wired so the dm-thin and btrfs backends can read them without further config-shape churn. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Design doc for adding Linux device-mapper thin provisioning as a third storage backend. Drops the ZFS kernel-module dependency and the dedicated-pool requirement while keeping block-level CoW (still a tight fit with Firecracker raw drives). Mirrors BTRFS-SPEC structure: trait-object dispatch, file-backed default, sparse metadata + data files, random u64 thin ids stored on existing VmMetadata/ImageEntry records (no separate allocator file). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Pure shell-out wrappers for the operations the dm-thin storage backend will need: * dm_thin::pool — thin-pool target lifecycle and dmsetup status parsing. * dm_thin::thin — thin volume operations including the random-u64 allocator with retry-on-EEXIST collision handling. * dm_thin::loop_device — losetup attach/detach/refresh helpers for file-backed pools. * dm_thin::tools — thin_check/thin_repair/thin_metadata_size/thin_dump. No StorageBackend impl yet; that lands in Phase 4. The wrappers come with 10 unit tests covering parser behavior, table formatting, id allocation invariants, and dm name sanitization. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

VmMetadata grows thin_id: Option<u64> and snapshots: Vec<SnapshotEntry>; ImageEntry grows thin_id: Option<u64>. ZFS and macOS leave them at their defaults — Option::is_none / Vec::is_empty cause serde to skip them entirely in vm.json/registry.json, preserving on-disk format. new_entry and new_build_entry gain a thin_id parameter so the dm-thin import pipeline can record the base snapshot id as it returns from the storage backend. SnapshotEntry is a new record type. ZFS keeps using zfs::snapshot::list; dm-thin will populate VmMetadata.snapshots since the kernel doesn't attach names to thin ids. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three changes that together let dm-thin (and any future) backend return its own thin id from creator methods and read backend-specific state off VmMetadata/ImageEntry from consumer methods: * New VolumeHandle struct returned by create_image_volume, clone_for_vm, clone_vm_storage, and restore_snapshot. Carries the disk path plus an optional thin id for backends that need one. * snapshot returns Option<SnapshotEntry>: Some when the backend persists snapshot metadata in vm.json (dm-thin), None when it tracks snapshots itself (ZFS in the kernel, APFS as files). * Methods consuming an existing volume now take &VmMetadata or &ImageEntry instead of &str. ZFS and APFS impls just use .name; the dm-thin impl will reach for .thin_id. CLI flow updates: a pending_metadata helper in src/cli/vm.rs builds a placeholder VmMetadata immediately after clone, so resize and inject calls can run before the full metadata record is constructed. The image registry import path threads VolumeHandle.thin_id through new_entry/new_build_entry. snapshot::restore persists thin_id + disk_path changes returned by the backend on dm-thin. No new backend yet — this commit just teaches the trait + impls and their callers the new shape. ZFS and APFS continue to behave identically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The new DmThinStorage type implements all StorageBackend methods on top of the dm_thin/ wrappers added in the previous phase. Key flows: * init: sparse metadata + data files, losetup, dmsetup create thin-pool. Honors --storage-path (file or block device), --size, --metadata-size, --block-size; computes metadata size via thin_metadata_size when not pinned. * create_image_volume: stages the ext4 dd onto a fresh thin id, snaps it as the immutable base, drops the staging device. Returns the base id as VolumeHandle.thin_id. * clone_for_vm + clone_vm_storage: kernel create_snap from the source thin id; activate the new device. * snapshot/delete_snapshot/list_snapshots: track entries in VmMetadata.snapshots since dm-thin doesn't carry snapshot names. * restore_snapshot: delete the live thin id, snap from the snapshot, re-activate. Returns the new id so the CLI persists it on VmMetadata.thin_id. * resize: dmsetup load with the new sector count + e2fsck + resize2fs. create_storage and init_storage in ember-linux/lib.rs gain a StorageKind dispatch so callers wire the right impl from GlobalConfig.storage_backend / InitConfig.storage_backend. btrfs is not yet implemented and falls back to ZFS for now. Includes 5 unit tests covering size parsing, byte formatting, and ISO 8601 parsing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

ember init grows --storage (zfs|dm-thin), --storage-path, --size, --metadata-size, and --block-size flags. The flag values flow into both InitConfig (so init_storage dispatches to DmThinStorage::init) and the persisted GlobalConfig. Implements StorageKind: FromStr so clap can parse the --storage value without making ember-core depend on clap. Init refuses to switch backends silently — the user must run ember deinit first if the existing config picks a different backend. dm-thin --storage-path defaults to /var/lib/ember/dm-thin when omitted, so a bare `ember init --storage dm-thin --size 50G` is enough to bring up a file-backed pool. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two new admin commands: * ember deinit [--purge] tears down the storage backend (inverse of ember init). Refuses to run while any VM is registered. Removes the persisted config last so the backend can still locate its backing paths during teardown. --purge deletes file-backed pool images (dm-thin metadata.img/data.img); raw block devices are always preserved. * ember storage grow --size <SIZE> grows the dm-thin data device: truncates the sparse data file, refreshes the loop device, and reloads the pool table with the new sector count. StorageBackend trait grows two methods: * deinit(purge) — backend-specific teardown. ZFS calls zpool destroy. macOS removes images/vms dirs when --purge. dm-thin tears down every ember-managed thin device, removes the pool, detaches loops, and conditionally deletes the backing files. * grow(new_size) — only meaningful for dm-thin file-backed pools. ZFS and macOS return a clear error directing the user elsewhere. A helper pool::list_with_prefix iterates dmsetup ls output to find ember-img-* / ember-vm-* devices for cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three #[ignore]'d integration tests that exercise the dm-thin path end-to-end via the real ember CLI: * init + deinit round-trip (verifies pool comes up, backing files appear, and --purge tears everything down). * refusal to switch backends silently (init dm-thin, then init zfs must fail with a clear message). * storage grow (init at 200M, grow to 400M, verify data.img size). Run them with: sudo cargo test --test dm_thin -- --ignored --test-threads=1 CLAUDE.md updated to advertise the dm-thin path: build/run examples, backend selection notes, runtime tool deps. Architecture section reflects that Storage is now a runtime trait object so multiple backends can coexist on Linux. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The kernel enforces dev_id <= (1 << 24) - 1 in drivers/md/dm-thin.c::read_dev_id; wider values are rejected with EINVAL ("Message received with invalid device id: ..." in dmesg). Earlier code generated full u64 ids and ran into this on the first create_thin during image pull. fresh_thin_id now masks the random value to 24 bits before returning. The on-disk type stays u64 so the format is forward-compatible if the kernel ever lifts the cap. Collision probability at ember scale (hundreds of volumes) is small; the retry-on-EEXIST loop already handles the rare case. Spec doc updated to match. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The earlier wording said the kernel accepted full u64 ids; reading drivers/md/dm-thin.c more carefully shows MAX_DEV_ID = (1 << 24) - 1 is enforced in read_dev_id. Updates the rationale and the comparison table to match the implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LinuxVm::start unconditionally prepended /dev/zvol/ to vm.disk_path, producing /dev/zvol//dev/mapper/ember-vm-... when the dm-thin backend recorded an absolute /dev/mapper path on VmMetadata. Firecracker then failed with ENOENT on the bogus path. ZFS dataset names cannot start with '/' (pool names must begin with a letter), so the leading slash is a safe discriminator: pass absolute paths through, prepend /dev/zvol/ only for relative ones. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replaces the inline `if disk_path.starts_with('/')` branch in LinuxVm::start with a call to `create_storage(config).disk_device_path(vm)`, so the storage backend stays the single source of truth for how a recorded disk_path maps to the actual device a hypervisor talks to. ZFS keeps prepending /dev/zvol/, dm-thin returns the /dev/mapper/... path it activated, macOS returns the rootfs.img path. No more ad-hoc dispatch in the VM layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

aljoscha · 2026-04-28T13:34:32Z

looks like there's formatting issues

No semantic changes. Removes a stray no-op `const _: () = ();` that was left in storage.rs to suppress an import-block reorder; rustfmt doesn't care about that ordering, so the marker is just noise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

aljoscha

Did a read-through of the design and the dm-thin backend code. The architectural shape looks good — Arc<dyn StorageBackend> is a clean enabler, the trait reshape (VolumeHandle, passing &VmMetadata/&ImageEntry for ops that need backend state, optional SnapshotEntry return) is the right surface, and the random-id-with-kernel-rejection design avoids inventing yet another allocator. Forks being independent (cleanup_fork and storage_dependents no-ops) is a nice property that brings Linux into parity with the planned btrfs path.

A few things worth addressing before un-drafting:

1. `disk_device_path` doesn't activate — `vm start` likely broken after reboot

crates/ember-linux/src/dm_thin_storage.rs:1960-1962:

fn disk_device_path(&self, vm: &VmMetadata) -> PathBuf {
    thin::vm_device_path(&vm.name)  // just builds "/dev/mapper/ember-vm-<name>"
}

crates/ember-linux/src/vm.rs:54 (LinuxVm::start) now uses this path to build the Firecracker drive config. dm-thin tables live only in kernel memory, so after a reboot the pool is gone and no ember-vm-* device exists. Nothing in the start path calls ensure_pool_active or ensure_thin_active.

The spec explicitly calls this out ("Per-VM and per-image volumes are activated lazily by methods that need them (e.g. disk_device_path, mount, start)") but the implementation doesn't honor it. resize does call ensure_thin_active (line 1922); disk_device_path and mount don't. Suggest having disk_device_path return Result<PathBuf> (or split off a prepare_for_use method) and route both LinuxVm::start and mount through it.

2. `restore_snapshot` window between `delete(vm_id)` and `allocate_snap(snap_id)`

dm_thin_storage.rs:1872-1876:

thin::deactivate(&dm_name)?;
thin::delete(pool::POOL_NAME, vm_id)?;          // (a) old vm_id gone
let new_id = thin::allocate_snap(pool::POOL_NAME, snap_id)?;  // (b) might fail
let disk_path = thin::activate(&dm_name, pool::POOL_NAME, new_id, size_sectors)?;

If (b) fails (transient dmsetup error, OOM, etc.), the VM is left with vm.thin_id pointing at a freed id and no live volume — and the persisted state on vm.json only gets updated after the function returns successfully. Consider: allocate the new snap from snap_id first, then delete the old vm_id, then activate. That keeps a valid id live throughout the window.

3. `parse_iso8601` reinvents civil-date arithmetic

dm_thin_storage.rs:2275-2297 rolls a Howard Hinnant-style epoch conversion in 22 lines, just so list_snapshots can populate created_at: u64. Either use a date crate or have SnapshotInfo::created_at stay a String — nothing else in the codebase consumes it as seconds-since-epoch.

4. `is_already_exists` does substring match on stderr

crates/ember-linux/src/dm_thin.rs:30:

matches!(err, Error::Command { stderr, .. } if stderr.contains("File exists"))

This is the entire concurrency story for create_thin/create_snap collision retry. A locale-dependent or kernel-version-changed error string silently turns retries into hard failures. Not a blocker (dmsetup strings have been stable for years), but worth a comment pinning the kernel version this was observed on, or a regression test that runs dmsetup message against a known-duplicate id and asserts the predicate fires.

5. `parse_size` is reinvented; `ByteSize` already exists

dm_thin_storage.rs:2238-2255 re-parses size specs (50G, 200M) — but ember_core::config::size::ByteSize already does this and is used on the storage grow --size arg (src/cli/storage.rs:13). The init path could parse dm_thin_size / dm_thin_metadata_size as ByteSize in InitConfig directly rather than carrying them as Option<String> and re-parsing on the backend side.

6. Do we have all the data to reactivate the pool after a restart?

Mostly yes, but the encoding is fragile.

ensure_pool_active needs: metadata path, data path, block size (must match creation-time value — kernel rejects mismatches), data sectors (re-read each time). GlobalConfig persists storage_path and dm_thin_block_size. Three concerns:

(a) Block size is stored as the user's flag, not the resolved value. src/cli/init.rs writes dm_thin_block_size: args.block_size (which is None when the user omits --block-size). Both init and new then resolve via unwrap_or(pool::DEFAULT_BLOCK_SIZE_SECTORS). Same value today. But block size is permanent at pool creation — if DEFAULT_BLOCK_SIZE_SECTORS ever changes, every existing pool becomes unimportable on the next ember upgrade. Fix: at init time, resolve the value first and persist Some(actual_block_size) unconditionally.

(b) File-vs-device mode isn't persisted, it's derived at runtime. metadata_file() / data_file() (lines 1566-1582) decide between <dir>/metadata.img and storage_path.with_file_name("dm-thin-metadata.img") based on a live is_dir() check on storage_path. Mostly harmless, but reactivation depends on the filesystem still answering the same way init did. An explicit dm_thin_mode: File | RawDevice field would make the contract clearer.

(c) Raw-device metadata path looks broken. For --storage-path /dev/sdb:

storage_path.with_file_name("dm-thin-metadata.img")
// → "/dev/dm-thin-metadata.img"

with_file_name replaces the last component, so metadata lands in /dev/ — which is tmpfs on most distros and will vanish on reboot. The spec says metadata in device-backed mode should live on the state directory's filesystem (the "embedded metadata mode" paragraph in DM-THIN-SPEC.md), but the code puts it next to the device. So device-backed mode probably can't survive a reboot at all today, while file-backed mode can (assuming #1 above is also fixed).

Things that would be worth persisting to make pool config fully self-describing: resolved block size (not the user flag), mode (file vs raw-device), optionally data sectors at init time so reactivation can detect "user shrank the backing file" and refuse rather than silently importing.

The thin ids themselves — the real persistence question for individual VMs — are stored on VmMetadata.thin_id / ImageEntry.thin_id / SnapshotEntry.thin_id, with the kernel pool metadata as source of truth (queryable via thin_dump). That part of the design is sound. The gap is purely on the pool table parameters.

* `disk_device_path` returns `Result<PathBuf>` and lazily activates pool + thin device so `vm start` works after a host reboot. Plumbed through `LinuxVm::start` and CLI call sites; `pending_metadata` now carries `disk_size_gib` for the activation size. * Raw-device metadata path moves from `storage_path.with_file_name(...)` (lands in `/dev/`, tmpfs) to `state_dir/dm-thin-metadata.img` so the pool survives reboot. * `restore_snapshot` allocates the replacement thin id first, then swaps the dm-mapper slot under a guard that frees the new id on any later failure. Old order orphaned `vm.thin_id` on transient errors. * `InitConfig.dm_thin_size`/`dm_thin_metadata_size` switched to `Option<ByteSize>`; dropped the reinvented `parse_size` helper and tests. * Dropped 22-line `parse_iso8601`. `SnapshotEntry::created_at` is now `u64` epoch seconds; added `now_epoch_secs()` helper in core. * Resolve `dm_thin_block_size` at init time and persist `Some(actual)` on `GlobalConfig` so future default changes don't orphan existing pools. * New `DmThinMode { File, RawDevice }` enum, resolved at init from `storage_path` and persisted on `GlobalConfig`/`InitConfig`. Reactivation no longer depends on a live `is_dir()` probe. * Documented kernel/lvm2/glibc context for `is_already_exists` and added regression tests pinning the `"File exists"` strerror against an actual `dmsetup` failure line. Spec updated for `dm_thin_mode` and `ByteSize` init fields. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

antiguru · 2026-04-29T12:47:37Z

Addressed your feedback, thank you! Should be good for another look.

aljoscha

Re-read the latest commit. All eight items from the previous round are addressed cleanly — disk_device_path lazily activates, restore_snapshot allocates first then swaps under a guard, parse_iso8601/parse_size are gone, is_already_exists has a regression test pinned to a real dmsetup failure line, raw-device metadata moved off /dev/, and DmThinMode/dm_thin_block_size are resolved at init and persisted on GlobalConfig. Architectural reshape (Arc<dyn StorageBackend>, VolumeHandle, methods taking &VmMetadata/&ImageEntry, optional SnapshotEntry return) reads well and is not gratuitously dm-thin-specific. Forks being independent is a real ergonomic win over the ZFS path.

A few things I'd still like to see before merge:

1. `destroy_vm_storage` leaks snapshot thin ids — real bug

crates/ember-linux/src/dm_thin_storage.rs:483-495:

fn destroy_vm_storage(&self, vm: &VmMetadata) -> Result<()> {
    // …
    if let Some(id) = vm.thin_id {
        let _ = thin::delete(pool::POOL_NAME, id);
    }
    Ok(())
}

If the VM has snapshots in vm.snapshots, only vm.thin_id is freed — every SnapshotEntry.thin_id stays alive in the pool metadata after vm delete. The vm.json record is gone (so ember loses the user-level handle), but the kernel still holds them. They consume metadata + any unique blocks, and only thin_dump can find them again.

Spec analogue: ZFS uses zfs destroy -r which is recursive over snapshots; the dm-thin spec at "VM destroy" only mentions deleting the VM's own thin id, missing this. Fix: iterate vm.snapshots and thin::delete(pool, snap.thin_id) for each (best-effort, before deleting the VM's own id).

2. `LinuxPlatform::inspect_*` still hardcodes "ZFS zvol"

crates/ember-linux/src/platform.rs:47-71:

fn inspect_vm_extra(metadata: &VmMetadata) -> Vec<(&'static str, String)> {
    let mut extra = vec![
        ("ZFS zvol", metadata.disk_path.clone()),
        // …

For a dm-thin VM, metadata.disk_path is /dev/mapper/ember-vm-myvm and the row prints ZFS zvol /dev/mapper/ember-vm-myvm — wrong label. Same in inspect_image_extra and info_extra ("ZFS pool", "Dataset" shown unconditionally). The dm-thin spec at lines 622-627 explicitly calls out that this needs to branch on config.storage_backend and add a "Thin id" row. The display layer didn't get the multi-backend update; everything else did.

3. `lib.rs::create_storage` btrfs branch silently falls back to ZFS

crates/ember-linux/src/lib.rs:32-36:

match config.storage_backend {
    StorageKind::Zfs => Arc::new(LinuxStorage::new(config)),
    StorageKind::DmThin => Arc::new(DmThinStorage::new(config)),
    StorageKind::Btrfs => Arc::new(LinuxStorage::new(config)),
}

init_storage properly errors out for btrfs ("not yet implemented"), but create_storage silently hands back a ZFS backend. If somebody hand-edits config.json to "storage_backend": "btrfs", every operation will go through ZFS code with garbage inputs. Should be unreachable!() or an explicit error — same shape as init_storage.

4. Pool-status check from the spec isn't wired up

The dm_thin/pool.rs module fully implements PoolStatus parsing — out_of_data_space, Failed, etc. — and the spec at line 333 says ember should "refuse VM create/start and prints an actionable error" when the pool is full. But nothing in dm_thin_storage.rs actually calls pool::status(...). The user gets EIO from dd mid-clone instead of a clean refuse. Worth at least gating create_image_volume/clone_for_vm/resize on mode != OutOfDataSpace.

5. Stale staging device on retry after a failed image pull

dm_thin_storage.rs:282-341 create_image_volume doesn't check whether ember-img-<name>-staging already exists from a previous failed run. The next dmsetup create for that name will fail with EEXIST and the leftover staging device + thin id stay until somebody manually cleans up. Cheap fix: at the top of create_image_volume, if pool::exists(&staging_dm) { thin::deactivate(&staging_dm); }.

6. Test crate noise

tests/dm_thin.rs does mod common; but only uses common::ember(...) — that pulls the entire tests/common/{mod,linux}.rs surface into the dm_thin test crate, where 35 helpers ring up as function … is never used. cargo clippy --all-targets ends with a 36-warning block, all from this test. Easiest fix: #[allow(dead_code)] on the common module, or split the helpers used by dm_thin into a smaller submodule.

Smaller things

--block-size <sectors> is unusual UX (the help says "Defaults to 128 (= 64 KiB)"). Consider taking a ByteSize like elsewhere — same pattern as --size/--metadata-size.
ensure_pool_active runs thin_check on every first command after reboot. For large pools this can take seconds. Worth a one-line note in the spec or a --skip-thin-check escape hatch.
pool::exists() is used to test arbitrary dm-mapper devices (e.g. pool::exists(&dm_name) for a thin volume in ensure_thin_active). The name is misleading — it's really "does this dm device exist". Consider renaming or moving it to module level.

…ilure Three related changes for robustness around dm-thin pool setup: * `pool::ensure_target_loaded()` runs `modprobe dm-thin-pool` and verifies the `thin-pool` target is registered via `dmsetup targets`, replacing the kernel's opaque "Invalid argument" with an actionable error pointing at CONFIG_DM_THIN_PROVISIONING. Called from both `init` and `ensure_pool_active`. * `init` now detaches loop devices on every failure path between attaching them and successful `pool::create`, so a failed init no longer leaves loops bound to backing files that get unlinked with the surrounding tempdir. * New `DmThinCleanup` RAII guard runs `ember deinit --purge` on drop, installed at the top of every dm_thin integration test so panics partway through still tear down the pool, loops, and backing files.

* destroy_vm_storage now frees every snapshot's thin id before the vm's own id. Previously snapshots stayed pinned in pool metadata after vm delete with no user-level handle to reach them. * create_image_volume tears down a stale ember-img-<name>-staging device left over from a previous failed run, otherwise EEXIST blocked retries. * create_storage panics on the StorageKind::Btrfs arm rather than silently routing through the ZFS backend with garbage inputs; matches init_storage's shape. * Added assert_pool_healthy() and gated create_image_volume, clone_for_vm, clone_vm_storage, snapshot, restore_snapshot, resize on it. OutOfDataSpace / Failed / ReadOnly pools surface an actionable error instead of an opaque mid-`dd` EIO. grow and the destroy paths stay ungated. * LinuxPlatform::inspect_vm_extra / inspect_image_extra label the disk row "Thin device" + "Thin id" when the metadata carries a thin_id, "ZFS zvol" otherwise. info_extra branches on storage_backend so dm-thin shows storage path / block size / mode instead of pool/dataset. * tests/dm_thin.rs gets #[allow(dead_code)] on `mod common;` so shared test helpers it doesn't use don't generate 36 dead-code warnings. * Spec note: ensure_pool_active runs thin_check on the metadata device on first command after a reboot — proportional to pool occupancy, intentional, and skippable by activating the pool manually. * Renamed dm-mapper device existence check from pool::exists to dm_thin::dm_device_exists; it was used to probe pool, thin, and staging devices indiscriminately, so the pool-scoped name was misleading. * --block-size now takes a ByteSize (e.g. `64K`, `1M`) like --size / --metadata-size; converted to sectors internally with validation for dm-thin's 64 KiB-multiple constraint. New Error::Pool variant for storage-pool-level errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

antiguru · 2026-04-30T11:28:46Z

Pushed 56b82f0. All items from the latest review addressed:

Add CI workflow for build, clippy, and format checks #1 destroy_vm_storage now frees every snapshot's thin id before the VM's own.
Add macOS support specification and implementation roadmap #2 inspect_vm_extra/inspect_image_extra label "Thin device"+"Thin id" when thin_id is set; info_extra branches on storage_backend.
macos: add support for macos, using builtin Apple Virtualization and APFS cow #3 create_storage panics on StorageKind::Btrfs instead of silently routing to ZFS.
docs: add full-scope implementation guideline to CLAUDE.md #4 New assert_pool_healthy() gate on create_image_volume / clone_for_vm / clone_vm_storage / snapshot / restore_snapshot / resize (skips grow and destroy paths). New Error::Pool variant.
feat(sec-257): vsock support + pool commands #5 create_image_volume tears down a stale ember-img-<name>-staging left over from a prior failed run.
feat(sec-254): vsock CID allocation and end-to-end tests #6 #[allow(dead_code)] on mod common; in tests/dm_thin.rs.

Smaller bits:

Spec note on thin_check cost on first post-reboot command.
Renamed pool::exists → dm_thin::dm_device_exists.
--block-size now takes a ByteSize like --size/--metadata-size, validated to be a multiple of 64 KiB.

The dm-thin arm of LinuxPlatform::info_extra was carrying a string literal for the pool name, while the rest of the codebase reaches for the pool::POOL_NAME constant. Pull the constant in here too so the display row stays in sync if the pool name ever changes.

`tests/resize.rs:214` had `1 * 1024 * 1024 * 1024` to spell out "1 GiB". Clippy's identity_op fires on the leading `1 *` and `cargo clippy --all-targets -- -D warnings` rejects it. Drop the `1 *` to keep the intent obvious without the noise.

The Makefile's `clippy` target — wired into the CI workflow — only covered the default lib/bin targets, so warnings inside `tests/` slipped through (e.g. an identity_op in tests/resize.rs that only showed up when running `cargo clippy --all-targets` locally). Add `--all-targets` so integration tests are linted too.

antiguru and others added 10 commits April 27, 2026 09:49

antiguru marked this pull request as draft April 27, 2026 09:15

antiguru and others added 4 commits April 27, 2026 11:39

aljoscha reviewed Apr 29, 2026

View reviewed changes

antiguru marked this pull request as ready for review April 29, 2026 12:46

aljoscha reviewed Apr 30, 2026

View reviewed changes

aljoscha and others added 2 commits April 30, 2026 12:03

aljoscha added 3 commits April 30, 2026 13:48

aljoscha merged commit c6063fa into aljoscha:main Apr 30, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dm-thin storage backend as alternative to ZFS#14

Add dm-thin storage backend as alternative to ZFS#14
aljoscha merged 21 commits into
aljoscha:mainfrom
antiguru:dm-thin

antiguru commented Apr 27, 2026

Uh oh!

aljoscha commented Apr 28, 2026

Uh oh!

aljoscha left a comment

Uh oh!

antiguru commented Apr 29, 2026

Uh oh!

aljoscha left a comment •

edited

Loading

Uh oh!

antiguru commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

antiguru commented Apr 27, 2026

Surface area

Phases (one commit per phase)

Uh oh!

aljoscha commented Apr 28, 2026

Uh oh!

aljoscha left a comment

Choose a reason for hiding this comment

1. disk_device_path doesn't activate — vm start likely broken after reboot

2. restore_snapshot window between delete(vm_id) and allocate_snap(snap_id)

3. parse_iso8601 reinvents civil-date arithmetic

4. is_already_exists does substring match on stderr

5. parse_size is reinvented; ByteSize already exists

6. Do we have all the data to reactivate the pool after a restart?

Uh oh!

antiguru commented Apr 29, 2026

Uh oh!

aljoscha left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

1. destroy_vm_storage leaks snapshot thin ids — real bug

2. LinuxPlatform::inspect_* still hardcodes "ZFS zvol"

3. lib.rs::create_storage btrfs branch silently falls back to ZFS

4. Pool-status check from the spec isn't wired up

5. Stale staging device on retry after a failed image pull

6. Test crate noise

Smaller things

Uh oh!

antiguru commented Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `disk_device_path` doesn't activate — `vm start` likely broken after reboot

2. `restore_snapshot` window between `delete(vm_id)` and `allocate_snap(snap_id)`

3. `parse_iso8601` reinvents civil-date arithmetic

4. `is_already_exists` does substring match on stderr

5. `parse_size` is reinvented; `ByteSize` already exists

aljoscha left a comment •

edited

Loading

1. `destroy_vm_storage` leaks snapshot thin ids — real bug

2. `LinuxPlatform::inspect_*` still hardcodes "ZFS zvol"

3. `lib.rs::create_storage` btrfs branch silently falls back to ZFS