Skip to content

fix(storage): drop invalid lvcreate --kernel flag breaking LVM-thin clone/restore (BUG-043) + n-rst harness (BUG-044)#158

Merged
Andrei Kvapil (kvaps) merged 2 commits into
mainfrom
fix/bug-043-lvm-clone-kernel-flag
Jun 14, 2026
Merged

fix(storage): drop invalid lvcreate --kernel flag breaking LVM-thin clone/restore (BUG-043) + n-rst harness (BUG-044)#158
Andrei Kvapil (kvaps) merged 2 commits into
mainfrom
fix/bug-043-lvm-clone-kernel-flag

Conversation

@kvaps

@kvaps Andrei Kvapil (kvaps) commented Jun 13, 2026

Copy link
Copy Markdown
Member

Two release-gate sweep findings. Neither is a regression from the prior release (both pre-existing); both are low-risk fixes that improve on v0.1.11.

BUG-043 (product): LVM-thin clone/restore never converged — invalid lvcreate flag

pkg/storage/lvm/lvm_thin.go RestoreVolumeFromSnapshot built lvcreate --snapshot ... with a --kernel token, which is not a valid lvcreate option (lvcreate: unrecognized option '--kernel', exit 3 on LVM ≥2.03). The destination backing LV was never created → DRBD never came up → clone/snapshot-restore onto LVM-thin pools never reached UpToDate. Shared seed path for both clone and snapshot-restore on LVM-thin (LUKS cells default to POOL=lvm-thin and hit it too); ZFS pools use a separate zfs clone path and were unaffected. Pre-existing (introduced before v0.1.11; lvm_thin.go byte-identical to v0.1.11), surfaced now because the new clone data plane + the gate's LVM-thin coverage exercise it.

Fix: remove the --kernel token; the remaining lvcreate --snapshot --setactivationskip n --activate y --name <tgt> <vg>/<src> is the correct thin snapshot/clone form. L1 regression pins the generated argv (no --kernel, expected thin-snapshot shape).

BUG-044 (harness): n-rst-recreates-tiebreaker read the wrong JSON key

The cell's final wire-view check used .rsc_flags (nonexistent; the field is .flags), so it always counted 0 TIE_BREAKER even though the product correctly recreates the witness ~1s after n rst (verified). Changed .rsc_flags.flags.

Stand validation (stand big, 18 runs, all PASS)

lvm-thin: luks-clone-encrypted 3/3, luks-snapshot-restore-encrypted 3/3, rd-clone-vd-data-plane 3/3 — all previously FAILED, now converge with real bytes. ZFS no-regression: rd-clone-vd-data-plane (zfs-thin) 3/3. Broader: r-full-lifecycle 3/3. BUG-044: n-rst-recreates-tiebreaker 3/3. go test ./pkg/storage/... + golangci-lint clean.

Summary by CodeRabbit

  • Bug Fixes

    • Fixed LVM thin snapshot restoration behavior to prevent failures on newer LVM versions by adjusting activation parameters.
  • Tests

    • Added regression test for thin volume snapshot restoration to ensure compatibility with newer LVM CLI versions.
    • Updated end-to-end test scripts to use correct JSON field names for resource flag parsing and validation.

Andrei Kvapil (kvaps) and others added 2 commits June 13, 2026 23:52
…/restore (BUG-043)

Thin.RestoreVolumeFromSnapshot built an `lvcreate --snapshot ...`
command that included a `--kernel` token. That is not a valid lvcreate
option (`lvcreate: unrecognized option '--kernel'`, exit 3) on
LVM >= 2.03, so the destination backing LV was never created. As a
result DRBD never came up and clone / snapshot-restore onto LVM-thin
pools never converged to UpToDate. This is the shared seed path for
both clone and snapshot-restore on LVM-thin (LUKS cells default to
POOL=lvm-thin and hit it too); ZFS pools use a separate `zfs clone`
path and were unaffected.

Remove the `--kernel` token. The remaining args form a correct thin
snapshot/clone command — `lvcreate --snapshot --setactivationskip n
--activate y --name <tgt> <vg>/<src-snap>` — matching the normal
CreateSnapshot shape plus the activation flags needed to make the
cloned LV usable. The stale doc-comment is corrected accordingly.

Add a unit regression pinning the generated lvcreate argv for the
thin restore/clone path: it asserts the expected thin-snapshot form
and that no lvcreate invocation carries `--kernel`.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
…x (BUG-044)

n-rst-recreates-tiebreaker and several sibling cli-matrix cells read
the resource-level flags from the `linstor resource list
--machine-readable` output via a `.rsc_flags` jq key that does not
exist on the blockstor wire — the field is `.flags`
(pkg/api/v1/resource.go: `Flags []string json:"flags"`). `.rsc_flags //
[]` therefore always evaluated to an empty array, so every flag check
silently counted zero matches: n-rst saw 0 TIE_BREAKER even though the
witness is correctly recreated, and the tiebreaker-collapse / reap /
diskless-selection assertions in the sibling cells degenerated to
no-ops or wrong-row selections.

Switch every `.rsc_flags` reference to `.flags`, mirroring the working
sibling cells (r-c-over-tiebreaker-skip-sync.sh, r-l-conns-shapes.sh)
that already read `.flags` from the same machine-readable resource
list. The product is correct; this is a harness-only fix.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Fixed LVM thin snapshot restoration to use proper activation flags (--setactivationskip n --activate y) instead of deprecated --kernel, added regression test, and updated E2E test scripts to read replica flags from the .flags JSON field instead of .rsc_flags in LINSTOR resource output.

Changes

LVM snapshot restoration and E2E script JSON field migration

Layer / File(s) Summary
LVM thin snapshot restore activation fix
pkg/storage/lvm/lvm_thin.go, pkg/storage/lvm/lvm_thin_test.go
Removed deprecated --kernel flag from RestoreVolumeFromSnapshot and replaced with --setactivationskip n --activate y to prevent LVM CLI exit-3 failures on newer versions. New regression test TestThinRestoreVolumeFromSnapshotIssuesThinSnapshotLvcreate validates correct lvcreate invocation and verifies --kernel is not used.
E2E script JSON field migration
tests/e2e/cli-matrix/multi-volume-late-vd-create.sh, tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh, tests/e2e/cli-matrix/r-d-collapses-tiebreaker.sh, tests/e2e/cli-matrix/r-td-diskless-reaps-tiebreaker.sh, tests/e2e/cli-matrix/rd-d-deleting-surface.sh, tests/e2e/cli-matrix/vd-d-prune-resource-volumes-no-flap.sh
Updated all six E2E test scripts to parse replica flags from .flags instead of .rsc_flags in LINSTOR resource list -o json output. Changes affect tiebreaker detection, diskless filtering, deletion surface validation, and diagnostic output across all scripts.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Poem

🐰 The snapshot springs to life anew,
Without that kernel flag we knew,
LVM's way has changed its tune,
E2E scripts all sync'd in June,
.flags now leads the dancing crew! 🎉

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: removing invalid --kernel flag from lvcreate in LVM-thin and fixing JSON field references in harness tests.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/bug-043-lvm-clone-kernel-flag

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request removes the invalid --kernel flag from the lvcreate command in RestoreVolumeFromSnapshot for LVM-thin, resolving a bug where LVM >= 2.03 would reject the command. It also adds a regression test and updates several E2E tests to query .flags instead of .rsc_flags from the Linstor JSON output. The reviewer suggests using ArgsUdevless instead of Args for the lvcreate command to avoid potential hangs during volume activation in containerized environments, along with updating the corresponding test assertion.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

@@ -272,7 +272,6 @@ func (t *Thin) RestoreVolumeFromSnapshot(ctx context.Context, target storage.Vol

_, err := t.exec.Run(ctx, "lvcreate",
Args("--snapshot",

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since this lvcreate command explicitly activates the volume (--activate y), it should use ArgsUdevless instead of Args to apply the udev-less activation workaround (activation { udev_sync=0 udev_rules=0 }). In a containerized environment without a udev daemon, activating a volume without this workaround can cause the command to hang or race during device node creation (as documented in Bug 269 and Bug 305).

Suggested change
Args("--snapshot",
\t\tArgsUdevless(\"--snapshot\",

t.Fatalf("RestoreVolumeFromSnapshot: %v", err)
}

want := "lvcreate --config devices { filter=['r|^/dev/drbd|','r|^/dev/zd|'] } --snapshot --setactivationskip n --activate y --name pvc-2_00000 vg/pvc-1_snap-1_00000"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Update the expected command string to include the udev-less activation configuration (activation { udev_sync=0 udev_rules=0 }) to match the change to ArgsUdevless in RestoreVolumeFromSnapshot.

\twant := \"lvcreate --config devices { filter=['r|^/dev/drbd|','r|^/dev/zd|'] } activation { udev_sync=0 udev_rules=0 } --snapshot --setactivationskip n --activate y --name pvc-2_00000 vg/pvc-1_snap-1_00000\"

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/storage/lvm/lvm_thin.go (1)

273-278: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the Provider interface documentation to match the new lvcreate invocation.

The implementation correctly removes the invalid --kernel flag and adds --setactivationskip n, but the Provider.RestoreVolumeFromSnapshot interface documentation in pkg/storage/storage.go (context snippet 1, lines 117-119) still shows the old pattern:

//   LVM_THIN:       `lvcreate -s --kernel --activate y \
//                       --name <tgt> <pool>/<src-snap>`

Update that comment to reflect the corrected invocation without --kernel.

📝 Suggested documentation fix

In pkg/storage/storage.go, update the LVM_THIN line to:

-//   LVM_THIN:       `lvcreate -s --kernel --activate y \
-//                       --name <tgt> <pool>/<src-snap>`
+//   LVM_THIN:       `lvcreate -s --setactivationskip n --activate y \
+//                       --name <tgt> <pool>/<src-snap>`
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/storage/lvm/lvm_thin.go` around lines 273 - 278, Update the
Provider.RestoreVolumeFromSnapshot interface documentation's LVM_THIN example to
match the new lvcreate invocation used in lvm_thin.go: remove the obsolete
`--kernel` flag and show `--setactivationskip n` instead (preserve `-s`,
`--activate y`, `--name <tgt> <pool>/<src-snap>`); locate the LVM_THIN example
in the Provider.RestoreVolumeFromSnapshot doc comment and replace the old sample
command with the corrected one reflecting `lvcreate -s --setactivationskip n
--activate y --name <tgt> <pool>/<src-snap>`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/storage/lvm/lvm_thin.go`:
- Around line 273-278: Update the Provider.RestoreVolumeFromSnapshot interface
documentation's LVM_THIN example to match the new lvcreate invocation used in
lvm_thin.go: remove the obsolete `--kernel` flag and show `--setactivationskip
n` instead (preserve `-s`, `--activate y`, `--name <tgt> <pool>/<src-snap>`);
locate the LVM_THIN example in the Provider.RestoreVolumeFromSnapshot doc
comment and replace the old sample command with the corrected one reflecting
`lvcreate -s --setactivationskip n --activate y --name <tgt> <pool>/<src-snap>`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fb733549-9f5c-4c1a-8ad2-282781f9e1c2

📥 Commits

Reviewing files that changed from the base of the PR and between 2b9b648 and 08a3ea3.

📒 Files selected for processing (8)
  • pkg/storage/lvm/lvm_thin.go
  • pkg/storage/lvm/lvm_thin_test.go
  • tests/e2e/cli-matrix/multi-volume-late-vd-create.sh
  • tests/e2e/cli-matrix/n-rst-recreates-tiebreaker.sh
  • tests/e2e/cli-matrix/r-d-collapses-tiebreaker.sh
  • tests/e2e/cli-matrix/r-td-diskless-reaps-tiebreaker.sh
  • tests/e2e/cli-matrix/rd-d-deleting-surface.sh
  • tests/e2e/cli-matrix/vd-d-prune-resource-volumes-no-flap.sh

@kvaps Andrei Kvapil (kvaps) merged commit e055663 into main Jun 14, 2026
26 of 27 checks passed
@kvaps Andrei Kvapil (kvaps) deleted the fix/bug-043-lvm-clone-kernel-flag branch June 14, 2026 02:47
Andrei Kvapil (kvaps) added a commit that referenced this pull request Jun 14, 2026
Resolve conflict in tests/e2e/cli-matrix/vd-d-prune-resource-volumes-no-flap.sh:
keep the complete wire-key fix (.flags + .volumes[]) from this branch, which
subsumes the partial .flags-only fix that landed on main via #158. Both sides
intended the same correction; this branch's version is the strict superset.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant