Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 27 additions & 3 deletions docs/known-issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -288,12 +288,36 @@ Idempotency of `provider.DeleteVolume` was already guaranteed by Bug 33's contra

**Related commits / tests**: `pkg/satellite/controllers/storage_sweeper.go` + `pkg/satellite/controllers/storage_sweeper_test.go` (7 unit tests covering owned-leave-alone, orphan-reap, foreign-prefix immunity, per-node scope, rate-limit, skip-annotation, mid-delete-race protection, missing-RD wildcard protection); `pkg/storage/storage.go` adds the `VolumeLister` optional interface + `VolumeRef`; ZFS / LVM-thin / LVM-thick providers implement it. E2e validation via `lc-rd-delete-churn.sh` (passes 10 iters without ZVOL leak).

## Bug 50: concurrent rapid late `vd c` wedges or drops the second volume

**Status**: open (campaign tracking id: BUG-048)
**Severity**: P1 (availability; NOT data-loss, NOT a node-reboot deadlock; recoverable)
**Scenario reference**: tests/e2e/cli-matrix/multi-volume-late-vd-create.sh
**Surfaced by**: release-gate validation campaign
**Reproduction steps**:

1. Create a multi-replica ResourceDefinition and let it converge (at least one VolumeDefinition, all replicas UpToDate).
2. Issue two `linstor volume-definition create <rd> <size>` calls back-to-back in rapid succession (concurrently, or scripted with no wait between them).

**Expected behaviour**: Both volume-definitions are created; each new volume is laid down on every replica and reaches `UpToDate` (with one replica electing SyncSource), exactly as a sequential add would.

**Actual behaviour**: Intermittently one of two faults occurs — (a) the second `vd c` returns SUCCESS but the VolumeDefinition is silently dropped (only one VD ends up persisted), or (b) the second volume is created but wedges `Inconsistent` with no SyncSource elected, so it never reaches `UpToDate`. In case (a) both calls report success.

**Scope / reachability**: NOT reachable through the production CSI path. linstor-csi maps one PVC to one ResourceDefinition with exactly one VolumeDefinition, created in a single request; autoplace and `ControllerPublishVolume` add no VolumeDefinitions, and resize grows the existing volume in place. The fault is reachable only when an operator (or a script) issues concurrent / rapid back-to-back manual `vd c` calls against an existing multi-replica RD. No data is lost or corrupted (md5 held across all validation runs) and no node reboot is required.

**Recovery**: Delete the wedged or missing volume-definition (`linstor vd d <rd> <vnr>`) and re-add it sequentially — issue one `vd c`, wait for it to reach `UpToDate`, then issue the next. Always add late volume-definitions one at a time.

**Recommended fix**: Serialise concurrent VolumeDefinition creation on a given ResourceDefinition (optimistic-concurrency precondition on the RD volume-minor allocation plus the auto-numbered VD create) so the second add observes the first, and seed the new volume's GI / current-UUID before peers attach so a deterministic SyncSource is elected. A first attempt (PR #164) reduced the silent-drop and fixed a day0 split-brain leg but regressed the CSI-reachable single-late-VD path, so it was not merged; a corrected fix that does not touch the single-VD path is tracked.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For grammatical consistency with the rest of the sentence (which uses "VolumeDefinition creation"), consider changing auto-numbered VD create to auto-numbered VD creation.


**Related commits / tests**: cli-matrix cell `tests/e2e/cli-matrix/multi-volume-late-vd-create.sh` honestly catches this (as of #162); PR #164 (deferred). No closing commit yet.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrase honestly catches this is somewhat informal for technical documentation. Consider using a more professional term such as reliably catches this or consistently catches this to maintain a formal tone.


## Recommended next-fix order

1. **Bug 35** (P0, placer FreeCapacity) — only P0; blocks placer 1.0 and risks ENOSPC at create time.
2. **Bug 42** (P1, piraeus pod-CIDR drift) — blocks the iptables-mode e2e lane.
3. **Bug 36 + 37** (P1, VD props merge) — fix together; 37 depends on 36's merge plumbing.
4. **Bug 39 + 40** (P1, toggle-disk retry/cancel) — fix together; together they unlock Bug 34's Option B state machine.
5. **Bug 34 Option B** (P1 follow-up) — wrap migrate-disk in the new state machine once 39/40 land.
6. **Bug 38** (P2, cosmetic STATE_INFO on shrink) — pure UX, do last.
7. **Bug 32** (P2, observation) — document only; no code fix needed.
5. **Bug 50** (P1, concurrent late `vd c`) — serialise VolumeDefinition create on the RD and seed GI before peer attach; operator-only path, recoverable, not data-loss. A corrected fix (superseding the deferred PR #164) is tracked.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve grammatical correctness and readability, consider changing VolumeDefinition create to VolumeDefinition creation (or creating VolumeDefinitions). Additionally, adding a comma in concurrent late (making it concurrent, late) would improve punctuation.

6. **Bug 34 Option B** (P1 follow-up) — wrap migrate-disk in the new state machine once 39/40 land.
7. **Bug 38** (P2, cosmetic STATE_INFO on shrink) — pure UX, do last.
8. **Bug 32** (P2, observation) — document only; no code fix needed.