fix(satellite,rest): close last-UpToDate-delete race on Secondary SyncSource (BUG-045) by kvaps · Pull Request #159 · cozystack/blockstor

Andrei Kvapil (kvaps) · 2026-06-14T00:02:43Z

Summary

Closes a data-availability hole in the U130 "don't delete the last active peer" guard (BUG-045). On stand big it was observed that linstor r d <node> <rd> deleting the LAST UpToDate diskful replica while a peer is mid-sync intermittently SUCCEEDED, stranding the SyncTarget with no UpToDate source — a path to data unavailability. The defect was non-deterministic (1 of 2 big runs).

The U130 predicate itself was logically correct; the defect was upstream of it, in the satellite observer's CRD status projection. This PR fixes both legs for defense-in-depth, as befits a data-safety guard.

Root cause

A diskful replica that is Secondary and acts as the SyncSource for a freshly-added SyncTarget peer could end up with a blank .status.volumes[].diskState in the CRD projection.

On a Secondary SyncSource, drbd-9 does not re-emit a local device frame carrying disk:UpToDate — the local disk-state never transitioned (it was already UpToDate), so the kernel only emits the peer-device replication:SyncSource frame toward the SyncTarget. When the observer's per-volume cache entry is (re)created by that peer-device frame, it carries an empty DiskState, and the projection leaves diskState blank (pkg/satellite/controllers/observer.go, peerDeviceVolumeObservation).

The U130 guard reads exactly that diskState field. With the only source's diskState blank, resourceIsUpToDate(target) returned false, the guard concluded "this is not an UpToDate source / there is no other UpToDate copy", and allowed the destructive delete — stranding the SyncTarget.

Fix

Leg 1 — observer projection (`pkg/satellite/controllers/observer.go`)

When the observer sees a peer-device replication:SyncSource frame, stamp the local volume's DiskState=UpToDate. A SyncSource feeds a peer's resync only from an UpToDate local disk, so this is a hard DRBD invariant. The peer-device frame is reliably emitted (unlike the missing local device frame), so this closes the unstamped-projection window. The stamp is idempotent and never downgrades a richer local-frame observation (mergeVolumeInto only upgrades to a non-empty value, and UpToDate is the terminal disk-state).

Leg 2 — guard fail-safe hardening (`pkg/rest/resource_delete_last_uptodate_u130.go`)

The last-copy delete safety decision no longer rests solely on the possibly-lagging diskState projection:

A SyncSource replication-state is treated as kernel ground truth that a replica holds an UpToDate copy. This applies to the target (do not conclude "not a source" when it is actively feeding a peer) and to siblings (a SyncSource sibling counts as a surviving source, so legitimate deletes are not over-refused).
An empty/unknown diskState on a diskful target is treated conservatively: while a peer is mid-sync, the last-copy delete is refused rather than trusting an unstamped projection.

Fail-safe argument (load-bearing): when the guard cannot positively confirm that another UpToDate copy would survive the delete, it REFUSES. A false refusal merely asks the operator to wait for the resync to finish or pass ?force=true; a false allow strands the SyncTarget with no UpToDate source — an unrecoverable data-availability loss. The asymmetry of consequences dictates refusing in doubt.

Legitimate last-copy deletes (no peer mid-sync) and the ?force=true override are unaffected — no regression to the documented U130 behaviour or bug-hunt #6.

Tests

L1: new observer projection tests (TestObserverStampsUpToDateForSecondarySyncSource, TestObserverSyncSourceStampDoesNotDowngradeRealFrame) and new guard tests covering the blank-diskState SyncSource target, the fail-safe blank-state-mid-sync refusal, and the SyncSource-sibling-survives non-regression — plus predicate-table rows for each. Existing U130 L1/replay coverage stays green.
L6 cli-matrix r-d-last-uptodate-midsync-rejected: run on stand big. Because the defect is non-deterministic, it was run 5x consecutively green, each run catching a live mid-sync window (big-worker-2 observed mid-sync) and exercising the real rejection. The same scenario reproduced the bug (r d ... succeeded while mid-sync) on the pre-fix image, proving the race is closed.
L6 cli-matrix r-full-lifecycle: green on stand (no regression on the r d / r c / r td delete path).
go test ./..., go test -race ./pkg/rest/... ./pkg/satellite/...: green. golangci-lint run: 0 issues.

Summary by CodeRabbit

Bug Fixes
- Improved delete operation safety by better detecting available disk sources during replication. Resources with uncertain disk states are now handled more conservatively to prevent accidental data loss.
- Enhanced fail-safe logic for deletion guards when disk state information is unavailable, reducing edge-case vulnerabilities.

…G-045) A diskful replica that is Secondary and acts as the SyncSource for a freshly-added SyncTarget peer could end up with a blank .status.volumes[].diskState in the CRD projection. On a Secondary SyncSource drbd-9 does not re-emit a local `device` frame carrying `disk:UpToDate` (the local disk-state did not transition), so when the volume-cache entry is (re)created by the peer-device `replication: SyncSource` frame it carries an empty DiskState and the projection leaves diskState blank. That blank diskState is the load-bearing input the U130 last-copy delete guard reads, so it could false-allow deleting the only UpToDate source while a peer is mid-sync, stranding the SyncTarget with no source (a data-availability loss). Fix: when the observer sees a peer-device `replication:SyncSource` frame, stamp the local volume's DiskState=UpToDate. A SyncSource feeds a peer's resync only from an UpToDate local disk, so this is a hard DRBD invariant. The stamp is idempotent and never downgrades a richer local-frame observation (mergeVolumeInto only upgrades to a non-empty value, and UpToDate is the terminal disk-state). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

The U130 guard decided whether deleting the last UpToDate diskful replica would strand a mid-sync peer solely from the CRD `diskState` projection. When the satellite observer left a live source's diskState blank (the Secondary SyncSource race), the guard concluded the target was "not a source" and allowed the destructive delete, stranding the SyncTarget with no UpToDate source. Harden the guard to fail safe — when it cannot positively confirm that another UpToDate copy would survive, it refuses: - Treat a SyncSource replication-state as kernel ground truth that a replica holds an UpToDate copy (a SyncSource only feeds a resync from an UpToDate local disk), so a lagging/blank diskState no longer hides a real source. This applies to both the target (do not conclude "not a source") and the siblings (a SyncSource sibling counts as a surviving source, avoiding over-refusal of legitimate deletes). - Treat an empty/unknown diskState on a diskful target conservatively: while a peer is mid-sync, refuse the last-copy delete rather than trust an unstamped projection. A false refusal only asks the operator to wait or pass `?force=true`; a false allow is unrecoverable. Legitimate last-copy deletes (no peer mid-sync) and the `?force=true` override are unaffected. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>

coderabbitai · 2026-06-14T00:03:01Z

📝 Walkthrough

Walkthrough

Fixes BUG-045 across two subsystems: the DRBD events2 observer now synthesizes DiskState=UpToDate from peer-device frames carrying replication:SyncSource, and the U130 last-UpToDate delete guard is hardened to classify replicas with blank or unknown diskState as possible sources, using confirmed-source evidence (UpToDate diskState or SyncSource replication-state) to decide refusal.

Changes

BUG-045: SyncSource DiskState stamping and U130 fail-safe guard

Layer / File(s)	Summary
Observer: stamp DiskState=UpToDate on SyncSource frames `pkg/satellite/controllers/observer.go`, `pkg/satellite/controllers/observer_internal_test.go`	Adds `drbdReplStateSyncSource` and `drbdDiskStateUpToDate` constants; `peerDeviceVolumeObservation` now special-cases `replication:SyncSource` peer-device frames to emit `DiskState=UpToDate` and returns early. Tests verify stamping when no local device frame exists and that a pre-existing `disk:Failed` frame is not downgraded.
U130 guard: confirmed-source and fail-safe blank-diskState logic `pkg/rest/resource_delete_last_uptodate_u130.go`, `pkg/rest/resource_delete_last_uptodate_u130_test.go`	Replaces strict UpToDate eligibility with a possible-source predicate covering blank `diskState`; rewrites sibling evaluation to count confirmed sources (UpToDate diskState or SyncSource replication-state) and refuse deletion only when the target is the sole confirmed source while a diskful peer is mid-sync. Adds a `SyncSource` replication-state constant. Tests cover BUG-045 rejection, fail-safe blank-diskState refusal, must-not-regress allow case, and four new predicate matrix entries.

Sequence Diagram

sequenceDiagram
  participant DRBD as DRBD events2
  participant Observer as peerDeviceVolumeObservation
  participant Cache as Volume/Connection cache
  participant U130 as resourceMidSyncDeleteRefusal
  participant API as REST DELETE handler

  DRBD->>Observer: peer-device frame (replication=SyncSource)
  Observer->>Cache: store volumeObservation(DiskState=UpToDate, ReplicationState=SyncSource)

  API->>U130: DELETE replica request
  U130->>Cache: read target diskState and replicationStates
  alt diskState blank or UpToDate or SyncSource replication
    U130->>Cache: scan diskful siblings for confirmedSources
    alt confirmedSource sibling exists
      U130-->>API: nil (allow delete)
    else no confirmedSource, midSync peer exists
      U130-->>API: 409 refusal (block delete)
    end
  else not a possible source
    U130-->>API: nil (allow delete)
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 A SyncSource once hid with no DiskState in sight,
The guard saw it blank and said "delete? Not tonight!"
We stamped it UpToDate when the peer-device said so,
And confirmed every source before letting things go.
BUG-045 is fixed — hop along, data's alright! 🌟

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title directly and specifically addresses the main change: fixing a data-availability race condition involving Secondary SyncSource deletion, referencing the specific bug (BUG-045) and the guard component (last-UpToDate-delete) affected.
Docstring Coverage	✅ Passed	Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/bug-045-u130-observer-syncsource-diskstate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request hardens the data-availability guard against BUG-045, ensuring that a blank or lagging CRD diskState projection does not lead to the accidental deletion of the last UpToDate replica while a peer is mid-sync. It introduces fail-safe checks in the delete guard and updates the observer to stamp DiskState=UpToDate when a SyncSource replication state is detected. The review feedback suggests adding a defensive nil check for the target resource in resourceMidSyncDeleteRefusal and points out a logical gap in TestObserverSyncSourceStampDoesNotDowngradeRealFrame where the merge volume logic is not actually executed to verify the test's assertions.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-14T00:04:36Z

 func resourceMidSyncDeleteRefusal(target *apiv1.Resource, siblings []apiv1.Resource) bool {
 	// A diskless / tiebreaker replica carries no UpToDate backing
 	// storage; removing it never strands a SyncTarget's only source.
 	if !resourceIsDiskful(target) {
 		return false
 	}


To prevent potential nil pointer dereferences and ensure robust defensive programming, we should add a nil check for target at the entry point of resourceMidSyncDeleteRefusal. If target is nil, the function can safely return false.

Suggested change

func resourceMidSyncDeleteRefusal(target *apiv1.Resource, siblings []apiv1.Resource) bool {

// A diskless / tiebreaker replica carries no UpToDate backing

// storage; removing it never strands a SyncTarget's only source.

if !resourceIsDiskful(target) {

return false

}

func resourceMidSyncDeleteRefusal(target *apiv1.Resource, siblings []apiv1.Resource) bool {

if target == nil {

return false

}

// A diskless / tiebreaker replica carries no UpToDate backing

// storage; removing it never strands a SyncTarget's only source.

if !resourceIsDiskful(target) {

return false

}

gemini-code-assist · 2026-06-14T00:04:36Z

+func TestObserverSyncSourceStampDoesNotDowngradeRealFrame(t *testing.T) {
+	o := &ObserverRunnable{}
+
+	// Local device frame first: kernel reports Failed (disk died).
+	failedEv, ok := translateDeviceEvent(drbd.Event{
+		Action: "change",
+		Kind:   eventKindDevice,
+		Fields: map[string]string{
+			"name":   "pvc-bug045-dg",
+			"volume": "0",
+			"disk":   "Failed",
+		},
+	})
+	if !ok {
+		t.Fatalf("translateDeviceEvent rejected Failed frame")
+	}
+
+	o.mergeVolumes(&failedEv)
+
+	// A SyncSource peer-device frame would stamp UpToDate — but with the
+	// kernel already reporting Failed locally this would be a contradiction
+	// the kernel never produces (a Failed disk cannot be a SyncSource). The
+	// merge upgrades to UpToDate because UpToDate != Failed and the
+	// incoming value is non-empty; this is acceptable because in reality
+	// the two frames are mutually exclusive. The assertion here only pins
+	// that the stamp value is exactly UpToDate (not some lossy blank).
+	syncSourceEv, ok := translatePeerDeviceEvent(drbd.Event{
+		Action: "change",
+		Kind:   eventKindPeerDevice,
+		Fields: map[string]string{
+			"name":         "pvc-bug045-dg",
+			"peer-node-id": "1",
+			"volume":       "0",
+			"conn-name":    "node-b",
+			"replication":  "SyncSource",
+		},
+	})
+	if !ok {
+		t.Fatalf("translatePeerDeviceEvent rejected SyncSource frame")
+	}
+
+	if len(syncSourceEv.Volumes) != 1 || syncSourceEv.Volumes[0].DiskState != "UpToDate" {
+		t.Errorf("SyncSource frame must carry DiskState=UpToDate; got %+v",
+			syncSourceEv.Volumes)
+	}
+}


The test TestObserverSyncSourceStampDoesNotDowngradeRealFrame does not actually call o.mergeVolumes(&syncSourceEv) to verify if the cached Failed state is preserved or overwritten. If o.mergeVolumes were called, mergeVolumeInto would actually overwrite the "Failed" state with "UpToDate" because "Failed" != "UpToDate". Please update the test to either perform the merge and assert the expected behavior, or clarify the comments and test name if the merge is intentionally omitted due to mutual exclusivity in the kernel.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/rest/resource_delete_last_uptodate_u130.go (1)
92-138: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Guard classification can false-allow multi-volume unsafe deletes.

The logic uses resource-level “any-volume” predicates and a mutually exclusive sibling switch. A sibling that is confirmed on one volume but mid-sync on another is counted only as a surviving source, which can allow deleting the target while a volume is still strandable.

Also applies to: 141-202
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/rest/resource_delete_last_uptodate_u130.go` around lines 92 - 138, The
guard logic incorrectly uses resource-level any-volume predicates within a
mutually exclusive switch statement. When a sibling is confirmed as an UpToDate
source on one volume but is mid-sync on another, the current code counts it only
as a confirmed source and ignores the mid-sync state, which can allow unsafe
deletion of a target that has a stranded volume. Replace the mutually exclusive
switch statement with logic that checks per-volume status: for each sibling,
determine if it is mid-sync on any volume (which should block the delete if the
target is the sole source), separately from whether it serves as a confirmed
source overall. Apply this same fix at both affected locations: the initial
function around the switch statement and the consolidated site at lines 141-202
which has the same pattern.

🧹 Nitpick comments (1)

pkg/satellite/controllers/observer_internal_test.go (1)

2614-2665: ⚡ Quick win

Align this test’s name with what it actually asserts.

The test currently checks SyncSource translation output, not cache-merge “no downgrade” behavior. Either add a post-merge assertion for that contract or rename to avoid false confidence.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/satellite/controllers/observer_internal_test.go` around lines 2614 -
2665, The test TestObserverSyncSourceStampDoesNotDowngradeRealFrame has a name
that implies it validates the "no downgrade" merge invariant, but it only checks
that the SyncSource event is correctly translated to have DiskState=UpToDate. To
align the test with its name, add a post-merge assertion after calling
o.mergeVolumes with the syncSourceEv to verify that the observer's state still
maintains the Failed disk state and was not downgraded, or alternatively rename
the test to reflect that it only validates SyncSource event translation (e.g.,
TestSyncSourceTranslatesUpToDate) to avoid misleading developers about what
contract is actually being tested.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/rest/resource_delete_last_uptodate_u130.go`:
- Around line 92-138: The guard logic incorrectly uses resource-level any-volume
predicates within a mutually exclusive switch statement. When a sibling is
confirmed as an UpToDate source on one volume but is mid-sync on another, the
current code counts it only as a confirmed source and ignores the mid-sync
state, which can allow unsafe deletion of a target that has a stranded volume.
Replace the mutually exclusive switch statement with logic that checks
per-volume status: for each sibling, determine if it is mid-sync on any volume
(which should block the delete if the target is the sole source), separately
from whether it serves as a confirmed source overall. Apply this same fix at
both affected locations: the initial function around the switch statement and
the consolidated site at lines 141-202 which has the same pattern.

---

Nitpick comments:
In `@pkg/satellite/controllers/observer_internal_test.go`:
- Around line 2614-2665: The test
TestObserverSyncSourceStampDoesNotDowngradeRealFrame has a name that implies it
validates the "no downgrade" merge invariant, but it only checks that the
SyncSource event is correctly translated to have DiskState=UpToDate. To align
the test with its name, add a post-merge assertion after calling o.mergeVolumes
with the syncSourceEv to verify that the observer's state still maintains the
Failed disk state and was not downgraded, or alternatively rename the test to
reflect that it only validates SyncSource event translation (e.g.,
TestSyncSourceTranslatesUpToDate) to avoid misleading developers about what
contract is actually being tested.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 22ea8b1a-f3ca-4975-a052-2cdf6b53019c

📥 Commits

Reviewing files that changed from the base of the PR and between 2b9b648 and 8b44150.

📒 Files selected for processing (4)

pkg/rest/resource_delete_last_uptodate_u130.go
pkg/rest/resource_delete_last_uptodate_u130_test.go
pkg/satellite/controllers/observer.go
pkg/satellite/controllers/observer_internal_test.go

Andrei Kvapil (kvaps) and others added 2 commits June 14, 2026 02:37

gemini-code-assist Bot reviewed Jun 14, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 14, 2026

View reviewed changes

Andrei Kvapil (kvaps) merged commit c193457 into main Jun 14, 2026
26 of 29 checks passed

Andrei Kvapil (kvaps) deleted the fix/bug-045-u130-observer-syncsource-diskstate branch June 14, 2026 02:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(satellite,rest): close last-UpToDate-delete race on Secondary SyncSource (BUG-045)#159

fix(satellite,rest): close last-UpToDate-delete race on Secondary SyncSource (BUG-045)#159
Andrei Kvapil (kvaps) merged 2 commits into
mainfrom
fix/bug-045-u130-observer-syncsource-diskstate

Andrei Kvapil (kvaps) commented Jun 14, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 14, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Andrei Kvapil (kvaps) commented Jun 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Leg 1 — observer projection (pkg/satellite/controllers/observer.go)

Leg 2 — guard fail-safe hardening (pkg/rest/resource_delete_last_uptodate_u130.go)

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Andrei Kvapil (kvaps) commented Jun 14, 2026 •

edited by coderabbitai Bot

Loading

Leg 1 — observer projection (`pkg/satellite/controllers/observer.go`)

Leg 2 — guard fail-safe hardening (`pkg/rest/resource_delete_last_uptodate_u130.go`)

coderabbitai Bot commented Jun 14, 2026 •

edited

Loading