Skip to content

fix(satellite,rest): close last-UpToDate-delete race on Secondary SyncSource (BUG-045)#159

Merged
Andrei Kvapil (kvaps) merged 2 commits into
mainfrom
fix/bug-045-u130-observer-syncsource-diskstate
Jun 14, 2026
Merged

fix(satellite,rest): close last-UpToDate-delete race on Secondary SyncSource (BUG-045)#159
Andrei Kvapil (kvaps) merged 2 commits into
mainfrom
fix/bug-045-u130-observer-syncsource-diskstate

Conversation

@kvaps

@kvaps Andrei Kvapil (kvaps) commented Jun 14, 2026

Copy link
Copy Markdown
Member

Summary

Closes a data-availability hole in the U130 "don't delete the last active peer" guard (BUG-045). On stand big it was observed that linstor r d <node> <rd> deleting the LAST UpToDate diskful replica while a peer is mid-sync intermittently SUCCEEDED, stranding the SyncTarget with no UpToDate source — a path to data unavailability. The defect was non-deterministic (1 of 2 big runs).

The U130 predicate itself was logically correct; the defect was upstream of it, in the satellite observer's CRD status projection. This PR fixes both legs for defense-in-depth, as befits a data-safety guard.

Root cause

A diskful replica that is Secondary and acts as the SyncSource for a freshly-added SyncTarget peer could end up with a blank .status.volumes[].diskState in the CRD projection.

On a Secondary SyncSource, drbd-9 does not re-emit a local device frame carrying disk:UpToDate — the local disk-state never transitioned (it was already UpToDate), so the kernel only emits the peer-device replication:SyncSource frame toward the SyncTarget. When the observer's per-volume cache entry is (re)created by that peer-device frame, it carries an empty DiskState, and the projection leaves diskState blank (pkg/satellite/controllers/observer.go, peerDeviceVolumeObservation).

The U130 guard reads exactly that diskState field. With the only source's diskState blank, resourceIsUpToDate(target) returned false, the guard concluded "this is not an UpToDate source / there is no other UpToDate copy", and allowed the destructive delete — stranding the SyncTarget.

Fix

Leg 1 — observer projection (pkg/satellite/controllers/observer.go)

When the observer sees a peer-device replication:SyncSource frame, stamp the local volume's DiskState=UpToDate. A SyncSource feeds a peer's resync only from an UpToDate local disk, so this is a hard DRBD invariant. The peer-device frame is reliably emitted (unlike the missing local device frame), so this closes the unstamped-projection window. The stamp is idempotent and never downgrades a richer local-frame observation (mergeVolumeInto only upgrades to a non-empty value, and UpToDate is the terminal disk-state).

Leg 2 — guard fail-safe hardening (pkg/rest/resource_delete_last_uptodate_u130.go)

The last-copy delete safety decision no longer rests solely on the possibly-lagging diskState projection:

  • A SyncSource replication-state is treated as kernel ground truth that a replica holds an UpToDate copy. This applies to the target (do not conclude "not a source" when it is actively feeding a peer) and to siblings (a SyncSource sibling counts as a surviving source, so legitimate deletes are not over-refused).
  • An empty/unknown diskState on a diskful target is treated conservatively: while a peer is mid-sync, the last-copy delete is refused rather than trusting an unstamped projection.

Fail-safe argument (load-bearing): when the guard cannot positively confirm that another UpToDate copy would survive the delete, it REFUSES. A false refusal merely asks the operator to wait for the resync to finish or pass ?force=true; a false allow strands the SyncTarget with no UpToDate source — an unrecoverable data-availability loss. The asymmetry of consequences dictates refusing in doubt.

Legitimate last-copy deletes (no peer mid-sync) and the ?force=true override are unaffected — no regression to the documented U130 behaviour or bug-hunt #6.

Tests

  • L1: new observer projection tests (TestObserverStampsUpToDateForSecondarySyncSource, TestObserverSyncSourceStampDoesNotDowngradeRealFrame) and new guard tests covering the blank-diskState SyncSource target, the fail-safe blank-state-mid-sync refusal, and the SyncSource-sibling-survives non-regression — plus predicate-table rows for each. Existing U130 L1/replay coverage stays green.
  • L6 cli-matrix r-d-last-uptodate-midsync-rejected: run on stand big. Because the defect is non-deterministic, it was run 5x consecutively green, each run catching a live mid-sync window (big-worker-2 observed mid-sync) and exercising the real rejection. The same scenario reproduced the bug (r d ... succeeded while mid-sync) on the pre-fix image, proving the race is closed.
  • L6 cli-matrix r-full-lifecycle: green on stand (no regression on the r d / r c / r td delete path).
  • go test ./..., go test -race ./pkg/rest/... ./pkg/satellite/...: green. golangci-lint run: 0 issues.

Summary by CodeRabbit

  • Bug Fixes
    • Improved delete operation safety by better detecting available disk sources during replication. Resources with uncertain disk states are now handled more conservatively to prevent accidental data loss.
    • Enhanced fail-safe logic for deletion guards when disk state information is unavailable, reducing edge-case vulnerabilities.

Andrei Kvapil (kvaps) and others added 2 commits June 14, 2026 02:37
…G-045)

A diskful replica that is Secondary and acts as the SyncSource for a
freshly-added SyncTarget peer could end up with a blank
.status.volumes[].diskState in the CRD projection. On a Secondary
SyncSource drbd-9 does not re-emit a local `device` frame carrying
`disk:UpToDate` (the local disk-state did not transition), so when the
volume-cache entry is (re)created by the peer-device `replication:
SyncSource` frame it carries an empty DiskState and the projection
leaves diskState blank.

That blank diskState is the load-bearing input the U130 last-copy
delete guard reads, so it could false-allow deleting the only UpToDate
source while a peer is mid-sync, stranding the SyncTarget with no source
(a data-availability loss).

Fix: when the observer sees a peer-device `replication:SyncSource`
frame, stamp the local volume's DiskState=UpToDate. A SyncSource feeds
a peer's resync only from an UpToDate local disk, so this is a hard DRBD
invariant. The stamp is idempotent and never downgrades a richer
local-frame observation (mergeVolumeInto only upgrades to a non-empty
value, and UpToDate is the terminal disk-state).

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
The U130 guard decided whether deleting the last UpToDate diskful
replica would strand a mid-sync peer solely from the CRD `diskState`
projection. When the satellite observer left a live source's diskState
blank (the Secondary SyncSource race), the guard concluded the target
was "not a source" and allowed the destructive delete, stranding the
SyncTarget with no UpToDate source.

Harden the guard to fail safe — when it cannot positively confirm that
another UpToDate copy would survive, it refuses:

- Treat a SyncSource replication-state as kernel ground truth that a
  replica holds an UpToDate copy (a SyncSource only feeds a resync from
  an UpToDate local disk), so a lagging/blank diskState no longer hides
  a real source. This applies to both the target (do not conclude
  "not a source") and the siblings (a SyncSource sibling counts as a
  surviving source, avoiding over-refusal of legitimate deletes).
- Treat an empty/unknown diskState on a diskful target conservatively:
  while a peer is mid-sync, refuse the last-copy delete rather than
  trust an unstamped projection. A false refusal only asks the operator
  to wait or pass `?force=true`; a false allow is unrecoverable.

Legitimate last-copy deletes (no peer mid-sync) and the `?force=true`
override are unaffected.

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 14, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Fixes BUG-045 across two subsystems: the DRBD events2 observer now synthesizes DiskState=UpToDate from peer-device frames carrying replication:SyncSource, and the U130 last-UpToDate delete guard is hardened to classify replicas with blank or unknown diskState as possible sources, using confirmed-source evidence (UpToDate diskState or SyncSource replication-state) to decide refusal.

Changes

BUG-045: SyncSource DiskState stamping and U130 fail-safe guard

Layer / File(s) Summary
Observer: stamp DiskState=UpToDate on SyncSource frames
pkg/satellite/controllers/observer.go, pkg/satellite/controllers/observer_internal_test.go
Adds drbdReplStateSyncSource and drbdDiskStateUpToDate constants; peerDeviceVolumeObservation now special-cases replication:SyncSource peer-device frames to emit DiskState=UpToDate and returns early. Tests verify stamping when no local device frame exists and that a pre-existing disk:Failed frame is not downgraded.
U130 guard: confirmed-source and fail-safe blank-diskState logic
pkg/rest/resource_delete_last_uptodate_u130.go, pkg/rest/resource_delete_last_uptodate_u130_test.go
Replaces strict UpToDate eligibility with a possible-source predicate covering blank diskState; rewrites sibling evaluation to count confirmed sources (UpToDate diskState or SyncSource replication-state) and refuse deletion only when the target is the sole confirmed source while a diskful peer is mid-sync. Adds a SyncSource replication-state constant. Tests cover BUG-045 rejection, fail-safe blank-diskState refusal, must-not-regress allow case, and four new predicate matrix entries.

Sequence Diagram

sequenceDiagram
  participant DRBD as DRBD events2
  participant Observer as peerDeviceVolumeObservation
  participant Cache as Volume/Connection cache
  participant U130 as resourceMidSyncDeleteRefusal
  participant API as REST DELETE handler

  DRBD->>Observer: peer-device frame (replication=SyncSource)
  Observer->>Cache: store volumeObservation(DiskState=UpToDate, ReplicationState=SyncSource)

  API->>U130: DELETE replica request
  U130->>Cache: read target diskState and replicationStates
  alt diskState blank or UpToDate or SyncSource replication
    U130->>Cache: scan diskful siblings for confirmedSources
    alt confirmedSource sibling exists
      U130-->>API: nil (allow delete)
    else no confirmedSource, midSync peer exists
      U130-->>API: 409 refusal (block delete)
    end
  else not a possible source
    U130-->>API: nil (allow delete)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Poem

🐇 A SyncSource once hid with no DiskState in sight,
The guard saw it blank and said "delete? Not tonight!"
We stamped it UpToDate when the peer-device said so,
And confirmed every source before letting things go.
BUG-045 is fixed — hop along, data's alright! 🌟

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and specifically addresses the main change: fixing a data-availability race condition involving Secondary SyncSource deletion, referencing the specific bug (BUG-045) and the guard component (last-UpToDate-delete) affected.
Docstring Coverage ✅ Passed Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/bug-045-u130-observer-syncsource-diskstate

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request hardens the data-availability guard against BUG-045, ensuring that a blank or lagging CRD diskState projection does not lead to the accidental deletion of the last UpToDate replica while a peer is mid-sync. It introduces fail-safe checks in the delete guard and updates the observer to stamp DiskState=UpToDate when a SyncSource replication state is detected. The review feedback suggests adding a defensive nil check for the target resource in resourceMidSyncDeleteRefusal and points out a logical gap in TestObserverSyncSourceStampDoesNotDowngradeRealFrame where the merge volume logic is not actually executed to verify the test's assertions.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 85 to 90
func resourceMidSyncDeleteRefusal(target *apiv1.Resource, siblings []apiv1.Resource) bool {
// A diskless / tiebreaker replica carries no UpToDate backing
// storage; removing it never strands a SyncTarget's only source.
if !resourceIsDiskful(target) {
return false
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential nil pointer dereferences and ensure robust defensive programming, we should add a nil check for target at the entry point of resourceMidSyncDeleteRefusal. If target is nil, the function can safely return false.

Suggested change
func resourceMidSyncDeleteRefusal(target *apiv1.Resource, siblings []apiv1.Resource) bool {
// A diskless / tiebreaker replica carries no UpToDate backing
// storage; removing it never strands a SyncTarget's only source.
if !resourceIsDiskful(target) {
return false
}
func resourceMidSyncDeleteRefusal(target *apiv1.Resource, siblings []apiv1.Resource) bool {
if target == nil {
return false
}
// A diskless / tiebreaker replica carries no UpToDate backing
// storage; removing it never strands a SyncTarget's only source.
if !resourceIsDiskful(target) {
return false
}

Comment on lines +2620 to +2665
func TestObserverSyncSourceStampDoesNotDowngradeRealFrame(t *testing.T) {
o := &ObserverRunnable{}

// Local device frame first: kernel reports Failed (disk died).
failedEv, ok := translateDeviceEvent(drbd.Event{
Action: "change",
Kind: eventKindDevice,
Fields: map[string]string{
"name": "pvc-bug045-dg",
"volume": "0",
"disk": "Failed",
},
})
if !ok {
t.Fatalf("translateDeviceEvent rejected Failed frame")
}

o.mergeVolumes(&failedEv)

// A SyncSource peer-device frame would stamp UpToDate — but with the
// kernel already reporting Failed locally this would be a contradiction
// the kernel never produces (a Failed disk cannot be a SyncSource). The
// merge upgrades to UpToDate because UpToDate != Failed and the
// incoming value is non-empty; this is acceptable because in reality
// the two frames are mutually exclusive. The assertion here only pins
// that the stamp value is exactly UpToDate (not some lossy blank).
syncSourceEv, ok := translatePeerDeviceEvent(drbd.Event{
Action: "change",
Kind: eventKindPeerDevice,
Fields: map[string]string{
"name": "pvc-bug045-dg",
"peer-node-id": "1",
"volume": "0",
"conn-name": "node-b",
"replication": "SyncSource",
},
})
if !ok {
t.Fatalf("translatePeerDeviceEvent rejected SyncSource frame")
}

if len(syncSourceEv.Volumes) != 1 || syncSourceEv.Volumes[0].DiskState != "UpToDate" {
t.Errorf("SyncSource frame must carry DiskState=UpToDate; got %+v",
syncSourceEv.Volumes)
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test TestObserverSyncSourceStampDoesNotDowngradeRealFrame does not actually call o.mergeVolumes(&syncSourceEv) to verify if the cached Failed state is preserved or overwritten. If o.mergeVolumes were called, mergeVolumeInto would actually overwrite the "Failed" state with "UpToDate" because "Failed" != "UpToDate". Please update the test to either perform the merge and assert the expected behavior, or clarify the comments and test name if the merge is intentionally omitted due to mutual exclusivity in the kernel.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/rest/resource_delete_last_uptodate_u130.go (1)

92-138: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Guard classification can false-allow multi-volume unsafe deletes.

The logic uses resource-level “any-volume” predicates and a mutually exclusive sibling switch. A sibling that is confirmed on one volume but mid-sync on another is counted only as a surviving source, which can allow deleting the target while a volume is still strandable.

Also applies to: 141-202

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/rest/resource_delete_last_uptodate_u130.go` around lines 92 - 138, The
guard logic incorrectly uses resource-level any-volume predicates within a
mutually exclusive switch statement. When a sibling is confirmed as an UpToDate
source on one volume but is mid-sync on another, the current code counts it only
as a confirmed source and ignores the mid-sync state, which can allow unsafe
deletion of a target that has a stranded volume. Replace the mutually exclusive
switch statement with logic that checks per-volume status: for each sibling,
determine if it is mid-sync on any volume (which should block the delete if the
target is the sole source), separately from whether it serves as a confirmed
source overall. Apply this same fix at both affected locations: the initial
function around the switch statement and the consolidated site at lines 141-202
which has the same pattern.
🧹 Nitpick comments (1)
pkg/satellite/controllers/observer_internal_test.go (1)

2614-2665: ⚡ Quick win

Align this test’s name with what it actually asserts.

The test currently checks SyncSource translation output, not cache-merge “no downgrade” behavior. Either add a post-merge assertion for that contract or rename to avoid false confidence.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/satellite/controllers/observer_internal_test.go` around lines 2614 -
2665, The test TestObserverSyncSourceStampDoesNotDowngradeRealFrame has a name
that implies it validates the "no downgrade" merge invariant, but it only checks
that the SyncSource event is correctly translated to have DiskState=UpToDate. To
align the test with its name, add a post-merge assertion after calling
o.mergeVolumes with the syncSourceEv to verify that the observer's state still
maintains the Failed disk state and was not downgraded, or alternatively rename
the test to reflect that it only validates SyncSource event translation (e.g.,
TestSyncSourceTranslatesUpToDate) to avoid misleading developers about what
contract is actually being tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/rest/resource_delete_last_uptodate_u130.go`:
- Around line 92-138: The guard logic incorrectly uses resource-level any-volume
predicates within a mutually exclusive switch statement. When a sibling is
confirmed as an UpToDate source on one volume but is mid-sync on another, the
current code counts it only as a confirmed source and ignores the mid-sync
state, which can allow unsafe deletion of a target that has a stranded volume.
Replace the mutually exclusive switch statement with logic that checks
per-volume status: for each sibling, determine if it is mid-sync on any volume
(which should block the delete if the target is the sole source), separately
from whether it serves as a confirmed source overall. Apply this same fix at
both affected locations: the initial function around the switch statement and
the consolidated site at lines 141-202 which has the same pattern.

---

Nitpick comments:
In `@pkg/satellite/controllers/observer_internal_test.go`:
- Around line 2614-2665: The test
TestObserverSyncSourceStampDoesNotDowngradeRealFrame has a name that implies it
validates the "no downgrade" merge invariant, but it only checks that the
SyncSource event is correctly translated to have DiskState=UpToDate. To align
the test with its name, add a post-merge assertion after calling o.mergeVolumes
with the syncSourceEv to verify that the observer's state still maintains the
Failed disk state and was not downgraded, or alternatively rename the test to
reflect that it only validates SyncSource event translation (e.g.,
TestSyncSourceTranslatesUpToDate) to avoid misleading developers about what
contract is actually being tested.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 22ea8b1a-f3ca-4975-a052-2cdf6b53019c

📥 Commits

Reviewing files that changed from the base of the PR and between 2b9b648 and 8b44150.

📒 Files selected for processing (4)
  • pkg/rest/resource_delete_last_uptodate_u130.go
  • pkg/rest/resource_delete_last_uptodate_u130_test.go
  • pkg/satellite/controllers/observer.go
  • pkg/satellite/controllers/observer_internal_test.go

@kvaps Andrei Kvapil (kvaps) merged commit c193457 into main Jun 14, 2026
26 of 29 checks passed
@kvaps Andrei Kvapil (kvaps) deleted the fix/bug-045-u130-observer-syncsource-diskstate branch June 14, 2026 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant