diff --git a/docs/cli-parity-known-deltas.md b/docs/cli-parity-known-deltas.md index f7710ff8..698670d0 100644 --- a/docs/cli-parity-known-deltas.md +++ b/docs/cli-parity-known-deltas.md @@ -15,6 +15,7 @@ Row IDs match the command-catalogue indexes used by `cli-parity-refresh.sh` (see | # | command | delta_kind | accepted_until | why | |---|---------|------------|----------------|-----| +| 03 | `sp l` | WIRE_SHAPE | permanent | ENV_TOPOLOGY + envelope-shape. After pool-name + table-border normalization the `sp l` listing still diverges because the BS stand and the upstream oracle are provisioned with DIFFERENT storage-pool topologies (BS carries `lvm-thin` / `zfs-thin` / `zfs-thick` rows the FILE_THIN-only oracle never had) and because of two deliberate wire-shape choices on the FILE_THIN pool: (a) BS leaves the FILE_THIN `PoolName` cell blank where the oracle renders its backing-dir path `/var/lib/linstor-oracle-` (BS's FILE_THIN provider has no operator-facing PoolName — same family as row 04's StorDriver omission); (b) BS leaves the `SharedName` column blank where the oracle stamps `;`, and BS provisions a single shared `DfltDisklessStorPool` row rather than one per node. None of these are behavioural deltas — they reflect the stand's pool fixtures and BS's leaner FILE_THIN DTO; linstor-csi / piraeus-operator read pool capacity (`sp l -m`, row 79) not the PoolName/SharedName render columns. Whitelisted so a topology-shaped `sp l` diff does not block the refresh; a genuine capacity/driver regression still surfaces via `sp l -m` (row 79) and `sp l --show-props` (row 04). | | 04 | `sp l --show-props StorDriver/*` | WIRE_SHAPE | permanent | BLOCKSTOR_SUPERSET: BS surfaces extra `StorDriver/LvmVg`/`ThinPool` keys; client glob still matches → CLI render identical. Operator-visible parity OK. | | 08 | `r l` (tiebreaker layers DRBD,STORAGE) | WIRE_SHAPE | 2026-12-31 | F10 residual: DRBD layer enrichment (`may_promote`, `promotion_score`, `node_id`, `al_*`) not yet stamped. Audit refresh 2026-05-19 reclassifies F10 as partial — `drbdtop`-style monitoring depends on it but CSI does not. | | 18 | `controller version` | WIRE_SHAPE | permanent | Intentional version stamping: BS reports `1.33.2+ git=blockstor`. Downstream tooling MUST NOT grep a hex git_hash from BS. | @@ -71,3 +72,4 @@ These rows are **NOT** whitelisted on purpose — they appear in the audit but b - 2026-05-14 — original one-shot audit `docs/cli-parity-audit-2026-05-14.md`. - 2026-05-19 — refresh `docs/cli-parity-audit-2026-05-19-refresh.md`; F1-F20 closed (F10 partial residual remains as accepted delta #08); L7 harness `261d9e32f` lands re-runnable cli-parity-refresh.sh as the going-forward audit driver. - 2026-06-14 — reclassified two cli-matrix cells from "bug" to intentional BS contract: row 84 (tiebreaker→diskful promotion full-syncs by design, `anyDiskfulPeerHasData` gate avoids the UUID-mint StandAlone wedge) and row 85 (bare `SyncSource`/`SyncTarget` literal is the intended shape when `OutOfSyncKib<=0`). Cells `r-c-over-tiebreaker-skip-sync.sh` and `r-l-conns-shapes.sh` now assert the BS-intended behaviour. +- 2026-06-14 — extended `cli-parity-refresh.sh`'s `normalize_side_output` to collapse ASCII-table border-width drift (pool-name length sets column width) and the upstream-only `DfltRscGrp PlaceCount` render, clearing the `rg l` (05) and `rg l --pastable` (20) false WIRE_SHAPE diffs → both now PARITY. The residual `sp l` (03) divergence is a genuine ENV_TOPOLOGY + FILE_THIN envelope-shape delta and is whitelisted as row 03 (not masked). diff --git a/tests/burnin-blockstor.sh b/tests/burnin-blockstor.sh index 45bff07b..a689d6fe 100755 --- a/tests/burnin-blockstor.sh +++ b/tests/burnin-blockstor.sh @@ -135,20 +135,46 @@ EOF fi DEV=$(on_node "$PRIMARY" bash -c "grep -oE '/dev/drbd[0-9]+' /etc/drbd.d/${RD}.res | head -1") + + # Write 1 MiB urandom on PRIMARY and capture its md5, then read it + # back on PEER and compare. Each remote read GUARDS dd's exit code + # (via PIPESTATUS) AND the byte count: under churn `dd` can fail to + # open /dev/drbdN (EAGAIN, device transiently busy) and read ZERO + # bytes; md5("") is a fixed digest, so an unguarded read produces a + # FALSE mismatch alarm that would drown out a real future divergence. + # The md5 is computed by piping dd STRAIGHT into md5sum (never via a + # shell variable — command substitution strips NUL bytes, which a + # binary 1 MiB read is full of). On a guarded read failure the snippet + # emits the sentinel "READFAIL"; the iteration's compare is then + # SKIPPED (not counted as a real FAIL) and re-tried next iteration. + # `bs=1M count=1` reads exactly 1048576 bytes on success. + EXPECT_BYTES=1048576 PRIMARY_MD5=$(on_node "$PRIMARY" bash -c " drbdadm primary ${RD} dd if=/dev/urandom of=${DEV} bs=1M count=1 status=none oflag=direct - dd if=${DEV} bs=1M count=1 status=none iflag=direct | md5sum | awk '{print \$1}' + md5=\$(dd if=${DEV} bs=1M count=1 iflag=direct 2>/tmp/burnin-dd-primary.err | md5sum | awk '{print \$1}') + rc=\${PIPESTATUS[0]} + n=\$(awk '/bytes/ {print \$1; exit}' /tmp/burnin-dd-primary.err) drbdadm secondary ${RD} + if [ \"\$rc\" -ne 0 ] || [ \"\$n\" != \"${EXPECT_BYTES}\" ]; then echo 'READFAIL'; else echo \"\$md5\"; fi " | tail -1) PEER_MD5=$(on_node "$PEER" bash -c " drbdadm primary ${RD} - dd if=${DEV} bs=1M count=1 status=none iflag=direct | md5sum | awk '{print \$1}' + md5=\$(dd if=${DEV} bs=1M count=1 iflag=direct 2>/tmp/burnin-dd-peer.err | md5sum | awk '{print \$1}') + rc=\${PIPESTATUS[0]} + n=\$(awk '/bytes/ {print \$1; exit}' /tmp/burnin-dd-peer.err) drbdadm secondary ${RD} + if [ \"\$rc\" -ne 0 ] || [ \"\$n\" != \"${EXPECT_BYTES}\" ]; then echo 'READFAIL'; else echo \"\$md5\"; fi " | tail -1) - if [[ "$PRIMARY_MD5" == "$PEER_MD5" ]]; then + if [[ "$PRIMARY_MD5" == "READFAIL" || "$PEER_MD5" == "READFAIL" \ + || -z "$PRIMARY_MD5" || -z "$PEER_MD5" ]]; then + # Transient read failure on at least one side — neither PASS nor + # FAIL. A real mismatch is only credible when BOTH reads succeeded + # and returned the full 1 MiB. + echo "[$(date -u +%FT%TZ)] iter=$ITER SKIP: transient dd read failure (primary='$PRIMARY_MD5' peer='$PEER_MD5'); not comparing" + elif [[ "$PRIMARY_MD5" == "$PEER_MD5" ]]; then PASS=$((PASS + 1)) else FAIL=$((FAIL + 1)) diff --git a/tests/e2e/cli-matrix/bug-278-skipdisk-autoclear-after-reattach.sh b/tests/e2e/cli-matrix/bug-278-skipdisk-autoclear-after-reattach.sh index 7d64d589..7f6c8458 100755 --- a/tests/e2e/cli-matrix/bug-278-skipdisk-autoclear-after-reattach.sh +++ b/tests/e2e/cli-matrix/bug-278-skipdisk-autoclear-after-reattach.sh @@ -11,29 +11,53 @@ # `drbdadm adjust --skip-disk` and the local volume stayed Diskless # even though the kernel was healthy. # -# Bug 278 fix: when the reconciler sees `SkipDisk=True` AND the kernel -# reports the local volume as non-Diskless (UpToDate / Inconsistent / -# Outdated), the satellite releases the observer's SSA claim on the -# SkipDisk key via SkipDiskClearer. The next dispatcher cycle -# re-resolves Spec.Props without SkipDisk, the FSM transitions -# PhaseSkipDisk → PhaseRunning, and the next reconcile dispatches plain -# `drbdadm adjust` to re-attach the lower disk. +# Bug 278 fix (pkg/satellite/reconciler.go runAdjust + SkipDiskClearer): +# on EVERY reconcile, when the reconciler sees the observer-owned +# `SkipDisk=True` stamp AND the kernel probe reports the local volume +# as non-Diskless (HasDisklessVolume==false — UpToDate / Inconsistent / +# Outdated, backing storage attached), the satellite releases the +# observer's SSA claim on the SkipDisk key via SkipDiskClearer. The next +# dispatcher cycle re-resolves Spec.Props without SkipDisk, the FSM +# transitions PhaseSkipDisk → PhaseRunning, and the next reconcile +# dispatches plain `drbdadm adjust` to keep the lower disk attached. # -# Contract this cell pins: +# Two-sided contract this cell pins (BUG-046 triage corrected the +# previous revision — see below): # -# 1. Steady-state: 2-replica diskful RD, both UpToDate. -# 2. Stamp SkipDisk=True onto Resource.Spec.Props on $N2 (simulates -# the observer's defensive write on a transient Failed event the -# kernel emits at Talos upgrade time). -# 3. Restart the satellite pod on $N2 (simulates Talos kernel -# restart). The pod reattaches, sees the SkipDisk prop, but -# kernel state on $N2 is healthy (disk:UpToDate, backing_dev -# present — the lower LV survived the OS upgrade). -# 4. Within 60s assert: Spec.Props NO LONGER contains DrbdOptions/SkipDisk -# AND Status.volumes[0].diskState is back to UpToDate. +# A. PERSISTENCE (positive control). On a replica whose kernel volume +# really is Diskless (HasDisklessVolume==true), the auto-clear gate +# is INTENTIONALLY closed: a defensive SkipDisk stamp there MUST +# survive — that is the exact state SkipDisk is meant to guard. This +# proves the clearer is selective (it does not blindly wipe every +# stamp) and that the stamp wiring works at all. # -# Pre-fix: SkipDisk stays pinned forever; the cell would FAIL at -# step 4 because Spec.Props still carries the key after 60s. +# B. AUTO-CLEAR. On a HEALTHY (UpToDate) replica the satellite auto- +# clears the observer-owned defensive stamp, so the volume stays +# attached after a Talos-upgrade reattach. We restart the satellite +# pod on that node (the documented Talos kernel-restart shape), and +# assert the stamp is gone and the disk is UpToDate. +# +# Why the previous revision could never pass (BUG-046 triage): it stamped +# SkipDisk on a HEALTHY UpToDate replica and then read the prop BACK as a +# setup precondition (`assert stamped == "True"`). But the auto-clear +# fires on EVERY reconcile of a healthy slot, so the satellite cleared +# the stamp between the SSA apply and the read-back — the precondition +# raced the product's correct behaviour and aborted with "stamp did not +# land". The stamped-state on a healthy disk is transient BY DESIGN; the +# cell now asserts the auto-clear directly (poll for ABSENCE, the steady +# state) and uses the genuinely-persistent Diskless replica as the +# positive control instead of a brittle read-back of a doomed stamp. +# +# The SSA stamp is applied under the OBSERVER's field manager +# (`blockstor-satellite-skipdisk`, pkg/satellite/controllers/observer.go) +# so it carries the exact SSA ownership the real defensive write has. +# The auto-clear RELEASES that owner's claim — that is the whole product +# contract: the satellite un-stamps only its own defensive writes, while +# an operator-set SkipDisk (different field manager) would survive. The +# apply doc carries the two +required immutable scalars +# (resourceDefinitionName / nodeName) — same SSA-validation rule the +# observer's writeSkipDiskProp follows; --force-conflicts mirrors the +# observer's ForceOwnership. set -euo pipefail @@ -44,7 +68,7 @@ SCRIPT_DIR=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd) # shellcheck source=lib.sh source "$SCRIPT_DIR/lib.sh" -require_workers 2 +require_workers 3 linstor_cli_setup @@ -59,56 +83,90 @@ trap cleanup EXIT N1=$WORKER_1 N2=$WORKER_2 - -echo ">> [Bug 278] 2-replica diskful RD on $N1+$N2" -"${LCTL[@]}" resource-definition create "$RD" >/dev/null -"${LCTL[@]}" volume-definition create "$RD" 128M >/dev/null -"${LCTL[@]}" resource create "$N1" "$RD" --storage-pool=stand >/dev/null -"${LCTL[@]}" resource create "$N2" "$RD" --storage-pool=stand >/dev/null - -echo ">> wait for both diskful UpToDate" -RD="$RD" wait_uptodate "$RD" "$N1" "$N2" - -echo ">> stamp DrbdOptions/SkipDisk=True onto $N2 (simulates pre-upgrade defensive stamp)" -# Server-side-apply under the OBSERVER's field manager -# (`blockstor-satellite-skipdisk`, pkg/satellite/controllers/observer.go) -# so the stamp carries the exact SSA ownership the real defensive write -# has. The Bug 278 auto-clear RELEASES that owner's claim — that is the -# whole product contract: the satellite un-stamps only its own -# defensive writes, while an operator-set SkipDisk (different field -# manager) survives the reattach. The previous revision of this cell -# stamped via `kubectl patch --type=merge` (field manager -# "kubectl-patch"), which the release path correctly refuses to clear — -# the cell was asserting the opposite of the product contract and could -# never pass (BUG-040 triage). -# -# The apply doc must carry the two +required immutable scalars -# (resourceDefinitionName / nodeName) — same SSA-validation rule the -# observer's writeSkipDiskProp follows. -# --force-conflicts mirrors the observer's ForceOwnership: the apply -# doc must claim the two required scalars even though the controller's -# field manager owns them. -kubectl apply --server-side --force-conflicts --field-manager=blockstor-satellite-skipdisk -f - < — SSA-apply DrbdOptions/SkipDisk=True onto +# (RD, NODE).Spec.Props under the observer's field manager, mirroring +# the real defensive stamp's ownership shape. +stamp_skipdisk() { + local node=$1 + kubectl apply --server-side --force-conflicts \ + --field-manager=blockstor-satellite-skipdisk -f - </dev/null apiVersion: blockstor.cozystack.io/v1alpha1 kind: Resource metadata: - name: ${RD}.${N2} + name: ${RD}.${node} spec: resourceDefinitionName: ${RD} - nodeName: ${N2} + nodeName: ${node} props: DrbdOptions/SkipDisk: "True" EOF +} + +# skipdisk_prop — echo the current SkipDisk prop value on +# (RD, NODE).Spec.Props ("True" or empty). +skipdisk_prop() { + local node=$1 + kubectl get "resources.blockstor.cozystack.io/${RD}.${node}" \ + -o jsonpath='{.spec.props.DrbdOptions/SkipDisk}' 2>/dev/null || echo "" +} -echo ">> confirm SkipDisk is stamped on $N2 Spec.Props" -stamped=$(kubectl get "resources.blockstor.cozystack.io/${RD}.${N2}" \ - -o jsonpath='{.spec.props.DrbdOptions/SkipDisk}' 2>/dev/null || echo "") -if [[ "$stamped" != "True" ]]; then - echo "FAIL (Bug 278 setup): SkipDisk stamp did not land (got '$stamped'); aborting" >&2 +echo ">> [Bug 278] 2-replica diskful RD on $N1+$N2, diskless replica on $N3" +"${LCTL[@]}" resource-definition create "$RD" >/dev/null +"${LCTL[@]}" volume-definition create "$RD" 128M >/dev/null +"${LCTL[@]}" resource create "$N1" "$RD" --storage-pool=stand >/dev/null +"${LCTL[@]}" resource create "$N2" "$RD" --storage-pool=stand >/dev/null +"${LCTL[@]}" resource create "$N3" "$RD" --diskless >/dev/null + +echo ">> wait for both diskful UpToDate ($N1, $N2)" +wait_uptodate "$RD" "$N1" "$N2" + +echo ">> wait for $N3 to settle Diskless" +if ! wait_status_diskless "$RD" "$N3" 60; then + echo "FAIL (Bug 278 setup): $N3 never converged to Diskless; aborting" >&2 + exit 1 +fi + +# ---- Part A: PERSISTENCE on a genuinely-Diskless replica ----------------- +# +# HasDisklessVolume==true on $N3, so the auto-clear gate is closed: the +# defensive stamp MUST survive there. This is the positive control that +# proves the clearer is selective (it does not blindly wipe every stamp). +echo ">> [Bug 278/A] stamp SkipDisk=True onto the DISKLESS $N3 (must persist)" +stamp_skipdisk "$N3" + +echo ">> [Bug 278/A] confirm SkipDisk persists on Diskless $N3 (15s observation)" +# Poll for a sustained 15s: the stamp must stay pinned, not flicker away. +persist_ok=true +for _ in $(seq 1 8); do + val=$(skipdisk_prop "$N3") + if [[ "$val" != "True" ]]; then + persist_ok=false + break + fi + sleep 2 +done +if [[ "$persist_ok" != "true" ]]; then + echo "FAIL (Bug 278/A regression): SkipDisk was cleared on the Diskless $N3 — the clearer is over-eager (must only clear on a HEALTHY slot)" >&2 + kubectl get "resources.blockstor.cozystack.io/${RD}.${N3}" \ + -o json 2>/dev/null | jq '{props: .spec.props, status: .status}' >&2 || true exit 1 fi -echo ">> restart satellite pod on $N2 (simulates Talos kernel upgrade reattach)" +# ---- Part B: AUTO-CLEAR on a HEALTHY replica after reattach -------------- +# +# Stamp the same defensive prop onto the HEALTHY $N2. Restart its +# satellite pod (the documented Talos kernel-upgrade reattach shape), +# then assert the satellite auto-clears the observer-owned stamp and the +# disk stays UpToDate. We poll for ABSENCE — the steady state the +# product converges to — rather than the transient stamped state (which +# the clearer correctly wipes on every reconcile of a healthy slot, and +# which the previous revision wrongly tried to read back as a precondition). +echo ">> [Bug 278/B] stamp SkipDisk=True onto the HEALTHY $N2 (simulates pre-upgrade defensive stamp)" +stamp_skipdisk "$N2" + +echo ">> [Bug 278/B] restart satellite pod on $N2 (simulates Talos kernel upgrade reattach)" sat_pod=$(kubectl -n "$NS" get pods -l app=blockstor-satellite \ --field-selector "spec.nodeName=$N2" \ -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || echo "") @@ -119,6 +177,7 @@ fi kubectl -n "$NS" delete pod "$sat_pod" --wait=true >/dev/null echo ">> wait for satellite back up on $N2" +new_pod="" for _ in $(seq 1 60); do new_pod=$(kubectl -n "$NS" get pods -l app=blockstor-satellite \ --field-selector "spec.nodeName=$N2,status.phase=Running" \ @@ -133,17 +192,15 @@ if [[ -z "$new_pod" || "$new_pod" == "$sat_pod" ]]; then exit 1 fi -echo ">> wait up to 60s for the satellite to auto-clear SkipDisk on $N2" +echo ">> [Bug 278/B] wait up to 60s for the satellite to auto-clear SkipDisk on $N2" # Poll Resource.Spec.Props.DrbdOptions/SkipDisk — Bug 278 contract: # the satellite reconciler probes kernel state, sees healthy -# (disk:UpToDate, backing_dev set), and releases the observer's SSA -# claim on the SkipDisk key. After SSA release the apiserver removes -# the key from Spec.Props (no other owner claims it). +# (HasDisklessVolume==false), and releases the observer's SSA claim on +# the SkipDisk key. After SSA release the apiserver removes the key from +# Spec.Props (no other owner claims it). cleared=false for _ in $(seq 1 30); do - val=$(kubectl get "resources.blockstor.cozystack.io/${RD}.${N2}" \ - -o jsonpath='{.spec.props.DrbdOptions/SkipDisk}' 2>/dev/null || echo "") - if [[ -z "$val" ]]; then + if [[ -z "$(skipdisk_prop "$N2")" ]]; then cleared=true break fi @@ -151,15 +208,15 @@ for _ in $(seq 1 30); do done if [[ "$cleared" != "true" ]]; then - echo "FAIL (Bug 278): SkipDisk did NOT auto-clear on $N2 within 60s after satellite restart" >&2 + echo "FAIL (Bug 278): SkipDisk did NOT auto-clear on the HEALTHY $N2 within 60s after satellite restart" >&2 kubectl get "resources.blockstor.cozystack.io/${RD}.${N2}" \ -o json 2>/dev/null | jq '{props: .spec.props, status: .status}' >&2 || true exit 1 fi -echo ">> confirm $N2 disk state is back to UpToDate (re-attached after clear)" +echo ">> [Bug 278/B] confirm $N2 disk state is UpToDate (stayed attached after clear)" if ! wait_status_state "$RD" "$N2" "UpToDate" 60 0; then - echo "FAIL (Bug 278 deep): SkipDisk cleared but $N2 did not re-attach to UpToDate within 60s" >&2 + echo "FAIL (Bug 278 deep): SkipDisk cleared but $N2 did not hold UpToDate within 60s" >&2 kubectl get "resources.blockstor.cozystack.io/${RD}.${N2}" \ -o json 2>/dev/null | jq '{props: .spec.props, status: .status}' >&2 || true exit 1 @@ -175,4 +232,4 @@ if [[ "$n1_disk" != "UpToDate" ]]; then exit 1 fi -echo ">> bug-278-skipdisk-autoclear-after-reattach OK (auto-clear fires on healthy reattach, $N2 back to UpToDate)" +echo ">> bug-278-skipdisk-autoclear-after-reattach OK (A: stamp persists on Diskless $N3; B: auto-clear fires on healthy $N2 reattach, $N2 stays UpToDate)" diff --git a/tests/e2e/cli-matrix/drbd-options-hierarchy-controller.sh b/tests/e2e/cli-matrix/drbd-options-hierarchy-controller.sh index 293ee843..f55e8352 100755 --- a/tests/e2e/cli-matrix/drbd-options-hierarchy-controller.sh +++ b/tests/e2e/cli-matrix/drbd-options-hierarchy-controller.sh @@ -66,10 +66,17 @@ fi # show_max_buffers — echo the effective max-buffers for $RD as # the running kernel sees it. `drbdsetup show` dumps the live config in # the same `net { max-buffers ; }` shape the .res carries. +# +# Capture drbdsetup's output into a variable FIRST, then awk the string. +# Piping `drbdsetup show | awk '... exit'` closed the pipe on the first +# match while drbdsetup was still writing → SIGPIPE on the writer, which +# under `set -o pipefail` surfaced as exit 141 from the command +# substitution in wait_max_buffers and aborted the cell. The here-string +# has no writer to signal, so the early awk `exit` is harmless. show_max_buffers() { - local node=$1 - on_node "$node" drbdsetup show "$RD" 2>/dev/null \ - | awk '/max-buffers/ { gsub(/[;]/,""); print $2; exit }' + local node=$1 out + out=$(on_node "$node" drbdsetup show "$RD" 2>/dev/null) || out="" + awk '/max-buffers/ { gsub(/[;]/,""); print $2; exit }' <<<"$out" } # wait_max_buffers — poll the kernel config until the @@ -99,18 +106,22 @@ echo ">> rd c + vd c + r c (--auto-place=2 -s $POOL)" "${LCTL[@]}" resource create --auto-place=2 --storage-pool="$POOL" "$RD" >/dev/null echo ">> wait for 2 diskful replicas (auto-tiebreaker may add a 3rd, diskless row)" -deadline=$(( $(date +%s) + 60 )) -while :; do - diskful_n=$(linstor_diskful_nodes "$RD" | grep -c . || true) - [[ "$diskful_n" == "2" ]] && break - if (( $(date +%s) > deadline )); then - die "never reached 2 diskful replicas (last=$diskful_n)" - fi - sleep 2 -done - -# Pick a diskful node to probe the kernel config on. -node=$(linstor_diskful_nodes "$RD" | head -1) +# Use wait_diskful_count (DISKLESS/TIE_BREAKER rows excluded) — same +# helper the lifecycle catchers use. The previous `linstor_diskful_nodes +# | grep -c .` / `| head -1` forms raced SIGPIPE under `set -o pipefail`: +# `head -1` closes the pipe after the first node while the helper's +# internal `while read` loop is still emitting the rest, so the pipeline +# exited 141 and `set -e` aborted the cell before C1 ever ran. Consume +# the helper via mapfile/process-substitution (no pipe) — the project +# convention every other cell already follows. +if ! wait_diskful_count "$RD" 2 60; then + die "never reached 2 diskful replicas" +fi + +# Pick a diskful node to probe the kernel config on. mapfile over a +# process substitution — no pipe, so no SIGPIPE under pipefail. +mapfile -t diskful_nodes < <(linstor_diskful_nodes "$RD") +node=${diskful_nodes[0]:-} [[ -n "$node" ]] || die "no diskful node for $RD" echo ">> [C1] assert controller max-buffers ($CTRL_MB) inherited into $RD on $node" diff --git a/tests/e2e/cli-matrix/multi-volume-late-vd-create.sh b/tests/e2e/cli-matrix/multi-volume-late-vd-create.sh index 6c552f35..6c9ef09c 100755 --- a/tests/e2e/cli-matrix/multi-volume-late-vd-create.sh +++ b/tests/e2e/cli-matrix/multi-volume-late-vd-create.sh @@ -69,27 +69,56 @@ echo ">> [Bug 332] rd c + vd c (vol-0)" echo ">> [Bug 332] r c --auto-place=3 -s $POOL" "${LCTL[@]}" resource create --auto-place=3 --storage-pool="$POOL" "$RD" >/dev/null +# Resolve the 3 diskful nodes auto-place picked, so every convergence +# check below reads the per-replica Resource.Status directly. +echo ">> wait for 3 diskful replicas to be placed" +if ! wait_diskful_count "$RD" 3 90; then + echo "FAIL: auto-place did not land 3 diskful replicas" >&2 + "${LCTL[@]}" resource list --resources "$RD" 2>&1 | tail -30 >&2 + exit 1 +fi +mapfile -t DISKFUL_NODES < <(linstor_diskful_nodes "$RD") +if (( ${#DISKFUL_NODES[@]} != 3 )); then + echo "FAIL: expected 3 diskful nodes, got ${#DISKFUL_NODES[@]}: ${DISKFUL_NODES[*]}" >&2 + exit 1 +fi +echo " diskful nodes: ${DISKFUL_NODES[*]}" + +# wait_all_replicas_volume_uptodate — poll +# Resource.Status.volumes[].diskState on EVERY diskful node until +# all report UpToDate, or timeout. Reads the observer-stamped CRD via +# status_disk_state (lib.sh) — the SAME wire surface the passing cells +# use. NOTE: the `--machine-readable resource list` projection leaves +# `.vlms` null on this apiserver, so the previous jq-on-vlms detection +# always read empty and timed out even while `linstor r l` (which reads +# Resource.Status) showed every replica UpToDate. status_disk_state +# reads the populated CRD field instead. +wait_all_replicas_volume_uptodate() { + local vol=$1 timeout=$2 + local deadline=$(( $(date +%s) + timeout )) + local node st all_ok + while (( $(date +%s) < deadline )); do + all_ok=true + for node in "${DISKFUL_NODES[@]}"; do + st=$(status_disk_state "$RD" "$node" "$vol") + if [[ "$st" != "UpToDate" ]]; then + all_ok=false + break + fi + done + if [[ "$all_ok" == "true" ]]; then + return 0 + fi + sleep 3 + done + return 1 +} + # 240s: a 1G x3 initial sync shares the stand with the rest of the -# sweep; the previous 120s budget flaked under load (the failure dump -# printed all three replicas UpToDate moments after the deadline). +# sweep; the previous 120s budget flaked under load. echo ">> wait up to 240s for vol-0 to reach UpToDate on all 3 replicas" -deadline=$(( $(date +%s) + 240 )) -all_up=false -while (( $(date +%s) < deadline )); do - # Per-replica volume-0 state, three rows expected. The wire - # shape is `linstor r l --resources ` → DiskState column. - states=$("${LCTL[@]}" --machine-readable resource list --resources "$RD" 2>/dev/null \ - | jq -r '[.[][]? | .vlms[]? | select(.vlm_nr == 0) | .state.disk_state // "Unknown"] | join(",")' \ - 2>/dev/null || echo "") - if [[ "$states" == "UpToDate,UpToDate,UpToDate" ]]; then - all_up=true - break - fi - sleep 3 -done - -if [[ "$all_up" != "true" ]]; then - echo "FAIL: vol-0 did not reach UpToDate on all 3 replicas within 120s" >&2 +if ! wait_all_replicas_volume_uptodate 0 240; then + echo "FAIL: vol-0 did not reach UpToDate on all 3 replicas within 240s" >&2 "${LCTL[@]}" resource list --resources "$RD" 2>&1 | tail -30 >&2 exit 1 fi @@ -101,32 +130,30 @@ echo ">> [Bug 332] late vd c (vol-1)" echo ">> [Bug 332] late vd c (vol-2)" "${LCTL[@]}" volume-definition create "$RD" 1G >/dev/null -echo ">> wait up to 60s for vol-1 + vol-2 to reach UpToDate on all 3 replicas" -deadline=$(( $(date +%s) + 60 )) -late_up=false -while (( $(date +%s) < deadline )); do - # Pull every (replica, volume) disk_state. We need exactly: - # 3 rows × 3 volumes = 9 disk_state strings, all UpToDate. - # A Bug-332-bitten path will show vol-1/vol-2 stuck on - # "Diskless" with the operator-set DISKLESS flag absent. - states=$("${LCTL[@]}" --machine-readable resource list --resources "$RD" 2>/dev/null \ - | jq -r '[.[][]? | .vlms[]? | .state.disk_state // "Unknown"] | join(",")' \ - 2>/dev/null || echo "") - count_uptodate=$(awk -F, '{ for (i=1;i<=NF;i++) if ($i=="UpToDate") n++ } END { print n+0 }' <<<"$states") - if (( count_uptodate == 9 )); then - late_up=true +# Each late-added 1G volume needs its own initial sync on every replica; +# under sweep load that takes well past the old 60s budget, so give the +# late volumes the same 240s headroom vol-0 gets. +echo ">> wait up to 240s for vol-1 + vol-2 to reach UpToDate on all 3 replicas" +late_up=true +for vol in 1 2; do + if ! wait_all_replicas_volume_uptodate "$vol" 240; then + late_up=false break fi - sleep 3 done if [[ "$late_up" != "true" ]]; then - echo "FAIL (Bug 332): late-added vol-1/vol-2 not UpToDate on all 3 replicas within 60s" >&2 + echo "FAIL (Bug 332): late-added vol-1/vol-2 not UpToDate on all 3 replicas within 240s" >&2 "${LCTL[@]}" resource list --resources "$RD" 2>&1 | tail -40 >&2 - # Surface the smoking gun: a Diskless line for a non-DISKLESS spec. - echo "----- linstor r l --resources $RD (with flags) -----" >&2 - "${LCTL[@]}" --machine-readable resource list --resources "$RD" 2>/dev/null \ - | jq -r '.[][]? | "\(.node_name) vol=\(.vlms[]?.vlm_nr) state=\(.vlms[]?.state.disk_state) flags=\(.flags//[])"' >&2 || true + # Surface the smoking gun: per (node, vol) disk_state straight from + # the populated Resource.Status — a Bug-332-bitten path shows vol-1 + # or vol-2 stuck on Diskless on a non-DISKLESS replica. + echo "----- per (node, vol) Resource.Status diskState -----" >&2 + for node in "${DISKFUL_NODES[@]}"; do + for vol in 0 1 2; do + echo " $node vol=$vol state=$(status_disk_state "$RD" "$node" "$vol")" >&2 + done + done exit 1 fi @@ -135,10 +162,13 @@ fi # MUST NOT report any volume as Diskless when the spec lacks the # DISKLESS flag. This is the kernel-truth assertion that distinguishes # Bug 332 (Unintentional Diskless) from spec-pinned diskless replicas. -echo ">> [Bug 332] kernel-truth: drbdadm status on a diskful node" -satellite_node=$("${LCTL[@]}" --machine-readable resource list --resources "$RD" 2>/dev/null \ - | jq -r '.[][]? | select((.flags//[]) | (map(. == "DISKLESS") | any | not)) | .node_name' \ - 2>/dev/null | head -1) +echo ">> [Bug 332] kernel-truth: drbdsetup status on a diskful node" +# Use the already-resolved diskful node set (reliable, CRD-backed) and +# probe the kernel directly via the satellite pod (on_node, lib.sh) — +# the same drbdsetup status path the parent helpers use. The previous +# `--machine-readable ... | jq .flags | head -1` resolution shared the +# null-vlms / SIGPIPE failure modes of the wait loops above. +satellite_node=${DISKFUL_NODES[0]:-} if [[ -z "$satellite_node" ]]; then echo "SKIP-PARTIAL: could not resolve a diskful node for kernel-truth check" @@ -146,7 +176,7 @@ if [[ -z "$satellite_node" ]]; then exit 0 fi -if status_out=$(kubectl debug node/"$satellite_node" --image=alpine -- chroot /host drbdadm status "$RD" 2>&1); then +if status_out=$(on_node "$satellite_node" drbdsetup status "$RD" --verbose 2>&1); then echo "$status_out" if grep -E 'volume:[12].*disk:Diskless' <<<"$status_out" >/dev/null; then echo "FAIL (Bug 332): diskful node $satellite_node reports volume:1 or volume:2 as Diskless on kernel state" >&2 @@ -154,7 +184,7 @@ if status_out=$(kubectl debug node/"$satellite_node" --image=alpine -- chroot /h exit 1 fi else - echo "SKIP-PARTIAL: kubectl debug to inspect drbdadm status failed (RBAC / image pull); REST-level pin still asserted" + echo "SKIP-PARTIAL: drbdsetup status on $satellite_node failed; REST-level pin still asserted" fi echo ">> multi-volume-late-vd-create OK (Bug 332 pinned: late vd c on $RD brought vol-1/vol-2 to UpToDate, no Unintentional Diskless)" diff --git a/tests/operator-harness/cli-parity-refresh.sh b/tests/operator-harness/cli-parity-refresh.sh index ce9f8424..192d739b 100755 --- a/tests/operator-harness/cli-parity-refresh.sh +++ b/tests/operator-harness/cli-parity-refresh.sh @@ -107,12 +107,27 @@ up() { linstor --controllers "$UP_URL" "$@"; } # # - Storage-pool names differ purely by environment config: BS's stand # uses `stand` / `lvm-thin` / `zfs-thin` / `zfs-thick`, the upstream -# oracle uses `pool`. A `sp l` / `r l` / `vd l` row that is otherwise -# identical would diff solely on the pool column. Collapse every -# side's pool token to a canonical `` so only REAL shape +# oracle uses `pool`. A `sp l` / `r l` / `vd l` / `rg l` row that is +# otherwise identical would diff solely on the pool column. Collapse +# every side's pool token to a canonical `` so only REAL shape # divergence survives the diff. # - The PARITY_PREFIX-suffixed object names already match across sides # (same prefix seeded on both), so they need no normalization. +# - `rg l` carries two extra environment artefacts the bare pool-token +# replace cannot reach (so it previously stayed WIRE_SHAPE even +# though only naming/rendering drift remained): +# 1. ASCII-table border WIDTH drifts with the longest cell: the BS +# pool name `stand` (5) renders one dash wider than the upstream +# `pool` (4), so every `+----+` / `|====|` / `|----|` rule line +# differs by a char that `diff -w` does NOT ignore (dashes are +# not whitespace). Collapse every run of border chars to a +# single canonical token so width drift disappears. +# 2. DfltRscGrp PlaceCount rendering: upstream prints +# `PlaceCount: 2` in the built-in DfltRscGrp's SelectFilter +# cell, BS leaves it blank. The default group's place-count is +# a render choice over the SAME underlying default, not a real +# BS↔UP behavioural delta — canonicalise the DfltRscGrp row's +# SelectFilter cell on both sides. # # Side = "bs" or "up" selects which pool-name set to canonicalize. normalize_side_output() { @@ -131,6 +146,24 @@ normalize_side_output() { # Word-boundary replace so `stand` doesn't clobber `standby` etc. sed -i -E "s/\\b${p}\\b//g" "$file" 2>/dev/null || true done + + # Collapse ASCII-table border-char runs so pool-name length drift + # (which sets the column width, and thus the rule-line length) cannot + # leave a false diff. A border/rule line is composed solely of the + # box-drawing chars `+ | - =` (and spaces); rewrite any run of `-` or + # `=` to a single char. Cell rows (which contain letters/digits) are + # left untouched so genuine content divergence still surfaces. + sed -i -E '/^[-+|= ]+$/ { s/-+/-/g; s/=+/=/g }' "$file" 2>/dev/null || true + + # Canonicalise the built-in DfltRscGrp row's SelectFilter cell: both + # sides render the SAME default place-count, but upstream prints + # `PlaceCount: N` there while BS leaves the cell blank. Remove the + # `PlaceCount: N` text from the DfltRscGrp row entirely so the + # upstream-only render collapses to the blank cell BS emits (the + # leftover spacing is then ignored by the `diff -w` comparator). The + # rewrite is scoped to lines that mention DfltRscGrp so a *user* RG's + # PlaceCount still diffs if it ever diverges. + sed -i -E '/DfltRscGrp/ s/PlaceCount: *[0-9]+//' "$file" 2>/dev/null || true } # reap_residual_resource_groups — delete leftover resource-groups diff --git a/tests/operator-harness/replay/snap-create-multiple-group-consistency.yaml b/tests/operator-harness/replay/snap-create-multiple-group-consistency.yaml index aebff85e..0a169442 100644 --- a/tests/operator-harness/replay/snap-create-multiple-group-consistency.yaml +++ b/tests/operator-harness/replay/snap-create-multiple-group-consistency.yaml @@ -76,8 +76,17 @@ steps: timeout_s: 240 # ---- Bug-046: group snapshot — single suspend barrier, all-or-nothing -- + # CLI form (linstor-client 1.27.1): the snapshot name is a single + # POSITIONAL and the resource names go to `-r`/--resource_names (one + # shared snapshot name across all listed RDs). The snapshot name MUST + # come before `-r` — argparse's variadic `-r A B grp` would greedily + # eat `grp` into --resource_names and leave no positional ("too few + # arguments", exit 2). The previous positional colon form + # (`{{rd}}:grp bug046-multi-b:grp`) is not accepted at all and aborted + # with argparse exit 2, so this step (and thus BUG-046's L7 coverage) + # never ran. - name: snapshot-create-multiple - cmd: ["snapshot", "create-multiple", "{{rd}}:grp", "bug046-multi-b:grp"] + cmd: ["snapshot", "create-multiple", "grp", "-r", "{{rd}}", "bug046-multi-b"] expect_exit: 0 - name: both-snapshots-visible cmd: ["snapshot", "list"]