ADBDEV-8057: Fix pg_rewind leaving replica on old timeline by silent-observer · Pull Request #1970 · arenadata/gpdb

silent-observer · 2025-09-02T14:40:47Z

Remove bogus assertion in pg_rewind

It's entirely possible that the calculated "point of divergence" is
before the end of WAL on the target, because when we calculate the
point of divergence, we assume that the target's WAL extends to
infinity on its current timeline. It might be more clear to calculate
the end of WAL first, and if the target is a direct ancestor of the
source, consider the end of target WAL as the point of
divergence. However, it doesn't change the result: the target is a
direct ancestor of the source, and doesn't need to be rewound.

Add tests to cover the "no rewind required" cases, both when the
target's end of WAL is exactly at the point of divergence, which is
what the removed assertion assumed (case B in the test), and the case
where the assertion failed (case A).

Changes from original patch: test was made more general to prepare for more
detailed tests in the next commits. In particular, we pg_rewind from a
running source node, not from a stopped one, and the source node is passed as
an argument. This doesn't affect the test validity for this patch.

Discussion: https://www.postgresql.org/message-id/18575-1b9b3b60e2a480de@postgresql.org

Alternatively, perhaps just copying the history file would be enough?

Changes from original patch:

Move sourceHistfile initialization to before rewind_parseTimeLineHistory,
since rewind_parseTimeLineHistory actually modifies it.
Don't copy timeline history file when dry_run is specified. Dry run
shouldn't make changes that can potentially break the target server when
applied incorrectly.
Fix 107_empty_conf.pl test to preserve primary_conninfo setting.

Fix pg_rewind edge case when checkpoint is at switchpoint

In one specific edge case, when the source server makes a checkpoint exactly
after switching to a new timeline, switchpoint LSN exactly matches latest
checkpoint LSN. This can happen, for example, if the source has just performed
archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in
pg_rewind is illegal and causes an error at startup, since minRecoveryPoint
actually means the minimum allowed end of WAL, and we want that to be after
the switch occurs, not at the end of the previous timeline.

As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which
now correctly handles this case. Also add a test scenario for the source node
that has performed recovery.

Note: Do not squash to preserve authorship

src/bin/pg_rewind/pg_rewind.c

KnightMurloc · 2025-09-08T05:12:16Z

Use checkpoint+1 to account for the edge case when the checkpoint exactly
matches switchpoint LSN.

Is this not the same case as described in the first commit?

Add the appropriate description in the second commit.

src/bin/pg_rewind/t/010_target_is_ancestor.pl

KnightMurloc · 2025-09-08T05:43:30Z

If switches to the first commit, the tests pass with unchanged to pg_rewind.

It's entirely possible that the calculated "point of divergence" is before the end of WAL on the target, because when we calculate the point of divergence, we assume that the target's WAL extends to infinity on its current timeline. It might be more clear to calculate the end of WAL first, and if the target is a direct ancestor of the source, consider the end of target WAL as the point of divergence. However, it doesn't change the result: the target is a direct ancestor of the source, and doesn't need to be rewound. Add tests to cover the "no rewind required" cases, both when the target's end of WAL is exactly at the point of divergence, which is what the removed assertion assumed (case B in the test), and the case where the assertion failed (case A). Changes from original patch: test was made more general to prepare for more detailed tests in the next commits. In particular, we pg_rewind from a running source node, not from a stopped one, and the source node is passed as an argument. This doesn't affect the test validity for this patch. Discussion: https://www.postgresql.org/message-id/18575-1b9b3b60e2a480de@postgresql.org Ticket: ADBDEV-8057

Alternatively, perhaps just copying the history file would be enough? Changes from original patch: - Move sourceHistfile initialization to before rewind_parseTimeLineHistory, since rewind_parseTimeLineHistory actually modifies it. - Don't copy timeline history file when dry_run is specified. Dry run shouldn't make changes that can potentially break the target server when applied incorrectly. - Fix 107_empty_conf.pl test to preserve primary_conninfo setting. Ticket: ADBDEV-8057

In one specific edge case, when the source server makes a checkpoint exactly after switching to a new timeline, switchpoint LSN exactly matches latest checkpoint LSN. This can happen, for example, if the source has just performed archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in pg_rewind is illegal and causes an error at startup, since minRecoveryPoint actually means the minimum allowed end of WAL, and we want that to be after the switch occurs, not at the end of the previous timeline. As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which now correctly handles this case. Also add a test scenario for the source node that has performed recovery. Ticket: ADBDEV-8057

silent-observer · 2025-09-15T09:04:40Z

I split the patch into three commits now.

The first contains the assert removal and an updated test that can be adapted to second and third commits with minimal changes.
The second contains the bulk of the patch by Heikki, with minor fixes, and updates to the test that only pass after the second commit.
The third commit contains a fix for minRecoveryPoint edge case, and extends the test more to cover it.
Now the test changes are fully sequential: the tests pass on each of the three commits, but only with corresponding changes to pg_rewind present.

src/bin/pg_rewind/pg_rewind.c

Alternatively, perhaps just copying the history file would be enough? Changes from original patch: - Refactor sourceHistfile to be an out parameter instead of global state. - Move sourceHistfile initialization to before rewind_parseTimeLineHistory, since rewind_parseTimeLineHistory actually modifies it. - Don't copy timeline history file when dry_run is specified. Dry run shouldn't make changes that can potentially break the target server when applied incorrectly. - Fix 107_empty_conf.pl test to preserve primary_conninfo setting. Ticket: ADBDEV-8057

In one specific edge case, when the source server makes a checkpoint exactly after switching to a new timeline, switchpoint LSN exactly matches latest checkpoint LSN. This can happen, for example, if the source has just performed archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in pg_rewind is illegal and causes an error at startup, since minRecoveryPoint actually means the minimum allowed end of WAL, and we want that to be after the switch occurs, not at the end of the previous timeline. As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which now correctly handles this case. Also add a test scenario for the source node that has performed recovery. Ticket: ADBDEV-8057

Alternatively, perhaps just copying the history file would be enough? Changes from original patch: - Refactor sourceHistfile to be an out parameter instead of global state. - Move sourceHistfile initialization to before rewind_parseTimeLineHistory, since rewind_parseTimeLineHistory actually modifies it. - Don't copy timeline history file when dry_run is specified. Dry run shouldn't make changes that can potentially break the target server when applied incorrectly. - Fix 107_empty_conf.pl test to preserve primary_conninfo setting. Ticket: ADBDEV-8057

In one specific edge case, when the source server makes a checkpoint exactly after switching to a new timeline, switchpoint LSN exactly matches latest checkpoint LSN. This can happen, for example, if the source has just performed archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in pg_rewind is illegal and causes an error at startup, since minRecoveryPoint actually means the minimum allowed end of WAL, and we want that to be after the switch occurs, not at the end of the previous timeline. As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which now correctly handles this case. Also add a test scenario for the source node that has performed recovery. Ticket: ADBDEV-8057

src/bin/pg_rewind/t/107_empty_conf.pl

src/bin/pg_rewind/pg_rewind.c

In one specific edge case, when the source server makes a checkpoint exactly after switching to a new timeline, switchpoint LSN exactly matches latest checkpoint LSN. This can happen, for example, if the source has just performed archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in pg_rewind is illegal and causes an error at startup, since minRecoveryPoint actually means the minimum allowed end of WAL, and we want that to be after the switch occurs, not at the end of the previous timeline. As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which now correctly handles this case. Also add a test scenario for the source node that has performed recovery. Ticket: ADBDEV-8057

RekGRpth · 2025-09-19T05:21:28Z

Refactor sourceHistfile to be an out parameter instead of global state.

pgindent these your changes

silent-observer · 2025-09-19T06:23:46Z

Is it expected that changes in the src/bin/pg_rewind/t/107_empty_conf.pl test work without a patch?

Yes, changes to 107_empty_conf.pl only fix config line being accidentally deleted, they don't change the nature of the test. 010_target_is_ancestor.pl is the main test for this patch.

It's entirely possible that the calculated "point of divergence" is before the end of WAL on the target, because when we calculate the point of divergence, we assume that the target's WAL extends to infinity on its current timeline. It might be more clear to calculate the end of WAL first, and if the target is a direct ancestor of the source, consider the end of target WAL as the point of divergence. However, it doesn't change the result: the target is a direct ancestor of the source, and doesn't need to be rewound. Add tests to cover the "no rewind required" cases, both when the target's end of WAL is exactly at the point of divergence, which is what the removed assertion assumed (case B in the test), and the case where the assertion failed (case A). Changes from original patch: test was made more general to prepare for more detailed tests in the next commits. In particular, we pg_rewind from a running source node, not from a stopped one, and the source node is passed as an argument. This doesn't affect the test validity for this patch. Discussion: https://www.postgresql.org/message-id/18575-1b9b3b60e2a480de@postgresql.org Ticket: ADBDEV-8057

Alternatively, perhaps just copying the history file would be enough? Changes from original patch: - Refactor sourceHistfile to be an out parameter instead of global state. - Move sourceHistfile initialization to before rewind_parseTimeLineHistory, since rewind_parseTimeLineHistory actually modifies it. - Don't copy timeline history file when dry_run is specified. Dry run shouldn't make changes that can potentially break the target server when applied incorrectly. - Fix 107_empty_conf.pl test to preserve primary_conninfo setting. Ticket: ADBDEV-8057

In one specific edge case, when the source server makes a checkpoint exactly after switching to a new timeline, switchpoint LSN exactly matches latest checkpoint LSN. This can happen, for example, if the source has just performed archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in pg_rewind is illegal and causes an error at startup, since minRecoveryPoint actually means the minimum allowed end of WAL, and we want that to be after the switch occurs, not at the end of the previous timeline. As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which now correctly handles this case. Also add a test scenario for the source node that has performed recovery. Ticket: ADBDEV-8057

Alternatively, perhaps just copying the history file would be enough? Changes from original patch: - Refactor sourceHistfile to be an out parameter instead of global state. - Move sourceHistfile initialization to before rewind_parseTimeLineHistory, since rewind_parseTimeLineHistory actually modifies it. - Don't copy timeline history file when dry_run is specified. Dry run shouldn't make changes that can potentially break the target server when applied incorrectly. - Fix 107_empty_conf.pl test to preserve primary_conninfo setting. Ticket: ADBDEV-8057

In one specific edge case, when the source server makes a checkpoint exactly after switching to a new timeline, switchpoint LSN exactly matches latest checkpoint LSN. This can happen, for example, if the source has just performed archive recovery. Then setting minRecoveryPoint to the checkpoint LSN in pg_rewind is illegal and causes an error at startup, since minRecoveryPoint actually means the minimum allowed end of WAL, and we want that to be after the switch occurs, not at the end of the previous timeline. As a fix, add +1 to the checkpoint LSN when setting minRecoveryPoint, which now correctly handles this case. Also add a test scenario for the source node that has performed recovery. Ticket: ADBDEV-8057

It's entirely possible that the calculated "point of divergence" is before the end of WAL on the target, because when we calculate the point of divergence, we assume that the target's WAL extends to infinity on its current timeline. It might be more clear to calculate the end of WAL first, and if the target is a direct ancestor of the source, consider the end of target WAL as the point of divergence. However, it doesn't change the result: the target is a direct ancestor of the source, and doesn't need to be rewound. Add tests to cover the "no rewind required" cases, both when the target's end of WAL is exactly at the point of divergence, which is what the removed assertion assumed (case B in the test), and the case where the assertion failed (case A). Changes from original patch: test was made more general to prepare for more detailed tests in the next commits. In particular, we pg_rewind from a running source node, not from a stopped one, and the source node is passed as an argument. This doesn't affect the test validity for this patch. Discussion: https://www.postgresql.org/message-id/18575-1b9b3b60e2a480de@postgresql.org Ticket: ADBDEV-8057

Alternatively, perhaps just copying the history file would be enough? Changes from original patch: - Refactor sourceHistfile to be an out parameter instead of global state. - Move sourceHistfile initialization to before rewind_parseTimeLineHistory, since rewind_parseTimeLineHistory actually modifies it. - Don't copy timeline history file when dry_run is specified. Dry run shouldn't make changes that can potentially break the target server when applied incorrectly. - Fix 107_empty_conf.pl test to preserve primary_conninfo setting. Ticket: ADBDEV-8057

silent-observer force-pushed the ADBDEV-8057 branch 2 times, most recently from cc0a1c8 to df9dd60 Compare September 3, 2025 14:00

silent-observer marked this pull request as ready for review September 4, 2025 06:13

RekGRpth reviewed Sep 4, 2025

View reviewed changes

src/bin/pg_rewind/pg_rewind.c Show resolved Hide resolved

This comment was marked as resolved.

Sign in to view

KnightMurloc reviewed Sep 8, 2025

View reviewed changes

src/bin/pg_rewind/t/010_target_is_ancestor.pl Show resolved Hide resolved

silent-observer force-pushed the ADBDEV-8057 branch from df9dd60 to 506a166 Compare September 15, 2025 08:54

KnightMurloc reviewed Sep 16, 2025

View reviewed changes

src/bin/pg_rewind/pg_rewind.c Outdated Show resolved Hide resolved

src/bin/pg_rewind/pg_rewind.c Outdated Show resolved Hide resolved

src/bin/pg_rewind/pg_rewind.c Outdated Show resolved Hide resolved

src/bin/pg_rewind/pg_rewind.c Outdated Show resolved Hide resolved

silent-observer force-pushed the ADBDEV-8057 branch from 506a166 to c06e2a8 Compare September 16, 2025 13:54

silent-observer force-pushed the ADBDEV-8057 branch from c06e2a8 to fbc4fec Compare September 16, 2025 14:03

KnightMurloc reviewed Sep 18, 2025

View reviewed changes

src/bin/pg_rewind/t/107_empty_conf.pl Show resolved Hide resolved

src/bin/pg_rewind/pg_rewind.c Outdated Show resolved Hide resolved

silent-observer force-pushed the ADBDEV-8057 branch from fbc4fec to b713bf6 Compare September 18, 2025 10:20

KnightMurloc previously approved these changes Sep 18, 2025

View reviewed changes

This comment was marked as resolved.

Sign in to view

silent-observer dismissed KnightMurloc’s stale review via cc50848 September 19, 2025 06:40

silent-observer force-pushed the ADBDEV-8057 branch from b713bf6 to cc50848 Compare September 19, 2025 06:40

hlinnaka and others added 2 commits September 19, 2025 07:41

silent-observer force-pushed the ADBDEV-8057 branch from cc50848 to a962757 Compare September 19, 2025 07:42

RekGRpth approved these changes Sep 19, 2025

View reviewed changes

KnightMurloc approved these changes Sep 19, 2025

View reviewed changes

silent-observer merged commit 742e818 into adb-7.2.0 Sep 19, 2025
5 checks passed

silent-observer deleted the ADBDEV-8057 branch September 19, 2025 12:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADBDEV-8057: Fix pg_rewind leaving replica on old timeline#1970

ADBDEV-8057: Fix pg_rewind leaving replica on old timeline#1970
silent-observer merged 3 commits intoadb-7.2.0from
ADBDEV-8057

silent-observer commented Sep 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

KnightMurloc commented Sep 8, 2025

Uh oh!

Uh oh!

KnightMurloc commented Sep 8, 2025

Uh oh!

silent-observer commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

RekGRpth commented Sep 19, 2025

Uh oh!

silent-observer commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

silent-observer commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

KnightMurloc commented Sep 8, 2025

Uh oh!

Uh oh!

KnightMurloc commented Sep 8, 2025

Uh oh!

silent-observer commented Sep 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment was marked as resolved.

RekGRpth commented Sep 19, 2025

Uh oh!

silent-observer commented Sep 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

silent-observer commented Sep 2, 2025 •

edited

Loading