[feat] PIP-468: sealed-segment retention GC for scalable topics#25668
Merged
merlimat merged 2 commits intoapache:masterfrom May 5, 2026
Merged
[feat] PIP-468: sealed-segment retention GC for scalable topics#25668merlimat merged 2 commits intoapache:masterfrom
merlimat merged 2 commits intoapache:masterfrom
Conversation
After a split/merge the parent segment is sealed and accepts no further
writes; eventually its data ages out. There was no mechanism to actually
delete it. This adds:
1. Single owner of segment-topic deletion. The v4 inactive-topic GC
would otherwise race the controller for whole-topic deletion of
segment-backing topics. PersistentTopic#checkGC now early-returns
when the topic is in the segment:// domain — the controller is the
sole lifecycle owner.
2. Periodic GC tick on the ScalableTopicController leader. Each tick:
- Resolves the effective retention from topic-policies → namespace
policy → broker default. Negative ⇒ keep forever, tick is a no-op.
- Picks sealed segments where (now - sealedAtMs) >= retentionMs.
- For each candidate, polls every existing subscription's backlog
on that segment via the existing /segments/.../backlog admin
endpoint. All-zero ⇒ prunable.
- CAS-prunes the layout (re-validating against the latest layout
to handle a concurrent prune by a former leader gracefully),
reloads, notifies subscriptions, then deletes the backing topic
via admin.topics().deleteAsync(force=true).
- Layout-prune is the point of no return; backing-topic delete is
best-effort and retried on subsequent ticks.
The clock is injectable (java.time.Clock) so tests can fast-forward
past retention deterministically. splitSegment/mergeSegments now
read the wall-clock through the same Clock so that test ticks
compute consistent ages.
Tests:
- testGcTickPrunesDrainedSealedSegmentPastRetention — split, tick
within retention (no prune), advance past retention, tick again,
assert pruned + delete called.
- testGcTickRespectsKeepForeverRetention — negative retention leaves
the segment in place even after a year of clock advance.
lhotari
reviewed
May 5, 2026
lhotari
reviewed
May 5, 2026
lhotari
reviewed
May 5, 2026
lhotari
reviewed
May 5, 2026
Member
lhotari
left a comment
There was a problem hiding this comment.
I added some review comments.
- Coalesce all eligible prunes for a tick into a single CAS write on the metadata znode (was: one CAS per segment, with retry storms when more than one or two were eligible). The metadata-reload + subscription notify also collapse to a single round-trip. - Re-check isLeader() (and closed) just before the CAS, since drain checks can take seconds and leadership may have flipped meanwhile. - Use the segment-aware scalableTopics().deleteSegmentAsync admin call instead of a hand-built persistent:// URL — this is the same primitive ScalableTopicService uses internally and routes correctly to the segment's owning broker. - Rename the lambda parameter that shadowed the prunable() method. - Add docstring on pruneEligibleAsync explaining behaviour for STREAM / QUEUE / CHECKPOINT subscriptions and parent-vs-child pruning order: CHECKPOINT subs have no broker-side cursor so the backlog endpoint returns NotFound → false, keeping the segment pinned (safe default).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
After a split / merge the parent segment is sealed and accepts no further writes; eventually its data ages out. There was no mechanism to actually delete it. This PR adds:
1. Single owner of segment-topic deletion
The v4 inactive-topic GC would otherwise race the controller for whole-topic deletion of segment-backing topics.
PersistentTopic#checkGCnow early-returns when the topic is in thesegment://domain — the controller is the sole lifecycle owner.2. Periodic GC tick on the
ScalableTopicControllerleaderEach tick:
(now - sealedAtMs) >= retentionMs./segments/.../backlogadmin endpoint. All-zero ⇒ prunable.admin.topics().deleteAsync(force=true).The clock is injectable (
java.time.Clock) so tests can fast-forward past retention deterministically.splitSegment/mergeSegmentsnow read the wall-clock through the sameClockso that test ticks compute consistent ages.Test plan
testGcTickPrunesDrainedSealedSegmentPastRetention— split, tick within retention (no prune), advance past retention, tick again, assert pruned + delete called.testGcTickRespectsKeepForeverRetention— negative retention leaves the segment in place even after a year of clock advance.org.apache.pulsar.broker.service.scalable.*(76 tests),V5SmokeTest,V5SegmentSplitTest,V5DeadLetterPolicyTest.pulsar-brokermain + test).