Skip to content

[feat] PIP-468: sealed-segment retention GC for scalable topics#25668

Merged
merlimat merged 2 commits intoapache:masterfrom
merlimat:st-segment-gc
May 5, 2026
Merged

[feat] PIP-468: sealed-segment retention GC for scalable topics#25668
merlimat merged 2 commits intoapache:masterfrom
merlimat:st-segment-gc

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

@merlimat merlimat commented May 5, 2026

Summary

After a split / merge the parent segment is sealed and accepts no further writes; eventually its data ages out. There was no mechanism to actually delete it. This PR adds:

1. Single owner of segment-topic deletion

The v4 inactive-topic GC would otherwise race the controller for whole-topic deletion of segment-backing topics. PersistentTopic#checkGC now early-returns when the topic is in the segment:// domain — the controller is the sole lifecycle owner.

2. Periodic GC tick on the ScalableTopicController leader

Each tick:

  • Resolves the effective retention from topic-policies → namespace policy → broker default. Negative ⇒ keep forever, tick is a no-op.
  • Picks sealed segments where (now - sealedAtMs) >= retentionMs.
  • For each candidate, polls every existing subscription's backlog on that segment via the existing /segments/.../backlog admin endpoint. All-zero ⇒ prunable.
  • CAS-prunes the layout (re-validating against the latest layout to handle a concurrent prune by a former leader gracefully), reloads, notifies subscriptions, then deletes the backing topic via admin.topics().deleteAsync(force=true).
  • Layout-prune is the point of no return; backing-topic delete is best-effort and retried on subsequent ticks.

The clock is injectable (java.time.Clock) so tests can fast-forward past retention deterministically. splitSegment / mergeSegments now read the wall-clock through the same Clock so that test ticks compute consistent ages.

Test plan

  • testGcTickPrunesDrainedSealedSegmentPastRetention — split, tick within retention (no prune), advance past retention, tick again, assert pruned + delete called.
  • testGcTickRespectsKeepForeverRetention — negative retention leaves the segment in place even after a year of clock advance.
  • Existing scalable + V5 suites green: org.apache.pulsar.broker.service.scalable.* (76 tests), V5SmokeTest, V5SegmentSplitTest, V5DeadLetterPolicyTest.
  • Checkstyle clean (pulsar-broker main + test).

After a split/merge the parent segment is sealed and accepts no further
writes; eventually its data ages out. There was no mechanism to actually
delete it. This adds:

1. Single owner of segment-topic deletion. The v4 inactive-topic GC
   would otherwise race the controller for whole-topic deletion of
   segment-backing topics. PersistentTopic#checkGC now early-returns
   when the topic is in the segment:// domain — the controller is the
   sole lifecycle owner.

2. Periodic GC tick on the ScalableTopicController leader. Each tick:
   - Resolves the effective retention from topic-policies → namespace
     policy → broker default. Negative ⇒ keep forever, tick is a no-op.
   - Picks sealed segments where (now - sealedAtMs) >= retentionMs.
   - For each candidate, polls every existing subscription's backlog
     on that segment via the existing /segments/.../backlog admin
     endpoint. All-zero ⇒ prunable.
   - CAS-prunes the layout (re-validating against the latest layout
     to handle a concurrent prune by a former leader gracefully),
     reloads, notifies subscriptions, then deletes the backing topic
     via admin.topics().deleteAsync(force=true).
   - Layout-prune is the point of no return; backing-topic delete is
     best-effort and retried on subsequent ticks.

   The clock is injectable (java.time.Clock) so tests can fast-forward
   past retention deterministically. splitSegment/mergeSegments now
   read the wall-clock through the same Clock so that test ticks
   compute consistent ages.

Tests:
- testGcTickPrunesDrainedSealedSegmentPastRetention — split, tick
  within retention (no prune), advance past retention, tick again,
  assert pruned + delete called.
- testGcTickRespectsKeepForeverRetention — negative retention leaves
  the segment in place even after a year of clock advance.
@merlimat merlimat changed the title PIP-468: sealed-segment retention GC for scalable topics [feat] PIP-468: sealed-segment retention GC for scalable topics May 5, 2026
Copy link
Copy Markdown
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some review comments.

- Coalesce all eligible prunes for a tick into a single CAS write on the
  metadata znode (was: one CAS per segment, with retry storms when more
  than one or two were eligible). The metadata-reload + subscription
  notify also collapse to a single round-trip.
- Re-check isLeader() (and closed) just before the CAS, since drain
  checks can take seconds and leadership may have flipped meanwhile.
- Use the segment-aware scalableTopics().deleteSegmentAsync admin call
  instead of a hand-built persistent:// URL — this is the same primitive
  ScalableTopicService uses internally and routes correctly to the
  segment's owning broker.
- Rename the lambda parameter that shadowed the prunable() method.
- Add docstring on pruneEligibleAsync explaining behaviour for STREAM /
  QUEUE / CHECKPOINT subscriptions and parent-vs-child pruning order:
  CHECKPOINT subs have no broker-side cursor so the backlog endpoint
  returns NotFound → false, keeping the segment pinned (safe default).
Copy link
Copy Markdown
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@merlimat merlimat merged commit c1a7347 into apache:master May 5, 2026
43 checks passed
@merlimat merlimat deleted the st-segment-gc branch May 5, 2026 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants