Skip to content

[fix][test] Fix flaky AdminApiTest.persistentTopicsCursorResetAfterReset timeout#25692

Open
merlimat wants to merge 1 commit intoapache:masterfrom
merlimat:fix/flaky-admin-api-cursor-reset
Open

[fix][test] Fix flaky AdminApiTest.persistentTopicsCursorResetAfterReset timeout#25692
merlimat wants to merge 1 commit intoapache:masterfrom
merlimat:fix/flaky-admin-api-cursor-reset

Conversation

@merlimat
Copy link
Copy Markdown
Contributor

@merlimat merlimat commented May 5, 2026

Motivation

AdminApiTest.persistentTopicsCursorResetAfterReset flakes by hitting the 5-minute test-method timeout:

ThreadTimeoutException: Method
o.a.p.b.admin.AdminApiTest.persistentTopicsCursorResetAfterReset()
didn't finish within the time-out 300000
    at j.u.c.locks.LockSupport.park(LockSupport.java:371)
    at o.a.p.c.u.collections.GrowableArrayBlockingQueue.take(...)
    at o.a.p.c.impl.ConsumerImpl.internalReceive(ConsumerImpl.java:531)
    at o.a.p.c.impl.ConsumerBase.receive(ConsumerBase.java:282)
    at AdminApiTest.persistentTopicsCursorResetAfterReset(:2945)

The test uses consumer.receive() (no timeout) inside a loop after admin.topics().resetCursor(...). If the broker doesn't push the redelivery in time, the consumer blocks indefinitely until the test method timeout fires.

Failing test scan: https://scans.gradle.com/s/tihvvhomboopa/tests/task/:pulsar-broker:test/details/org.apache.pulsar.broker.admin.AdminApiTest/persistentTopicsCursorResetAfterReset%5B4%5D(simple-topicName)/1/output

Modifications

Replace every consumer.receive() in this test with consumer.receive(30, TimeUnit.SECONDS) and assert the result is non-null with a descriptive message. The test now fails fast in ~30 seconds with a clear diagnostic identifying which resetCursor step did not deliver, instead of hanging for the full 5-minute timeout.

Verifying this change

  • AdminApiTest.persistentTopicsCursorResetAfterReset passes locally with the change.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API: no
  • The schema: no
  • The default values of configurations: no
  • The threading model: no
  • The binary protocol: no
  • The REST endpoints: no
  • The admin CLI options: no
  • The metrics: no
  • Anything that affects deployment: no

…set timeout

The test calls `consumer.receive()` (no timeout) inside a loop after
`admin.topics().resetCursor(...)`. If the broker doesn't push the
expected redelivery in time, the consumer blocks indefinitely on
`GrowableArrayBlockingQueue.take()` until the 5-minute test timeout
fires:

```
ThreadTimeoutException: Method
o.a.p.b.admin.AdminApiTest.persistentTopicsCursorResetAfterReset()
didn't finish within the time-out 300000
    at j.u.c.locks.LockSupport.park(LockSupport.java:371)
    at o.a.p.c.u.collections.GrowableArrayBlockingQueue.take(...)
    at o.a.p.c.impl.ConsumerImpl.internalReceive(ConsumerImpl.java:531)
    at o.a.p.c.impl.ConsumerBase.receive(ConsumerBase.java:282)
    at AdminApiTest.persistentTopicsCursorResetAfterReset(:2945)
```

Switch every `consumer.receive()` in this test to
`receive(30, TimeUnit.SECONDS)` and assert the message is non-null
with a descriptive message. Now if the redelivery genuinely doesn't
arrive, the test fails fast (in 30s) with a clear diagnostic instead
of hanging for the full 5-minute timeout.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants