[aws2] Fix flaky SqsIOWriteBatchesTest timeout tests (#38946)#38971
[aws2] Fix flaky SqsIOWriteBatchesTest timeout tests (#38946)#38971tkaymak wants to merge 1 commit into
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses flaky tests in SqsIOWriteBatchesTest caused by reliance on wall-clock timing for batch grouping assertions. By shifting from strict order-based verification to invariant-based verification, the tests are now resilient to varying execution speeds on loaded CI runners while still effectively validating the timeout-driven flushing logic. Highlights
New Features🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors several tests in SqsIOWriteBatchesTest.java to use timing-independent invariants instead of strict timing assertions, which helps prevent test flakiness on loaded machines. The feedback focuses on resolving potential Checker Framework nullness warnings by properly wrapping nullable objects (such as the results of Map.get and the @Nullable parameter in assertMessageBodies) with checkNotNull before dereferencing or passing them to assertions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
The four timeout-related tests asserted the exact grouping of messages into SendMessageBatch calls. Those groupings depend on wall-clock timing (the per-message Thread.sleep delay racing the configured batch timeout), so on loaded CI runners batches form differently and the strict verify(...).sendMessageBatch(request(exact entries)) checks fail with Mockito ArgumentsAreDifferent. Rewrite the assertions to verify timing-independent invariants instead: all expected message bodies are sent exactly once, no batch exceeds the size implied by the timeout cadence, and at least the minimum number of batches is produced. This still exercises the timeout-driven flushing (both synchronous and the strict separate-thread variant) without depending on exact wall-clock behavior. Fixes apache#38946
456be77 to
62a4c6d
Compare
|
Assigning reviewers: R: @Abacn for label java. Note: If you would like to opt out of this review, comment Available commands:
The PR bot will only process comments in the main thread (not review comments). |
What
Fixes the flaky
SqsIOWriteBatchesTest(#38946).Why
The four timeout-related tests:
testWriteBatchesWithTimeouttestWriteBatchesWithStrictTimeouttestWriteBatchesToDynamicWithTimeouttestWriteBatchesToDynamicWithStrictTimeoutasserted the exact grouping of messages into
SendMessageBatchcalls.Those groupings depend on wall-clock timing, the per-message
Thread.sleepdelay racing the configuredwithBatchTimeout, so on loaded CI runners the batches form differently and the strictverify(sqs).sendMessageBatch(request(exact entries))checks fail with MockitoArgumentsAreDifferent.How
Replace the exact-grouping assertions with timing-independent invariants:
ArgumentCaptor, grouped by queue),verify(sqs, atLeast(n))).This still exercises timeout-driven flushing (both the synchronous on-append path and the strict separate-thread variant) without depending on exact wall-clock behavior. The non-timeout tests are unchanged.
Testing
Ran the class 6 times under deliberate 4-core CPU saturation (load avg ~12) with no failures; previously the timeout tests would flake under load.
Fixes #38946