Backport ParallelIterable memory fixes (#9402, #10691, #10978) to 1.5.2#245
Merged
jiang95-dev merged 3 commits intoJun 8, 2026
Conversation
Core: Fix ParallelIterable memory leak where queue continues to be populated even after iterator close (#9402) (cherry picked from commit d3cb1b6)
Core: Limit ParallelIterable memory consumption by yielding in tasks (backport #10691) (#10787) ParallelIterable schedules 2 * WORKER_THREAD_POOL_SIZE tasks for processing input iterables. This defaults to 2 * # CPU cores. When one or some of the input iterables are considerable in size and the ParallelIterable consumer is not quick enough, this could result in unbounded allocation inside `ParallelIterator.queue`. This commit bounds the queue. When queue is full, the tasks yield and get removed from the executor. They are resumed when consumer catches up. (cherry picked from commit 7831a8d) Co-authored-by: Piotr Findeisen <piotr.findeisen@gmail.com> (cherry picked from commit e18a2fe10214f5f3ffa0a317a28af8b2a619817a)
Drop ParallelIterable's queue low water mark (#10979) As part of the change in commit 7831a8d, queue low water mark was introduced. However, it resulted in increased number of manifests being read when planning LIMIT queries in Trino Iceberg connector. To avoid increased I/O, back out the change for now. (cherry picked from commit ed53c6d326cb7efef2d41e26ef001e1d7b17fd78)
d955798 to
4e3efcd
Compare
jiang95-dev
approved these changes
Jun 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type: Backport / cherry-pick
Backports upstream Apache Iceberg
ParallelIterablememory fixes/optimizations (from releases 1.6.0 and 1.6.1) into theopenhouse-1.5.2line.Motivation: prerequisite for switching fork Trino from
org.apache.iceberg(1.6.1) tocom.linkedin.iceberg(1.5.2). Without these,ParallelIterablecan growParallelIterator.queuewithout bound (driver memory pressure / leak after iterator close), which would regress Trino behavior on the 1.5.2 connector.Cherry-picked commits
Each commit was produced with
git cherry-pick -x(original authorship +(cherry picked from commit <sha>)provenance preserved) and applied in upstream chronological order.main)1.6.x)d3cb1b696e18a2fe10ed53c6d32Order matters: applying #9402 first matches upstream history and lets the #10691/#10787 rewrite apply without conflict.
Conflicts
None. All three cherry-picks applied cleanly.
TestParallelIterablewas already on JUnit 5 + Awaitility in 1.5.2.x, so no test adaptation was required. The cherry-picks are byte-for-byte upstream apart from the prependedBackport of apache/iceberg#NNNN to openhouse-1.5.2.line in each commit message.Verification
./gradlew :iceberg-core:test --tests org.apache.iceberg.util.TestParallelIterable— pass (JDK 17; Gradle 8.1.1 requires JDK <= 17).The resulting
core/src/main/java/org/apache/iceberg/util/ParallelIterable.javais byte-identical to apache's post-#10978 state (bcb32818d) apart from non-functional annotations (@VisibleForTesting,@SuppressWarnings) and a test-onlyqueueSize()helper.Note for reviewers
ParallelIterable.javaemits a pre-existingFutureReturnValueIgnorederrorprone warning in the new yielding code — non-fatal, present upstream, build succeeds.