Skip to content

[SPARK-56455][SQL][TESTS] Add test for broadcast failure when exceeding maxBroadcastTableSize#55313

Closed
pratham76 wants to merge 1 commit intoapache:masterfrom
pratham76:test-broadcast
Closed

[SPARK-56455][SQL][TESTS] Add test for broadcast failure when exceeding maxBroadcastTableSize#55313
pratham76 wants to merge 1 commit intoapache:masterfrom
pratham76:test-broadcast

Conversation

@pratham76
Copy link
Copy Markdown
Contributor

@pratham76 pratham76 commented Apr 12, 2026

What changes were proposed in this pull request?

Currently, there is no validation for enforcement of spark.sql.maxBroadcastTableSize.

This change introduces unit tests that verifies that proper exception is thrown when exceeding the threshold determined by the above config.

Why are the changes needed?

This helps prevent regressions and ensures consistent enforcement of broadcast size limits.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing Test Suites -- additional test cases added

Was this patch authored or co-authored using generative AI tooling?

No

@pratham76
Copy link
Copy Markdown
Contributor Author

@sunchao @dongjoon-hyun Could you have a look? Thanks!

Copy link
Copy Markdown
Member

@sunchao sunchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one nit

val ex = intercept[SparkException] {
joinDF.collect()
}
assert(ex.getMessage.contains("Cannot broadcast"))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this check is probably to board: since we are testing against maxBroadcastTableSize, we can check the error message directly here or the error code _LEGACY_ERROR_TEMP_2249.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sunchao, have used the approach of using error code to verify here.

@HyukjinKwon HyukjinKwon changed the title [SPARK-56455][SQL] Add test coverage for maxBroadcastTableSize config enforcement [SPARK-56455][SQL][TESTS] Add test coverage for maxBroadcastTableSize config enforcement Apr 12, 2026
@pratham76
Copy link
Copy Markdown
Contributor Author

@sunchao If there are no more changes required, could we merge this PR? Also please do let know if i can add it for 4.1.x as well, as the change was introduced then on.

}

test("SPARK-56455: maxBroadcastTableSize config enforcement") {
withSQLConf(SQLConf.MAX_BROADCAST_TABLE_SIZE.key -> "104857600") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that 104857600 and spark.range(100) are roughly related, this positive case seems redundant because the other test cases covers this case already.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. Have removed the redundant case.

assert(joinDF.collect().length == 100)
}

withSQLConf(SQLConf.MAX_BROADCAST_TABLE_SIZE.key -> "100") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this negative case, yes. I agree that it seems that we didn't have explicitly before. Could you revise the PR title and test case name specifically for this because the current PR title, "Add test coverage for maxBroadcastTableSize config enforcement", is too broad and ignores the existing test coverage.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. Could i change the title as "Add test for broadcast failure when exceeding maxBroadcastTableSize"? This seems to be appropriate for the test coverage. Do let me know your thoughts. Thanks!

@pratham76 pratham76 changed the title [SPARK-56455][SQL][TESTS] Add test coverage for maxBroadcastTableSize config enforcement [SPARK-56455][SQL][TESTS] Add test for broadcast failure when exceeding maxBroadcastTableSize Apr 13, 2026
@pratham76
Copy link
Copy Markdown
Contributor Author

@dongjoon-hyun Could you inform if the changes are okay to proceed with? Thanks!

@pratham76 pratham76 requested a review from dongjoon-hyun April 13, 2026 18:51
Copy link
Copy Markdown
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @pratham76 .

@dongjoon-hyun
Copy link
Copy Markdown
Member

Pending CIs.

@pratham76
Copy link
Copy Markdown
Contributor Author

Thanks @dongjoon-hyun. CI is successful can we merge this PR?

@dongjoon-hyun
Copy link
Copy Markdown
Member

Of course, @pratham76 .

@dongjoon-hyun
Copy link
Copy Markdown
Member

dongjoon-hyun commented Apr 14, 2026

Merged to master. I added you to the Apache Spark contributor group and assigned SPARK-56455 to you.

Welcome to the Apache Spark community and congratulations for your first JIRA issue (and the second commit).

@pratham76
Copy link
Copy Markdown
Contributor Author

Thank you @dongjoon-hyun! 😊

Could we also port this test to branch-4.1 as the initial change was added from 4.1 release onwards?

@dongjoon-hyun
Copy link
Copy Markdown
Member

dongjoon-hyun commented Apr 14, 2026

This will be a part of Apache Spark 4.2.0 whose Feature Freeze is scheduled in 2 weeks. (2026-05-01).

https://spark.apache.org/versioning-policy.html

Screenshot 2026-04-14 at 09 50 19

So, although I understand your passion, no, the requested backporting is not required in the community, @pratham76 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants