Skip to content

Spark: Add named constant for NO_ADVISORY_PARTITION_SIZE#15681

Merged
singhpk234 merged 2 commits intoapache:mainfrom
jbewing:main
Mar 20, 2026
Merged

Spark: Add named constant for NO_ADVISORY_PARTITION_SIZE#15681
singhpk234 merged 2 commits intoapache:mainfrom
jbewing:main

Conversation

@jbewing
Copy link
Copy Markdown
Contributor

@jbewing jbewing commented Mar 19, 2026

This PR extracts a named constant for NO_ADVISORY_PARTITION_SIZE in SparkWriteRequirements. I'm splitting this out of another PR which took another approach to a change and is making this change as a byproduct.

Related PR: #15150

@@ -26,8 +26,10 @@
/** A set of requirements such as distribution and ordering reported to Spark during writes. */
public class SparkWriteRequirements {

public static final long NO_ADVISORY_PARTITION_SIZE = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this constant needs to be public? Could we make it private for now and only relax the visibility if a concrete use case comes up?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, fixed in 0ff2adb

@@ -54,6 +56,8 @@ public boolean hasOrdering() {

public long advisoryPartitionSize() {
// Spark prohibits requesting a particular advisory partition size without distribution
return distribution instanceof UnspecifiedDistribution ? 0 : advisoryPartitionSize;
return distribution instanceof UnspecifiedDistribution
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I liked how you moved this into SparkWriteRequirements constructor in the previous pr, I think it's weird that the class stores a value and then ignores it sometimes. We are currently basically saying "hey you can pass in a value, jk it does nothing because a different parameter was passed.

It would probably make sense to prohibit specifying advisory size if distribution is "unspecified" but I think that may break some things.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , addressed in 0ff2adb

Copy link
Copy Markdown
Member

@RussellSpitzer RussellSpitzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, the only thing I think it's worth while to keep that constant private as suggested by @ebyhr. If you like you could change the assignment like in the previous pr.

@jbewing
Copy link
Copy Markdown
Contributor Author

jbewing commented Mar 20, 2026

Thanks for the feedback @RussellSpitzer & @ebyhr! I've addressed it in 0ff2adb

@singhpk234 singhpk234 merged commit a16baff into apache:main Mar 20, 2026
24 checks passed
@singhpk234
Copy link
Copy Markdown
Contributor

Thanks for the change @jbewing ! Thank you for review @ebyhr @RussellSpitzer @huaxingao

manuzhang pushed a commit to manuzhang/iceberg that referenced this pull request Mar 30, 2026
)

* Add named constant for `NO_ADVISORY_PARTITION_SIZE`

* PR feedback
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants