Skip to content

Spark 4.1: Fix Partition info not displayed in job description for RewritePositionDeleteFilesSparkAction#15175

Open
manuzhang wants to merge 1 commit into
apache:mainfrom
manuzhang:partition-desc
Open

Spark 4.1: Fix Partition info not displayed in job description for RewritePositionDeleteFilesSparkAction#15175
manuzhang wants to merge 1 commit into
apache:mainfrom
manuzhang:partition-desc

Conversation

@manuzhang

@manuzhang manuzhang commented Jan 29, 2026

Copy link
Copy Markdown
Member

Closes #12414

After Fix:
Google Chrome 2026-01-29 22 58 48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an issue where partition information was not properly displayed in job descriptions for the RewritePositionDeleteFilesSparkAction in Spark 4.1. The fix adds a description field to StructProjection to carry partition information through the projection process.

Changes:

  • Modified StructProjection to store and return a description string
  • Updated PartitionUtil.coercePartition() to pass partition description when wrapping PartitionData
  • Modified RewritePositionDeleteFilesSparkAction to extract and use the partition description in job descriptions

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
api/src/main/java/org/apache/iceberg/util/StructProjection.java Added description field and getter method, plus overloaded wrap method to accept description
core/src/main/java/org/apache/iceberg/util/PartitionUtil.java Updated to pass partition description when wrapping PartitionData instances
spark/v4.1/spark/src/main/java/org/apache/iceberg/spark/actions/RewritePositionDeleteFilesSparkAction.java Added helper method to extract partition description for job descriptions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/src/main/java/org/apache/iceberg/util/PartitionUtil.java Outdated
@manuzhang manuzhang force-pushed the partition-desc branch 2 times, most recently from be1b165 to 5beb72d Compare January 29, 2026 02:55
@manuzhang manuzhang requested a review from Copilot January 29, 2026 02:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread api/src/main/java/org/apache/iceberg/util/StructProjection.java

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread api/src/main/java/org/apache/iceberg/util/StructProjection.java
Comment thread api/src/main/java/org/apache/iceberg/util/StructProjection.java Outdated
@manuzhang manuzhang force-pushed the partition-desc branch 2 times, most recently from f85cd05 to 85a351f Compare January 29, 2026 03:08
@manuzhang manuzhang marked this pull request as ready for review January 29, 2026 15:09
@manuzhang manuzhang requested a review from stevenzwu February 3, 2026 09:04
@manuzhang manuzhang requested review from huaxingao and nastra March 2, 2026 03:36
Comment thread api/src/main/java/org/apache/iceberg/util/StructProjection.java Outdated
@manuzhang manuzhang force-pushed the partition-desc branch 2 times, most recently from 377bc26 to bcf599e Compare March 3, 2026 09:47
@@ -377,4 +378,8 @@ private String jobDesc(
table.name());
}
}

private String partitionDescription(StructLike struct) {
return struct instanceof StructProjection ? ((StructProjection) struct).description() : "";

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't really need to cast with JDK 17 anymore. we can define the variable name already as part of the instanceof

@manuzhang manuzhang force-pushed the partition-desc branch 2 times, most recently from e0aad64 to b80e536 Compare March 3, 2026 11:00
@manuzhang manuzhang requested a review from nastra March 4, 2026 06:36
@@ -362,7 +363,7 @@ private String jobDesc(
runner.description(),
group.info().globalIndex(),
plan.totalGroupCount(),
partition,
partitionDescription(partition),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have the same issue in RewriteDataFilesSparkAction?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why, but RewriteDataFilesSparkAction doesn't call coercePartition.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without coercePartition, the partition info can be displayed in job description.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, got it, thanks. Nevertheless, I think we should still aim for implementing a proper toString() method


public StructProjection copyFor(StructLike newStruct) {
return new StructProjection(type, positionMap, nestedProjections).wrap(newStruct);
return new StructProjection(type, positionMap, nestedProjections).wrap(newStruct, description);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking more about this, I'm not sure that's the right approach going forward. Maybe it's better to implement toString and defer to the underlying struct in the toString implementation.

@Override
  public String toString() {
    return null != struct ? struct.toString() : null;
  }

that way we don't need any of the other changes and we can also make sure that this is properly handled not only for RewritePositionDeleteFilesAction but also for the action for rewriting data files

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, like in that PR, although @stevenzwu has a point in that it should only include the projected fields

@manuzhang

manuzhang commented Mar 20, 2026

Copy link
Copy Markdown
Member Author

@nastra Please take another look. I followed the handling of coerced PartitionData in https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/PartitionStatsHandler.java#L510 and no other changes are needed now.

@manuzhang manuzhang closed this Mar 23, 2026
@manuzhang manuzhang reopened this Mar 23, 2026
@manuzhang manuzhang requested a review from nastra March 23, 2026 09:16
@github-actions

Copy link
Copy Markdown

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@manuzhang

Copy link
Copy Markdown
Member Author

@nastra @huaxingao gentle ping for review. Thanks!

@manuzhang manuzhang added this to the Iceberg 1.12.0 milestone Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Partition info is not displayed in job description for RewritePositionDeleteFilesSparkAction

3 participants