Skip to content

Conversation

@slfan1989
Copy link
Contributor

@slfan1989 slfan1989 commented Jan 18, 2026

Which issue does this PR close?

Closes #1404.

Rationale for this change

[AURON#1404] Support for Spark 4.0.1 Compatibility in Auron.

What changes are included in this PR?

To support Spark 4, Auron needs to be adapted accordingly. Currently, Celeborn already supports Spark 4.0, and Iceberg has also supported Spark 4.0 for some time. The Iceberg community has already voted to deprecate support for Spark 3.4, and it will be removed soon.

For this PR, I have made the following changes:

  • Three changes encountered during compilation:

    • NativeShuffleExchangeExec#ShuffleWriteProcessor: Due to SPARK-44605 restructuring the write method in the API, I refactored the partition and rdd handling here to retrieve them from dependencies for compatibility with other interfaces. In the future, we should switch to the new interface and make further changes to nativeRssShuffleWrite / nativeShuffleWrite.

    • NativeBroadcastExchangeBase#getBroadcastTimeout: In Spark 4.0, getBroadcastTimeout needs to be fetched from getActiveSession.

    • NativeBroadcastExchangeBase#getRelationFuture: In Spark 4.0, the type of SparkSession has changed to org.apache.spark.sql.classic.SparkSession, so I made the necessary adjustments to the way it is accessed.

Are there any user-facing changes?

No.

How was this patch tested?

CI.

Signed-off-by: slfan1989 <slfan1989@apache.org>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds Spark 4.0.1 compatibility to Auron by addressing three main compilation issues and updating version annotations across the codebase.

Changes:

  • Added Maven profile and build script configuration for Spark 4.0 with Java 17+ and Scala 2.13 requirements
  • Implemented Spark 4.0-specific method overrides for broadcast exchange, shuffle write, and Hive table operations to handle API changes (SPARK-44605, session type changes, servlet API migration)
  • Added getVariant method implementations in columnar classes and updated @sparkver annotations throughout to include "4.0"

Reviewed changes

Copilot reviewed 41 out of 43 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
pom.xml Added Spark 4.0 Maven profile with Java 17+ and Scala 2.13 enforcement
auron-build.sh Added "4.0" to supported Spark versions list
NativeBroadcastExchangeBase.scala Added Spark 4.0 implementations for getBroadcastTimeout and getRelationFuture to handle session API changes
NativeShuffleExchangeExec.scala Added Spark 4.0 write method override for SPARK-44605 API changes and shuffleId property
NativeParquetInsertIntoHiveTableExec.scala Added AuronInsertIntoHiveTable40 class to handle classic.SparkSession type change and removed unused import
AuronAllExecutionsPage.scala Added Spark 4.0 render method override for jakarta.servlet migration and added sparkver dependency
AuronColumnarStruct/BatchRow/Array.scala Added getVariant method implementations for Spark 4.0 VariantVal support
Multiple exec and provider files Updated @sparkver annotations from "3.x" ranges to include "/ 4.0"
Test files Removed unused variables and updated test suite version annotations
Multiple shim files Code formatting adjustments (indentation) with no logic changes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


@sparkver("4.0")
override def shuffleId: Int = {
shuffleDependency.shuffleId;
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semicolon at the end of line 232 is unnecessary in Scala and should be removed for consistency with Scala style conventions.

Suggested change
shuffleDependency.shuffleId;
shuffleDependency.shuffleId

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +200
override def write(
inputs: Iterator[_],
dep: ShuffleDependency[_, _, _],
mapId: Long,
mapIndex: Int,
context: TaskContext): MapStatus = {

// [SPARK-44605][CORE] Refined the internal ShuffleWriteProcessor API.
// Due to the restructuring of the write method in the API, we optimized and refactored the original Partition.
val rdd = dep.rdd
val partition = rdd.partitions(mapIndex)

val writer = SparkEnv.get.shuffleManager.getWriter(
dep.shuffleHandle,
mapId,
context,
createMetricsReporter(context))

writer match {
case writer: AuronRssShuffleWriterBase[_, _] =>
writer.nativeRssShuffleWrite(
rdd.asInstanceOf[MapPartitionsRDD[_, _]].prev.asInstanceOf[NativeRDD],
dep,
mapId.toInt,
context,
partition,
numPartitions)

case writer: AuronShuffleWriterBase[_, _] =>
writer.nativeShuffleWrite(
rdd.asInstanceOf[MapPartitionsRDD[_, _]].prev.asInstanceOf[NativeRDD],
dep,
mapId.toInt,
context,
partition)
}
writer.stop(true).get
}
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inputs parameter is not used in this method implementation. According to SPARK-44605, the API was refined to pass an Iterator instead of RDD, but this implementation retrieves the RDD from the dependency and reconstructs the partition. Verify that ignoring the inputs parameter is intentional and correct for the Auron native execution model, or consider using it if it contains the actual input data for this map task.

Copilot uses AI. Check for mistakes.
Comment on lines +147 to +150
def getBroadcastTimeout: Long = {
SparkSession.getActiveSession
.map(_.conf.get("spark.sql.broadcastTimeout").toLong)
.getOrElse(300L)
Copy link

Copilot AI Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default timeout value of 300L seconds (5 minutes) is hardcoded. This should match Spark's default broadcast timeout configuration value. Verify that 300 seconds is the correct default for Spark 4.0, or consider using Spark's configuration constant if available to ensure consistency with Spark's defaults.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for Spark 4.0.1 Compatibility in Auron

1 participant