Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,11 @@ package org.apache.spark.sql.catalyst.xml
import java.io.Writer
import java.sql.Timestamp
import java.util.Base64
import javax.xml.stream.XMLOutputFactory

import scala.collection.Map

import org.apache.hadoop.shaded.com.ctc.wstx.api.WstxOutputProperties
import org.apache.hadoop.shaded.com.ctc.wstx.stax.WstxOutputFactory

import org.apache.spark.SparkIllegalArgumentException
import org.apache.spark.sql.catalyst.InternalRow
Expand Down Expand Up @@ -72,7 +72,13 @@ class StaxXmlGenerator(
private val binaryFormatter = ToStringBase.getBinaryFormatter

private val gen = {
val factory = XMLOutputFactory.newInstance()
// Instantiate the Woodstox factory directly from the shaded Hadoop classes instead of
// using XMLOutputFactory.newInstance(). The latter resolves an implementation via the
// service-loader mechanism, which could pick up a different (unshaded) StAX provider on the
// classpath. Such a provider would not understand the shaded WstxOutputProperties keys set
// below and would throw IllegalArgumentException. Constructing the shaded factory directly
// guarantees the properties and the implementation always match.
val factory = new WstxOutputFactory()
// to_xml disables structure validation to allow multiple root tags
factory.setProperty(WstxOutputProperties.P_OUTPUT_VALIDATE_STRUCTURE, validateStructure)
factory.setProperty(WstxOutputProperties.P_OUTPUT_VALIDATE_NAMES, options.validateName)
Comment on lines +81 to 84

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the shaded class import and make this best effort instead of forcing WoodStox:

  val factory = XMLOutputFactory.newInstance()
  Seq(
    "com.ctc.wstx.outputValidateStructure" -> validateStructure,
    "com.ctc.wstx.outputValidateNames" -> options.validateName
  ).foreach { case (prop, value) =>
    if (factory.isPropertySupported(prop)) factory.setProperty(prop, value)
  }

Expand Down