Skip to content

Delta pipeline fix tests#12386

Draft
felipepessoto wants to merge 6 commits into
apache:mainfrom
felipepessoto:delta_pipeline_fix_tests
Draft

Delta pipeline fix tests#12386
felipepessoto wants to merge 6 commits into
apache:mainfrom
felipepessoto:delta_pipeline_fix_tests

Conversation

@felipepessoto

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

…es baseline

Run delta-io/delta's `spark` ScalaTest suite against a Gluten Velox bundle in CI
and gate the results against a committed baseline so the many expected Delta-on-
Gluten failures stay manageable and can be fixed incrementally without letting
currently-passing tests silently regress.

What it adds (.github/workflows/util/delta-spark-ut/):
- delta_spark_ut.yml: builds the native lib + Gluten bundle, then runs the Delta
  spark suite sharded by suite into 4 shards x 4 forked test JVMs (~16-way), and
  gates each shard against the baseline.
- compare-test-results.py: the gate. Per shard, regressions (failed not in the
  baseline) fail the build; newly-passing baselined tests are flagged so the
  baseline can be tightened. Also supports seed/aggregate modes.
- known-failures.txt: the committed baseline of expected failures.
- setup-delta.sh: clones Delta, injects the Gluten bundle, patches
  DeltaSQLCommandTest, and force-fails the two DeletionVectorsSuite 2B-row tests
  whose native row-index materialization OOM-kills the runner and hangs the shard.
- README.md: how the pipeline, gating and baseline-refresh work.

The workflow also carries a hang watchdog that thread-dumps and kills a wedged
fork, and tunes the per-fork heap (2G) and off-heap (2G) to fit the ~16G runner.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
felipepessoto and others added 2 commits June 27, 2026 09:37
Velox has no Arrow representation for VariantType, so the native columnar write
path -- which converts the incoming rows to Velox batches via
RowToVeloxColumnarExec.toArrowSchema -- throws
`UnsupportedOperationException: Unsupported data type: variant` at runtime. This
broke every Delta write whose schema contains a variant column (INSERT, UPDATE,
MERGE, OPTIMIZE/auto-compact, checkpoint-driven rewrites), since
GlutenOptimisticTransaction.writeFiles always offloaded the write to the native
writer (the now-removed code path built the Velox plan unconditionally).

Guard GlutenOptimisticTransaction.writeFiles: if the input schema contains a
variant at any nesting level, delegate to super.writeFiles (the vanilla Delta
write path) instead of offloading. Non-variant writes are unaffected. The check
matches by type name so it stays source-compatible across Spark versions.

Adds GlutenDeltaVariantWriteSuite covering top-level, struct-nested, and UPDATE
variant writes.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@felipepessoto felipepessoto force-pushed the delta_pipeline_fix_tests branch from d39550f to 95ce39c Compare June 27, 2026 09:37
@github-actions github-actions Bot removed the VELOX label Jun 27, 2026
…line

Delta's data-skipping, limit-push-down, column-pruning and scan-metric tests
collect file-source scans by matching the concrete `FileSourceScanExec` case
class. Under the Gluten Velox bundle the scan is offloaded to
DeltaScanTransformer, a sibling that implements the same `FileSourceScanLike`
interface but is not FileSourceScanExec, so the match misses and the scan
looks absent. This surfaced as `scala.MatchError: List()` (~56
DataSkipping*/DeltaLimitPushDown* tests), empty generated-column partition
filters (~45 OptimizeGeneratedColumnSuite tests) and broken column-pruning /
scan-metric checks across the Delete, Update, Merge, DeletionVectors and
RowId suites and the TestsStatistics helper.

Gluten copies `partitionFilters` and the other accessors these tests read
verbatim onto the offloaded scan, so results are identical to vanilla -- only
the test's `case` match breaks. Fix it by cherry-picking the two merged
upstream Delta commits that widen these matches to the shared
`FileSourceScanLike` interface (behavior-preserving for vanilla, which also
implements it):

  * delta-io/delta#7104 -- ScanReportHelper.collectScans
  * delta-io/delta#7105 -- the remaining 9 test sources, its follow-up

Both are merged on Delta master but land after the ref this workflow builds
against (v4.2.0), so setup-delta.sh cherry-picks them onto the shallow
checkout. Each fetches the fix commit at depth 2 (commit + parent) so
cherry-pick can compute the parent->fix diff, and uses `cherry-pick -n` so no
committer identity is required. Once the pinned DELTA_REF advances to include
a commit its cherry-pick becomes a clean no-op and that block can be removed.

The cherry-picks run before the DeletionVectorsSuite 2B-row force-fail step:
that step sed-injects fail() into DeletionVectorsSuite.scala, which
delta-io/delta#7105 also edits, and git cherry-pick refuses to apply onto a
working tree with uncommitted changes to a file it touches (exit 128).

Refresh known-failures.txt from run 28299900971 (the delta-spark-aggregate job
output), which ran all 19073 tests across 16 shards: removes 187 now-passing
tests with 0 regressions, 963 -> 776. ~147 come from the fixes above
(DataSkipping*, DeltaLimitPushDown*, OptimizeGeneratedColumnSuite, MergeInto*,
RowIdSuite); the remaining ~40 are other suites that now pass (e.g.
HiveConvertToDeltaSuite, BitmapAggregatorE2ESuite). Verified against the
per-shard ran/failed lists: every baseline entry was observed this run (0
stale), so nothing was dropped due to a crashed or incomplete shard.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@felipepessoto felipepessoto force-pushed the delta_pipeline_fix_tests branch 2 times, most recently from 154089e to 05e5156 Compare June 28, 2026 02:50
Make delta_spark_ut.yml a reusable workflow (on: workflow_call) and call it from
velox_backend_x86.yml so the Delta tests reuse the native lib + arrow jars that
workflow already builds, instead of duplicating the build-native-lib-centos-7
job. GitHub artifacts cannot be shared across workflows, so the only way to
reuse the artifact is to run the Delta jobs in the same workflow run.

delta_spark_ut.yml keeps a workflow_dispatch trigger for standalone manual runs
(its build-native-lib-centos-7 job is gated to that case and skipped when
called); the pull_request trigger is removed so the suite no longer double-runs.
velox_backend_x86.yml gains an arrow-jars upload on its native build and a
delta-spark-ut job that calls the reusable workflow. That job runs on every
velox trigger like the other spark-test jobs, since core/velox/substrait/cpp
changes can affect Delta query offload.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@felipepessoto felipepessoto force-pushed the delta_pipeline_fix_tests branch from 05e5156 to b1fe046 Compare June 28, 2026 03:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant