Skip to content

[DOC] add Databricks support matrix [skip ci]#15090

Open
nvliyuan wants to merge 1 commit into
NVIDIA:mainfrom
nvliyuan:docs-databricks-support-matrix
Open

[DOC] add Databricks support matrix [skip ci]#15090
nvliyuan wants to merge 1 commit into
NVIDIA:mainfrom
nvliyuan:docs-databricks-support-matrix

Conversation

@nvliyuan

Copy link
Copy Markdown
Collaborator

Closes #15059.

Description

This draft PR adds a Databricks support matrix to make runtime compatibility visible before users deploy the RAPIDS Accelerator on Databricks.

Changes include:

  • Add docs/databricks-support.md with the current v26.06.0 Databricks runtime matrix covering Spark, Scala, JDK runtime ownership, CUDA jar variants, minimum driver, and runtime notes.
  • Add a Databricks Delta Lake feature support table for DBR 14.3 and DBR 17.3, based on the related TME documentation MR: https://gitlab-master.nvidia.com/spark-rapids-tme/documentation/-/merge_requests/178
  • Link the new support matrix from docs/download.md near the supported Databricks runtime list.
  • Document caveats around Databricks runtime patching, binary compatibility errors such as NoSuchMethodError, and operation-specific Delta CPU fallback.

Validation:

  • git diff --check -- docs/databricks-support.md docs/download.md
  • IDE lints for docs/databricks-support.md and docs/download.md

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Please provide the names of the existing tests in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

Signed-off-by: liyuan <yuali@nvidia.com>
@nvliyuan nvliyuan force-pushed the docs-databricks-support-matrix branch from 8e60321 to a5390a6 Compare June 15, 2026 07:38
@nvliyuan nvliyuan marked this pull request as ready for review June 15, 2026 07:43
@nvliyuan nvliyuan requested a review from sameerz June 15, 2026 07:43
@nvliyuan nvliyuan added the documentation Improvements or additions to documentation label Jun 15, 2026
@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds a new docs/databricks-support.md page with a Databricks runtime compatibility matrix for v26.06.0 (DBR 13.3, 14.3, 17.3) and a Delta Lake GPU/CPU-fallback feature table, then cross-links it from docs/download.md.

  • New databricks-support.md: Documents Spark/Scala/CUDA/driver requirements per Databricks runtime, Delta Lake GPU support per operation (reads, writes, DML, OPTIMIZE, liquid clustering), and binary-compatibility caveats including the historical CatalogTable.copy NoSuchMethodError.
  • download.md update: Adds a three-line cross-reference link to the new support matrix, placed immediately after the Supported Databricks runtime list.

Confidence Score: 4/5

Documentation-only change; no code is modified and no runtime behavior is affected.

The new page is well-structured and covers the right areas, but the Delta Lake table shows liquid clustering as GPU support on DBR 14.3 and CPU fallback on DBR 17.3 with no explanation, which is likely to confuse users upgrading between runtimes. The deletion-vector read row for DBR 17.3 also leaves out the configuration or conditions needed to actually achieve the GPU path. Neither issue blocks functionality, but they could actively mislead users about expected behavior on supported runtimes.

docs/databricks-support.md — the Delta Lake GPU support table needs clarification around the liquid clustering regression and the deletion-vector read preconditions for DBR 17.3.

Important Files Changed

Filename Overview
docs/databricks-support.md New documentation page covering Databricks runtime compatibility matrix and Delta Lake GPU support table; three findings around unexplained liquid clustering regression (14.3 → 17.3), version-pinned content in a persistent page, and missing actionable guidance for deletion-vector read conditions.
docs/download.md Three-line addition inserting a cross-reference link to the new Databricks support matrix page; link placement and relative path are correct.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A([User wants to deploy RAPIDS\nAccelerator on Databricks]) --> B{Check DBR version}
    B --> C[DBR 13.3 ML LTS GPU\nSpark 3.4.1 · Scala 2.12]
    B --> D[DBR 14.3 ML LTS GPU\nSpark 3.5.0 · Scala 2.12]
    B --> E[DBR 17.3 ML LTS GPU\nSpark 4.0.0 · Scala 2.13]
    C --> F[Download Scala 2.12 artifact]
    D --> F
    E --> G[Download Scala 2.13 artifact]
    F --> H{CUDA variant?}
    G --> H
    H --> I[CUDA 12 jar]
    H --> J[CUDA 13 jar]
    I --> K[Deploy · verify with\nspark.rapids.sql.explain=NOT_ON_GPU]
    J --> K
    K --> L{Using Delta Lake?}
    L -- No --> M([Done])
    L -- Yes --> N[Check Delta Lake GPU\nSupport table in\ndatabricks-support.md]
    N --> O{Feature on GPU\nor CPU fallback?}
    O -- GPU --> M
    O -- CPU fallback --> P[Confirm expected fallback\nor file issue]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A([User wants to deploy RAPIDS\nAccelerator on Databricks]) --> B{Check DBR version}
    B --> C[DBR 13.3 ML LTS GPU\nSpark 3.4.1 · Scala 2.12]
    B --> D[DBR 14.3 ML LTS GPU\nSpark 3.5.0 · Scala 2.12]
    B --> E[DBR 17.3 ML LTS GPU\nSpark 4.0.0 · Scala 2.13]
    C --> F[Download Scala 2.12 artifact]
    D --> F
    E --> G[Download Scala 2.13 artifact]
    F --> H{CUDA variant?}
    G --> H
    H --> I[CUDA 12 jar]
    H --> J[CUDA 13 jar]
    I --> K[Deploy · verify with\nspark.rapids.sql.explain=NOT_ON_GPU]
    J --> K
    K --> L{Using Delta Lake?}
    L -- No --> M([Done])
    L -- Yes --> N[Check Delta Lake GPU\nSupport table in\ndatabricks-support.md]
    N --> O{Feature on GPU\nor CPU fallback?}
    O -- GPU --> M
    O -- CPU fallback --> P[Confirm expected fallback\nor file issue]
Loading

Reviews (1): Last reviewed commit: "[DOC] add Databricks support matrix" | Re-trigger Greptile

Comment on lines +42 to +52
| Delta feature | DBR 14.3 | DBR 17.3 |
|---------------|----------|----------|
| Reads without deletion vectors | GPU | GPU |
| Deletion vector reads | CPU fallback | GPU only with metadata row index and RAPIDS deletion-vector predicate pushdown |
| Delta writes | GPU for append, overwrite, CTAS, and RTAS | GPU for append and overwrite. CTAS and RTAS fall back to CPU. |
| Delta writes with deletion vectors | CPU fallback | CPU fallback |
| DELETE and UPDATE | GPU for copy-on-write. Operations that write deletion vectors fall back to CPU. | Same as DBR 14.3. Liquid-clustered paths also fall back to CPU. |
| MERGE | GPU, including liquid clustering | GPU for non-liquid-clustered tables only. Liquid-clustered and persistent deletion-vector writes fall back to CPU. |
| OPTIMIZE | CPU fallback | GPU for standard deletion-vector-free, non-liquid-clustered tables only |
| Auto compaction | GPU when triggered by supported GPU writes | GPU for inline deletion-vector-free, non-liquid-clustered tables only |
| Liquid clustering | GPU support | CPU fallback |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 DBR 13.3 missing from the Delta Lake GPU support table

The Runtime Compatibility section lists three supported runtimes (DBR 13.3, 14.3, and 17.3), but the Delta Lake GPU Support table only covers DBR 14.3 and DBR 17.3. Users deploying to DBR 13.3 — a listed and supported runtime — have no guidance on which Delta Lake operations run on the GPU vs. fall back to CPU, which is precisely the kind of ambiguity this document is intended to resolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DOC] Add detailed Spark RAPIDS support matrix for Databricks runtimes, JDK, CUDA, and known issues

3 participants