chore(deps): bump torchao from 0.15.0 to 0.17.0 in /cmd/trainers/torchtune by dependabot[bot] · Pull Request #3399 · kubeflow/trainer

dependabot · 2026-03-30T23:39:14Z

Bumps torchao from 0.15.0 to 0.17.0.

Release notes

v0.17.0

Highlights

We are excited to announce the 0.17 release of torchao! This release adds support for cuteDSL MXFP8 MoE kernels, per-head FP8 quantized low precision attention, ABI stability, and more!

CuteDSL MXFP8 MoE Kernels

We added a new CuteDSL MXFP8 quantization kernel for 3d expert weights that writes scale factors directly to blocked layout for tensorcores: pytorch/ao#4090

Used for scaling along dim1 in the backward pass of MoE training with grouped GEMMs.

~12% speedup over previous 2 kernel “quantize then scale layout transformation” approach!

Per-Head FP8 Quantized Low Precision Attention

We added a new API for per-head fp8 quantized attention with FA3 as the backend (pytorch/ao#3959 and pytorch/ao#3857)

Users can either choose to use the elementary blocks as direct replacements for `F.scaled_dot_product_attention` or use the high-level wrapper, which replaces all F.SDPA calls within a module with the low precision attention variant.

Running torch.compile on a wrapped module will enable RoPE fusion where appropriate

Results show a 1.84x speedup on Wan2.1-T2V-1.3B, 1.23x speedup on LLaMA 3 prefill with high sequence lengths (131k), 1.07x speedup on flux.1-schnell with 2048x2048 image size

Example Usage of Direct Replacement:
from torchao.prototype.attention.fp8_fa3 import fp8_fa3_sdpa, fp8_fa3_rope_sdpa
out = fp8_fa3_sdpa(q, k, v)
Example Usage of Wrapper:
from torchao.prototype.attention import (
    AttentionBackend,
    LowPrecisionAttentionConfig,
    apply_low_precision_attention,
)
# Instantiate any nn.Module()
model = MyModel()
Simple SDPA replacement
config = LowPrecisionAttentionConfig(backend=AttentionBackend.FP8_FA3)
model = apply_low_precision_attention(model, config)
Flash activation is handled internally by the wrapper
output = model(inputs)
Torch.compile will enable rope fusion
model = torch.compile(model)
PyTorch ABI stability

... (truncated)

Commits

02105d4 [mxfp8 training] add cutedsl kernel for mxfp8 quantation along dim0 (#4156)
d17c61b clean up unused rocm references in test_training.py (#4170)
136cacb Remove tensor parallel test for v1 of Int8DynamicActivationInt8WeightConfig (...
8fca033 [xpu][test] Skip WIP config for Intel GPU in test_safetensors_support.py and ...
6a2f643 Fix rocm CI (#4167)
a927712 Move bitpacking.py to prototype and add uintx_utils.py (#4152)
9ea1e67 Skip test_fsdp2 if PyTorch version is 2.11.0 or higher (#4168)
3330d29 [reland][xpu] INT8 quantization on Intel XPU (#3782)
ac0b820 Fix test_sparse_api failures for builds without hipSPARSELt (#4125) (#4125)
1f90b4d Delete deprecated PackedLinearInt8DynamicActivationIntxWeightLayout and relat...
Additional commits viewable in compare view

Most Recent Ignore Conditions Applied to This Pull Request

Dependency Name	Ignore Conditions
torchao	[>= 0.16.dev0, < 0.17]

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [torchao](https://github.com/pytorch/ao) from 0.15.0 to 0.17.0. - [Release notes](https://github.com/pytorch/ao/releases) - [Commits](pytorch/ao@v0.15.0...v0.17.0) --- updated-dependencies: - dependency-name: torchao dependency-version: 0.17.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>

github-actions · 2026-03-30T23:39:27Z

🎉 Welcome to the Kubeflow Trainer! 🎉

Thanks for opening your first PR! We're happy to have you as part of our community 🚀

Here's what happens next:

If you haven't already, please check out our Contributing Guide for repo-specific guidelines and the Kubeflow Contributor Guide for general community standards.
Our team will review your PR soon! cc @kubeflow/kubeflow-trainer-team

Join the community:

Slack: Join our #kubeflow-trainer Slack channel.
Meetings: Attend the Kubeflow AutoML and Training Working Group bi-weekly meetings.

Feel free to ask questions in the comments if you need any help or clarification!
Thanks again for contributing to Kubeflow! 🙏

google-oss-prow · 2026-03-30T23:39:33Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign terrytangyuan for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Mar 30, 2026

dependabot bot requested review from Copilot and removed request for Copilot March 30, 2026 23:39

google-oss-prow bot added the size/XS label Mar 30, 2026

google-oss-prow bot requested review from akshaychitneni and jinchihe March 30, 2026 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(deps): bump torchao from 0.15.0 to 0.17.0 in /cmd/trainers/torchtune#3399

chore(deps): bump torchao from 0.15.0 to 0.17.0 in /cmd/trainers/torchtune#3399
dependabot[bot] wants to merge 1 commit intomasterfrom
dependabot/pip/cmd/trainers/torchtune/torchao-0.17.0

dependabot bot commented on behalf of github Mar 30, 2026

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

google-oss-prow bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

dependabot bot commented on behalf of github Mar 30, 2026

v0.17.0

Highlights

CuteDSL MXFP8 MoE Kernels

Per-Head FP8 Quantized Low Precision Attention

PyTorch ABI stability

Uh oh!

github-actions bot commented Mar 30, 2026

Uh oh!

google-oss-prow bot commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants