use static signature for sfd_col_d_srelu_tensor by jiemingz · Pull Request #281 · NVIDIA/cudnn-frontend

jiemingz · 2026-06-04T23:15:52Z

Summary by CodeRabbit

Bug Fixes
- Improved caching efficiency for grouped matrix multiplication operations with dynamic activation functions, resulting in more precise instance selection and better performance optimization.

Signed-off-by: Jieming Zhang <jiemingz@nvidia.com>

Anerudhan · 2026-06-04T23:18:23Z

coderabbitai

🧹 Nitpick comments (1)

python/cudnn/grouped_gemm/grouped_gemm_dsrelu/api.py (1)
1483-1493: ⚡ Quick win

Add documentation explaining the dimension selection logic.

The helper omits dimension 4 from the static shape and marks stride dimensions 2 and 5 as dynamic, but the rationale for this specific selection is not documented. Consider adding a docstring or inline comment explaining:

Why dimension 4 (rest_m) is excluded from the static shape

Why stride dimensions 2 and 5 are treated as dynamic

How this relates to the SFD column tensor structure

This would improve maintainability and help future developers understand the cache key granularity design. As per coding guidelines, documentation is a key focus area for this module.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cudnn/grouped_gemm/grouped_gemm_dsrelu/api.py` around lines 1483 -
1493, Add a short docstring or inline comment to
dynamic_sfd_col_tensor_signature explaining the dimension-selection rationale:
state that static_shape intentionally omits dimension 4 (rest_m) because rest_m
varies per-column and should not be part of the cache key, and that
dynamic_stride_dims=(2, 5) marks the stride-related dimensions (the inner M
chunk and the leading stride for packed layout) as dynamic because their strides
can vary even when logical sizes match; also mention how this choice maps to the
SFD column tensor layout and why it yields the desired cache key granularity
when delegating to dynamic_m_tensor_signature.
Source: Coding guidelines

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@python/cudnn/grouped_gemm/grouped_gemm_dsrelu/api.py`:
- Around line 1483-1493: Add a short docstring or inline comment to
dynamic_sfd_col_tensor_signature explaining the dimension-selection rationale:
state that static_shape intentionally omits dimension 4 (rest_m) because rest_m
varies per-column and should not be part of the cache key, and that
dynamic_stride_dims=(2, 5) marks the stride-related dimensions (the inner M
chunk and the leading stride for packed layout) as dynamic because their strides
can vary even when logical sizes match; also mention how this choice maps to the
SFD column tensor layout and why it yields the desired cache key granularity
when delegating to dynamic_m_tensor_signature.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 941018c2-14e8-4ce1-a369-5b4cb4767865

📥 Commits

Reviewing files that changed from the base of the PR and between 035b520 and bf8b797.

📒 Files selected for processing (1)

python/cudnn/grouped_gemm/grouped_gemm_dsrelu/api.py

use static signature for sfd_col_d_srelu_tensor

bf8b797

Signed-off-by: Jieming Zhang <jiemingz@nvidia.com>

Anerudhan added mod-cutedsl CuTeDSL kernels, generated kernels, examples, or related integration work. cat-enhancements orig-nv-eng Reported or requested by NVIDIA engineering. labels Jun 4, 2026

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

NVIDIA deleted a comment from coderabbitai Bot Jun 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use static signature for sfd_col_d_srelu_tensor#281

use static signature for sfd_col_d_srelu_tensor#281
jiemingz wants to merge 1 commit into
NVIDIA:developfrom
jiemingz:jiemingz/dev_recompile

jiemingz commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Anerudhan commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jiemingz commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Anerudhan commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jiemingz commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading