Skip to content

rendezvous only once for matmul reduce scatter#1824

Merged
shunting314 merged 1 commit intomainfrom
shunting314/stack/22
Mar 27, 2026
Merged

rendezvous only once for matmul reduce scatter#1824
shunting314 merged 1 commit intomainfrom
shunting314/stack/22

Conversation

@shunting314
Copy link
Copy Markdown
Contributor

@shunting314 shunting314 commented Mar 26, 2026

Stacked PRs:


rendezvous only once for matmul reduce scatter

benchmarking results shows the helion kernel would be >10x slower if we don't have this fix.

shunting314 added a commit that referenced this pull request Mar 26, 2026
stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from 827159b to 0a4b037 Compare March 26, 2026 06:49
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2026
@shunting314 shunting314 requested review from jansel and yf225 March 26, 2026 06:50
@shunting314 shunting314 marked this pull request as draft March 26, 2026 06:51
@shunting314 shunting314 changed the base branch from shunting314/stack/21 to main March 26, 2026 06:51
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from 0a4b037 to 0692f5d Compare March 26, 2026 06:51
shunting314 added a commit that referenced this pull request Mar 26, 2026
stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 changed the base branch from main to shunting314/stack/21 March 26, 2026 06:52
@shunting314 shunting314 marked this pull request as ready for review March 26, 2026 06:52
@shunting314 shunting314 marked this pull request as draft March 26, 2026 06:54
@shunting314 shunting314 changed the base branch from shunting314/stack/21 to main March 26, 2026 06:54
shunting314 added a commit that referenced this pull request Mar 26, 2026
stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from 0692f5d to 45bb050 Compare March 26, 2026 06:54
@shunting314 shunting314 changed the base branch from main to shunting314/stack/21 March 26, 2026 06:54
@shunting314 shunting314 marked this pull request as ready for review March 26, 2026 06:54
@shunting314 shunting314 marked this pull request as draft March 26, 2026 06:55
@shunting314 shunting314 changed the base branch from shunting314/stack/21 to main March 26, 2026 06:55
shunting314 added a commit that referenced this pull request Mar 26, 2026
stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from 45bb050 to 8b6d57d Compare March 26, 2026 06:55
@shunting314 shunting314 changed the base branch from main to shunting314/stack/21 March 26, 2026 06:55
@shunting314 shunting314 marked this pull request as ready for review March 26, 2026 06:55
@shunting314 shunting314 marked this pull request as draft March 26, 2026 22:21
@shunting314 shunting314 changed the base branch from shunting314/stack/21 to main March 26, 2026 22:21
shunting314 added a commit that referenced this pull request Mar 26, 2026
stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from 8b6d57d to 4a0e82b Compare March 26, 2026 22:22
shunting314 added a commit that referenced this pull request Mar 26, 2026
stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from 4a0e82b to af5bbae Compare March 26, 2026 22:22
@shunting314 shunting314 changed the base branch from main to shunting314/stack/21 March 26, 2026 22:22
@shunting314 shunting314 marked this pull request as ready for review March 26, 2026 22:22


def helion_matmul_reduce_scatter(
symm_mem_buffer: torch.Tensor,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a test for this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have tests covering the kernel. But not specific to this wrapper. I just added one to make sure symm_mem_tensor is passed in rather than created inside the wrapper.

Ideally, we should avoid using such a wrapper though

stack-info: PR: #1824, branch: shunting314/stack/22
@shunting314 shunting314 marked this pull request as draft March 27, 2026 18:32
@shunting314 shunting314 changed the base branch from shunting314/stack/21 to main March 27, 2026 18:32
@shunting314 shunting314 force-pushed the shunting314/stack/22 branch from af5bbae to 0728d11 Compare March 27, 2026 18:32
@shunting314 shunting314 marked this pull request as ready for review March 27, 2026 18:32
@shunting314 shunting314 marked this pull request as draft March 27, 2026 20:52
@shunting314 shunting314 marked this pull request as ready for review March 27, 2026 20:52
@shunting314 shunting314 marked this pull request as draft March 27, 2026 21:20
@shunting314 shunting314 marked this pull request as ready for review March 27, 2026 21:20
@shunting314 shunting314 merged commit 118ea4b into main Mar 27, 2026
20 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants