-
Notifications
You must be signed in to change notification settings - Fork 144
properly codegen hl.triton_kernel #1797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -225,11 +225,3 @@ def all_gather_object(obj: T) -> list[T]: | |
| object_list = [None] * dist.get_world_size() | ||
| dist.all_gather_object(object_list, obj) | ||
| return object_list # pyrefly: ignore | ||
|
|
||
|
|
||
| def autotune_for_distributed_kernel() -> bool: | ||
| """ | ||
| Remove this once these issues regarding distributed kernels are fixed: | ||
| - https://github.com/pytorch/helion/issues/1642 | ||
| """ | ||
| return os.getenv("HELION_AUTOTUNE_FOR_DISTRIBUTED_KERNEL") == "1" | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Curious are the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes. The envvar was added as a workaround because of this issue |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -94,7 +94,6 @@ def setUpClass(cls) -> None: | |
| "HELION_DIST_CHECK_CONFIG_CONSISTANCY": "1", | ||
| "HELION_CAP_AUTOTUNE_NUM_NEIGHBORS": "50", | ||
| "HELION_CAP_REBENCHMARK_REPEAT": "50", | ||
| "HELION_AUTOTUNE_FOR_DISTRIBUTED_KERNEL": "1", | ||
| }, | ||
| ) | ||
| ) | ||
|
|
@@ -264,8 +263,7 @@ def test_allreduce_bias_rmsnorm(self, kernel_name, autotuner): | |
| kernel = getattr(mod, kernel_name).fn | ||
| if autotuner == "fixed": | ||
| fixed_config = helion.Config( | ||
| block_sizes=[8], | ||
| num_warps=8, | ||
| block_sizes=[8], num_warps=8, reduction_loops=[1024] | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jansel this test covered the issue fixed. |
||
| ) | ||
|
|
||
| kernel = helion.kernel( | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wonder should we add a single-gpu unit test that would fail without the fix in this PR? since I believe test/test_distributed.py is only run on the distributed CI job, adding a single-gpu unit test would ensure that we won't accidentally lose the coverage.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't come up with a reasonable single GPU unit test. But if you have concern about coverage, I can add a specific multi-GPU test case for this one. (rather than being a parametrized test) |
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a test for the issue this is fixing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_distributed.py has a test covering this case