Skip to content

Define local get_op_idx after upstream removal#442

Merged
sanketpurandare merged 1 commit into
mainfrom
sanketpurandare/stack/4
May 4, 2026
Merged

Define local get_op_idx after upstream removal#442
sanketpurandare merged 1 commit into
mainfrom
sanketpurandare/stack/4

Conversation

@sanketpurandare

@sanketpurandare sanketpurandare commented May 4, 2026

Copy link
Copy Markdown
Contributor

Stacked PRs:


Define local get_op_idx after upstream removal

torch._inductor.comms.get_op_idx was removed in PyTorch commit a79a695988d as dead code cleanup. Keep the trivial op-name parser local so the autobucketing sort still validates scheduler node names before extracting the numeric index.

Add focused coverage for the local parser, and stop excluding tests/test_nccl_cost_model.py from pre-commit by fixing its existing lint issues.

Authored with Claude.

@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/4 branch from cbfb975 to 6791d71 Compare May 4, 2026 03:14
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label May 4, 2026
@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/3 branch from 2a2aec7 to 7425c44 Compare May 4, 2026 03:18
sanketpurandare added a commit that referenced this pull request May 4, 2026
torch._inductor.comms.get_op_idx was removed in PyTorch commit
a79a695988d as dead code cleanup. The function was trivially
int(snode.get_name()[2:]) — inline it at the single call site.

stack-info: PR: #442, branch: sanketpurandare/stack/4
@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 6791d71 to 18dd7f7 Compare May 4, 2026 03:18
@sanketpurandare sanketpurandare marked this pull request as draft May 4, 2026 03:32
@sanketpurandare sanketpurandare changed the base branch from sanketpurandare/stack/3 to main May 4, 2026 03:32
@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 18dd7f7 to 59e3b40 Compare May 4, 2026 03:32
@sanketpurandare sanketpurandare changed the base branch from main to sanketpurandare/stack/3 May 4, 2026 03:32
@sanketpurandare sanketpurandare marked this pull request as ready for review May 4, 2026 03:32
@sanketpurandare sanketpurandare marked this pull request as draft May 4, 2026 04:02
@sanketpurandare sanketpurandare changed the base branch from sanketpurandare/stack/3 to main May 4, 2026 04:02
@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/4 branch 2 times, most recently from 124fe52 to 96297cc Compare May 4, 2026 04:07
@sanketpurandare sanketpurandare marked this pull request as ready for review May 4, 2026 04:08

@fmassa fmassa left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok to approve to unblock, but I think it might also be worth considering if we should just remove those autobucketing passes from AutoParallel, if they already live in PyTorch. @IvanKobzarev wdyt?

Comment thread .pre-commit-config.yaml Outdated
@@ -1,4 +1,4 @@
exclude: 'build|stubs'
exclude: 'build|stubs|autoparallel/tools/overlap_simulator|tests/test_nccl_cost_model.py'

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the test_nccl_cost_model.py problematic?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the source linting errors. Sorry this was done sneakily by codex.

) # sort nodes by original operation order
ag_related_snodes = sorted(
ag_related_snode_set, key=lambda x: get_op_idx(x)
ag_related_snode_set, key=lambda x: int(x.get_name()[2:])

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this robust?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the exact definition in upstream before it got removed. It does a very dumb thing, if you have your op string as "op142", it just strips the "op" part and gives you "142" which is then converted to an int.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I added it it as a separate definition now and also added some asserts and a unit test.

@sanketpurandare

Copy link
Copy Markdown
Contributor Author

I'm ok to approve to unblock, but I think it might also be worth considering if we should just remove those autobucketing passes from AutoParallel, if they already live in PyTorch. @IvanKobzarev wdyt?

@IvanKobzarev just leaving a comment here that if you want to remove the passes you can remove them in a later PR, thanks!

torch._inductor.comms.get_op_idx was removed in PyTorch commit a79a695988d as dead code cleanup. Keep the trivial op-name parser local so the autobucketing sort still validates scheduler node names before extracting the numeric index.

Add focused coverage for the local parser, and stop excluding tests/test_nccl_cost_model.py from pre-commit by fixing its existing lint issues.

Authored with Claude.

stack-info: PR: #442, branch: sanketpurandare/stack/4
@sanketpurandare sanketpurandare marked this pull request as draft May 4, 2026 20:00
@sanketpurandare sanketpurandare force-pushed the sanketpurandare/stack/4 branch from 96297cc to 054648f Compare May 4, 2026 20:00
@sanketpurandare sanketpurandare changed the title Inline get_op_idx after upstream removal Define local get_op_idx after upstream removal May 4, 2026
@sanketpurandare sanketpurandare marked this pull request as ready for review May 4, 2026 20:00
@sanketpurandare sanketpurandare requested a review from fmassa May 4, 2026 20:03
@sanketpurandare sanketpurandare merged commit 28679a6 into main May 4, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants