Skip to content

Arm backend: Preserve duplicate output slots with TOSA identity fanout#18866

Open
AdrianLundell wants to merge 1 commit intopytorch:mainfrom
AdrianLundell:change-1225529
Open

Arm backend: Preserve duplicate output slots with TOSA identity fanout#18866
AdrianLundell wants to merge 1 commit intopytorch:mainfrom
AdrianLundell:change-1225529

Conversation

@AdrianLundell
Copy link
Copy Markdown
Collaborator

@AdrianLundell AdrianLundell commented Apr 14, 2026

When FuseEqualPlaceholdersPass fuses equal constant placeholders, the graph output can contain the same node in multiple output slots. In this case ToTosaMemoryFormatPass was rewriting the output node with replace_input_with() while inserting output transposes.

That rewrote all matching occurrences at once, so duplicated logical output slots were collapsed onto the same transpose node instead of remaining distinct.

Fix this by handling duplicate outputs in the output rewrite path. For shared output nodes, create a single boundary TOSA TRANSPOSE and preserve distinct output slots by inserting TOSA IDENTITY fanout nodes for later duplicates.

This keeps insert_input_transpose() focused on normal input rewrites, avoids duplicating equivalent transposes for shared outputs, and preserves the output slot structure expected by later lowering and serialization stages.

Add regression coverage for FuseEqualPlaceholdersPass + ToTosaMemoryFormatPass with duplicate outputs, and add TOSA IDENTITY dialect and visitor coverage.

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

When FuseEqualPlaceholdersPass fuses equal constant placeholders,
the graph output can contain the same node in multiple output slots.
In this case ToTosaMemoryFormatPass was rewriting the output node
with replace_input_with() while inserting output transposes.

That rewrote all matching occurrences at once, so duplicated
logical output slots were collapsed onto the same transpose node
instead of remaining distinct.

Fix this by handling duplicate outputs in the output rewrite path.
For shared output nodes, create a single boundary TOSA TRANSPOSE
and preserve distinct output slots by inserting TOSA IDENTITY
fanout nodes for later duplicates.

This keeps insert_input_transpose() focused on normal input
rewrites, avoids duplicating equivalent transposes for shared
outputs, and preserves the output slot structure expected by later
lowering and serialization stages.

Add regression coverage for FuseEqualPlaceholdersPass +
ToTosaMemoryFormatPass with duplicate outputs, and add TOSA
IDENTITY dialect and visitor coverage.

Signed-off-by: Baris Demir <baris.demir@arm.com>
Change-Id: Ie14bc88bfadaad7f993b71ef1b5332b5953b72c8
@AdrianLundell AdrianLundell added partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm ciflow/trunk release notes: none Do not include this in the release notes labels Apr 14, 2026
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot bot commented Apr 14, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18866

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 1 Awaiting Approval, 5 New Failures, 1 Cancelled Job, 3 Unrelated Failures

As of commit 6ae1b6a with merge base 37b12c8 (image):

AWAITING APPROVAL - The following workflow needs approval before CI can run:

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 14, 2026
@zingo
Copy link
Copy Markdown
Collaborator

zingo commented Apr 14, 2026

Hi @digantdesai this adds file, maybe you want to check this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm release notes: none Do not include this in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants