Arm backend: Support for uint8 I/O by per · Pull Request #18869 · pytorch/executorch

per · 2026-04-14T10:04:21Z

Summary

Add support for quantization to uint8 for input and output tensors.

Test plan

Tested through CI

cc @digantdesai @freddan80 @zingo @oscarandersson8218 @mansnils @Sebastian-Larsson @robell

pytorch-bot · 2026-04-14T10:04:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18869

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rolling out OSDC (ARC) runners on pull workflow for PyTorch trunk commits

❌ 9 New Failures, 2 Cancelled Jobs, 4 Pending, 3 Unrelated Failures

As of commit 796d6ff with merge base 2c545f8 ():

NEW FAILURES - The following jobs have failed:

Apple / build-benchmark-app / macos-job (gh)
RuntimeError: Command bash /Users/runner/work/_temp/exec_script failed with exit code 1
pull / android / run-emulator (gh)
The process '/usr/local/lib/android/sdk/platform-tools/adb' failed with exit code 224
pull / test-mcu-cortex-m-backend / linux-job (gh)
RuntimeError: Command docker exec -t db0c1dce6351c9046e8109fcfad1c1c4f7ee266f8d2408aa3d3136c8bcd259e4 /exec failed with exit code 1
pull / unittest / macos / macos-job (gh)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv3_model
trunk / test-arm-backend-ethos-u (test_pytest_ops_ethos_u85) / linux-job (gh)
RuntimeError: Command docker exec -t fed03eb1b44f8fb5a87c1d52f69f343bf36b04c118c922add77d62dac3da2849 /exec failed with exit code 1
trunk / test-cortex-m-e2e (mv2) / linux-job (gh)
RuntimeError: Command docker exec -t b965df96bc3c50be6f98bef196bb1d88a5d9afc71ae9c29d64f0fff49d4a0cdc /exec failed with exit code 92
trunk / test-torchao-huggingface-checkpoints (lfm2_5_1_2b, linux.arm64.2xlarge, executorch-ubuntu-22.04-g... / linux-job (gh)
RuntimeError: Command docker exec -t 3df333c42689aeec8cc77a31bc07a40dfddbe93e44dfd7a8d9540b7ad0f28ec0 /exec failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (phi_4_mini, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh)
RuntimeError: Command docker exec -t fdb58caec7be29c8b82c07df3ae6d11046fc848ec2d319dfe2c1d0366d81823f /exec failed with exit code 1
trunk / test-torchao-huggingface-checkpoints (qwen3_4b, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc1... / linux-job (gh)
RuntimeError: Command docker exec -t d8e79365a271b0e0165d31a15e27e346b73e139fd697d104f956ef1e4d5d6ce1 /exec failed with exit code 1

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / unittest-editable / windows / windows-job (gh)
Test CoreML Backend / test-coreml / test-backend-macos (coreml, models) / macos-job (gh)
##[error]The operation was canceled.

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / unittest-release / macos / macos-job (gh) (detected as infra flaky with no log or failing log classifier)

BROKEN TRUNK - The following jobs failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
trunk / unittest-release / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copilot

Pull request overview

Adds uint8 input/output quantization support for Arm TOSA-based backends by representing unsigned semantics via TOSA RESCALE flags (while keeping internal tensors signless int8), plus plumbing to preserve IO Q/DQ nodes when requested.

Changes:

Add uint8 IO quantization config and tests validating IO-only uint8 and numeric correctness across TOSA/VGF/Ethos-U55 flows.
Extend TOSA RESCALE op (fake + lowering) with input_unsigned / output_unsigned flags and map torch.uint8 to TOSA INT8 dtype.
Add preserve_io_quantization compile spec option and propagate it through pipelines; extend Ethos-U runtime input type support to uint8.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
backends/arm/tosa/partitioner.py	Adds compile-spec-driven IO Q/DQ detagging and rejects partitions with invalid internal uint8 tensors.
backends/arm/tosa/mapping.py	Maps `torch.uint8` to TOSA INT8 (unsigned semantics via RESCALE flags).
backends/arm/tosa/dialect/ops/rescale.py	Extends RESCALE schema and validation for unsigned IO semantics.
backends/arm/_passes/insert_rescales_pass.py	Folds dq→q into RESCALE and threads unsigned flags; enforces “uint8 at IO only”.
backends/arm/operators/op_tosa_rescale.py	Plumbs unsigned flags into flatbuffer RESCALE attributes and visitor parsing.
backends/arm/common/arm_compile_spec.py	Adds `preserve_io_quantization` to compile specs with parsing/roundtrip + warning behavior.
backends/arm/runtime/EthosUBackend.cpp	Allows uint8 (`ScalarType::Byte`) inputs in runtime IO validation.
backends/arm/quantizer/arm_quantizer.py	Adds `get_uint8_io_quantization_config()` and exports it.
backends/arm/quantizer/arm_quantizer_utils.py	Adjusts shared-qspec clique logic to avoid pulling uint8 IO qspecs into shared propagation.
backends/arm/test/common.py	Adds `preserve_io_quantization` option to VGF compile spec helper.
backends/arm/test/tester/test_pipeline.py	Plumbs `preserve_io_quantization` through test pipeline configuration.
backends/arm/test/quantizer/test_uint8_io_quantization.py	New test ensuring uint8 IO quantization config applies at IO.
backends/arm/test/passes/test_ioquantization_pass.py	Adds extensive end-to-end and pass-level tests for uint8 IO behavior.
backends/arm/test/misc/test_rescale_range.py	Adds unsigned zero-point range validation tests for RESCALE.
backends/arm/test/misc/test_compile_spec.py	Adds tests for preserve_io_quantization roundtrip and warning behavior.

Comments suppressed due to low confidence (1)

backends/arm/tosa/dialect/ops/rescale.py:38

The function docstring is no longer a real docstring because tosa_spec = get_context_spec() appears before the triple-quoted string. This makes the string a no-op expression and drops the doc from introspection/tools. Move the docstring to be the first statement in RESCALE (and assign tosa_spec after it).

    tosa_spec = get_context_spec()
    """Casts the input tensor to dtype `dtype` to produce the correct tensor
    meta for a _rescale op.

    Additionally validates TOSA constraints of a RESCALE op.

    """

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

backends/arm/tosa/dialect/ops/rescale.py

backends/arm/tosa/partitioner.py

backends/arm/test/passes/test_ioquantization_pass.py

zingo · 2026-04-14T12:52:54Z

Hi @digantdesai this adds files and you might want to tests is from a buck2 perspective

Add support for IO tensors only to be uint8. In conjuction with the QuantizeInput and QuantizeOutput pass this adds the possibility to give inputs of uint8 dtype to the model directly. Change-Id: Icc08ac242e5c980f2abd484eb0e7661418873ab7 Signed-off-by: Per Åstrand <per.astrand@arm.com>

SharedQspecQuantizer can propagate the IO quantization spec into internal nodes when using the composable quantizer. For uint8 IO this violates the TOSA constraint that uint8 is only allowed at IO boundaries. Skip IO-based qspec anchors for uint8 so internal nodes stay int8 while preserving shared qspec behavior elsewhere. Change-Id: Ie068de0c46426f386c86d9c295459011e906f335 Signed-off-by: Per Åstrand <per.astrand@arm.com>

Add an option to preserve the quantization on IO. Useful for keeping input and output tensors quantized when backend supports both +INT and +FP. Change-Id: Ibf6177e70c2abd9f64151553cb94698591a77acc Signed-off-by: Per Åstrand <per.astrand@arm.com>

Change-Id: Ie943e1de816d981c0f09d9bd3683881c03e3000c Signed-off-by: Per Åstrand <per.astrand@arm.com>

per requested a review from digantdesai as a code owner April 14, 2026 10:04

per added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Apr 14, 2026

Copilot AI review requested due to automatic review settings April 14, 2026 10:04

per added the ciflow/trunk label Apr 14, 2026

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 14, 2026

Copilot started reviewing on behalf of per April 14, 2026 10:04 View session

Copilot AI reviewed Apr 14, 2026

View reviewed changes

backends/arm/tosa/dialect/ops/rescale.py Show resolved Hide resolved

backends/arm/tosa/partitioner.py Show resolved Hide resolved

backends/arm/test/passes/test_ioquantization_pass.py Show resolved Hide resolved

zingo added this to ExecuTorch Arm Backend Apr 14, 2026

github-project-automation bot moved this to To triage in ExecuTorch Arm Backend Apr 14, 2026

zingo moved this from To triage to Ready in ExecuTorch Arm Backend Apr 14, 2026

per added 4 commits April 14, 2026 17:13

Arm backend: Handle +FP+INT for vgf and quantized IO

c4dc7e6

Change-Id: Ie943e1de816d981c0f09d9bd3683881c03e3000c Signed-off-by: Per Åstrand <per.astrand@arm.com>

per force-pushed the uint8-io branch from b542fe6 to c4dc7e6 Compare April 14, 2026 15:17

Merge branch 'main' into uint8-io

796d6ff

per added the release notes: arm Changes to the ARM backend delegate label Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arm backend: Support for uint8 I/O #18869

Arm backend: Support for uint8 I/O #18869
per wants to merge 5 commits intopytorch:mainfrom
per:uint8-io

per commented Apr 14, 2026 •

edited by pytorch-bot bot

Loading

Uh oh!

pytorch-bot bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zingo commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

per commented Apr 14, 2026 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18869

❗ 1 Active SEVs

❌ 9 New Failures, 2 Cancelled Jobs, 4 Pending, 3 Unrelated Failures

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zingo commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

per commented Apr 14, 2026 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Apr 14, 2026 •

edited

Loading