Skip to content

TStore verifier should reject tile/partition shape mismatch and subset should not allow enlarging tile shape #322

@Zhendong404

Description

@Zhendong404

Summary

pto.tstore currently accepts IR where the destination partition shape is larger than the source tile valid shape.

This later lowers to PTO-ISA C++ like:

GlobalTensor<float, ..., Shape<..., 64, 64>, ...> dst = ...;
Tile<..., 64, 64, ..., 32, 32, ...> tile = ...;
TSTORE(dst, tile);

For PTO-ISA there is an implicit constraint: during TSTORE, the GlobalTensor shape must match the tile valid shape. The current verifier does not enforce this, so invalid IR can pass verification and only show up much later in generated code.

Reproducer

A generated case from this repo already demonstrates the problem:

  • source test case:
    test/pto_isa_st/TMaxs/tmaxs_float_64x64_32x32_32x32.py
  • generated PTO IR:
    build/output/TMaxs/tmaxs_float_64x64_32x32_32x32-pto-ir.pto
  • generated kernel:
    build/output_npu_validation/TMaxs/tmaxs_float_64x64_32x32_32x32/tmaxs_float_64x64_32x32_32x32_kernel.cpp

Relevant PTO IR:

%4 = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=32, cols=32, v_row=32, v_col=32, ...>
%5 = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>
%7 = pto.subset %4[%c0_5, %c0_5] sizes [64, 64] : !pto.tile_buf<loc=vec, dtype=f32, rows=32, cols=32, v_row=32, v_col=32, ...>
pto.tmaxs ins(%7, %6 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>, f32)
  outs(%5 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>)
pto.tstore ins(%5 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>)
  outs(%3 : !pto.partition_tensor_view<64x64xf32>)

This verifies today, but should be rejected earlier.

Root cause

There seem to be two gaps:

  1. TStoreOp::verify() checks element types / address spaces, but does not check that destination partition shape matches the source tile valid shape.
  2. SubsetOp::verify() does not reject non-boxed subsets that enlarge the tile shape, so a 32x32 tile can become a subset ... sizes [64, 64].

Because of that, invalid IR is accepted and codegen simply materializes the mismatch.

Expected behavior

At least one of the following should be enforced:

  1. pto.tstore verifier should reject cases where:
    • dst partition rank/shape does not match src tile valid shape
    • especially for static cases, dst shape != src valid_shape
  2. pto.subset verifier should reject shape enlargement:
    • subset result sizes must not exceed source tile shape
    • ideally also remain consistent with valid-shape semantics

Why this matters

Without this check, invalid test cases or frontend bugs can silently generate PTO IR that looks structurally valid but violates PTO-ISA constraints at TSTORE lowering time.

Suggested fix

  • Add a shape compatibility check in TStoreOp::verify()
  • Add a no-enlargement check in SubsetOp::verify()
  • Optionally add a regression test using the existing tmaxs_float_64x64_32x32_32x32 pattern

Local reference points

  • lib/PTO/IR/PTO.cpp:
    • TStoreOp::verify()
    • SubsetOp::verify()
  • test/sample_utils/pto_isa_st_cases.py:
    • _subset_if_needed() currently builds pto.subset(... sizes=[dst_shape]) whenever src_shape != dst_shape, even when dst_shape > src_shape

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions