-
Notifications
You must be signed in to change notification settings - Fork 30
TStore verifier should reject tile/partition shape mismatch and subset should not allow enlarging tile shape #322
Description
Summary
pto.tstore currently accepts IR where the destination partition shape is larger than the source tile valid shape.
This later lowers to PTO-ISA C++ like:
GlobalTensor<float, ..., Shape<..., 64, 64>, ...> dst = ...;
Tile<..., 64, 64, ..., 32, 32, ...> tile = ...;
TSTORE(dst, tile);For PTO-ISA there is an implicit constraint: during TSTORE, the GlobalTensor shape must match the tile valid shape. The current verifier does not enforce this, so invalid IR can pass verification and only show up much later in generated code.
Reproducer
A generated case from this repo already demonstrates the problem:
- source test case:
test/pto_isa_st/TMaxs/tmaxs_float_64x64_32x32_32x32.py - generated PTO IR:
build/output/TMaxs/tmaxs_float_64x64_32x32_32x32-pto-ir.pto - generated kernel:
build/output_npu_validation/TMaxs/tmaxs_float_64x64_32x32_32x32/tmaxs_float_64x64_32x32_32x32_kernel.cpp
Relevant PTO IR:
%4 = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=32, cols=32, v_row=32, v_col=32, ...>
%5 = pto.alloc_tile : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>
%7 = pto.subset %4[%c0_5, %c0_5] sizes [64, 64] : !pto.tile_buf<loc=vec, dtype=f32, rows=32, cols=32, v_row=32, v_col=32, ...>
pto.tmaxs ins(%7, %6 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>, f32)
outs(%5 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>)
pto.tstore ins(%5 : !pto.tile_buf<loc=vec, dtype=f32, rows=64, cols=64, v_row=32, v_col=32, ...>)
outs(%3 : !pto.partition_tensor_view<64x64xf32>)This verifies today, but should be rejected earlier.
Root cause
There seem to be two gaps:
TStoreOp::verify()checks element types / address spaces, but does not check that destination partition shape matches the source tile valid shape.SubsetOp::verify()does not reject non-boxed subsets that enlarge the tile shape, so a32x32tile can become asubset ... sizes [64, 64].
Because of that, invalid IR is accepted and codegen simply materializes the mismatch.
Expected behavior
At least one of the following should be enforced:
pto.tstoreverifier should reject cases where:- dst partition rank/shape does not match src tile valid shape
- especially for static cases,
dst shape != src valid_shape
pto.subsetverifier should reject shape enlargement:- subset result sizes must not exceed source tile shape
- ideally also remain consistent with valid-shape semantics
Why this matters
Without this check, invalid test cases or frontend bugs can silently generate PTO IR that looks structurally valid but violates PTO-ISA constraints at TSTORE lowering time.
Suggested fix
- Add a shape compatibility check in
TStoreOp::verify() - Add a no-enlargement check in
SubsetOp::verify() - Optionally add a regression test using the existing
tmaxs_float_64x64_32x32_32x32pattern
Local reference points
lib/PTO/IR/PTO.cpp:TStoreOp::verify()SubsetOp::verify()
test/sample_utils/pto_isa_st_cases.py:_subset_if_needed()currently buildspto.subset(... sizes=[dst_shape])wheneversrc_shape != dst_shape, even whendst_shape > src_shape
Metadata
Metadata
Assignees
Labels
Type
Projects
Status