Skip to content

feat: Add TGEMV_MX family ops#418

Open
FangRui0 wants to merge 2 commits intohw-native-sys:mainfrom
FangRui0:add_gemv
Open

feat: Add TGEMV_MX family ops#418
FangRui0 wants to merge 2 commits intohw-native-sys:mainfrom
FangRui0:add_gemv

Conversation

@FangRui0
Copy link
Copy Markdown
Contributor

@FangRui0 FangRui0 commented Apr 2, 2026

TGEMV_MX is A5-only op

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 2, 2026

/run a5

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the pto.tgemv.mx operation family for mixed-precision matrix-vector multiplication on A5 targets, including IR definitions, documentation, and lowering patterns. Review feedback highlights the need for unique intrinsic names for the accumulation and bias variants to prevent backend collisions, the importance of updating conversion patterns and test expectations accordingly, and the necessity of preserving result types during view-to-memref transformations to maintain SSA consistency.

}];

let extraClassDeclaration = [{
static StringRef getIntrinsicName() { return "TGEMV_MX"; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The intrinsic name for the accumulation variant should be specific to avoid collisions and ensure correct mapping in the backend. Following the pattern of other operations (like TMatmulMXAccOp), this should be TGEMV_MX_ACC.

    static StringRef getIntrinsicName() { return "TGEMV_MX_ACC"; }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend API named "TGEMV_MX".

}];

let extraClassDeclaration = [{
static StringRef getIntrinsicName() { return "TGEMV_MX"; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The intrinsic name for the bias variant should be specific. Using the same name as the base operation (TGEMV_MX) will cause issues during code generation as the argument count and positions differ. It should be TGEMV_MX_BIAS.

    static StringRef getIntrinsicName() { return "TGEMV_MX_BIAS"; }

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend API named "TGEMV_MX".

Comment on lines +6884 to +6885
replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
{dst, cIn, a, aScale, b, bScale}, rewriter);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This variant should map to the TGEMV_MX_ACC intrinsic. Mapping both acc and bias variants to the same TGEMV_MX name with different argument structures will lead to incorrect C code generation or compilation errors in the backend.

Suggested change
replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
{dst, cIn, a, aScale, b, bScale}, rewriter);
replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX_ACC",
{dst, cIn, a, aScale, b, bScale}, rewriter);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend API named "TGEMV_MX".

Comment on lines +6903 to +6904
replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
{dst, a, aScale, b, bScale, bias}, rewriter);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This variant should map to the TGEMV_MX_BIAS intrinsic. Using a generic name for operations with different argument signatures is fragile and likely incorrect for the EmitC backend.

Suggested change
replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX",
{dst, a, aScale, b, bScale, bias}, rewriter);
replaceOrEraseWithOpaqueCall(op.getOperation(), "TGEMV_MX_BIAS",
{dst, a, aScale, b, bScale, bias}, rewriter);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend API named "TGEMV_MX".

}
};

struct PTOTGemvMXAccToTGEMV_MX
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pattern name should reflect the specific intrinsic it targets. It is recommended to rename this to PTOTGemvMXAccToTGEMV_MX_ACC for consistency with other mixed-precision accumulation patterns.

Suggested change
struct PTOTGemvMXAccToTGEMV_MX
struct PTOTGemvMXAccToTGEMV_MX_ACC

}
};

struct PTOTGemvMXBiasToTGEMV_MX
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pattern name should be updated to PTOTGemvMXBiasToTGEMV_MX_BIAS to match the intended intrinsic target.

Suggested change
struct PTOTGemvMXBiasToTGEMV_MX
struct PTOTGemvMXBiasToTGEMV_MX_BIAS

IRRewriter rewriter(ctx);
rewriter.setInsertionPoint(op);
rewriter.replaceOpWithNewOp<pto::TGemvMxOp>(
op, TypeRange{},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using TypeRange{} will drop any results the original operation might have had. If these operations are in the tensor domain (where they return a result aliasing the destination buffer), this will break the SSA chain. It is safer to use op->getResultTypes() to preserve the original result signature.

Suggested change
op, TypeRange{},
op, op->getResultTypes(),

Comment on lines +22 to +23
// CHECK: TGEMV_MX(
// CHECK: TGEMV_MX(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test expectations should be updated to reflect the specific intrinsic names for the accumulation and bias variants.

// CHECK: TGEMV_MX_ACC(
// CHECK: TGEMV_MX_BIAS(

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:6b32a2640480
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260402_100305_manual_pr418.log
  • 手动指令:/run a5
  • 触发人:FangRui0
  • 触发评论:feat: Add TGEMV_MX family ops #418 (comment)
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

_inject_sync_loop-pto.cpp
Sync(test_inject_sync_two_event_id.py) OK   generated: test_inject_sync_two_event_id-pto.cpp
Sync(test_intercore_sync_a3_dyn.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_missing_setffts.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_modes.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a5_dyn.py) OK   generated: test_intercore_sync_a5_dyn-pto.cpp
Sync(test_intercore_sync_a5_functional.py) OK   generated: test_intercore_sync_a5_functional-pto.cpp
Sync(test_intercore_sync_a5_ptoisa_vec.py) OK   generated: test_intercore_sync_a5_ptoisa_vec-pto.cpp
Sync(test_intercore_sync_a5.py) OK   generated: test_intercore_sync_a5-pto.cpp
Sync(test_mem_inject_sync_basic.py) OK   generated: test_mem_inject_sync_basic-pto.cpp
Sync(test_set_wait_unified_api.py) OK   generated: test_set_wait_unified_api-pto.cpp
Sync(tmatmulk_autosync_a5.py) OK   generated: tmatmulk_autosync_a5-pto.cpp
Tcvt(tcvt.py) OK   generated: tcvt-pto.cpp
TileSetGetValue(tile_getval_mat_invalid.py) XFAIL python failed as expected
TileSetGetValue(tileSetGetValue.py) OK   generated: tileSetGetValue-pto.cpp
TInsert(tinsert.py) OK   generated: tinsert-pto.cpp
Trans(trans.py) OK   generated: trans-pto.cpp
Trap(trap.py) OK   generated: trap-pto.cpp
VectorAddition(vadd_pto_ir.py) OK   generated: vadd_pto_ir-pto.cpp
VectorAddition(vadd_validshape_hyper.py) OK   generated: vadd_validshape_hyper-pto.cpp
VectorAddition(vectorAddition.py) OK   generated: vectorAddition-pto.cpp
Xors(xors.py) OK   generated: xors-pto.cpp
Xor(xor.py)  OK   generated: xor-pto.cpp
-----------------------------
OK=172  FAIL=1  SKIP=4
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-04-02 10:06:11 =====

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 2, 2026

/run a5

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:1ae7e200ce24
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260402_105406_manual_pr418.log
  • 手动指令:/run a5
  • 触发人:FangRui0
  • 触发评论:feat: Add TGEMV_MX family ops #418 (comment)
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

_inject_sync_loop-pto.cpp
Sync(test_inject_sync_two_event_id.py) OK   generated: test_inject_sync_two_event_id-pto.cpp
Sync(test_intercore_sync_a3_dyn.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_missing_setffts.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3_modes.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a3.py) SKIP requires --pto-arch=a3
Sync(test_intercore_sync_a5_dyn.py) OK   generated: test_intercore_sync_a5_dyn-pto.cpp
Sync(test_intercore_sync_a5_functional.py) OK   generated: test_intercore_sync_a5_functional-pto.cpp
Sync(test_intercore_sync_a5_ptoisa_vec.py) OK   generated: test_intercore_sync_a5_ptoisa_vec-pto.cpp
Sync(test_intercore_sync_a5.py) OK   generated: test_intercore_sync_a5-pto.cpp
Sync(test_mem_inject_sync_basic.py) OK   generated: test_mem_inject_sync_basic-pto.cpp
Sync(test_set_wait_unified_api.py) OK   generated: test_set_wait_unified_api-pto.cpp
Sync(tmatmulk_autosync_a5.py) OK   generated: tmatmulk_autosync_a5-pto.cpp
Tcvt(tcvt.py) OK   generated: tcvt-pto.cpp
TileSetGetValue(tile_getval_mat_invalid.py) XFAIL python failed as expected
TileSetGetValue(tileSetGetValue.py) OK   generated: tileSetGetValue-pto.cpp
TInsert(tinsert.py) OK   generated: tinsert-pto.cpp
Trans(trans.py) OK   generated: trans-pto.cpp
Trap(trap.py) OK   generated: trap-pto.cpp
VectorAddition(vadd_pto_ir.py) OK   generated: vadd_pto_ir-pto.cpp
VectorAddition(vadd_validshape_hyper.py) OK   generated: vadd_validshape_hyper-pto.cpp
VectorAddition(vectorAddition.py) OK   generated: vectorAddition-pto.cpp
Xors(xors.py) OK   generated: xors-pto.cpp
Xor(xor.py)  OK   generated: xor-pto.cpp
-----------------------------
OK=172  FAIL=1  SKIP=4
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-04-02 10:57:11 =====

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 2, 2026

/run a3

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 2, 2026

/run a5

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 3, 2026

/run a5 test/basic/tgemv_mx_emitc.pto

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:62b896849e46
  • 结果汇总:OK 0 / FAIL 1 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260403_154406_manual_pr418.log
  • 手动指令:/run a5 test/basic/tgemv_mx_emitc.pto
  • 触发人:FangRui0
  • 指定用例:test/basic/tgemv_mx_emitc.pto
  • 触发评论:feat: Add TGEMV_MX family ops #418 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • tgemv_mx_emitc (run, exit=2)

@reedhecre
Copy link
Copy Markdown

A5 板测失败详情:PR #418

tgemv_mx_emitc

stage=run info=exit=2

/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:107:5: error: static assertion failed due to requirement 'std::is_same_v<half, float>': TMatmulMX:No supported data type combination.
    static_assert((isFp4 || isFp8) && std::is_same_v<CType, float>, "TMatmulMX:No supported data type combination.");
    ^                                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/npu/a5/TMatmul.hpp:312:5: note: in instantiation of function template specialization 'pto::CheckMadMxValid<pto::Tile<pto::TileType::Acc, half, 1, 16, pto::BLayout::ColMajor, 1, 16, pto::SLayout::RowMajor, 1024, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Left, half, 1, 128, pto::BLayout::ColMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 1, 128, pto::BLayout::RowMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Right, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::ColMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    CheckMadMxValid<TileRes, TileLeft, TileLeftScale, TileRight, TileRightScale>();
    ^
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/common/pto_instr.hpp:368:5: note: in instantiation of function template specialization 'pto::TGEMV_MX_IMPL<pto::AccPhase::Unspecified, pto::Tile<pto::TileType::Acc, half, 1, 16, pto::BLayout::ColMajor, 1, 16, pto::SLayout::RowMajor, 1024, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Left, half, 1, 128, pto::BLayout::ColMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 1, 128, pto::BLayout::RowMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Right, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::ColMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
    MAP_INSTR_IMPL(TGEMV_MX, cMatrix, aMatrix, aScaleMatrix, bMatrix, bScaleMatrix);
    ^
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/payload/pto-isa/include/pto/common/pto_instr.hpp:22:34: note: expanded from macro 'MAP_INSTR_IMPL'
#define MAP_INSTR_IMPL(API, ...) API##_IMPL(__VA_ARGS__)
                                 ^
<scratch space>:218:1: note: expanded from here
TGEMV_MX_IMPL
^
/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/npu_validation/Basic/tgemv_mx_emitc/tgemv_mx_emitc_kernel.cpp:91:3: note: in instantiation of function template specialization 'pto::TGEMV_MX<pto::Tile<pto::TileType::Acc, half, 1, 16, pto::BLayout::ColMajor, 1, 16, pto::SLayout::RowMajor, 1024, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Left, half, 1, 128, pto::BLayout::ColMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 1, 128, pto::BLayout::RowMajor, 1, 128, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Right, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::ColMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>, pto::Tile<pto::TileType::Scaling, half, 128, 16, pto::BLayout::RowMajor, 128, 16, pto::SLayout::RowMajor, 512, pto::PadValue::Null, pto::CompactMode::Null>>' requested here
  TGEMV_MX(v9, v5, v6, v7, v8);
  ^
1 error generated.
gmake[2]: *** [CMakeFiles/tgemv_mx_emitc_kernel.dir/build.make:76: CMakeFiles/tgemv_mx_emitc_kernel.dir/tgemv_mx_emitc_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/tgemv_mx_emitc_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-03 15:45:51] ERROR: testcase failed (exit 2): tgemv_mx_emitc
[2026-04-03 15:45:51] === SUMMARY ===
[2026-04-03 15:45:51] OK=0 FAIL=1 SKIP=0
[2026-04-03 15:45:51] RESULTS_TSV=/tmp/ptoas-board-monitor-a5/runs/20260403_154406_manual_pr418/remote_npu_validation_results.tsv

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 3, 2026

/run a5 test/basic/tgemv_mx_emitc.pto test/basic/tgemv_mx_variants_emitc.pto

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:0a830082228e
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260403_162806_manual_pr418.log
  • 手动指令:/run a5 test/basic/tgemv_mx_emitc.pto test/basic/tgemv_mx_variants_emitc.pto
  • 触发人:FangRui0
  • 指定用例:test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto
  • 触发评论:feat: Add TGEMV_MX family ops #418 (comment)
  • 失败阶段:internal / RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

日志尾部

, strided<[16, 1], offset: ?>, #pto.address_space<acc>>
  [Success] -> __cc__ float*
[Debug] Converting MemRef: memref<1x128xf8E4M3, strided<[128, 1], offset: ?>, #pto.address_space<left>>
  [Success] -> __ca__ float8_e4m3_t*
[Debug] Converting MemRef: memref<1x128xf16, strided<[128, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<128x16xf8E4M3, strided<[16, 1], offset: ?>, #pto.address_space<right>>
  [Success] -> __cb__ float8_e4m3_t*
[Debug] Converting MemRef: memref<128x16xf16, strided<[16, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<1x16xf32, strided<[16, 1], offset: ?>, #pto.address_space<bias>>
  [Success] -> __gm__ float*
===== END STAGE emit-basic-pto-cases rc=0 @ 2026-04-03 16:29:46 =====
basic direct pto emitted: test/basic/tgemv_mx_emitc.pto -> test/samples/Basic/tgemv_mx_emitc-pto.cpp, test/basic/tgemv_mx_variants_emitc.pto -> test/samples/Basic/tgemv_mx_variants_emitc-pto.cpp

===== INTERNAL ERROR =====
Traceback (most recent call last):
  File "/root/ptoas-board-monitor-a5/monitor.py", line 2071, in run_once
    summary = runner.run()
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1499, in run
    self.generate_payload()
    ~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1441, in generate_payload
    self.resolve_payload_run_only_cases()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1122, in resolve_payload_run_only_cases
    raise RuntimeError(f"RUN_ONLY_CASES matched zero buildable cases: {self.run_only_cases}")
RuntimeError: RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

1 similar comment
@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:0a830082228e
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260403_162806_manual_pr418.log
  • 手动指令:/run a5 test/basic/tgemv_mx_emitc.pto test/basic/tgemv_mx_variants_emitc.pto
  • 触发人:FangRui0
  • 指定用例:test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto
  • 触发评论:feat: Add TGEMV_MX family ops #418 (comment)
  • 失败阶段:internal / RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

日志尾部

, strided<[16, 1], offset: ?>, #pto.address_space<acc>>
  [Success] -> __cc__ float*
[Debug] Converting MemRef: memref<1x128xf8E4M3, strided<[128, 1], offset: ?>, #pto.address_space<left>>
  [Success] -> __ca__ float8_e4m3_t*
[Debug] Converting MemRef: memref<1x128xf16, strided<[128, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<128x16xf8E4M3, strided<[16, 1], offset: ?>, #pto.address_space<right>>
  [Success] -> __cb__ float8_e4m3_t*
[Debug] Converting MemRef: memref<128x16xf16, strided<[16, 16], offset: ?>, #pto.address_space<scaling>>
  [Success] -> __fbuf__ half*
[Debug] Converting MemRef: memref<1x16xf32, strided<[16, 1], offset: ?>, #pto.address_space<bias>>
  [Success] -> __gm__ float*
===== END STAGE emit-basic-pto-cases rc=0 @ 2026-04-03 16:29:46 =====
basic direct pto emitted: test/basic/tgemv_mx_emitc.pto -> test/samples/Basic/tgemv_mx_emitc-pto.cpp, test/basic/tgemv_mx_variants_emitc.pto -> test/samples/Basic/tgemv_mx_variants_emitc-pto.cpp

===== INTERNAL ERROR =====
Traceback (most recent call last):
  File "/root/ptoas-board-monitor-a5/monitor.py", line 2071, in run_once
    summary = runner.run()
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1499, in run
    self.generate_payload()
    ~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1441, in generate_payload
    self.resolve_payload_run_only_cases()
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/root/ptoas-board-monitor-a5/monitor.py", line 1122, in resolve_payload_run_only_cases
    raise RuntimeError(f"RUN_ONLY_CASES matched zero buildable cases: {self.run_only_cases}")
RuntimeError: RUN_ONLY_CASES matched zero buildable cases: test/basic/tgemv_mx_emitc.pto,test/basic/tgemv_mx_variants_emitc.pto

@FangRui0
Copy link
Copy Markdown
Contributor Author

FangRui0 commented Apr 3, 2026

/run a5 tgemv_mx_emitc tgemv_mx_variants_emitc

@reedhecre
Copy link
Copy Markdown

A5 板测成功

  • 触发方式:manual
  • 源码提交:0a830082228e
  • 结果汇总:OK 2 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260403_164706_manual_pr418.log
  • 结果 TSV:/root/ptoas-board-monitor-a5/logs/20260403_164706_manual_pr418.tsv
  • 手动指令:/run a5 tgemv_mx_emitc tgemv_mx_variants_emitc
  • 触发人:FangRui0
  • 指定用例:tgemv_mx_emitc,tgemv_mx_variants_emitc
  • 触发评论:feat: Add TGEMV_MX family ops #418 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

4 participants