Skip to content

Fix All codecheck findings#403

Open
HecreReed wants to merge 3 commits intohw-native-sys:mainfrom
HecreReed:codex/fix-codecheck-20260331
Open

Fix All codecheck findings#403
HecreReed wants to merge 3 commits intohw-native-sys:mainfrom
HecreReed:codex/fix-codecheck-20260331

Conversation

@HecreReed
Copy link
Copy Markdown
Collaborator

Summary

  • address the codecheck issues listed in CANN_pto-as_codecheck_questions_20260331092554.xlsx
  • refactor PTO transform and ptobc/ptoas code paths to remove duplicated logic and reduce oversized functions
  • add missing PR386 license headers to touched testdata files and clean up Python helper diagnostics

Validation

  • python3 -m py_compile python/pto/dialects/pto.py
  • git diff --check
  • ninja -C build-codecheck ptoas ptobc

@HecreReed HecreReed marked this pull request as ready for review March 31, 2026 03:00
@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run all

@reedhecre
Copy link
Copy Markdown

A3 板测失败

失败用例

  • syncHigh (run, exit=1)
  • abs (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A3 板测失败详情:PR #403

syncHigh

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor/runs/20260331_113605_manual_pr403/npu_validation/Sync/syncHigh/main.cpp:91)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 1814538] 2026-03-31-11:43:09.740.363 (EZ9999):  The error from device(chipId:1, dieId:0), serial number is 1083, there is an exception of aivec error, core id is 17, error code = 0x4000000000000000, dump info: pc start: 0x124400000000, current: 0x1244000000a4, vec error info: 0xf01d, mte error info: 0x7103006e46, ifu error info: 0x20000fb1afdc0, ccu error info: 0xcc201000d80000f, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000000.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:389]
        TraceBack (most recent call last):
       The extend info: errcode:(0x4000000000000000, 0, 0) errorStr: VEC instruction error: the ub address out of bounds. fixp_error0 info: 0x3006e46, fixp_error1 info: 0x71, fsmId:0, tslot:5, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:402]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1497]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [DFX_INFO]Aicore kernel execute failed, device_id=2, stream_id=1467, report_stream_id=1467, task_id=0, flip_num=0, fault kernel_name=_Z13run_sync_highPfS_, fault kernel info ext=_Z13run_sync_highPfS_, program id=0, hash=15681698180925654878.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-03-31 11:43:11] ERROR: testcase failed (exit 1): syncHigh
abs

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor/runs/20260331_113605_manual_pr403/npu_validation/Abs/abs/main.cpp:91)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 2903205] 2026-03-31-11:56:44.091.058 (EZ9999):  The error from device(chipId:1, dieId:0), serial number is 1084, there is an exception of aivec error, core id is 37, error code = 0x4000000000000000, dump info: pc start: 0x124400000000, current: 0x12440000008c, vec error info: 0xf01b, mte error info: 0x7103006646, ifu error info: 0x2fffefa614f80, ccu error info: 0xcc201100d80000f, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000000.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:389]
        TraceBack (most recent call last):
       The extend info: errcode:(0x4000000000000000, 0, 0) errorStr: VEC instruction error: the ub address out of bounds. fixp_error0 info: 0x3006646, fixp_error1 info: 0x71, fsmId:0, tslot:7, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:402]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1497]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [DFX_INFO]Aicore kernel execute failed, device_id=2, stream_id=1488, report_stream_id=1488, task_id=0, flip_num=0, fault kernel_name=_Z13abs_kernel_2dPfS_, fault kernel info ext=_Z13abs_kernel_2dPfS_, program id=0, hash=8649095210733992711.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-03-31 11:56:45] ERROR: testcase failed (exit 1): abs
[2026-03-31 11:56:45] === SUMMARY ===
[2026-03-31 11:56:45] OK=161 FAIL=2 SKIP=0
[2026-03-31 11:56:45] RESULTS_TSV=/tmp/ptoas-board-monitor/runs/20260331_113605_manual_pr403/remote_npu_validation_results.tsv

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run a3

@reedhecre
Copy link
Copy Markdown

A3 板测失败

失败用例

  • relu (run, exit=1)
  • paged_attention_example_kernel_softmax_prepare (run, exit=1)

@reedhecre
Copy link
Copy Markdown

A3 板测失败详情:PR #403

relu

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor/runs/20260331_141605_manual_pr403/npu_validation/Relu/relu/main.cpp:91)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 1028013] 2026-03-31-14:26:39.729.961 (EZ9999):  The error from device(chipId:1, dieId:0), serial number is 1086, there is an exception of aivec error, core id is 39, error code = 0x4000000000000000, dump info: pc start: 0x124400000000, current: 0x12440000008c, vec error info: 0xf01b, mte error info: 0x7103004a46, ifu error info: 0x2fffef4f05540, ccu error info: 0xcc2010029800075, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000000.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:389]
        TraceBack (most recent call last):
       The extend info: errcode:(0x4000000000000000, 0, 0) errorStr: VEC instruction error: the ub address out of bounds. fixp_error0 info: 0x3004a46, fixp_error1 info: 0x71, fsmId:0, tslot:7, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:402]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1497]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [DFX_INFO]Aicore kernel execute failed, device_id=2, stream_id=1465, report_stream_id=1465, task_id=0, flip_num=0, fault kernel_name=_Z14relu_kernel_2dPfS_, fault kernel info ext=_Z14relu_kernel_2dPfS_, program id=0, hash=13611994874237834080.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-03-31 14:26:41] ERROR: testcase failed (exit 1): relu
paged_attention_example_kernel_softmax_prepare

stage=run info=exit=1

[ERROR] aclrtSynchronizeStream(stream) failed: 507035 (/tmp/ptoas-board-monitor/runs/20260331_141605_manual_pr403/npu_validation/PyPTOIRParser/paged_attention_example_kernel_softmax_prepare/main.cpp:108)
[ERROR] RecentErrMsg: EZ9999: Inner Error!
EZ9999[PID: 1032833] 2026-03-31-14:27:06.066.872 (EZ9999):  The error from device(chipId:1, dieId:0), serial number is 1087, there is an exception of aivec error, core id is 47, error code = 0x4000000000000000, dump info: pc start: 0x124400000000, current: 0x1244000003f0, vec error info: 0xe01e, mte error info: 0x79030001b0, ifu error info: 0x20000fe1cfc00, ccu error info: 0xcc2000014800053, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000000.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:389]
        TraceBack (most recent call last):
       The extend info: errcode:(0x4000000000000000, 0, 0) errorStr: VEC instruction error: the ub address out of bounds. fixp_error0 info: 0x30001b0, fixp_error1 info: 0x79, fsmId:1, tslot:7, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:402]
       Kernel task happen error, retCode=0x31, [vector core exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1497]
       AIV Kernel happen error, retCode=0x31.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [AIC_INFO] after execute:args print end[FUNC:GetError][FILE:stream.cc][LINE:1346]
       [DFX_INFO]Aicore kernel execute failed, device_id=2, stream_id=1471, report_stream_id=1471, task_id=0, flip_num=0, fault kernel_name=_Z22kernel_softmax_preparePffPu6__bf16S_S_, fault kernel info ext=_Z22kernel_softmax_preparePffPu6__bf16S_S_, program id=0, hash=16053642148614005971.[FUNC:GetError][FILE:stream.cc][LINE:1346]
       rtStreamSynchronize execution failed, reason=vector core exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:65]
       synchronize stream failed, runtime result = 507035[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
[2026-03-31 14:27:07] ERROR: testcase failed (exit 1): paged_attention_example_kernel_softmax_prepare

@HecreReed
Copy link
Copy Markdown
Collaborator Author

/run all

@reedhecre
Copy link
Copy Markdown

A3 板测成功

  • 触发方式:manual
  • 源码提交:19c6090d87d1
  • 结果汇总:OK 163 / FAIL 0 / SKIP 0
  • 日志:/tmp/ptoas-board-monitor/logs/20260331_143704_manual_pr403.log
  • 结果 TSV:/tmp/ptoas-board-monitor/logs/20260331_143704_manual_pr403.tsv
  • 手动指令:/run all
  • 触发人:HecreReed
  • 触发评论:https://github.com/zhangstevenunity/PTOAS/pull/403#issuecomment-4160282069

@reedhecre
Copy link
Copy Markdown

reedhecre commented Mar 31, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: Fix All codecheck findings #403 Fix All codecheck findings
  • Author: HecreReed
  • Base/Head: main / codex/fix-codecheck-20260331
  • Head SHA: 83d6ff0b20e5
  • Trigger: PR 有新提交
  • Generated At: 2026-04-01T02:31:33Z
  • Previous Head SHA: 50c595dd516c
  • Status: completed

Summary

未检查到 PR #403 存在问题

Findings

No issues found.

@HecreReed HecreReed changed the title Fix 2026-03-31 codecheck findings Fix All codecheck findings Apr 1, 2026
@HecreReed HecreReed force-pushed the codex/fix-codecheck-20260331 branch from 50c595d to 83d6ff0 Compare April 1, 2026 02:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants