test: add qwen3 scope2 pto kernels#426
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces Qwen3 scope2 PTO kernels and updates the runop.sh script to include the Qwen3Scope2 directory in the test suite. The script was modified to apply default architecture and lowering level flags for these kernels. A review comment suggests decoupling the logic for these default flags to ensure that --pto-level=level3 is applied even if a user provides an explicit architecture override.
| local has_pto_arch_override=0 | ||
| if ((${#ptoas_flags[@]})); then | ||
| for ((idx=0; idx<${#ptoas_flags[@]}; ++idx)); do | ||
| if [[ "${ptoas_flags[idx]}" == "--pto-arch" && $((idx + 1)) -lt ${#ptoas_flags[@]} ]]; then | ||
| target_arch="${ptoas_flags[idx + 1]}" | ||
| has_pto_arch_override=1 | ||
| elif [[ "${ptoas_flags[idx]}" == --pto-arch=* ]]; then | ||
| target_arch="${ptoas_flags[idx]#--pto-arch=}" | ||
| has_pto_arch_override=1 | ||
| fi | ||
| done | ||
| fi | ||
| if [[ "$A" == "Qwen3Scope2" && $has_pto_arch_override -eq 0 ]]; then | ||
| ptoas_flags+=(--pto-arch a5 --pto-level=level3) | ||
| target_arch="a5" | ||
| fi |
There was a problem hiding this comment.
The logic for applying default flags to the Qwen3Scope2 directory is currently coupled to the presence of the --pto-arch flag. If a user provides an explicit --pto-arch override in PTOAS_FLAGS but omits --pto-level, the required --pto-level=level3 default will not be applied, which will cause compilation failures for these specific kernels as they require Level-3 lowering. It is better to detect and apply these overrides independently.
local has_pto_arch_override=0
local has_pto_level_override=0
if ((${#ptoas_flags[@]})); then
for ((idx=0; idx<${#ptoas_flags[@]}; ++idx)); do
if [[ "${ptoas_flags[idx]}" == "--pto-arch" && $((idx + 1)) -lt ${#ptoas_flags[@]} ]]; then
target_arch="${ptoas_flags[idx + 1]}"
has_pto_arch_override=1
elif [[ "${ptoas_flags[idx]}" == --pto-arch=* ]]; then
target_arch="${ptoas_flags[idx]#--pto-arch=}"
has_pto_arch_override=1
elif [[ "${ptoas_flags[idx]}" == "--pto-level" && $((idx + 1)) -lt ${#ptoas_flags[@]} ]]; then
has_pto_level_override=1
elif [[ "${ptoas_flags[idx]}" == --pto-level=* ]]; then
has_pto_level_override=1
fi
done
fi
if [[ "$A" == "Qwen3Scope2" ]]; then
if [[ $has_pto_arch_override -eq 0 ]]; then
ptoas_flags+=(--pto-arch a5)
target_arch="a5"
fi
if [[ $has_pto_level_override -eq 0 ]]; then
ptoas_flags+=(--pto-level=level3)
fi
fi|
/run a5 decode_attention_incore_0 decode_attention_incore_1 decode_attention_incore_2 decode_attention_incore_3 |
A5 板测失败
日志尾部 |
|
/run a5 decode_attention_incore_0 decode_attention_incore_1 decode_attention_incore_2 decode_attention_incore_3 decode_attention_incore_4 decode_attention_incore_5 decode_attention_incore_6 decode_attention_incore_7 decode_attention_incore_8 decode_attention_incore_9 decode_attention_incore_10 decode_attention_incore_11 decode_attention_incore_12 --pto-level=level3 |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45dbf6790c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if [[ "$A" == "Qwen3Scope2" && $has_pto_arch_override -eq 0 ]]; then | ||
| ptoas_flags+=(--pto-arch a5 --pto-level=level3) |
There was a problem hiding this comment.
Preserve level3 default when arch is overridden
For Qwen3Scope2, this branch only injects --pto-level=level3 when no --pto-arch override is present, so a common override like PTOAS_FLAGS='--pto-arch=a5' disables the level3 injection and runs these kernels at ptoas default level2. These new kernels use pto.alloc_tile addr=... (which ptoas documents as level3-only), so this path causes avoidable compile failures even though the user requested the correct arch. Consider detecting --pto-level independently and only auto-injecting level3 when it is not explicitly set.
Useful? React with 👍 / 👎.
A5 板测失败
日志尾部 |
|
/run a5 decode_attention_incore_0 decode_attention_incore_1 decode_attention_incore_2 decode_attention_incore_3 decode_attention_incore_4 decode_attention_incore_5 decode_attention_incore_6 decode_attention_incore_7 decode_attention_incore_8 decode_attention_incore_9 decode_attention_incore_10 decode_attention_incore_11 decode_attention_incore_12 --pto-level=level3 |
A5 板测成功
|
|
/run a5 decode_attention_incore_0 decode_attention_incore_1 decode_attention_incore_2 decode_attention_incore_3 decode_attention_incore_4 decode_attention_incore_5 decode_attention_incore_6 decode_attention_incore_7 decode_attention_incore_8 decode_attention_incore_9 decode_attention_incore_10 decode_attention_incore_11 decode_attention_incore_12 --pto-level=level3 |
A5 板测失败
日志尾部 |
|
/run a5 decode_attention_incore_0 decode_attention_incore_1 decode_attention_incore_2 decode_attention_incore_3 decode_attention_incore_4 decode_attention_incore_5 decode_attention_incore_6 decode_attention_incore_7 decode_attention_incore_8 decode_attention_incore_9 decode_attention_incore_10 decode_attention_incore_11 decode_attention_incore_12 --pto-level=level3 |
A5 板测失败
失败用例
|
A5 板测失败详情:PR #426decode_attention_incore_7
decode_attention_incore_2
decode_attention_incore_10
|
Summary
test/samples/Qwen3Scope2/with 13qwen3_32b_decode_scope2.pygenerated.ptokernelstest/samples/runop.shto includeQwen3Scope2in direct.ptocoverage--pto-arch a5 --pto-level=level3forQwen3Scope2when no explicit override is providedDetails
These kernels are generated from the
pypto-libQwen3 scope2 decode example and are intended to provide compile-regression coverage for pypto-generated A5.ptoinputs.The kernels compile with the current
ptoasflow when using A5 + level3 lowering. They are added as direct.ptosamples instead of handwritten IR.Remote board validation is intentionally left conservative in this draft: the workflow defaults skip these cases so this PR can land compile coverage first without changing the current board-run default surface.
Validation
PTOAS_BIN=/Users/laoda/pto/PTOAS/build/tools/ptoas/ptoas bash test/samples/runop.sh -t Qwen3Scope2