-
Notifications
You must be signed in to change notification settings - Fork 41
[Feature] Migrate AICPU launch to new rtsLaunchCpuKernel interface (BUILD_WITH_NEW_CANN) #356
Description
Summary
Migrate the AICPU kernel launch path from the legacy rtAicpuKernelLaunchExWithArgs API to the new rtsLaunchCpuKernel / rtsBinaryLoadFromFile / rtsFuncGetByName interface available in newer CANN versions. This is gated behind BUILD_WITH_NEW_CANN in the pypto codebase and should be adopted in simpler for forward compatibility.
Motivation / Use Case
Current state (simpler):
Both a2a3 and a5 platform backends use the legacy launch path in device_runner.cpp:
// src/a2a3/platform/onboard/host/device_runner.cpp:607
rtAicpuKernelLaunchExWithArgs(
rtKernelType_t::KERNEL_TYPE_AICPU_KFC, "AST_DYN_AICPU",
aicpu_num, &rt_args, nullptr, stream, 0);This requires manually constructing rtAicpuArgsEx_t with kernelNameAddrOffset / soNameAddrOffset, embedding kernel and SO names as fixed-size char arrays in a struct.
New interface (pypto, under BUILD_WITH_NEW_CANN):
pypto has migrated to a cleaner approach using LoadAicpuOp:
- Load:
rtsBinaryLoadFromFile(jsonPath, &optionCfg, &binHandle)— load AICPU op info from a JSON descriptor - Resolve:
rtsFuncGetByName(binHandle, opName, &funcHandle)— get function handle by name - Launch:
rtsLaunchCpuKernel(funcHandle, blockDim, stream, &launchCfg, &argInfo)— launch with typed args
Benefits:
- Cleaner API: No manual offset calculations (
kernelNameAddrOffset,soNameAddrOffset), no embedded char arrays - Forward compatibility: The legacy
rtAicpuKernelLaunchExWithArgsmay be deprecated in future CANN versions - Consistency: Aligns simpler's host launch path with pypto's approach
- Custom op support: The new interface supports both built-in ops (
LaunchBuiltInOp) and custom ops (LaunchCustomOp) through a unifiedLoadAicpuOpclass
Proposed API / Behavior
Add BUILD_WITH_NEW_CANN compile flag support and a LoadAicpuOp-style abstraction:
// New path (when BUILD_WITH_NEW_CANN is defined):
// 1. Generate op info JSON at init time
// 2. Load binary handle: rtsBinaryLoadFromFile(...)
// 3. Resolve function handles: rtsFuncGetByName(...)
// 4. Launch: rtsLaunchCpuKernel(funcHandle, blockDim, stream, &cfg, &args)
// Legacy path (fallback):
// Existing rtAicpuKernelLaunchExWithArgs code unchangedScope:
src/a2a3/platform/onboard/host/device_runner.cpp— AICPU launch inlaunch_aicpu_kernel()src/a5/platform/onboard/host/device_runner.cpp— same pattern- New header:
rts/rts_kernel.hdependency (from CANN toolkit) - New include dependency gated behind
#ifdef BUILD_WITH_NEW_CANN
Reference implementation: pypto/framework/src/machine/runtime/load_aicpu_op.cpp and load_aicpu_op.h
Alternatives Considered
- Keep legacy API only: Works for now, but risks breakage if CANN deprecates the old interface
- Conditional compilation (recommended): Use
#ifdef BUILD_WITH_NEW_CANNto support both old and new paths, matching pypto's approach. This allows gradual migration without breaking existing builds
Additional Context
- pypto reference:
framework/src/machine/runtime/load_aicpu_op.{h,cpp}anddevice_runner.cpp - The new API requires
rts/rts_kernel.hheader from the CANN toolkit - AICore launch (
rtKernelLaunchWithHandleV2) is unaffected — only AICPU launch changes
Metadata
Metadata
Assignees
Labels
Type
Projects
Status