[release-1.15] fix ascend vnpu addResource#5377
Conversation
Signed-off-by: james <open4pd@4paradigm.com>
Signed-off-by: james <open4pd@4paradigm.com>
There was a problem hiding this comment.
Code Review
This pull request introduces a PodMap to the AscendDevice struct to track resource usage per pod, preventing duplicate additions or subtractions of resources for the same pod. It also updates AddResourceUsage and SubResourceUsage signatures, updates annotation keys from InRequestDevices to SupportDevices, and adds comprehensive unit tests for AddResource. The review feedback suggests adding a safety check to lazily initialize dev.PodMap if it is nil before writing to it, preventing potential runtime panics.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if _, ok := dev.PodMap[string(pod.UID)]; !ok { | ||
| dev.PodMap[string(pod.UID)] = &devices.DeviceUsage{ | ||
| Used: 1, | ||
| Usedcores: cono_dev.Usedcores, | ||
| Usedmem: cono_dev.Usedmem, | ||
| } | ||
| ads.AddResourceUsage(dev, cono_dev.Usedcores, cono_dev.Usedmem) | ||
| klog.V(5).Infof("add resource usage for pod %s. device %s usedmem %d", pod.Name, dev.DeviceInfo.ID, cono_dev.Usedmem) | ||
| } |
There was a problem hiding this comment.
If dev.PodMap is nil, assigning a value to it will cause a runtime panic (panic: assignment to entry in nil map). Although PodMap is initialized in NewAscendDevices and DeepCopy, it is safer to lazily initialize it if it is nil to prevent potential panics if AscendDevice is constructed or modified elsewhere without proper initialization.
if dev.PodMap == nil {
dev.PodMap = make(map[string]*devices.DeviceUsage)
}
if _, ok := dev.PodMap[string(pod.UID)]; !ok {
dev.PodMap[string(pod.UID)] = &devices.DeviceUsage{
Used: 1,
Usedcores: cono_dev.Usedcores,
Usedmem: cono_dev.Usedmem,
}
ads.AddResourceUsage(dev, cono_dev.Usedcores, cono_dev.Usedmem)
klog.V(5).Infof("add resource usage for pod %s. device %s usedmem %d", pod.Name, dev.DeviceInfo.ID, cono_dev.Usedmem)
}There was a problem hiding this comment.
New instances of AscendDevice are created through NewAscendDevices, DeepCopy, and getDeviceSnapshot. All of these interfaces initialize the PodMap, so there will be no null pointer issues currently.
From the perspective of defensive programming, I can add null pointer checks.
There was a problem hiding this comment.
I submitted a PR on the bot for this change: volcano-sh-bot#1. I'm not sure if this is the correct way to modify the bot's branch. If not, please let me know how to do it. @JesseStutler
I also have added this change to #5378 which will be merged to master
There was a problem hiding this comment.
Got it, If there are other places already init the struct, I think we can safely merge it, because AI may also have illusions and can't get the context of other places of codes
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: JesseStutler The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is an automated cherry-pick of #5363