设备:Atlas 300I DOU
使用libvnpu进行切割,驱动版本: Version: 25.5.1, cann版本:8.1.RC1
pod yaml文件:
kind: Deployment
metadata:
name: ascend-soft-slice-pod
spec:
replicas: 1 # 可根据需要调整副本数
selector:
matchLabels:
app: ascend310p
template:
metadata:
annotations:
huawei.com/vnpu-mode: 'hami-core'
labels:
app: ascend310p # 用于 Deployment 选择器匹配
spec:
runtimeClassName: ascend
containers:
- name: ubuntu-container
image: dev.bingosoft.net/bingomatrix/my-mindie:1.0.0-300I-Duo
imagePullPolicy: IfNotPresent
command: ["bash", "-c", "sleep 86400"]
resources:
limits:
huawei.com/Ascend310P: "1" # 请求 1 块物理 NPU
huawei.com/Ascend310P-memory: "10240" # 请求 10Gi 显存
huawei.com/Ascend310P-core: "40" # 请求 40% 的算力
volumeMounts:
- name: dshm
mountPath: /dev/shm
- name: ascend-toolkit
mountPath: /usr/local/Ascend/ascend-toolkit
readOnly: true # 推荐只读,避免污染主机环境
volumes:
- name: ascend-toolkit
hostPath:
path: /usr/local/Ascend/ascend-toolkit
type: Directory
- name: dshm
emptyDir:
medium: Memory
sizeLimit: 2Gi # 根据需求调整,如 1Gi、2Gi
pod里面使用acl分配内存报错:
🔧 Initializing ACL...
📦 Allocating two 5GB device buffers (total ~10GB)...
[2026-04-24T07:38:58Z INFO limiter::worker] [Worker PID:549] Initialize SchedulerClient...
thread '' (549) panicked at crates/limiter/src/shmem.rs:33:25:
Worker failed to open NPU Manager shmem! Is the Daemon running?
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace
thread '' (549) panicked at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/core/src/panicking.rs:225:5:
panic in a function that cannot unwind
stack backtrace:
0: 0xffffa999cb30 - <<std[cc6062c208ed37d1]::sys::backtrace::BacktraceLock>::print::DisplayBacktrace as core[4e11ee24e72d71de]::fmt::Display>::fmt
1: 0xffffa99b1238 - core[4e11ee24e72d71de]::fmt::write
2: 0xffffa99a2ff0 - <std[cc6062c208ed37d1]::sys::stdio::unix::Stderr as std[cc6062c208ed37d1]::io::Write>::write_fmt
3: 0xffffa998d4cc - std[cc6062c208ed37d1]::panicking::default_hook::{closure#0}
4: 0xffffa9999f04 - std[cc6062c208ed37d1]::panicking::default_hook
5: 0xffffa999a0bc - std[cc6062c208ed37d1]::panicking::panic_with_hook
6: 0xffffa998d5a8 - std[cc6062c208ed37d1]::panicking::panic_handler::{closure#0}
7: 0xffffa9984d78 - std[cc6062c208ed37d1]::sys::backtrace::__rust_end_short_backtrace::<std[cc6062c208ed37d1]::panicking::panic_handler::{closure#0}, !>
8: 0xffffa998dd04 - __rustc[b7974e8690430dd9]::rust_begin_unwind
9: 0xffffa98d9fdc - core[4e11ee24e72d71de]::panicking::panic_nounwind_fmt
10: 0xffffa98d9f64 - core[4e11ee24e72d71de]::panicking::panic_nounwind
11: 0xffffa98da0bc - core[4e11ee24e72d71de]::panicking::panic_cannot_unwind
12: 0xffffa98db0cc - rtMalloc
13: 0xffff7369e7f0 - _ZN3acl17aclMallocMemInnerEPPvmb20aclrtMemMallocPolicyt
14: 0xffff7369fd3c - aclrtMalloc
15: 0xffff738e62e8 -
16: 0xffffa94db9e8 -
17: 0xffffa94db7a0 - _PyObject_MakeTpCall
18: 0xffffa942301c - _PyEval_EvalFrameDefault
19: 0xffffa95dc01c -
20: 0xffffa95dc0c0 - PyEval_EvalCode
21: 0xffffa95dca7c -
22: 0xffffa95dcb74 -
23: 0xffffa95dcc98 -
24: 0xffffa95e2cc8 - _PyRun_SimpleFileObject
25: 0xffffa95e3204 - _PyRun_AnyFileObject
26: 0xffffa95e413c - Py_RunMain
27: 0xffffa9619ed0 - Py_BytesMain
28: 0xffffa90f76c4 -
29: 0xffffa90f77a8 - __libc_start_main
30: 0xaaaab44b08b0 - _start
31: 0x0 -
设备:Atlas 300I DOU
使用libvnpu进行切割,驱动版本: Version: 25.5.1, cann版本:8.1.RC1
pod yaml文件:
pod里面使用acl分配内存报错:
🔧 Initializing ACL...
📦 Allocating two 5GB device buffers (total ~10GB)...
[2026-04-24T07:38:58Z INFO limiter::worker] [Worker PID:549] Initialize SchedulerClient...
thread '' (549) panicked at crates/limiter/src/shmem.rs:33:25:
Worker failed to open NPU Manager shmem! Is the Daemon running?
note: run with
RUST_BACKTRACE=1environment variable to display a backtracethread '' (549) panicked at /rustc/59807616e1fa2540724bfbac14d7976d7e4a3860/library/core/src/panicking.rs:225:5:
panic in a function that cannot unwind
stack backtrace:
0: 0xffffa999cb30 - <<std[cc6062c208ed37d1]::sys::backtrace::BacktraceLock>::print::DisplayBacktrace as core[4e11ee24e72d71de]::fmt::Display>::fmt
1: 0xffffa99b1238 - core[4e11ee24e72d71de]::fmt::write
2: 0xffffa99a2ff0 - <std[cc6062c208ed37d1]::sys::stdio::unix::Stderr as std[cc6062c208ed37d1]::io::Write>::write_fmt
3: 0xffffa998d4cc - std[cc6062c208ed37d1]::panicking::default_hook::{closure#0}
4: 0xffffa9999f04 - std[cc6062c208ed37d1]::panicking::default_hook
5: 0xffffa999a0bc - std[cc6062c208ed37d1]::panicking::panic_with_hook
6: 0xffffa998d5a8 - std[cc6062c208ed37d1]::panicking::panic_handler::{closure#0}
7: 0xffffa9984d78 - std[cc6062c208ed37d1]::sys::backtrace::__rust_end_short_backtrace::<std[cc6062c208ed37d1]::panicking::panic_handler::{closure#0}, !>
8: 0xffffa998dd04 - __rustc[b7974e8690430dd9]::rust_begin_unwind
9: 0xffffa98d9fdc - core[4e11ee24e72d71de]::panicking::panic_nounwind_fmt
10: 0xffffa98d9f64 - core[4e11ee24e72d71de]::panicking::panic_nounwind
11: 0xffffa98da0bc - core[4e11ee24e72d71de]::panicking::panic_cannot_unwind
12: 0xffffa98db0cc - rtMalloc
13: 0xffff7369e7f0 - _ZN3acl17aclMallocMemInnerEPPvmb20aclrtMemMallocPolicyt
14: 0xffff7369fd3c - aclrtMalloc
15: 0xffff738e62e8 -
16: 0xffffa94db9e8 -
17: 0xffffa94db7a0 - _PyObject_MakeTpCall
18: 0xffffa942301c - _PyEval_EvalFrameDefault
19: 0xffffa95dc01c -
20: 0xffffa95dc0c0 - PyEval_EvalCode
21: 0xffffa95dca7c -
22: 0xffffa95dcb74 -
23: 0xffffa95dcc98 -
24: 0xffffa95e2cc8 - _PyRun_SimpleFileObject
25: 0xffffa95e3204 - _PyRun_AnyFileObject
26: 0xffffa95e413c - Py_RunMain
27: 0xffffa9619ed0 - Py_BytesMain
28: 0xffffa90f76c4 -
29: 0xffffa90f77a8 - __libc_start_main
30: 0xaaaab44b08b0 - _start
31: 0x0 -