Name and Version
build-cuda-debug/bin/llama-cli --version
version: 9621 (597b667)
built with GNU 13.3.0 for Linux x86_64
llama.log
Operating systems
Linux
GGML backends
CUDA
Hardware
ryzen 5900x + 2 rtx 3060.
Models
unsloth/Qwen3.5-35B-A3B/Qwen3.5-35B-A3B-UD-IQ4_NL.gguf
Problem description & steps to reproduce
see the attached log.
First Bad Commit
No response
Relevant log output
Logs
0.11.338.063 D ggml_cuda_graph_check_compability: disabling CUDA graphs due to unsupported node type
0.11.348.030 E ggml_cuda_compute_forward: SOFT_MAX failed
0.11.348.725 E CUDA error: invalid argument
0.11.348.728 E current device: 0, in function ggml_cuda_compute_forward at /home/lf/codes/llama.cpp/ggml/src/ggml-cuda/ggml-cuda.cu:3163
0.11.348.729 E err
Name and Version
build-cuda-debug/bin/llama-cli --version
version: 9621 (597b667)
built with GNU 13.3.0 for Linux x86_64
llama.log
Operating systems
Linux
GGML backends
CUDA
Hardware
ryzen 5900x + 2 rtx 3060.
Models
unsloth/Qwen3.5-35B-A3B/Qwen3.5-35B-A3B-UD-IQ4_NL.gguf
Problem description & steps to reproduce
see the attached log.
First Bad Commit
No response
Relevant log output
Logs