Releases: joninco/llama.cpp
Releases · joninco/llama.cpp
b6220
b6209
opencl: mark `argsort` unsupported if cols exceed workgroup limit (#1…
b6170
CUDA: fix negative KV_max values in FA (#15321)
b6153
perplexity: give more information about constraints on failure (#15303) * perplexity: give more information about constraints on failure This checks whether -np is insufficient vs context, and provides clues as to how much is needed for each. * log formatting * log error and return instead of storing max_seq_exceeded int * check if s0 is zero for -np check
b6144
ggml : repack block_iq4_nlx8 (#14904) ggml-ci