[pull] master from ggml-org:master by pull[bot] · Pull Request #854 · LongLeCE/llama.cpp

pull · 2026-02-06T08:42:02Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

This commit addresses the TODO in llama-sampling.h to rename that header and the implementation to llama-sampler.

* metal : skip loading all-zero mask * cont : minor

* vulkan: make FA mask/softcap enables spec constants * don't specialize for sinks * bump timeout a little bit

…19376) The cpu and cuda backends use fp16 for the VKQ accumulator type, this change does the same for vulkan. This helps particularly with large head sizes which are very register-limited. I tried this for the coopmat1 path and it slowed down a bit. I didn't try for scalar. I applied the softmax bias that the cuda backend uses to avoid overflow, although I was not able to reproduce the original bug without it.

ggerganov and others added 5 commits February 6, 2026 07:55

cuda : cuda graphs now compare all node params (#19383)

3e21647

llama : rename llama-sampling to llama-sampler (#19363)

e696cfc

This commit addresses the TODO in llama-sampling.h to rename that header and the implementation to llama-sampler.

metal : skip loading all-zero mask (#19337)

7fcf1ef

* metal : skip loading all-zero mask * cont : minor

vulkan: make FA mask/softcap enables spec constants (#19309)

f9bd518

* vulkan: make FA mask/softcap enables spec constants * don't specialize for sinks * bump timeout a little bit

pull bot locked and limited conversation to collaborators Feb 6, 2026

pull bot added the ⤵️ pull label Feb 6, 2026

pull bot merged commit 1946e46 into LongLeCE:master Feb 6, 2026

github-actions bot added Apple Metal Nvidia GPU ggml Vulkan devops labels Feb 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] master from ggml-org:master#854

[pull] master from ggml-org:master#854
pull[bot] merged 5 commits intoLongLeCE:masterfrom
ggml-org:master

pull bot commented Feb 6, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

pull bot commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pull bot commented Feb 6, 2026 •

edited

Loading