Skip to content

[pull] master from ggml-org:master#179

Merged
pull[bot] merged 5 commits intoLongLeCE:masterfrom
ggml-org:master
Jul 18, 2025
Merged

[pull] master from ggml-org:master#179
pull[bot] merged 5 commits intoLongLeCE:masterfrom
ggml-org:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Jul 18, 2025

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.3)

Can you help keep this open source service alive? 💖 Please sponsor : )

lgai-exaone and others added 5 commits July 18, 2025 10:45
* graph : avoid huge warm-up graphs for MoE models

ggml-ci

* cont : bump max nodes to 8x model tensors
* Fix Gemma3n not executed as CUDA_GRAPH on NVGPUs

Gemma3n uses Matrix-Matrix addition as part of their input processing,
wrongly triggering CUDA_GRAPH disablement on NVGPUs even when batch-size
of 1 is used.

* Exclude `project_per_layer_input` by matching node names

This ensures that all other graphs which don't exhibit this pattern do
not have their behavior changed.

* Revert unnecessary formatting changes
@pull pull Bot locked and limited conversation to collaborators Jul 18, 2025
@pull pull Bot added the ⤵️ pull label Jul 18, 2025
@pull pull Bot merged commit 2adf8d8 into LongLeCE:master Jul 18, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants