Skip to content

[pull] master from ggml-org:master#1092

Merged
pull[bot] merged 5 commits intoLongLeCE:masterfrom
ggml-org:master
Apr 19, 2026
Merged

[pull] master from ggml-org:master#1092
pull[bot] merged 5 commits intoLongLeCE:masterfrom
ggml-org:master

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented Apr 19, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

rgerganov and others added 5 commits April 19, 2026 10:21
* rpc : refactor the RPC transport

Move all transport related code into a separate file and use the
socket_t interface to hide all transport implementation details.

* fix win32

* better socket_t construction
* server : speculative decoding using checkpoints

* server : fix draft check with checkpoints

* server : rename spec vars

* server : log levels

* server : refactored spec logic to speculative.cpp

* server : renamed spec checkpoints option

* server : fix spec checkpoints, logging

* speculative : checkpoints with draft model, logging

* server : n_tokens_cur and create_checkpoint in draft

* server : fix server_speculative_callback (slot.id)

* spec : fix ngram-map/begin idx_last_check

* spec : init ckpt (begin() wasn't called)

* chore: update webui build output

* server : restore sampler in spec checkpoint and clear mem

* cont : avoid --spec-use-checkpoints argument

* cont : remove server_prompt_checkpoint_with_size

* spec : rename (leave_draft_state)

* cont : clean-up

* cont : do not ignore partial drafts even if the are short

* cont : spec callback owned by session

* cont : simplify

* cont : avoid empty speculative session

* cont : simplify

* cont : simplify

* cont : enable mtmd speculative decoding

* cont : keep the spec sampler alive

* cont : simplify

* cont : fix nullptr deref + draft checkpoints

* cont : remove common_speculative_accept_response

* cont : remove callback

* cont : simplify

* cont : minor

* cont : simplify

* cont : fix accepted number

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
#21630 added the CMP0194 NEW policy to silence a CMake warning, but on Windows runners it caused CMake to prefer the MinGW toolchain for ASM and broke MSVC builds.

Reverting only that policy block restores the previous working behavior. The CMake 4.1+ warning comes back, but that is cosmetic and does not break any platform.

Reported-by: oobabooga

Refs: #21630

Co-authored-by: texasich <texasich@users.noreply.github.com>
* convert : support sentence-transformer 5.4 config files

* fix: embeddinggemma

* fix: mapping

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix: pooling_mode

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
@pull pull Bot locked and limited conversation to collaborators Apr 19, 2026
@pull pull Bot added the ⤵️ pull label Apr 19, 2026
@pull pull Bot merged commit 037bfe3 into LongLeCE:master Apr 19, 2026
8 of 17 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants