Port CUDA SSD streaming support by linuxbest · Pull Request #349 · antirez/ds4

linuxbest · 2026-06-06T22:16:18Z

Add CUDA support for SSD streaming async selected loads and shared overlap decode paths, while preserving the existing Metal streaming path and ds4_lx session cancellation API.

Also update SSD streaming help text so it is described as a GPU graph backend feature instead of Metal-only.

Verification: make ds4; make ds4-server ds4-bench ds4-eval ds4-agent; make cuda-regression; make ds4 ds4-bench; git diff --check.

Quick cold-cache benchmark on NVIDIA GB10 with 8GB expert cache: prefill 2048 tokens at ~30-32 tok/s, decode 32 tokens at ~2.1 tok/s.

Add CUDA support for SSD streaming async selected loads and shared overlap decode paths, while preserving the existing Metal streaming path and ds4_lx session cancellation API. Also update SSD streaming help text so it is described as a GPU graph backend feature instead of Metal-only. Verification: make ds4; make ds4-server ds4-bench ds4-eval ds4-agent; make cuda-regression; make ds4 ds4-bench; git diff --check. Quick cold-cache benchmark on NVIDIA GB10 with 8GB expert cache: prefill 2048 tokens at ~30-32 tok/s, decode 32 tokens at ~2.1 tok/s.

linuxbest · 2026-06-06T22:23:24Z

https://asciinema.org/a/7ceMiiSZi1Fc6rrJ

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port CUDA SSD streaming support#349

Port CUDA SSD streaming support#349
linuxbest wants to merge 1 commit into
antirez:mainfrom
linuxbest:cuda-ssd-streaming

linuxbest commented Jun 6, 2026

Uh oh!

linuxbest commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

linuxbest commented Jun 6, 2026

Uh oh!

linuxbest commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant