speed-bench: add Mac Studio M1 Ultra 64GB streaming numbers by ozgursoy · Pull Request #350 · antirez/ds4

ozgursoy · 2026-06-07T04:19:16Z

Real-hardware 64GB streaming datapoint. The "Flash on 64GB MacBooks" section
currently only documents a 128GB machine with 64GB locked away, this is an
actual 64GB box.

Machine: Mac Studio M1 Ultra, 64GB, macOS 26.5
Model: Flash Q2 (DeepSeek-V4-Flash-IQ2XXS-w2Q2K...imatrix, ~81GB)

Command:
./ds4-bench -m ds4flash.gguf
--ssd-streaming --ssd-streaming-cache-experts 32GB
--ctx-start 2048 --ctx-max 32768 --step-incr 2048 --gen-tokens 128
--csv speed-bench/m1_ultra_64gb_stream.csv

Results (2K-32K ctx): prefill ~108-118 t/s, generation ~5 t/s, both roughly
flat as context grows. Decode is SSD-bound: it does not scale with the Ultra
GPU, so generation stays close to what smaller 64GB Apple Silicon reaches.

Bonus, simulated 32GB (--ssd-streaming --simulate-used-memory 32GB): with only
~32GB available the expert cache can't stay resident for the 81GB model, so
decode collapses to ~0.17 t/s (prefill ~60 t/s) - not practically usable. Can
add a CSV for it if useful.

speed-bench: add Mac Studio M1 Ultra 64GB streaming numbers

bb1e3ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed-bench: add Mac Studio M1 Ultra 64GB streaming numbers#350

speed-bench: add Mac Studio M1 Ultra 64GB streaming numbers#350
ozgursoy wants to merge 1 commit into
antirez:streamingfrom
ozgursoy:m1-ultra-64gb-streaming-bench

ozgursoy commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ozgursoy commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant