Skip to content

[Blog] Model inference with Prefill-Decode disaggregation #6479

[Blog] Model inference with Prefill-Decode disaggregation

[Blog] Model inference with Prefill-Decode disaggregation #6479

Job Run time
1m 19s
4s
17s
12s
1m 26s
22s
45s
23s
32s
25s
3m 35s
18s
6m 9s
6m 13s
2m 16s
5m 20s
2m 21s
5m 51s
2m 2s
5m 58s
2m 31s
3m 15s
4m 14s
5m 34s
3m 39s
3m 6s
17s
20s
1h 8m 44s