Skip to content

[Blog] Model inference with Prefill-Decode disaggregation #6481

[Blog] Model inference with Prefill-Decode disaggregation

[Blog] Model inference with Prefill-Decode disaggregation #6481

Job Run time
1m 6s
4s
1m 18s
13s
38s
8s
34s
36s
20s
24s
19s
6m 7s
3m 28s
3m 7s
6m 23s
3m 21s
2m 25s
5m 50s
4m 14s
2m 15s
6m 14s
4m 17s
4m 46s
2m 18s
2m 56s
6m 0s
14s
17s
1h 9m 52s