Skip to content

docs(benchmark): document warmup and add optional --num-warmup flag#138

Open
mvanhorn wants to merge 1 commit into
z-lab:mainfrom
mvanhorn:docs/benchmark-warmup
Open

docs(benchmark): document warmup and add optional --num-warmup flag#138
mvanhorn wants to merge 1 commit into
z-lab:mainfrom
mvanhorn:docs/benchmark-warmup

Conversation

@mvanhorn

Copy link
Copy Markdown

Summary

The benchmark warms up max(concurrency, 1) requests before timing, and every README example uses --concurrency 1. So the documented default warms with a single request, which undersells steady-state TPOT (the reported cold-start numbers climb over the first several requests).

Changes

  • Add an optional --num-warmup (default 8) so the warmup count is decoupled from concurrency.
  • Add a short README note so the default benchmark warms enough to report steady-state throughput.

Default behavior for users who pass --concurrency >= 8 is unchanged.

Fixes #135

AI was used for assistance.

The benchmark warms up max(concurrency, 1) requests, and every README
example uses --concurrency 1, so the documented default warms with a
single request. Add an optional --num-warmup (default 8) and a README
note so steady-state TPOT is not undersold.

Fixes z-lab#135
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docs suggestion: cold-start TPOT undersells steady-state by ~35% (43 vs 69 tok/s) until the draft path warms

1 participant