refactor(benchmarks): explicit QUICK_SIZES/LONG_SIZES per spec (+ fix 14-min memory CI)#33
Merged
Merged
Conversation
Merging this PR will not alter performance
Performance Changes
Comparing Footnotes
|
ae78685 to
fb48f99
Compare
…thresholds)
Replace the derived quick subset and the long_threshold number with explicit
size-tier constants at the top of each model file, so what runs in each tier is
visible at a glance:
SIZES = (10, 50, 100, 250, 500, 1000, 1600) # --long
QUICK_SIZES = (10, 250) # --quick (per-PR)
LONG_SIZES = (1000, 1600) # --long only
Tiers: --quick → quick_sizes; default → sizes minus long_sizes; --long → all.
LONG_SIZES is the sizes above each old long_threshold, so default/--long behave
exactly as before. QUICK_SIZES is the lean per-PR pair — the giant-drop that
fixes the 14-min CodSpeed memory job, now spelled out rather than derived.
Patterns expose the same shape via the QUICK_SEVERITIES (0, 50, 100) constant
and long_sizes=() (every severity runs by default). skip_reason / introspect /
help text updated; long_threshold removed.
Supersedes the derivation-based giant-drop.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
fb48f99 to
5c99d67
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TODO (human): one line on why.
Note
Generated by AI (Claude).
What
Follow-up to #31. Two goals:
--quick, including the heaviest builds (basic n=1600 ≈ 550 MiB, knapsack 1M, qp 20000) under memray. Per-PR--quicknow runs a lean 2 sizes per model; patterns keep(0, 50, 100).long_thresholdnumber with constants visible at the top of each model file:Tiers:
--quick→quick_sizes; default →sizesminuslong_sizes;--long→ all.LONG_SIZES= the sizes above each oldlong_threshold, so default/--longbehave exactly as before — only--quickgot leaner. Patterns expose the same shape via aQUICK_SEVERITIES = (0, 50, 100)constant andlong_sizes=()(every severity runs by default; severity cost peaks at 100, which stays in quick+default).skip_reason(the shared pytest↔memray gate),introspect, and CLI help updated;long_thresholdremoved.Verify