Skip to content

refactor(benchmarks): explicit QUICK_SIZES/LONG_SIZES per spec (+ fix 14-min memory CI)#33

Merged
FBumann merged 1 commit into
masterfrom
perf/quick-ci-time
Jun 6, 2026
Merged

refactor(benchmarks): explicit QUICK_SIZES/LONG_SIZES per spec (+ fix 14-min memory CI)#33
FBumann merged 1 commit into
masterfrom
perf/quick-ci-time

Conversation

@FBumann

@FBumann FBumann commented Jun 6, 2026

Copy link
Copy Markdown
Collaborator

TODO (human): one line on why.

Note

Generated by AI (Claude).

What

Follow-up to #31. Two goals:

  1. Fix the 14-min CodSpeed memory job. Benchmark selection rework + time/memory CLI unification #31's quick-subset rework ran 3 sizes per model under --quick, including the heaviest builds (basic n=1600 ≈ 550 MiB, knapsack 1M, qp 20000) under memray. Per-PR --quick now runs a lean 2 sizes per model; patterns keep (0, 50, 100).
  2. Make the size tiers explicit per spec (your call) — replace the derived quick subset and the long_threshold number with constants visible at the top of each model file:
SIZES       = (10, 50, 100, 250, 500, 1000, 1600)  # --long
QUICK_SIZES = (10, 250)                            # --quick (per-PR)
LONG_SIZES  = (1000, 1600)                          # --long only

Tiers: --quickquick_sizes; defaultsizes minus long_sizes; --long → all. LONG_SIZES = the sizes above each old long_threshold, so default/--long behave exactly as before — only --quick got leaner. Patterns expose the same shape via a QUICK_SEVERITIES = (0, 50, 100) constant and long_sizes=() (every severity runs by default; severity cost peaks at 100, which stays in quick+default).

skip_reason (the shared pytest↔memray gate), introspect, and CLI help updated; long_threshold removed.

Verify

  • ruff + mypy clean; harness tests pass (incl. memory↔pytest id alignment).
  • Tier check per spec confirms quick ⊆ default ⊆ long and that default/long match the old thresholds.

@codspeed-hq

codspeed-hq Bot commented Jun 6, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 79 untouched benchmarks
🆕 59 new benchmarks
⏩ 593 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
🆕 Memory test_build[masked-n=100] N/A 735.4 KB N/A
🆕 Memory test_build[expression_arithmetic-n=250] N/A 19.6 MB N/A
🆕 Memory test_to_solver[gurobi-basic-n=250] N/A 69.1 MB N/A
🆕 Memory test_to_solver[gurobi-milp-n=50] N/A 1 MB N/A
🆕 Memory test_to_lp[cumsum-severity=100] N/A 29.3 MB N/A
🆕 Memory test_to_solver[highs-storage-n=250] N/A 37.5 MB N/A
🆕 Memory test_to_solver[gurobi-sparse_network-n=250] N/A 48.1 MB N/A
🆕 Memory test_to_solver[highs-knapsack-n=10000] N/A 2.5 MB N/A
🆕 Memory test_to_solver[highs-milp-n=50] N/A 802.1 KB N/A
🆕 Memory test_to_solver[gurobi-sos-n=1000] N/A 3.1 MB N/A
🆕 Memory test_build[nodal_balance-severity=100] N/A 31.5 MB N/A
🆕 Memory test_to_solver[gurobi-knapsack-n=10000] N/A 2.8 MB N/A
🆕 Memory test_to_solver[highs-piecewise-n=1000] N/A 4.2 MB N/A
🆕 Memory test_to_solver[highs-basic-n=250] N/A 55.1 MB N/A
🆕 Memory test_to_solver[highs-masked-n=100] N/A 659 KB N/A
🆕 Memory test_to_lp[qp-n=1000] N/A 20.4 KB N/A
🆕 Memory test_build[storage-n=250] N/A 11.3 MB N/A
🆕 Memory test_to_solver[gurobi-nodal_balance-severity=100] N/A 25.1 MB N/A
🆕 Memory test_build[knapsack-n=10000] N/A 792.3 KB N/A
🆕 Memory test_to_solver[highs-merge_balance-severity=100] N/A 24.9 MB N/A
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.


Comparing perf/quick-ci-time (5c99d67) with master (4db3c76)

Open in CodSpeed

Footnotes

  1. 593 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@FBumann FBumann force-pushed the perf/quick-ci-time branch from ae78685 to fb48f99 Compare June 6, 2026 22:40
…thresholds)

Replace the derived quick subset and the long_threshold number with explicit
size-tier constants at the top of each model file, so what runs in each tier is
visible at a glance:

    SIZES       = (10, 50, 100, 250, 500, 1000, 1600)  # --long
    QUICK_SIZES = (10, 250)                            # --quick (per-PR)
    LONG_SIZES  = (1000, 1600)                          # --long only

Tiers: --quick → quick_sizes; default → sizes minus long_sizes; --long → all.
LONG_SIZES is the sizes above each old long_threshold, so default/--long behave
exactly as before. QUICK_SIZES is the lean per-PR pair — the giant-drop that
fixes the 14-min CodSpeed memory job, now spelled out rather than derived.
Patterns expose the same shape via the QUICK_SEVERITIES (0, 50, 100) constant
and long_sizes=() (every severity runs by default). skip_reason / introspect /
help text updated; long_threshold removed.

Supersedes the derivation-based giant-drop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann FBumann force-pushed the perf/quick-ci-time branch from fb48f99 to 5c99d67 Compare June 6, 2026 22:41
@FBumann FBumann changed the title perf(benchmarks): drop model giants from --quick (fix 14-min CodSpeed memory job) refactor(benchmarks): explicit QUICK_SIZES/LONG_SIZES per spec (+ fix 14-min memory CI) Jun 6, 2026
@FBumann FBumann merged commit 6d47dfa into master Jun 6, 2026
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant