Skip to content

feat(benchmarks): log₂ sweep colouring + --clip colour clamp#30

Merged
FBumann merged 7 commits into
masterfrom
benchmark/plotting
Jun 6, 2026
Merged

feat(benchmarks): log₂ sweep colouring + --clip colour clamp#30
FBumann merged 7 commits into
masterfrom
benchmark/plotting

Conversation

@FBumann

@FBumann FBumann commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

TODO (human): one or two sentences on why — e.g. "the sweep heatmap mis-rendered ratios on a linear scale; this fixes it and adds a colour clamp for the release-history memory sweep."

Note

The following was generated by AI (Claude).

What

  • Sweep view colours by log₂(ratio) instead of raw ratio. Plotly's continuous colour scale is linear with no log mode, so the old colouring made a 2× look twice as intense as its mirror ½×. Folds are now symmetric around 1×.
  • Fold-change colourbar with dynamic ticks — integer log₂ steps spanning the actual range, labels (, ½×, …) generated from bound (not hardcoded); cells + hover also show the × fold.
  • --clip clamps the colour scale — the one thing you can't zoom after rendering (default symmetric p95). Unit follows the plot's colour: a fold-change (>1) for fold-coloured sweep (--clip 8 = ⅛×–8×), an absolute Δ for Δ-coloured scatter/compare. scaling has no diverging colour and ignores it. Axes stay full-range and zoomable.
  • Default clamp is now the symmetric p95 of the data (was auto-fit) via a shared _symmetric_clip(magnitudes, override) helper, reused by both scatter and sweep.
  • Scatter ratio axis is now log-scaled (was linear) so a 2× and its mirror ½× read symmetrically about 1.0; non-positive ratios are dropped (a log axis can't show 0).
  • Compare bar colour uses the same symmetric p95 clamp (length still shows the full Δ). Scaling already log-scales size — unchanged.
  • --sort now actually sorts the compare bars by the chosen Δ — it was hardcoded alphabetical while --sort only switched the dimension (the name/help were misleading). Biggest regressions on top, improvements at the bottom.
  • plot preserves input order — you set the axis order by the order you pass snapshots (no ordering flags; plot serves arbitrary snapshots, not just linopy-<ver>).
  • numpy promoted to a module-level import (was lazy inside plot_scatter).

Verify

  • ruff + mypy clean.
  • Sweep renders verified for the default p95 clamp and --clip overrides (the colourbar range/ticks follow bound).
  • Scatter still works via the shared helper.
  • --clip validation is contextual: positive always; >1 only required for the sweep (fold-change) view.
Motivation

Surfaced while building a release-history memory sweep (memory sweep across linopy 0.2→0.7, plotted with --view sweep): a couple of 10× outliers washed the heatmap to white, and the linear colour scale skewed folds. log₂ + the p95/--clip clamp make it legible.

…ic p95)

The sweep heatmap coloured by raw ratio on plotly's linear scale, so a 2x and
its mirror 1/2x looked asymmetric. Colour by log2(ratio) instead — folds
symmetric around 1x, with a fold-change colourbar (1/8x...8x).

Add --clip to override the colour clamp (a fold-change >1 for sweep, an absolute
delta for scatter) over a new shared _symmetric_clip(magnitudes, override)
helper that defaults to the symmetric p95 of the data, reused by both views.
numpy promoted to a module-level import.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann FBumann force-pushed the benchmark/plotting branch from 59387f9 to c319c92 Compare June 5, 2026 21:21
@codspeed-hq

codspeed-hq Bot commented Jun 5, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

✅ 79 untouched benchmarks
⏩ 495 skipped benchmarks1


Comparing benchmark/plotting (3a78d7f) with master (520bb43)

Open in CodSpeed

Footnotes

  1. 495 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

FBumann and others added 6 commits June 5, 2026 23:30
…ur in compare

Same fix as the sweep view, applied across the others:
- scatter: the ratio y-axis was linear, so a 2x and its mirror 1/2x read
  asymmetrically (they even centred it on 1.0 *linearly*). Make it log_y so folds
  are symmetric about 1.0; window symmetric in log space; drop non-positive
  ratios (a log axis can't show a 0).
- compare: clamp the bar colour with the shared symmetric p95 (consistency; the
  bar *length* still shows the full delta).
- scaling already log-scales size; left as is.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…dimension

Was inconsistent: a fold for sweep but an absolute Δ for scatter, and compare
ignored it. Now --clip is always a fold-change (>1, default symmetric p95) that
bounds the *ratio* dimension wherever a view has one:
- sweep: the ratio colour (±log2)
- scatter: the ratio y-axis ([1/clip, clip]) — moved off the colour, which
  reverts to the auto symmetric-p95 Δ clamp
- compare / scaling: no ratio axis → ignored

Validation is now uniform (fold-change > 1).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ear per plot)

Colour is the one thing you can't adjust after the plot is drawn (axes zoom).
So --clip targets the colour only, and its unit follows the plot's colour scale:
- sweep (colour = log2 ratio): a fold-change (>1)
- scatter / compare (colour = absolute Δ): a linear Δ bound
- scaling: no diverging colour → ignored
Default stays the symmetric p95. Axes are full-range and zoomable — scatter's
y-axis no longer p95-clips (which hid outlier points). Validation is per-scale
(fold > 1 for sweep; any positive for the linear ones).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…for inputs

- compare: --sort was misleading — bars were hardcoded alphabetical
  (sort_values('test_id')) while --sort only switched the dimension. Now it
  sorts by the chosen Δ (delta_abs/delta_pct): biggest regressions on top,
  improvements at the bottom. The name/help are finally truthful.
- plot --order {input,version}: default 'input' preserves the order you pass
  (the plot never re-sorts); 'version' sorts inputs by parsed linopy-<ver>,
  fixing a glob's string order (0.3.10 before 0.3.2) for release-history sweeps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Applied after --order, so e.g. --order version --reverse = newest-first (which
also makes the newest snapshot the sweep baseline / compare 'a').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
plot serves arbitrary snapshots (bench labels, baseline.json, …), so parsing
linopy-<ver> from filenames is a leaky abstraction — --order version is
meaningless for non-version snapshots. plot already preserves input order, so
callers control the axis by the order they pass. The --sort fix (compare bars
sort by Δ, not alphabetically) stays — that was a real bug.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@FBumann FBumann merged commit 3fdb9b4 into master Jun 6, 2026
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant