Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
8c908af
docs: update benchmark readme
FBumann May 27, 2026
413f1c6
benchmarks: reusable model registry, new model types, new phases, CI …
FBumann May 28, 2026
a6cc83b
benchmarks: add --long flag, gate super-long sizes by default
FBumann May 28, 2026
300abb5
benchmarks: make --quick truly quick (35s → 18s)
FBumann May 28, 2026
c725c68
benchmarks: add registry-usage notebook + execute in CI
FBumann May 28, 2026
99483f8
benchmarks: switch walkthrough to .ipynb, add reprs to ModelSpec
FBumann May 28, 2026
751aa78
benchmarks: typer-based CLI as the single entry point
FBumann May 28, 2026
8b124e2
benchmarks: pin typer==0.26.2, use ctx.args for pytest pass-through
FBumann May 28, 2026
86fd036
benchmarks: pin test infra + add transitive lockfile
FBumann May 28, 2026
9be18e1
benchmarks: add ``sweep`` subcommand for cross-version perf runs
FBumann May 28, 2026
51f418d
benchmarks: collapse README to a pointer, kill duplication
FBumann May 28, 2026
c0f3fee
benchmarks: make pypsa optional, expand notebook into proper guide
FBumann May 28, 2026
0522a75
benchmarks: sweep gains --phase / --model / --filter + pytest pass-th…
FBumann May 28, 2026
7bb464e
benchmarks: add ``compare`` subcommand wrapping pytest-benchmark compare
FBumann May 28, 2026
83bdeda
benchmarks: compare lists snapshots as relative paths (easier to copy…
FBumann May 28, 2026
8e378b5
benchmarks: tighter defaults for ``compare`` (median/iqr, sorted by n…
FBumann May 28, 2026
f67721b
benchmarks: compare gains ``--group-by=fullname`` default + ctx.args …
FBumann May 28, 2026
3ac333b
benchmarks: revert compare to manual arg-split + acknowledge typer wart
FBumann May 28, 2026
919e061
benchmarks: add ``plot`` subcommand (compare / sweep / scaling views)
FBumann May 28, 2026
c921b78
benchmarks: move plotting to benchmarks/plotting.py + text_auto + hov…
FBumann May 28, 2026
4c6f328
benchmarks: switch primary metric to ``min``, allow ``--metric`` over…
FBumann May 28, 2026
d703cb1
benchmarks: plot compare sorts/bars by absolute time delta by default
FBumann May 28, 2026
69693c0
benchmarks: add ``scatter`` plot view for two-snapshot exploration
FBumann May 28, 2026
2f08aa6
benchmarks: scatter view handles N snapshots via plotly animation_frame
FBumann May 28, 2026
321d2d9
benchmarks: scatter — include baseline as frame 0, clip colour to p95…
FBumann May 28, 2026
a0d4b7a
benchmarks: scatter — center y-axis symmetrically around 1.0
FBumann May 28, 2026
45700e7
benchmarks: address review — row height, scaling-from-params, mismatc…
FBumann May 28, 2026
ad7aa53
benchmarks: plot returns Figure, default output → .benchmarks/plots/<…
FBumann May 28, 2026
7c7bab2
benchmarks: plot renderers return (Figure, n_tests) — drop trace intr…
FBumann May 28, 2026
6a8a16d
benchmarks: notebook plot demo uses the CLI + tqdm progress
FBumann May 28, 2026
09dad9d
benchmarks: notebook plot demo accepts the full CLI command string
FBumann May 28, 2026
2ece2c1
benchmarks: memory tracks all phases via memray.Tracker; README accur…
FBumann May 28, 2026
ea4bc76
benchmarks: plot subcommand auto-detects memory snapshots alongside t…
FBumann May 28, 2026
cccd476
benchmarks: compare view drops unchanged tests (esp. memory)
FBumann May 28, 2026
d34824a
benchmarks: fix compare y-axis collision; revert unchanged-row filter
FBumann May 28, 2026
d88f235
benchmarks: compare view renders value text outside bars
FBumann May 28, 2026
abb3f14
benchmarks: compare bars keep alphabetical test-id order
FBumann May 28, 2026
914efbf
benchmarks: plot gains ``--facets {phase,model}`` for compare + scatter
FBumann May 28, 2026
eb687f1
benchmarks: faceted compare/scatter share one x + y axis label
FBumann May 28, 2026
5a08e79
benchmarks: notebook showcases ``--facets phase`` after compare/scatter
FBumann May 28, 2026
e24451a
benchmarks: faceted compare — per-facet rows, shared y-tick labels pe…
FBumann May 28, 2026
f4917dd
benchmarks: scatter as default compare view + expose load_long_df
FBumann May 28, 2026
2993b95
benchmarks: memory sweep + --rounds/--repeats overrides + centralized…
FBumann May 28, 2026
ac1df53
benchmarks: CodSpeed CI + Dependabot perf attribution loop
FBumann May 28, 2026
0e6ec41
benchmarks: drop lockfile, relocate walkthrough, Jupytext --build flow
FBumann May 28, 2026
3981cad
benchmarks: add CLI walkthrough as Jupytext MyST notebook
FBumann May 28, 2026
59eadb3
benchmarks: bump pinned jupytext to 1.19.3 (matches installed)
FBumann May 28, 2026
cbf517a
benchmarks: sweep --smoke for cross-version sanity checks
FBumann May 28, 2026
4ba6fb4
benchmarks: small cleanups (dead __iter__, naming, stale comments)
FBumann May 28, 2026
d86b111
benchmarks: delete unused SOLVER_BUILD phase + collapse models re-exp…
FBumann May 28, 2026
b153239
benchmarks: share phase verbs via benchmarks/phases.py + guard the seam
FBumann May 28, 2026
754e0ec
benchmarks: extract _provision_venvs helper to dedupe sweep plumbing
FBumann May 28, 2026
7d3e474
benchmarks: bump pinned numpy 1.26.4 → 2.4.6
FBumann May 29, 2026
2621a7b
benchmarks: relax numpy pin to <2.0 for wider sweep coverage
FBumann May 29, 2026
2656178
benchmarks: pin numpy back to ==1.26.4 (last 1.x)
FBumann May 29, 2026
e7f9c5b
benchmarks: fix sweep silently measuring dev linopy + getattr SOLVER_…
FBumann May 29, 2026
b35fafe
benchmarks: pin xarray to 2025.1.2 to extend sweep coverage to 0.4.4
FBumann May 29, 2026
11f56d2
benchmarks: shim write_lp for linopy <0.4.1, extending sweep floor to…
FBumann May 29, 2026
3091c64
benchmarks: add --as-of <DATE> for cross-time-reproducible sweeps
FBumann May 29, 2026
e74ae1e
benchmarks: harden the sweep isolation seam (preflight + no bytecode)
FBumann May 29, 2026
55612f5
benchmarks: copy harness into sweep venvs instead of symlinking
FBumann May 29, 2026
c031153
benchmarks: add ad-hoc `bench` helper for arbitrary callables
FBumann May 29, 2026
3df647c
benchmarks: make the suite mypy-clean
FBumann May 29, 2026
c5f23ec
benchmarks: extract snapshot.py + calibrate bench.time
FBumann May 29, 2026
2839145
benchmarks: split sweep orchestration out of cli.py
FBumann May 29, 2026
4502fed
benchmarks: drop the "Other CLI surfaces" table from the walkthrough
FBumann May 29, 2026
99f4f56
benchmarks: show load_long_df from-file diff in the walkthrough
FBumann May 29, 2026
927750f
benchmarks: label sweep snapshots by ref/sha for git/file specs
FBumann May 29, 2026
ee8d89a
feat(benchmarks): add user-pattern specs swept over a severity axis
FBumann Jun 4, 2026
cfcd4b2
feat(benchmarks): end-to-end pipeline memory + CLI/docs for patterns
FBumann Jun 4, 2026
9063ec8
refactor(benchmarks): one spec selector (--filter) + spec_param_id he…
FBumann Jun 4, 2026
8a4a8a6
fix(benchmarks): make the suite mypy-clean + repair the plot dependen…
FBumann Jun 4, 2026
abe6329
feat(benchmarks): add merge_balance and flow_sum patterns
FBumann Jun 4, 2026
f2e63a2
feat(benchmarks): add storage model + rolling pattern (intertemporal …
FBumann Jun 4, 2026
816a5ce
feat(benchmarks): add cumsum pattern
FBumann Jun 5, 2026
ac1fda4
ci(mypy): type-check benchmarks/ (anchor the legacy benchmark/ exclude)
FBumann Jun 5, 2026
7f2585b
benchmarks: harden pypsa example fetch + CodSpeed continue-on-error
FBumann Jun 5, 2026
827a947
Merge branch 'master' into benchmark-suite-charter
FBumann Jun 5, 2026
919e766
Merge remote-tracking branch 'origin/benchmark-suite-charter' into fe…
FBumann Jun 5, 2026
be65b12
refactor(benchmarks): split cli.py into a typer sub-app package
FBumann Jun 5, 2026
f07fa3c
test(benchmarks): group harness unit tests under _tests/
FBumann Jun 5, 2026
14b6445
docs(benchmarks): trim verbose inline comments in CI/config
FBumann Jun 5, 2026
e9934f0
feat(benchmarks): drop flow_sum pattern (irreducible, not a sparsity …
FBumann Jun 5, 2026
1db88ec
perf(benchmarks): size pattern dims so peak-RSS cliffs clear the nois…
FBumann Jun 5, 2026
8daaa2d
fix: read tuple coords entries as xarray's (dim_name, values)
FBumann Jun 5, 2026
049a940
test: parameterize and reorganize alignment coords tests
FBumann Jun 5, 2026
e08b44f
fix: correct coords-entry TypeError to not list tuple as a bare sequence
FBumann Jun 5, 2026
03beba4
fix(benchmarks): clearer scaling-plot axis labels via a per-axis table
FBumann Jun 5, 2026
942dec8
refactor(benchmarks): rename the spec-name field model -> spec
FBumann Jun 5, 2026
ac3a6d1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jun 5, 2026
fb4a8bd
ci: split CodSpeed into its own master-only workflow
FBumann Jun 5, 2026
1ddba73
merge: benchmark suite onto master (sparsity/memory baseline)
FBumann Jun 5, 2026
f235cc4
ci(codspeed): instrumentation on PRs-to-master, add walltime macro job
FBumann Jun 5, 2026
d0ca7d3
merge: tuple coords-entry regression fix (#766)
FBumann Jun 5, 2026
0ef08c0
fix(benchmarks): walkthrough groups by spec, not the renamed-away model
FBumann Jun 5, 2026
dc8d404
ci(codspeed): authenticate via OIDC instead of a token secret
FBumann Jun 5, 2026
ce449ce
ci(codspeed): set mode=simulation (required input in action v4)
FBumann Jun 5, 2026
ec473c8
ci(codspeed): add memory instrument job (heap allocations)
FBumann Jun 5, 2026
755085a
perf(benchmarks): scope CodSpeed runs, drop netcdf disk I/O
FBumann Jun 5, 2026
ab61e27
perf(benchmarks): thin default severity sweep 5->3 (0,50,100)
FBumann Jun 5, 2026
1e2f49d
refactor(benchmarks): rename phases to to_/from_ scheme; drop matrice…
FBumann Jun 5, 2026
0478494
perf(benchmarks): per-instrument CodSpeed subsets via --codspeed-set
FBumann Jun 5, 2026
d0d0c52
perf(benchmarks): simplify --codspeed-set to full|simulation
FBumann Jun 5, 2026
8593377
perf(benchmarks): scope walkthrough run/memory cells to one model
FBumann Jun 5, 2026
e2276dc
ci(codspeed): combine simulation+memory upload; per-PR memory; label-…
FBumann Jun 5, 2026
f37c8be
refactor(benchmarks): lean up — trim docstrings/comments, dedupe plot…
FBumann Jun 5, 2026
520bb43
ci(codspeed): drop simulation (cachegrind); memory is the always-on b…
FBumann Jun 5, 2026
3fdb9b4
feat(benchmarks): log₂ sweep colouring + --clip colour clamp (#30)
FBumann Jun 6, 2026
0dc9c0a
Benchmark selection rework + time/memory CLI unification (#31)
FBumann Jun 6, 2026
4db3c76
fix(benchmarks): sweep colourbar keeps several round fold labels at s…
FBumann Jun 6, 2026
6d47dfa
refactor(benchmarks): explicit QUICK_SIZES/LONG_SIZES per spec (drop …
FBumann Jun 6, 2026
a64c0f2
Merge branch 'PyPSA:master' into master
FBumann Jun 7, 2026
85c103c
pre-commit
FBumann Jun 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,30 @@ updates:
github-actions:
patterns:
- '*'

# Pinned ``[benchmarks]`` extra in pyproject.toml. One PR per dep bump
# → CodSpeed CI runs and attributes any perf delta to that specific
# bump. Keeps the cross-version ``sweep`` baseline (lockfile-pinned)
# stable while still surfacing upstream perf changes per-PR with
# eyes-open review. Loose ``[project.dependencies]`` (numpy, scipy, ...)
# have no version specifier so Dependabot leaves them alone — only the
# ``==`` pins in ``[benchmarks]`` produce PRs.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinning like you did in pyproject.toml doesn't give us full pins, because the deps of deps aren't pinned. Also, I'm not sure we need these per dependency PRs just to get codspeed stats on them. The user will use the newest versions anyway, and it's very unlikely that we'd pin deps for the user just because of codspeed.

So I'd just fully remove this and the benchmark pins in pyproject.toml. For PRs, we probably want to just commit uv.lock and use that one in all PRs (for tests and benchmarks), and on master we always resolve from scratch. But all CI runs need to be updated then. I guess this can rather be done in another PR, and here we just ignore dependency pinning

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided against a uv lock file to be able to sweep across linopy versions. As this is arguably not needed for ci and regression tracking, I'm fine with removing this and maybe even commuting a lock file instead. But lockfile usage needs to stay optional to make sweeping possible.
Another reason to add those here was to catch indirect improvements in upstream repos (xarray etc), as a dependabot or would also run when trying to bump the deps.
But we could probably simplify this.

- package-ecosystem: pip
directory: /
schedule:
interval: monthly
open-pull-requests-limit: 5
groups:
# Measurement scaffolding + CLI/notebook tooling. Perf-irrelevant —
# they don't move CodSpeed signal, so batching into one PR cuts
# review noise. Perf-relevant deps (numpy, xarray, highspy, …) stay
# un-grouped so each gets its own attributed CodSpeed delta.
benchmark-tooling:
patterns:
- pytest
- pytest-benchmark
- pytest-memray
- pytest-codspeed
- nbconvert
- typer
- plotly
43 changes: 43 additions & 0 deletions .github/workflows/benchmark-smoke.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Benchmark smoke

# Builds every spec and fires every phase once under --quick
# --benchmark-disable: a "did a refactor break a spec?" check, not timing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this run?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minutes
1 for the actual smoke test, 1 for the notebook

Can be optimized I think.

https://github.com/fluxopt/linopy/actions/runs/27085593476/job/79939326221


on:
push:
branches: [ master ]
pull_request:
branches: [ '*' ]

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
smoke:
name: Benchmark smoke (quick)
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0 # setuptools_scm

- name: Set up Python 3.12
uses: actions/setup-python@v6
with:
python-version: "3.12"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switch to 3.13?


- name: Install package and benchmark dependencies
run: |
python -m pip install uv
uv pip install --system -e ".[dev,benchmarks]"

- name: Run benchmark smoke
run: |
python -m benchmarks smoke

- name: Execute walkthrough notebook
# Catches doc rot: the walkthrough must stay runnable end-to-end.
run: |
python -m benchmarks notebook
62 changes: 62 additions & 0 deletions .github/workflows/codspeed-macro.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
name: CodSpeed (walltime macro)

# Wall-clock benchmarks on CodSpeed's dedicated bare-metal macro runners — the
# mode that reflects the real cost of dense-vs-sparse work (cache, allocation,
# native numpy/scipy), which instruction counting under-weights.
#
# Master push (updates the walltime baseline) + manual dispatch + opt-in per-PR
# via the ``trigger:benchmark`` label. Off every *unlabelled* PR: macro-runner
# minutes are metered (600/month free), and self-hosted bare-metal shouldn't run
# arbitrary PR code — the label is a maintainer-controlled gate, so only apply it
# to trusted (same-repo) PRs.
#
# Requires the repo under a GitHub org (macro runners are org-only) with the

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the noise

# CodSpeed app connected to the repo (OIDC auth — no token secret needed).

on:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge both codspeed files?

push:
branches: [ master ]
pull_request:
types: [ labeled, synchronize ]
branches: [ master ]
workflow_dispatch:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess push master + workflow_dispatch is enough


concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
macro:
name: CodSpeed walltime (macro runner)
# Always on master push / dispatch; on PRs only when explicitly labelled.
if: >-
${{ github.event_name != 'pull_request' ||
contains(github.event.pull_request.labels.*.name, 'trigger:benchmark') }}
runs-on: codspeed-macro

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are costs here. Looks great running memory on github runners and clocktime on codspeed runners, can you point me to more resources? Specially if we wanna use this across multiple repos. How long does this run and how much is for free?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walltime (macro) is free up to 600 min/month

Cost after that is 0,013 €/min

But I'm not sure if we can go over the 600 minutes without a paid plan...

Everything else can run on a girhub runner and is free

https://codspeed.io/pricing

# Non-gating until the CodSpeed app is connected to the repo (OIDC auth).
continue-on-error: true
permissions:
contents: read # actions/checkout
id-token: write # OIDC auth with CodSpeed — no token secret
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0 # setuptools_scm

- name: Set up Python 3.12
uses: actions/setup-python@v6
with:
python-version: "3.12"

- name: Install pinned benchmark environment
# Pinned ``[benchmarks]`` extra so Dependabot bumps → one CodSpeed delta each.
run: |
python -m pip install uv
uv pip install --system -e ".[dev,benchmarks]"

- name: Run benchmarks under CodSpeed (walltime)
uses: CodSpeedHQ/action@v4
with:
mode: walltime
run: |
pytest benchmarks/ --quick --codspeed
48 changes: 48 additions & 0 deletions .github/workflows/codspeed-memory.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
name: CodSpeed (memory)

# Heap-allocation tracking — the always-on signal for this sparsity/memory fork.
# Fast (~2 min) and free on a GitHub runner, so it runs on master (baseline) and
# every PR. A solo instrument on ubuntu: its one upload per (commit, env) never
# clashes with the walltime run, which is a separate bare-metal environment.

on:
push:
branches: [ master ]
pull_request:
branches: [ master ]
workflow_dispatch:

concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
memory:
name: CodSpeed memory
runs-on: ubuntu-latest
# Non-gating: informational, never blocks a merge.
continue-on-error: true
permissions:
contents: read # actions/checkout
id-token: write # OIDC auth with CodSpeed — no token secret
steps:
- uses: actions/checkout@v6
with:
fetch-depth: 0 # setuptools_scm

- name: Set up Python 3.12
uses: actions/setup-python@v6
with:
python-version: "3.12"

- name: Install pinned benchmark environment
run: |
python -m pip install uv
uv pip install --system -e ".[dev,benchmarks]"

- name: Run benchmarks under CodSpeed (memory)
uses: CodSpeedHQ/action@v4
with:
mode: memory
run: |
pytest benchmarks/ --quick --codspeed
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,10 @@ benchmark/scripts/__pycache__
benchmark/scripts/benchmarks-pypsa-eur/__pycache__
benchmark/scripts/leftovers/

# Benchmarks (internal suite): regenerable .ipynb viewing artifacts
benchmarks/walkthrough.ipynb
benchmarks/.ipynb_checkpoints/

# IDE
.idea/

Expand Down
121 changes: 48 additions & 73 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -1,94 +1,69 @@
# Internal Performance Benchmarks

Measures linopy's own performance (build time, LP write speed, memory usage) across problem sizes using [pytest-benchmark](https://pytest-benchmark.readthedocs.io/) and [pytest-memray](https://pytest-memray.readthedocs.io/). Use these to check whether a code change introduces a regression or improvement.
End-to-end performance tracking for `linopy` — build → solver handoff
→ netCDF (de)serialization → fixed PyPSA model. Solver algorithm
runtime is out of scope.

> **Note:** The `benchmark/` directory (singular) contains *external* benchmarks comparing linopy against other modeling frameworks. This directory (`benchmarks/`) is for *internal* performance tracking only.
**The walkthrough is load-bearing.** Phase coverage, CLI introspection,
the two-snapshot regression workflow with inline Plotly views, and
how to extend the suite live in [`walkthrough.md`](walkthrough.md).
This README only covers install and how to open the walkthrough.

## Setup
> `benchmark/` (singular) is the legacy external-framework suite.
> `benchmarks/` (plural) is this internal suite.

```bash
pip install -e ".[benchmarks]"
```
## Models vs patterns

## Running benchmarks
Two kinds of benchmark spec, same harness (time *or* peak memory — a
`run`/`sweep` `--metric` flag, same phases), distinguished by their sweep axis:

```bash
# Quick smoke test (small sizes only)
pytest benchmarks/ --quick
- **Models** (`models/`, `REGISTRY`) — whole `linopy.Model`s swept over
`size` (axis `n`): "how does cost scale with the problem?"
- **Patterns** (`patterns/`, `PATTERNS`) — fragments of realistic modelling
code (a balance constraint, a KVL contraction) swept over `severity`
(0–100, axis `severity`): "how does cost respond as one data shape goes
from benign to pathological?" Each `PatternSpec.description` documents what
its dial means (`"0: …, 100: …"`).

# Full timing benchmarks
pytest benchmarks/test_build.py benchmarks/test_lp_write.py benchmarks/test_matrices.py
Both kinds build a complete `linopy.Model`, so both run the **same phases** and
share the phase drivers (`test_build.py`, `test_matrices.py`, …) and `memory`
grid — they're just more `(spec, value)` rows, tagged by `axis`. There is no
separate pattern driver. Running a pattern through `build` *and* `lp_write`
shows whether a dense-`_term` blow-up propagates to export or collapses.

# Run a specific model
pytest benchmarks/test_build.py -k basic
```
Patterns target the operations where the dense-`_term` representation forces
materialisation — `groupby().sum()` padding, sparse `@` densification — so a
`severity` sweep draws the cost cliff, and a cross-version `compare` shows a
kernel change bending it. Adding either is one file: drop it in `models/` or
`patterns/`, call `register(...)` / `register_pattern(...)`.

## Comparing timing between branches
## Install

```bash
# Save baseline results on master
git checkout master
pytest benchmarks/test_build.py --benchmark-save=master

# Switch to feature branch and compare
git checkout my-feature
pytest benchmarks/test_build.py --benchmark-save=my-feature --benchmark-compare=0001_master

# Compare saved results without re-running
pytest-benchmark compare 0001_master 0002_my-feature --columns=median,iqr
uv sync --extra dev --extra benchmarks
source .venv/bin/activate
```

Results are stored in `.benchmarks/` (gitignored).

## Memory benchmarks
`pypsa` is optional — `pypsa_scigrid` and
`test_pypsa_carbon_management.py` skip gracefully without it. Install
when you need them: `uv pip install pypsa`.

`memory.py` runs each test in a separate process with pytest-memray to get accurate per-test peak memory (including C/numpy allocations). Results are saved as JSON and can be compared across branches.
The `[benchmarks]` extra in `pyproject.toml` pins every direct dep that
affects measurement (`numpy`, `scipy`, `xarray`, `pandas`, `polars`,
`dask`, etc.). `sweep` installs these into each per-version venv, so
"same deps, only linopy varies" comes for free without a separate
lockfile — bump the pins in pyproject and the next sweep picks them up.

By default, only the build phase (`test_build.py`) is measured. Unlike timing benchmarks where `benchmark()` isolates the measured function, memray tracks all allocations within a test — including model construction in setup. This means LP write and matrix tests would report build + phase memory combined, making the phase-specific contribution impossible to isolate. Since model construction dominates memory usage, measuring build alone gives the most actionable numbers.
## Open the walkthrough

```bash
# Save baseline on master
git checkout master
python benchmarks/memory.py save master

# Save feature branch
git checkout my-feature
python benchmarks/memory.py save my-feature

# Compare
python benchmarks/memory.py compare master my-feature

# Quick mode (smaller sizes, faster)
python benchmarks/memory.py save master --quick

# Measure a specific phase (includes build overhead)
python benchmarks/memory.py save master --test-path benchmarks/test_lp_write.py
python -m benchmarks notebook --build # (re)generate walkthrough.ipynb
jupyter lab benchmarks/walkthrough.ipynb # ...or PyCharm / VSCode
```

Results are stored in `.benchmarks/memory/` (gitignored). Requires Linux or macOS (memray is not available on Windows).

> **Note:** Small tests (~5 MiB) are near the import-overhead floor and may show noise of ~1 MiB between runs. Focus on larger tests for meaningful memory comparisons. Do not combine `--memray` with timing benchmarks — memray adds ~2x overhead that invalidates timing results.

## Models

| Model | Description | Sizes |
|-------|-------------|-------|
| `basic` | Dense N*N model, 2*N^2 vars/cons | 10 — 1600 |
| `knapsack` | N binary variables, 1 constraint | 100 — 1M |
| `expression_arithmetic` | Broadcasting, scaling, summation across dims | 10 — 1000 |
| `sparse_network` | Ring network with mismatched bus/line coords | 10 — 1000 |
| `pypsa_scigrid` | Real power system (requires `pypsa`) | 10 — 200 snapshots |

## Phases

| Phase | File | What it measures |
|-------|------|------------------|
| Build | `test_build.py` | Model construction (add_variables, add_constraints, add_objective) |
| LP write | `test_lp_write.py` | Writing the model to an LP file |
| Matrices | `test_matrices.py` | Generating sparse matrices (A, b, c, bounds) from the model |

## Adding a new model
The `.md` is the source of truth; the `.ipynb` is a disposable,
gitignored build artifact. Edit the `.md`, re-run `--build`, re-open.
Same workflow in any editor.

1. Create `benchmarks/models/my_model.py` with a `build_my_model(n)` function and a `SIZES` list
2. Add parametrized tests in the relevant `test_*.py` files
3. Add a quick threshold in `conftest.py`
CI executes the walkthrough end-to-end on every PR
(`python -m benchmarks notebook`) so the examples can't silently rot.
Loading
Loading