Skip to content

Runbook: external setup to enable the CodSpeed benchmark CI (org, OIDC, macro runners) #27

@FBumann

Description

@FBumann

TODO (human): one or two sentences on why we want this captured — e.g. "so we can stand the CodSpeed benchmark CI back up on upstream linopy without rediscovering the org/runner dance."

Note

The following content was generated by AI (drafted by Claude while wiring CodSpeed up on this fork). It's a setup runbook, not discussion.

What this covers

The benchmark workflows are committed (.github/workflows/codspeed.yml, codspeed-memory.yml, codspeed-macro.yml), but they only report once a chunk of out-of-band setup exists that the repo can't carry. This is that checklist — for re-creating the setup on another repo/org (e.g. upstream linopy) later.

Prerequisites

  • The repo must live under a GitHub organization. Walltime/macro runners are org-only — a personal account cannot use them (CodSpeed greys the feature out with "this feature is only available for organization accounts").
  • CodSpeedHQ/action@v4. v4 made mode a required input — every job must set mode: to one of simulation | walltime | memory. (This bit us: omitting it fails the step instantly.)

1. Connect CodSpeed

  1. Install the CodSpeed GitHub App on the org (not a personal account): app.codspeed.io → import → select the org → add the repo.
  2. Make sure the repo appears under the org workspace on CodSpeed, not your personal workspace. A repo transferred into the org keeps its old personal CodSpeed project; if it shows under personal, disconnect it there and re-add it under the org, or macro stays greyed out even though the app is on the org.

2. Auth — OIDC (no token secret)

  • The workflows use OIDC, so there is no CODSPEED_TOKEN secret to manage. Each job needs:
    permissions:
      contents: read   # actions/checkout
      id-token: write  # OIDC auth with CodSpeed
  • Once the app is connected (step 1), uploads authenticate automatically. Verified: the run logs show Performance data uploaded / Linked repository: <org>/<repo>.

3. Macro / walltime runners (org-only)

  1. CodSpeed side: in the org workspace settings, enable macro runners. This registers CodSpeed's bare-metal runners to the GitHub org (they appear in the org's Default runner group).
  2. GitHub side (public repos only): https://github.com/organizations/<org>/settings/actions/runner-groupsDefault → tick "Allow public repositories". Without this the runs-on: codspeed-macro job sits queued forever (no eligible runner).
  3. Verified: after both, a previously-stuck codspeed-macro job is picked up by a runner within seconds.

4. Security hardening (do this)

  • Restrict the runner group to selected repositories (just this repo), not "all repositories" — limits blast radius on a public org.
  • Optionally restrict the runner group to the macro workflow (.../codspeed-macro.yml@refs/heads/master) so nothing else can land jobs on the bare-metal runners.
  • Deliberate design property: codspeed-macro.yml triggers on push: master + workflow_dispatch only — never pull_request — so untrusted fork PRs cannot execute on the self-hosted/macro runner (the main public-repo self-hosted-runner risk). Do not add a pull_request trigger to the macro workflow. (The PR-triggered simulation/memory jobs run on GitHub-hosted ubuntu-latest, not the macro runner.)

5. Workflows & cadence (already in the repo)

Workflow mode Runs on Triggers Cost
codspeed.yml simulation GitHub ubuntu-latest push master + PR→master + dispatch free
codspeed-memory.yml memory GitHub ubuntu-latest push master + PR→master + dispatch free
codspeed-macro.yml walltime codspeed-macro (org) push master + dispatch macro-minutes

All three are continue-on-error: true (non-gating) until baselines/noise are characterized.

6. Plan limits

  • Free plan includes 600 macro-runner minutes/month — consumed by walltime only. simulation and memory run on GitHub-hosted runners and consume zero macro minutes (just free public-repo Actions minutes).
  • OSS projects can request more macro minutes via contact@codspeed.io.

7. Establishing the baseline

The baseline is the first successful run on the default branch (master), per instrument. PRs-to-master (simulation/memory) and master pushes (all three) then compare against it.

Parked follow-up

Memory CI currently = CodSpeed memory only (per-test heap allocations). Open question: also gate the end-to-end pipeline OOM ceiling (python -m benchmarks memory --phase pipeline), which CodSpeed's per-test run structurally can't capture. Preferred shape: expose the pipeline phase as a CodSpeed-measured benchmark rather than a second workflow. (A self-contained memray --fail-over gate was prototyped but left unmerged.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions