Run evaluation pipelines for data-driven weather models built with Anemoi.
- Experiments: compare model performance via standard and diagnostic verification
- Showcasing: produce visual material for specific events
- Sandboxing: generate an isolated inference development environments for any model
To run an experiment, prepare a demo config file like the one below and adapt it to your setup:
# yaml-language-server: $schema=../workflow/tools/config.schema.json
description: |
Demo experiment: compare two forecaster checkpoints against the same baseline and truth data.
# Optional: used in the output directory name. If omitted, the config file name is used.
config_label: co2-forecasters-demo
# Choose one date style:
# 1. A regular range with a run frequency (shown here)
# 2. An explicit list of ISO timestamps for case studies or showcases
dates:
start: 2020-01-01T00:00
end: 2020-01-10T00:00
frequency: 60h
runs:
# Each item is either `forecaster` or `interpolator`.
- forecaster:
# `checkpoint` may point to a supported MLflow run URL, a Hugging Face `.ckpt` URL, or a local checkpoint path.
checkpoint: https://servicedepl.meteoswiss.ch/mlstore#/experiments/228/runs/2f962c89ff644ca7940072fa9cd088ec
# Labels are what appear in plots, tables, and reports.
label: Stage D - N320 global grid with CERRA finetuning
# Lead times follow start/end/step in hours.
steps: 0/120/6
# `config` points to the inference config template for the run. If omitted, evalml uses the bundled default for the run type.
config: resources/inference/configs/sgm-forecaster-global.yaml
# Optional extra dependencies needed by this checkpoint at inference time.
extra_requirements:
- git+https://github.com/ecmwf/anemoi-inference.git@0.8.3
- forecaster:
checkpoint: https://mlflow.ecmwf.int/#/experiments/103/runs/d0846032fc7248a58b089cbe8fa4c511
label: M-1 forecaster
steps: 0/120/6
config: resources/inference/configs/sgm-forecaster-global_trimedge.yaml
baselines:
- baseline:
baseline_id: COSMO-E
label: COSMO-E
root: /store_new/mch/msopr/ml/COSMO-E
steps: 0/120/6
truth:
label: COSMO KENDA
root: /scratch/mch/fzanetta/data/anemoi/datasets/mch-co2-an-archive-0p02-2015-2020-6h-v3-pl13.zarr
stratification:
regions:
- jura
- mittelland
- voralpen
- alpennordhang
- innerealpentaeler
- alpensuedseite
root: /scratch/mch/bhendj/regions/Prognoseregionen_LV95_20220517
locations:
# All workflow outputs are written under this root.
output_root: output/
profile:
# Passed through to Snakemake. Tune this block to match your cluster or local executor.
executor: slurm
global_resources:
# Limits total concurrent GPU use across submitted jobs.
gpus: 16
default_resources:
slurm_partition: "postproc"
cpus_per_task: 1
mem_mb_per_cpu: 1800
runtime: "1h"
jobs: 50
batch_rules:
# Group many small plotting jobs into fewer submissions.
plot_forecast_frame: 32The runs list accepts both forecaster and interpolator entries. For dates, you can either provide a start / end / frequency block as above or an explicit list of ISO timestamps for case-study style runs.
You can then run it with:
evalml experiment path/to/experiment/config.yaml --reportThis project uses uv. Download and install it with
curl -LsSf https://astral.sh/uv/install.sh | shthen, install the project and its dependencies with uv sync and activate the virtual
environment with source .venv/bin/activate.
Some experiments are stored on the ECMWF-hosted MLflow server: https://mlflow.ecmwf.int. To access these runs in the evaluation workflow, you need to authenticate using a valid token. Run the following commands once to log in and obtain a token:
uv pip install anemoi-training --no-deps
anemoi-training mlflow login --url https://mlflow.ecmwf.intYou will be prompted to paste a seed token obtained from https://mlflow.ecmwf.int/seed. After this step, your token is stored locally and used for subsequent runs. Tokens are valid for 30 days. Every training or evaluation run within this period automatically extends the token by another 30 days. It’s good practice to run the login command before executing the workflow to ensure your token is still valid.
By default, data produced by the workflow will be stored under output/ in your working directory.
We suggest that you set up a symlink to a directory on your scratch:
mkdir -p $SCRATCH/evalenv/output
ln -s $SCRATCH/evalenv/output outputThis way data will be written to your scratch, but you will still be able to browse it with your IDE.
If you are using VSCode, we advise that you install the YAML extension, which will enable config validation, autocompletion, hovering support, and more.