Conversation
… we want to factor it out of the rule
* Distinguish between primary runs ('candidates') and secondary runs
* Docstrings
* Adopt forecast intervals including the end point * Fix parsing * Experiments work * Update config/forecasters.yaml * Align init times to availabiliy of COE * run pre-commit * Change README to COSMO-E availability --------- Co-authored-by: Jonas Bhend <jonasbhend@users.noreply.github.com> Co-authored-by: Jonas Bhend <jonas.bhend@meteoswiss.ch>
* draft changes * rename workspace resources dir * working for config/forecasters.yaml * improve logging * works for interpolators.yaml * re-add get_leadtime function * refactor run directives into script
* add region averages * add regions to config * Add regions to verification module, scripts, and rules * add stratification to forecaster config and fix typo * fix dict indexing * fix append error * read lon/lat from obs dataset * Add inner verification domain * Add missing dependency * add plots by region * Add regions to dashboard * Fix dashboard * Add region name and initializations to plot title (and remove header div) * Add support for multiple regions * Fix legend
…e-to-generate-namelist
dnerini
left a comment
There was a problem hiding this comment.
looking very nice @andreaspauling ! I have added a few initial thoughts form a quick look into your changes, but I plan to have a closer look soon!
| # prepare_mec_input: setup run dir, gather observations and model data in the run dir for the actual init time | ||
| rule prepare_mec_input: | ||
| input: | ||
| src_dir=OUT_ROOT / "data/runs/{run_id}/{init_time}/grib", |
There was a problem hiding this comment.
There is no rule giving this an output, so this should to the very list trigger some warnings from snakemake. You could specify it as a parameter instead.
| set -euo pipefail | ||
|
|
||
| # Run MEC inside sarus container | ||
| # Note: pull command currently needed only once to download the container |
There was a problem hiding this comment.
the pull command could then be factored out into a separate rule that is run only once before launching all the parallel MEC jobs
| RESULTS_DIR = OUT_ROOT / "results" / EXPERIMENT_NAME | ||
|
|
||
| # prefer one rule because snakemake complains about ambiguous rules (same output) | ||
| ruleorder: prepare_inference_forecaster > prepare_inference_interpolator |
There was a problem hiding this comment.
need to have a closer look at this, I don0't understand why this problem would appear with your changes
There was a problem hiding this comment.
Without it snakemake complains. Thanks for having a closer look at it.
frazane
left a comment
There was a problem hiding this comment.
Added some comments. The most important ones are those we already discussed (avoid copying data and directory structure) but I wanted to make sure we do not forget about them.
There was a problem hiding this comment.
Could we get a "script summary log" at the beginning of the script? See other scripts for examples.
| parser.add_argument( | ||
| "--namelist", | ||
| type=str, | ||
| help="Anything useful", |
There was a problem hiding this comment.
Missing an actual help message.
| parser.add_argument( | ||
| "--template", | ||
| type=str, | ||
| ) |
| if __name__ == "__main__": | ||
| parser = ArgumentParser() | ||
|
|
||
| parser.add_argument("--steps", type=_parse_steps, default="0/120/6") |
| RESULTS_DIR = OUT_ROOT / "results" / EXPERIMENT_NAME | ||
|
|
||
| # prefer one rule because snakemake complains about ambiguous rules (same output) | ||
| ruleorder: prepare_inference_forecaster > prepare_inference_interpolator |
| return list(range(start, end + 1, step)) | ||
|
|
||
|
|
||
| # TODO: merge with _ref_times from common.smk? |
There was a problem hiding this comment.
Not merged but perhaps could be moved to common.smk.
| # concatenate all grib files in src_dir into a single file fc_file | ||
| echo "grib files processed:" | ||
| files=( "$src_dir"/20*.grib ) | ||
| if (( ${{#files[@]}} )); then | ||
| printf '%s\n' "${{files[@]}}" | ||
| cat "${{files[@]}}" > "$fc_file" | ||
| else | ||
| echo "WARNING: no grib files found in $src_dir" >&2 | ||
| fi |
There was a problem hiding this comment.
Is this really necessary? We are effectively duplicating the entire output data.
There was a problem hiding this comment.
Could we move the mec directory from being inside each forecast run directory (output/data/runs/<run-id>/<init-time>/mec) to output/data/mec/<valid-time> so as to not mix up inittime-based directory structure and validtime-based directory structure? This is also the same approach adopted by osm in the operational archive.
Add the MEC workflow. The new parts are in green in the DAG: snakemake_dag.pdf
For each valid date a MEC case is set up and run. This includes:
All MEC cases can be removed once the final feedback file is produced (removal not yet implemented).