Skip to content

Enhance robust pcmdi#49

Draft
zhangshixuan1987 wants to merge 24 commits into
mainfrom
enhance_robust_pcmdi
Draft

Enhance robust pcmdi#49
zhangshixuan1987 wants to merge 24 commits into
mainfrom
enhance_robust_pcmdi

Conversation

@zhangshixuan1987
Copy link
Copy Markdown
Contributor

Summary

This pull request enhances the robustness of the PCMDI diagnostics utilities by fixing bugs related to ENSO metric extraction and synthetic plot handling, improving logging clarity and parallel computing reliability, and making incremental enhancements to the viewer and utility modules.

Objectives:

  • Enhance robustness of PCMDI diagnostics utilities.
  • Fix bugs related to ENSO metric extraction and synthetic plot handling.
  • Improve logging clarity and parallel computing reliability.
  • Incremental enhancements to viewer and utility modules.

Issue resolution:

  • No related GitHub issue; changes are based on code review and internal quality improvements.

This pull request is

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version

Small Change

  • To merge, I will use "Squash and merge". That is, this change should be a single commit.
  • Logic: I have visually inspected the entire pull request myself.
  • Pre-commit checks: All the pre-commits checks have passed.

1. Does this do what we want it to do?

-Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
-Testing: I have considered likely and/or severe edge cases and have included them in testing.
Note: Testing was performed together with another bug fix change to address .

2. Are the implementation details accurate & efficient?

  • I have visually inspected the entire pull request myself.

3. Is this well documented?

  • A comment is put on each commit to explain the changes clearly.

4. Is this code clean?

  • The code is as simple as possible and well-commented.
    -All the pre-commits checks have passed.

@zhangshixuan1987 zhangshixuan1987 marked this pull request as draft May 5, 2026 22:48
@zhangshixuan1987 zhangshixuan1987 requested a review from forsyth2 May 5, 2026 22:48
@zhangshixuan1987 zhangshixuan1987 self-assigned this May 5, 2026
@zhangshixuan1987 zhangshixuan1987 added the semver: bug Bug fix (will increment patch version) label May 5, 2026
@zhangshixuan1987
Copy link
Copy Markdown
Contributor Author

@forsyth2: Hi Ryan, I am submitting this pull request with several bug fixes identified during the code review process while addressing the reported issues in E3SM-Project/zppy#807. I also included a few fixes related to the ENSO code path that is planned to be activated in the near future. I decided to address them now because the relevant code is already part of the current code base, so it would be better to resolve these issues before the functionality becomes fully enabled and more widely used.

@zhangshixuan1987 zhangshixuan1987 added the semver: small improvement Small improvement (will increment patch version) label May 5, 2026
- pcmdi_setup.py: raise ValueError in _extract_metadata() when filename
  parts are missing, instead of logging and falling through to IndexError;
  fix typo "dervied" -> "derived" in log message

- link_observation.py: use context manager for obs_alias_file to ensure
  file handle is closed

- enso_metrics_reader.py: guard nested RESULTS.model dict access with
  .get() and raise a descriptive KeyError when structure is missing,
  instead of an opaque KeyError at runtime

- utils.py: replace Popen(shell=True) with shlex.split() + shell=False
  in run_parallel_jobs() and run_serial_jobs() to eliminate shell
  injection risk; add shlex import

- pcmdi_enso.py, pcmdi_synthetic_plots.py: replace bare json.load(open())
  calls with context managers to prevent file descriptor leaks

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- ENSOParameters: validate enso_groups is not None at construction time
  instead of crashing with AttributeError on .split() downstream

- EnsoDiagnosticsCollector.__init__: validate model_name_parts has
  exactly 4 elements before tuple unpack, giving a descriptive error

- collect_figures: guard os.listdir(fdir) with os.path.isdir() before
  calling it; guard error-log dir listing the same way to prevent
  FileNotFoundError inside the logger call itself

- collect_figures: warn when model/relm marker is absent from filename
  before splitting, preventing silent wrong output filenames

- collect_metrics / collect_diags: replace bare os.listdir() inside
  logger error f-strings with isdir-guarded dir_contents variable

- main(): check obs_dict is non-empty before [0] index to avoid
  IndexError on empty obs_catalogue.json

- check_enso_input: guard both os.symlink() calls with
  os.path.exists() to prevent FileExistsError on re-run/retry

- check_vars: add re.DOTALL flag to list_variables regex so multi-line
  driver stdout does not produce a false "no variable list found"
failure

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- rename pcmdi_mean_cimate.py -> pcmdi_mean_climate.py (correct
  spelling);
  update entry point in pyproject.toml and import in test file

- MeanClimateParameters: validate --regions is not None before .split()
  to prevent AttributeError on missing CLI argument

- MeanClimateMetricsCollector.__init__: validate model_info has exactly
  4 elements before tuple unpack, giving a descriptive error

- _collect_figures: fix output directory key typo "CLIM_patttern" ->
  "CLIM_pattern"; add logger.warning when no figures are found for a
  var/region/season combination instead of silently skipping

- _collect_metrics: guard parts[1] access with len(parts) < 2 check
  and log+skip on unexpected filename format instead of IndexError

- main(): re-raise RuntimeError from job runners instead of swallowing
  it with print(), preventing silent wrong results after job failure;
  replace all print() calls with logger.info()

- generate_mean_clim_cmds: add logger.warning when a variable is not
  found in obs_dic instead of silently omitting its command

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- VariabilityModesParameters: validate --var_modes and --vars are not
  None at construction time instead of crashing with AttributeError
  or KeyError: None downstream

- main(): validate model_name has exactly 4 dot-separated parts before
  index access to prevent IndexError

- main(): re-raise RuntimeError from job runners instead of swallowing
  it with print(), preventing silent wrong results after job failure;
  replace all print() calls with logger.info()

- _collect_figures: add logger.warning when no files match a
  mode/season combination instead of silently skipping

- _classify_output_name: add logger.warning when filename does not
  match any known pattern and suffix falls back to "unknown"

- generate_varmode_cmds: quote refpath in command string so paths
  containing spaces are handled correctly by shlex.split

- test_generate_varmode_cmds: update expected strings to reflect
  quoted refpath

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- SyntheticPlotsParameters: replace all bare args["x"].split(",") calls
  with None-safe guards so missing optional list arguments (clim_vars,
  clim_regions, mova_modes, mova_vars, movc_modes, movc_vars, enso_vars)
  no longer raise AttributeError

- SyntheticPlotsParameters: replace all str(args["x"]).lower() in (...)
  boolean checks with str2bool(args.get("x", False)) to use the existing
  helper consistently; previously str("None") silently evaluated to
False,
  masking missing viewer flags (clim_viewer, mova_viewer, movc_viewer,
  enso_viewer, save_all_data)

- main(): guard shutil.copy for e3sm_pmp_logo.png with os.path.exists
  check; log a warning and skip instead of raising FileNotFoundError
  when pcmdi_external_prefix is misconfigured

- _get_args(): change --debug to type=str2bool with default=False and
  simplify check to `if args.debug:`; previously only "true" was
  recognised, silently ignoring "1", "yes", etc.

All 7 unit tests pass (zi-pcmdi-diags-20260430).
with logger

- CoreParameters: validate num_workers and multiprocessing are not None
  before int() and .lower() calls to prevent cryptic TypeError/
  AttributeError on missing CLI arguments

- CoreParameters: guard --vars with args.get() before .split(",") to
  prevent AttributeError on missing argument

- set_up(): split model_name once into model_name_parts and validate
  len >= 2 before index access; eliminates IndexError on malformed
  model name and removes two redundant .split() calls in input_template
  construction

- set_up(): validate model_name_ref is non-None and has >= 2 parts in
  model_vs_model branch to prevent None.split() and IndexError

- _process_group(): replace print() warning with logger.warning() so
  missing catalogue warning respects log-level control

- _generate_mask(): replace both print() calls with logger.info() so
  mask method info is captured in log output

- derive_missing_variable(): replace print() with logger.info() for
  derived variable write confirmation

- Fix typo "assigining" -> "assigning" in two log messages

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- run_parallel_jobs: raise ValueError if num_workers < 1 instead of
  silently degrading to serial execution

- run_parallel_jobs: rename inner loop variables cmd/proc to
  batch_cmd/batch_proc to eliminate shadowing of the outer cmd variable

- run_parallel_jobs: terminate all remaining running batch processes
  before raising RuntimeError on job failure, preventing orphaned
  subprocesses from running indefinitely after an error

- run_parallel_jobs: improve batch log message from misleading
  "Running {count_child_processes()} subprocesses" (counted before
  launch) to "Running batch of {len(procs)} subprocesses"

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- MeanClimateTableBuilder.map_regions(): rename local variable from
  `seasons` to `regions` throughout the method; copy-paste bug caused
  no runtime error but was misleading and error-prone

- build_table(): update figure path from "CLIM_patttern" to
  "CLIM_pattern" to match the corrected output directory name from
  pcmdi_mean_climate.py; viewer was silently finding no files

- setup_jinja_env(): add os.path.isdir() check before creating the
  Jinja2 environment; previously a missing template directory produced
  an unhelpful TemplateNotFound error with no indication the directory
  itself was absent

- Add module-level `import logging` and `logger = logging.getLogger`
  at the top of the file; previously the file had no logger

- generate_methodology_html(), generate_data_html(),
  generate_viewer_html(): replace print() calls with logger.info() so
  HTML write confirmations respect log-level control and appear in log
  output

All 7 unit tests pass (zi-pcmdi-diags-20260430).
- Use shutil.move for cross-filesystem-safe file moves
- Normalize ENSO group parsing by stripping whitespace
- Make figure, metric, and diagnostic output replacement explicit
- Add destination directory guards before overwriting outputs
- Write metrics JSON through a temporary file before replacing destination
- Improve output validation with return-code checks and clearer directory checks
- Make output collection more robust across filesystems
- Add explicit replacement handling for figures, metrics, and diagnostics
- Strip whitespace from ENSO group inputs
- Validate command return codes and reported output directories
- Support prefix-style NetCDF filenames when symlinking alternative variables
- Extend existing alias treatment consistently for wind stress inputs
Correct the coupled modes variability viewer so MOV_compose EOF
links use the expected EOF mode instead of incorrectly pointing to
CBF figures. Coupled SST modes are now mapped as:
AMO/PDO -> eof1 and NPGO -> eof2. Mode names are normalized from
configuration input so lowercase or whitespace-padded values still
produce the expected filenames and labels.

Improve viewer robustness:
- default EMoV viewer modes now use atmospheric PSL modes, not coupled
  SST modes
- atmospheric EMoV mode names are normalized before filename generation
- output directories are created before writing viewer HTML files
- string-valued config lists are preserved instead of joined character
  by character
- image glob matches are sorted for deterministic link selection

Harden mean-climate synthetic portrait plotting by filtering variables
per region/stat/season before dataframe indexing. This prevents missing
variables in region-specific dataframes from raising pandas KeyError
and instead skips unavailable variables with a warning. The shared
drop_vars helper now also removes requested variables that are absent
from the dataframe.

Extract the mean-climate portrait variable preparation into a helper to
keep mean_climate_plot_driver below the flake8 complexity threshold.

Add regression tests for coupled EOF link generation, config
normalization, EMoV defaults, viewer output handling, and mean-climate
missing-variable handling.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver: bug Bug fix (will increment patch version) semver: small improvement Small improvement (will increment patch version)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant