Enhance robust pcmdi#49
Draft
zhangshixuan1987 wants to merge 24 commits into
Draft
Conversation
Contributor
Author
|
@forsyth2: Hi Ryan, I am submitting this pull request with several bug fixes identified during the code review process while addressing the reported issues in E3SM-Project/zppy#807. I also included a few fixes related to the ENSO code path that is planned to be activated in the near future. I decided to address them now because the relevant code is already part of the current code base, so it would be better to resolve these issues before the functionality becomes fully enabled and more widely used. |
- pcmdi_setup.py: raise ValueError in _extract_metadata() when filename parts are missing, instead of logging and falling through to IndexError; fix typo "dervied" -> "derived" in log message - link_observation.py: use context manager for obs_alias_file to ensure file handle is closed - enso_metrics_reader.py: guard nested RESULTS.model dict access with .get() and raise a descriptive KeyError when structure is missing, instead of an opaque KeyError at runtime - utils.py: replace Popen(shell=True) with shlex.split() + shell=False in run_parallel_jobs() and run_serial_jobs() to eliminate shell injection risk; add shlex import - pcmdi_enso.py, pcmdi_synthetic_plots.py: replace bare json.load(open()) calls with context managers to prevent file descriptor leaks All 7 unit tests pass (zi-pcmdi-diags-20260430).
- ENSOParameters: validate enso_groups is not None at construction time instead of crashing with AttributeError on .split() downstream - EnsoDiagnosticsCollector.__init__: validate model_name_parts has exactly 4 elements before tuple unpack, giving a descriptive error - collect_figures: guard os.listdir(fdir) with os.path.isdir() before calling it; guard error-log dir listing the same way to prevent FileNotFoundError inside the logger call itself - collect_figures: warn when model/relm marker is absent from filename before splitting, preventing silent wrong output filenames - collect_metrics / collect_diags: replace bare os.listdir() inside logger error f-strings with isdir-guarded dir_contents variable - main(): check obs_dict is non-empty before [0] index to avoid IndexError on empty obs_catalogue.json - check_enso_input: guard both os.symlink() calls with os.path.exists() to prevent FileExistsError on re-run/retry - check_vars: add re.DOTALL flag to list_variables regex so multi-line driver stdout does not produce a false "no variable list found" failure All 7 unit tests pass (zi-pcmdi-diags-20260430).
- rename pcmdi_mean_cimate.py -> pcmdi_mean_climate.py (correct spelling); update entry point in pyproject.toml and import in test file - MeanClimateParameters: validate --regions is not None before .split() to prevent AttributeError on missing CLI argument - MeanClimateMetricsCollector.__init__: validate model_info has exactly 4 elements before tuple unpack, giving a descriptive error - _collect_figures: fix output directory key typo "CLIM_patttern" -> "CLIM_pattern"; add logger.warning when no figures are found for a var/region/season combination instead of silently skipping - _collect_metrics: guard parts[1] access with len(parts) < 2 check and log+skip on unexpected filename format instead of IndexError - main(): re-raise RuntimeError from job runners instead of swallowing it with print(), preventing silent wrong results after job failure; replace all print() calls with logger.info() - generate_mean_clim_cmds: add logger.warning when a variable is not found in obs_dic instead of silently omitting its command All 7 unit tests pass (zi-pcmdi-diags-20260430).
- VariabilityModesParameters: validate --var_modes and --vars are not None at construction time instead of crashing with AttributeError or KeyError: None downstream - main(): validate model_name has exactly 4 dot-separated parts before index access to prevent IndexError - main(): re-raise RuntimeError from job runners instead of swallowing it with print(), preventing silent wrong results after job failure; replace all print() calls with logger.info() - _collect_figures: add logger.warning when no files match a mode/season combination instead of silently skipping - _classify_output_name: add logger.warning when filename does not match any known pattern and suffix falls back to "unknown" - generate_varmode_cmds: quote refpath in command string so paths containing spaces are handled correctly by shlex.split - test_generate_varmode_cmds: update expected strings to reflect quoted refpath All 7 unit tests pass (zi-pcmdi-diags-20260430).
- SyntheticPlotsParameters: replace all bare args["x"].split(",") calls
with None-safe guards so missing optional list arguments (clim_vars,
clim_regions, mova_modes, mova_vars, movc_modes, movc_vars, enso_vars)
no longer raise AttributeError
- SyntheticPlotsParameters: replace all str(args["x"]).lower() in (...)
boolean checks with str2bool(args.get("x", False)) to use the existing
helper consistently; previously str("None") silently evaluated to
False,
masking missing viewer flags (clim_viewer, mova_viewer, movc_viewer,
enso_viewer, save_all_data)
- main(): guard shutil.copy for e3sm_pmp_logo.png with os.path.exists
check; log a warning and skip instead of raising FileNotFoundError
when pcmdi_external_prefix is misconfigured
- _get_args(): change --debug to type=str2bool with default=False and
simplify check to `if args.debug:`; previously only "true" was
recognised, silently ignoring "1", "yes", etc.
All 7 unit tests pass (zi-pcmdi-diags-20260430).
with logger
- CoreParameters: validate num_workers and multiprocessing are not None
before int() and .lower() calls to prevent cryptic TypeError/
AttributeError on missing CLI arguments
- CoreParameters: guard --vars with args.get() before .split(",") to
prevent AttributeError on missing argument
- set_up(): split model_name once into model_name_parts and validate
len >= 2 before index access; eliminates IndexError on malformed
model name and removes two redundant .split() calls in input_template
construction
- set_up(): validate model_name_ref is non-None and has >= 2 parts in
model_vs_model branch to prevent None.split() and IndexError
- _process_group(): replace print() warning with logger.warning() so
missing catalogue warning respects log-level control
- _generate_mask(): replace both print() calls with logger.info() so
mask method info is captured in log output
- derive_missing_variable(): replace print() with logger.info() for
derived variable write confirmation
- Fix typo "assigining" -> "assigning" in two log messages
All 7 unit tests pass (zi-pcmdi-diags-20260430).
- run_parallel_jobs: raise ValueError if num_workers < 1 instead of
silently degrading to serial execution
- run_parallel_jobs: rename inner loop variables cmd/proc to
batch_cmd/batch_proc to eliminate shadowing of the outer cmd variable
- run_parallel_jobs: terminate all remaining running batch processes
before raising RuntimeError on job failure, preventing orphaned
subprocesses from running indefinitely after an error
- run_parallel_jobs: improve batch log message from misleading
"Running {count_child_processes()} subprocesses" (counted before
launch) to "Running batch of {len(procs)} subprocesses"
All 7 unit tests pass (zi-pcmdi-diags-20260430).
- MeanClimateTableBuilder.map_regions(): rename local variable from `seasons` to `regions` throughout the method; copy-paste bug caused no runtime error but was misleading and error-prone - build_table(): update figure path from "CLIM_patttern" to "CLIM_pattern" to match the corrected output directory name from pcmdi_mean_climate.py; viewer was silently finding no files - setup_jinja_env(): add os.path.isdir() check before creating the Jinja2 environment; previously a missing template directory produced an unhelpful TemplateNotFound error with no indication the directory itself was absent - Add module-level `import logging` and `logger = logging.getLogger` at the top of the file; previously the file had no logger - generate_methodology_html(), generate_data_html(), generate_viewer_html(): replace print() calls with logger.info() so HTML write confirmations respect log-level control and appear in log output All 7 unit tests pass (zi-pcmdi-diags-20260430).
- Use shutil.move for cross-filesystem-safe file moves - Normalize ENSO group parsing by stripping whitespace - Make figure, metric, and diagnostic output replacement explicit - Add destination directory guards before overwriting outputs - Write metrics JSON through a temporary file before replacing destination - Improve output validation with return-code checks and clearer directory checks
- Make output collection more robust across filesystems - Add explicit replacement handling for figures, metrics, and diagnostics - Strip whitespace from ENSO group inputs - Validate command return codes and reported output directories - Support prefix-style NetCDF filenames when symlinking alternative variables - Extend existing alias treatment consistently for wind stress inputs
151329f to
97b6ce8
Compare
Correct the coupled modes variability viewer so MOV_compose EOF links use the expected EOF mode instead of incorrectly pointing to CBF figures. Coupled SST modes are now mapped as: AMO/PDO -> eof1 and NPGO -> eof2. Mode names are normalized from configuration input so lowercase or whitespace-padded values still produce the expected filenames and labels. Improve viewer robustness: - default EMoV viewer modes now use atmospheric PSL modes, not coupled SST modes - atmospheric EMoV mode names are normalized before filename generation - output directories are created before writing viewer HTML files - string-valued config lists are preserved instead of joined character by character - image glob matches are sorted for deterministic link selection Harden mean-climate synthetic portrait plotting by filtering variables per region/stat/season before dataframe indexing. This prevents missing variables in region-specific dataframes from raising pandas KeyError and instead skips unavailable variables with a warning. The shared drop_vars helper now also removes requested variables that are absent from the dataframe. Extract the mean-climate portrait variable preparation into a helper to keep mean_climate_plot_driver below the flake8 complexity threshold. Add regression tests for coupled EOF link generation, config normalization, EMoV defaults, viewer output handling, and mean-climate missing-variable handling.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This pull request enhances the robustness of the PCMDI diagnostics utilities by fixing bugs related to ENSO metric extraction and synthetic plot handling, improving logging clarity and parallel computing reliability, and making incremental enhancements to the viewer and utility modules.
Objectives:
Issue resolution:
This pull request is
Small Change
1. Does this do what we want it to do?
-Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
-Testing: I have considered likely and/or severe edge cases and have included them in testing.
Note: Testing was performed together with another bug fix change to address .
2. Are the implementation details accurate & efficient?
3. Is this well documented?
4. Is this code clean?
-All the pre-commits checks have passed.