Skip to content

Derived variables structure update#326

Merged
justin-richling merged 20 commits into
NCAR:mainfrom
justin-richling:derived-vars-update
May 14, 2026
Merged

Derived variables structure update#326
justin-richling merged 20 commits into
NCAR:mainfrom
justin-richling:derived-vars-update

Conversation

@justin-richling
Copy link
Copy Markdown
Collaborator

@justin-richling justin-richling commented Sep 3, 2024

Move the calculation of derived variables out of adf_diag.py into an external script. This is a placeholder until a more robust variable derivations scheme is implemented.

This will help declutter adf_diag.py in the hopes of keeping non-essential methods separate.

adf_derive.py has two main methods: check_derive and derive_variable

  • check_derive - called in variable loop; will check on each available variable
    For incoming variable, look for list of constituents if available as a list in variable defaults yaml file

    If the variable does not have the argument derivable_from or derivable_from_cam_chem,
    then it will be assumed not to be a derivable variable, just missing from history file

    If the variable does have the argument derivable_from or derivable_from_cam_chem,
    first check cam-chem, then regular cam and add all constituents to diag_var_list for time series generation.

    NOTE: this will only modify diag_var_list for use in adf_diag.py, it will not modify the yaml variable list used in other scripts.

  • derive_variable - called after variable loop and time series generation; loop over dictionary containing derived variables as keys and list of constituents as values.
    This will then be called in a loop of accepted derived variables and will be generated from constituent time series files.

Pull these checks and calculations out of `adf-diag.py` to clean that file up.
Now call the `adf_derive.py` script for derived variables
Copy link
Copy Markdown
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice cleanup! Just have a few suggestions.

Comment thread lib/adf_derive.py
Comment thread lib/adf_derive.py
Comment thread lib/adf_derive.py
Comment thread lib/adf_derive.py Outdated
Comment thread lib/adf_derive.py
Comment thread lib/adf_derive.py
Comment thread lib/adf_derive.py
Copy link
Copy Markdown
Collaborator

@brianpm brianpm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once Jesse's comments are addressed, I think this looks good. Claude helped me with the review, here's what it says.

Lines 92-127:
The error message can be written twice in certain scenarios:

Line 104: If CAM-CHEM branch finds an empty constit_list, it logs constit_errmsg
Line 111: Then the code might overwrite constit_list with vres["derivable_from"]
Lines 126-127: If the new constit_list is also empty, the same error is logged again
Plus, the isinstance(constit_list, list) check at line 126 is redundant since constit_list is initialized as a list at line 80 and should always be a list.

@nusbaume suggestion is valid. Here's the fix:

# Initialize tracking for error messages
constit_errmsg_written = False

try_cam_constits = True
# Try finding info from variable defaults yaml file
try:
    vres = res[var]
except KeyError:
    print(exit_msg)
    self.debug_log(exit_msg)
    return diag_var_list, constit_dict

# Check first if variable is potentially part of a CAM-CHEM run
if "derivable_from_cam_chem" in vres:
    constit_list = vres["derivable_from_cam_chem"]

    if constit_list:
        if all(item in hist_file_ds.data_vars for item in constit_list):
            # Set check to look for regular CAM constituents in variable defaults
            try_cam_constits = False
            msg = f"derive time series for {case_name}:"
            msg += "\n\tLooks like this a CAM-CHEM run, "
            msg += f"checking constituents for '{var}'"
            self.debug_log(msg)
    else:
        # Only write error if we haven't already written it
        self.debug_log(constit_errmsg)
        constit_errmsg_written = True
    # End if
# End if

# If not CAM-CHEM, check regular CAM runs
if try_cam_constits:
    if "derivable_from" in vres:
        constit_list = vres["derivable_from"]
    else:
        # Missing variable or missing derivable_from argument
        der_from_msg = f"derive time series for {case_name}:"
        der_from_msg += f"\n Can't create time series for {var}.\n\tEither "
        der_from_msg += "the variable is missing from CAM output or it is a "
        der_from_msg += "derived quantity and is missing the 'derivable_from' "
        der_from_msg += "config argument.\n\tPlease add variable to CAM run "
        der_from_msg += "or set appropriate argument in variable "
        der_from_msg += "defaults yaml file."
        self.debug_log(der_from_msg)
        constit_errmsg_written = True
    # End if
# End if

# Log if this variable can be derived but is missing list of constituents
# (only if not already logged above)
if not constit_list and not constit_errmsg_written:
    self.debug_log(constit_errmsg)

Key changes:
Add constit_errmsg_written flag at the top
Set flag to True whenever we log an error (lines 104→105, 121→122)
Remove the isinstance() check at line 126 and replace with the simpler if not constit_list and not constit_errmsg_written:
This eliminates redundant error logging and makes the logic clearer.

@justin-richling
Copy link
Copy Markdown
Collaborator Author

justin-richling commented May 14, 2026

Thanks @brianpm (and Claude) I was just looking through this PR, this will help!

@justin-richling justin-richling requested a review from nusbaume May 14, 2026 16:49
@justin-richling justin-richling added code clean-up Made code simpler and/or easier to read. averaging Concerns the calculation of temporal and spatial averages labels May 14, 2026
Copy link
Copy Markdown
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great now, just had a couple final (non-required) cleanup requests!

Comment thread lib/adf_diag.py Outdated
Comment thread lib/adf_diag.py Outdated
Comment thread lib/adf_diag.py Outdated
@justin-richling justin-richling merged commit 84cfaf0 into NCAR:main May 14, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

averaging Concerns the calculation of temporal and spatial averages code clean-up Made code simpler and/or easier to read.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants