Derived variables structure update#326
Conversation
Pull these checks and calculations out of `adf-diag.py` to clean that file up.
Now call the `adf_derive.py` script for derived variables
nusbaume
left a comment
There was a problem hiding this comment.
Nice cleanup! Just have a few suggestions.
Pull these checks and calculations out of `adf-diag.py` to clean that file up.
Now call the `adf_derive.py` script for derived variables
b0e0067 to
fafd70d
Compare
…ing/ADF into derived-vars-update
This is now being generated in `adf_derive.py`
brianpm
left a comment
There was a problem hiding this comment.
Once Jesse's comments are addressed, I think this looks good. Claude helped me with the review, here's what it says.
Lines 92-127:
The error message can be written twice in certain scenarios:
Line 104: If CAM-CHEM branch finds an empty constit_list, it logs constit_errmsg
Line 111: Then the code might overwrite constit_list with vres["derivable_from"]
Lines 126-127: If the new constit_list is also empty, the same error is logged again
Plus, the isinstance(constit_list, list) check at line 126 is redundant since constit_list is initialized as a list at line 80 and should always be a list.
@nusbaume suggestion is valid. Here's the fix:
# Initialize tracking for error messages
constit_errmsg_written = False
try_cam_constits = True
# Try finding info from variable defaults yaml file
try:
vres = res[var]
except KeyError:
print(exit_msg)
self.debug_log(exit_msg)
return diag_var_list, constit_dict
# Check first if variable is potentially part of a CAM-CHEM run
if "derivable_from_cam_chem" in vres:
constit_list = vres["derivable_from_cam_chem"]
if constit_list:
if all(item in hist_file_ds.data_vars for item in constit_list):
# Set check to look for regular CAM constituents in variable defaults
try_cam_constits = False
msg = f"derive time series for {case_name}:"
msg += "\n\tLooks like this a CAM-CHEM run, "
msg += f"checking constituents for '{var}'"
self.debug_log(msg)
else:
# Only write error if we haven't already written it
self.debug_log(constit_errmsg)
constit_errmsg_written = True
# End if
# End if
# If not CAM-CHEM, check regular CAM runs
if try_cam_constits:
if "derivable_from" in vres:
constit_list = vres["derivable_from"]
else:
# Missing variable or missing derivable_from argument
der_from_msg = f"derive time series for {case_name}:"
der_from_msg += f"\n Can't create time series for {var}.\n\tEither "
der_from_msg += "the variable is missing from CAM output or it is a "
der_from_msg += "derived quantity and is missing the 'derivable_from' "
der_from_msg += "config argument.\n\tPlease add variable to CAM run "
der_from_msg += "or set appropriate argument in variable "
der_from_msg += "defaults yaml file."
self.debug_log(der_from_msg)
constit_errmsg_written = True
# End if
# End if
# Log if this variable can be derived but is missing list of constituents
# (only if not already logged above)
if not constit_list and not constit_errmsg_written:
self.debug_log(constit_errmsg)
Key changes:
Add constit_errmsg_written flag at the top
Set flag to True whenever we log an error (lines 104→105, 121→122)
Remove the isinstance() check at line 126 and replace with the simpler if not constit_list and not constit_errmsg_written:
This eliminates redundant error logging and makes the logic clearer.
|
Thanks @brianpm (and Claude) I was just looking through this PR, this will help! |
nusbaume
left a comment
There was a problem hiding this comment.
Looks great now, just had a couple final (non-required) cleanup requests!
Move the calculation of derived variables out of
adf_diag.pyinto an external script. This is a placeholder until a more robust variable derivations scheme is implemented.This will help declutter
adf_diag.pyin the hopes of keeping non-essential methods separate.adf_derive.pyhas two main methods:check_deriveandderive_variablecheck_derive- called in variable loop; will check on each available variableFor incoming variable, look for list of constituents if available as a list in variable defaults yaml file
If the variable does not have the argument
derivable_fromorderivable_from_cam_chem,then it will be assumed not to be a derivable variable, just missing from history file
If the variable does have the argument
derivable_fromorderivable_from_cam_chem,first check cam-chem, then regular cam and add all constituents to
diag_var_listfor time series generation.NOTE: this will only modify
diag_var_listfor use inadf_diag.py, it will not modify the yaml variable list used in other scripts.derive_variable- called after variable loop and time series generation; loop over dictionary containing derived variables as keys and list of constituents as values.This will then be called in a loop of accepted derived variables and will be generated from constituent time series files.