Skip to content

fix crash when debugging#173

Draft
weiyuan-jiang wants to merge 4 commits into
developfrom
feature/wjiang/fix_debug
Draft

fix crash when debugging#173
weiyuan-jiang wants to merge 4 commits into
developfrom
feature/wjiang/fix_debug

Conversation

@weiyuan-jiang
Copy link
Copy Markdown
Contributor

@weiyuan-jiang weiyuan-jiang commented May 10, 2026

This PR fixes the run in debug mode. It is 0-diff. It it goes with this PR in NCEP_Shared, GEOSldas can run on debug mode with land assimilation

Related PRs:
GEOS-ESM/NCEP_Shared#37

In debug mode, Intel Fortran raises error 65 (floating invalid) when
comparing NaN values (e.g. NaN >= 0.). Input scaling parameter files
use NaN as a no-data indicator, which causes comparisons like
sclprm_std_obs >= 0. to raise FP exceptions before returning false.

Fix: Add NaN guards using the standard Fortran idiom (val == val is
.false. for NaN) before the existing range checks at all 3 guard sites
in clsm_ensupd_read_obs.F90 (~line 9755, ~9989, ~10384).
When N_files=1, lon_min_vec(2) and lon_max_vec(2) remain initialized
to MAPL_UNDEF (1.0e15). Whole-array operations like:
  start_ind = (lon_min_vec - CMG_ll_lon) / CMG_dlon
produce ~2e16 which overflows a 32-bit integer, causing
forrtl: error (65): floating invalid in debug mode.

Fix: Restrict array operations to (1:N_files) elements only:
  start_ind(1:N_files) = (lon_min_vec(1:N_files) - CMG_ll_lon)/CMG_dlon
  last_ind(1:N_files)  = (lon_max_vec(1:N_files) - CMG_ll_lon)/CMG_dlon
  N_lon_vec(1:N_files) = last_ind(1:N_files) - start_ind(1:N_files) + 1
@weiyuan-jiang weiyuan-jiang requested a review from a team as a code owner May 10, 2026 01:40
@weiyuan-jiang weiyuan-jiang changed the title Feature/wjiang/fix debug fix crash when debugging May 11, 2026
Copy link
Copy Markdown
Collaborator

@gmao-rreichle gmao-rreichle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiyuan-jiang, many thanks for the PR. I have a couple of comments/questions below.

Comment on lines +2809 to +2817
! when in debug mode, nint(VEGCLS) with 1.0e15 may crash
allocate(tmpR(N_catl))
tmpR = VEGCLS(:)
where(tmpR > 1.0e10) tmpR = nodata_generic
mwRTM_param(:)%vegcls = nint(tmpR(:))
tmpR = SOILCLS(:)
where(tmpR > 1.0e10) tmpR = nodata_generic
mwRTM_param(:)%soilcls = nint(tmpR(:))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiyuan-jiang : I'm not sure if we need this, and if we do, I'm not sure it's the best fix. First, I'm pretty confident that vegcls and soilcls here should never be no-data. So a better way of addressing this might be to check for (native) no-data-values first and stop if any are encountered.
If I'm wrong and no-data-values for the "integer" fields could happen, we might need to figure out a more robust solution. The current solution depends on "nodata_generic" being -9999. I was hoping that at some point in the future we can use MAPL_UNDEF instead of LDAS having its own nodata-value. But if nodata_generic turns into 1e15, the currently implemented solution won't work, I think.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I debugged, I printed the values and I did see 1.0e15. When it is converted to int, it becomes the most negative integer. Here our nondata_generic is -9999.0

Comment on lines +9755 to +9758
if ( sclprm_mean_obs(ind)==sclprm_mean_obs(ind) .and. &
sclprm_mean_mod(ind)==sclprm_mean_mod(ind) .and. &
sclprm_std_obs(ind) ==sclprm_std_obs(ind) .and. &
sclprm_std_mod(ind) ==sclprm_std_mod(ind) .and. &
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need these four lines, which should always be true? Not sure where they're coming from and what they might be meant to do.
Or do they evaluate to "false" if a value is NaN? If this is the intent, we should at least add a comment to clarify. Although I don't think the reader for the "sclprm_*" values would produce NaNs.
Alternatively, perhaps check more explicitly for nodata values?
I guess all of this depends on the nodata-value here being -9999. If the nodata-value is 1e15, it wouldn't be identified by any of the if-conditions.

Copy link
Copy Markdown
Contributor Author

@weiyuan-jiang weiyuan-jiang May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that comparison is for Nan. If Nan == Nan is always .false. . In debugging mode, comparing Nan may lead to crash

Comment on lines +9993 to +9996
if ( sclprm_mean_obs(j_ind, i_ind)==sclprm_mean_obs(j_ind, i_ind) .and. &
sclprm_mean_mod(j_ind, i_ind)==sclprm_mean_mod(j_ind, i_ind) .and. &
sclprm_std_obs(j_ind, i_ind) ==sclprm_std_obs(j_ind, i_ind) .and. &
sclprm_std_mod(j_ind, i_ind) ==sclprm_std_mod(j_ind, i_ind) .and. &
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Comment on lines +10392 to +10395
if ( sclprm_mean_obs(ind)==sclprm_mean_obs(ind) .and. &
sclprm_mean_mod(ind)==sclprm_mean_mod(ind) .and. &
sclprm_std_obs( ind)==sclprm_std_obs( ind) .and. &
sclprm_std_mod( ind)==sclprm_std_mod( ind) .and. &
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

@github-actions
Copy link
Copy Markdown

This PR is being prevented from merging because you have added one of our blocking labels: Contingent - DNA, Needs Lead Approval, Contingent -- Do Not Approve. You'll need to remove it before this PR can be merged.

@gmao-rreichle gmao-rreichle marked this pull request as draft May 11, 2026 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants