output at flexible time levels#217
output at flexible time levels#217guoqing-noaa wants to merge 1 commit intoufs-community:gsl/developfrom
Conversation
…ttribute Introduce a new `output_timelevels` attribute for MPAS streams that enables variable output intervals. With this capability, we may outoput every 15 minutes in the first hour, every hour in the first 3 days, every 3 hours for the next 4 days, and every 6 hours in the last 3 days. We can also use this to only write out forecast files during a given period, such as: output_timelevels="6-12h" Check the PR description for details on how to specify the time levles. Here is a quick example: output_timelevels="0-3-15m 4-72 75-168-3 174-240-6"
|
|
The following situations have been tested: |
output_timelevels stream attrbute|
For a conus12km 12 hours forecasts, I tried different settings for the the |
SamuelTrahanNOAA
left a comment
There was a problem hiding this comment.
These code changes are overcomplicated and will be difficult to extend or maintain. Any significant changes to the functionality will require a complicated rewrite.
More specific overview comments:
- Most of the parser code is a reimplementation of the Fortran standard
splitfunction. It would be better to callsplit. - Much of the time-related code reimplements the MPAS and ESMF alarm functionality. It would be better to extend the alarm functionality.
- Parser errors aren't reported. If you type something wrong, you don't know what it was.
- There are out-of-bounds array reads due to off-by-one errors.
- The new syntax deviates from the MPAS days_HH:MM:SS time specification.
- Unlike the rest of the MPAS I/O interface, this uses a cryptic string instead of XML tags. If you used XML tags, the parser code wouldn't be necessary at all.
| !||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| | ||
| ! | ||
| ! parse_time_string | ||
| ! | ||
| !> \brief Parse a time string into total minutes ( allow 0.5 minutes, i.e. 30 seconds) | ||
| !> \author Guoqing Ge | ||
| !> \date February 2026 | ||
| !> \details | ||
| !> Parses time strings like "1h30m", "45m", "6", "90s", "1d" into total minutes | ||
| !> Supported units: d/D (days), h (hours), m (minutes), s (seconds) | ||
| !> Plain integers without units are interpreted as hours | ||
| ! | ||
| !----------------------------------------------------------------------- | ||
| subroutine parse_time_string(time_str, total_minutes, ierr)!{{{ | ||
|
|
||
| implicit none | ||
|
|
||
| character(len=*), intent(in) :: time_str | ||
| real(kind=RKIND), intent(out) :: total_minutes | ||
| integer, intent(out) :: ierr | ||
|
|
||
| character(len=StrKIND) :: str_local | ||
| integer :: i, len_str, num_start, read_err | ||
| real(kind=RKIND) :: value | ||
| character :: ch | ||
| logical :: found_unit | ||
|
|
||
| ierr = 0 | ||
| total_minutes = 0.0_RKIND | ||
| str_local = trim(adjustl(time_str)) | ||
| len_str = len_trim(str_local) | ||
|
|
||
| if (len_str == 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| ! Check if last character is a digit (plain integer = hours) | ||
| ch = str_local(len_str:len_str) | ||
| if (ch >= '0' .and. ch <= '9') then | ||
| read(str_local, *, iostat=read_err) value | ||
| if (read_err /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
| total_minutes = value * 60.0_RKIND | ||
| return | ||
| end if | ||
|
|
||
| ! Parse duration with units (e.g., "1h30m", "45m", "90s", "1d" or "1D") | ||
| num_start = 1 | ||
| i = 1 | ||
| do while (i <= len_str) | ||
| ch = str_local(i:i) | ||
| ! Check for unit characters: d/D (days), h (hours), m (minutes), s (seconds) | ||
| if (ch == 'h' .or. ch == 'm' .or. ch == 's' .or. ch == 'd' .or. ch == 'D') then | ||
|
|
||
| if (i == num_start) then | ||
| ierr = 1 ! No number before unit | ||
| return | ||
| end if | ||
|
|
||
| read(str_local(num_start:i-1), *, iostat=read_err) value | ||
| if (read_err /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| select case (ch) | ||
| case ('h') | ||
| total_minutes = total_minutes + value * 60.0_RKIND | ||
| case ('m') | ||
| total_minutes = total_minutes + value | ||
| case ('s') | ||
| total_minutes = total_minutes + value / 60.0_RKIND | ||
| case ('d', 'D') | ||
| total_minutes = total_minutes + value * 24.0_RKIND * 60.0_RKIND | ||
| end select | ||
|
|
||
| num_start = i + 1 | ||
| end if | ||
| i = i + 1 | ||
| end do | ||
|
|
||
| ! Check for trailing digits without unit (error) | ||
| if (num_start <= len_str) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| end subroutine parse_time_string!}}} | ||
|
|
||
|
|
||
| !||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| | ||
| ! | ||
| ! parse_output_timelevel_spec | ||
| ! | ||
| !> \brief Parse one time level specification segment | ||
| !> \author Guoqing Ge | ||
| !> \date February 2026 | ||
| !> \details | ||
| !> Parses a single segment using format: start, start-stop, or start-stop-step | ||
| !> Time strings can be integers (hours) or duration format (e.g., "1h30m", "45m") | ||
| !> Examples: "6", "0-3", "0-1h-15m", "1h30m-2h-15m" | ||
| ! | ||
| !----------------------------------------------------------------------- | ||
| subroutine parse_output_timelevel_spec(spec, start_hour, end_hour, interval_minutes, ierr)!{{{ | ||
|
|
||
| implicit none | ||
|
|
||
| character(len=*), intent(in) :: spec | ||
| real(kind=RKIND), intent(out) :: start_hour, end_hour | ||
| real(kind=RKIND), intent(out) :: interval_minutes | ||
| integer, intent(out) :: ierr | ||
|
|
||
| character(len=StrKIND) :: spec_local | ||
| integer :: i, dash_count, dash_pos(2), len_spec | ||
| real(kind=RKIND) :: start_minutes, end_minutes, step_minutes | ||
| integer :: local_ierr | ||
|
|
||
| ierr = 0 | ||
| dash_pos(1) = 0 | ||
| dash_pos(2) = 0 | ||
| spec_local = trim(adjustl(spec)) | ||
| len_spec = len_trim(spec_local) | ||
|
|
||
| ! Guard against empty string | ||
| if (len_spec == 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| ! Count dashes to determine format | ||
| dash_count = 0 | ||
| do i = 1, len_spec | ||
| if (spec_local(i:i) == '-') then | ||
| dash_count = dash_count + 1 | ||
| if (dash_count <= 2) dash_pos(dash_count) = i | ||
| end if | ||
| end do | ||
|
|
||
| ! Parse based on number of dashes | ||
| if (dash_count == 0) then | ||
| ! Format: single time string (output at that time only) | ||
| call parse_time_string(spec_local, start_minutes, local_ierr) | ||
| if (local_ierr /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
| start_hour = start_minutes / 60.0_RKIND | ||
| end_hour = start_hour | ||
| interval_minutes = 60.0_RKIND ! Default, but won't matter for single time | ||
|
|
||
| else if (dash_count == 1) then | ||
| ! Format: start-stop (interval defaults to 1 hour) | ||
| if (dash_pos(1) <= 1 .or. dash_pos(1) >= len_spec) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| call parse_time_string(spec_local(1:dash_pos(1)-1), start_minutes, local_ierr) | ||
| if (local_ierr /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| call parse_time_string(spec_local(dash_pos(1)+1:len_spec), end_minutes, local_ierr) | ||
| if (local_ierr /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| start_hour = start_minutes / 60.0_RKIND | ||
| end_hour = end_minutes / 60.0_RKIND | ||
| interval_minutes = 60.0_RKIND | ||
|
|
||
| else if (dash_count == 2) then | ||
| ! Format: start-stop-step | ||
| if (dash_pos(1) <= 1 .or. dash_pos(2) <= dash_pos(1) + 1 .or. dash_pos(2) >= len_spec) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| call parse_time_string(spec_local(1:dash_pos(1)-1), start_minutes, local_ierr) | ||
| if (local_ierr /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| call parse_time_string(spec_local(dash_pos(1)+1:dash_pos(2)-1), end_minutes, local_ierr) | ||
| if (local_ierr /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| call parse_time_string(spec_local(dash_pos(2)+1:len_spec), step_minutes, local_ierr) | ||
| if (local_ierr /= 0) then | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| start_hour = start_minutes / 60.0_RKIND | ||
| end_hour = end_minutes / 60.0_RKIND | ||
| interval_minutes = step_minutes | ||
|
|
||
| else | ||
| ! More than 2 dashes - invalid format | ||
| ierr = 1 | ||
| return | ||
| end if | ||
|
|
||
| end subroutine parse_output_timelevel_spec!}}} |
There was a problem hiding this comment.
This is an extraordinary amount of code to parse a simple string. The vast majority of it could be replaced with a call to split
| return | ||
| end if | ||
|
|
||
| ! First pass: find if current forecast hour matches any segment |
There was a problem hiding this comment.
You do not need to reparse the output_timelevel every time you reach a new forecast time. The output_timelevel doesn't change. Instead, parse it once and reuse the information.
| seg_end = seg_end + 1 | ||
| end do | ||
| seg_end = seg_end - 1 | ||
|
|
There was a problem hiding this comment.
You should use split instead of reimplementing it.
| end do | ||
| seg_end = seg_end - 1 | ||
|
|
||
| if (seg_end >= seg_start) then |
There was a problem hiding this comment.
This is a check for an invalid condition which could be detected at the beginning of the model when the XML is parsed. Users shouldn't have to wait for the first forecast output time to see that their XML is invalid.
| if (forecast_hour >= start_hour .and. forecast_hour <= end_hour) then | ||
| found = .true. | ||
| ! Check if this is a single time point (not a range) | ||
| if (abs(start_hour - end_hour) < 1.0e-6_RKIND) then |
There was a problem hiding this comment.
The parse_output_timelevel_spec already knows whether this is a single time. Why do you need to check it again? Can't you return that information?
| ! More than 2 dashes - invalid format | ||
| ierr = 1 | ||
| return | ||
| end if |
There was a problem hiding this comment.
This should print an error message so the user knows what they did wrong.
| end if | ||
|
|
||
| seg_start = seg_end + 1 | ||
| do while (seg_start <= work_len .and. work_str(seg_start:seg_start) == ' ') |
There was a problem hiding this comment.
Out-of-bounds array read due to an off-by-one bug. This will access the character at work_len+1 because Fortran doesn't have a short-circuit .and. operator.
| ! If we matched a single time, find the next output time | ||
| if (found .and. is_single_time) then | ||
| ! Second pass: find the minimum time > forecast_hour | ||
| seg_start = 1 | ||
| do while (seg_start <= work_len) | ||
| seg_end = seg_start | ||
| do while (seg_end <= work_len .and. work_str(seg_end:seg_end) /= ' ') | ||
| seg_end = seg_end + 1 | ||
| end do | ||
| seg_end = seg_end - 1 |
There was a problem hiding this comment.
You're parsing the next output time twice: once for checking the current output time, and again when checking the next output time.
If you had put the information in a derived type data structure, you wouldn't need this complexity.
| end subroutine stream_mgr_add_pkg_c !}}} | ||
|
|
||
|
|
||
| subroutine stream_mgr_set_property_c(manager_c, streamID_c, propertyName_c, propertyValue_c, ierr_c) bind(c) !{{{ |
There was a problem hiding this comment.
This subroutine is unnecessary. The output_interval should be sent to stream_mgr_create_stream_c
| ! Guard against empty string | ||
| if (work_len == 0) then | ||
| ierr = 1 | ||
| return | ||
| end if |
There was a problem hiding this comment.
This should print an error so the user knows what they did wrong.
If you want to use a parser generator, the original grammar specification should be in the repository, not the automatically-generated code. |
No, I don't intend to use a parser generator. I put EBNF there to make sure the overall timelevel specification logic is consistent and complete, and there are no hidden surprises. |
@SamuelTrahanNOAA Thanks for the comments and discussions! This is the first version to get things work. I also feel the implementation is kind of complicated and I would like to descope to meet our current needs only. My thought is to limit the timelevel specifications to only support minutes and hours (may also limit the output to have to start at 0h). I intentionally try to avoid using the same time format |
It is, indeed, a finely-polished EBNF with perfect indentation. However, it doesn't describe the code. Your |
What is more user-friendly is splitting them into individual XML tags so it is clear what is going on. That would use the existing MPAS parsing code (eliminating the parser). Also, it'll be less confusing to experienced MPAS users. <output_interval start="01:15:00" stop="02:30:00" step="00:15:00"/> <!-- every 15 minutes from 1:15 to 2:30 -->
<output_interval start="06:00:00"/> <!-- Only at hour 6 -->I can see how your string would be easier to pass through the rrfs-workflow bash scripts, but bash limitations shouldn't take precedence over MPAS code consistency and quality. |
|
Can you please provide:
|
|
@clark-evans I am sorry, this may not be ready for an official review yet. I should have put this in the draft mode since the beginning. |
This looks good. Does current MPAS code support this? |
No. I'm suggesting it as an alternative implementation to your recursive descent parser. |
|
@SamuelTrahanNOAA Thanks for lots of great discussions! For the moment, this PR is mainly for a test/demo purpose. I think my initial need is much simpler that the goal in this PR. We want to output For our current needs, I think NCAR Andy Stokely's changes may work. I will test that first. I was not aware of Andy's changes until 2 days ago. |
|
WRF was limited to start...step...end for each stream. We were able to work around that by having multiple output streams. (Recall that the operational models HRRR, HWRF, NAM, and RAP were all WRF-based, along with other quasi-operational models.) |
Introduce a new
output_timelevelsattribute for MPAS streams that enables variable output intervals.With this capability, we may outoput every 15 minutes in the first hour, every hour in the first 3 days, every 3 hours for the next 4 days, and every 6 hours in the last 3 days.
We can also use this to only write out forecast files during a given period, such as:
output_timelevels="6-12h"Check the PR description below for details on how to specify the time levles.
Here is a quick example:
output_timelevels="0-3-15m 4-72 75-168-3 174-240-6"FYI, with this PR, we may write out mpasout files at
"0 1 2 3 6 9 12", while history files at"4 5 7 8 11 13-99999"to avoid duplicate output of both mpasout and history files at the same time levels, which are almost the same.This PR solves issue #214
Priority Reviewers