Skip to content

Add NOAA MRMS CONUS hourly precipitation analysis dataset#472

Open
aldenks wants to merge 16 commits intomainfrom
claude/implement-noaa-mrms-conus-oYYud
Open

Add NOAA MRMS CONUS hourly precipitation analysis dataset#472
aldenks wants to merge 16 commits intomainfrom
claude/implement-noaa-mrms-conus-oYYud

Conversation

@aldenks
Copy link
Member

@aldenks aldenks commented Feb 27, 2026

Summary

This PR adds support for the NOAA Multi-Radar Multi-Sensor (MRMS) CONUS hourly precipitation analysis dataset. The implementation includes data ingestion, reformatting, and operational update capabilities for multiple precipitation-related variables from MRMS.

Towards #461, closes #473

Key Changes

  • Template Configuration (template_config.py): Defines the dataset structure with 4 data variables (precipitation_surface, precipitation_pass_1_surface, precipitation_radar_only_surface, and categorical_precipitation_type_surface) covering the Continental US at 0.01° resolution with hourly frequency from October 2014 onwards.

  • Region Job Implementation (region_job.py): Implements data processing logic including:

    • Source file coordinate generation with support for multiple MRMS product versions
    • Handling of MRMS v12.0 transition (October 2020) with fallback to pre-v12 products (GaugeCorr_QPE_01H) for historical data
    • Download support from three sources: AWS S3, Iowa Mesonet archive, and NCEP
    • GRIB2 decompression and rasterio-based data extraction
    • Deaccumulation of hourly precipitation accumulations to rates
    • Processing region buffering for proper deaccumulation without gaps
  • Dataset Class (dynamical_dataset.py): Provides the main dataset interface with:

    • Operational update scheduling (every 3 hours via Kubernetes CronJob)
    • Data validation pipeline
    • Support for both backfill and incremental updates
  • Zarr Templates: Complete Zarr v3 metadata templates for all coordinates and data variables with optimized chunking and compression (zstd + blosc).

  • Tests: Comprehensive test coverage including:

    • Source file coordinate URL generation for different sources and time periods
    • Product version selection logic (v12 vs pre-v12)
    • Processing region buffering behavior
    • End-to-end backfill and operational update workflows
    • Data validation

Notable Implementation Details

  • Version-aware product selection: Automatically selects appropriate MRMS product names based on data timestamp, with pre-v12 products only available via Iowa Mesonet archive
  • Pass 1 availability: Pass 1 precipitation data only available from October 2020 onwards; earlier requests are skipped
  • Deaccumulation handling: First timestep in output is NaN due to deaccumulation requiring a prior timestep
  • Spatial reference: Uses IAU 1965 spheroid (native to MRMS) with sub-pixel difference from WGS84 at 0.01° resolution

https://claude.ai/code/session_01JD7KMBFUaoUjEYcNa6VhtK


TODOs

  • make sure integration tests aren't too slow / don't OOM

small run to

  • check actual compressed size and update chunks/shards if needed
  • check resource requirements and adjust


data = region_job.read_data(updated_coord, radar_var)
assert data.shape == (3500, 7000)
assert not np.all(np.isnan(data))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change all the not np.all(np.isnan(... checks in all tests in this PR to assert that the values are all finite. the only nans should be at the very first timestep of the entire dataset.

Comment on lines +229 to +230
common_keys = set(template_attrs) & set(file_attrs)
for key in common_keys:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert "spatial_ref" in common_key and "crs_wkt" in common_keys so we know its not empty



@pytest.mark.slow
def test_single_file_integration(tmp_path: Path) -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have download_file and read_data tests in region_job_test.py. remove this test and move just the download + crs/spatial coords test part of it to template_config_test.py

@aldenks aldenks marked this pull request as ready for review February 27, 2026 18:11
claude and others added 13 commits March 1, 2026 21:17
Implement the noaa-mrms-conus-analysis-hourly dataset with:
- Three data sources: Iowa Mesonet (pre-v12), AWS S3 (primary), NCEP (fallback)
- Four variables: precipitation_surface, precipitation_pass_1_surface,
  precipitation_radar_only_surface, categorical_precipitation_type_surface
- Deaccumulation of QPE accumulations to precipitation rates
- MRMS v12.0 product discontinuity handling (GaugeCorr_QPE → MultiSensor_QPE)
- Gzip-compressed GRIB2 source file support
- Template, tests, and dataset registration

https://claude.ai/code/session_01JD7KMBFUaoUjEYcNa6VhtK
Add test_single_file_integration that downloads a real MRMS file from S3,
reads all template variables, and verifies GRIB lat/lon and CRS attributes
match template dimension_coordinates and spatial_ref.

Also update attribution to include NOAA NCEP as a source.

https://claude.ai/code/session_01JD7KMBFUaoUjEYcNa6VhtK
- Change time chunks from 72 to 720 (30 days), shards from 2160 to 720
- Change lat/lon shards from 4x to 10x chunk size
- Remove early return for existing decompressed files to handle retry
  of corrupt downloads

https://claude.ai/code/session_01JD7KMBFUaoUjEYcNa6VhtK
- Change `not np.all(np.isnan(...))` to `np.all(np.isfinite(...))` in region_job_test.py
- Assert spatial_ref and crs_wkt keys exist in common_keys before comparing
- Move CRS/spatial coordinate validation from dynamical_dataset_test.py to template_config_test.py
- Remove test_single_file_integration (download/read coverage already in region_job_test.py)

https://claude.ai/code/session_01JD7KMBFUaoUjEYcNa6VhtK
Monkeypatches _get_template to .sel() on the time dimension, reducing
the template size for the integration test. Print statements for
snapshot capture still present - will be replaced with assertions.

https://claude.ai/code/session_01JD7KMBFUaoUjEYcNa6VhtK
Process only 2 hours for backfill + 1 for update instead of 3+1,
and replace print statements with assert_allclose/assert_array_equal
snapshot checks at a point with meaningful data (snow, non-zero precip).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rd boundary test

- Override time shard encoding to size 2 so the operational update naturally
  crosses a shard boundary, testing deaccumulation buffering without a separate test
- Replace write_shards to only write the first spatial shard (1 of 8), cutting
  write time by ~87%
- Combined test: 69s+14s(failing) → 10s(passing), full suite: 86s → 13s

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Iowa Mesonet filenames don't use the MRMS_ prefix that S3 and NCEP use.
e.g. GaugeCorr_QPE_01H_00.00_... not MRMS_GaugeCorr_QPE_01H_00.00_...

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Some pre-v12 Iowa Mesonet MRMS files contain a duplicate GRIB message
encoded with standard meteorological discipline (0) alongside the
MRMS-specific discipline (209). Read band 1 in this case after asserting
band 2 has the expected discipline, keeping the original assertion for
all other multi-band cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Shared memory was 65.7GB (over limit). New config:
- time chunk/shard: 720h→648h (30→27 days), shared memory 59.1GB
- spatial chunk: 175×175→100×100 (1.75°→1°), ~1.2MB compressed at 5%
- spatial shard: 1750×1750→700×1400 (5×5 shards, geographically square)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After reducing lat/lon shard sizes from 1750×1750 to 700×1400 in
c5a7c4f, the test's first-shard assertion was reading into unwritten
shards, causing assert_no_nulls to fail on NaN fill values.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@aldenks aldenks force-pushed the claude/implement-noaa-mrms-conus-oYYud branch from df700e8 to b4754ab Compare March 2, 2026 02:18
data_var_group: Sequence[NoaaMrmsDataVar],
) -> Sequence[NoaaMrmsSourceFileCoord]:
times = pd.to_datetime(processing_region_ds["time"].values)
data_var = data_var_group[0]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert len == 1

class NoaaMrmsSourceFileCoord(SourceFileCoord):
time: Timestamp
product: str
level: str = "00.00"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove default value

Comment on lines +96 to +98
product = internal.mrms_product_pre_v12
else:
product = internal.mrms_product
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add mrms_fallback_products_pre_v12 and mrms_fallback_products. these should be empty tuples for all variables in the template config except precipitation_surface. add a new fallback_products: tuple[str, ...] on the source file coord and fill it in here.

for precipitation surface:
mrms_fallback_products = ("MultiSensor_QPE_01H_Pass1", "RadarOnly_QPE_01H")
mrms_fallback_products_pre_v12 = ("RadarOnly_QPE_01H",)

except FileNotFoundError:
if coord.time > (pd.Timestamp.now() - pd.Timedelta(hours=12)):
return self._download_from_source(coord, source="ncep")
raise
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update this logic to the follwoing after making the changes to add the fallback product attributes the mrms data var internal attrs and mrms source file coord.

from reformatters.common.pydantic import replace

is_pre_v12 = coord.time < MRMS_V12_START
is_recent = coord.time > (pd.Timestamp.now() - pd.Timedelta(hours=12))

sources = 
   match
       case is_pre_v12 -> [iowa]
       case not is_pre_v12 and not is_recent -> [s3]
       case is_recent -> [s3, nomads]
 
products = [coord.product, *coord.fallback_products]

last_exception: Exception | None = None
for product in products:
  for source in sources:
     try:
        return self._download_from_source(replace(coord, product=product), source=source)
     except FileNotFoundError as e:
        last_exception = e
        continue

raise last_exception

Comment on lines +133 to +139
# Some pre-v12 Iowa Mesonet files have a duplicate GRIB message with
# standard meteorological discipline (0) alongside the MRMS-specific one (209).
# Band 1 (discipline 209) is always the authoritative MRMS data.
band2_discipline = reader.tags(2).get("GRIB_DISCIPLINE", "")
assert band2_discipline == "0(Meteorological)", (
f"Expected band 2 GRIB_DISCIPLINE '0(Meteorological)', found '{band2_discipline}' in {coord.downloaded_path}"
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rather than expect a specific order do this:

if reader.count == 2 and coord.time < MRMS_V12_START: find the band with grib discipline 209 and use that. assert that a band with 209 exists if we are in the count == 2 and pre v12 case. set a rasterio_band = n variable in both if/else branches and then use that in the reader.read call.

mrms_product: str
# Pre-v12 product name on Iowa Mesonet (e.g. GaugeCorr_QPE_01H for precipitation_surface)
mrms_product_pre_v12: str | None = None
mrms_level: str = "00.00"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the default value and explicitly set this in all data variables in this template

long_name="Precipitation rate",
units="kg m-2 s-1",
step_type="avg",
comment="Average precipitation rate over the previous hour. Derived from MultiSensor_QPE_01H_Pass2 from October 2020, GaugeCorr_QPE_01H before. Units equivalent to mm/s.",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
comment="Average precipitation rate over the previous hour. Derived from MultiSensor_QPE_01H_Pass2 from October 2020, GaugeCorr_QPE_01H before. Units equivalent to mm/s.",
comment="Average precipitation rate over the previous hour. Derived from MultiSensor_QPE_01H_Pass2 from October 2020, GaugeCorr_QPE_01H before. If primary product is unavailable, falls back to MultiSensor_QPE_01H_Pass1 and then RadarOnly_QPE_01H. Units equivalent to mm/s.",

Copilot AI added a commit that referenced this pull request Mar 2, 2026
Co-authored-by: aldenks <463484+aldenks@users.noreply.github.com>
Copilot AI and others added 2 commits March 2, 2026 10:44
…lbacks, and robust pre-v12 band selection (#480)

* Initial plan

* Implement PR #472 feedback for MRMS fallback products and band selection

Co-authored-by: aldenks <463484+aldenks@users.noreply.github.com>

* Default MRMS fallback tuples and remove explicit empty assignments

Co-authored-by: aldenks <463484+aldenks@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: aldenks <463484+aldenks@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MRMS implementation, backfill, validation, and updating

3 participants