Skip to content

Filter set!(model, ::MetadataSet) by per-model accepted kwargs#274

Merged
glwagner merged 3 commits into
mainfrom
glw/metadata-set-per-model-set
May 22, 2026
Merged

Filter set!(model, ::MetadataSet) by per-model accepted kwargs#274
glwagner merged 3 commits into
mainfrom
glw/metadata-set-per-model-set

Conversation

@glwagner
Copy link
Copy Markdown
Member

@glwagner glwagner commented May 22, 2026

Summary

  • Fixes the docs-build failure introduced by MetadataSet (#235): core + adoption sweep #259 in examples/one_degree_simulation.jl: set!(ocean.model, ecco_set) and set!(sea_ice.model, ecco_set) both threw ArgumentError: name h not found in model.velocities, model.tracers, or model.free_surface because the generic set!(model, ::MetadataSet) forwarded every glossary-mapped variable to the model, but real HydrostaticFreeSurfaceModel/SeaIceModel throw on unknown kwargs.
  • Add direct dispatches set!(::HydrostaticFreeSurfaceModel, ::MetadataSet) and set!(::SeaIceModel, ::MetadataSet) that filter the routed kwargs to those the model actually consumes.
  • Keep the generic set!(model, ::MetadataSet) fallback simple: deliver every glossary-mapped variable. Permissive stub/ad-hoc models continue to work; strict models override to filter.

Design

Plain multiple dispatch — no accepted_metadata_names hook in between. Per-model logic lives next to its dispatch, and a future model can do more than just filter (transform, validate, multi-pass) without having to extend a second function. A small _set_glossary! helper keeps the kwargs-construction code DRY across the two dispatches.

Why this got through CI

The existing set!(model, mset) tests in test/test_metadata_set.jl use a permissive StubModel whose set! accepts arbitrary kwargs, so the dispatch's failure to filter wasn't exercised. Docs build did fail on the #259 merge commit but the run shows as cancelled, so it wasn't a hard block.

Test plan

  • julia --project=. test/test_metadata_set.jl passes locally, including a new RestrictiveStubModel regression test that mimics the real models' kwarg-strictness with a shared 4-variable MetadataSet and demonstrates the override pattern.
  • Smoke test against real models confirms which(set!, (HydrostaticFreeSurfaceModel, MetadataSet)) and which(set!, (SeaIceModel, MetadataSet)) route to the specialised methods, not the fallback.
  • CI docs build passes (this is the failing case from main).

🤖 Generated with Claude Code

set!(model, mset) previously forwarded every glossary-mapped variable to
set!(model; kwargs...), which fails for real Oceananigans/ClimaSeaIce
models that throw ArgumentError on unknown kwargs. A shared 4-variable
MetadataSet (T, S, h, ℵ) therefore broke both the ocean's set! (rejects
h, ℵ) and the sea-ice set! (rejects T, S), as seen in the docs-build
failure of examples/one_degree_simulation.jl.

Add `accepted_metadata_names(model)` with model-specific methods for
HydrostaticFreeSurfaceModel (introspect velocities/tracers/free_surface)
and SeaIceModel (only :sea_ice_thickness, :sea_ice_concentration), and
filter the routed kwargs through it. The fallback (`keys(variable_glossary)`)
preserves existing StubModel test behavior.

Add a RestrictiveStubModel regression test in test_metadata_set.jl that
mimics the real models' kwarg-strictness.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 22, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/DataWrangling/metadata.jl 80.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

glwagner and others added 2 commits May 21, 2026 18:36
…dataSet) dispatch

Previous commit introduced an `accepted_metadata_names` hook that the generic
`set!(model, mset)` consulted to filter routed kwargs. Refactor to plain
multiple dispatch: each model type overrides `set!(::Model, ::MetadataSet)`
directly. The generic fallback delivers every glossary-mapped variable
(useful for permissive stub models) and real models override to filter.

Why this design is better:
- Idiomatic Julia — no parallel hook layer, just multiple dispatch.
- Per-model logic lives next to its dispatch, not behind a predicate.
- Future models can do more than just filter (transform, validate, multi-pass)
  without having to extend a second function.

A small `_set_glossary!` helper keeps the kwargs-construction code DRY across
the dispatches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s arg

Replace the private `_set_glossary!` helper with a third optional argument
`names` on `Fields.set!(model, mset::MetadataSet, names=keys(variable_glossary))`.
Per-model overrides simply compute their narrowed `names` and call the
3-arg generic — no separate helper, no redundant `haskey(variable_glossary, ...)`
check (the default already restricts to glossary keys).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

Yeah! Thanks! I was about to open an issue!
I'll try with this PR!

@glwagner
Copy link
Copy Markdown
Member Author

i want to merge bc this is urgent (0.5.0 is broken). can you approve?

@glwagner
Copy link
Copy Markdown
Member Author

i'll merge because tests pass.

@glwagner glwagner merged commit 29fd499 into main May 22, 2026
10 checks passed
@glwagner glwagner deleted the glw/metadata-set-per-model-set branch May 22, 2026 01:34
@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

Hm... Did the test capture the case in the one_degree_model?
I can't test because ECCO server is down...
When I put some show methods in the set!(model::HydrostaticFreeSurfaceModel, mset::MetadataSet) method I got:

set!(ocean.model,   ecco_set)   # picks up :temperature, :salinity → T, S
short = (propertynames(model.velocities)..., propertynames(model.tracers)..., propertynames(model.free_surface)...) = (:u, :v, :w, :T, :S, :e, :displacement, :barotropic_velocities, :filtered_state, :gravitational_acceleration, :kernel_parameters, :substepping, :timestepper)
names = (:eastward_velocity, :v_velocity, :temperature, :u_velocity, :eastward_wind, :northward_wind, :air_temperature, :northward_velocity, :salinity)
[ Info: Downloading ECCO data: temperature in /home/constantinon/.julia/scratchspaces/904d977b-046a-4731-8b86-9235c0d1ef02/ECCO/v4...
[ Info:  ... downloaded 274.0 B (100% complete, 552.080 ms)
[ Info:  ... downloaded 274.0 B (100% complete, 552.161 ms)
[ Info:  ... downloaded 274.0 B (100% complete, 552.193 ms)
[ Info:  ... downloaded 274.0 B (100% complete, 552.210 ms)
ERROR: RequestError: HTTP/2 403 while requesting https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/interp_monthly/THETA/1993/THETA_1993_01.nc

Seems that names include also velocities?

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

i want to merge bc this is urgent (0.5.0 is broken). can you approve?

I was trying to test!

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

@glwagner
Copy link
Copy Markdown
Member Author

glwagner commented May 22, 2026

ah sorry! ~~yes, it looks like the MetadataSet for ECCO is including velocities, although we may not want to set them. ~~

I don't think it does:

ecco_set = MetadataSet(:temperature, :salinity,
:sea_ice_thickness, :sea_ice_concentration;
dataset = ECCO4Monthly(), date)

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

What could be a clean solution for that?

@glwagner
Copy link
Copy Markdown
Member Author

As far as I can tell the MetadataSet does not include velocities. What is your conclusion? I'm not sure what you are showing up there

@glwagner
Copy link
Copy Markdown
Member Author

Ok so looking at the code, what you have shown is the list of all possible names that can be set. It is not the list of names that are set.

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

Hm... I just went here

function Fields.set!(model::HydrostaticFreeSurfaceModel, mset::MetadataSet)
short = (propertynames(model.velocities)...,
propertynames(model.tracers)...,
propertynames(model.free_surface)...)
names = Tuple(n for (n, s) in variable_glossary if s in short)
return set!(model, mset, names)
end

and put a @show in front of names. Then names is provided to set!. So these are the variables that can be set, or aren't they? Am I missing something?

@glwagner
Copy link
Copy Markdown
Member Author

glwagner commented May 22, 2026

This is what names says:

names = Tuple(n for (n, s) in variable_glossary if s in short)

this builds a tuple that contains every name in variable_glossary (a mapping of long to short names, containing all possible names that can be set from MetadataSet), for which the short name "s" is contained in short (the list of valid names for this particular model). That is what if s in short means.

@glwagner
Copy link
Copy Markdown
Member Author

Another way to write it:

    valid_long_names = Tuple(long_name for (long_name, short_name) in variable_glossary
                             if short_name in short)
    return set!(model, mset, valid_long_names)

@glwagner
Copy link
Copy Markdown
Member Author

function Fields.set!(model::HydrostaticFreeSurfaceModel, mset::MetadataSet)
    hfsm_short_names = (propertynames(model.velocities)..., propertynames(model.tracers)..., )
    valid_long_names = Tuple(long_name for (long_name, short_name) in variable_glossary if short_name in hfsm_short_names)
    return set!(model, mset, valid_long_names)
end

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

But e.g. :u is in hfsm_short_names and also in valid_long_names.... So I don't understand why it won't be set.

Anyway, I'll try myself and see what happens when the ECCO server heals; there's no need to try to "compile" code in our heads.

@glwagner
Copy link
Copy Markdown
Member Author

But e.g. :u is in hfsm_short_names and also in valid_long_names.... So I don't understand why it won't be set.

It is correct that :u is in both the short names for HFSM, and is one of the valid long names.

However, it is not one of the variables contained in mset. Therefore, it will not be set. Does that make sense?

@glwagner
Copy link
Copy Markdown
Member Author

one way to understand this could be to write @show mset? would that help?

@glwagner
Copy link
Copy Markdown
Member Author

I believe this concept will apply to any dataset, so maybe try another one other than ECCO to see

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

one way to understand this could be to write @show mset? would that help?

Right! I was forgetting the mset only includes the variables we constructed it with and not all variables of the current dataset!

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

We can deal with it within #277

@navidcy
Copy link
Copy Markdown
Member

navidcy commented May 22, 2026

e1d223e deals with it; the error was because MetadataSet without a date kwarg defaults to all dates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants