Skip to content

remove decorator that is not compatible with all CDS datasets#605

Merged
sandorkertesz merged 3 commits into
developfrom
feature/cds-plugin-over-zealous-decorator
Apr 22, 2025
Merged

remove decorator that is not compatible with all CDS datasets#605
sandorkertesz merged 3 commits into
developfrom
feature/cds-plugin-over-zealous-decorator

Conversation

@EddyCMWF
Copy link
Copy Markdown
Contributor

@EddyCMWF EddyCMWF commented Feb 5, 2025

This decorator breaks requests for some datasets where the daterange is required as a daterange in the cds/ads request (I'm just finding a nice example).

Eitherway, this modifying of requests has the potential to lead to misunderstandings/incompatibilities in the future. We can have the exception of the the area as here it is a case of mapping the earhtkit style dictionary to the CDS style list for an area definition.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Feb 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.15%. Comparing base (2d19c9f) to head (2282487).
⚠️ Report is 362 commits behind head on develop.

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #605   +/-   ##
========================================
  Coverage    90.15%   90.15%           
========================================
  Files          159      159           
  Lines        12181    12181           
  Branches       595      595           
========================================
  Hits         10982    10982           
  Misses        1017     1017           
  Partials       182      182           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sandorkertesz
Copy link
Copy Markdown
Collaborator

This will break existing code. E.g. would break the test test_cds_grib_multi_var_date.

If it is acceptable please give a clear description on how to migrate existing code (like the one in the test above).

@EddyCMWF
Copy link
Copy Markdown
Contributor Author

EddyCMWF commented Mar 17, 2025

Hi Sandor,
Sorry for being slow to repond on this, I decided to update my dataset to match this hidden pre-processing that EK-data is doing, however I have discovered that the issue is broader than I first thought. It effects all the datasets in the CDS/ADS that use a date-range. The sample API code produced by the web-portal for date-ranges is not compatible, it is a list of date-ranges, and EK-data is turning this into a list with just the first value in it. Here is a minimum reproducable example with a dataset that has been published for a long time:

dataset = "cams-global-ghg-reanalysis-egg4"
request = {
    "pressure_level": ["1000"],
    "date": ["2020-12-01/2020-12-31"],
    "step": ["0"],
    "data_format": "grib",
    "variable": [
        "temperature",
    ]
}

ekds = earthkit.data.from_source(
    "cds", dataset, request
)
ekds.to_xarray()

It returns only the first day instead of the whole month, if you put some debug output in EK-data you see it's because ["2020-12-01/2020-12-31"] gets turned into ["2020-12-01"]

A problem with the test you mentioned is that it's using date formats which are not using the suggested CDS/ADS syntax.

Also, I think it's implementing functionality which only works for grib and not really guaranteed to produce something sensible for our other datasets.

https://ads.atmosphere.copernicus.eu/datasets/cams-global-ghg-reanalysis-egg4?tab=download

@sandorkertesz
Copy link
Copy Markdown
Collaborator

@EddyCMWF, is there a document explaining the cdsapi syntax for "date"?

@EddyCMWF
Copy link
Copy Markdown
Contributor Author

Hi Sandor,

The suggested syntax is the code you get when you click on the "Show API request" in the cownload form. I've asked the contractors to add the type of value produced by the widgets to the widget documentation pages.

However, the problem is more general than that. earthkit is assuming that the behaviour of a a widget is something, when the reality is that the CDS datasets are very broad in scope, therefore assuming that modifying a request in a certain way for all datasets is not a good idea. This code was written for requests to MARS datasets in the old system, and is not general enough for all the types of data available in the CDS in my opinion.

Eddy

@sandorkertesz
Copy link
Copy Markdown
Collaborator

Hi @EddyCMWF, I agree with you. Please update the PR with the latests develop branch and then it can be merged.

@EddyCMWF
Copy link
Copy Markdown
Contributor Author

Hi Sandor, I've merged develop into my branch, however I am now a little confused... I was expecting the unit tests to fail, but they are not

@EddyCMWF
Copy link
Copy Markdown
Contributor Author

I added some more date-formats using the suggested syntax from the CDS, I expect them to pass without the decorator. I expect the existing "2012-12-12/to/2012-12-14" case to fail, but it is passing for reasons that I don't know. It fails when I run the tests locally

@sandorkertesz
Copy link
Copy Markdown
Collaborator

I added some more date-formats using the suggested syntax from the CDS, I expect them to pass without the decorator. I expect the existing "2012-12-12/to/2012-12-14" case to fail, but it is passing for reasons that I don't know. It fails when I run the tests locally

In the CI the CDS tests are disabled.

@sandorkertesz sandorkertesz merged commit 7a39d72 into develop Apr 22, 2025
118 of 119 checks passed
@sandorkertesz
Copy link
Copy Markdown
Collaborator

@EddyCMWF, the problem with disabling the normaliser is that now the e.g. a datetime.datetime cannot be used. Maybe we should create another normaliser for cds dates and make it optional so that users can have the choice to turn it on or off.

@EddyCMWF EddyCMWF deleted the feature/cds-plugin-over-zealous-decorator branch May 9, 2025 07:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants