Skip to content

Add test for anemoi-datasets loader using CERRA test dataset hosted on EWC#1

Merged
leifdenby merged 16 commits intomainfrom
feat/anemoi-dataset-cerra-test
Mar 19, 2026
Merged

Add test for anemoi-datasets loader using CERRA test dataset hosted on EWC#1
leifdenby merged 16 commits intomainfrom
feat/anemoi-dataset-cerra-test

Conversation

@leifdenby
Copy link
Copy Markdown
Member

@leifdenby leifdenby commented Mar 17, 2026

I have added a test that applies the anemoi-datasets loader and then uses mlwp-data-specs validator to validate the loaded xr.Dataset.

To implement this I also uploaded cerra-rr-an-oper-0001-mars-5p5km-2017-2017-6h-v3-testing.zarr from the anemoi dataset registry to the EWC (European Weather Cloud) object store at https://object-store.os-api.cci2.ecmwf.int/mlwp-sample-datasets/anemoi-datasets/cerra-rr-an-oper-0001-mars-5p5km-2017-2017-6h-v3-testing.zarr/

The point of this PR is to adjust mlwp-data-specs and the anemoi-datasets loader here in mlwp-data-loaders until the validation of the loaded xr.Dataset passes.

Currently this installs mlwp-data-specs directly from main (i.e. whatever commit is on HEAD) of https://github.com/mlwp-tools/mlwp-data-specs. Eventually we should tag and publish a release of mlwp-data-specs once this data-loader (and one for anemoi-inference datasets?`) pass the validation.

@mpvginde your thoughts would be much appreciated :)

@leifdenby
Copy link
Copy Markdown
Member Author

leifdenby commented Mar 17, 2026

Currently the test fails, I need to find out if that is because either:

  1. the loader isn't implemented correctly, i.e. I missed something when refactoring the loader functionality from mxalign
  2. the validator isn't implemented correctly, again I could have missed something from mxalign
Screenshot 2026-03-17 at 14 30 48

@mpvginde
Copy link
Copy Markdown

Hi @leifdenby from my own tests with using the validator with the mxalign loaders I have noticed that some of the attributes (long_name, units, ..) are not currently not set correctly or not provided at all by the loaders. This is probably causing the validation to fail

@leifdenby
Copy link
Copy Markdown
Member Author

leifdenby commented Mar 19, 2026

I've (temporarily) integrated mxaligns validation and the validation passes (in the tests) using both mxalign and the validation logic currently (as of https://github.com/mlwp-tools/mlwp-data-specs/tree/b703bacd89854441316fb36c321aa03757ee233e, current HEAD of main) in mlwp-data-specs 🥳

Given that the loaded dataset now passes both from the perspective of mxalign and mlwp-data-specs I am going to merge this PR :)

@leifdenby leifdenby merged commit 91665d6 into main Mar 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants