-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recipe test results for v2.12.0 #3916
Comments
great, many thanks @sloosvel 🍺
|
For the ones that fail because of the missing ERA5 facets: We should not use the drs I know this is certainly not optimal. A possible solution to this could be to set (useless) default values for these facets here. Unfortunately, this is also not a problem unique to ERA5, but could in principle happen for other data too (e.g., the |
There seems to a be a duplicated dataset that prevents concatenation
Ok, I am relaunching the recipes editing the |
thanks, Saskia! Just don't worry about the example recipes, those are either fine or absolutely FUBAR (like Julia); also those with HDF5 errors - we never managed to figure out why they'd fail - they are flaky, so they go through no probs at times |
Ignoring the grib pool, all recipes that were failing because of the missing facets except for 3 run successfully. The results in the DKRZ machine have also been updated, Below is the new summary: Recipe running session 2025-02-18Recipes that failed due to missing data:
Recipes that failed due to diagnostic failures:
Recipes that failed due to NetCDF HDF errors:
Recipes that failed due to duplicated datasets that break concatenation:
Recipes that failed due to metadata issues that break concatenation
Recipes that failed due to out of memory errors:
Recipes that failed due to timeout errors
Recipes that failed because of a failure to convert units:
|
Here are the results from the comparison tool: compare_output_2.12.0rc1.txt The following recipes need to be inspected by a human:
|
I have checked the output of |
@sloosvel the debug page https://esmvaltool.dkrz.de/shared/esmvaltool/v2.12.0rc1/debug.html still points to the first run you did, could you update it please, for the 18-Feb run? 🍺 |
It's the same directory, I just added the recipes that were re-run |
Looks like some small 4th digit changes in multi-model mean numbers on gier_2020bg, I assume this might be the case for more where the figure is not reported but the .nc data files |
hmm am looking at the run dates and none is from 18 February, and eg stratosphere still looks failed (from 15 Feb) https://esmvaltool.dkrz.de/shared/esmvaltool/v2.12.0rc1/recipe_autoassess_stratosphere_20250215_144720/ |
@valeriupredoi Here you have the successful results: https://esmvaltool.dkrz.de/shared/esmvaltool/v2.12.0rc1/recipe_autoassess_stratosphere_20250218_082909/ |
wunderpub! Just checked it, all looks great, and checked the box, thanks, Saskia 🍻 |
ocean_multimap has something going on very weirdly for just one model: Both recipes used |
Actually, there are some small changes also for |
cheers @TomasTorsvik - I am rerunning this one with various Dask versions, maybe we'll catch the culprit there, if not then I'm out of ideas - I don't know what's being run in this recipe - @ledm would be of help here |
recipe_iht_toa looks fine, I've checked box. |
Hi all, thanks for the checks. For |
Regarding recipe_impact.yml: it looks like something changed in how Iris loads bad data. Fix available in ESMValGroup/ESMValCore#2666. |
Regarding recipe_easy_ipcc.yml: it looks like the problem is caused by a newer version (v20230616 instead of v20191108, which was used for the v2.10 runs) of the NorESM2-MM ssp585 files that uses different names for the latitude and longitude index coordinate names ("first spatial index for variables stored on an unstructured grid") than are used for the historical experiment ("cell index along first dimension"). I'll write a fix. |
|
@TomasTorsvik you were right - Dask is not the problem (poor Dask gets the default blame now haha), it's iris-cartopy-matplotlib, with iris=3.10 (and other deps below) I am getting this error:
relevant deps:
Let me try see if I downgrade Matplotlib if anything comes out of the woodwork - matplotlib=3.10 is full of API and other changes |
OK no luck, same plot even if I downgraded to cartopy==0.23 and matplotlib==3.9.4 - lower than that I am starting to have headaches that we are using old dependencies; surely something in amongst iris/cartopy/matplotlib changed since almost one year ago - but even so, we don't want to pin on any of these on a one year old version. That's why running these recipes more often is a useful thing. At any rate, as a specialist, @TomasTorsvik - do the plots look bad? Or just different (which could be bad 😆 ) |
It's definitely something in ESMValGroup/ESMValCore#2457, I ran before and after the commit and the results differ and the strange shift appears: |
ooh we found us the smoking gun then! I am looking at a final possible culprit outside us, iris-esmf-regrid 0.11 build=1 I realized I've not tested with that |
nope - I am confident to say any possible and impossible combination of dependencies is not able to produce the sane-looking plot, it results either in a crash (whether it be preprocessor or diagnostic) or in a bizarre-looking plot, all this with a fixed esmvalcore latest dev. Saskia, Bouwe, it is us ie esmvalcore - and that PR looks to be the cause indeed 🍺 |
In ESMValGroup/ESMValCore#2457 we switched from using the ESMF regridding scheme |
I think the root of the problem is an issue with the CMORization of MRI-ESM1, in which instead of having a latitude and a longitude depend on dummy indices, like i and j or indices along a direction which is what most models do, they make it depend on an actual coordinate system (rlat and rlon).. |
I'll write a fix then. |
@valeriupredoi , @bouweandela , @sloosvel |
Looks like so, because
|
Superb debugging work, folks! I am very happy to see this issue pinpointed now I was quite worried that we'd have to pin one or more of our dependencies, and that'd add fuel to an already burning fire from our env being already quite old. Sorry I wasn't communicating any faster, all my testing was being done on JASMIN, and she suffered quite a bit over the past few days from overcrowding 🍺 |
that Polar Cap is hefty on that plot - may look appealing to the US President 😆 Does that data need de-haloing? |
i guess it should be ok, but I can test it for CORDEX data |
thanks Tomas, this is also most prob fixed by ESMValGroup/ESMValCore#2672 |
Regarding the sea-ice, if you go back to the results of version 2.8.0 (which is the last version in which the recipe worked before 2.11.0), the results look more similar to version 2.12.0. So maybe something was wrong in version 2.11.0 that went unnoticed. |
vintage back in fashion - pls ignore me then, I have spoken out of my rear 😁 |
Here are the results for the second release candidate: https://esmvaltool.dkrz.de/shared/esmvaltool/v2.12.0rc2/ Recipe running session 2025-02-25Recipes that failed due to missing data:
Recipes that failed due to diagnostic failures:
Recipes that failed due to NetCDF HDF errors:
Recipes that failed due to out of memory errors:
Recipes that failed due to timeout errors
Recipes that failed because of a failure to convert units:
Here are the results of the comparison tool: compare_output_2.12.0rc2.txt. The following recipes need to be inspected.
|
Summary for some recipes whose output differs: #3916 (comment) For recipe_ecs_scatter, it seems that some observational data is duplicated: $ ls -l /work/bd0854/DATA/ESMValTool2/download/obs4MIPs/TRMM/v20160613
...
-rw-rw---- 1 b380103 bd0854 442424824 Jul 19 2023 pr_TRMM-L3_v7-7A_199801-201312.nc
-rw-rw---- 1 b380866 bd0854 571450152 Nov 8 2023 pr_TRMM-L3_v7A_200001010130-200001312230.nc
-rw-rw---- 1 b380866 bd0854 534585768 Nov 8 2023 pr_TRMM-L3_v7A_200002010130-200002292230.nc
-rw-rw---- 1 b380866 bd0854 571450152 Nov 8 2023 pr_TRMM-L3_v7A_200003010130-200003312230.nc
-rw-rw---- 1 b380866 bd0854 553017960 Nov 8 2023 pr_TRMM-L3_v7A_200004010130-200004302230.nc
-rw-rw---- 1 b380866 bd0854 571450152 Nov 8 2023 pr_TRMM-L3_v7A_200005010130-200005312230.nc
-rw-rw---- 1 b380866 bd0854 553017960 Nov 8 2023 pr_TRMM-L3_v7A_200006010130-200006302230.nc
-rw-rw---- 1 b380866 bd0854 571450152 Nov 8 2023 pr_TRMM-L3_v7A_200007010130-200007312230.nc
... I guess we could just remove some of those files? One problem is also that those different files do not cover the same time range... This problem appears now due to ESMValGroup/ESMValCore#2448 (comment), so a solution would also be to specifically set drs: # preferred syntax
CMIP3: DKRZ
CMIP5: DKRZ
CMIP6: DKRZ
CORDEX: BADC
obs4MIPS: default in the config file, but I would fix this properly. @axel-lauer what do you think? |
As far as I know, these are not duplicate files. |
I think the problem is that obs4MIPS data files are loaded only taking into account the obs4MIPs:
cmor_strict: false
input_dir:
default: 'Tier{tier}/{dataset}'
ESGF: '{project}/{dataset}/{version}'
RCAST: '/'
IPSL: '{realm}/{short_name}/{freq}/{grid}/{institute}/{dataset}/{latest_version}'
input_file:
default: '{short_name}_*.nc'
ESGF: '{short_name}_*.nc'
output_file: '{project}_{dataset}_{short_name}'
cmor_type: 'CMIP6'
cmor_path: 'obs4mips'
cmor_default_table_prefix: 'obs4MIPs_' The files don't even have the same version, ( |
Yes, I agree that this is bad, but we probably don't want to change this now right before the release. This would technically be a breaking change that has the potential to break a lot of things. So how about we test the recipes now with drs:
obs4MIPS: default and fix this after the release? The data is present in our default OBS pool: $ ls -l /work/bd0854/DATA/ESMValTool2/OBS/Tier1/TRMM-L3
total 864120
-rw-rw-r--+ 1 b380103 bd0854 442424860 Feb 22 2019 prStderr_TRMM-L3_v7-7A_199801-201312.nc
lrwxrwxrwx 1 b380103 bd0854 39 Mar 6 2019 prStderr_TRMM-L3_v7_7A_199801-201312.nc -> prStderr_TRMM-L3_v7-7A_199801-201312.nc
-rw-rw-r--+ 1 b380103 bd0854 442424824 Feb 22 2019 pr_TRMM-L3_v7-7A_199801-201312.nc
lrwxrwxrwx 1 b380103 bd0854 33 Mar 6 2019 pr_TRMM-L3_v7_7A_199801-201312.nc -> pr_TRMM-L3_v7-7A_199801-201312.nc I don't think we need a new rc for that. |
I just created an issue about that so we don't forget: ESMValGroup/ESMValCore#2677 |
Thanks @schlunma, the run was successful now! If there are no more comments I'll proceed with the release. |
@ESMValGroup/esmvaltool-developmentteam Here is an overview of the tests performed for releasing v2.12.0. The results are summarized in https://esmvaltool.dkrz.de/shared/esmvaltool/v2.12.0rc1/
Here is the conda environment: esmvaltool_v212rc1.txt
Recipe running session 2025-02-17
125 out of 163 recipes ran successfully, 38 recipes failed. You can find below the errors summarized.
Recipes that failed due to errors related to missing ERA5 grib dataset keys:
Recipes that failed due to diagnostic failures:
Recipes that failed due to NetCDF HDF errors:
Recipes that failed due to duplicated datasets that break concatenation:
Recipes that failed due to metadata issues that break concatenation
Recipes that failed due to out of memory errors:
Recipes that failed due to timeout errors
The text was updated successfully, but these errors were encountered: