Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soil level clm surface input data for clm5.2.0 have missing values in large domains #2744

Closed
maritsandstad opened this issue Sep 6, 2024 · 53 comments · Fixed by #2500
Closed
Assignees
Labels
bug something is working incorrectly science Enhancement to or bug impacting science
Milestone

Comments

@maritsandstad
Copy link

Brief summary of bug

Input surface data for clm on spectral element grid ne30 (inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_ne30np4.pg3_hist_1850_78pfts_c240216.nc) at least for clm5.2.0 have weird/missing values for parts of the grid.

Here is an example showing organic carbon summed over the soil levels and similar patterns of missing data seem to be present in all the soil level data I've tried plotting including PCT_CLAY, PCT_SAND:

ORGC_output_se_toplevel_map

This pattern shows up similarly in output datasets

General bug information

Plotting and regridding input data from surface input data for spectral element grid ne30 for clm5.2.0 (inputdata/lnd/clm2/surfdata_esmf/ctsm5.2.0/surfdata_ne30np4.pg3_hist_1850_78pfts_c240216.nc) data for variables on soil levels seem to be looking weird and missing data for what looks like some of the spectral element domains.

Above is an example plot of what organic carbon summed over soil levels looks like, but the pattern of missing data seems to be the same for all the soil level data that I have plotted so far. Looking at the LMR picture framing from https://journals.ametsoc.org/view/journals/mwre/134/12/mwr3360.1.xml I can almost see at least some of the domains that are missing, here is an attempt at colouring in the missing parts:

Screenshot 2024-09-06 151006

CTSM version you are using: [output of git describe]
clm5.2.0 ne30

Does this bug cause significantly incorrect results in the model's science? [Yes ]

Configurations affected: [Fill this in if known.]

Details of bug

See above

Important details of your setup / configuration so we can reproduce the bug

I regridded using the xesmf package following this excellent guide: https://ncar.github.io/esds/posts/2022/cam-se-regridding/

For reference here is my regridding code snippet

import numpy as np
import xarray as xr
import xesmf

def make_se_regridder(weight_file):
    weights = xr.open_dataset(weight_file)
    in_shape = weights.src_grid_dims.load().data

    # Since xESMF expects 2D vars, we'll insert a dummy dimension of size-1
    if len(in_shape) == 1:
        in_shape = [1, in_shape.item()]

    # output variable shape
    out_shape = weights.dst_grid_dims.load().data.tolist()[::-1]

    dummy_in = xr.Dataset(
        {
            "lat": ("lat", np.empty((in_shape[0],))),
            "lon": ("lon", np.empty((in_shape[1],))),
        }
    )
    dummy_out = xr.Dataset(
        {
            "lat": ("lat", weights.yc_b.data.reshape(out_shape)[:, 0]),
            "lon": ("lon", weights.xc_b.data.reshape(out_shape)[0, :]),
        }
    )

    regridder = xesmf.Regridder(
        dummy_in,
        dummy_out,
        weights=weight_file,
        method="bilinear",
        reuse_weights=True,
        periodic=True,
    )
    return regridder

def regrid_se_data(regridder, data_to_regrid):
    print(data_to_regrid.dims)
    if isinstance(data_to_regrid, xr.DataArray):
        print(type(data_to_regrid))
        updated = data_to_regrid.copy().transpose(..., "lndgrid").expand_dims("dummy", axis=-2)
    else:
        vars_with_ncol = [name for name in data_to_regrid.variables if "ncol" in data_to_regrid[name].dims]
        updated = data_to_regrid.copy().update(
            data_to_regrid[vars_with_ncol].transpose(..., "lndgrid").expand_dims("dummy", axis=-2)
        )
    regridded = regridder(updated.rename({"dummy": "lat", "lndgrid": "lon"}))
    return regridded

I used this file for mapping inputdata/lnd/clm2/mappingdata/maps/ne30pg3/map_ne30pg3_to_0.5x0.5_nomask_aave_da_c180515.nc

The regridding looked ok and completely sane for other variables that didn't have soil level, and the pattern persisted regardless of whether I regridded a variable with the soil level dimension, or if I summed over and thereby eliminated the soil level dimension before regridding. It also looked the same if I regridded the same output variables from the first history files from a run.

Important output or errors that show the problem

See above

@maritsandstad
Copy link
Author

I am happy to provide additional details or more plots.

@mvertens, @mvdebolskiy @rosiealice, here is my write-up of the soil/dust data issue. Perhaps you also know whether this needs to flagged for anyone in particular.

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

Thanks for raising this issue, @maritsandstad.

I wondered if this was an issue with the regridding or the dataset, so I tried plotting with uxarray, which seems to confirm your concern. We'll have to look at this for the CTSM5.3 datasets that @slevis-lmwg and @ekluzek are working on finalizing now.

image

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

visually, it looks like the same issue is present in the 5.3 surface data that @olyson is using for the ne30 spinup /glade/derecho/scratch/slevis/temp_work/new_rawdata/tools/mksurfdata_esmf/surfdata_ne30np4.pg3_hist_1850_78pfts_c240905.nc.

Here for max(PCT_SAND)
image

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

spot checking other data on the surface dataset it looks like this is just an issue for the soil variables, which I confirmed are messing up the soil properties being calculated by the pedotransfer functions in CLM (below for WATSAT.
image

@wwieder wwieder added priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations bug something is working incorrectly priority: Immediate Highest priority, something that was unexpected science Enhancement to or bug impacting science and removed priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations labels Sep 6, 2024
@wwieder wwieder added this to the ctsm5.3.0 milestone Sep 6, 2024
@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

I think this should be addressed in #2500 before we make the 5.3 tag. It's also likely influencing dust emissions, @dmleung, assuming your using the ne30 grid in your testing?

Also pinging @dlawrenncar here too, as this will delay our release of 5.3

@maritsandstad
Copy link
Author

@wwieder, yeah I also just saw this for the soil variables, but I also saw the same for WATSAT from initial outputs from a testrun.

@mvertens
Copy link

mvertens commented Sep 6, 2024

@wwieder - can you look at any of the other unstructured surface datasets (e.g. ne30np4 - or any other refined grids) and see if the same problem arises. Is this just for ne30np4.pg3?

@billsacks
Copy link
Member

It would also be very helpful to know if soil color shows this issue; has anyone checked that?

@samsrabin
Copy link
Collaborator

@wwieder notes that it "doesn't look like an issue in older surface datasets created by mksurfdata_map (inputdata/lnd/clm2/surfdata_map/surfdata_ne30pg3_hist_16pfts_Irrig_CMIP6_simyr1850_c190227.nc)"

@samsrabin samsrabin added the blocker another issue/PR depends on this one label Sep 6, 2024
@samsrabin
Copy link
Collaborator

@wwieder - can you look at any of the other unstructured surface datasets (e.g. ne30np4 - or any other refined grids) and see if the same problem arises. Is this just for ne30np4.pg3?

Also worth checking to see if any structured-grid datasets are affected.

@billsacks
Copy link
Member

I just spent a while on a call with @mvertens . In addition to the ideas noted above, we agreed that the best next step would be to output the regridded mapunit created as an intermediate step in creating these soil variables. I can make a branch this morning that adds that output.

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

Good questions
@billsacks soil color looks fine.

@samsrabin older datasets created with mksrfdata_map are OK
(e.g. inputdata/lnd/clm2/surfdata_map/surfdata_ne30pg3_hist_16pfts_Irrig_CMIP6_simyr1850_c190227.nc)

@samsrabin a quick look at different structured grids at different resolutions look fine.

@mvertens other unstructured grids also look like the one originally noted here, missing much of Africa an other regions around E Asia and W North America
ne30np4.pg2
ne3np4.pg3
ne120np4.pg3
ne16np4.pg3

all the 5.3 datasets I'm looking at are here:
/glade/derecho/scratch/slevis/temp_work/new_rawdata/tools/mksurfdata_esmf/surfdata_ne30np4.pg2_hist_1850_78pfts_c240905.nc

@ekluzek
Copy link
Collaborator

ekluzek commented Sep 6, 2024

@billsacks since this worked with mksurfdata_map which used ESMF mapping, but a different version of ESMF -- could this be an ESMF glitch for specific versions of ESMF? Or maybe there are different options we used for mapping of the soil data in mksurfdata_map? The soil dataset is different so it could be something about mapping from the new soil data grid to the model grid.

@billsacks
Copy link
Member

Thank you @wwieder - this is very helpful.

@ekluzek - @mvertens walked me through the regridding code for soil texture. The implementation is significantly different in mksurfdata_esmf compared with the old mksurfdata_map in some ways. But there are essentially two parts of what's going on: (1) regridding the mapunit and then (2) taking the regridded mapunit and from it creating and outputting the soil properties on the output grid. We felt that outputting the mapunit will let us narrow down the problem. It's especially interesting that soil color looks fine, because that uses a very similar (though not identical) algorithm to the regridding of mapunit for soil properties.

Anyway, I'll move ahead with adding the outputting of mapunit.

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

This plan makes sense @billsacks checking that the mapunits are regridding correctly seems like the next step.

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

FWIW the MPAS120 grid is NOT missing the regions in question (although plotting in uxarray look odd)

image

MPAS480 also seems to be OK

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

The C96 grid is also affected

image

@mvertens
Copy link

mvertens commented Sep 6, 2024

@wwieder - thanks for looking at the other resolutions. It looks like the problem has to do with a cubed sphere grid (both C96 and spectral element resolutions are on a cubed sphere grid). So this analysis is really helpful.

@ekluzek
Copy link
Collaborator

ekluzek commented Sep 6, 2024

@mvertens and @billsacks thanks so much for digging into this, we really appreciate it! This will take much less time for the two of you to solve.

After you are done it might be good to show more about this to the CTSM SE so we can also get some knowledge on this area.

And @maritsandstad thanks for pointing this out as well -- really helpful for the community to let us know about things like this.

@billsacks
Copy link
Member

Oh, mapunits is already on the surface dataset. @wwieder can you please plot that for the ne30 surface dataset?

@dlawrenncar
Copy link
Contributor

One other thing to think about in terms of implications for ongoing runs is whether or not this is affecting the km-scale simulations that @briandobbins is working on. He is getting 'close' to a production-ish run and it would be good to know if this is impacting those super high-resolution runs.

@ekluzek
Copy link
Collaborator

ekluzek commented Sep 6, 2024

@dlawrenncar this is looking like it's an issue with the cubed sphere grid and NOT the mpasa grids (as mpasa120 looks OK from Will's plot above). By extension that likely means that mpasa15 and mpasa3p75 are probably OK. But, still not bad to check.

@billsacks
Copy link
Member

I checked an f09 surface dataset before and after the fix. Only 8 grid cells differ in their mapunits, of which 7 differ in the resulting PCT_SAND (I haven't checked other variables, but differences should be a subset of these 8 grid cells that differ in their mapunits).

@billsacks
Copy link
Member

The fix is here: slevis-lmwg#9

@billsacks
Copy link
Member

(I also did a run with the ne30 resolution where I changed the code to set mapunit equal to mapunit_value_max everywhere. By doing this, I confirmed that the code for determining mapunit_value_max is now working correctly: it was set to 32056 everywhere, as expected.)

@wwieder
Copy link
Contributor

wwieder commented Sep 6, 2024

I'll be curious if this also addresses #2502?

@billsacks
Copy link
Member

Yes, good point, this is almost certainly the culprit that explains #2502 - because the nature of this bug means that there would likely be different answers with different processor counts / decompositions.

@dlawrenncar
Copy link
Contributor

If this bug is going to affect the MPAS grids as well, can someone work with @briandobbins to create a new ultra high-resolution dataset for the runs that he is working on. There is some urgency on this, as the runs need to be completed quite soon. Not urgent as in this needs to be done before the weekend, but early next week preferably.

@ekluzek
Copy link
Collaborator

ekluzek commented Sep 6, 2024

@dlawrenncar yes we will do that right away. When we recreate all the datasets the mpasa15 and mpasa3p75 are done at the same time. And rerunning all the datasets will be done by early next week, it may or may not happen over the weekend. So as long as @briandobbins needs one of those we'll get it for him.

@dmleung
Copy link
Contributor

dmleung commented Sep 6, 2024

@wwieder, I was testing dust in the ne30 grid (#2732) using the same surface dataset yes. The new dust needs some tuning given all the model updates, but thus far I didn't see very significant impacts from the soil variables. The important variables that explicitly affect dust are PCT_CLAY and PCT_NAT_PFT etc. From the conversation here it seems PCT_CLAY is off, so I will need to double check on how this issue impacts my tests. Thanks for letting me know about this!

@briandobbins
Copy link
Contributor

In case this is helpful, here's what I see with my 3.75km dataset that I've been using with the ultra-high res CLM50 runs:

ds.mapunits.plot():
mpasa3p75 plot of mapunits

ds.mapunits.plot(clim=(-0.1,0.1)):
dobbins_3p75km_mapunits

@wwieder
Copy link
Contributor

wwieder commented Sep 9, 2024

@dmleung can you have a look at dust emissions from cases that use this surface dataset:
/glade/derecho/scratch/slevis/temp_work/new_rawdata/tools/mksurfdata_esmf/surfdata_ne30np4.pg3_hist_1850_78pfts_c240908.nc

@briandobbins there should also be 3 km mpas grid (surfdata_mpasa3p75_hist_2000_16pfts_c240908.nc) you can test out in the same directory

@briandobbins
Copy link
Contributor

@wwieder Thanks, that's looking much better to me - here's the clipped new one:
updated_3 75km_land_file

@wwieder
Copy link
Contributor

wwieder commented Sep 9, 2024

That looks more like planet Earth!

@jrbuzan
Copy link

jrbuzan commented Sep 13, 2024

Hi everyone,

Thanks for looking into this. Perhaps this is already fixed, but I also see this showing up in the BULK variable.

Cheers,
-Jonathan

BULK

@ekluzek
Copy link
Collaborator

ekluzek commented Sep 13, 2024

Hey @jrbuzan I used @wwieder instructions to plot the latest file:

fin = '/glade/campaign/cesm/cesmdata/inputdata/lnd/clm2/surfdata_esmf/ctsm5.3.0/surfdata_ne30np4.pg3_hist_1850_78pfts_c240908.nc'

And got this for the first layer which looks fine to me. Which layer did you plot above?

bokeh_plot

@billsacks
Copy link
Member

@jrbuzan - yes, this issue would be expected to impact the BULK variable, too.

@wwieder
Copy link
Contributor

wwieder commented Sep 13, 2024

Just to clarify, we're expecting incorrect BULK on 5.2 and older 5.3 surface datasets created with mksrfdata_esmf- but it should be corrected on the newest 5.3 datasets Sam created (or that @jrbuzan could create with corrections to mksrfdata_esmf that Bill made last week)?

@billsacks
Copy link
Member

Just to clarify, we're expecting incorrect BULK on 5.2 and older 5.3 surface datasets created with mksrfdata_esmf- but it should be corrected on the newest 5.3 datasets Sam created (or that @jrbuzan could create with corrections to mksrfdata_esmf that Bill made last week)?

Yes, that's what I'd expect - thanks for stating that more clearly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something is working incorrectly science Enhancement to or bug impacting science
Projects
None yet
Development

Successfully merging a pull request may close this issue.