-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script for modifying neon surface dataset #1375
Conversation
# if file not found run subset_data.py | ||
# Clean up imports for both codes... | ||
# Check against a list of valid names. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should add zbedrock
here, but note that the utilities could be extended to additional data on the surface data as they become needed (canopy height) or available (lai)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added zbedrock
functionality that we discussed in #1353 (comment) here:
CTSM/tools/contrib/modify_singlept_site_neon.py
Lines 512 to 518 in 0bf780e
zb_flag = False | |
if (obs_bot.iloc[-1]<rock_thresh): | |
f2['zbedrock']=obs_bot.iloc[-1]*100 | |
print (f2['zbedrock']) | |
zb_flag = True | |
I used zb_flag
for whether or not zbedrock
is updated: so if zbedrock
gets updated , the code updates the metadata to show zbedrock
is modified:
CTSM/tools/contrib/modify_singlept_site_neon.py
Lines 287 to 291 in 0bf780e
if zb_flag: | |
nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC','zbedrock'] | |
else: | |
nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC'] | |
Should we do anything special for LAI or Canopy height at this point? Maybe add it as a to-do item at the top of the code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're always planning on running NEON with BGC for now. if we ever get LAI data from NEON sites we can run in SP, but I don't see that happening soon.
|
||
import numpy as np | ||
import pandas as pd | ||
import xarray as xr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More broadly how does CTSM, CESM or CESM-lab handle package versions and python environments? Is this something that gets tested for python scripts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wwieder these are all standard python packages that you get with ncar_pylib on cheyenne. There are two aspects to python environments, one is the list of python modules used in the python scripts, the other is in setting up the python environment that includes setting specific versions of those. The first is in the script itself, and the only thing that you can do is to check for a specific version of python and/or a python module. It would be good to at least add a check for the python version (this is something we do in cime for example). If you use something version specific in a python module you should do the same. I'm guessing that for this list of modules we aren't using anything version specific.
In terms of setting up the python environment that's outside the script. I looked and didn't see anything specific that CESMLab is doing in this regard, nor are they doing version checking inside the scripts (at least not yet). But, as cheyenne is our workhorse and ncar_pylib is sufficient I think this is fine as it is right now.
I will add an issue to have python scripts check for versions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ekluzek, I mostly want to make sure the script works, even as ncar_pylib gets updated.
return parser | ||
|
||
|
||
def get_neon(neon_dir, site_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that you also have a wrapper script do batch process all sites with one call, @negin513 ?
|
||
neon_file = os.path.join(neon_dir, site_name + "_surfaceData.csv") | ||
|
||
#-- Download the file if it does not exits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to check for when the file was created to know if a file exists but needs to be updated for some reason? Can you query the creation date of the NEON file and if it's later than the creation date of our local copy update to the 'new' data? This relates to NEONScience/NCAR-NEON#39
nc.attrs['Updated_with'] = os.path.abspath(__file__) | ||
nc.attrs['Updated_from'] = surf_file | ||
nc.attrs['Updated_using'] = neon_file | ||
nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, add zbedrock
here too if the NEON observations don't make it down to 2 m depth.
fname_out = basename[:cend]+"_"+"c"+today_string+".nc" | ||
return(fname_out) | ||
|
||
def sort_print_soil_layers(obs_bot, soil_bot): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this function!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to hear it. 👍😎🎉
carbon_tot = df['carbonTot'][bin_index[soil_lev]] | ||
#print ("carbon_tot:", carbon_tot) | ||
layer_depth = df['biogeoBottomDepth'][bin_index[soil_lev]] - df['biogeoTopDepth'][bin_index[soil_lev]] | ||
f2['ORGANIC'][soil_lev] = carbon_tot * bulk_den * 0.1 / layer_depth * 100 / 0.58 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we missing the conversion from g to kg OM at the endORGANIC = carbonTot * bulk_den * 0.1 / layer_depth * 100 /0.58 * 1e-3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe plot up old and new organic values for a few sites to check that values seem to be of similar magnitude?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @negin513 sorry this has been so confusing. Here's what I think we need do for ORGANIC on the surface dataset. We're taking these data from NEON:
carbonTot
(gC/kg soil) andbulkDensExclCoarseFrag
(g soil / cm3 soil) and need to calculateORGANIC
(kg OM/m3 soil, assuming 0.58gC/gOM)
To do this I think we just need:
- ORGANIC = carbonTot * bulkDensity / 0.58
- I think all the units converting between g-kg and cm3-m3 end up dropping out
- I also don't think we need depth, since units for ORGANIC are volumetric, not on an areas basis.
After doing this calculation with the raw NEON data, then you can select the CLM soil layer that corresponds to the NEON horizon, as you did for SAND and CLAY. let's stick with this for now, it's easier
- OR -
We can do a depth weighted average for values between NEON layers, unlike for texture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for confusion on this and I can provide more details, but it seems like the equation here should be:
ORGANIC = carbonTot * bulk_den / 0.58 (units would be kgOM/m3 soil)
Then we can select CLM layers that correspond to the matching NEON observations?
layer_depth = df['biogeoBottomDepth'][bin_index[soil_lev]] - df['biogeoTopDepth'][bin_index[soil_lev]] | ||
f2['ORGANIC'][soil_lev] = carbon_tot * bulk_den * 0.1 / layer_depth * 100 / 0.58 | ||
|
||
#TODO : max depth for neon sites from WW |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add this to #1353
Since this is specific to neon and under contrib this can come in anytime. |
@negin513 where is this PR at in terms of being ready to go? |
before this gets merged in do you want to correct the formula for |
@wwieder since I've checked in the new surface datasets as they are, we are going to go with what we have right now. But, we can have another future change with the changes you want for a subsequent tag. |
I used existing infrastructure to add descriptive strings to certain history fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the change to a bunch of other history fields that needed it. Some variable names for pools was also changed to use terms consistent with the new names as well.
… it has to run in the directory where the script exists, and the system tools test don't allow that right now
…e starts with a . and is hidden, but seems to be correct, also add the two new tests to the standard test list
Description of changes
This script is for modifying the surface dataset at neon sites using data available from the neon server.
Specific notes
Contributors other than yourself, if any: @wwieder , @danicalombardozzi
CTSM Issues Fixed (include github issue #): #1353
Are answers expected to change (and if so in what way)? N/A
Any User Interface Changes (namelist or namelist defaults changes)? N/A
This script has the following user interface: