Script for modifying neon surface dataset #1375

negin513 · 2021-05-18T13:21:44Z

Description of changes

This script is for modifying the surface dataset at neon sites using data available from the neon server.

Specific notes

Contributors other than yourself, if any: @wwieder , @danicalombardozzi

CTSM Issues Fixed (include github issue #): #1353

Are answers expected to change (and if so in what way)? N/A

Any User Interface Changes (namelist or namelist defaults changes)? N/A

This script has the following user interface:

|------------------------------------------------------------------|
|---------------------  Instructions  -----------------------------|
|------------------------------------------------------------------|
This script is for modifying surface dataset at neon sites
using data available from the neon server.

After creating a single point surface data file from a global 
surface data file using subset_data.py, use this script to 
overwrite some fields with site-specific data for neon sites.

This script will do the following:
- Download neon data for the specified site if it does not exist in the specified directory.
- Modify surface dataset with downloaded data (neon-specific).

-------------------------------------------------------------------
Instructions for running on Cheyenne/Casper:

load the following into your local environment
    module load python
    ncar_pylib

To remove NPL from your environment on Cheyenne/Casper:
    deactivate
-------------------------------------------------------------------
To see the available options:
    ./modify_singlept_site_neon.py --help
-------------------------------------------------------------------

optional arguments:
  -h, --help            show this help message and exit
  --neon_site SITE_NAME
                        4-letter neon site code.
  --surf_dir SURF_DIR   Directory of single point surface dataset. [default:
                        /glade/scratch/$user/single_point/]
  --out_dir OUT_DIR     Directory to write updated single point surface
                        dataset. [default:
                        /glade/scratch/$user/single_point_neon_updated/]

wwieder · 2021-05-18T17:45:44Z

tools/contrib/modify_singlept_site_neon.py

+# if file not found run subset_data.py
+# Clean up imports for both codes...
+# Check against a list of valid names.
+


we should add zbedrock here, but note that the utilities could be extended to additional data on the surface data as they become needed (canopy height) or available (lai)

I've added zbedrock functionality that we discussed in #1353 (comment) here:

CTSM/tools/contrib/modify_singlept_site_neon.py

Lines 512 to 518 in 0bf780e

zb_flag = False

if (obs_bot.iloc[-1]<rock_thresh):

f2['zbedrock']=obs_bot.iloc[-1]*100

print (f2['zbedrock'])

zb_flag = True

I used zb_flag for whether or not zbedrock is updated: so if zbedrock gets updated , the code updates the metadata to show zbedrock is modified:

CTSM/tools/contrib/modify_singlept_site_neon.py

Lines 287 to 291 in 0bf780e

if zb_flag:

nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC','zbedrock']

else:

nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC']

Should we do anything special for LAI or Canopy height at this point? Maybe add it as a to-do item at the top of the code?

@wwieder @negin513 has a question for you above. But since you normally want to run NEON with BGC I don't think we care about this for now, but I'll let Will weigh in for sure.

we're always planning on running NEON with BGC for now. if we ever get LAI data from NEON sites we can run in SP, but I don't see that happening soon.

wwieder · 2021-05-18T17:46:49Z

tools/contrib/modify_singlept_site_neon.py

+
+import numpy as np
+import pandas as pd
+import xarray as xr


More broadly how does CTSM, CESM or CESM-lab handle package versions and python environments? Is this something that gets tested for python scripts?

@wwieder these are all standard python packages that you get with ncar_pylib on cheyenne. There are two aspects to python environments, one is the list of python modules used in the python scripts, the other is in setting up the python environment that includes setting specific versions of those. The first is in the script itself, and the only thing that you can do is to check for a specific version of python and/or a python module. It would be good to at least add a check for the python version (this is something we do in cime for example). If you use something version specific in a python module you should do the same. I'm guessing that for this list of modules we aren't using anything version specific.

In terms of setting up the python environment that's outside the script. I looked and didn't see anything specific that CESMLab is doing in this regard, nor are they doing version checking inside the scripts (at least not yet). But, as cheyenne is our workhorse and ncar_pylib is sufficient I think this is fine as it is right now.

I will add an issue to have python scripts check for versions though.

Thanks @ekluzek, I mostly want to make sure the script works, even as ncar_pylib gets updated.

wwieder · 2021-05-18T17:48:05Z

tools/contrib/modify_singlept_site_neon.py

+    return parser
+
+
+def get_neon(neon_dir, site_name):


I'm assuming that you also have a wrapper script do batch process all sites with one call, @negin513 ?

wwieder · 2021-05-18T17:52:39Z

tools/contrib/modify_singlept_site_neon.py

+
+    neon_file = os.path.join(neon_dir, site_name + "_surfaceData.csv")
+
+    #-- Download the file if it does not exits


Is there any way to check for when the file was created to know if a file exists but needs to be updated for some reason? Can you query the creation date of the NEON file and if it's later than the creation date of our local copy update to the 'new' data? This relates to NEONScience/NCAR-NEON#39

wwieder · 2021-05-18T17:53:34Z

tools/contrib/modify_singlept_site_neon.py

+    nc.attrs['Updated_with'] = os.path.abspath(__file__)
+    nc.attrs['Updated_from'] = surf_file
+    nc.attrs['Updated_using'] = neon_file
+    nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC']


again, add zbedrock here too if the NEON observations don't make it down to 2 m depth.

wwieder · 2021-05-18T17:54:14Z

tools/contrib/modify_singlept_site_neon.py

+    fname_out = basename[:cend]+"_"+"c"+today_string+".nc"
+    return(fname_out)
+
+def sort_print_soil_layers(obs_bot, soil_bot):


I love this function!

Glad to hear it. 👍😎🎉

wwieder · 2021-05-18T17:59:39Z

tools/contrib/modify_singlept_site_neon.py

+        carbon_tot = df['carbonTot'][bin_index[soil_lev]]
+        #print ("carbon_tot:", carbon_tot)
+        layer_depth = df['biogeoBottomDepth'][bin_index[soil_lev]] - df['biogeoTopDepth'][bin_index[soil_lev]]
+        f2['ORGANIC'][soil_lev] = carbon_tot *  bulk_den * 0.1 / layer_depth * 100 / 0.58 


are we missing the conversion from g to kg OM at the endORGANIC = carbonTot * bulk_den * 0.1 / layer_depth * 100 /0.58 * 1e-3

maybe plot up old and new organic values for a few sites to check that values seem to be of similar magnitude?

Hi @negin513 sorry this has been so confusing. Here's what I think we need do for ORGANIC on the surface dataset. We're taking these data from NEON:

carbonTot (gC/kg soil) and

bulkDensExclCoarseFrag (g soil / cm3 soil) and need to calculate

ORGANIC (kg OM/m3 soil, assuming 0.58gC/gOM)

To do this I think we just need:

ORGANIC = carbonTot * bulkDensity / 0.58

I think all the units converting between g-kg and cm3-m3 end up dropping out

I also don't think we need depth, since units for ORGANIC are volumetric, not on an areas basis.

After doing this calculation with the raw NEON data, then you can select the CLM soil layer that corresponds to the NEON horizon, as you did for SAND and CLAY. let's stick with this for now, it's easier

OR -

We can do a depth weighted average for values between NEON layers, unlike for texture.

sorry for confusion on this and I can provide more details, but it seems like the equation here should be:

ORGANIC = carbonTot * bulk_den / 0.58 (units would be kgOM/m3 soil)

Then we can select CLM layers that correspond to the matching NEON observations?

wwieder · 2021-05-18T18:02:19Z

tools/contrib/modify_singlept_site_neon.py

+        layer_depth = df['biogeoBottomDepth'][bin_index[soil_lev]] - df['biogeoTopDepth'][bin_index[soil_lev]]
+        f2['ORGANIC'][soil_lev] = carbon_tot *  bulk_den * 0.1 / layer_depth * 100 / 0.58 
+
+    #TODO : max depth for neon sites from WW


I'll add this to #1353

ekluzek · 2021-05-20T21:01:13Z

Since this is specific to neon and under contrib this can come in anytime.

ekluzek · 2021-06-29T16:00:52Z

@negin513 where is this PR at in terms of being ready to go?

wwieder · 2021-07-22T17:46:43Z

before this gets merged in do you want to correct the formula for organic @negin513 and re-create the modified surface data? This also isn't critical since we'll get to repeat the process once NEON provides the estimatedOrganicCarbon data from the megapit.

ekluzek · 2021-07-22T17:58:16Z

@wwieder since I've checked in the new surface datasets as they are, we are going to go with what we have right now. But, we can have another future change with the changes you want for a subsequent tag.

I used existing infrastructure to add descriptive strings to certain history fields that I had labeled by number in ESCOMP#1340. While doing this, I applied the change to a bunch of other history fields that needed it. Some variable names for pools was also changed to use terms consistent with the new names as well.

… it has to run in the directory where the script exists, and the system tools test don't allow that right now

…e starts with a . and is hidden, but seems to be correct, also add the two new tests to the standard test list

negin513 added 3 commits May 5, 2021 16:02

first version adapted from modify_singlept_site

7f261fb

updating modify singlept neon

5cbaca2

some cleanups

50bf049

negin513 assigned wwieder and danicalombardozzi and unassigned wwieder and danicalombardozzi May 18, 2021

negin513 requested review from wwieder, danicalombardozzi and ekluzek May 18, 2021 13:22

wwieder reviewed May 18, 2021

View reviewed changes

slevis-lmwg mentioned this pull request Jul 7, 2021

Toolchain part1: ./gen_mksurf_namelist.py #1419

Merged

negin513 added 5 commits July 13, 2021 10:20

some changes and cleanup

54ace5d

some more cleanups.

9957b03

updating zbedrock singlept site neon.

fe3addd

add flag to update metadata if needed.

292e23d

quick fixing zb_flag

2aa0a24

wwieder mentioned this pull request Jul 15, 2021

Change NEON surface datasets to 79PFT files to facilitate the crop sites... #1364

Closed

some more changes + download data

0bf780e

wwieder mentioned this pull request Jul 15, 2021

Modified NEON surface datasets have errors #1429

Closed

ekluzek added PR status: ready PR: this is ready to merge in, with all tests satisfactory and reviews complete test: none No tests required (e.g. tools/contrib) tag: support tools only enhancement new capability or improved behavior of existing capability labels Jul 15, 2021

ekluzek self-assigned this Jul 15, 2021

negin513 added 2 commits July 20, 2021 16:01

update to zbedrock

c1a5d75

write netcdf3 as suggested by Erik and Jim.

c1b3624

negin513 mentioned this pull request Jul 20, 2021

fix stop_option=date, move neon niwo test to match user_mods #1411

Merged

bug regarding writing in netcdf3

e503179

ekluzek added 3 commits July 23, 2021 00:11

Add system tools test for the subset_data.py script. It fails because…

716988a

… it has to run in the directory where the script exists, and the system tools test don't allow that right now

Add test for modify script, the test does pass but the output filenam…

48970c1

…e starts with a . and is hidden, but seems to be correct, also add the two new tests to the standard test list

ekluzek merged commit 2e4f7d5 into ESCOMP:master Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Script for modifying neon surface dataset #1375

Script for modifying neon surface dataset #1375

negin513 commented May 18, 2021

wwieder May 18, 2021

negin513 Jul 15, 2021 •

edited

Loading

ekluzek Jul 15, 2021

wwieder Jul 16, 2021

wwieder May 18, 2021

ekluzek Jul 16, 2021

wwieder Jul 16, 2021

wwieder May 18, 2021

wwieder May 18, 2021

wwieder May 18, 2021

wwieder May 18, 2021

negin513 Jul 15, 2021

wwieder May 18, 2021

wwieder May 18, 2021

wwieder Jul 14, 2021 •

edited

Loading

wwieder Jul 14, 2021

wwieder May 18, 2021

ekluzek commented May 20, 2021

ekluzek commented Jun 29, 2021

wwieder commented Jul 22, 2021

ekluzek commented Jul 22, 2021

	zb_flag = False

	if (obs_bot.iloc[-1]<rock_thresh):
	f2['zbedrock']=obs_bot.iloc[-1]*100
	print (f2['zbedrock'])
	zb_flag = True

	if zb_flag:
	nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC','zbedrock']
	else:
	nc.attrs['Updated_fields'] = ['PCT_CLAY','PCT_SAND','ORGANIC']


		neon_file = os.path.join(neon_dir, site_name + "_surfaceData.csv")

		#-- Download the file if it does not exits

Script for modifying neon surface dataset #1375

Script for modifying neon surface dataset #1375

Conversation

negin513 commented May 18, 2021

Description of changes

Specific notes

Choose a reason for hiding this comment

negin513 Jul 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wwieder Jul 14, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ekluzek commented May 20, 2021

ekluzek commented Jun 29, 2021

wwieder commented Jul 22, 2021

ekluzek commented Jul 22, 2021

negin513 Jul 15, 2021 •

edited

Loading

wwieder Jul 14, 2021 •

edited

Loading