-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transient single-point dataset capability for subset_data #1673
Comments
This is potentially a useful feature to maintain, although I suspect it's not used often. Can a user still run a generic single point case with global DATM inputs and the global land use time series? Would being able to subset the land use time series file for a single grid provide additional flexibility for users to configure their own specific land use time series? Are their urban configuration, especially with transient urban now enabled, that would be helpful @olyson? |
One more note, wasn't this something that @swensosc 's old subset_data script had a simple way of doing this that can be brought over into @negin513 's new script? CTSM/tools/contrib/subset_surfdata Line 120 in 980b655
Again, this may not be high priority, but we can discuss at our CLM meeting next week in conjunction with a broader datasets conversation. |
You can use subset_data to subset from your landuse.timeseries file, but that's only going to work if you aren't overriding the PFT's for your data (same with the original subset_surfdata you show above). We have two sites we do that for: 1x1_numaIA and 1x1_brazil (so we can keep those two working). The issue with that is that in a normal tower site case you do override the PFT's, so the current transient capability is only going to work if you can happen to find a point from a global dataset that has the right PFT's to begin with. Possibly using a higher resolution global grid will help, but you are still likely to get a mix of PFT's even at our finest resolutions. For the smallville site we constructed specific landuse changes that happened over a few years. We don't have any capacity to construct transient changes like that now. PTCLM also had the ability to do transient for the US-Ha1 tower site. What it had was a harvest that happened at 1946, and that's all the transient timeseries file does is add that one harvest. But, that is a useful feature. Again that's the kind of thing that you have to construct rather than use the global landuse.timeseries file. |
For urban and in general, single-point transient capability would be useful for troubleshooting problems encountered in global or regional simulations. I can't think of any specific urban use cases other than that. |
Hello!
The code corresponding to it is here: CTSM/python/ctsm/subset_data.py Lines 463 to 468 in 157f719
and here: CTSM/python/ctsm/site_and_regional/single_point_case.py Lines 338 to 395 in 157f719
|
@negin513 yes as I say above the ability to subset landuse.timeseries files exists. But, it's not going to function in a useable way if you are over-ridding the PFT's for the site. The landuse timeseries file from the global dataset is going to have a different PFT distribution that won't line up with what you want to override it with. To both override the PFT's and allow a transient change in time, there needs to be a mechanism to not only override the PFT's, but also give how it's going to change in time. And you also might want to specify the harvest for each year as well. I say this elsewhere -- we can use this capability for some specific sites: 1x1_numaIA, and 1x1_brazil (since we don't override the PFT's there). We can't use it for constructed transient changes like we do for 1x1_smallvilleIA and 1x1_US-Ha1. To catch the joke, Smallville IA is the place where Superman was raised, so it's not a real place, the PFT's and transient changes are completely made up. :-) |
Oh! I see what you mean here. Thanks for clarifying it.
Haha! I did not know about Smallville. I kept thinking it is a real place. 😄 |
In the ctsm SE meeting, we have decided on the following format for now:
An additional feature would be filling in the years if they don't exist using the previous line. |
Thanks, @negin513 - that looks like a very good format. One minor detail is that I'd probably get rid of the spaces within a given area (like the |
@erik - I'm not sure what you mean by having the mksurfdata_esmf Makefile build single point datasets. |
This is not a requirement for mksurfdata_esmf it's a requirement for the subset_data tool. |
@ekluzek - thanks for clarifying. That makes sense. |
We talked about this some at the CTSM software meeting this morning as this is needed to create single-point transient datasets. @negin513 and I are meeting on this tomorrow. |
@negin513 and I met on this, and she has more comments coming. We worked out the UI for how this should work. She will also do the work needed for this,. There is only one file that we need for this, for surface dataset generation, so we can wait on it for later. |
@wwieder this was something that Negin was going to do, but obviously can't now. This is important for the CTSM5.2 in that there is one test dataset that needs this capability. Keeping this testing is important long term, but we maybe don't need to hold CTSM5.2 for it. I haven't looked into how long this would take to accomplish. But, do you have thoughts on if we should make it a requirement for CTSM5.2 or wait until post CTSM5.2? |
My feeling is that we want to have the capability long-term, but if it isn't in place for CTSM5.2 we can probably pretty easily put together the needed transient dataset(s) through a manual / one-off process. |
I agree with @billsacks here, this is something we want long term, but that doesn't need to hold up the CTSM5.2 development (or release). We can create the dataset needed for testing, with the understanding that at some point users will request this functionality with a modern code base. Should we close this issue with a 'won't fix' label (for now) or leave it open? |
@wwieder let's leave it open although I will put low priority for now. I needed to know what the plan for it was to know if it was something that CTSM5.2 should be held up for. So I'll adjust the CTSM5.2 project board as well. |
As a way to do this in the short term I'm going to initially implement this with a simple bash script using NCO and the older file: See |
Our current plan is for @slevis-lmwg do the first step of doing this in a bash script. |
...sorry for my confusion about the card associated with this issue. I put it back where I found it. |
After talking to @slevis-lmwg we wanted to clarify the scope for this issue.
@ekluzek can you help clarify if this assessment is accurate. If so, how critical is this capability before we bring in the CTSM5.2 tag? |
@wwieder unfortunately I think this is important for our testing, and so critical to do. If it was just regular transient time-series it probably wouldn't be a big deal. But, this is how we test both transient-lake and transient-urban. See the test directories that use smallville...
So this is important software testing, but also important testing of scientific features we need to keep working. However, as I write this I realize that CTSM5.2 have transient lake and urban already in. So actually maybe we could remove those two tests (or modify them to do this with global datasets)? It's probably OK to not have a single point test of transient flanduse_timeseries files, I'm pretty confident that is likely to be OK. Although long term we still want this capability. So possibly the task is to make sure we have tests that ensure transient lake and urban are working? There could also be tests to make sure you can turn just those features on. @wwieder and @slevis-lmwg what do you think? |
OK, so the issue is really focused on testing. This will help us decide that prioritization for @slevis-lmwg to do this. For testing purposes, it seems like this can kind of be a one-off, we just need a tool to crates the land use time series for point simulations that exercise lake, urban, (and other) features? I agree that testing is important and will defer to you, Sam and @olyson about the best way to ensure good testing coverage for transient features we want to support with the CTSM5.2 dataset. |
Make reproducible by placing in a script (bash) and test by running the smallville tests from testlists. Do this to test on derecho:
|
Update
|
|
I have not found a 3rd smallville case to address if there is one. I updated Externals.cfg to dev159, ran ./manage_externals/..., and updated the 2 user_nl_clm files in the smallville testmod directories. The smallville tests PASS:
|
@slevis-lmwg it looks like we removed testing for this dataset. I'll look into that some more... Here is the previous file that was used: /glade/campaign/cesm/cesmdata/cseg/inputdata/lnd/clm2/surfdata_map/release-clm5.0.18/landuse.timeseries_1x1_smallvilleIA_hist_78pfts_CMIP6_simyr1850-1855_c190214.nc Note, it says 1850-1855, but it's really a constructed file that exercises specific landuse transitions in those 5 years. So it covers all the type of changes in a short test. Look into creating that file and if it's easy enough we could add it back into our testing. |
The info from the above landuse file that needs to be replicated. Each line is a year (1850-1855):
|
I updated this script (originally created above) I updated the /cropMonthOutput testmod's user_nl_clm and this test now passes: |
The last test passed. I renamed the .sh script to modify_smallville.sh. |
TODO slevis
|
Copied the 3 landuse files and the corresponding fsurdat file:
to |
Workaround for transient Smallville tests #1673 + testing all new datasets
UPDATE |
In moving from mksurfdata.pl to using subset for single point datasets, one capability we have removed is the ability to do transient single-point datasets. We have this in place for smallville testing of dynamic landunits and we have one test for a tower site with transient landuse changes.
This relates to:
#1664
Definition of done:
The text was updated successfully, but these errors were encountered: