Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prescribed sowing date and maturity requirements #1863

Merged
merged 435 commits into from
Jul 27, 2023

Conversation

samsrabin
Copy link
Collaborator

@samsrabin samsrabin commented Oct 5, 2022

Description of changes

This branch enables CLM to read in externally-prescribed crop sowing dates and "cultivar" maturity requirements (growing degree-days, GDDs). This has so far only been tested with static values, and the results indicate that yield performance is worsened. However, this capability is required by the GGCMI phase 3 / ISIMIP3 Agriculture protocol.

Briefly, the way this works is that an offline run is first performed with prescribed sowing dates and 364-day seasons. Instantaneous GDD accumulation is saved daily. A Python script then cross-references those daily outputs with a map of mean sowing dates to determine the mean accumulated GDDs in the growing season, saving the result as a file for use as prescribed maturity requirements.

Having CLM use the GGCMI harvest date and save the GDDs accumulated between sowing and harvest would have been conceptually simpler but practically more complex, as it would have introduced more code that would need to be kept up-to-date as model I/O evolved. The chosen method also allows GDDs to be re-generated without another model run if target mean harvest date changes in the future. Finally, it removes the possibility of a naive model user prescribing harvest date in a model run, which would remove the needed ability of harvest to “float” with growing season temperature.

Another option might have been for the Python script to read sowing and harvest dates, then read specified climate files to determine the GDDs accumulated in each growing season. However, that won't work because bottom-of-atmosphere temperature is not what is used in CLM for calculating GDDs.

Specific notes

CTSM Issues Fixed:

Are answers expected to change (and if so in what way)?

Yes. I have left the capability to use the CLM-default sowing date and maturity requirement algorithms, but answers do change due to an order-of-operations change and the presence of diagnostic variables (see comment here).

Outdated:

  • Fixing a bug in the "last chance" planting check. Previously, the relaxed planting criteria were allowed on or after the last day of the planting window (within the same calendar year). The fixed version allows relaxed criteria only on the last day. This was fixed in tag ctsm5.1.dev101.
  • Previous versions of CLM allowed the reproductive (grain filling) phase to begin before the crop entered the vegetative (leaf expansion) phase. This version requires at least one timestep in the reproductive phase before moving on. Reverted this change.

Any User Interface Changes (namelist or namelist defaults changes)?

New namelist options:

  • generate_crop_gdds: If true with use_cropcal_streams true, will only read sowing date input file. Harvests will occur the day before planting each year.
  • stream_fldfilename_sdate: Path to netCDF file with mapped sowing dates (day of year). Default empty.
  • stream_fldfilename_cultivar_gdds: Path to netCDF file with mapped maturity requirements (GDDs). Default empty.
  • model_year_align_cropcal: Simulation year that aligns with stream_year_first_cropcal value.
  • ignore_rx_crop_gdds: Set to true in order to override prescribed crop GDD requirements and instead use default CLM values/logic. For troubleshooting purposes only. Default false.
  • min_crop_gdd_target: Set minimum crop GDD requirement. Used only with prescribed GDD inputs from file. Default 1.0.
  • stream_year_{first,last}_cropcal: {First, last} year to loop over for crop calendar data. Defaults 1850, 2100.
  • stream_meshfile_cropcal
  • use_mxmat: (New as of merge commit 2022-10-17.) Whether to limit growing season length based on CFT parameter mxmat (days). Default true, but must be set to false when generate_crop_gdds is true.

New output variables:

  • ${CPOOL}C_TO_FOOD_PERHARV (e.g., GRAINC_TO_FOOD_PERHARV): C to food per harvest (i.e, Output has second dimension mxharvests) for the indicated pool. Should only be output annually.
  • ${CPOOL}C_TO_FOOD_ANN (e.g., GRAINC_TO_FOOD_ANN): C to food at harvest per calendar year for the indicated pool. Should only be output annually.
  • GDDHARV_PERHARV: GDD (technically HUI) needed to harvest. Per harvest; i.e, output has second dimension mxharvests. Should only be output annually.
  • SDATES_PERHARV: Sowing dates (day of year) associated with each harvest in a calendar year. Output has second dimension mxharvests. Should only be output annually.
  • SYEARS_PERHARV: Sowing years associated with each harvest in a calendar year. Output has second dimension mxharvests. Should only be output annually.
  • GDDACCUM_PERHARV: Accumulated growing degree days past planting date at each harvest. Output has second dimension mxharvests. Should only be output annually.
  • HUI_PERHARV: Accumulated heat unit index at each harvest. Output has second dimension mxharvests. Should only be output annually.
  • SOWING_REASON: Reason for each crop sowing. Output has second dimension mxsowings. Should only be output annually.
  • SOWING_REASON_PERHARV: Reason for sowing associated with each harvest. Output has second dimension mxsowings. Should only be output annually.
  • HARVEST_REASON_PERHARV: Reason for each harvest. Output has second dimension mxharvests. Should only be output annually.

Testing performed

Methods
To test the code changes implemented in this work, I performed a set of CLM runs at approximately 2-degree resolution (2.5° longitude x ~1.9° latitude, or f19_g17) using the reanalysis-based GSWP3v1 dataset. Specifically, the component set used was HIST_DATM%GSWP3v1_CLM50%BGC‑CROP_SICE_SOCN_MOSART_SGLC_SWAV_SESP. These runs were designed primarily to test (a) the process of using model runs and postprocessing to generate maturity requirement files, (b) that read-in sowing date and maturity requirement files are being obeyed, and (c) that CLM outputs are minimally affected when using the new code but CLM’s native crop calendar algorithms (rather than the new prescribed inputs).

While these runs do simulate crop yield, I did not necessarily expect improvements in crop yield performance because so far I'm only using static prescribed sowing date and maturity requirement files. While CLM’s native crop calendar algorithms are imperfect, the fact that they allow variation over time could result in a performance advantage. Initial tests show that, indeed, yield performance is worse. Testing with mxmat-limited growing season length actually shows improved yield performance!
 
The model was spun up to equilibrium over 1802 years, forced with detrended 1901–1920 climate and no land use, using unaltered master-branch CLM code from the tag ctsm5.1.dev092. The next run segment continued with the same code and detrended climate forcings but 1850–1900 land use.
 
These spinup and land use initialization runs produced model states enabling historical-period runs to begin in 1901. These are illustrated and described below. (This figure is taken from the manuscript I'm composing, where I've simplified the branch names for clarity.)
screenshot_5408

The first historical run, the "original baseline," used a code version branched from ctsm5.1.dev092—branch yield_perharv2 (figure: additional_outputs), commit 95c6bbe, as in #1801. The only changes on that branch were the addition of output variables needed to enable and streamline analysis.

The code version from this PR's branch (specifically commit d255bf6; branch name in figure: new_cropcals) was used in a 1958–2014 “new baseline” run, also with CLM default sowing date and maturity requirement algorithms; initial conditions were taken from the original baseline run.
 
A new_cropcals “GDD-generating” run branched off from the new baseline to start in 1977. This used the GGCMI phase 3 sowing dates but with harvest set to occur 364 days later (i.e., the day before the next sowing). The purpose of this run was to generate daily outputs of accumulated GDDs, which my Python script would then cross-reference with the GGCMI phase 3 harvest dates to determine the average GDDs in the GGCMI-derived season. The GDD-generating run ended in 2010, with the outputs from the 1980–2009 growing seasons being used by my Python script.
 
Finally, the GGCMI sowing dates and the harvest requirements produced by generate_gdds.py were used in a new_cropcals “prescribed calendars” run branched from the new baseline at the beginning of 1958.
 
The start dates of the new_cropcals runs were chosen because they are three years ahead of two important years: 1961, the year in which the FAO yield data begin; and 1980, the first year in the ISIMIP3 calibration period. Initial tests showed three years to be sufficient for dissipation of any explainable but unwanted behaviors associated with switching between model code versions and/or crop calendar settings.

Results
I'll present results just for rainfed (rf) spring wheat. Even though it's a bit weird because it includes areas that should be winter wheat, this is the most geographically extensive of all the crop types, and it's pretty representative of the results for most crops. Figures include the 1980–2009 growing seasons unless otherwise indicated, where e.g. the "2009 growing season" is the season that began in 2009. These figures are pretty messy because I wanted to prioritize getting this PR completed; let me know if anything is unclear.

What are the seasonality changes, and are inputs being obeyed?

The new maturity requirements result in much greater geographical variation:
gdd1_19_spring_wheat_gs1980-2009 5

Checks in my Python scripts—both the script generating the new maturity requirements and the script analyzing run outputs—show that sowing dates are being obeyed perfectly. To illustrate (v0 is original baseline run, v1 is prescribed calendars run):
sdate_0vs1_wheat_spring_rf

Similar checks confirm that the new maturity requirements are also being obeyed within reasonable tolerances. The maximum HUI exceedance of the prescribed value in any crop is about 28.8 °C-days, which occurs in an extreme grid cell whose prescribed value is ~21 °C-days. To illustrate (including only seasons where the crop was harvested at maturity; "if mat.") :
hui_ifmature_0vs1_wheat_spring_rf 1

As expected, crops are only harvested before reaching the prescribed requirement when a new prescribed sowing will happen the next day. In these runs, I disabled CLM's maximum growing season (gs) length (which is actually the only behavior when using prescribed crop calendars; I could make that optional). To illustrate, this figure shows the fraction of growing seasons in 1980–2009 harvested for each reason:
harvest_reason_0vs1_wheat_spring_rf

Average growing season lengths in the prescribed calendars run do not match the ISIMIP3 season lengths exactly (note the split in the colorbar is at CLM's maximum growing season length for the crop, mxmat):
seas_length_ifmature_0vs1_wheat_spring_rf

This can occur because of statistical variation, but is also due to the HUI “boost” that can occur when maximum leaf area index is reached before the HUI level associated with the end of the vegetative period is reached. The agreement of CLM's growing season length (days) with prescribed calendars to the ISIMIP3 crop calendars is similar to that of most other models that have completed ISIMIP3 runs:
seas_length_compGGCMI_ifmature_median_diffExpected_wheat_spring_rf 1

The tendency of CLM to have shorter-than-expected growing seasons, which happens in most crops, is consistent with the "HUI boost" explanation.

@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability tag: enh - new science next this should get some attention in the next week or two. Normally each Thursday SE meeting. labels Oct 5, 2022
@samsrabin samsrabin marked this pull request as ready for review October 5, 2022 22:10
@billsacks billsacks self-assigned this Oct 6, 2022
@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Oct 6, 2022
@samsrabin samsrabin force-pushed the rx_crop_calendars2 branch 2 times, most recently from 37f914b to d255bf6 Compare October 17, 2022 16:39
New features:
* use_cropcal_streams now automatically set
* If desired, can specify only one of rx sowing dates vs. cultivar GDD requirements
* Can choose to ignore or use CFT max growing season length for any setup
Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samsrabin - sorry it took me a couple of weeks to review this. Overall these changes look great. I really appreciate the care you take to write clear code. And I can tell that you have put thought into handling various edge cases and unusual situations – thanks! (I haven't carefully studied all of your new logic for prescribed crop calendars, especially related to the different harvest conditions, but it looks like you have thought this through carefully, based on the logic & comments I see. Let me know if you'd like a more detailed review of any of these pieces.) I also appreciate your detailed PR comments. And it sounds like you have done detailed testing, which is fantastic!

In terms of answer changes: Your two listed answer changes make sense to me, but I'm a bit surprised that either of these would actually cause differences in simulations. It sounds like you plan to test these, and it would be good to understand if these are truly the source of answer changes in simulations. If answer changes show up in testing: Given the size of the changes here, it would be best if we can separate the answer changes into their own tag. We should at least start by splitting out the known bug fixes into their own tag and then we can check to see if that explains all of the answer changes. If it will be a pain to split these out cleanly, then an alternative would be to hack together a one-off version that could be used to give bit-for-bit results. (I know that in one of your previous tags, it ended up taking a very long time to track down all of the sources of answer changes, and in retrospect it may not have been worth all this time. I'm hesitant to send you down more rabbit holes, but I'm also a bit nervous about having non-understood answer changes amongst such a large set of changes, because it's harder to have confidence that no bugs were introduced. I will somewhat defer to you here, though, and your own level of confidence in the correctness of your changes for standard/old cases without prescribed crop calendars.)

I only did a cursory review of the new stream module, because I'm not very familiar with streams. If you feel like it warrants a more careful review, then Erik would be the best person to ask for this review.

Besides my inline comments, one other change we should make is:

  • We should add one or more tests that exercise the new options and output variables. I think it would be good to have a multi-year test with a mid-year restart to ensure that all of the necessary variables are being saved to the restart file (it looks like you were careful in this respect, but it never hurts to check). (we are adding an issue to do this later)

Note that I have requested changes using checkboxes, based on the workflow described here: https://github.com/ESCOMP/CTSM/wiki/Pull-request-review-workflow . In contrast to what is stated there, I am comfortable with you checking things off yourself if you're fairly confident that you have addressed them. Or you can leave things for me to check off after a re-review.

Let me know if you'd like to talk through any of this.

Externals.cfg Outdated Show resolved Hide resolved
bld/namelist_files/namelist_defaults_ctsm.xml Outdated Show resolved Hide resolved
bld/CLMBuildNamelist.pm Outdated Show resolved Hide resolved
src/main/clm_driver.F90 Outdated Show resolved Hide resolved
src/main/clm_driver.F90 Outdated Show resolved Hide resolved
src/biogeochem/CNPhenologyMod.F90 Outdated Show resolved Hide resolved
src/biogeochem/CNPhenologyMod.F90 Outdated Show resolved Hide resolved
src/biogeochem/CNPhenologyMod.F90 Outdated Show resolved Hide resolved
src/biogeochem/CNPhenologyMod.F90 Outdated Show resolved Hide resolved
src/biogeochem/CNPhenologyMod.F90 Show resolved Hide resolved
@samsrabin
Copy link
Collaborator Author

Latest push added Python scripts to (a) regrid the GGCMI sowing and harvest dates and (b) convert them to CTSM-compatible format.

Copy link
Collaborator

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went over this with @samsrabin and approving.

@ekluzek ekluzek merged commit 31d2369 into ESCOMP:master Jul 27, 2023
@ekluzek ekluzek deleted the rx_crop_calendars2 branch July 27, 2023 21:22
@ekluzek ekluzek restored the rx_crop_calendars2 branch July 27, 2023 21:24
@samsrabin samsrabin added the science Enhancement to or bug impacting science label Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability science Enhancement to or bug impacting science
Projects
Status: Ready to eat (Done!)
Status: Done (non release/external)
4 participants