-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Put in sanity checks for longwave radiation and other atmospheric forcing fields #96
Comments
Bill Sacks < sacks > - 2014-08-28 15:09:40 -0600 Andy, is this still an issue, or have you worked around it? If it's still an issue, we can return this to major importance. |
Andy Fox < afox > - 2014-08-28 16:10:29 -0600 Bill, |
Bill Sacks < sacks > - 2014-08-28 16:53:47 -0600 Okay, thanks for the reminder, Andy. I'm returning this to major importance. |
Erik Kluzek < erik > - 2014-11-20 10:56:45 -0700 Andy can you try either the latest code clm4_5_1_r096 or add in the bug fix for 1717 and see if it works for you now? |
Tim Hoar < thoar > - 2015-11-03 14:47:22 -0700 (In reply to Erik Kluzek from comment #4)
The bug fix for 1717 was put in LakeHydrologyMod.F90 in clm4_5_1_r096 and does not have the last two lines that bug 1717 indicates would be a good idea. Things apparently got moved around a bit for CLM4_5 in cesm_1_2_1, in that LakeHydrologyMod.F90 seems to be SLakeHydrologyMod.F90 - which is substantially bigger. I did my best to add the fix to I did a clean_build and then a build, verified that the new SLakeHydrologyMod.F90 was used and I it appears to have had no effect. I am still getting longwave balance errors, but my error message is slightly different than Andy's in that his pretty clearly indemnified LakeTemperature, mine does not: 63306 45:(shr_stream_set) size of filename = 1 |
Bill Sacks < sacks > - 2015-11-03 14:49:46 -0700 Adding Keith Oleson in case he has any insights. |
Keith Oleson < oleson > - 2015-11-03 15:56:57 -0700 The only thing I can say at the moment is that it doesn't look like Zack's original suggested bug fix for 1717 was put into the code. I.e., "I found one error that is not just a typo but also shows up in a rarely The next comment after that is from Erik regarding a longwave balance error and then suggested fixes for that from Sean. I don't know if the above from Zack is related to Tim's error though. But I guess you could try it. I will check with Erik/Sean to see why Zack's suggestion wasn't put in. Maybe Sean's changes took care of Zack's suggestion somehow. |
Tim Hoar < thoar > - 2015-11-03 16:08:15 -0700 (In reply to Keith Oleson from comment #7)
I need some clarification, sorry ... Are you saying I should back out the 20 or so lines from Sean that constitute the fix implemented in clm4_5_1_r096 in favor of Zack's one-liner: I already tried Sean's solution - and it did not help. |
Keith Oleson < oleson > - 2015-11-03 16:14:25 -0700 No, I was suggesting you could just substitute in Zack's one-liner. Leave |
Keith Oleson < oleson > - 2015-11-03 20:46:47 -0700 Tim, If you have a case I can clone easily I can take a closer look at this. It will probably have to wait until next week though as we are trying to diagnose problems with CLM5 within the coupled model this week. Keith |
Tim Hoar < thoar > - 2015-11-03 21:50:32 -0700 (In reply to Keith Oleson from comment #10)
my case directory is: I appreciate the help. I have several people who want to use the multi-instance capability with CLM45 - at least one is BGC, and right now I cannot get it to run |
Andy Fox < afox > - 2015-11-04 10:05:05 -0700 I never managed to figure out which DATMS were causing the issues either - but from what I recall I decided that it was certainly more than a couple, but by no means all of them. I think I speculated it was maybe 10 out of 80 DATMS caused this issue? |
Keith Oleson < oleson > - 2015-11-09 10:07:14 -0700 I'd like to try to replicate the original balance error. Will this script that Andy pointed to work for me? /glade/p/work/afox/DART/models/clm/shell_scripts/CESM1_1_1_setup_pmo_test I assume I would check out clm4_5_62 and then modify the above script for my directories/environment. |
Tim Hoar < thoar > - 2015-11-09 16:49:46 -0700 OK - I whittled it down to a single instance that (reliably) fails.
please clone away. Thanks. |
Keith Oleson < oleson > - 2015-11-09 21:33:14 -0700 It looks like what is happening is that a negative sabg (solar absorbed) for a lake point is calculated by the albedo routine, because the incoming diffuse solar radiation is negative. The sun is slightly above the horizon (coszen > 0) so the model tries to use the negative incoming solar. This causes a large value for lake surface temperature which generates a large outgoing longwave radiation (e+16) and the model can't quite compensate. I checked all of the incoming diffuse solar from all of the ensemble members and several of the ensemble members have negative values in at least one grid cell. Not sure why these forcing files worked with CLM4.0. Maybe there is a check in the CLM4.0 source code that is not in CLM4.5 anymore. Or maybe the CLM4.0 lake model is more forgiving of a negative solar radiation at a single time step than the CLM4.5 lake model is. I assume that the same version of the datm is used to drive both CLM4.0 and CLM4.5. I could see a few different fixes for this: 1) Simply zero out any negative values in the 6-hourly datm forcing files, 2) In the interpolation of 6-hourly values to 1/2 hourly in the datm code (or whatever time step CLM is using), zero out negative values, 2) Put a check into the SurfaceAlbedo code to zero out negative values before computing an albedo. It seems to me like the best solution is 1). Although maybe there should also be a check in CLM for negative solar radiation. I don't know why those forcing files would have negative solar radiation. Are those files generated by CAM? I'll see if I can figure out why CLM4.0 works and CLM4.5 doesn't...it might be a useful exercise... |
Tim Hoar < thoar > - 2015-11-10 09:53:50 -0700 Those files have origins in CESM/CAM and the CESM coupler. We do have to do some processing to concatenate them into annual files, but we don't do any 'math' on them, There are 1121 files of this nature. I'm not looking forward to the prospect of hunting down all those values and replacing them with zeros. Solutions 2 or 3 might be a bit faster given the timeframe I'm working with. I'd just as soon try to figure out where those are read and replace the negative values at that point. |
Keith Oleson < oleson > - 2015-11-10 11:26:23 -0700 Ok, I will see if I can figure out how to zero out negative values when the datm reads them in. |
Bill Sacks < sacks > - 2015-11-10 11:28:59 -0700 This discussion makes me feel like we should have some sanity checks in CLM (e.g., in atm2lndMod) that check for reasonable values of all of the forcing fields, so that problems like this can be caught more easily.... |
Keith Oleson < oleson > - 2015-11-10 11:44:27 -0700 Yes, I agree. One problem with forcing fields that appears every once in a while is negative incoming longwave radiation from the atmosphere. This manifests itself as a longwave balance error in the urban model because the urban model has an internal check on longwave balance during the time step. It's not obvious to users what the problem really is. Establishing lower bounds on some fields is probably not too hard, e.g., incoming solar and longwave, humidity, precip should not be negative. Establishing upper bounds would be more challenging. Might also be a challenge for other fields such as temperature where we could be simulating climates far in the past and far in the future that have values that we would consider to be extreme relative to present day. |
Bill Sacks < sacks > - 2015-11-10 11:55:38 -0700 We could tackle the easiest part first: make sure that things that should be non-negative truly are non-negative. |
Andy Fox < afox > - 2015-11-10 13:40:38 -0700 Its great you guys have figured this out. Thanks so much for doing this Keith. I must admit back in the day when I looked in to this I didn't concentrate on errors in the DATMs as they had worked just fine with CLM4.0... |
Keith Oleson < oleson > - 2015-11-10 14:56:26 -0700 Ok, I have a couple of suggested options. They are really just to get you going with your specific CLM45 simulations, I don't recommend either as permanent bug fixes because I think we need a software engineer who is familiar with the datm code to take a look at this. And as Bill suggested we may want to implement a more robust approach to bounding all of the forcing data under any situation, uncoupled or coupled to an active atmosphere model. The first is to modify this file as I've done in my SourceMods for your particular case: /glade/p/work/oleson/cesm1_2_1/scripts/clm45bgc_121_pmo2/SourceMods/src.share/shr_strdata_mod.F90 My changes are marked with "!KO": !KO This ensures that the solar radiation fields are greater than zero before the cosine normalization step to interpolate the 6-hourly data to half-hourly is performed. The cosine normalization is only done for solar fields so we don't have to check the field names here. The second is to modify this file: /glade/p/work/oleson/cesm1_2_1/scripts/clm45bgc_121_pmo2/SourceMods/src.share/shr_dmodel_mod.F90 Changes are: The advantage of this is that the check for negative solar radiation is done right after the data is read in. The disadvantage is that I'm looking for specific atm fields, the four incoming solar fields on your forcing files. I've tested both of these separately and they both work and get the simulation past the longwave balance error. If anyone can suggest a better temporary fix, let me know, thanks. |
Tim Hoar < thoar > - 2015-11-13 16:35:42 -0700 Keith, Bill, Andy; Thank you very much. I tried your first method (modifying shr_strdata_mod.F90) Again - Thanks very much and I'll keep you posted. |
Bill Sacks < sacks > - 2016-06-29 16:01:52 -0600 Looking back at this quickly, we want to do three things: (1) CAM should have some checks to not output negative values for some fluxes (2) DATM should have similar checks, which convert negative values to 0 (3) CLM should check incoming values for sanity, and die with a meaningful message if values are outside the physically possible range. (3) has come up a number of times recently. We should at least do this. |
Bill Sacks < sacks > - 2016-11-07 14:44:04 -0700 Renaming this based on what we still want to do. |
The error checks for unphysical values has been put into place. We check for negative values that shouldn't be (solar, spec-humidity, LW) as well as values that are NaN. So closing this. |
### Description of changes Remove an unused ESMF_Calendar statement. This was partially done in 97ed65a, this PR just completes the task ### Specific notes Contributors other than yourself, if any: CMEPS Issues Fixed (include github issue #): ESCOMP#96 Are there dependencies on other component PRs - [ ] CIME (list) - [ ] CMEPS (list) Are changes expected to change answers? - [ ] bit for bit - [ ] different at roundoff level - [ ] more substantial Any User Interface Changes (namelist or namelist defaults changes)? - [ ] Yes - [ ] No Testing performed: - [ ] (required) aux_cdeps - machines and compilers: - details (e.g. failed tests): - [ ] (optional) CESM prealpha test - machines and compilers - details (e.g. failed tests): Hashes used for testing: - [ ] CIME - repository to check out: https://github.com/ESCOMP/CESM.git - branch: - hash: - [ ] CMEPS - repository to check out: https://github.com/ESCOMP/CESM.git - branch: - hash: - [ ] CESM - repository to check out: https://github.com/ESCOMP/CESM.git - branch: - hash:
Andy Fox < afox > - 2014-02-18 19:50:11 -0700
Bugzilla Id: 1927
Bugzilla CC: erik, oleson, rfisher, sacks, thoar,
I am trying to move the CLM-DART framework from CLM4.0 to CLM4.5. We rely on using an ensemble of DATMs produced by CAM, these are couple history files with 6hrly values and 2 degree resolution and this has been working well with ICN compsets with a large number of model tags over the last couple of years.
I’m now working with clm tag 4_5_62. Using an ICN compset still causes no issues, however, moving either CLM4.5CN or CLM4.5BGC compsets is causing long wave balance errors for some of the ensemble members. I don’t know what fraction of the different DATMs cause errors, and which don’t, but it is the case that multiple different ones cause the error. I have been testing now with a single instance.
The following case
/glade/p/work/afox/cases/clm4_5_62_pmo_member_23
With cesm.log.140218-191607 in
/glade/scratch/afox/clm4_5_62_pmo_member_23/run
has an error that reads as
1: Opened file ./clm4_5_62_pmo_member_23.clm2.h0.2000-03.nc to write 1
15: errsoi incorporated into sensible heat in SLakeTemperature: c, (W/m^2):66916 7.812500000000000E-003
15: BalanceCheck: longwave energy balance error nstep = 5139 point =156436 imbalance = 163.744375 W/m2
15: clm model is stopping
15: ENDRUN: called without a message string
15:Abort(1) on node 15 (rank 15 in comm -1006632951): application called MPI_Abort(comm=0xC4000009, 1) - process 15
There is an initial warning about a sensible heat balance error from the lake code (line 1003 of clm4_5_62/models/lnd/clm/src/clm4_5/biogeophys/SLakeTemperatureMod.F90), then an endrun call from BalanceCheck.F90 with longwave radiation. Its not immediately apparent if these are definitely actually related.
This is 100% replicable, but replicating the error is somewhat complex as you have to set the user_nl_datm correctly set the stream files to point to the correct forcing files.
We have a script that will do this for single instance case that will fail with this error
/glade/p/work/afox/DART/models/clm/shell_scripts/CESM1_1_1_setup_pmo_test
The text was updated successfully, but these errors were encountered: