-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hard coded 3 hourly frequency in gocart2g #146
Comments
@bena-nasa I won't have the time this week to give a thoughtful look at this, sorry. But if we add H2O2 to the internal state in GOCART2G (this is I think the field which requires the alarm cycling) that will solve this problem? |
@pcolarco thanks for the fast response. I agree, based my quick look at the code, maybe adding h2o2 to the internal state could solve it. I'll just try adding that here real quick and see if it fixes the issue |
Oh boy, there are multiple things going on here. |
There was a reason why the 3 hour alarm was hardwired, as not to give the user the illusion that they could specify any other value. An easier solution may involve changing the way we handle these oxidants. The way this oxidant is "recycled" always apperead contrived in my opinion. So, stop trying to find a way to address this in code. There is no deep mandate to keep this algorithm. Let us discuss this in our aerosol group meeting. |
At the bare minimum - if the 3 hour cannot be changed and it is not safe to checkpoint at other intervals, then invalid start/checkpoint times must be detected and provide a clear failure message. Getting plausible but incorrect results is (almost) the worst possible failure. |
@tclune @amdasilva I tried for quite a while yesterday to fix this. I think there are two issues.
Looking at the code I thought this would be a very trivial modification. Unfortunately we have another case of ESMF alarms not functioning properly. I tried setting the initial ringtime to 0z of that day (indeed if you look at the code some did set a new time to be 0z of the current day, then it is not used a few lines later when the alarm is created). In the 4 hour run the alarm ran at 21z but did not ring at 0z. In the 2+2 run the alarm ran at 21z, then again at 23z when the model was restarted. ESMF is well aware of the alarm issues and has been rewriting the alarm algorithms to a point where I can test them. |
@tclune @amdasilva @rmontuoro
The issue however is that I tried exactly this and found that the ESMF_Alarm was not behaving properly as detailed in the previous comment. |
@amdasilva @pcolarco @tclune @weiyuan-jiang @rmontuoro As I said, I thought there were 2 issues, 1 the alarm was just being created in a bad way, and needed to be relative to the same time each day, not whenever you started the model! And the self%xhno3 needed to be saved to the internal state. I had tried both of these but the issues with the esmf alarm not behaving right after I "fixed" it to be relative to 0z of the current day when created meant this was doomed. I've been testing a new ESFM alarm branch where they have completely rewritten the alarm implementation to fix the issues with it, and indeed, now the alarm is behaving properly. (An alarm set to ring at 0z and every n hours after that does actually do that). I then tested the regression with the new ESMF branch + fixes to what I thought were the issues with the NI GridComp that I've detailed in several post in this issue But, despite this, it still did not pass a 2 hour + 2 hour regression with a 3 hourly alarm in the code. I even tried, running say a 2 step run starting at 21z vs 2 1 step runs! This still fails, the strange thing is that all the restarts are zero diff except the SU restart. If you let regression run longer than the alarm frequency everything goes non-zero diff. I noticed that su has the same bad logic for an alarm and has special first time the gridcomp executes, so I can't how this would regress either! I will fix that and report if that fixes the regression problems. Hopefully there is no more specious logic like this in the other grid comps :( |
@amdasilva @pcolarco @tclune @weiyuan-jiang @rmontuoro
|
@amdasilva @tclune would you like me to update the branch I referenced above, merge develop, and invent a little bypass to the ESMF alarms so we can just fix this so that you can do start/stop regression for as long as this 3 hourly logic exists? We really need to be able to do start/stop regression tests at something less than 3 hours, that is just hinderance to debugging and I have the solution with just a little more work as detailed above. There have been a couple people I've been called on to help trying to do start/stop regression who wasted a bunch of time before talking to myself or Matt because they were not aware of this "limitation" when doing such tests. If the problem were just fixed (and it's not hard to fix...) it would save others confusion and time in the future. |
@amdasilva @pcolarco @adarmenov @weiyuan-jiang @tclune @mathomp4
There is this issue reported in UFS that appears to involve gocart2g:
ufs-community/ufs-weather-model#1190
now Matt noticed that a 4 hour vs 2 + 2 start/stop regression was failing. We then remembered there is a hard coded 3 hourly alarm in gocart2g! In these spots:
https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/NI2G_GridComp/NI2G_GridCompMod.F90#L393
https://github.com/GEOS-ESM/GOCART/blob/v2.0.6/ESMF/GOCART2G_GridComp/SU2G_GridComp/SU2G_GridCompMod.F90#L480
If I change this to 2 hours, the 4 hour vs 2 + 2 regression passes. So it does seem possible this could be the problem for UFS. At the very least it seems like either whatever the point of this alarm is it either needs to be divisible by the job segment or perhaps we are not saving something in the internal state that needs to be checkpointed.
The text was updated successfully, but these errors were encountered: