-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clean up logging #792
clean up logging #792
Conversation
… clause, or ensuring there is an endrun following.
Updated the metadata of the decompmicc parameter in the parameter file. See: ef9126a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good amount of cleanup; really nice work.
I couldn't tag this directly to the line in github, but was the following overlooked or intentionally left without a debug
flag check since it's only writing to the lead node?
fates/biogeochem/EDLoggingMortalityMod.F90
Lines 181 to 184 in 4acc184
if(logging_time .and. (is_master.eq.itrue) ) then | |
write(fates_log(),fmt) 'Logging Event Enacted on date: ', & | |
hlm_current_month,'-',hlm_current_day,'-',hlm_current_year | |
end if |
I did intentionally leave the logging event print-outs. They should be fairly sparse, and I figure it is useful for people using the logging module to have that verification of their events. Again, don't feel strongly here, I can also ecapsulate this in a debug. |
@rgknox this looks good and will certainly help reduce the logging event print-outs. After I went through this I looked at my logs to confirm a reduction, and this will capture some things. But there is one that is not addressed. In the CESM.log and LND.log I get a consistent "BalanceCheck, solar radiation balance error" , but from what I can tell this is outside of FATES here: So, I guess this BalanceCheck solar radiation error warning is a separate issue then, and I can approve this PR? Should I open a separate issue for that, or would this be on the HLM side? Example from recent LND.log: Example of some of that reporting from a recent CESM.log: |
updated this previous comment to reflect that the BalanceCheck solar radiation warning happens in CESM.log and LND.log |
I see that this messaging is like you say, on the CTSM side of the code. Those logging messages may be outside the scope of this PR, but this sounds like a potential discussion topic to bring up at the CTSM software meeting? |
That sounds like a good idea @rgknox. I added it to the agenda of tomorrow's ctsm software meeting. |
@rgknox and @glemieux sounds good. It is worth noting that this radiation imbalance is being ignored on the FATES side, so perhaps we should discuss the validity of this error tolerance on the FATES side as well. (Though I am not totally sure these are the same balance checks...) fates/biogeophys/EDSurfaceAlbedoMod.F90 Lines 968 to 988 in 2fa452a
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I have one suggested change and two questions.
I also noticed that there are three current FatesWarn
calls and the first index used is 2 and not 1. I'm assuming the FatesWarn
call that used index=1
was removed?
end if | ||
! end_run | ||
write(fates_log(), *) 'fates NL tag not recognized:',trim(tag) | ||
!! call endrun(msg=errMsg(sourcefile, __LINE__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to end the run if the tag isn't recognized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are currently passing in a tag that is not recognized! I don't know where this came from, but
I don't see this passed in E3SM:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing 'sf_anthro_suppression_def' was added at one point with the expectation it would be used, but the usage was never implemented. Lets remove this argument pass in the next API change on ctsm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup that was me 🤦. As Jackie noted its part of ESCOMP/CTSM#1536. Since this was a ctsm-side only update to prep for future work I guess should have commented out those lines and the lines later on down the file to avoid this issue?
and
Co-authored-by: Gregory Lemieux <7565064+glemieux@users.noreply.github.com>
Good point about the indexing @glemieux . I might had started using 2, assuming 1 would be a "universal" or "generic" code, but that is zero. I can change this to start with one. Also, I need to change the loop in the finalize routine where the totals are printed, to start with zero and not 1. |
I believe all tests pass as expected.
|
Description:
This set of changes seeks to ensure that all write statements to logs, that are intended to "warn", are either:
FatesWarn() has a feature where each warning message is associated with an index. Once the specific warning has been written 100 times (or other desired value), it will stop reporting the warning. It will continue to count how many times the warning trips. It maintains these counts at the node level.
In some situations, such as the interface, we had also been requiring verbose output prior to an endrun. If the model hits an endrun prematurely, it is a failure, and thus all logging information is welcomed. So the verbose requirement was removed.
In a few situations we had been reporting errors that were fairly catastrophic without failing (such as encountering negative patch areas). So, endrun() was added to a few locations.
Fixes: #772
Fixes: #236
Fixes: #191
Fixes: #793
Collaborators:
@jkshuman
Expectation of Answer Changes:
This should not change answers
Checklist:
Test Results:
CTSM (or) E3SM (specify which) test hash-tag:
CTSM (or) E3SM (specify which) baseline hash-tag:
FATES baseline hash-tag:
Test Output: