-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove all refences of mct #74
Conversation
Update NorESM MOSART with the latest changes from ESCOMP.
updates to remove mct_mod and all other mct related files from share/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work, thank you. I think that there is a misuse of the PIO_BCAST_ERROR statement and I would like to see the code in comments removed.
call t_stopf ('lc_rof_export') | ||
|
||
!-------------------------------- | ||
! Check that internal clock is in sync with master clock | ||
! Check that internal clock is in sync with sync clock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be main clock not sync clock?
@@ -718,7 +745,7 @@ subroutine ModelAdvance(gcomp, rc) | |||
write(iulog,*)' mosart ymd=',ymd ,' mosart tod= ',tod | |||
write(iulog,*)' sync ymd=',ymd_sync,' sync tod= ',tod_sync | |||
rc = ESMF_FAILURE | |||
call ESMF_LogWrite(subname//" MOSART clock not in sync with Master Sync clock",ESMF_LOGMSG_ERROR) | |||
call ESMF_LogWrite(subname//" MOSART clock not in sync with sync clock",ESMF_LOGMSG_ERROR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe sync clock is intentional but I find it a bit confusing.
! check for stability | ||
! if(TRunoff%vr(iunit,nt) < -TINYVALUE .or. TRunoff%vr(iunit,nt) > 30) then | ||
! write(iulog,*) "Numerical error inRouting_KW, ", iunit,nt,TRunoff%vr(iunit,nt) | ||
! end if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we just remove the commented code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yES - I think that would be great. I'll do that.
else | ||
if(hr_ > rdepth_ + ((rwidth0_-rwidth_)/2._r8)*SLOPE1 + TINYVALUE) then | ||
deltahr_ = hr_ - rdepth_ - ((rwidth0_-rwidth_)/2._r8)*SLOPE1 | ||
! pr_ = rwidth_ + 2._r8*(rdepth_ + ((rwidth0_-rwidth_)/2._r8)*SLOPE1/sin(atan(SLOPE1)) + deltahr_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again I suggest we remove code instead of leaving it in comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree!
nbas_chk = 0 | ||
nrof_chk = 0 | ||
do nr=1,rtmlon*rtmlat | ||
! !if (mainproc) write(iulog,*) 'nupstrm check ',nr,gmask(nr),nupstrm(nr),idxocn(nr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented code.
src/riverroute/RtmMod.F90
Outdated
call pio_initdecomp(pio_subsystem, pio_double, dsizes, compDOF, iodesc_dbl) | ||
call pio_initdecomp(pio_subsystem, pio_int , dsizes, compDOF, iodesc_int) | ||
deallocate(compdof) | ||
call pio_seterrorhandling(ncid, PIO_BCAST_ERROR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me why we are using PIO_BCAST_ERROR here as I see no error handling in the following code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/riverroute/RtmMod.F90
Outdated
call shr_sys_abort(trim(subname)//' ERROR areatot incorrect') | ||
endif | ||
|
||
! do nr = rtmCTL%begr,rtmCTL%endr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove commented code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/riverroute/RtmMod.F90
Outdated
|
||
call getfil(frivinp, locfn, 0 ) | ||
call ncd_pio_openfile (ncid, trim(locfn), 0) | ||
call pio_seterrorhandling(ncid, PIO_BCAST_ERROR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This demonstrates correct use of PIO_BCAST_ERROR - the method is set in line 2324, the error is checked in 2326 and the method is reset in 2332.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out.
src/riverroute/RtmSpmd.F90
Outdated
implicit none | ||
private | ||
|
||
#include <mpif.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we should change this to
use mpi
It would need to be moved above the implicit none statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've actually updated things so that 'use mpi' is now used in the necessary two routines rather than in RtmSpmd.F90.
@mvertens thanks so much for putting these changes together! This is so helpful. I should be able to move these changes over to RTM as well. And @jedwards4b thanks for a nice review of this, appreciate that as well. And I agree with him about removing comments. I defer to him about the PIO specific things. |
@mvertens it sounds like you found this to be bit-for-bit right? I'm surprised because I would think answers would change going from MCT to ESMF? I suppose maybe the reason is that the sparse-matrix multiply is done the same between MCT and ESMF, just because it's a standard math problem? |
@erik it's bfb because the matrix multiply is all 1 or 0 and so there is no issue with respect to order of operations. |
…ROR in RtmMod.F90
@ekluzek - The reason that answers are bfb is that the mapping is only multiplying by 1. You are simply moving the water from one gridcell to another. Also - MCT is only used in RTM in the cpl/mct code. I'm doing another PR back to ESCOMP where I have removed this directory. So now MCT will not be in either RTM or MOSART. |
src/riverroute/RtmSpmd.F90
Outdated
@@ -50,7 +37,7 @@ subroutine RtmSpmdInit(mpicom) | |||
|
|||
! Get my processor id | |||
call mpi_comm_rank(mpicom_rof, iam, ier) | |||
if (iam == MAINTASK) then | |||
if (iam == 0) then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change this line? I think that the constant MAINTASK is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay - I'll move it back. My reasoning was fewer variable declarations.
@mvertens I browsed through this PR. I would be happy to meet with you for a tour of the changes if you still need feedback from me. My slevis@ucar calendar is up to date. |
Hello, We're still trying to balance competing interests on CTSM SE time. In the short term (before Feb. working group meetings) this doesn't seem as critical as other PRs (e.g., Dust work, CTSM5.2 datasets, and supporting PPE neeeds for CLM6 calibration). For now, I'm suggesting that @ekluzek and @slevis-lmwg focus their energy on these more critical LMWG PRs before focusing on the MOSART PRs, @mvertens, @jedwards4b, & @briandobbins. |
Brian and @wwieder and I met and hashed some plans for this. The plan we have now is that @slevis-lmwg and I will do this tag (and the RTM equivalent one) together next Wednesday. We'll hold off on the other tags until after LMWG, but Sam will take care of them afterwards. We were going to do the tags today, but with @wwieder direction above we decided to hold off. From what we could tell that should be acceptable to everyone, but Brian and @wwieder can negotiate something different if needed. Doing the set of tags isn't going to be that much time, but it's also distracting to our main priorities. And it seems that doing most of them after LMWG should be OK. @mvertens @jedwards4b @briandobbins |
…t on with code mods, this resolves ESCOMP#80
@ekluzek Please prioritize merging this PR as is. Do not continue to add code and features. You have added changes to a collaborators PR without consulting or discussing with that collaborator. This is, at best, a poor practice which can only serve to alienate contributors. |
Hi everyone,
If this last point is accurate, the tone of your last comment seems counter productive, @jedwards4b. I have to believe that @ekluzek and @slevis-lmwg are working to bring this PR to main efficiently and effectively and I think it's important that our virtual conversations on this platform are constructive in tone and content. |
@wwieder - thanks for your note. I'd like to clarify a few things.
|
@wwieder I do not think that it is a common practice for anyone to alter another persons PR without explicit permission of the original author. If you don't want to ask, an alternative would be to post a new PR to the existing PR for discussion. |
@mvertens @jedwards4b @wwieder Reply to specific points above...
I have done PR's to PR's before. And it does make sense when I'm suggesting changes that I need the author to review or might be controversial, or it's in a repository I'm not the maintainer for example. Then it's useful, it does slow down the process though. In this case as maintainer I didn't see the point in having you approve the changes. The impression I got Sunday was that you wanted this in quickly so I did something to expedite it and then was chastised (by Jim) for doing just that on Monday. So I'm in a no win situation... Here's a suggestion that @slevis-lmwg and I came up with yesterday... In CTSM we make "merge PR's" that merge several PR's in together and then possibly add changes on top of that. So in the future we can do that for these cases. I mean if I for example -- brought your PR in as is -- and then added changes in the next tag -- I don't see how that's objectionable? That next tag isn't necessarily something you have to approve...
@mvertens you are right that you checked in with @swensosc and he agreed. And I appreciate that you checked in with him, he is the one that needs to approve such a thing. However, @slevis-lmwg and I as code maintainers were not informed about the change -- other than by code inspection. I was concerned and asked at the meeting, where you told me you were NOT removing flooding -- but only removing a part that wasn't functional. In our followup with @swensosc he told me he did use flooding, but in a hacky way -- so it was important to get it back to the same state for his work.
The thing is that there was a TON of cleanup that came in here that is unrelated to removing MCT. @slevis-lmwg and I were neither informed nor asked permission about the inclusion of that cleanup. Nor was that cleanup provided in a separate PR, so we could assess it separately than the MCT removal. I think I'm good with most of it -- and above you see that I documented in issues about a lot of it. But, there was a lot of this done with no discussion with us. To list some of it: Again, I agree with most of it, but it's better if we were informed about it. If you need more communication from us on what we are doing -- in order to have a good collaboration we need more communication from you.
I know you did, and we appreciate that. And I'm glad we can rely on your doing testing. Something that would help though is for you to document what testing you did (in the PR for example). I ran into problems with the baselines running the full test suite, and since you didn't mention these issues -- I'm assuming you didn't run the full test list. But, knowing what testing was done is helpful for @slevis-lmwg and I as code maintainers. TLDR:
does the above list sound acceptable to everyone? |
I think @ekluzek feels this PR is ready to merge. He's proposing to do this as a merge tag like so Erik's changes are separated from the original PR.
|
I wanted to publicly acknowledge that each one of us brings different strengths to the group. I apologize if this is the wrong platform for this comment. I will remain brief, and I hope I don't get in trouble for it :-)
|
@ekluzek I see a software engineer who is overwhelmed with tasks and what I thought I was doing was to try to help reduce the burden while recognizing and supporting the work of a very important outside collaborator. I hope that we are working toward a solution where you allow others to take on some of these tasks - they may not always be done exactly the way that you would do them, but I think that we all care about developing quality software that can stand the test of time. I apologize for my blunt comments sometimes, perhaps I could have been more effective had I not exposed my personal frustrations. With respect to the suggestions above:
I'm not sure if I understand the necessity for this, it seems like it puts an unnecessary burden on you and or @slevis-lmwg - why not allow developers to test their own PR's (with a suitable test suite) and merge and create a tag for each one? |
@jedwards4b thanks so much for your last comment that calms the rhetoric down. I do appreciate the help. And I accept your apology it helps set a much better tone to work together in. As humans we do need to express our frustrations, but when working together we need to have positive ways of doing that. I know Brian Dobbins will be working on that within CSEG. And actually I wasn't seeing that anyone of the CTSM SE group was going to be able to do this piece that @mvertens did before CESM3. Not only that, but it would've taken any of us a ton more time than she was able to do it in. I would've needed to consult with the both of you to take it on, and it would've still taken me a lot longer even if I could work on it now. I also see the collaboration with NiO and other external groups to be very important for CESM. CESM should lean heavily into the "C" in the name, as a way to distinguish from E3SM and also to help prevent a split like that from happening. My hope is that we can bring in their work and take advantage of their advances as well as not having to have either of us have much difference in our repositories between us. So I'm glad to have NiO machines in the test list and hope that betzy will be added to ccs_config for example. As well as other additions that will bring in advancements for everyone. I also agree with @slevis-lmwg comments about @mvertens as well. She is able to do stuff like this amazingly quickly which is awesome and her superpower. And @slevis-lmwg I tend to follow the maxim "Public praise, private reproof", it's almost always good to give praise in a public forum, so I think you are right to do that in a forum like this in my view. On this question we will be discussing this with CTSM SE's this morning and get back to you...
|
@mvertens @jedwards4b and all. We discussed this in the CTSM SE group this morning. We have some proposals based on the discussion. And I'll take some time to also add some clarification on roles and responsibilities here. And by the way it's both OK and actually good to fight about these proposals. I'd like to have the standards set by the group now, to help avoid future friction. If we agree on group standards we can more easily hold each other accountable. Roles:
My thought is that we have a few tags that @slevis-lmwg and I need to finish off. But, once we get to @mvertens refactoring tag is the time that @mvertens could make the tag if she would like to take on that responsibility. @mvertens Jim volunteered you, but would you like the responsibility to make MOSART tags? We haven't made tags in MOSART for about 2 years, so obviously we aren't doing a ton of work here. I imagine @mvertens will be doing work at that point that we just need to know about, but could let her do tags for. The other development work is @swensosc work on the GW branch -- but as it's on a branch it's disconnected from the main-development. A sticky thing is being informed about timelines. Unfortunately we can't promise dates, but we can be responsible to inform about plans and expected timelines. And we could be expected to give updates to PR authors if those plans change. This is something that we could negotiate on to improve it for everyone, and we are open to ideas on what that would look like. And we are open to additions or requests for subtractions of some things above if someone thinks it's important. If the above sounds good we'll make some changes in github to implement the rules. And I'll find a place to make them visible within MOSART github pages... |
@ekluzek - thanks for putting this list together. These points all seem very reasonable to me. It would be helpful if this could be moved to a discussion rather than be associated with this PR. In terms of my tagging - I think you are referring only to my major refactorization of mosart - right? Could you please clarify that. |
Good idea @mvertens! I like that! I'll open up discussions and that's a great place for this to live longterm. I'm glad you find the list reasonable. Again happy to negotiate on specifics... Yes, in terms of tagging. I'm wondering if you'd like to be able to do your own tag/baselines for the refactoring PR? And you'll obviously have PR's after that point as well. So we could have you do your own tags afterwards as well. As I say above we would just need to let us know about it, approve it, and then give you the green light to make it. |
My understanding is that we need this PR to be merged before we can move forward on #76 - perhaps I misunderstood but I thought that this would be done by now. Please clarify the timeline for merging this and the other outstanding mosart PR's. |
@jedwards4b thanks for the question. I was doing some verification with three tests that were showing changes to answers. ERP_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-qgrwlOpts The first two had issues with history files in the baselines not output at the same frequency as the new updated tag. I reran them and was able to show that they are fine. The last one is an issue that @mvertens discusses in the introduction to this PR (and thank for including that discussion!). She points out they are roundoff level changes -- and I verified that as well even for a month long simulation (with only a few CTSM fields diverging by greater than roundoff after the month). It's not clear why using decomp_option="1d" shows a roundoff level change -- but I don't think we care enough to track it down. The code changes are large enough that it's not clear what the cause might be. I also did another test with the last decomp_option "basin" to verify it was OK, and it shows identical results to the baseline. So it is perfect. The other tests are roundrobin and they show identical results so they of course are fine as well.
So @slevis-lmwg and I will schedule to complete this tag as soon as we can. I've also started the discussions, and have a few discussions under https://github.com/ESCOMP/MOSART/discussions Please comment and suggest changes as needed. I plan to have it open for awhile to get feedback, and will update in response. Once, we've settled on it -- I'll lock the discussion. |
Thanks for this update, @ekluzek . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has passed the testing and is ready to come in. Thanks for the contribution! Really nice to bring in this cleanup and removal of MCT!
Add notes about mosart1_0_49 from @mvertens
This PR does the following:
do_rtmflood
and xml variableMOSART_FLOOD_MODE
. Also removed subroutineMOSART_FloodInit
inRtmMod.F90
which was never activated and in fact the model aborted if you tried to invoke it.Verified that this was no longer needed in consult with @swensosc.
Issues resolved:
Resolves #65
Resolves #75
Resolves #73
Resolved #85
Some work on: #86 and #87
Testing:
Ran the following tests on derecho for both this PR and master (these are now aux_mosart tests with this PR)
All tests passed and were bfb with master EXCEPT for