Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove all refences of mct #74

Merged
merged 24 commits into from
Feb 2, 2024
Merged

Conversation

mvertens
Copy link
Contributor

@mvertens mvertens commented Dec 14, 2023

This PR does the following:

  • removes all MCT references from the code and replaces them with ESMF routehandles and mapping calls
  • major changes to RtmMod.F90 along with other code cleanup described below
  • RtmVar
    • Now contains new ESMF data types needed for the MOSART mapping
       type(ESMF_Field)       , public :: srcField
       type(ESMF_Field)       , public :: dstField
       type(ESMF_RouteHandle) , public :: rh_dnstream
       type(ESMF_RouteHandle) , public :: rh_direct
       type(ESMF_RouteHandle) , public :: rh_eroutUp
  • RtmMod:
    • now have two new init phases for mosart. The first init phase is now called MOSART_init1 and replaces Rtmini. This has mostly what was there before but moves the creation of all routehandles to the second init phase - MOSART_init2 which must be called after the mesh has been read in. Also - moved the section of code for MOSART_init2 to be right below the section for MOSART_init1.
    • removed the mapping for Smatp_dnstrm since it was not used and there is no reason to create a map that is not needed. The associated code that was commented out for this has also been removed.
    • renamed RtmRun to MOSART_run
    • new indentation
  • MOSART_physics.F90
    • now using the computed routehandle rh_eroutUp
    • new indentation
  • Removed namelist variable do_rtmflood and xml variable MOSART_FLOOD_MODE. Also removed subroutine MOSART_FloodInit in RtmMod.F90 which was never activated and in fact the model aborted if you tried to invoke it.
    Verified that this was no longer needed in consult with @swensosc.
  • masterproc -> mainproc
  • updated the MOSART testlist for derecho and betzy (betzy is a NorESM platform) and added a PFS test

Issues resolved:
Resolves #65
Resolves #75
Resolves #73
Resolved #85

Some work on: #86 and #87

Testing:
Ran the following tests on derecho for both this PR and master (these are now aux_mosart tests with this PR)

ERP_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-qgrwlOpts
ERS_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.mosart-default
ERS_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.mosart-iceOff
PEM_D.f10_f10_mg37.I1850Clm50Sp.derecho_intel.mosart-inplacethreshold
PFS.f10_f10_mg37.I1850Clm50Sp.derecho_intel.mosart-default
SMS_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts
SMS_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-passChannelDepths
SMS.f10_f10_mg37.I1850Clm50Sp.derecho_intel.mosart-nobypass
SMS_Ld5.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.mosart-mosartCold
SMS_Lh3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.mosart-mosartOff
SMS_Lh3.f10_f10_mg37.I2000Clm51Sp.derecho_intel.mosart-clmAccelSpinupIgnoreWarn

All tests passed and were bfb with master EXCEPT for

  1. SMS.f10_f10_mg37.I1850Clm50Sp.derecho_intel.mosart-nobypass - this is an expected fail for both this PR and master
  2. SMS_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts produced a handful of roundofflevel
RMS lndExp_Flrr_volr                 1.7951E-21            NORMALIZED  2.0078E-18
RMS lndExp_Flrr_volrmch              9.9165E-24            NORMALIZED  1.8001E-20
RMS rofImp_Flrr_volr                 1.0219E-19            NORMALIZED  1.0561E-16
RMS rofImp_Flrr_volrmch              1.5927E-19            NORMALIZED  2.6501E-16
RMS rofImp_Forr_rofi                 1.4041E-21            NORMALIZED  4.6582E-15
RMS rofImp_Forr_rofl                 1.4094E-20            NORMALIZED  7.1437E-15

@mvertens mvertens requested a review from jedwards4b December 14, 2023 19:00
Copy link
Contributor

@jedwards4b jedwards4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, thank you. I think that there is a misuse of the PIO_BCAST_ERROR statement and I would like to see the code in comments removed.

call t_stopf ('lc_rof_export')

!--------------------------------
! Check that internal clock is in sync with master clock
! Check that internal clock is in sync with sync clock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be main clock not sync clock?

@@ -718,7 +745,7 @@ subroutine ModelAdvance(gcomp, rc)
write(iulog,*)' mosart ymd=',ymd ,' mosart tod= ',tod
write(iulog,*)' sync ymd=',ymd_sync,' sync tod= ',tod_sync
rc = ESMF_FAILURE
call ESMF_LogWrite(subname//" MOSART clock not in sync with Master Sync clock",ESMF_LOGMSG_ERROR)
call ESMF_LogWrite(subname//" MOSART clock not in sync with sync clock",ESMF_LOGMSG_ERROR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe sync clock is intentional but I find it a bit confusing.

! check for stability
! if(TRunoff%vr(iunit,nt) < -TINYVALUE .or. TRunoff%vr(iunit,nt) > 30) then
! write(iulog,*) "Numerical error inRouting_KW, ", iunit,nt,TRunoff%vr(iunit,nt)
! end if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just remove the commented code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yES - I think that would be great. I'll do that.

else
if(hr_ > rdepth_ + ((rwidth0_-rwidth_)/2._r8)*SLOPE1 + TINYVALUE) then
deltahr_ = hr_ - rdepth_ - ((rwidth0_-rwidth_)/2._r8)*SLOPE1
! pr_ = rwidth_ + 2._r8*(rdepth_ + ((rwidth0_-rwidth_)/2._r8)*SLOPE1/sin(atan(SLOPE1)) + deltahr_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I suggest we remove code instead of leaving it in comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree!

nbas_chk = 0
nrof_chk = 0
do nr=1,rtmlon*rtmlat
! !if (mainproc) write(iulog,*) 'nupstrm check ',nr,gmask(nr),nupstrm(nr),idxocn(nr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code.

call pio_initdecomp(pio_subsystem, pio_double, dsizes, compDOF, iodesc_dbl)
call pio_initdecomp(pio_subsystem, pio_int , dsizes, compDOF, iodesc_int)
deallocate(compdof)
call pio_seterrorhandling(ncid, PIO_BCAST_ERROR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why we are using PIO_BCAST_ERROR here as I see no error handling in the following code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

call shr_sys_abort(trim(subname)//' ERROR areatot incorrect')
endif

! do nr = rtmCTL%begr,rtmCTL%endr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


call getfil(frivinp, locfn, 0 )
call ncd_pio_openfile (ncid, trim(locfn), 0)
call pio_seterrorhandling(ncid, PIO_BCAST_ERROR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This demonstrates correct use of PIO_BCAST_ERROR - the method is set in line 2324, the error is checked in 2326 and the method is reset in 2332.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out.

implicit none
private

#include <mpif.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should change this to
use mpi
It would need to be moved above the implicit none statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I've actually updated things so that 'use mpi' is now used in the necessary two routines rather than in RtmSpmd.F90.

@ekluzek
Copy link
Contributor

ekluzek commented Dec 14, 2023

@mvertens thanks so much for putting these changes together! This is so helpful. I should be able to move these changes over to RTM as well.

And @jedwards4b thanks for a nice review of this, appreciate that as well. And I agree with him about removing comments. I defer to him about the PIO specific things.

@ekluzek
Copy link
Contributor

ekluzek commented Dec 14, 2023

@mvertens it sounds like you found this to be bit-for-bit right? I'm surprised because I would think answers would change going from MCT to ESMF? I suppose maybe the reason is that the sparse-matrix multiply is done the same between MCT and ESMF, just because it's a standard math problem?

@jedwards4b
Copy link
Contributor

@erik it's bfb because the matrix multiply is all 1 or 0 and so there is no issue with respect to order of operations.

@mvertens
Copy link
Contributor Author

@ekluzek - The reason that answers are bfb is that the mapping is only multiplying by 1. You are simply moving the water from one gridcell to another. Also - MCT is only used in RTM in the cpl/mct code. I'm doing another PR back to ESCOMP where I have removed this directory. So now MCT will not be in either RTM or MOSART.

@@ -50,7 +37,7 @@ subroutine RtmSpmdInit(mpicom)

! Get my processor id
call mpi_comm_rank(mpicom_rof, iam, ier)
if (iam == MAINTASK) then
if (iam == 0) then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change this line? I think that the constant MAINTASK is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay - I'll move it back. My reasoning was fewer variable declarations.

@mvertens mvertens changed the title Feature/remove mct remove all refences of mct Dec 31, 2023
@mvertens mvertens requested a review from slevis-lmwg December 31, 2023 14:05
@slevis-lmwg
Copy link
Contributor

@mvertens I browsed through this PR. I would be happy to meet with you for a tour of the changes if you still need feedback from me. My slevis@ucar calendar is up to date.

@wwieder
Copy link

wwieder commented Jan 11, 2024

Hello,

We're still trying to balance competing interests on CTSM SE time. In the short term (before Feb. working group meetings) this doesn't seem as critical as other PRs (e.g., Dust work, CTSM5.2 datasets, and supporting PPE neeeds for CLM6 calibration). For now, I'm suggesting that @ekluzek and @slevis-lmwg focus their energy on these more critical LMWG PRs before focusing on the MOSART PRs, @mvertens, @jedwards4b, & @briandobbins.

@ekluzek
Copy link
Contributor

ekluzek commented Jan 12, 2024

Brian and @wwieder and I met and hashed some plans for this. The plan we have now is that @slevis-lmwg and I will do this tag (and the RTM equivalent one) together next Wednesday. We'll hold off on the other tags until after LMWG, but Sam will take care of them afterwards. We were going to do the tags today, but with @wwieder direction above we decided to hold off. From what we could tell that should be acceptable to everyone, but Brian and @wwieder can negotiate something different if needed. Doing the set of tags isn't going to be that much time, but it's also distracting to our main priorities. And it seems that doing most of them after LMWG should be OK. @mvertens @jedwards4b @briandobbins

@jedwards4b
Copy link
Contributor

@ekluzek Please prioritize merging this PR as is. Do not continue to add code and features. You have added changes to a collaborators PR without consulting or discussing with that collaborator. This is, at best, a poor practice which can only serve to alienate contributors.

@wwieder
Copy link

wwieder commented Jan 22, 2024

Hi everyone,
This is a bigger than expected PR for just removing the MCT cap, so I'm having a bit of trouble understanding all that's going on. I do think it's worth addressing the past few commits and comments, however.

  • Broadly, CTSM SE's commonly make changes as collaborators to external PRs when merging PRs. Is this not common across other CESM components or collaborative software development projects?
  • Specifically, it seems like @ekluzek reverted some of @mvertens code cleanup, specifically related to broken flooding features in MOSART. Erik could have asked Mariana to do this, but he went ahead a did it, (filing Remove non-working aspects of the FLOOD option #80?). Maybe this could have been done more clearly or could have involved other channels of communication?
  • The flooding features will still be broken with this PR, but by maintaining this part of the code it provides handles where @swensosc can get flooding to work. Maintaining this feature seems important for the science, albeit imperfectly, for now.
  • Finally, I can't tell if all the testing has been done or everything's checked out, but it seems like this PR is ready to go?

If this last point is accurate, the tone of your last comment seems counter productive, @jedwards4b. I have to believe that @ekluzek and @slevis-lmwg are working to bring this PR to main efficiently and effectively and I think it's important that our virtual conversations on this platform are constructive in tone and content.

@mvertens
Copy link
Contributor Author

@wwieder - thanks for your note. I'd like to clarify a few things.

  1. The code to remove MCT was not a cleanup project. It was to replace MCT functionality with ESMF.
  2. The flood code never worked in this PR. If you asked for it the model aborted. I checked with @swensosc before removing it - and he was in agreement that this should be done.
  3. It would have been helpful if changes to my PR had come in as a PR to my fork with the changes that @ekluzek wanted to have put in. I would have welcomed that approach to incorporating modifications.
  4. I did extensive testing was done on this PR and @slevis-lmwg approved these.

@jedwards4b
Copy link
Contributor

@wwieder I do not think that it is a common practice for anyone to alter another persons PR without explicit permission of the original author. If you don't want to ask, an alternative would be to post a new PR to the existing PR for discussion.

@ekluzek
Copy link
Contributor

ekluzek commented Jan 23, 2024

@mvertens @jedwards4b @wwieder

Reply to specific points above...

It would have been helpful if changes to my PR had come in as a PR to my fork with the changes that @ekluzek wanted to have put in. I would have welcomed that approach to incorporating modifications.

I have done PR's to PR's before. And it does make sense when I'm suggesting changes that I need the author to review or might be controversial, or it's in a repository I'm not the maintainer for example. Then it's useful, it does slow down the process though. In this case as maintainer I didn't see the point in having you approve the changes. The impression I got Sunday was that you wanted this in quickly so I did something to expedite it and then was chastised (by Jim) for doing just that on Monday. So I'm in a no win situation...

Here's a suggestion that @slevis-lmwg and I came up with yesterday...

In CTSM we make "merge PR's" that merge several PR's in together and then possibly add changes on top of that. So in the future we can do that for these cases. I mean if I for example -- brought your PR in as is -- and then added changes in the next tag -- I don't see how that's objectionable? That next tag isn't necessarily something you have to approve...

The flood code never worked in this PR. If you asked for it the model aborted. I checked with @swensosc before removing it - and he was in agreement that this should be done.

@mvertens you are right that you checked in with @swensosc and he agreed. And I appreciate that you checked in with him, he is the one that needs to approve such a thing. However, @slevis-lmwg and I as code maintainers were not informed about the change -- other than by code inspection. I was concerned and asked at the meeting, where you told me you were NOT removing flooding -- but only removing a part that wasn't functional. In our followup with @swensosc he told me he did use flooding, but in a hacky way -- so it was important to get it back to the same state for his work.

The code to remove MCT was not a cleanup project. It was to replace MCT functionality with ESMF.

The thing is that there was a TON of cleanup that came in here that is unrelated to removing MCT. @slevis-lmwg and I were neither informed nor asked permission about the inclusion of that cleanup. Nor was that cleanup provided in a separate PR, so we could assess it separately than the MCT removal.

I think I'm good with most of it -- and above you see that I documented in issues about a lot of it. But, there was a lot of this done with no discussion with us.

To list some of it:
-- Arbitrary change of the test list name
-- Replace PROTEC commenting directives
-- The change from masterproc to mainproc
-- Removal of extra implicit none statements
-- Replace instances of RTM for MOSART
-- Update I/O unit handling
-- Fix indentation
-- Remove commented out code

Again, I agree with most of it, but it's better if we were informed about it.

If you need more communication from us on what we are doing -- in order to have a good collaboration we need more communication from you.

I did extensive testing was done on this PR and @slevis-lmwg approved these.

I know you did, and we appreciate that. And I'm glad we can rely on your doing testing. Something that would help though is for you to document what testing you did (in the PR for example). I ran into problems with the baselines running the full test suite, and since you didn't mention these issues -- I'm assuming you didn't run the full test list. But, knowing what testing was done is helpful for @slevis-lmwg and I as code maintainers.

TLDR:
I propose the following...

  • Future tags will be done with a merge tag that includes several PR's and changes on top of that
  • If there are changes we need to do that we need review on -- we'll open a PR to the PR and/or discuss them beforehand
  • Please communicate with @slevis-lmwg and I about big changes (removal of FLOOD) sooner
  • Same for non-related cleanup changes as well
  • Prefer for non-related cleanup changes to go into a second PR (not required but desired)
  • Let us know what testing was done in the PR

does the above list sound acceptable to everyone?

@wwieder
Copy link

wwieder commented Jan 23, 2024

I think @ekluzek feels this PR is ready to merge. He's proposing to do this as a merge tag like so Erik's changes are separated from the original PR.

  • @mvertens do the changes he's layered on here seem OK to you?
  • It sounds like @briandobbins and CSEG are going to have a discussion about expectations of how PRs handled across components.

@mvertens
Copy link
Contributor Author

@wwieder - yes the changes are fine and I appreciate @ekluzek putting them in.
@ekluzek - thank you for your response to my points. In looking at the code some more - I now agree that there was more cleanup than I remembered. I also agree in general with your proposed suggestions.

@slevis-lmwg
Copy link
Contributor

I wanted to publicly acknowledge that each one of us brings different strengths to the group. I apologize if this is the wrong platform for this comment. I will remain brief, and I hope I don't get in trouble for it :-)

  • Efficiency in making updates stands out for me as @mvertens' strength here. Mariana, I very much appreciate and admire your ability to solve software issues so quickly.
  • I see @ekluzek's strength in this interaction and in general as ensuring that the CTSM software continue to operate according to a set of rules to prevent bigger problems later. I truly appreciate and admire how well Erik does this for the LMWG, while juggling numerous tasks simultaneously. Thank you, Erik.

@jedwards4b
Copy link
Contributor

@ekluzek I see a software engineer who is overwhelmed with tasks and what I thought I was doing was to try to help reduce the burden while recognizing and supporting the work of a very important outside collaborator. I hope that we are working toward a solution where you allow others to take on some of these tasks - they may not always be done exactly the way that you would do them, but I think that we all care about developing quality software that can stand the test of time. I apologize for my blunt comments sometimes, perhaps I could have been more effective had I not exposed my personal frustrations.

With respect to the suggestions above:

Future tags will be done with a merge tag that includes several PR's and changes on top of that

I'm not sure if I understand the necessity for this, it seems like it puts an unnecessary burden on you and or @slevis-lmwg - why not allow developers to test their own PR's (with a suitable test suite) and merge and create a tag for each one?

@ekluzek
Copy link
Contributor

ekluzek commented Jan 25, 2024

@jedwards4b thanks so much for your last comment that calms the rhetoric down. I do appreciate the help. And I accept your apology it helps set a much better tone to work together in. As humans we do need to express our frustrations, but when working together we need to have positive ways of doing that. I know Brian Dobbins will be working on that within CSEG.

And actually I wasn't seeing that anyone of the CTSM SE group was going to be able to do this piece that @mvertens did before CESM3. Not only that, but it would've taken any of us a ton more time than she was able to do it in. I would've needed to consult with the both of you to take it on, and it would've still taken me a lot longer even if I could work on it now.

I also see the collaboration with NiO and other external groups to be very important for CESM. CESM should lean heavily into the "C" in the name, as a way to distinguish from E3SM and also to help prevent a split like that from happening. My hope is that we can bring in their work and take advantage of their advances as well as not having to have either of us have much difference in our repositories between us. So I'm glad to have NiO machines in the test list and hope that betzy will be added to ccs_config for example. As well as other additions that will bring in advancements for everyone.

I also agree with @slevis-lmwg comments about @mvertens as well. She is able to do stuff like this amazingly quickly which is awesome and her superpower. And @slevis-lmwg I tend to follow the maxim "Public praise, private reproof", it's almost always good to give praise in a public forum, so I think you are right to do that in a forum like this in my view.

On this question we will be discussing this with CTSM SE's this morning and get back to you...

Future tags will be done with a merge tag that includes several PR's and changes on top of that

I'm not sure if I understand the necessity for this, it seems like it puts an unnecessary burden on you and or @slevis-lmwg - why not allow developers to test their own PR's (with a suitable test suite) and merge and create a tag for each one?

@ekluzek
Copy link
Contributor

ekluzek commented Jan 25, 2024

@mvertens @jedwards4b and all. We discussed this in the CTSM SE group this morning. We have some proposals based on the discussion. And I'll take some time to also add some clarification on roles and responsibilities here. And by the way it's both OK and actually good to fight about these proposals. I'd like to have the standards set by the group now, to help avoid future friction. If we agree on group standards we can more easily hold each other accountable.

Roles:

  • LMWG is responsible for maintenance of the river models in CESM (RTM, MOSART, mizuRoute)
  • LMWG NCAR chair (currently @wwieder) manages the scientific priorities of development within the river models, including management of SE activities in that space.
  • LMWG is responsible for scientific development in river models, with @swensosc being the scientific lead for MOSART scientific development
  • CTSM SE group will work with the LMWG chair on the needed SE developments in the river models and is responsible for: tag ordering, tags, testing-requirements, and testing baselines. This group will be the one that normally does these activities, however external collaborators can be given access with some training and oversight from CTSM SE group. Right now the ones in the CTSM SE group that will normally do this work is: @slevis-lmwg and/or @ekluzek.
  • External collaborators will submit PR's to river models and be informed on the ordering for them.
  • Someone from CTSM SE needs to approve PR's to river models if only so they are informed on what is happening.
  • The CTSM SE group is responsible to manage adding river model tags into CESM with the CSEG group testdb

My thought is that we have a few tags that @slevis-lmwg and I need to finish off. But, once we get to @mvertens refactoring tag is the time that @mvertens could make the tag if she would like to take on that responsibility. @mvertens Jim volunteered you, but would you like the responsibility to make MOSART tags? We haven't made tags in MOSART for about 2 years, so obviously we aren't doing a ton of work here. I imagine @mvertens will be doing work at that point that we just need to know about, but could let her do tags for. The other development work is @swensosc work on the GW branch -- but as it's on a branch it's disconnected from the main-development.

A sticky thing is being informed about timelines. Unfortunately we can't promise dates, but we can be responsible to inform about plans and expected timelines. And we could be expected to give updates to PR authors if those plans change. This is something that we could negotiate on to improve it for everyone, and we are open to ideas on what that would look like. And we are open to additions or requests for subtractions of some things above if someone thinks it's important.

If the above sounds good we'll make some changes in github to implement the rules. And I'll find a place to make them visible within MOSART github pages...

@mvertens
Copy link
Contributor Author

@ekluzek - thanks for putting this list together. These points all seem very reasonable to me. It would be helpful if this could be moved to a discussion rather than be associated with this PR. In terms of my tagging - I think you are referring only to my major refactorization of mosart - right? Could you please clarify that.

@ekluzek
Copy link
Contributor

ekluzek commented Jan 26, 2024

Good idea @mvertens! I like that! I'll open up discussions and that's a great place for this to live longterm. I'm glad you find the list reasonable. Again happy to negotiate on specifics...

Yes, in terms of tagging. I'm wondering if you'd like to be able to do your own tag/baselines for the refactoring PR? And you'll obviously have PR's after that point as well. So we could have you do your own tags afterwards as well. As I say above we would just need to let us know about it, approve it, and then give you the green light to make it.

@jedwards4b
Copy link
Contributor

My understanding is that we need this PR to be merged before we can move forward on #76 - perhaps I misunderstood but I thought that this would be done by now. Please clarify the timeline for merging this and the other outstanding mosart PR's.

@ekluzek
Copy link
Contributor

ekluzek commented Feb 1, 2024

@jedwards4b thanks for the question. I was doing some verification with three tests that were showing changes to answers.

ERP_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-qgrwlOpts
PEM_D.f10_f10_mg37.I1850Clm50Sp.derecho_intel.mosart-inplacethreshold
SMS_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts

The first two had issues with history files in the baselines not output at the same frequency as the new updated tag. I reran them and was able to show that they are fine.

The last one is an issue that @mvertens discusses in the introduction to this PR (and thank for including that discussion!). She points out they are roundoff level changes -- and I verified that as well even for a month long simulation (with only a few CTSM fields diverging by greater than roundoff after the month). It's not clear why using decomp_option="1d" shows a roundoff level change -- but I don't think we care enough to track it down. The code changes are large enough that it's not clear what the cause might be.

I also did another test with the last decomp_option "basin" to verify it was OK, and it shows identical results to the baseline. So it is perfect. The other tests are roundrobin and they show identical results so they of course are fine as well.

grep RMS /glade/derecho/scratch/erik/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist/run/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist.cpl.hi.0001-02-01-00000.nc.cprnc.out
 RMS lndExp_Flrr_volr                 4.9583E-19            NORMALIZED  1.7272E-16
 RMS lndExp_Flrr_volrmch              6.8211E-19            NORMALIZED  2.7513E-16
 RMS rofImp_Flrr_volr                 1.9708E-18            NORMALIZED  6.4934E-16
 RMS rofImp_Flrr_volrmch              2.0494E-18            NORMALIZED  7.8161E-16
 RMS rofImp_Forr_rofi                 9.3811E-20            NORMALIZED  6.3271E-14
 RMS rofImp_Forr_rofl                 3.4147E-21            NORMALIZED  1.2038E-15
derecho5 cases/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist> grep RMS /glade/derecho/scratch/erik/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist/run/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist.mosart.h0.0001-01.nc.cprnc.out
 RMS areatotal2                       3.9280E-06            NORMALIZED  3.5060E-16
 RMS DIRECT_DISCHARGE_TO_OCEAN_ICE    3.8158E-15            NORMALIZED  1.1627E-14
 RMS DIRECT_DISCHARGE_TO_OCEAN_LIQ    1.4055E-15            NORMALIZED  5.5442E-16
 RMS RIVER_DISCHARGE_OVER_LAND_LIQ    2.1871E-13            NORMALIZED  8.7639E-16
 RMS TOTAL_DISCHARGE_TO_OCEAN_ICE     3.8158E-15            NORMALIZED  1.1627E-14
 RMS TOTAL_DISCHARGE_TO_OCEAN_LIQ     5.8561E-15            NORMALIZED  1.0563E-15
derecho5 cases/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist> grep RMS /glade/derecho/scratch/erik/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist/run/SMS_Lm1_D.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.mosart-decompOpts.GC.mosart1__49_ctsm51d161delist.clm2.h0.0001-01.nc.cprnc.out
 RMS CH4_SURF_AERE_SAT                1.1154E-12            NORMALIZED  3.0834E-05
 RMS CH4_SURF_DIFF_SAT                9.5995E-19            NORMALIZED  5.3827E-09
 RMS FCH4                             2.9203E-17            NORMALIZED  1.4428E-06
 RMS FCH4TOCO2                        2.9183E-14            NORMALIZED  9.2769E-07
 RMS NEM                              2.9224E-14            NORMALIZED  2.1793E-06
 RMS CONC_O2_SAT                      1.0804E-05            NORMALIZED  1.5833E-04

So @slevis-lmwg and I will schedule to complete this tag as soon as we can.

I've also started the discussions, and have a few discussions under

https://github.com/ESCOMP/MOSART/discussions

Please comment and suggest changes as needed. I plan to have it open for awhile to get feedback, and will update in response. Once, we've settled on it -- I'll lock the discussion.

@wwieder
Copy link

wwieder commented Feb 1, 2024

Thanks for this update, @ekluzek .

Copy link
Contributor

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has passed the testing and is ready to come in. Thanks for the contribution! Really nice to bring in this cleanup and removal of MCT!

Add notes about mosart1_0_49 from @mvertens
@ekluzek ekluzek merged commit 1c22a8c into ESCOMP:master Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
6 participants