Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

finalize 0-layer thermo and cesm ponds deprecation in CICE #787

Merged
merged 5 commits into from
Nov 16, 2022

Conversation

eclare108213
Copy link
Contributor

  • Short (1 sentence) summary of your PR:
    Removes all ifdef options allowing 'undeprecation' of 0-layer thermo and cesm meltponds, and updates documentation
  • Developer(s):
    @eclare108213 @dabail10
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    ENTER INFORMATION HERE
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?
    • Yes
    • No
  • Does this PR update the Icepack submodule? If so, the Icepack submodule must point to a hash on Icepack's main branch.
    • Yes
    • No but it will need to
  • Does this PR add any new test cases?
    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

@apcraig
Copy link
Contributor

apcraig commented Nov 11, 2022

I ran a full test suite on cheyenne with 3 compilers including CICE-Consortium/Icepack#411 and everything passes and is bit-for-bit. This was done with #4935c98 in this PR and #926c338449 in the Icepack PR. Some additional changes were made after that, but I think only to documentation and hs0=0. Do we need to retest? See https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks#4935c98e1ff45eb1b7850cea5639028ef636c60d

@apcraig
Copy link
Contributor

apcraig commented Nov 11, 2022

I recommend we merge the Icepack PR when ready, CICE-Consortium/Icepack#411, then update Icepack in the PR before merging it.

@eclare108213
Copy link
Contributor Author

Setting hs0=0 as the default might have changed the answers. If so, I can fix it in the set_nml files, but I'm not sure which ones might have changed -- most likely those with topo ponds. That might just be alt03 -- recompile and test that one? There's alt03 and also a 'pondtopo' option in Icepack. I'm guessing I'll need to add hs0=0.03 to all of those files.

@apcraig
Copy link
Contributor

apcraig commented Nov 13, 2022

I retested with hs0=0. Ran a nearly full test suite on 3 compilers on cheyenne. The following runs seem to hang and then time out,

PEND cheyenne_intel_restart_gx3_8x4_debug_gx3ncarbulk_histall_histinst_iobinary_precision8 run
PEND cheyenne_intel_restart_gx3_32x1_debug_histall_histinst_ionetcdf run
PEND cheyenne_intel_smoke_gx3_1x1x5x4x580_dwblockall_gridc_reprosum_run10day run
PEND cheyenne_intel_smoke_gx3_1x1x5x4x580_gridc_reprosum_run10day run
PEND cheyenne_intel_smoke_gx1_32x1x16x12x40_cmplogrest_dwblockall_gridc_reprosum_run10day run
PEND cheyenne_intel_smoke_gx1_32x1x16x12x40_cmplogrest_gridc_reprosum_run10day run
PEND cheyenne_intel_smoke_gx1_1x1x320x384x1_droundrobin_run2day run
PEND cheyenne_intel_smoke_gx1_1x1x160x192x4_droundrobin_run2day run
PEND cheyenne_intel_smoke_gx1_1x1x80x96x16_droundrobin_run2day run
PEND cheyenne_intel_smoke_gx1_1x1x20x24x256_droundrobin_run2day run
PEND cheyenne_intel_smoke_gx1_1x1x16x16x480_droundrobin_run2day run

All other tests pass and are bit-for-bit vs main. It's also odd that it's just the intel compiler that struggles with a handful of tests that seem to relatively random. I believe the results are mostly reproducible, but not 100% sure. I don't know what hs0 does, but it's extremely unusual that the code hangs and times out. Not sure why that might be happening, again, just for intel for a few cases. How should we proceed. I think we either need to debug this or reset the default hs0 namelist value. Maybe there are other options.

@eclare108213
Copy link
Contributor Author

This is very odd, definitely not what I'd expect. Did you run a new baseline at the same time, and it didn't have these problems? The cases with debugging on don't throw errors? This seems like a cheyenne/intel problem to me. I had convinced myself that changing the initial values of hs0 in CICE and Icepack from 0.03 to 0 would be BFB for all tests, since hs0=0 in ice_in and icepack_in. So I don't understand what's going on.

My inclination is to reset the default values in the code to 0.03 and slightly modify these new documentation changes to be consistent in this PR, and then revise hs0 behavior as needed after discussion of #635 in a separate PR.

@apcraig
Copy link
Contributor

apcraig commented Nov 14, 2022

I agree it's odd. It could be a compiler issue. Let me do a bit more testing to confirm results and see if I can understand what's happening better.

@apcraig
Copy link
Contributor

apcraig commented Nov 14, 2022

I reran the suite with intel and it looks like everything is OK. I had some odd failures the last few days on cheyenne and even saw some repeated failures, but I think everything is OK again.

I think we should merge the Icepack PR, update this PR, then merge this if others agree.

@eclare108213
Copy link
Contributor Author

That's a relief, thanks for checking. How to best address the hs0 issue #635 isn't resolved yet, so I think it would be better to back out the hs0 initialization changes here (return to original values) but keep the documentation changes, modified to be consistent with the current state of the code. Then fix the hs0 behavior separately.

@apcraig
Copy link
Contributor

apcraig commented Nov 14, 2022

I agree, back out hs0 changes and fix separately after further discussion. Let me know when the Icepack and CICE PRs are ready again and I can review and/or test as needed. Thanks.

@apcraig apcraig marked this pull request as ready for review November 16, 2022 00:51
@apcraig apcraig merged commit 5a32f12 into CICE-Consortium:main Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants