Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add timestamp to rpointer files #2757

Merged
merged 32 commits into from
Dec 19, 2024

Conversation

jedwards4b
Copy link
Contributor

@jedwards4b jedwards4b commented Sep 12, 2024

Description of changes

Adds a timestamp to rpointer files in a backward compatible manor

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):

Are answers expected to change (and if so in what way)?
no
Any User Interface Changes (namelist or namelist defaults changes)?

Does this create a need to change or add documentation? Did you do so?

Submodules updated: Requires all of the following updates to work
cime6.1.49
share1.1.6
cmeps1.0.32

Testing performed, if any: will do regular

Things to do:

@wwieder
Copy link
Contributor

wwieder commented Sep 12, 2024

Thanks Jim. Can this go onto b4bdev, @ekluzek ?

@ekluzek ekluzek self-assigned this Sep 12, 2024
@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability usability Improve or clarify user-facing options labels Sep 12, 2024
@ekluzek
Copy link
Collaborator

ekluzek commented Sep 12, 2024

Thanks @jedwards4b.

@wwieder yes this totally makes sense as something coming into b4b-dev. Since, it has backwards compatibility it doesn't need to be coordinated with other CESM tags or externals. So bringing it into b4b-dev and having it go into CTSM main-dev the next time a b4b-dev tag is made (in two weeks) makes a lot of sense.

@jedwards4b jedwards4b marked this pull request as draft September 19, 2024 20:05
@jedwards4b
Copy link
Contributor Author

I've run into an issue here. The clm_timemgr reads its clock information from the restart file on restart - which makes it hard to read the clock to read the restart file. It's also not a requirement to get this from the restart file as the driver has already set the clock.

@wwieder wwieder added this to the cesm3_0_beta04 milestone Sep 26, 2024
@jedwards4b jedwards4b force-pushed the add_timestamp_to_rpointers branch from 463bf55 to 25efa68 Compare September 26, 2024 20:57
@jedwards4b jedwards4b marked this pull request as ready for review September 26, 2024 20:59
@jedwards4b jedwards4b force-pushed the add_timestamp_to_rpointers branch from 25efa68 to 9f07cf9 Compare September 26, 2024 21:15
@jedwards4b
Copy link
Contributor Author

I have tested with ERS.ne30pg3_t232.BLT1850.derecho_intel.allactive-defaultio
and plan to do a complete set of cesm prealpha tests.

@samsrabin
Copy link
Collaborator

We discussed this at the CTSM SE meeting this morning and decided it would be in our cesm3_0_beta04 tag, which fits with @jedwards4b's timeline.

Update surface datasets, CN Matrix, CLM60: excess ice on, explicit A/C on, crop calendars, Sturm snow, Leung dust emissions, prigent roughness data

Purpose and description of changes since ctsm5.2.005
----------------------------------------------------

Bring in updates needed for the CESM3.0 science capability/functionality "chill". Most importantly bringing
in: CN Matrix to speed up spinup for the BGC model, updated surface datasets, updated Leung 2023 dust emissions,
explicit Air Conditioning for the Urban model, updates to crop calendars. For clm6_0 physics these options are now
default turned on in addition to Sturm snow, and excess ice.

Changes to CTSM Infrastructure:
===============================

 - manage_externals removed and replaced by git-fleximod
 - Ability to handle CAM7 in LND_TUNING_MODE

Changes to CTSM Answers:
========================

 Changes to defaults for clm6_0 physics:
  - Urban explicit A/C turned on
  - Snow thermal conductivity is now Sturm_1997
  - New IC file for f09 1850
  - New crop calendars
  - Dust emissions is now Leung_2023
  - Excess ice is turned on
  - Updates to MEGAN for BVOC's
  - Updates to BGC fire method

 Changes for all physics versions:

  - Parameter files updated
  - FATES parameter file updated
  - Glacier region 1 is now undefined
  - Update in FATES transient Land use
  - Pass active glacier (CISM) runoff directly to river model (MOSART)
  - Add the option for using matrix for Carbon/Nitrogen BGC spinup

New surface datasets:
=====================

- With new surface datasets the following GLC fields have region "1" set to UNSET:
     glacier_region_behavior, glacier_region_melt_behavior, glacier_region_ice_runoff_behavior
- Updates to allow creating transient landuse timeseries files going back to 1700.
- Fix an important bug on soil fields that was there since ctsm5.2.0. This results in mksurfdata_esmf now giving identical answers with a change in number of processors, as it should.
- Add in creation of ne0np4.POLARCAP.ne30x4 surface datasets.
- Add version to the surface datasets.
- Remove the --hires_pft option from mksurfdata_esmf as we don't have the datasets for it.
- Remove VIC fields from surface datasets.

New input datasets to mksurfdata_esmf:
======================================

- Updates in PFT/LAI/soil-color raw datasets (now from the TRENDY2024 timeseries that ends in 2023), as well as two fire datasets (AG fire, peatland), and the glacier behavior dataset.
Same as ctsm5.3.001

I made an accidental merge and reverted it.
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 3, 2024

We are going to do this as a standalone tag to master, so I'll rebase to master.

@ekluzek ekluzek changed the base branch from b4b-dev to master December 4, 2024 16:53
Copy link
Collaborator

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedwards4b this is great, thanks for getting this out there for us. There's some nice improvements I saw you add (only reading the rpointer file on masterproc and catching some typos) which is great.

There are some changes that are required, and some I think would be good to do as they should be easy. They are outlined in the code changes. Right now I'm planning on just doing those changes. Feel free to comment on any of it though.

The required change is to move the updates into lnd_comp_esmf.F90 for LILAC.

Thanks again for the PR.

src/main/restFileMod.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/cpl/nuopc/lnd_comp_nuopc.F90 Show resolved Hide resolved
src/cpl/nuopc/lnd_comp_nuopc.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 4, 2024

Running aux_clm on Derecho I'm seeing tons of tests passing 199, with only 12 pending, but 23 failing. LILAC fails as I expected, but a bunch of ERI, the SSP tests, and one ERP, a few ERS, one REP, and a few SMS tests fail at the RUN phase.

@jedwards4b
Copy link
Contributor Author

@ekluzek I haven't yet merged the cime PR that you will need for these tests, are you using the branch?

@jedwards4b
Copy link
Contributor Author

I just merged it - try updating to cime6.1.47

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 4, 2024

Ahh, OK, thanks @jedwards4b! I'll update to that and see how it goes.

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@samsrabin samsrabin removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@peverwhee
Copy link

peverwhee commented Dec 17, 2024

@ekluzek i haven't run the tests on izumi but will do that in an hour or so. Here are my tests on derecho: /glade/derecho/scratch/courtneyp/aux_cam_intel_20241216142407

Here's the PR in CAM: ESCOMP/CAM#1147

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 17, 2024

It looks like the problem I'm having with the ERP tests on Izumi is due to a need to update CDEPS. Using the latest CDEPS I'm getting those tests to work now.

@peverwhee as such never mind on trying ERP tests on Izumi. But, I'll still be interested in what your testing shows on Izumi.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 17, 2024

Note, we talked about this yesterday at the standup as something to try. I tried some testing with backing off on the CMEPS update, and although some tests worked that way, ERI tests didn't because they require a coordination of CIME and CMEPS tags. So I can't back off any of the submodule updates as they are dependent on each other.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 17, 2024

The tests that I still have problems with:

ERP_P64x2_Ld765.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-monthly (NLCOMP RUN)
ERS_P128x1_Ld765.f10_f10_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdNoComp (NLCOMP RUN)

The test ERS_D_Ld20.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdTwoStream fails at build. But, you just need to go into the case and build by hand, and then it's fine.

In addition to the SSP spinup tests which I'm working on.

@peverwhee
Copy link

@ekluzek I kicked off the tests before I saw your message about updating CDEPS. But the CAM tests do pass on izumi with an older version (cdeps1.0.53) of CDEPS.

@ekluzek ekluzek added the blocked: dependency Wait to work on this until dependency is resolved label Dec 17, 2024
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 17, 2024

OK, oddly enough it's looking like the CDEPS update solved all the problems, both the SSP tests and the two outstanding broken tests. I'll rerun all of the tests and verify that this is correct, but that's what it looks like right now. I thought the tests that were dying were in the Driver and NOT DATM, but maybe it really was in DATM and that's why the update is important. What I don't get is why the SSP tests suddenly start working with this update. Improving some of the error checking may help with this for the future.

@billsacks @briandobbins and @peverwhee this is looking like really good news for our current plan.

@billsacks
Copy link
Member

That's great - thanks @ekluzek ! Let me know if things look different and you'd like me to help look at anything.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 18, 2024

Testing underway again. I am seeing issues with a few tests that I thought had been fixed with the CDEPS update. So we'll see we'll see what works and what doesn't after the testing is finished. From the CSEG meeting yesterday for practical reasons we may make the tag with a few things failing in the end.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 19, 2024

I was able to resubmit many of the tests that failed at the build step due to the new issue #2915. I think that's likely a problem with our ctsm_pylib environment, so might be easy to fix.

But, the mpi-serial tests are still failing to build in trying to build mpi-serial. One problem is the mismatch in autoconf version on Izumi (1.16.1) vs. the build that likely came from Derecho (1.16.5).

ERS_D_Ld5_Mmpi-serial.1x1_vancouverCAN.I1PtClm50SpRs.izumi_nag.clm-CLM1PTStartDate.GC.ctsm5316acl_nag
ERS_D_Ld7_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropRs.izumi_intel.clm-decStart1851_noinitial.GC.ctsm5316acl_int
ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm50FatesRs.izumi_nag.clm-FatesCold.GC.ctsm5316acl_nag
ERS_Ld1200_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_gnu.clm-cropMonthlyNoinitial.GC.ctsm5316acl_gnu
ERS_Ld1600_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-cropMonthlyNoinitial.GC.ctsm5316acl_int
ERS_Ld600_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.izumi_gnu.clm-cropMonthlyNoinitial.GC.ctsm5316acl_gnu
ERS_Ly20_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-cropMonthlyNoinitial.GC.ctsm5316acl_int
ERS_Ly20_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-cropMonthlyNoinitial--clm-matrixcnOn.GC.ctsm5316acl_int
ERS_Ly3_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.izumi_gnu.clm-cropMonthOutput.GC.ctsm5316acl_gnu
ERS_Ly5_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.izumi_gnu.clm-ciso_monthly.GC.ctsm5316acl_gnu
ERS_Ly5_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.izumi_gnu.clm-ciso_monthly--clm-matrixcnOn.GC.ctsm5316acl_gnu
ERS_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.izumi_intel.clm-cropMonthOutput.GC.ctsm5316acl_int
ERS_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.izumi_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings.GC.ctsm5316acl_int
SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.izumi_gnu.clm-ptsRLA.GC.ctsm5316acl_gnu
SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.izumi_gnu.clm-ptsROA.GC.ctsm5316acl_gnu
SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.izumi_nag.clm-ptsRLA.GC.ctsm5316acl_nag
SMS_D_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm45BgcCropQianRs.izumi_intel.clm-cropMonthOutput.GC.ctsm5316acl_int
SMS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm60FatesRs.izumi_nag.clm-FatesCold.GC.ctsm5316acl_nag
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60SpRs.izumi_nag.clm-default--clm-NEON-TOOL.GC.ctsm5316acl_nag
SMS_Ld5_Mmpi-serial.1x1_brazil.IHistClm60Bgc.izumi_gnu.clm-mimics.GC.ctsm5316acl_gnu
SMS_Ly3_Mmpi-serial.1x1_numaIA.I2000Clm50BgcDvCropQianRs.izumi_gnu.clm-ignor_warn_cropMonthOutputColdStart.GC.ctsm5316acl_gnu
SMS_Ly5_Mmpi-serial.1x1_brazil.IHistClm50BgcQianRs.izumi_intel.clm-newton_krylov_spinup.GC.ctsm5316acl_int
SMS_Ly5_Mmpi-serial.1x1_smallvilleIA.IHistClm60BgcCropQianRs.izumi_gnu.clm-gregorian_cropMonthOutput.GC.ctsm5316acl_gnu

This was working just yesterday, so I'm puzzling out what's going on here.

@jedwards4b
Copy link
Contributor Author

@ekluzek I just tried this on izume and built case SMS_D_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm45BgcCropQianRs.izumi_intel.clm-cropMonthOutput.20241219_131416_z62hfl without any issues. Could it be something in your environment or perhaps the node you tried to build on is not configured correctly?

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 19, 2024

Ahhh, @jedwards4b good question. Thanks for trying it yourself. I suppose it could be an environment thing, but I haven't done anything with my environment. I'll try some different logins and/or a different node. Good suggestions here.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 19, 2024

I tried it for ctsm5.3.014 and it seemed to work -- but maybe I just need to try these cases in that same window? I'll try that...

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 19, 2024

Hmmm, trying the same window where ctsm5.3.014 worked -- still didn't work. I also compared the software_environment.txt file in the CASE between @jedwards4b case and mine, and didn't see anything obvious. I sent a ticket to NRIT to see if they have ideas.

However, now all of a sudden Izumi scratch is read-only so there are other problems.

@ekluzek ekluzek merged commit f437651 into ESCOMP:master Dec 19, 2024
2 checks passed
@ekluzek ekluzek deleted the add_timestamp_to_rpointers branch December 19, 2024 23:33
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 21, 2024

Izumi was having some readonly problems with disks and I thought that could be the problem, so I tried again. But, I got the same problem again.

HOWEVER, I DID FIGURE OUT A WORKAROUND:
I noticed this because @jedwards4b had a case that was working for him. And what I noticed is that as part of him building his case the case build modified some files in his mpi-serial directory: /home/jedwards/CTSM/libraries/mpi-serial with automake and configure. So I figured if I ran autoconf to do some of the setup in the mpi-serial directory I might get it to work. A catch to this is that the configure is going to be specific to the compiler settings for the case you are trying to build.

What worked for me (in tcshell):

cd $SRCDIR
cd libraries/mpi-serial
aclocal
autoconf
automake
source $CASEDIR/.env_mach_specific.csh
./configure
\rm -rf Makefile autom4te.cache/ config.log config.status stamp-h1 tests/.deps/ tests/Makefile
cd $CASEDIR
./case.build

Where: SRCDIR is the top level CTSM checkout, and CASEDIR is the case or test directory you are trying to build.
NOTE:

  • The remove step is needed as otherwise it tells you that the mpi-serial directory is already configured and you need to run "make distclean" to clean it. Doing that does clean up the local configuration, but then for me it went back to the original error I saw. But, still I might be removing more files than I really need to, but at least something in the list asked for the clean.
  • There might be extra steps in the above that aren't needed, but the above is what I got to work.
  • Suggestions on how to simplify above would be great to hear about.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 21, 2024

Oh, put that here in the PR rather than in mpi-serial space. I'll copy it to there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked: dependency Wait to work on this until dependency is resolved enhancement new capability or improved behavior of existing capability usability Improve or clarify user-facing options
Projects
Status: Done (non release/external)
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants