-
Notifications
You must be signed in to change notification settings - Fork 313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problems with running fates_next_api/release-clm5.0 on izumi #1093
Comments
From looking at the ChangeLog for cime, I think this should drop in with no problems, and no change of answers. |
Here's another log message:
|
Thanks @ekluzek I will follow up with you on where to get the updated cime. unless you want to post here, and I can update and test. |
@jkshuman use cime5.6.33 and see if it works. |
@jkshuman you must be watching now (or I just had a glitch Friday), as you now show up when I start typing your name. If I start typing someone's name and they don't show up as an option, it's usually because they aren't watching. |
@ekluzek still getting a fail on Izumi. Let me know if I am missing something: case build still fails on Izumi: |
error in that file: |
OK, I verified the same problem by testing fates_next_api, with both default cime and cime5.6.33... SMS.f09_g17.I2000Clm50Fates.izumi_intel.clm-FatesColdDef Then I also tried it on the release branch and see the same problem (cime5.6.33 is the default in release-clm5.0.34). I also tried the more generic test SMS.f09_g17.I2000Clm50BgcCrop.izumi_intel.clm-default and it fails as well. |
OK, it looks like the izumi updates went in cime maint-5.6 branch -- but haven't been tagged. When I point cime to the latest maint-5.6 branch, it does seem to build. This is the cime PR with the needed updates... |
@ekluzek model builds and submits, but fails after first time-step. (This same case was successful on Hobart) |
@ekluzek I tried a run that uses only 1 node, and got same fail. The two tests have similar fail where they complete the first time-step, and then fail on resubmit. First run was an 8 node run with monthly time-step, second test was 1 node with yearly time-step) Same fail: "killed by signal 15" the 1 node case (junk no fire) will continue if I resubmit manually from inside the case. did not test 8 node case. |
and this junk fire case running on 1 node was able to make it into year 2 automatically... |
Note that the cime fix for Izumi works. The resubmit problem is a different issue (and inconsistent as not all cases fail for my test cases). @ekluzek should we close this and open a separate issue on this resubmit? @jedwards4b (is this Jim Edwards?) suggested including a workaround option: per Jim "Are you aware of the resubmit immediate option to case.submit? It will submit all of your jobs at once from the login node with dependancies so that each job will complete before the next begins. This should be an effective workaround for the problem compute nodes not resubmitting properly." |
@ekluzek I just accidentally replicated the above error on my workstation trying to build a single site case. Last week while helping @jkshuman track down the issue using my workstation I had been able to successfully build and run with The trigger for the failure this time was that I was trying to build the case with a conda environment activated that I don't normally use during case builds. Perhaps that suggests it's an issue with the module versions loaded on izumi? I can provide my output from |
@glemieux this is interesting. I just did an overhaul of my conda environments. though I do not recall which conda was active (if any) when I ran these test cases. |
tested cime branch cime.5.8.30 on Izumi with fates_main_api per @ekluzek recommendation and simulation was successful. Thanks @ekluzek ctsm path to output: /scratch/cluster/jkshuman/archive/t4_izumi_JKS_C3_main_4x5_fde33f56_6bfea0f8/lnd/hist |
…m-updates Update default allometry parameters for tree PFTs
Brief summary of bug
Jackie has had problems running on izumi of late with fates_next_api.
@jkshuman
General bug information
CTSM version you are using: release-clm5.0.30-143-gabcd5937
Does this bug cause significantly incorrect results in the model's science? No
Configurations affected: izumi_intel
Details of bug
Failure of building gptl.
Important details of your setup / configuration so we can reproduce the bug
I think this is just because fates_next_api is using cime5.6.28 and needs to be updated to cime5.6.33
Important output or errors that show the problem
The text was updated successfully, but these errors were encountered: