-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Job submission issue on theta #3099
Comments
I'm also seeing a problem where an xmlchange on the job wallclock time was not accepted due to
Not sure this is related to above, but is anyone else seeing issues submitting jobs on theta? |
@jgfouca is there a recommended way to set max wall-clock times according to job node-counts? The queueing policy from https://www.alcf.anl.gov/user-guides/job-scheduling-policy-xc40-systems
|
@amametjanov , this would have to be described in the config_batch.xml entry for theta. A combination of walltimemin, walltimemax, nodemin, and nodemax should work. |
To avoid a reset of requested walltime, both master and maint-1.0 require
If only one or none of these are set prior to |
@amametjanov , that shouldn't be necessary (from xmlchange):
|
@jgfouca please give it a try on theta with maint-1.0. Noel had three different jobs launched with run_e3sm time-out because |
@ndkeen , @amametjanov, this is for SMS.ne30_oECv3_ICG.A_WCYCL1850S_CMIP6.theta_intel (both master and maint-1.0 seem to work):
|
A reproducer:
|
@amametjanov , thanks. Looking at it now. |
@jhkennedy , do you have any spare cycles to look at this? |
@jgfouca I can look at it on Monday. |
@amametjanov I can reproduce that behavior for |
Yes, I can reproduce on master after pulling in changes in #3145 that are needed to avoid the error in the initial comment of this issue. |
Update queue policy on Theta Addresses #3099 [BFB]
Update queue policy on Theta Addresses E3SM-Project/E3SM#3099 [BFB]
Fixed by #3144 (maint-1.0), #3145 (master). To keep custom wall-time after case.setup, it's recommended to either set
between |
With a master of July 31st, I tried
SMS.ne30_oECv3_ICG.A_WCYCL1850S_CMIP6
and the job failed with:The text was updated successfully, but these errors were encountered: