Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

env_mach_pes.xml not being locked properly #1322

Closed
fischer-ncar opened this issue Apr 7, 2017 · 4 comments
Closed

env_mach_pes.xml not being locked properly #1322

fischer-ncar opened this issue Apr 7, 2017 · 4 comments
Assignees
Labels

Comments

@fischer-ncar
Copy link
Contributor

fischer-ncar commented Apr 7, 2017

I ran into this problem while doing testing on cheyenne. I was able to make changes in env_mach_pes.xml and submit to the queue without any error messages.

I was also able to change env_mach_pes.xml, then do a rebuild and submit without redoing case.setup.

fischer-ncar pushed a commit that referenced this issue Apr 11, 2017
Increase jobmax on redsky

Some of our tests were asking for more than the 480 max. This
meant that no queue satisfied the need and the null queue was selected
which is not valid on redsky.

[BFB]

* origin/jgfouca/cime/increase_redsky_queue:
  Increase jobmax on redsky
@sarich sarich self-assigned this Apr 19, 2017
@sarich sarich added the ready label Apr 19, 2017
@sarich
Copy link
Contributor

sarich commented Apr 19, 2017

I'm unable to reproduce this error locally (I don't have a cheyenne account), maybe it would help if I could get the git version and the exact commands and xml changes that were made.

And of course if this is only happening on cheyenne then someone else may have to look into it.

@fischer-ncar
Copy link
Contributor Author

This is what I did using the current cime/master.

git version 1.8.5.6

./create_test SMS.f45_g37_rx1.A.cheyenne_intel
./xmlchange NTASKS=-2
./case.submit # run submits to queue, but fails

./case.build --clean
./case.build
./case.submit # run submits to queue, but fails

I wasn't able to get create_newcase to repeat this behavior. So, it's something specific with create_test.

@sarich
Copy link
Contributor

sarich commented Apr 19, 2017

Thanks, that was very helpful. I think I am getting the same behavior now and will investigate.

NOTES:
Running create_newcase and changing the NTASKS gets the expected behavior:

./create_newcase --case test1 --res f45_g37_rx1 --mach anlworkstation --compiler gnu --compset A
cd test1
./case.setup
./case.build
./xmlchange NTASKS=-2
./case.submit
File /nfs2/sarich/cime4/scripts/test1/LockedFiles/env_mach_pes.xml has been modified
found difference in NTASKS:{'component': 'ICE'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'ESP'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'CPL'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'OCN'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'LND'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'ATM'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'WAV'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'GLC'} : case '-2' locked '-1'
found difference in NTASKS:{'component': 'ROF'} : case '-2' locked '-1'

But using create_test for the same test case fails to catch the change:
./create_test SMS.f45_g37_rx1.A
cd ~/acme/scratch/SMS.f45_g37_rx1.A.anlworkstation_gnu.20170419_155053_r7bnhy
./xmlchange NTASKS=-2
./case.submit
...tries to run...

@sarich
Copy link
Contributor

sarich commented Apr 20, 2017

checked_lockfile() was counting the number of dots in full pathname and ignored any locked files that had more than one period in it. So if the case name had any dots, no locked files were checked.

@rljacob rljacob closed this as completed May 4, 2017
@ghost ghost removed the in progress label May 4, 2017
jgfouca added a commit that referenced this issue Feb 23, 2018
Increase jobmax on redsky

Some of our tests were asking for more than the 480 max. This
meant that no queue satisfied the need and the null queue was selected
which is not valid on redsky.

[BFB]

* origin/jgfouca/cime/increase_redsky_queue:
  Increase jobmax on redsky
jgfouca added a commit that referenced this issue Mar 13, 2018
Increase jobmax on redsky

Some of our tests were asking for more than the 480 max. This
meant that no queue satisfied the need and the null queue was selected
which is not valid on redsky.

[BFB]

* origin/jgfouca/cime/increase_redsky_queue:
  Increase jobmax on redsky
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants