Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue selection in env_batch is not adequate for blues #1255

Closed
jgfouca opened this issue Mar 16, 2017 · 3 comments
Closed

Queue selection in env_batch is not adequate for blues #1255

jgfouca opened this issue Mar 16, 2017 · 3 comments
Assignees

Comments

@jgfouca
Copy link
Contributor

jgfouca commented Mar 16, 2017

Some background: blues has two available queues, "shared" and "batch". The shared queue will throw an error if you try to submit a job and request more than an hour of runtime. The "shared" queue will accept num cores 1-64, the "batch" queue will accept 1-thousands. My problem is that we have a lot of tests that take 64 cores (fit in either queue), but some take three hours and some take one. env_batch.select_best_queue seems unable to select the appropriate queue for such jobs (1 hour or less should go to "shared" and the others should go to "batch"), instead it just selects "shared" for everything.

@rljacob
Copy link
Member

rljacob commented Mar 16, 2017

More info on blues: you can't directly submit to the "batch" queue. Adding "-q batch" will get your job rejected. For regular jobs, it figures out where to route you based on nodes and time requested. Only if you want a different queue, like "shared", then you have to use the "-q" option.

@jgfouca
Copy link
Contributor Author

jgfouca commented Mar 16, 2017

@rljacob that's been fixed on the ACME side via python hack.

@rljacob rljacob added the ready label Mar 17, 2017
@jgfouca jgfouca assigned jgfouca and unassigned jedwards4b May 1, 2017
@jgfouca
Copy link
Contributor Author

jgfouca commented May 1, 2017

Had a request come in on the ACME side so I'll take this.

@ghost ghost added in progress and removed ready labels May 1, 2017
@ghost ghost removed the in progress label May 2, 2017
jgfouca added a commit that referenced this issue May 2, 2017
Update queue selection to take walltime into account

Adds concept of strict walltime.

The idea here is to have better support for machines like blues that have a "debug" queue and a "standard" queue. The "debug" has strict limits on both walltime and num_pes and therefore should not be selected as the user's queue if they asked for a long walltime. For other machines, the maxwalltime setting is being used more like a default walltime than a true max.

Test suite: scripts_regression_tests (melvin and skybridge) and some by-hand testing on blues
Test baseline:
Test namelist changes:
Test status: bit for bit

Fixes #1255

User interface changes?: Changes in how walltime is handled

Code review: @jedwards4b @jayeshkrishna @rljacob
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants