-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
All ranks log an error when r_lite is missing, even when it is not to run on that broker/node #1439
Comments
I should note, normally I wouldn't call this a bug, but it's a lot of printing to have thousands of those lines for every single job. Even for info level, it's substantial. |
I think this intended to log the fallback to the old rank.N.cores method, but probably not intended to do that on nodes that aren't part of the allocation. Is anything using the old method at this point or have we fully moved over to |
Well duh, I guess at least @trws's WIP branch is using it! I should wake up before I post. |
It would be nice to remove the rank.N.cores fallback. However, then missing
R_lite is definitely an error, right?
Without R_lite, there's no information about the allocation, so all ranks
consider that an error (they don't know if the allocation was meant for
then or not)
Maybe in this case only rank 0 should log an error?
…On Mon, Apr 9, 2018, 6:42 AM Jim Garlick ***@***.***> wrote:
I think this intended to log the fallback to the old rank.N.cores method,
but probably not intended to do that on nodes that aren't part of the
allocation.
Is anything using the old method at this point or have we fully moved over
to R_lite? Maybe not only the log message but the fallback can go away
now?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1439 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAtSUtqNrr-RA4gYqz2koIPL1Qjxjxtxks5tm2VCgaJpZM4TL67s>
.
|
Agreed, the sched end isn't all done with that yet if I understand correctly. There may be a branch that does, but I'm not using it yet. |
Yeah for the record, the sched R_lite generation work isn't finished. |
Reduce the amount of redundant logging from the job module. For errors that will be the same for every rank (e.g. missing or unparseable R_lite), log only from rank 0. Change the log level for "no rank.N dir for this rank" to debug, since it is an expected condition for a job that does not target rank N. This should greatly reduce amount of chatter from the job module to the flux log. Fixes flux-framework#1439
Reduce the amount of redundant logging from the job module. For errors that will be the same for every rank (e.g. missing or unparseable R_lite), log only from rank 0. Change the log level for "no rank.N dir for this rank" to debug, since it is an expected condition for a job that does not target rank N. This should greatly reduce amount of chatter from the job module to the flux log. Fixes flux-framework#1439
This is for a single-node job:
The text was updated successfully, but these errors were encountered: