-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wreck: small fix for jobs with more nodes in R lite than tasks #1403
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1403 +/- ##
==========================================
+ Coverage 78.51% 78.55% +0.04%
==========================================
Files 163 163
Lines 29983 29995 +12
==========================================
+ Hits 23542 23564 +22
+ Misses 6441 6431 -10
|
Ok, it turns out it would more difficult than initially anticipated to allow wreck jobs to run when more nodes are assigned to the job than tasks. There are multiple places where number of local tasks per node are assumed to non-zero across the parallel job, including nodeid assignment, PMI, etc. These could probably be worked through, but it doesn't seem useful at this time. Instead, this PR now makes assignment of nnodes > ntasks a fatal error:
|
Hm, #1400 should actually be renamed I think it is actually a flux-sched bug. It could reference it though.. what's the suggested format for that? |
Actually it is 2d55622 that will make ntasks < nnodes a fatal error now. |
Add a function to return the total number of nodes that have tasks assigned after rcalc_distribute()
Use rcalc_total_nodes_use() in t/wreck/rcalc to return the total number of *used* nodes in output for testing purposes.
Aded a set of "R_lite" inputs and expected outputs for the rcalc utility, and a test to read in inputs, generate outputs, and check the results.
In some cases a call to `wlog_fatal` would cause jobs to get stuck in `reserved` or `starting` state, especially if rank 0 wrexecd exited with a failure. Try harder in this function to update the job state to "failed" before rank 0 wrexecd exits on error.
Issue a fatal error in wrexecd if nnodes > ntasks. This case is not handled correctly in wrexecd, and it is deemed unimportant to fix now. Terminating the job with a failure is a better solution than a hang or inconsistent state.
Add a test to ensure that a job assigned more nodes than tasks fails, instead of hanging or running more tasks than requested.
Yeah #1400 is definitely a sched bug. And making this condition a fatal error is a reasonable semantics. |
The new |
Apparently the rcalc tests called in a loop will not be compatible with chain-lint, so disable these tests for now under --chain-lint.
This adds
rcalc_total_nnodes_used()
call to the wreck/rcalc class, and then uses that to set nnodes internally in wrexecd. This should prevent jobs with larger allocations than tasks from hanging (but this needs testing)I also added some R-lite inputs for testing under
t/wreck/input
, and verification of output in a newt1999-wreck-rcalc.t
test.This isn't ready for merge yet, but put up as a placeholder.