-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rankfile per prun #720
Comments
Thank you for implementing, it works for a use-case with a single prun invocation running concurrently in one DVM, but not the use-case with multiple concurrent prun invocations where you want to run multiple independent jobs in one DVM without sharing any nodes. I think the latter use case needs support for +e relative node specifier. With +n, separate jobs end up allocated onto different slots on the same nodes. Since jobs are independent it doesn't make sense to require constructing a set of rankfiles (one rankfile per job) that are aware of each other, i.e. one set of rankfiles per one particular set of jobs. Desired:
Actual:
Open new issue or am I missing something? |
Hmmm...yeah, that would be something we haven't supported before, so best to open a new issue. I don't see any reason why we couldn't do it - just not something that was previously requested 😄 |
From the mailing list here
The
--mca prte_rankfile arf.txt
option seems to apply to the DVM and not per-job.So
prterun --mca prte_rankfile arf.txt ./a.out
will work, but notprun --mca prte_rankfile arf.txt ./a.out
.The text was updated successfully, but these errors were encountered: