-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to specify order of assigment with prterun --map-by :pe-list option #696
Comments
Correct - because as the help output indicates, there is no concept of ordering in this option. If someone wants to do a different pattern, such as you suggest, then they have other ways of doing it - e.g., rankfile mapping. Changing this option to support what you "expected" would be disruptive. I think we need to step back a bit and ask ourselves what we are trying to accomplish here. Are we trying to check that things work as intended? In that case, we shouldn't file an issue for this case - it is performing as intended. Your "expectation" was simply at odds with the documented behavior. It might indicate we need to clarify the docs (meaning we should file a documentation issue), but not merit changing the behavior. Trying to modify/expand the cmd line options to meet someone's expectations would be a rather intensive task - I would have to question the value and reality of it. We could easily wind up chasing our tails as one person's expectations are unlikely to match those of another. Just to be clear: I'm not questioning the value of testing all these options. I appreciate someone doing so. I'm only trying to understand the motivation for filing issues on things that are working as documented/intended. |
I started working on PRRTE with @jjhursey last summer and am in the process of learning the code. He asked me to do some testing with various combinations of map, bind, and rank options, which is the origin of the issues I have written. I'm not trying to push PRRTE in any specific direction, just trying to shake out bugs in the code. I found cases which did not seem to be working right while I was testing. I discussed these with Josh and wrote issues only when he also thought the behavior might be wrong. If what I think is a bug is something that PRRTE was clearly not intended to do, I'll accept that, or if it's a documentation cleanup problem I'll accept that too. In this case, the only place I found any mention of the PE-LIST option was in the prterun --help output and in the draft documents in pull request #557. All that text says is that PE-LIST is a comma-delimited list of CPUs, it doesn't say anything about ordered or unordered lists. If there's other documentation that clarifies it, I don't know where that is. If this is a documentation cleanup, that's fine. I realize changing PRRTE to pass around a list of CPUs with specific ordering is not a simple fix, but also don't know where it lands in terms of what's most important. |
I'm aware of the history - like I said, I'm not questioning the work. I'm just still trying to figure out how to respond to this "issue flood". It seems like these fall into a few categories:
Probably more categories in there, but hopefully this provides some food for thought. What I'm ultimately looking for is a way to label these issues so we can more easily deal with them. If we can come up with a reasonably simple way of categorizing them, then we can create a label for each category and appropriately mark the issues - and it is okay to put multiple labels on the issue if it somewhat spans categories, though it would be preferred if we can contain that. |
Dave and I have been testing based on the documentation we have - in the man pages, FAQs, and help output. Where it is ambiguous we have tried to test what we think it should do. Our goal is to test existing functionality, not necessarily invent new functionality or interpretations. We file tickets for any mismatch with the hope that we can dialog with the community on if it is a documentation issue, bug, invalid usage, or possibly a future feature. I think in the case of this Issue we hit on a documentation clarification issue, and possibly a future feature. I added a note to fixe the documentation issue in this comment. I think we should re-classify this ticket as a future, possible enhancement request for a |
I was about to make my own issue about this, but I have also experienced 'surprising' mapping with combinations of For example, the following option set yielded a crash, that looks like a bug
But other combinations would do unexpected things silently (like ranking by package when requesting rank-by node, and other bizarre things). |
I'm going to look at the crash, but the equivalent of |
This issue is an enhancement request to add a
|
Thank you for taking the time to submit an issue!
Background information
What version of the PMIx Reference Server are you using? (e.g., v1.0, v2.1, git master @ hash, etc.)
Master branch
What version of PMIx are you using? (e.g., v1.2.5, v2.0.3, v2.1.0, git branch name and hash, etc.)
Master branch
Please describe the system on which you are running
Operating system/version:
RHEL 7.7
Computer hardware:
4 Power 8 nodes, 2 sockets each with 10 cores (20 total) and 8 hwthreads per core (160 total)
Network type:
Hostfile specifies 4 nodes, each with slots=8 keyword
Details of the problem
If I run the command prterun -n 4 --hostfile hostfile8 --map-by core:PE-LIST=0,1,2,3 --bind-to core:report echo then I get output I expect
If I run the command prterun -n 4 --hostfile hostfile8 --map-by core:PE-LIST=3,2,1,0 --bind-to core:report echo I get the same output, where I expect something like
I think the ability to specify an arbitrary mapping is useful, such as when trying to place tasks for memory or cache affinity.
The problem is that when the set of resources specified in the PE-LIST option is handled as a set instead of a list, which means that it is impossible to retain any concept of ordering in the PE-LIST resources.
The text was updated successfully, but these errors were encountered: