-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do not slice unencoded pmi messages #238
Conversation
pmix_vallen_max is the max size of an encoded key value, so there is no need to slice the unencoded pmi message in the first place.
@hjelmn @hppritcha @elenash Can you folks please take a look at this? |
I don't have any objections to this patch, it just makes all pmi_puts to be done during pmix_fence, not earlier (during pmix_put). But could you explain how it fixes the original problem? |
Without the patch, when you put a key that will increase pmix_packed_data from 200 to 300 bytes, 256 bytes of unencoded data will be sent immediatly The receiver will not try to get key-2 because key-1 is minus terminated, an other way to fix this is to slice the encoded key : Gilles elenash notifications@github.com wrote:
|
Just to provide some background on why this code even exists. Cray imposes a limit on the number of keys an application can commit to PMI. At large scale, OMPI was running into this limit, and so Nathan created the ability to "pack" our data into "meta-keys" so that every key pushed to PMI contained a full maxlen of info in it. This compressed the data into a minimum number of keys. Of course, you then had to "uncompress" the meta-key data on the receiving end. Ugly, but it resolved the problem by significantly reducing the number of keys pushed to PMI. I don't know what broke when we refactored the code to create the pmix framework, but this used to work quite well. I don't really have a stake in this code as pmix never uses it (we pack the data into buffers instead of using this method). My only reason for asking for a review was to ensure that those who do care can verify that the compression remains in operation so that Cray machines don't encounter the key limit problem again. |
@rhc54 There shouldn't be such problem with this patch. It still creates meta keys but later when it was previously. |
@ggouaillardet I'm not sure that understood what you mean. Why does key-1 become minus terminated? |
@elenash because pmi_encode minus terminates the encoded string In this scenario, pmi_encode is invoked twice by opal_pmix_base_commit_packed : elenash notifications@github.com wrote:
|
@ggouaillardet I see, thanks! |
Test FAILed. Build Log
Test FAILed. |
#245 is a better implementation that obsoletes this PR |
Don't register the PSM errhandler until it is certain that the PSM component can be used.
pmix_vallen_max is the max size of an encoded key value, so there
is no need to slice the unencoded pmi message in the first place.
if we really want to keep the unencoded pmi message small
refers #235 but for master only