-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
btl/vader: ensure the fast box tag is always read first #5829
Conversation
On some platfoms reading a 64-bit value is non-atomic and it is possible that the two 32-bit values are read in the wrong order. To ensure the tag is always read first this commit reads the tag before reading the full 64-bit value. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
@jsquyres In theory this will fix the remaining vader issue. |
@amckinstry Please verify. |
@amckinstry This is in reference to #5638 -- the hang with vader on some architectures. |
Unfortunately we now see a related crash on other codes (lammps): #0 0x0000000000000000 in () #3 0x00007f7a700f8f83 in mca_btl_vader_component_progress () at btl_vader_component.c:702 it looks like hdr->tag is invaliid, hence segdfault |
@amckinstry, I've opened #5842 to track this related crash you're seeing. |
@amckinstry What platforms? |
@hjelmn In @amckinstry's original Issue, he mentioned i386 + Debian: #5638 |
@gpaulsen Yeah, just want to see if that is still the case. If it is just i386 I can not justify spending any more time at work on the issue. Someone else will need to look at it. |
Ok thanks for mentioning that. |
Ok, that I can spend time on. Will take a look today. |
On some platfoms reading a 64-bit value is non-atomic and it is
possible that the two 32-bit values are read in the wrong order. To
ensure the tag is always read first this commit reads the tag before
reading the full 64-bit value.
Signed-off-by: Nathan Hjelm hjelmn@lanl.gov