-
Notifications
You must be signed in to change notification settings - Fork 860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v3.0.x vader on PPC (wmb moved to end of set_header) #4937
Labels
Comments
@hjelmn This is the issue I mentioned on the phone |
ps: I tested with the wmb() call at the top of set_header() and it seemed to work |
markalle
added a commit
to markalle/ompi
that referenced
this issue
Mar 22, 2018
I have some tests that failed on PPC with the recent vader changes. I'm suspicious of the set_header() function where there's a wmb() call that looks like it boils down to set some data set the header to indicate the data is available wmb and I think the wmb needs to go up a line. More details here: open-mpi#4937 with a copy of the "maxsoak.c" testcase at https://gist.github.com/markalle/a1c203297cb6af22a3fb5c24e62b2ba3
I made a PR with my proposed fix: |
markalle
added a commit
to markalle/ompi
that referenced
this issue
Mar 22, 2018
I have some tests that failed on PPC with the recent vader changes. I'm suspicious of the set_header() function where there's a wmb() call that looks like it boils down to set some data set the header to indicate the data is available wmb and I think the wmb needs to go up a line. More details here: open-mpi#4937 with a copy of the "maxsoak.c" testcase at https://gist.github.com/markalle/a1c203297cb6af22a3fb5c24e62b2ba3 Signed-off-by: Mark Allen <markalle@us.ibm.com>
This is fixed from what I understand. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is for the v3.0.x branch.
I have some tests that fail with vader on PPC (pass on x86 due to more generous memory ordering rules there). It looks to me like one of the wmb calls has been moved. I don't have much knowledge of what vader's doing, but I'm guessing the use of the function mca_btl_vader_fbox_set_header() should boil down to
but the fbox_set_header function has its wmb() call at the bottom so I think it's probably ending up as
which wouldn't ensure the data is visible to the reader.
I can hit the problem using the below "maxsoak.c" testcase as
mpicc -o x maxsoak.c
mpirun -np 6 -mca pml ob1 -mca btl vader,self ./x
and the testcase will detect corruption.
For me the failure message from the testcase ends up something like
The text was updated successfully, but these errors were encountered: