-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCX SEGV in osc_ucx_component.c #5083
Comments
thanks @gpaulsen looking into this now |
@gpaulsen could you paste the command line and test code path to reproduce this error? |
@gpaulsen I found the test to reproduce this. I think MPI_Win_dynamic is wrong, modifying it now. |
@xinzhao3 Probably related, there is a problem when creating windows with empty (0-length) buffers using /* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil ; -*- */
/*
*
* (C) 2003 by Argonne National Laboratory.
* See COPYRIGHT in top-level directory.
*/
#include <mpi.h>
#include <stdio.h>
// #include "mpitest.h"
#define ELEM_SIZE 8
int main( int argc, char *argv[] )
{
int rank;
int errors = 0, all_errors = 0;
int *flavor, *model, flag;
void *buf;
MPI_Win window;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
/** Create using MPI_Win_create() **/
if (rank > 0)
MPI_Alloc_mem(rank*ELEM_SIZE, MPI_INFO_NULL, &buf);
else
buf = NULL;
MPI_Win_create(buf, rank*ELEM_SIZE, 1, MPI_INFO_NULL, MPI_COMM_WORLD, &window);
MPI_Win_get_attr(window, MPI_WIN_CREATE_FLAVOR, &flavor, &flag);
if (!flag) {
printf("%d: MPI_Win_create - Error, no flavor\n", rank);
errors++;
} else if (*flavor != MPI_WIN_FLAVOR_CREATE) {
printf("%d: MPI_Win_create - Error, bad flavor (%d)\n", rank, *flavor);
errors++;
}
MPI_Win_get_attr(window, MPI_WIN_MODEL, &model, &flag);
if (!flag) {
printf("%d: MPI_Win_create - Error, no model\n", rank);
errors++;
} else if ( ! (*model == MPI_WIN_SEPARATE || *model == MPI_WIN_UNIFIED) ) {
printf("%d: MPI_Win_create - Error, bad model (%d)\n", rank, *model);
errors++;
}
MPI_Win_free(&window);
if (buf)
MPI_Free_mem(buf);
} |
@xinzhao3 And the stack trace
|
@xinzhao3 do you need anything else from me? |
@gpaulsen I am working on this issue and close to finish. I will give an update on tomorrow ompi meeting. |
@xinzhao3 I can try it tomorrow, our system is in maintenance today. |
Per 2018-08-07 webex, @xinzhao3 is going to check to see if this was a UCX error / has already been resolved. If we end up release noting this saying that there's a UCX issue in version X.Y.Z yadda yadda yadda, that would be fine. |
@gpaulsen Ping. |
waiting on update from me. No update yet. |
No longer failing in v3.1.x with latest UCX. |
@xinzhao3 @jladd-mlnx
many IBM tests on v3.1.x and on master have been failing for a number of weeks with a runtime segv due to the OSC UCX component.
I believe this should be easy to reproduce, though I'm not sure where the argument to the 'flavor' is coming from.
I think we should either block v3.1.x or disable the ucx osc component for the v3.1.x until we figure this out, due to how easy it is to his this issue.
The text was updated successfully, but these errors were encountered: