-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OSC/UCX crashes with c_flush from ibm test suite #5117
Comments
Oh I should add this is on a box where I only have xpmem installed. No mellanox/ib hw. |
@hppritcha I think this NULL buffer bug is fixed by #5094 , could you try this patch for this test? |
@hppritcha I runned your code with patch #5094 , looks like the issue is fixed. Could you try it to run again? Thanks! |
Well I tried with updated master (with #5094 merged) and now I see a new error: Running test
and if I look at the coredump I see
|
This is with UCX 1.3.0. I still see many failures in the ibm/onesided tests - do you actually test these? I see many many failures. At least 50% of the tests fail. I suspect that what's happening is no one is testing UCX with xpmem support only - no infiniband. Most of the log files show output like
|
this appears to be fixed. closing |
The OSC/UCX component on master fails for the c_flush.c test in the ibm test suite. With verbosity set to 100, one sees this error message from UCX"
1525114642.639621] [primavera:3170 :0] ucp_mm.c:264 UCX ERROR Undefined address requires UCP_MEM_MAP_ALLOCATE flag
The problem is it appears the UCX component can't handle NULL buffers being supplied to
MPI_Win_create
. Here's the testI'm using UCX master at 1785c376beeff9
The problem vanishes if one has all ranks do a
MPI_Alloc_mem
and supply the returned value toMPI_Win_create
.The text was updated successfully, but these errors were encountered: