Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memchecker tosses on multiple non-blocking send of the same buffer #2919

Closed
sthibaul opened this issue Feb 3, 2017 · 1 comment
Closed
Labels

Comments

@sthibaul
Copy link

sthibaul commented Feb 3, 2017

Hello,

See the attached testcase:

test.txt

to be run with -np 3. It is non-blockingly sending the same buffer several times. With memchecker this produces the following warning (using openmpi 1.6.5, but the same happens with openmpi 2):

==19685== Invalid read of size 1
==19685==    at 0x4C2E750: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1018)
==19685==    by 0x4F48140: opal_convertor_pack (opal_convertor.c:251)
==19685==    by 0xECA9CDC: mca_btl_sm_sendi (btl_sm.c:849)
==19685==    by 0xDE4B874: mca_bml_base_sendi (bml.h:304)
==19685==    by 0xDE4B874: mca_pml_ob1_send_request_start_copy (pml_ob1_sendreq.c:467)
==19685==    by 0xDE3E57B: mca_pml_ob1_send_request_start_btl (pml_ob1_sendreq.h:375)
==19685==    by 0xDE3E57B: mca_pml_ob1_send_request_start (pml_ob1_sendreq.h:441)
==19685==    by 0xDE3E57B: mca_pml_ob1_isend (pml_ob1_isend.c:87)
==19685==    by 0x4EB2EB7: PMPI_Isend (pisend.c:84)
==19685==    by 0x400A06: main (in /net/inria/home/sthibault/test)
==19685==  Address 0xffefff5af is on thread 1's stack
==19685==  in frame #6, created by main (???:)
==19685== 
==19685== Unaddressable byte(s) found during client check request
==19685==    at 0x4F464A1: valgrind_module_isdefined (memchecker_valgrind_module.c:112)
==19685==    by 0x4EB3015: memchecker_call (memchecker.h:104)
==19685==    by 0x4EB3015: PMPI_Isend (pisend.c:46)
==19685==    by 0x400A3A: main (in /net/inria/home/sthibault/test)
==19685==  Address 0xffefff5af is on thread 1's stack
==19685==  in frame #2, created by main (???:)

This is apparently because at the end of MPI_Isend in isend.c there is this call:

memchecker_call(&opal_memchecker_base_mem_noaccess, buf, count, type);

Which is probably meant to catch any application's attempt to write to the buffer being sent, but AIUI, the standard doesn't prevent the application from calling another MPI_Isend with the buffer, and it doesn't prevent the application from reading from the buffer either. Ideally that noaccess call should be replaced by something like a nowrite call, but it seems that valgrind does not have such "read-only" request yet, so in the meanwhile I'd say openmpi should drop the noaccess call.

@jsquyres jsquyres added the bug label May 16, 2017
@sthibaul
Copy link
Author

Apparently this is fixed in openmpi 4.0.2:

    /*
     * today's MPI standard mandates the send buffer remains accessible during the send operation
     * hence memchecker cannot mark buf as non accessible, but it might mark buf as read-only in
     * order to trap end user errors. Unfortunatly valgrind does not support marking buffers as read-only,
     * so there is pretty much nothing we can do here.
     */

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants