-
Notifications
You must be signed in to change notification settings - Fork 56
After fork, child process will crash in close invoking #751
Comments
Hi @qianlong-ql, |
Hi @qianlong-ql
instead of |
I try replace rpma_peer_new & rpma_mr_reg to ibv_alloc_pd & ibv_reg_mr, Its also crash in fclose. I try update rpm to nic-libs-mellanox-rdma-3.0.2-1.x86_64 & nic-drivers-mellanox-rdma-3.0.2-10.noarch, but not work |
Do we know that |
yes |
Can you run this code under valgrind-memcheck? |
It won't crash when run with valgrind, here is the output, I don't see any useful information
|
Let's try to get more information with following option: |
replace fopen & fclose like below
and backtrace become
run valgrind with additional params and output:
|
hm, what is |
@qianlong-ql Hi, What OS-version does it occur on? I cannot reproduce it (using |
I clone a environment to reproduce this problem and send the addr & password to Tomasz Gromadzki by email. |
OK, thanks, I have tested it. Could you download and save the source rpm |
nic-libs-mellanox-rdma-3.0.2-1.x86_64.rpm is not match with the driver on this machine. I recovery nic-libs-mellanox-rdma version to 2.0.1-2 and put the main source in directory /root/rpm_packet/nic-libs-mellanox-rdma-2.0.1 |
Thanks! |
Hi @qianlong-ql
|
Thanks, I got the key point that private memory shouldn't use for RDMA if fork used. But why is safe when pagesize set to 2MB. |
It is not safe when pagesize is set to 2MB. It just does not crash, but I cannot guarantee that other things will work correctly. |
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
The memory region passed to rpma_mr_reg() (ibv_mr_reg()) cannot be allocated from the heap (using malloc() or posix_memalign()), but should be mapped using mmap(). Rationale: If ibv_fork_init() was called, the memory region passed to ibv_mr_reg() is marked by this function with flag “do not copy on fork” and after having called fork(), the child process does not receive this range of virtual addresses. If this memory region was allocated from the heap, the child process receives a corrupted heap with a “hole” of inaccessible addresses inside. A memory allocator knows nothing about this “hole” and if it tries to access (read or write) that range of virtual addresses, it causes a segfault. Ref: pmem#751
@qianlong-ql The fix #866 has been merged. Let us know, if it fixes this issue, please. |
@qianlong-ql ping |
@ldorau I have been on vacation and I will test and let you know as soon as possible after my vacation |
OK |
@ldorau The issue has been fixed |
@qianlong-ql Thanks for confirmation! Closing ... |
the up code invoke ibv_fork_init to support fork , after fork clild process will crash in fclose. crash will not happen if ibv_fork_init or rpma_mr_reg not invoked. I also find it's safe if ibv_fork_init and rpma_mr_reg invoke in different thread.
the backtrace like this:
#0 0x00007f9f0d2dd81d in _int_free () from /lib64/libc.so.6
#1 0x00007f9f0d2ca047 in fclose@@GLIBC_2.2.5 () from /lib64/libc.so.6
#2 0x0000000000402198 in main (argc=-1, argv=0x800076)
The text was updated successfully, but these errors were encountered: