Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault in blas_shutdown() function #1692

Closed
changyp6 opened this issue Jul 21, 2018 · 3 comments
Closed

Segmentation Fault in blas_shutdown() function #1692

changyp6 opened this issue Jul 21, 2018 · 3 comments
Milestone

Comments

@changyp6
Copy link

changyp6 commented Jul 21, 2018

I built caffe from source code, and use pycaffe interface to program.
When I import caffe in python3, and press Ctrl+D, segment fault reports

Every time it runs, segment fault occurs.

Steps to Reproduce:

  1. build caffe from source code(https://github.com/BVLC/caffe) and install caffe
  2. run python3
  3. import caffe
  4. press Ctrl+D

Actual results:

lldb python3
(lldb) target create "python3"
Current executable set to 'python3' (x86_64).
(lldb) run
Process 26585 launched: '/usr/bin/python3' (x86_64)
Python 3.6.6 (default, Jul 19 2018, 14:25:17) 
[GCC 8.1.1 20180712 (Red Hat 8.1.1-5)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import caffe
>>> 
Process 26585 stopped
* thread #1, name = 'python3', stop reason = signal SIGSEGV: invalid address (fault address: 0x7fffd9fff008)
    frame #0: 0x00007fffe6fb01b3 libopenblasp.so.0`blas_shutdown at memory.c:1281
   1278     for (pos = 0; pos < BUFFERS_PER_THREAD; pos ++){
   1279       struct alloc_t *alloc_info = local_memory_table[thread][pos];
   1280       if (alloc_info) {
-> 1281         alloc_info->release_func(alloc_info);
   1282         alloc_info = (void *)0;
   1283       }
   1284     }
(lldb) bt
* thread #1, name = 'python3', stop reason = signal SIGSEGV: invalid address (fault address: 0x7fffd9fff008)
  * frame #0: 0x00007fffe6fb01b3 libopenblasp.so.0`blas_shutdown at memory.c:1281
    frame #1: 0x00007fffe6d82015 libopenblasp.so.0`gotoblas_quit at memory.c:1470
    frame #2: 0x00007ffff7de58e6 ld-linux-x86-64.so.2`_dl_fini + 518
    frame #3: 0x00007ffff6b3a72c libc.so.6`__run_exit_handlers + 316
    frame #4: 0x00007ffff6b3a85c libc.so.6`__GI_exit + 28
    frame #5: 0x00007ffff6b24252 libc.so.6`__libc_start_main + 242
    frame #6: 0x0000555555554e1a python3`_start + 42
(lldb) c
Process 26585 resuming
Process 26585 exited with status = 11 (0x0000000b)

Additional info:
I think I have found the root cause.
Look at the code on line 1279 and 1280
on line 1279 a local variable "alloc_info" is declared, and initialized by value of "local_memory_table[thread][pos]"
on lone 1280 this local variable "alloc_info" is set to NULL,
however, the original location "local_memory_table[thread][pos]" still remains the same, it is NOT set to NULL.

If libopenblasp.so is loaded multiple times in the memory, the first time blas_shutdown() runs normally, the second time blas_shutdown() runs into segment fault.

The solution is to change line 1280 from "alloc_info = (void *)0;" to "local_memory_table[thread][pos] = (void *)0;"

I have created a patch based on openblas version 0.3.1, which is attached.
And I have tested this patch, after applying this patch, and rebuild openblas.
It doesn't crash anymore.

I have also reported this bug to the openblas package maintainer in redhat bugzilla

@changyp6
Copy link
Author

changyp6 commented Jul 21, 2018

here is the patch to fix this issue:
Please take a look @xianyi @martin-frbg

--- OpenBLAS-0.3.1/driver/others/memory.c.orig	2018-07-20 18:54:50.372917078 +0800
+++ OpenBLAS-0.3.1/driver/others/memory.c	2018-07-20 18:56:26.378444907 +0800
@@ -1279,7 +1279,7 @@ void blas_shutdown(void){
       struct alloc_t *alloc_info = local_memory_table[thread][pos];
       if (alloc_info) {
         alloc_info->release_func(alloc_info);
-        alloc_info = (void *)0;
+        local_memory_table[thread][pos] = (void *)0;
       }
     }
   }

You can download the patch from this link

@martin-frbg
Copy link
Collaborator

Good catch, thanks. (CC @oon3m0oo)

@oon3m0oo
Copy link
Contributor

Ahhh... yes, my fault. Nice catch!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants