Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The FailureSignalHandler may cause hang when the program is aborted #235

Closed
yyh-graphsql opened this issue Aug 22, 2017 · 4 comments
Closed

Comments

@yyh-graphsql
Copy link

yyh-graphsql commented Aug 22, 2017

We use 'google::InstallFailureSignalHandler();' in our c++ project. But sometimes, the FailureSignalHandler may hang because of the malloc in the GetBacktrace() function.

The signal handler interrupted an in-progress malloc function call which had already held the lock. In this scenario, it will cause a deadlock.

Any ways that I can work around?

The glog version we are using is 0.3.3.

@yyh-graphsql
Copy link
Author

OS information:
Linux 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1 (2016-12-30) x86_64 x86_64 x86_64 GNU/Linux
PRETTY_NAME="Debian GNU/Linux 8 (jessie)"
NAME="Debian GNU/Linux"
VERSION_ID="8"
VERSION="8 (jessie)"
ID=debian

gcc version:
gcc version 4.8.2 20140120 (Red Hat 4.8.2-15) (GCC)

Glibc:
admin@ip-172-31-21-217:~/aug_20_patch$ /home/admin/.syspre/lib64/libc.so.6
GNU C Library (Debian GLIBC 2.19-18+deb8u10) stable release version 2.19, by Roland McGrath et al.
Copyright (C) 2014 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.8.4.
Compiled on a Linux 3.16.43 system on 2017-06-18.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC

@andyleejordan
Copy link
Contributor

Do you know which implementation of GetStackTrace your build is using? I ask because there are several (here's the libunwind one), and I know some care has been taken to avoid this deadlock situation.

It would be helpful to know which one has this problem. Did you build glog yourself or use it from a package?

@yyh-graphsql
Copy link
Author

@andschwa I thinks it should be compiled with src/stacktrace_generic-inl.h. I have the backtrace as follows:

(gdb) bt
#0  __lll_lock_wait_private () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x00007fdc0b9bd4da in _L_lock_10381 () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fdc0b9bb015 in __GI___libc_malloc (bytes=56) at malloc.c:2887
#3  0x00007fdc0d821aae in _dl_map_object_deps (map=map@entry=0x7fdc0da294e8, preloads=preloads@entry=0x0, npreloads=npreloads@entry=0, 
    trace_mode=trace_mode@entry=0, open_mode=open_mode@entry=-2147483648) at dl-deps.c:511
#4  0x00007fdc0d827c6c in dl_open_worker (a=a@entry=0x7fff461bdae8) at dl-open.c:272
#5  0x00007fdc0d823aa4 in _dl_catch_error (objname=objname@entry=0x7fff461bdad8, errstring=errstring@entry=0x7fff461bdae0, 
    mallocedp=mallocedp@entry=0x7fff461bdad7, operate=operate@entry=0x7fdc0d827b60 <dl_open_worker>, args=args@entry=0x7fff461bdae8)
    at dl-error.c:187
#6  0x00007fdc0d82762b in _dl_open (file=0x7fdc0baa1ce6 "libgcc_s.so.1", mode=-2147483647, caller_dlopen=<optimized out>, nsid=-2, argc=17, 
    argv=0x7fff461bf048, env=0x7fff461bf0d8) at dl-open.c:661
#7  0x00007fdc0ba5ca22 in do_dlopen (ptr=ptr@entry=0x7fff461bdd10) at dl-libc.c:87
#8  0x00007fdc0d823aa4 in _dl_catch_error (objname=0x7fff461bdcf0, errstring=0x7fff461bdcf8, mallocedp=0x7fff461bdcef, 
    operate=0x7fdc0ba5c9e0 <do_dlopen>, args=0x7fff461bdd10) at dl-error.c:187
#9  0x00007fdc0ba5cabf in dlerror_run (operate=operate@entry=0x7fdc0ba5c9e0 <do_dlopen>, args=args@entry=0x7fff461bdd10) at dl-libc.c:46
#10 0x00007fdc0ba5cb31 in __GI___libc_dlopen_mode (name=name@entry=0x7fdc0baa1ce6 "libgcc_s.so.1", mode=mode@entry=-2147483647)
    at dl-libc.c:163
#11 0x00007fdc0ba34455 in init () at ../sysdeps/x86_64/backtrace.c:52
#12 0x00007fdc0d3ea410 in pthread_once () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:103
#13 0x00007fdc0ba3456c in __GI___backtrace (array=<optimized out>, size=64) at ../sysdeps/x86_64/backtrace.c:103
#14 0x0000000000898920 in google::GetStackTrace (result=result@entry=0x7fff461be000, max_depth=max_depth@entry=32, 

@shinh
Copy link
Collaborator

shinh commented Aug 23, 2017

As written in https://github.com/google/glog/blob/master/INSTALL#L15, libunwind would be definitely the best solution.

If I remember correctly and nothing has changed, glibc's backtrace() calls malloc() only when it's called first time. If this is correct, calling DumpStackTraceToString or backtrace in execinfo.h at the beginning of your program could mitigate the issue. If this works, it'd be probably a good idea to do this in InstallFailureSignalHandler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants