Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HANG in native state after detach on api.detach_spawn test #2690

Open
derekbruening opened this issue Nov 6, 2017 · 0 comments
Open

HANG in native state after detach on api.detach_spawn test #2690

derekbruening opened this issue Nov 6, 2017 · 0 comments

Comments

@derekbruening
Copy link
Contributor

Running my new test for #2601 api.detach_spawn in a loop, after fixing
#2601 and #2688, in one run I saw a hang after detach finished:

Looks completely native except dynamo_exited* aren't set -- ok, that's
normal, dynamo_exit_post_detach clears them all for a possible re-attach:

(gdb) p dynamo_initialized
$1 = false
(gdb) p dynamo_exited
$2 = false
(gdb) p doing_detach
$3 = false
(gdb) p dynamo_detaching_flag
$4 = -1
(gdb) p dynamo_exited_all_other_threads
$5 = false
(gdb) p dynamo_exited_all_and_cleaned
No symbol "dynamo_exited_all_and_cleaned" in current context.
(gdb) p dynamo_exited_and_cleaned
$6 = false

So we're completely native. It's not clear why it's stuck: there are 10
threads and they all seem to be running the 'print(".")' but the callstacks
are hard to figure out: are they in some fprintf lock or what.

From base of stack up I get this far:

0xed2fbde0  0xed2fc300  No symbol matches (void *)$retaddr.
0xed2fbde4  0xf753c9e5  vfprintf + 469 in section .text of /lib/i386-linux-gnu/libc.so.6

0xed2fc300  0xed2fc31c  No symbol matches (void *)$retaddr.
0xed2fc304  0x08049d9c  print + 54 in section .text of /home/bruening/dr/git/build_x86_dbg_tests/suite/tests/bin/api.detach_spawn

0xed2fc31c  0xed2fc358  No symbol matches (void *)$retaddr.
0xed2fc320  0x080495ba  parent_func + 42 in section .text of /home/bruening/dr/git/build_x86_dbg_tests/suite/tests/bin/api.detach_spawn

0xed2fc358  0xed2fc428  No symbol matches (void *)$retaddr.
0xed2fc35c  0xf76b0f72  start_thread + 210 in section .text of /lib/i386-linux-gnu/libpthread.so.0

But the vfprintf frame has it next hitting somewhere around an funlockfile
func ptr, but there's no retaddr near there:

(gdb) x/2i 0xf753c9e5-5
   0xf753c9e0 <vfprintf+464>:	call   0xf75415f0
   0xf753c9e5 <vfprintf+469>:	lea    -0xc(%ebp),%esp
(gdb) x/23i 0xf75415f0
   0xf75415f0:	push   %ebp
   0xf75415f1:	push   %edi
   0xf75415f2:	push   %esi
   0xf75415f3:	mov    %eax,%esi
   0xf75415f5:	push   %ebx
   0xf75415f6:	sub    $0x20ec,%esp

0xed2f9ce8  0xed2f9d10  No symbol matches (void *)$retaddr.
0xed2f9cec  0xf76b9170  funlockfile in section .text of /lib/i386-linux-gnu/libpthread.so.0
#+END_EXAMPLE

Going from the top down we're in libc but gdb won't name the routine:
#+BEGIN_EXAMPLE
(gdb) x/5i $pc-2
   0xf76eac8e:	int    $0x80
=> 0xf76eac90:	pop    %ebp
   0xf76eac91:	pop    %edx
   0xf76eac92:	pop    %ecx
   0xf76eac93:	ret    
(gdb) dps $esp $esp+64
0xed2f9cc8  0x00000001  No symbol matches (void *)$retaddr.
0xed2f9ccc  0x00000002  No symbol matches (void *)$retaddr.
0xed2f9cd0  0x00000080  No symbol matches (void *)$retaddr.
0xed2f9cd4  0xf75f48b1  No symbol matches (void *)$retaddr.
(gdb) x/10i 0xf75f48b1-12
   0xf75f48a5:	mov    $0xf0,%eax
   0xf75f48aa:	call   *%gs:0x10
   0xf75f48b1:	mov    %edx,%eax
   0xf75f48b3:	xchg   %eax,(%ebx)

f74f9000-f76a4000 r-xp 00000000 fc:01 12321400                           /lib/i386-linux-gnu/libc-2.19.so
f76a4000-f76a6000 r--p 001aa000 fc:01 12321400                           /lib/i386-linux-gnu/libc-2.19.so
f76a6000-f76a7000 rw-p 001ac000 fc:01 12321400                           /lib/i386-linux-gnu/libc-2.19.so

0xf0 == 240 == futex

(gdb) info reg
eax            0xfffffe00	-512
ecx            0x80	128                  == flag
edx            0x2	2                    == mustbe
ebx            0xf76a788c	-144017268   == int*futex
esp            0xed2f9cc8	0xed2f9cc8
ebp            0x1	0x1                  == val3
esi            0x0	0                    == timeout
edi            0x1	1                    == int*uaddr2

#define FUTEX_WAIT		0
#define FUTEX_PRIVATE_FLAG	128
#define FUTEX_WAIT_PRIVATE	(FUTEX_WAIT | FUTEX_PRIVATE_FLAG)

Looks like the routine starts here:

   0xf75f4890:	push   %edx

So its caller is:

(gdb) x/4i 0xf754182e-5
   0xf7541829:	call   0xf75f4890
   0xf754182e:	jmp    0xf75416ee

Anyway it's a futex inside vfprintf: hard to get more when these libc
routines apparently have no names known to gdb.

Did we mess up something about stderr? Running 300x w/o DR I don't see a
hang.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant