Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r820 - r823 causes "make test" to hang #492

Open
derekbruening opened this issue Nov 28, 2014 · 9 comments
Open

r820 - r823 causes "make test" to hang #492

derekbruening opened this issue Nov 28, 2014 · 9 comments

Comments

@derekbruening
Copy link
Contributor

From peterfeiner on June 08, 2011 10:21:36

What steps will reproduce the problem? 1. Checkout the source code
2. Update to revision 822 or 823 (820 and 821 do not compile).
3. Build and run the tests:

mkdir build
cd build
cmake -DBUILD_TESTS=ON ../
make -j20 && make test What is the expected output? What do you see instead? I expect to see all of the tests run. Instead, I just see

Running tests...
Test project /home/peter/dynamorio/build
Start 1: code_api|common.broadfun

and it hangs. top shows that common.broadfun is pegging a CPU. Please use labels and text to provide additional information. If I run ctest instead of make test, broadfun does not hang. r819 does not have this problem.

I'm running Ubuntu Linux 10.04 on an intel core i7.

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=492

@derekbruening
Copy link
Contributor Author

From qin.zhao@gmail.com on June 08, 2011 07:30:08

I saw hangs before on my Linux, which was because of recursively SIGSEGV happens in signal handling on access tls field.
Can you try to disable the private loader to see if it happens again.
If still, disable mangle_app_seg too to see if it works or not.

I also see the different behavior with ctest and make test. Can you file another issue on it?

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on June 08, 2011 07:35:49

"make test" basically just runs "ctest". look in the Makefile:

test:
@$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --cyan "Running tests..."
/usr/bin/ctest --force-new-ctest-process $(ARGS)

can you try w/ and w/o the "--force-new-ctest-process".

xref http://www.cmake.org/pipermail/cmake/2007-June/014632.html

@derekbruening
Copy link
Contributor Author

From peterfeiner on June 08, 2011 08:49:55

(sorry, I originally replied to the email thread. I'm copying here for posterity)

Removing --force-new-ctest-process does not fix the problem.

The simplest Makefile hangs too:

test:
ctest

Qin, what other issue do you want me to file? Also, how do I disable
the private loader (other than reverting to r819 )?

@derekbruening
Copy link
Contributor Author

From qin.zhao@gmail.com on June 08, 2011 09:08:04

I think we should file an issue about the different behavior of using make test and ctest.

There are two ways to disable the private loader:

  1. change the source code at core/optionsx.h and set the default value to be false, or
  2. run DR with option -ops "-no_private_loader"

@derekbruening
Copy link
Contributor Author

From bruen...@google.com on June 08, 2011 09:12:27

You said:

If I run ctest instead of make test, broadfun does not hang.

But now you say that this fails:

test:
ctest

This doesn't make sense. When you "run ctest instead of make test" were you running some other ctest on your path, or passing some args?

@derekbruening
Copy link
Contributor Author

From peterfeiner on June 08, 2011 11:47:49

Changing private_loader to false fixes the problem.

Derek, Although we can't make sense of it, it's happening:

~/dynamorio/build$ make -j20

~/dynamorio/build$ which ctest
/usr/bin/ctest
~/dynamorio/build$ ctest
<all tests run -- as usual, some fail>
$ cat Makefile
...
test:
@$(CMAKE_COMMAND) -E cmake_echo_color --switch=$(COLOR) --cyan "Running tests..."
/usr/bin/ctest --force-new-ctest-process $(ARGS)
.PHONY : test
...
~/dynamorio/build$ make test
Running tests...
Test project /home/peter/dynamorio/build
Start 1: code_api|common.broadfun
<this hangs, so I press ctrl-c>
^Cmake: *** [test] Interrupt
~/dynamorio/build$ cat Makefile.simple
test:
which ctest
ctest
~/dynamorio/build$ make -f Makefile.simple
which ctest
/usr/bin/ctest
ctest
Test project /home/peter/dynamorio/build
Start 1: code_api|common.broadfun

^Cmake: *** [test] Interrupt

@derekbruening
Copy link
Contributor Author

From rnk@google.com on October 25, 2011 18:56:15

Does this still reproduce for you? There have been a handful of private loader fixes that Qin has put in over the last few months.

@derekbruening
Copy link
Contributor Author

From peterfeiner on October 28, 2011 18:08:13

Yes, this still happens.

broadfun hangs with this stack trace:

(gdb) where
#0 0x00000000710e61a2 in syscall_ready () from /home/peter/dynamorio/build/lib64/release/libdynamorio.so
#1 0x0000000071351100 in maps_iter_buf_lock () from /home/peter/dynamorio/build/lib64/release/libdynamorio.so
#2 0x00000000710f2c4f in futex_wait (lock=0x6) at /home/peter/dynamorio/core/linux/os.c:2565
#3 mutex_wait_contended_lock (lock=0x6) at /home/peter/dynamorio/core/linux/os.c:7925
#4 0x00000000710f40b2 in maps_iterator_start (iter=0x48db8ac0, may_alloc=false) at /home/peter/dynamorio/core/linux/os.c:6478
#5 0x00000000710f421e in query_memory_ex_from_os (pc=0x2aab737627c6 <Address 0x2aab737627c6 out of bounds>, info=0x48db8b60) at /home/peter/dynamorio/core/linux/os.c:7776
#6 0x00000000710f43f4 in get_memory_info_from_os (pc=0x71351100 <Address 0x71351100 out of bounds>, base_pc=0x0, size=0x1, prot=0xffffffffffffffff)
at /home/peter/dynamorio/core/linux/os.c:7841
#7 0x00000000710fa1a7 in compute_memory_target (dcontext=0x48d42a40, instr_cache_pc=0x2aab737627c6 <Address 0x2aab737627c6 out of bounds>, sc=0x48db90a8, write=0x48db9018)
at /home/peter/dynamorio/core/linux/signal.c:3382
#8 0x00000000710fea3d in master_signal_handler_C (sig=11, siginfo=, ucxt=0x48db9080, xsp=0x48db9078 <Address 0x48db9078 out of bounds>)
at /home/peter/dynamorio/core/linux/signal.c:3742
#9 0x00000000710e61a7 in client_int_syscall () from /home/peter/dynamorio/build/lib64/release/libdynamorio.so
#10 0x0000000000000000 in ?? ()

after it times out, ctest continues with the other programs. Here's a stack trace from getretaddr:

(gdb) where
#0 0x00000000710e61a2 in syscall_ready () from /home/peter/dynamorio/build/lib64/release/libdynamorio.so
#1 0x0000000071351100 in maps_iter_buf_lock () from /home/peter/dynamorio/build/lib64/release/libdynamorio.so
#2 0x00000000710f2c4f in futex_wait (lock=0x6) at /home/peter/dynamorio/core/linux/os.c:2565
#3 mutex_wait_contended_lock (lock=0x6) at /home/peter/dynamorio/core/linux/os.c:7925
#4 0x00000000710f40b2 in maps_iterator_start (iter=0x48730ac0, may_alloc=false) at /home/peter/dynamorio/core/linux/os.c:6478
#5 0x00000000710f421e in query_memory_ex_from_os (pc=0x2b2c5f6ea7c6 "dH\213", info=0x48730b60) at /home/peter/dynamorio/core/linux/os.c:7776
#6 0x00000000710f43f4 in get_memory_info_from_os (pc=0x71351100 "\001", base_pc=0x0, size=0x1, prot=0xffffffffffffffff) at /home/peter/dynamorio/core/linux/os.c:7841
#7 0x00000000710fa1a7 in compute_memory_target (dcontext=0x486baa40, instr_cache_pc=0x2b2c5f6ea7c6 "dH\213", sc=0x487310a8, write=0x48731018)
at /home/peter/dynamorio/core/linux/signal.c:3382
#8 0x00000000710fea3d in master_signal_handler_C (sig=11, siginfo=, ucxt=0x48731080, xsp=0x48731078 "\247a\016q")
at /home/peter/dynamorio/core/linux/signal.c:3742
#9 0x00000000710e61a7 in client_int_syscall () from /home/peter/dynamorio/build/lib64/release/libdynamorio.so
#10 0x0000000000000000 in ?? ()

In both cases, there appears to be a deadlock waiting for the proc maps futex. Interestingly, the test programs no longer peg the CPU. I suspect that when I originally reported this bug DR was still using spinlocks and the CPU pegging was caused by the same deadlock.

@derekbruening
Copy link
Contributor Author

From zhao...@google.com on October 29, 2011 07:45:31

It is something wrong earlier, i.e. the code causing the SIGSEGV or even earlier. Is it possible you can figure out which instruction cause the SIGSEGV and the corresponding application instruction by going up to master_signal_handler_C and get the signal context?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant