-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ThreadSanitizer: thread leak from HIP runtime #3182
Comments
Thanks for reporting, will look into it. |
@al42and Apologies for the lack of response. Can you please test with latest ROCm 6.0.2 (HIP 6.0.32831)? If resolved, please close ticket. Thanks! |
Don't have 6.0.2 at hand, but the problem still occurs with 6.0.0: $ hipcc tsan.cpp -g -fsanitize=thread -o tsan && ./tsan
clang: warning: ignoring '-fsanitize=thread' option as it is not currently supported for target 'amdgcn-amd-amdhsa' [-Woption-ignored]
Detected 1 devices
==================
WARNING: ThreadSanitizer: thread leak (pid=3481992)
Thread T2 (tid=3482001, finished) created by main thread at:
#0 pthread_create /long_pathname_so_that_rpms_can_package_the_debug_info/src/external/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1048 (tsan+0x296503)
#1 <null> <null> (libhsa-runtime64.so.1+0x2972b) (BuildId: fdfae95418d176670b25ac26f0542b05d0aec181)
#2 hipGetDeviceCount ??:? (libamdhip64.so.6+0xa9c23) (BuildId: c119a12e92604d9b1dd360dcf538793bfab296a4)
#3 __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58 (libc.so.6+0x29d8f) (BuildId: c289da5071a3399de893d2af81d6a30c62646e1e)
SUMMARY: ThreadSanitizer: thread leak (/opt/rocm-6.0.0/lib/llvm/bin/../../../lib/libhsa-runtime64.so.1+0x2972b) (BuildId: fdfae95418d176670b25ac26f0542b05d0aec181)
==================
ThreadSanitizer: reported 1 warnings Note for others trying to reproduce: Since hipcc in ROCm 6.0 is based on Clang 17, it requires a workaround for TSAN on newer kernels: google/sanitizers#1716 (comment). But this is not directly related to the issue here. |
Still happens with 6.1: $ hipcc --version
HIP version: 6.1.40092-038397aaa
AMD clang version 17.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-6.1.1 24154 f53cd7e03908085f4932f7329464cd446426436a)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-6.1.1/llvm/bin
Configuration file: /opt/rocm-6.1.1/lib/llvm/bin/clang++.cfg
$ hipcc tsan.cpp -g -fsanitize=thread -o tsan && ./tsan
clang: warning: ignoring '-fsanitize=thread' option as it is not currently supported for target 'amdgcn-amd-amdhsa' [-Woption-ignored]
Detected 1 devices
/usr/bin/addr2line: DWARF error: invalid or unhandled FORM value: 0x23
==================
WARNING: ThreadSanitizer: thread leak (pid=64226)
Thread T2 (tid=64235, finished) created by main thread at:
#0 pthread_create ??:? (tsan+0x29c90b)
#1 <null> <null> (libhsa-runtime64.so.1+0x2c0fc) (BuildId: 8575df86329e78c19cac825f819d82b0361816da)
#2 hipGetCmdName ??:? (libamdhip64.so.6+0xad053) (BuildId: daff87db3cceb0402dea325b66af7507d54d0eb2)
#3 __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58 (libc.so.6+0x29d8f) (BuildId: 962015aa9d133c6cbcfb31ec300596d7f44d3348)
SUMMARY: ThreadSanitizer: thread leak ??:? in pthread_create
==================
ThreadSanitizer: reported 1 warnings |
@al42and We have an internal ticket to investigate this issue. Thanks! |
Hi @al42and, I tried to reproduce the issue you are facing but could not find any threads that were leaking with the latest version of ROCm (6.2.2). I verified with threadSanitizer and gdb. However, there was an issue with threadSanitizer where I got an error message with unexpected memory mapping. If you face a similar issue, there was a recent kernel update that bumped vm.mmap_rnd_bits up from 28 to 32 for amd64 systems. There was also an update to support only up to 30 ASLR bits for threadSanitizer: ThreadSanitizer ASLR Change. Therefore, to solve this issue, you would have to reduce ASLR bits from 32 to 30:
Please give that a try on the latest version of ROCm and let me know if the issue persists, thanks! |
Hi @darren-amd
Thanks. I can confirm that the issue can no longer be reproduced with 6.2.2 while still happening on the same machine with 6.1.1.
Yes, I'm aware of that, see the note in #3182 (comment). |
Trying to run any app which uses HIP API with TSAN triggers a "thread leak" error at the end:
Tested with ROCm 5.4.1 on MI50 and ROCm 5.4.2 on RX 6400.
Code used (anything doing HIP API calls should work):
The text was updated successfully, but these errors were encountered: