Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread specific key leakage #789

Closed
llhe opened this issue Sep 10, 2018 · 6 comments
Closed

thread specific key leakage #789

llhe opened this issue Sep 10, 2018 · 6 comments
Milestone

Comments

@llhe
Copy link

llhe commented Sep 10, 2018

Description

We build a program which use ostream and when we do stress testing by repeated loading and unloading the built shared library, the program crashed with an abort message cannot create thread specific key for __cxa_get_globals() here:

namespace __cxxabiv1 {
namespace {
    std::__libcpp_tls_key key_;
    std::__libcpp_exec_once_flag flag_ = _LIBCPP_EXEC_ONCE_INITIALIZER;

    void destruct_ (void *p) {
        __free_with_fallback ( p );
        if ( 0 != std::__libcpp_tls_set ( key_, NULL ) )
            abort_message("cannot zero out thread value for __cxa_get_globals()");
        }

    void construct_ () {
        if ( 0 != std::__libcpp_tls_create ( &key_, destruct_ ) )
            abort_message("cannot create thread specific key for __cxa_get_globals()");
        }
} 

which means pthread_key_create returns error.

// ./cxx-stl/llvm-libc++/include/__threading_support

377 // Thread local storage                                                         
378 int __libcpp_tls_create(__libcpp_tls_key *__key, void (*__at_exit)(void *))     
379 {                                                                               
380   return pthread_key_create(__key, __at_exit);                                  
381 }  

And checked the source of bionic implementation for pthread_key_create https://github.com/aosp-mirror/platform_bionic/blob/master/libc/bionic/pthread_key.cpp#L120:

__BIONIC_WEAK_FOR_NATIVE_BRIDGE
int pthread_key_create(pthread_key_t* key, void (*key_destructor)(void*)) {
  for (size_t i = 0; i < BIONIC_PTHREAD_KEY_COUNT; ++i) {
    uintptr_t seq = atomic_load_explicit(&key_map[i].seq, memory_order_relaxed);
    while (!SeqOfKeyInUse(seq)) {
      if (atomic_compare_exchange_weak(&key_map[i].seq, &seq, seq + SEQ_INCREMENT_STEP)) {
        atomic_store(&key_map[i].key_destructor, reinterpret_cast<uintptr_t>(key_destructor));
        *key = i | KEY_VALID_FLAG;
        return 0;
      }
    }
  }
  return EAGAIN;
}

Which means it will fail when there is no free slot (with 130 as the max).

And from the code above, there is no matching pthread_key_delete in destruct_ call. This will create a leak for each loading.

Environment Details

The program is based on 17b which is not the latest, however, the source code section should be enough.

  • NDK Version: 17b
  • Build system: bazel and cmake
  • Host OS: ubuntu
  • Compiler: clang
  • ABI: armeabi-v7a
  • STL: libc++
  • NDK API level: 21
  • Device API level: 27
@DanAlbert DanAlbert self-assigned this Sep 10, 2018
@DanAlbert DanAlbert added this to the r19 milestone Sep 10, 2018
@DanAlbert
Copy link
Member

And from the code above, there is no matching pthread_key_delete in destruct_ call.

It's a little bit trickier than this because destruct_ will be called on the exit of any thread that used the key. There may be other threads still using the key, so we can't delete it. Ref counting the key in libc++abi gets a little bit messy, but may be an option.

Why are you looping dlopen/dlclose? I'm wondering if your stress tests are going to all be useless on modern devices where dlclose won't do anything for libraries with non-trivial thread-local destructors.

@DanAlbert DanAlbert removed this from the r19 milestone Sep 10, 2018
@DanAlbert DanAlbert removed their assignment Sep 10, 2018
@llhe
Copy link
Author

llhe commented Sep 13, 2018

It makes sense.

We are library provider, it's our user who make such stress testing. So we can suggest them to skip this test case. However, the stress testing do make sense when the final product is a plugin/shared library.

@tavianator
Copy link

libc++ also intentionally leaks the pthread_keys used for notify_all_at_thread_exit() and __cxa_thread_atexit(). dlclose() on libc++(abi) is not really supported. Maybe you could load it with RTLD_NODELETE?

@DanAlbert
Copy link
Member

DanAlbert commented Apr 14, 2020

Forgot to close this one. Like @tavianator said, these are intentionally leaked by libc++abi as a necessity of the fallback __cxa_thread_atexit_imp. See https://reviews.llvm.org/D21803 for more discussion.

Devices running 23+ will get the non-fallback behavior. If you need to avoid the leak for earlier platforms, there are a few workarounds:

  1. Don't use dlclose
  2. Use RTLD_NODELETE
  3. Don't use thread_local for globals with non-trivial destructors

@olegarch
Copy link

Thanks for detailed explanation. I have a follow up question.

  1. Don't use dlclose
  2. Use RTLD_NODELETE
  3. Don't use thread_local for globals with non-trivial destructors

If 1. and 2. are not an option, does 3. mean not using std::thread at all?

@DanAlbert
Copy link
Member

DanAlbert commented Dec 1, 2020

does 3. mean not using std::thread at all?

No, you can use std::thread all you want. You just can't use thread_local on globals with non-trivial destructors if you care about old OSs.

But if 1 isn't an option you should probably re-evaluate your implementation. dlclose cannot be relied upon to do anything at all. thread_local globals with non-trivial destructors are one reason that it might do nothing for the current implementation of Android's loader, but that could change in the future to enable new features, or could just become a no-op at some point because that's the only valid implementation with predictable behavior.

jeremyg-lunarg added a commit to jeremyg-lunarg/Vulkan-ValidationLayers that referenced this issue Aug 9, 2022
https://developer.android.com/ndk/guides/cpp-support recommends using
c++_shared for applications that use more than one shared library.
If multiple libraries using c++_static are loaded you end up with
several copies of the globals in the C++ runtime. This also happens
if the same library is dlopen/dlclosed several times. Some of the
c++ runtime globals are thread_local, so each copy consumes a
TLS key. There are only 128 TLS keys allowed on android, and the
unit tests can hit this because of repeatedly loading and unloading
VVL.

See android/ndk#789 and the many issues
linked to it for more info.
jeremyg-lunarg added a commit to jeremyg-lunarg/Vulkan-ValidationLayers that referenced this issue Aug 10, 2022
https://developer.android.com/ndk/guides/cpp-support recommends using
c++_shared for applications that use more than one shared library.
If multiple libraries using c++_static are loaded you end up with
several copies of the globals in the C++ runtime. This also happens
if the same library is dlopen/dlclosed several times. Some of the
c++ runtime globals are thread_local, so each copy consumes a
TLS key. There are only 128 TLS keys allowed on android, and the
unit tests can hit this because of repeatedly loading and unloading
VVL.

See android/ndk#789 and the many issues
linked to it for more info.
jeremyg-lunarg added a commit to jeremyg-lunarg/Vulkan-ValidationLayers that referenced this issue Aug 11, 2022
https://developer.android.com/ndk/guides/cpp-support recommends using
c++_shared for applications that use more than one shared library.
If multiple libraries using c++_static are loaded you end up with
several copies of the globals in the C++ runtime. This also happens
if the same library is dlopen/dlclosed several times. Some of the
c++ runtime globals are thread_local, so each copy consumes a
TLS key. There are only 128 TLS keys allowed on android, and the
unit tests can hit this because of repeatedly loading and unloading
VVL.

See android/ndk#789 and the many issues
linked to it for more info.
jeremyg-lunarg added a commit to jeremyg-lunarg/Vulkan-ValidationLayers that referenced this issue Aug 11, 2022
https://developer.android.com/ndk/guides/cpp-support recommends using
c++_shared for applications that use more than one shared library.
If multiple libraries using c++_static are loaded you end up with
several copies of the globals in the C++ runtime. This also happens
if the same library is dlopen/dlclosed several times. Some of the
c++ runtime globals are thread_local, so each copy consumes a
TLS key. There are only 128 TLS keys allowed on android, and the
unit tests can hit this because of repeatedly loading and unloading
VVL.

See android/ndk#789 and the many issues
linked to it for more info.
jeremyg-lunarg added a commit to KhronosGroup/Vulkan-ValidationLayers that referenced this issue Aug 11, 2022
https://developer.android.com/ndk/guides/cpp-support recommends using
c++_shared for applications that use more than one shared library.
If multiple libraries using c++_static are loaded you end up with
several copies of the globals in the C++ runtime. This also happens
if the same library is dlopen/dlclosed several times. Some of the
c++ runtime globals are thread_local, so each copy consumes a
TLS key. There are only 128 TLS keys allowed on android, and the
unit tests can hit this because of repeatedly loading and unloading
VVL.

See android/ndk#789 and the many issues
linked to it for more info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants