Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race between "MPPTask-Moniter" and TiFlashMetrics when shutting down #9092

Closed
JaySon-Huang opened this issue May 27, 2024 · 0 comments · Fixed by #9096
Closed

Data race between "MPPTask-Moniter" and TiFlashMetrics when shutting down #9092

JaySon-Huang opened this issue May 27, 2024 · 0 comments · Fixed by #9096
Labels

Comments

@JaySon-Huang
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

2. What did you expect to see? (Required)

3. What did you see instead (Required)

[2024/05/27 17:27:13.332 +08:00] [DEBUG] [SegmentReader.cpp:45] [Stopped] [thread_id=1]
[2024/05/27 17:27:13.332 +08:00] [DEBUG] [SegmentReader.cpp:45] [Stopped] [thread_id=1]
[2024/05/27 17:27:13.332 +08:00] [DEBUG] [SegmentReader.cpp:45] [Stopped] [thread_id=1]
[2024/05/27 17:27:13.332 +08:00] [DEBUG] [SegmentReader.cpp:45] [Stopped] [thread_id=1]
[2024/05/27 17:27:13.332 +08:00] [DEBUG] [SegmentReader.cpp:45] [Stopped] [thread_id=1]
[2024/05/27 17:27:13.332 +08:00] [INFO] [JointThreadAllocInfo.cpp:192] ["Stop collecting thread alloc metrics"] [thread_id=1]
[2024/05/27 17:27:13.332 +08:00] [INFO] [JointThreadAllocInfo.cpp:203] ["JointThreadInfoJeallocMap shutdown, wait thread alloc monitor join"] [thread_id=1]
[2024/05/27 17:27:13.344 +08:00] [INFO] [KVStore.cpp:432] ["Destroy KVStore"] [thread_id=1]
[2024/05/27 17:27:13.344 +08:00] [INFO] [ReadIndex.cpp:371] ["KVStore shutdown, deleting read index worker"] [thread_id=1]
[2024/05/27 17:27:13.345 +08:00] [INFO] [KVStore.cpp:434] ["Destroy KVStore Finished"] [thread_id=1]
[2024/05/27 17:27:13.345 +08:00] [INFO] [JointThreadAllocInfo.cpp:192] ["Stop collecting thread alloc metrics"] [thread_id=1]
==================
WARNING: ThreadSanitizer: data race (pid=3158060)
  Write of size 8 at 0x7b040000a980 by main thread:
    #0 operator delete(void*) /root/llvm-project/compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp:126:3 (gtests_dbms+0x2ff96ff)
    #1 void std::__1::__libcpp_operator_delete[abi:ue170006]<void*>(void*) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/new:278:3 (gtests_dbms+0x321bf8a)
    #2 void std::__1::__do_deallocate_handle_size[abi:ue170006]<>(void*, unsigned long) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/new:302:10 (gtests_dbms+0x321bf8a)
    #3 std::__1::__libcpp_deallocate[abi:ue170006](void*, unsigned long, unsigned long) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/new:318:14 (gtests_dbms+0x321bf8a)
    #4 std::__1::allocator<prometheus::Gauge*>::deallocate[abi:ue170006](prometheus::Gauge**, unsigned long) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__memory/allocator.h:130:13 (gtests_dbms+0x321bf8a)
    #5 std::__1::allocator_traits<std::__1::allocator<prometheus::Gauge*>>::deallocate[abi:ue170006](std::__1::allocator<prometheus::Gauge*>&, prometheus::Gauge**, unsigned long) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__memory/allocator_traits.h:288:13 (gtests_dbms+0x321bf8a)
    #6 std::__1::vector<prometheus::Gauge*, std::__1::allocator<prometheus::Gauge*>>::__destroy_vector::operator()[abi:ue170006]() /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/vector:491:13 (gtests_dbms+0x321bf8a)
    #7 std::__1::vector<prometheus::Gauge*, std::__1::allocator<prometheus::Gauge*>>::~vector[abi:ue170006]() /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/vector:500:67 (gtests_dbms+0x321bf8a)
    #8 DB::MetricFamily<prometheus::Gauge>::~MetricFamily() /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/TiFlashMetrics.h:1033:8 (gtests_dbms+0x321bf8a)
    #9 DB::TiFlashMetrics::~TiFlashMetrics() /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/TiFlashMetrics.h:1113:7 (gtests_dbms+0x10cabb92)
    #10 cxa_at_exit_callback_installed_at(void*) /root/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:434:3 (gtests_dbms+0x2fb6a39)
    #11 DB::TiFlashMetrics::instance() /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/TiFlashMetrics.cpp:24:5 (gtests_dbms+0x10c63693)

  Previous read of size 8 at 0x7b040000a980 by thread T7:
    #0 DB::MetricFamily<prometheus::Gauge>::get(unsigned long) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/TiFlashMetrics.h:1061:40 (gtests_dbms+0xce7197c)
    #1 DB::(anonymous namespace)::checkLongLiveMPPTasks(std::__1::unordered_map<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, Stopwatch, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1
::allocator<char>>>, std::__1::allocator<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const, Stopwatch>>> const&, std::__1::shared_ptr<DB::Logger> const&) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Storages/KVStore/TMTContext.cpp:95:5 (gtests_dbms+0xce7197c)
    #2 DB::(anonymous namespace)::monitorMPPTasks(std::__1::shared_ptr<DB::MPPTaskMonitor>) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Storages/KVStore/TMTContext.cpp (gtests_dbms+0xce7197c)
    #3 DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0::operator()() const /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Storages/KVStore/TMTContext.cpp:124:9 (gtests_dbms+0xce7197c)
    #4 decltype(std::declval<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0&>()()) std::__1::__invoke[abi:ue170006]<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0&>(DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::M
PPTaskManager> const&)::$_0&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__type_traits/invoke.h:340:25 (gtests_dbms+0xce7197c)
    #5 void std::__1::__invoke_void_return_wrapper<void, true>::__call[abi:ue170006]<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0&>(DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c
++/v1/__type_traits/invoke.h:415:5 (gtests_dbms+0xce7197c)
    #6 std::__1::__function::__alloc_func<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0, std::__1::allocator<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0>, void ()>::operator()[abi:ue170006]() /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin
/../include/c++/v1/__functional/function.h:192:16 (gtests_dbms+0xce7197c)
    #7 std::__1::__function::__func<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0, std::__1::allocator<DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&)::$_0>, void ()>::operator()() /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/_
_functional/function.h:363:12 (gtests_dbms+0xce7197c)
    #8 std::__1::__function::__value_func<void ()>::operator()[abi:ue170006]() const /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__functional/function.h:517:16 (gtests_dbms+0x10c55b61)
    #9 std::__1::function<void ()>::operator()() const /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__functional/function.h:1168:12 (gtests_dbms+0x10c55b61)
    #10 decltype(std::declval<std::__1::function<void ()> const&>()()) std::__1::__invoke[abi:ue170006]<std::__1::function<void ()> const&>(std::__1::function<void ()> const&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__type_traits/invoke.h:340:25 (gtests_dbms+0x10c55b61)
    #11 std::__1::invoke_result<std::__1::function<void ()> const&>::type std::__1::invoke[abi:ue170006]<std::__1::function<void ()> const&>(std::__1::function<void ()> const&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__functional/invoke.h:30:12 (gtests_dbms+0x10c55b61)
    #12 auto std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>&&)::'lambda'(auto&&...)::operator()<>(auto&&...) const /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/ThreadFactory.h:48:26 (gtests_dbms+0x10c55b61)
    #13 decltype(std::declval<std::__1::function<void ()>>()()) std::__1::__invoke[abi:ue170006]<std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>&&)::'lambda'(auto&&...)>(std::__1::function<void ()>&&) /DATA/disk1/ra_c
ommon/tiflash-env-17/sysroot/bin/../include/c++/v1/__type_traits/invoke.h:340:25 (gtests_dbms+0x10c5563d)
    #14 void std::__1::__thread_execute[abi:ue170006]<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>&&)::'lambda'(
auto&&...)>(std::__1::tuple<std::__1::function<void ()>, std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>&&)::'lambda'(auto&&...)>&, std::__1::__tuple_indices<>) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../incl
ude/c++/v1/__thread/thread.h:221:5 (gtests_dbms+0x10c5563d)
    #15 void* std::__1::__thread_proxy[abi:ue170006]<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct>>, std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()
>&&)::'lambda'(auto&&...)>>(void*) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__thread/thread.h:232:5 (gtests_dbms+0x10c5563d)

  Thread T7 'MPPTask-Moniter' (tid=3158068, finished) created by main thread at:
    #0 pthread_create /root/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:1020:3 (gtests_dbms+0x2f7165b)
    #1 std::__1::__libcpp_thread_create[abi:ue170006](unsigned long*, void* (*)(void*), void*) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__threading_support:371:10 (gtests_dbms+0x10c55498)
    #2 std::__1::thread::thread<std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>&&)::'lambda'(auto&&...)&, void>(std::__1::function<void ()>&&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__threa
d/thread.h:248:16 (gtests_dbms+0x10c55498)
    #3 std::__1::thread DB::ThreadFactory::newThread<std::__1::function<void ()>>(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>&&) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/ThreadFactory.h:50:16 (gtests_dbms+0x10c5535e)
    #4 DB::(anonymous namespace)::RawThreadManager::scheduleThenDetach(bool, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::function<void ()>) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Common/ThreadManager.cpp:83:18 (gtests_dbms+0x10c53c25)
    #5 DB::(anonymous namespace)::startMonitorMPPTaskThread(std::__1::shared_ptr<DB::MPPTaskManager> const&) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Storages/KVStore/TMTContext.cpp:123:25 (gtests_dbms+0xce6df4e)
    #6 DB::TMTContext::TMTContext(DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig const&) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Storages/KVStore/TMTContext.cpp:161:5 (gtests_dbms+0xce6df4e)
    #7 DB::TMTContext* std::__1::construct_at[abi:ue170006]<DB::TMTContext, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&, DB::TMTContext*>(DB::TMTContext*, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__memory/construct_at.h:41:46 (gtests_dbms+0xe50f
d70)
    #8 void std::__1::allocator_traits<std::__1::allocator<DB::TMTContext>>::construct[abi:ue170006]<DB::TMTContext, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&, void, void>(std::__1::allocator<DB::TMTContext>&, DB::TMTContext*, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&) /DATA/disk1/ra_common/tiflash-env-17/sysro
ot/bin/../include/c++/v1/__memory/allocator_traits.h:304:9 (gtests_dbms+0xe50fd70)
    #9 std::__1::__shared_ptr_emplace<DB::TMTContext, std::__1::allocator<DB::TMTContext>>::__shared_ptr_emplace[abi:ue170006]<DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&>(std::__1::allocator<DB::TMTContext>, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c+
+/v1/__memory/shared_ptr.h:299:13 (gtests_dbms+0xe50fd70)
    #10 std::__1::shared_ptr<DB::TMTContext> std::__1::allocate_shared[abi:ue170006]<DB::TMTContext, std::__1::allocator<DB::TMTContext>, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&, void>(std::__1::allocator<DB::TMTContext> const&, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&) /DATA/disk1/ra_common/tiflash-env-17/s
ysroot/bin/../include/c++/v1/__memory/shared_ptr.h:1022:55 (gtests_dbms+0xe50fd70)
    #11 std::__1::shared_ptr<DB::TMTContext> std::__1::make_shared[abi:ue170006]<DB::TMTContext, DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&, void>(DB::Context&, DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&) /DATA/disk1/ra_common/tiflash-env-17/sysroot/bin/../include/c++/v1/__memory/shared_ptr.h:1031:12 (gtests_dbms+0xe50fd70)
    #12 DB::Context::createTMTContext(DB::TiFlashRaftConfig const&, pingcap::ClusterConfig&&) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/Interpreters/Context.cpp:1469:27 (gtests_dbms+0xe50fd70)
    #13 DB::tests::TiFlashTestEnv::addGlobalContext(DB::Settings const&, std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, DB::PageStorageRunMode, unsigned long) /DATA/disk1/jaysonhuang/tiflash-master/db
ms/src/TestUtils/TiFlashTestEnv.cpp:169:21 (gtests_dbms+0x6bf59e1)
    #14 DB::tests::TiFlashTestEnv::initializeGlobalContext(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>>>>, DB::PageStorageRunMode, unsigned long) /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/TestUti
ls/TiFlashTestEnv.cpp:102:5 (gtests_dbms+0x6bf4c04)
    #15 main /DATA/disk1/jaysonhuang/tiflash-master/dbms/src/TestUtils/gtests_dbms_main.cpp:71:5 (gtests_dbms+0x689b1ff)

SUMMARY: ThreadSanitizer: data race /root/llvm-project/compiler-rt/lib/tsan/rtl/tsan_new_delete.cpp:126:3 in operator delete(void*)
==================
==================

4. What is your TiFlash version? (Required)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant