Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean thread sanitizer run #2070

Closed
finnschiermer opened this issue Aug 25, 2016 · 9 comments · Fixed by #2167
Closed

Clean thread sanitizer run #2070

finnschiermer opened this issue Aug 25, 2016 · 9 comments · Fixed by #2167
Labels
Milestone

Comments

@finnschiermer
Copy link
Contributor

finnschiermer commented Aug 25, 2016

We should aim for clean runs of the thread sanitizer. Reported races should be inspected and if benign and necessary, exceptions should be added for them. Otherwise they should be fixed.

@emanuelez
Copy link
Contributor

Just for reference. This link explains how to do exclusions: https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual#suppressing-reports

@bmunkholm
Copy link
Contributor

@finnschiermer What's left on this?

@finnschiermer
Copy link
Contributor Author

At the moment there are 2 races left to classify, both are related to robust mutex emulation, and both are likely benign.

@finnschiermer
Copy link
Contributor Author

Oh, yes, there is one more race which have been eliminated but without us fully understanding what lies behind it. I expect to return to that later when we get plenty of time :-)

@finnschiermer
Copy link
Contributor Author

So, I have to reopen this, because the "eliminated" race mentioned above caused a crash at the cocoa unittests and had to be reverted.

@finnschiermer finnschiermer reopened this Oct 3, 2016
@finnschiermer finnschiermer removed their assignment Oct 26, 2016
@finnschiermer
Copy link
Contributor Author

Recently reported race:

==================
WARNING: ThreadSanitizer: data race (pid=4575)
Write of size 8 at 0x7b2000058510 by thread T79 (mutexes: write M1051163318364541560, write M588985411665600640, write M15):
#0 operator delete(void*) (libtsan.so.0+0x00000006a794)
#1 realm::SlabAlloc::attach_file(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, realm::SlabAlloc::Config&) (librealm.so.9+0x0000000d7705)
#2 realm::_impl::ClientFileAccessCache::access(realm::_impl::ClientFileAccessCache::Slot&) (librealm-sync-tsan.so.1+0x000000070a83)
#3 (anonymous namespace)::ClientImpl::stop_and_start_sessions() (librealm-sync-tsan.so.1+0x0000000d6b8e)
#4 realm::sync::Client::run() (librealm-sync-tsan.so.1+0x0000000d9a16)
#5 void* realm::util::Thread::entry_point<void realm::test_util::ThreadWrapper::start<(anonymous namespace)::MultiClientServerFixture::start()::{lambda()#2}>((anonymous namespace)::MultiClientServerFixture::start()::{lambda()#2} const&)::{lambda()#1}>({lambda()#1}) (realm-sync-tests-tsan+0x0000004aa04a)
#6 (libtsan.so.0+0x000000024fcb)

Previous write of size 1 at 0x7b2000058510 by thread T16 (mutexes: write M571252488279183488):
#0 pthread_mutex_destroy (libtsan.so.0+0x000000028bcb)
#1 std::_Sp_counted_ptr_inplace<realm::SlabAlloc::MappedFile, std::allocatorrealm::SlabAlloc::MappedFile, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (librealm.so.9+0x0000000d8c3b)
#2 realm::test_util::unit_test::RegisterTest<(anonymous namespace)::Realm_UnitTest__Sync_AuthFailure>::run_test(realm::test_util::unit_test::TestContext&) (realm-sync-tests-tsan+0x00000046434a)
#3 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) (realm-sync-tests-tsan+0x000000660354)
#4 realm::test_util::unit_test::TestList::ThreadContextImpl::run() (realm-sync-tests-tsan+0x000000660ac1)
#5 realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda(int)#1}::operator()(int) const (realm-sync-tests-tsan+0x0000006613c2)
#6 void* realm::util::Thread::entry_pointrealm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}(realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}) (realm-sync-tests-tsan+0x000000661751)
#7 (libtsan.so.0+0x000000024fcb)

Mutex M1051163318364541560 is already destroyed.

Mutex M588985411665600640 is already destroyed.

Mutex M15 (0x7b0c00000060) created at:
#0 pthread_mutex_init (libtsan.so.0+0x000000028a8e)
#1 _GLOBAL__sub_I_alloc_slab.cpp (librealm.so.9+0x0000000b9233)

Mutex M571252488279183488 is already destroyed.

Thread T79 (tid=4656, running) created by thread T16 at:
#0 pthread_create (libtsan.so.0+0x000000028280)
#1 (anonymous namespace)::MultiClientServerFixture::start() (realm-sync-tests-tsan+0x0000004463ee)
#2 (anonymous namespace)::Realm_UnitTest__Sync_AuthFailure::test_run() (realm-sync-tests-tsan+0x0000004639f7)
#3 realm::test_util::unit_test::RegisterTest<(anonymous namespace)::Realm_UnitTest__Sync_AuthFailure>::run_test(realm::test_util::unit_test::TestContext&) (realm-sync-tests-tsan+0x00000046434a)
#4 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) (realm-sync-tests-tsan+0x000000660354)
#5 realm::test_util::unit_test::TestList::ThreadContextImpl::run() (realm-sync-tests-tsan+0x000000660ac1)
#6 realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda(int)#1}::operator()(int) const (realm-sync-tests-tsan+0x0000006613c2)
#7 void* realm::util::Thread::entry_pointrealm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}(realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}) (realm-sync-tests-tsan+0x000000661751)
#8 (libtsan.so.0+0x000000024fcb)

Thread T16 'test-thread-16' (tid=4592, running) created by main thread at:
#0 pthread_create (libtsan.so.0+0x000000028280)
#1 realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config) (realm-sync-tests-tsan+0x000000662912)
#2 (anonymous namespace)::run_tests(realm::util::Logger*) (realm-sync-tests-tsan+0x0000004f7dd0)
#3 test_all(int, char**, realm::util::Logger*) (realm-sync-tests-tsan+0x0000004f922a)
#4 main (realm-sync-tests-tsan+0x00000042bad0)

SUMMARY: ThreadSanitizer: data race (/usr/local/lib64/libtsan.so.0+0x6a794) in operator delete(void*)

==================
WARNING: ThreadSanitizer: data race (pid=4575)
Write of size 8 at 0x7b2000054510 by thread T98 (mutexes: write M710578596545434232, write M17379, write M15):
#0 operator delete(void*) (libtsan.so.0+0x00000006a794)
#1 realm::SlabAlloc::attach_file(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, realm::SlabAlloc::Config&) (librealm.so.9+0x0000000d7705)
#2 realm::_impl::ClientFileAccessCache::access(realm::_impl::ClientFileAccessCache::Slot&) (librealm-sync-tsan.so.1+0x000000070a83)
#3 (anonymous namespace)::ClientImpl::stop_and_start_sessions() (librealm-sync-tsan.so.1+0x0000000d6b8e)
#4 realm::sync::Client::run() (librealm-sync-tsan.so.1+0x0000000d9a16)
#5 void* realm::util::Thread::entry_point<void realm::test_util::ThreadWrapper::start<(anonymous namespace)::MultiClientServerFixture::start()::{lambda()#2}>((anonymous namespace)::MultiClientServerFixture::start()::{lambda()#2} const&)::{lambda()#1}>({lambda()#1}) (realm-sync-tests-tsan+0x0000004aa04a)
#6 (libtsan.so.0+0x000000024fcb)

Previous write of size 1 at 0x7b2000054510 by thread T28 (mutexes: write M191261269737119872):
#0 pthread_mutex_destroy (libtsan.so.0+0x000000028bcb)
#1 std::_Sp_counted_ptr_inplace<realm::SlabAlloc::MappedFile, std::allocatorrealm::SlabAlloc::MappedFile, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (librealm.so.9+0x0000000d8c3b)
#2 realm::test_util::unit_test::RegisterTest<(anonymous namespace)::Realm_UnitTest__Sync_EarlyUnbind>::run_test(realm::test_util::unit_test::TestContext&) (realm-sync-tests-tsan+0x00000046a75a)
#3 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) (realm-sync-tests-tsan+0x000000660354)
#4 realm::test_util::unit_test::TestList::ThreadContextImpl::run() (realm-sync-tests-tsan+0x000000660ac1)
#5 realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda(int)#1}::operator()(int) const (realm-sync-tests-tsan+0x0000006613c2)
#6 void* realm::util::Thread::entry_pointrealm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}(realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}) (realm-sync-tests-tsan+0x000000661751)
#7 (libtsan.so.0+0x000000024fcb)

Mutex M710578596545434232 is already destroyed.

Mutex M17379 (0x7f1c76f28080) created at:
#0 pthread_mutex_trylock (libtsan.so.0+0x000000028cae)
#1 realm::util::RobustMutex::is_valid() (librealm.so.9+0x0000000c1028)
#2 realm::_impl::ClientFileAccessCache::access(realm::_impl::ClientFileAccessCache::Slot&) (librealm-sync-tsan.so.1+0x000000070a83)
#3 (anonymous namespace)::ClientImpl::stop_and_start_sessions() (librealm-sync-tsan.so.1+0x0000000d6b8e)
#4 realm::sync::Client::run() (librealm-sync-tsan.so.1+0x0000000d9a16)
#5 void* realm::util::Thread::entry_point<void realm::test_util::ThreadWrapper::start<(anonymous namespace)::MultiClientServerFixture::start()::{lambda()#2}>((anonymous namespace)::MultiClientServerFixture::start()::{lambda()#2} const&)::{lambda()#1}>({lambda()#1}) (realm-sync-tests-tsan+0x0000004aa04a)
#6 (libtsan.so.0+0x000000024fcb)

Mutex M15 (0x7b0c00000060) created at:
#0 pthread_mutex_init (libtsan.so.0+0x000000028a8e)
#1 _GLOBAL__sub_I_alloc_slab.cpp (librealm.so.9+0x0000000b9233)

Mutex M191261269737119872 is already destroyed.

Thread T98 (tid=4674, running) created by thread T28 at:
#0 pthread_create (libtsan.so.0+0x000000028280)
#1 (anonymous namespace)::MultiClientServerFixture::start() (realm-sync-tests-tsan+0x0000004463ee)
#2 (anonymous namespace)::Realm_UnitTest__Sync_EarlyUnbind::test_run() (realm-sync-tests-tsan+0x000000469799)
#3 realm::test_util::unit_test::RegisterTest<(anonymous namespace)::Realm_UnitTest__Sync_EarlyUnbind>::run_test(realm::test_util::unit_test::TestContext&) (realm-sync-tests-tsan+0x00000046a75a)
#4 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) (realm-sync-tests-tsan+0x000000660354)
#5 realm::test_util::unit_test::TestList::ThreadContextImpl::run() (realm-sync-tests-tsan+0x000000660ac1)
#6 realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda(int)#1}::operator()(int) const (realm-sync-tests-tsan+0x0000006613c2)
#7 void* realm::util::Thread::entry_pointrealm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}(realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::{lambda()#2}) (realm-sync-tests-tsan+0x000000661751)
#8 (libtsan.so.0+0x000000024fcb)

Thread T28 'test-thread-28' (tid=4604, running) created by main thread at:
#0 pthread_create (libtsan.so.0+0x000000028280)
#1 realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config) (realm-sync-tests-tsan+0x000000662912)
#2 (anonymous namespace)::run_tests(realm::util::Logger*) (realm-sync-tests-tsan+0x0000004f7dd0)
#3 test_all(int, char**, realm::util::Logger*) (realm-sync-tests-tsan+0x0000004f922a)
#4 main (realm-sync-tests-tsan+0x00000042bad0)

SUMMARY: ThreadSanitizer: data race (/usr/local/lib64/libtsan.so.0+0x6a794) in operator delete(void*)

@ironage
Copy link
Contributor

ironage commented Aug 10, 2017

The realm-core unit tests are running cleanly under thread and address sanitisers since #2782 fixed some problems and automated these checks as part of CI runs.
@finnschiermer where did those above stack traces come from, they seem to be from the sync tests?

@finnschiermer
Copy link
Contributor Author

@ironage I don't remember. I think the issue is solved for Core with your recent work. If you agree, please close this issue.

@ironage
Copy link
Contributor

ironage commented Aug 14, 2017

All the core races should be solved now. Let's close this and if we discover more we can open a new issue.

@ironage ironage closed this as completed Aug 14, 2017
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants