Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TSAN Data Race in LangBindHelper_HandoverBetweenThreads #6474

Open
sync-by-unito bot opened this issue Apr 5, 2023 · 14 comments
Open

TSAN Data Race in LangBindHelper_HandoverBetweenThreads #6474

sync-by-unito bot opened this issue Apr 5, 2023 · 14 comments

Comments

@sync-by-unito
Copy link

sync-by-unito bot commented Apr 5, 2023

Some core-tests are failing due to a data race in the LangBindHelper_HandoverBetweenThreads in test_lang_bind_helper.cpp. The tests themselves are passing, but the TSAN warnings are causing the task to fail. I have seen this failure twice within the past day or two.
Ubuntu 20.04 x86_64 (Clang 11 Encryption Enabled w/TSAN) Task Page
Data Race in Parsley

@sync-by-unito
Copy link
Author

sync-by-unito bot commented Apr 6, 2023

➤ michael-wb commented:

There may also be a related hang that occurred in theLangBindHelper_ImplicitTransactions_InterProcess test in test_lang_bind_helper.cpp that occurred in master:
https://spruce.mongodb.com/task/realm_core_stable_macos_release_test_on_exfat_b2ed3201a306b1b00f5fb1cc4d64d84e7e603c3f_23_04_06_14_22_32/logs?execution=0

@finnschiermer
Copy link
Contributor

I don't see the connection with the last one?

@nicola-cab
Copy link
Member

I don't see the connection with the last one?

Yes, the last one seems different, it is a problem with realm::util::File::lock(bool, bool). Probably worth to split it into different issues.

@nicola-cab
Copy link
Member

➤ michael-wb commented:

There may also be a related hang that occurred in theLangBindHelper_ImplicitTransactions_InterProcess test in test_lang_bind_helper.cpp that occurred in master: https://spruce.mongodb.com/task/realm_core_stable_macos_release_test_on_exfat_b2ed3201a306b1b00f5fb1cc4d64d84e7e603c3f_23_04_06_14_22_32/logs?execution=0

Tracked by #6739

@kiburtse
Copy link
Contributor

kiburtse commented Jun 27, 2023

tsan report from first linked ci build:


[2023/04/05 20:50:16.732] 1:   Thread T459 (tid=31133, running) created by thread T6 at:
[2023/04/05 20:50:16.732] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x78a94b)
[2023/04/05 20:50:16.732] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xcb7bdb)
[2023/04/05 20:50:16.733] 1:     #2 void realm::util::Thread::start(Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_17) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xcb7bdb)
[2023/04/05 20:50:16.733] 1:     #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4201 (realm-tests+0xcb7bdb)
[2023/04/05 20:50:16.733] 1:     #4 realm::test_util::unit_test::RegisterTest::run_test(realm::test_util::unit_test::TestContext&) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.hpp:641 (realm-tests+0xd16930)
[2023/04/05 20:50:16.733] 1:     #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:728 (realm-tests+0x1449aee)
[2023/04/05 20:50:16.733] 1:     #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:682 (realm-tests+0x14494b3)
[2023/04/05 20:50:16.735] 1:     #7 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:641 (realm-tests+0x144eaca)
[2023/04/05 20:50:16.735] 1:     #8 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:645 (realm-tests+0x144eaca)
[2023/04/05 20:50:16.735] 1:     #9 void* realm::util::Thread::entry_point(void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0x144eaca)
[2023/04/05 20:50:16.736] 1:
[2023/04/05 20:50:16.736] 1: SUMMARY: ThreadSanitizer: data race /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/group_writer.cpp:1333 in void realm::GroupWriter::write_array_at(realm::GroupWriter::MapWindow*, unsigned long, char const*, unsigned long)
[2023/04/05 20:50:16.800] 1: ==================
[2023/04/05 20:50:16.800] 1: ==================
[2023/04/05 20:50:16.800] 1: WARNING: ThreadSanitizer: data race (pid=29191)
[2023/04/05 20:50:16.800] 1:   Write of size 8 at 0x7f7c51d07150 by thread T473 (mutexes: write M595178272699097136, write M141159135264809048):
[2023/04/05 20:50:16.800] 1:     #0 memcpy /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:808 (realm-tests+0x79452e)
[2023/04/05 20:50:16.800] 1:     #1 void realm::GroupWriter::write_array_at(realm::GroupWriter::MapWindow*, unsigned long, char const*, unsigned long) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/group_writer.cpp:1334 (realm-tests+0x1656b5a)
[2023/04/05 20:50:16.800] 1:     #2 realm::GroupWriter::write_group() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/group_writer.cpp:909 (realm-tests+0x1650e7c)
[2023/04/05 20:50:16.800] 1:     #3 realm::DB::low_level_commit(unsigned long, realm::Transaction&, bool) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/db.cpp:2481 (realm-tests+0x163ef99)
[2023/04/05 20:50:16.801] 1:     #4 realm::DB::do_commit(realm::Transaction&, bool) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/db.cpp:2411 (realm-tests+0x163ea98)
[2023/04/05 20:50:16.801] 1:     #5 realm::Transaction::commit_and_continue_as_read(bool) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/transaction.cpp:219 (realm-tests+0x1760b39)
[2023/04/05 20:50:16.801] 1:     #6 (anonymous namespace)::handover_writer(std::shared_ptr) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4042 (realm-tests+0xd0d6ea)
[2023/04/05 20:50:16.802] 1:     #7 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4066 (realm-tests+0xd0d6ea)
[2023/04/05 20:50:16.802] 1:     #8 void* realm::util::Thread::entry_point<(anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr)::$_44>(void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0xd0d6ea)
[2023/04/05 20:50:16.802] 1:
[2023/04/05 20:50:16.802] 1:   Previous read of size 8 at 0x7f7c51d07150 by thread T459:
[2023/04/05 20:50:16.802] 1:     [failed to restore the stack]
[2023/04/05 20:50:16.802] 1:
[2023/04/05 20:50:16.802] 1:   Mutex M595178272699097136 is already destroyed.
[2023/04/05 20:50:16.802] 1:
[2023/04/05 20:50:16.802] 1:   Mutex M141159135264809048 is already destroyed.
[2023/04/05 20:50:16.802] 1:
[2023/04/05 20:50:16.802] 1:   Thread T473 (tid=31139, running) created by thread T459 at:
[2023/04/05 20:50:16.802] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x78a94b)
[2023/04/05 20:50:16.802] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xd0c643)
[2023/04/05 20:50:16.802] 1:     #2 void realm::util::Thread::start<(anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr)::$_44>((anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr)::$_44) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xd0c643)
[2023/04/05 20:50:16.802] 1:     #3 (anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4065 (realm-tests+0xd0c643)
[2023/04/05 20:50:16.802] 1:     #4 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4202 (realm-tests+0xd0c643)
[2023/04/05 20:50:16.802] 1:     #5 void* realm::util::Thread::entry_point(void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0xd0c643)
[2023/04/05 20:50:16.802] 1:
[2023/04/05 20:50:16.802] 1:   Thread T459 (tid=31133, running) created by thread T6 at:
[2023/04/05 20:50:16.802] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x78a94b)
[2023/04/05 20:50:16.802] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xcb7bdb)
[2023/04/05 20:50:16.802] 1:     #2 void realm::util::Thread::start(Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_17) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xcb7bdb)
[2023/04/05 20:50:16.802] 1:     #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4201 (realm-tests+0xcb7bdb)
[2023/04/05 20:50:16.802] 1:     #4 realm::test_util::unit_test::RegisterTest::run_test(realm::test_util::unit_test::TestContext&) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.hpp:641 (realm-tests+0xd16930)
[2023/04/05 20:50:16.802] 1:     #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:728 (realm-tests+0x1449aee)
[2023/04/05 20:50:16.802] 1:     #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:682 (realm-tests+0x14494b3)
[2023/04/05 20:50:16.802] 1:     #7 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:641 (realm-tests+0x144eaca)
[2023/04/05 20:50:16.802] 1:     #8 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/util/unit_test.cpp:645 (realm-tests+0x144eaca)
[2023/04/05 20:50:16.802] 1:     #9 void* realm::util::Thread::entry_point(void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0x144eaca)
[2023/04/05 20:50:16.803] 1:
[2023/04/05 20:50:16.803] 1: SUMMARY: ThreadSanitizer: data race /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/group_writer.cpp:1334 in void realm::GroupWriter::write_array_at(realm::GroupWriter::MapWindow*, unsigned long, char const*, unsigned long)
#6738 is exactly the same issue. So it was caught already in april by CI.

@ironage
Copy link
Contributor

ironage commented Jul 21, 2023

Another report:

[2023/07/21 18:28:40.751] 1: ==================
[2023/07/21 18:28:40.751] 1: WARNING: ThreadSanitizer: data race (pid=4443)
[2023/07/21 18:28:40.751] 1:   Read of size 2 at 0x7f4d950e4436 by thread T41:
[2023/07/21 18:28:40.751] 1:     #0 long realm::Array::get_universal<16ul>(char const*, unsigned long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/array.hpp:695 (realm-tests+0x1505aa9)
[2023/07/21 18:28:40.752] 1:     #1 long realm::Array::get<16ul>(unsigned long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/array.hpp:714 (realm-tests+0x1505aa9)
[2023/07/21 18:28:40.752] 1:     #2 realm::Array::get(unsigned long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/array.hpp:721 (realm-tests+0x8cef4b)
[2023/07/21 18:28:40.752] 1:     #3 realm::Array::get_as_ref(unsigned long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/array.hpp:754 (realm-tests+0x8cef4b)
[2023/07/21 18:28:40.752] 1:     #4 realm::ClusterNodeInner::_get_child_ref(unsigned long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/cluster_tree.cpp:167 (realm-tests+0x1557112)
[2023/07/21 18:28:40.752] 1:     #5 realm::ClusterNodeInner::traverse(realm::util::FunctionRef<realm::IteratorControl (realm::Cluster const*)>, long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/cluster_tree.cpp:691 (realm-tests+0x1557112)
[2023/07/21 18:28:40.752] 1:     #6 realm::ClusterTree::traverse(realm::util::FunctionRef<realm::IteratorControl (realm::Cluster const*)>) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/cluster_tree.cpp:1067 (realm-tests+0x1558b84)
[2023/07/21 18:28:40.752] 1:     #7 realm::Table::traverse_clusters(realm::util::FunctionRef<realm::IteratorControl (realm::Cluster const*)>) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/table.hpp:327 (realm-tests+0x148459e)
[2023/07/21 18:28:40.752] 1:     #8 realm::Query::do_count(unsigned long) const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/query.cpp:1498 (realm-tests+0x148459e)
[2023/07/21 18:28:40.752] 1:     #9 realm::Query::count() const /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/query.cpp:1520 (realm-tests+0x148499c)
[2023/07/21 18:28:40.753] 1:     #10 (anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr<realm::DB>) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/test_lang_bind_helper.cpp:3854 (realm-tests+0xca6a42)
[2023/07/21 18:28:40.753] 1:     #11 operator() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/test_lang_bind_helper.cpp:3962 (realm-tests+0xca6a42)
[2023/07/21 18:28:40.753] 1:     #12 void* realm::util::Thread::entry_point<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_16>(void*) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0xca6a42)
[2023/07/21 18:28:40.753] 1:
[2023/07/21 18:28:40.753] 1:   Previous write of size 8 at 0x7f4d950e4430 by thread T8:
[2023/07/21 18:28:40.753] 1:     [failed to restore the stack]
[2023/07/21 18:28:40.753] 1:
[2023/07/21 18:28:40.753] 1:   Thread T41 (tid=4507, running) created by thread T3 at:
[2023/07/21 18:28:40.753] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x77894b)
[2023/07/21 18:28:40.753] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xc52efb)
[2023/07/21 18:28:40.753] 1:     #2 void realm::util::Thread::start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_16>(Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_16) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xc52efb)
[2023/07/21 18:28:40.753] 1:     #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/test_lang_bind_helper.cpp:3961 (realm-tests+0xc52efb)
[2023/07/21 18:28:40.753] 1:     #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.hpp:640 (realm-tests+0xcae340)
[2023/07/21 18:28:40.754] 1:     #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:728 (realm-tests+0x13f353e)
[2023/07/21 18:28:40.754] 1:     #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:682 (realm-tests+0x13f2f03)
[2023/07/21 18:28:40.754] 1:     #7 operator() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:641 (realm-tests+0x13f855a)
[2023/07/21 18:28:40.754] 1:     #8 operator() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:645 (realm-tests+0x13f855a)
[2023/07/21 18:28:40.754] 1:     #9 void* realm::util::Thread::entry_point<realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::$_1>(void*) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0x13f855a)
[2023/07/21 18:28:40.754] 1:
[2023/07/21 18:28:40.754] 1:   Thread T8 (tid=4508, running) created by thread T3 at:
[2023/07/21 18:28:40.754] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x77894b)
[2023/07/21 18:28:40.754] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xc52f55)
[2023/07/21 18:28:40.754] 1:     #2 void realm::util::Thread::start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_17>(Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_17) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xc52f55)
[2023/07/21 18:28:40.754] 1:     #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/test_lang_bind_helper.cpp:3964 (realm-tests+0xc52f55)
[2023/07/21 18:28:40.755] 1:     #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.hpp:640 (realm-tests+0xcae340)
[2023/07/21 18:28:40.755] 1:     #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:728 (realm-tests+0x13f353e)
[2023/07/21 18:28:40.755] 1:     #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:682 (realm-tests+0x13f2f03)
[2023/07/21 18:28:40.755] 1:     #7 operator() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:641 (realm-tests+0x13f855a)
[2023/07/21 18:28:40.755] 1:     #8 operator() /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../test/util/unit_test.cpp:645 (realm-tests+0x13f855a)
[2023/07/21 18:28:40.755] 1:     #9 void* realm::util::Thread::entry_point<realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::$_1>(void*) /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0x13f855a)
[2023/07/21 18:28:40.755] 1:
[2023/07/21 18:28:40.755] 1: SUMMARY: ThreadSanitizer: data race /data/mci/0acebc05e9101b1c5979bf665a649410/realm-core/build/../src/realm/array.hpp:695 in long realm::Array::get_universal<16ul>(char const*, unsigned long) const
[2023/07/21 18:28:41.468] 1: ==================

@ironage
Copy link
Contributor

ironage commented Sep 21, 2023

Yet another report:

[2023/09/21 16:42:07.637] 1: ==================
[2023/09/21 16:42:07.637] 1: WARNING: ThreadSanitizer: data race (pid=16700)
[2023/09/21 16:42:07.637] 1:   Read of size 2 at 0x7fb1ef70cb6e by thread T328:
[2023/09/21 16:42:07.653] 1:     #0 long realm::Array::get_universal<16ul>(char const*, unsigned long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/array.hpp:686 (realm-tests+0x14cbd69)
[2023/09/21 16:42:07.654] 1:     #1 long realm::Array::get<16ul>(unsigned long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/array.hpp:705 (realm-tests+0x14cbd69)
[2023/09/21 16:42:07.654] 1:     #2 realm::Array::get(unsigned long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/array.hpp:712 (realm-tests+0x8c0a0b)
[2023/09/21 16:42:07.654] 1:     #3 realm::Array::get_as_ref(unsigned long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/array.hpp:745 (realm-tests+0x8c0a0b)
[2023/09/21 16:42:07.654] 1:     #4 realm::ClusterNodeInner::_get_child_ref(unsigned long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/cluster_tree.cpp:167 (realm-tests+0x151d812)
[2023/09/21 16:42:07.655] 1:     #5 realm::ClusterNodeInner::traverse(realm::util::FunctionRef<realm::IteratorControl (realm::Cluster const*)>, long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/cluster_tree.cpp:691 (realm-tests+0x151d812)
[2023/09/21 16:42:07.655] 1:     #6 realm::ClusterTree::traverse(realm::util::FunctionRef<realm::IteratorControl (realm::Cluster const*)>) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/cluster_tree.cpp:1042 (realm-tests+0x151f014)
[2023/09/21 16:42:07.655] 1:     #7 realm::Table::traverse_clusters(realm::util::FunctionRef<realm::IteratorControl (realm::Cluster const*)>) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/table.hpp:324 (realm-tests+0x144aa6e)
[2023/09/21 16:42:07.655] 1:     #8 realm::Query::do_count(unsigned long) const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/query.cpp:1480 (realm-tests+0x144aa6e)
[2023/09/21 16:42:07.655] 1:     #9 realm::Query::count() const /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/query.cpp:1499 (realm-tests+0x144ae59)
[2023/09/21 16:42:07.655] 1:     #10 (anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr<realm::DB>) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/test_lang_bind_helper.cpp:3854 (realm-tests+0xc99772)
[2023/09/21 16:42:07.656] 1:     #11 operator() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/test_lang_bind_helper.cpp:3962 (realm-tests+0xc99772)
[2023/09/21 16:42:07.659] 1:     #12 void* realm::util::Thread::entry_point<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_16>(void*) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0xc99772)
[2023/09/21 16:42:07.660] 1:
[2023/09/21 16:42:07.660] 1:   Previous write of size 8 at 0x7fb1ef70cb68 by thread T335:
[2023/09/21 16:42:07.660] 1:     [failed to restore the stack]
[2023/09/21 16:42:07.660] 1:
[2023/09/21 16:42:07.660] 1:   Thread T328 (tid=17255, running) created by thread T4 at:
[2023/09/21 16:42:07.660] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x76794b)
[2023/09/21 16:42:07.661] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xc4594b)
[2023/09/21 16:42:07.663] 1:     #2 void realm::util::Thread::start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_16>(Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_16) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xc4594b)
[2023/09/21 16:42:07.663] 1:     #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/test_lang_bind_helper.cpp:3961 (realm-tests+0xc4594b)
[2023/09/21 16:42:07.663] 1:     #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.hpp:640 (realm-tests+0xca10c0)
[2023/09/21 16:42:07.663] 1:     #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:728 (realm-tests+0x13b81ce)
[2023/09/21 16:42:07.663] 1:     #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:682 (realm-tests+0x13b7b93)
[2023/09/21 16:42:07.664] 1:     #7 operator() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:641 (realm-tests+0x13bd1ea)
[2023/09/21 16:42:07.664] 1:     #8 operator() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:645 (realm-tests+0x13bd1ea)
[2023/09/21 16:42:07.664] 1:     #9 void* realm::util::Thread::entry_point<realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::$_1>(void*) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0x13bd1ea)
[2023/09/21 16:42:07.664] 1:
[2023/09/21 16:42:07.664] 1:   Thread T335 (tid=17258, running) created by thread T4 at:
[2023/09/21 16:42:07.664] 1:     #0 pthread_create /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/rtl/tsan_interceptors_posix.cpp:966 (realm-tests+0x76794b)
[2023/09/21 16:42:07.664] 1:     #1 realm::util::Thread::start(void* (*)(void*), void*) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:472 (realm-tests+0xc459a5)
[2023/09/21 16:42:07.664] 1:     #2 void realm::util::Thread::start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_17>(Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::$_17) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:450 (realm-tests+0xc459a5)
[2023/09/21 16:42:07.665] 1:     #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/test_lang_bind_helper.cpp:3964 (realm-tests+0xc459a5)
[2023/09/21 16:42:07.665] 1:     #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.hpp:640 (realm-tests+0xca10c0)
[2023/09/21 16:42:07.665] 1:     #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:728 (realm-tests+0x13b81ce)
[2023/09/21 16:42:07.666] 1:     #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:682 (realm-tests+0x13b7b93)
[2023/09/21 16:42:07.666] 1:     #7 operator() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:641 (realm-tests+0x13bd1ea)
[2023/09/21 16:42:07.666] 1:     #8 operator() /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../test/util/unit_test.cpp:645 (realm-tests+0x13bd1ea)
[2023/09/21 16:42:07.669] 1:     #9 void* realm::util::Thread::entry_point<realm::test_util::unit_test::TestList::run(realm::test_util::unit_test::TestList::Config)::$_1>(void*) /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0x13bd1ea)
[2023/09/21 16:42:07.672] 1:
[2023/09/21 16:42:07.672] 1: SUMMARY: ThreadSanitizer: data race /data/mci/8b0793ac541e3e1d3c19e2f11223bb07/realm-core/build/../src/realm/array.hpp:686 in long realm::Array::get_universal<16ul>(char const*, unsigned long) const
[2023/09/21 16:42:07.672] 1: ==================

@kiburtse kiburtse changed the title TSAN Data Race in LangBindHelper_HandoverBetweenThreads TSAN Data Race in LangBindHelper_HandoverBetweenThreads on stack GroupWriter::write_array_at Oct 11, 2023
@kiburtse
Copy link
Contributor

@ironage these last failure you've reported are on different stack. All from queries similar to #6624 from what i can tell. Does it make sense to separate them into other ticket for Array::get_universal?

@ironage
Copy link
Contributor

ironage commented Oct 11, 2023

@kiburtse I think I'd like to keep all races from LangBindHelper_HandoverBetweenThreads together as I suspect a common underlying root cause. I don't see enough information from any one report to actually track down the culprit, so I hope that by looking through the combined reports we can piece together enough clues to find the root issue.

@kiburtse kiburtse changed the title TSAN Data Race in LangBindHelper_HandoverBetweenThreads on stack GroupWriter::write_array_at TSAN Data Race in LangBindHelper_HandoverBetweenThreads Oct 12, 2023
@kiburtse
Copy link
Contributor

ok, let's keep all failures under this ticket. The only stack trace which has happened multiple times already is on realm::Array::get_universal. That must be concurrent access to Node::m_data. What could this mean?

@ironage
Copy link
Contributor

ironage commented Oct 12, 2023

In theory, I think it means that an array is being written to while there are still readers accessing it concurrently. But this should not be possible because a writer does a copy-on-write and only modifies the array at its new location specifically to allow readers at previous versions to continue to see a consistent view of the data. This may indicate a problem with our locking, version management, or online compaction. It could also be a false positive from TSAN.

@kiburtse
Copy link
Contributor

kiburtse commented Nov 13, 2023

One more occurence from CI test run on the same ubuntu2004-encryption-tsan

I've tried v13.23.3 on archlinux with tsan and gcc and it reproduces for me 100% of the time on this test run alone also. Missing stack trace in our output seems to be related to small default value for tsan stacktrace - increased history_size option for tsan should solve the issue.

There are actually two possible combinations: we have three threads running handover_querier, handover_writer and handover_verifier functions.

Full tsan output from these two stacks:

Stack 1
WARNING: ThreadSanitizer: data race (pid=5477)
  Write of size 8 at 0x7f1af0a000a0 by thread T106 (mutexes: write M0):
    #0 __tsan_memcpy /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:3081 (libtsan.so.2+0x79499) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::EncryptedFileMapping::copy_up_to_date_page(unsigned long) /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:633 (realm-tests+0x12869fc) (BuildId: 4ed06def7c89e968)
    #2 realm::util::EncryptedFileMapping::refresh_page(unsigned long, unsigned long) /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:648 (realm-tests+0x1289d79) (BuildId: 4ed06def7c89e968)
    #3 realm::util::EncryptedFileMapping::read_barrier(void const*, unsigned long, unsigned long (*)(char const*), bool) /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:987 (realm-tests+0x128a58f) (BuildId: 4ed06def7c89e968)
    #4 realm::util::do_encryption_read_barrier(void const*, unsigned long, unsigned long (*)(char const*), realm::util::EncryptedFileMapping*, bool) /home/user/r/src/src/realm/util/file_mapper.hpp:144 (realm-tests+0x3ddc2c) (BuildId: 4ed06def7c89e968)
    #5 realm::util::encryption_read_barrier(void const*, unsigned long, realm::util::EncryptedFileMapping*, unsigned long (*)(char const*), bool) /home/user/r/src/src/realm/util/file_mapper.hpp:116 (realm-tests+0x3ddc2c)
    #6 realm::Allocator::translate_critical(realm::Allocator::RefTranslation*, unsigned long) const /home/user/r/src/src/realm/alloc.hpp:570 (realm-tests+0x3ddc2c)
    #7 realm::Allocator::translate(unsigned long) const /home/user/r/src/src/realm/alloc.hpp:588 (realm-tests+0x3ddc2c)
    #8 realm::Array::init_from_ref(unsigned long) /home/user/r/src/src/realm/array.hpp:114 (realm-tests+0x3f5bfe) (BuildId: 4ed06def7c89e968)
    #9 realm::_impl::GroupFriend::get_history_ref(realm::Allocator&, unsigned long) /home/user/r/src/src/realm/group.hpp:1182 (realm-tests+0x3f5bfe)
    #10 bool realm::Transaction::internal_advance_read<realm::_impl::NullInstructionObserver>(realm::_impl::NullInstructionObserver*, realm::VersionID, realm::_impl::History&, bool) /home/user/r/src/src/realm/transaction.hpp:496 (realm-tests+0x3f5bfe)
    #11 void realm::Transaction::advance_read<realm::_impl::NullInstructionObserver>(realm::_impl::NullInstructionObserver*, realm::VersionID) /home/user/r/src/src/realm/transaction.hpp:355 (realm-tests+0x79dd52) (BuildId: 4ed06def7c89e968)
    #12 realm::Transaction::advance_read(realm::VersionID) /home/user/r/src/src/realm/transaction.hpp:75 (realm-tests+0x79dd52)
    #13 handover_querier /home/user/r/src/test/test_lang_bind_helper.cpp:3839 (realm-tests+0x79dd52)
    #14 operator() /home/user/r/src/test/test_lang_bind_helper.cpp:3962 (realm-tests+0x79e565) (BuildId: 4ed06def7c89e968)
    #15 entry_point<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0x79e565)

  Previous read of size 1 at 0x7f1af0a000a4 by thread T107:
    #0 realm::NodeHoader::get_width_from_header(char const*) /home/user/r/src/src/realm/node_header.hpp:109 (realm-tests+0xfe38e8) (BuildId: 4ed06def7c89e968)
    #1 realm::Array::get(char const*, unsigned long) /home/user/r/src/src/realm/array.cpp:1269 (realm-tests+0xfe38e8)
    #2 long realm::Obj::_get<long>(realm::ColKey::Idx) const /home/user/r/src/src/realm/obj.cpp:366 (realm-tests+0x1147935) (BuildId: 4ed06def7c89e968)
    #3 long realm::Obj::get<long>(realm::ColKey) const /home/user/r/src/src/realm/obj.cpp:388 (realm-tests+0x114b7b4) (BuildId: 4ed06def7c89e968)
    #4 handover_verifier /home/user/r/src/test/test_lang_bind_helper.cpp:3881 (realm-tests+0x75ec7f) (BuildId: 4ed06def7c89e968)
    #5 operator() /home/user/r/src/test/test_lang_bind_helper.cpp:3965 (realm-tests+0x75f5d4) (BuildId: 4ed06def7c89e968)
    #6 entry_point<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0x75f5d4)

  Mutex M0 (0x7b0c00004740) created at:
    #0 pthread_mutex_init /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1329 (libtsan.so.2+0x448c6) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::Mutex::init_as_regular() /home/user/r/src/src/realm/util/thread.hpp:517 (realm-tests+0x12c083f) (BuildId: 4ed06def7c89e968)
    #2 realm::util::Mutex::Mutex() /home/user/r/src/src/realm/util/thread.hpp:494 (realm-tests+0x12c083f)
    #3 __static_initialization_and_destruction_0 /home/user/r/src/src/realm/util/file_mapper.cpp:113 (realm-tests+0x12c083f)
    #4 _GLOBAL__sub_I_file_mapper.cpp /home/user/r/src/src/realm/util/file_mapper.cpp:984 (realm-tests+0x12c083f)

  Thread T106 (tid=5584, running) created by thread T2 at:
    #0 pthread_create /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x44219) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::Thread::start(void* (*)(void*), void*) /home/user/r/src/src/realm/util/thread.hpp:472 (realm-tests+0x755423) (BuildId: 4ed06def7c89e968)
    #2 start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:450 (realm-tests+0x755423)
    #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /home/user/r/src/test/test_lang_bind_helper.cpp:3961 (realm-tests+0x755423)
    #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /home/user/r/src/test/util/unit_test.hpp:640 (realm-tests+0x7c773d) (BuildId: 4ed06def7c89e968)
    #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /home/user/r/src/test/util/unit_test.cpp:727 (realm-tests+0xee15c0) (BuildId: 4ed06def7c89e968)
    #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /home/user/r/src/test/util/unit_test.cpp:681 (realm-tests+0xee18b0) (BuildId: 4ed06def7c89e968)
    #7 operator() /home/user/r/src/test/util/unit_test.cpp:640 (realm-tests+0xee1d08) (BuildId: 4ed06def7c89e968)
    #8 operator() /home/user/r/src/test/util/unit_test.cpp:644 (realm-tests+0xee1d98) (BuildId: 4ed06def7c89e968)
    #9 entry_point<realm::test_util::unit_test::TestList::run(Config)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0xee1d98)

  Thread T107 (tid=5585, running) created by thread T2 at:
    #0 pthread_create /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x44219) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::Thread::start(void* (*)(void*), void*) /home/user/r/src/src/realm/util/thread.hpp:472 (realm-tests+0x75549d) (BuildId: 4ed06def7c89e968)
    #2 start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:450 (realm-tests+0x75549d)
    #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /home/user/r/src/test/test_lang_bind_helper.cpp:3964 (realm-tests+0x75549d)
    #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /home/user/r/src/test/util/unit_test.hpp:640 (realm-tests+0x7c773d) (BuildId: 4ed06def7c89e968)
    #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /home/user/r/src/test/util/unit_test.cpp:727 (realm-tests+0xee15c0) (BuildId: 4ed06def7c89e968)
    #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /home/user/r/src/test/util/unit_test.cpp:681 (realm-tests+0xee18b0) (BuildId: 4ed06def7c89e968)
    #7 operator() /home/user/r/src/test/util/unit_test.cpp:640 (realm-tests+0xee1d08) (BuildId: 4ed06def7c89e968)
    #8 operator() /home/user/r/src/test/util/unit_test.cpp:644 (realm-tests+0xee1d98) (BuildId: 4ed06def7c89e968)
    #9 entry_point<realm::test_util::unit_test::TestList::run(Config)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0xee1d98)

SUMMARY: ThreadSanitizer: data race /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:633 in realm::util::EncryptedFileMapping::copy_up_to_date_page(unsigned long)
==================
Stack 2
==================
WARNING: ThreadSanitizer: data race (pid=5477)
  Read of size 8 at 0x7f1afbb9b0b8 by thread T106 (mutexes: write M0):
    #0 __tsan_memcpy /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:3081 (libtsan.so.2+0x79499) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::EncryptedFileMapping::copy_up_to_date_page(unsigned long) /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:633 (realm-tests+0x12869fc) (BuildId: 4ed06def7c89e968)
    #2 realm::util::EncryptedFileMapping::refresh_page(unsigned long, unsigned long) /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:648 (realm-tests+0x1289d79) (BuildId: 4ed06def7c89e968)
    #3 realm::util::EncryptedFileMapping::read_barrier(void const*, unsigned long, unsigned long (*)(char const*), bool) /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:987 (realm-tests+0x128a58f) (BuildId: 4ed06def7c89e968)
    #4 realm::util::do_encryption_read_barrier(void const*, unsigned long, unsigned long (*)(char const*), realm::util::EncryptedFileMapping*, bool) /home/user/r/src/src/realm/util/file_mapper.hpp:144 (realm-tests+0x3ddc2c) (BuildId: 4ed06def7c89e968)
    #5 realm::util::encryption_read_barrier(void const*, unsigned long, realm::util::EncryptedFileMapping*, unsigned long (*)(char const*), bool) /home/user/r/src/src/realm/util/file_mapper.hpp:116 (realm-tests+0x3ddc2c)
    #6 realm::Allocator::translate_critical(realm::Allocator::RefTranslation*, unsigned long) const /home/user/r/src/src/realm/alloc.hpp:570 (realm-tests+0x3ddc2c)
    #7 realm::Allocator::translate(unsigned long) const /home/user/r/src/src/realm/alloc.hpp:588 (realm-tests+0x3ddc2c)
    #8 realm::Array::init_from_ref(unsigned long) /home/user/r/src/src/realm/array.hpp:114 (realm-tests+0x3f5bfe) (BuildId: 4ed06def7c89e968)
    #9 realm::_impl::GroupFriend::get_history_ref(realm::Allocator&, unsigned long) /home/user/r/src/src/realm/group.hpp:1182 (realm-tests+0x3f5bfe)
    #10 bool realm::Transaction::internal_advance_read<realm::_impl::NullInstructionObserver>(realm::_impl::NullInstructionObserver*, realm::VersionID, realm::_impl::History&, bool) /home/user/r/src/src/realm/transaction.hpp:496 (realm-tests+0x3f5bfe)
    #11 void realm::Transaction::advance_read<realm::_impl::NullInstructionObserver>(realm::_impl::NullInstructionObserver*, realm::VersionID) /home/user/r/src/src/realm/transaction.hpp:355 (realm-tests+0x79dd52) (BuildId: 4ed06def7c89e968)
    #12 realm::Transaction::advance_read(realm::VersionID) /home/user/r/src/src/realm/transaction.hpp:75 (realm-tests+0x79dd52)
    #13 handover_querier /home/user/r/src/test/test_lang_bind_helper.cpp:3839 (realm-tests+0x79dd52)
    #14 operator() /home/user/r/src/test/test_lang_bind_helper.cpp:3962 (realm-tests+0x79e565) (BuildId: 4ed06def7c89e968)
    #15 entry_point<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0x79e565)

  Previous write of size 4 at 0x7f1afbb9b0b8 by thread T108 (mutexes: write M1, write M2):
    #0 void realm::GroupWriter::write_array_at<realm::WriteWindowMgr::MapWindow>(realm::WriteWindowMgr::MapWindow*, unsigned long, char const*, unsigned long) /home/user/r/src/src/realm/group_writer.cpp:1370 (realm-tests+0x10c6181) (BuildId: 4ed06def7c89e968)
    #1 realm::GroupWriter::write_group() /home/user/r/src/src/realm/group_writer.cpp:930 (realm-tests+0x10ce736) (BuildId: 4ed06def7c89e968)
    #2 realm::DB::low_level_commit(unsigned long, realm::Transaction&, bool) /home/user/r/src/src/realm/db.cpp:2504 (realm-tests+0x10bc69d) (BuildId: 4ed06def7c89e968)
    #3 realm::DB::do_commit(realm::Transaction&, bool) /home/user/r/src/src/realm/db.cpp:2426 (realm-tests+0x10bd264) (BuildId: 4ed06def7c89e968)
    #4 realm::Transaction::commit_and_continue_as_read(bool) /home/user/r/src/src/realm/transaction.cpp:220 (realm-tests+0x121646c) (BuildId: 4ed06def7c89e968)
    #5 handover_writer /home/user/r/src/test/test_lang_bind_helper.cpp:3802 (realm-tests+0x78c4d9) (BuildId: 4ed06def7c89e968)
    #6 operator() /home/user/r/src/test/test_lang_bind_helper.cpp:3826 (realm-tests+0x78c74c) (BuildId: 4ed06def7c89e968)
    #7 entry_point<(anonymous namespace)::handover_querier(HandoverControl<Work>*, realm::test_util::unit_test::TestContext&, realm::DBRef)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0x78c74c)

  Mutex M0 (0x7b0c00004740) created at:
    #0 pthread_mutex_init /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1329 (libtsan.so.2+0x448c6) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::Mutex::init_as_regular() /home/user/r/src/src/realm/util/thread.hpp:517 (realm-tests+0x12c083f) (BuildId: 4ed06def7c89e968)
    #2 realm::util::Mutex::Mutex() /home/user/r/src/src/realm/util/thread.hpp:494 (realm-tests+0x12c083f)
    #3 __static_initialization_and_destruction_0 /home/user/r/src/src/realm/util/file_mapper.cpp:113 (realm-tests+0x12c083f)
    #4 _GLOBAL__sub_I_file_mapper.cpp /home/user/r/src/src/realm/util/file_mapper.cpp:984 (realm-tests+0x12c083f)

  Mutex M1 (0x7f1b0be8e030) created at:
    #0 pthread_mutex_lock /usr/src/debug/gcc/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4481 (libtsan.so.2+0x560df) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::RobustMutex::low_level_lock() /home/user/r/src/src/realm/util/thread.cpp:221 (realm-tests+0x12ce6ff) (BuildId: 4ed06def7c89e968)
    #2 void realm::util::RobustMutex::lock<realm::util::InterprocessMutex::lock()::{lambda()#1}>(realm::util::InterprocessMutex::lock()::{lambda()#1}) /home/user/r/src/src/realm/util/thread.hpp:639 (realm-tests+0x10b3d5b) (BuildId: 4ed06def7c89e968)
    #3 realm::util::InterprocessMutex::lock() /home/user/r/src/src/realm/util/interprocess_mutex.hpp:347 (realm-tests+0x10b3d5b)
    #4 realm::DB::do_begin_write() /home/user/r/src/src/realm/db.cpp:2319 (realm-tests+0x10b3d5b)
    #5 realm::DB::start_write(bool) /home/user/r/src/src/realm/db.cpp:2663 (realm-tests+0x10b6a24) (BuildId: 4ed06def7c89e968)
    #6 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /home/user/r/src/test/test_lang_bind_helper.cpp:3950 (realm-tests+0x7550f8) (BuildId: 4ed06def7c89e968)
    #7 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /home/user/r/src/test/util/unit_test.hpp:640 (realm-tests+0x7c773d) (BuildId: 4ed06def7c89e968)
    #8 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /home/user/r/src/test/util/unit_test.cpp:727 (realm-tests+0xee15c0) (BuildId: 4ed06def7c89e968)
    #9 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /home/user/r/src/test/util/unit_test.cpp:681 (realm-tests+0xee18b0) (BuildId: 4ed06def7c89e968)
    #10 operator() /home/user/r/src/test/util/unit_test.cpp:640 (realm-tests+0xee1d08) (BuildId: 4ed06def7c89e968)
    #11 operator() /home/user/r/src/test/util/unit_test.cpp:644 (realm-tests+0xee1d98) (BuildId: 4ed06def7c89e968)
    #12 entry_point<realm::test_util::unit_test::TestList::run(Config)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0xee1d98)

  Mutex M2 (0x7f1b0be8e058) created at:
    #0 pthread_mutex_trylock /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1355 (libtsan.so.2+0x44aff) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::RobustMutex::is_valid() /home/user/r/src/src/realm/util/thread.cpp:265 (realm-tests+0x12ce7aa) (BuildId: 4ed06def7c89e968)
    #2 realm::util::InterprocessMutex::is_valid() /home/user/r/src/src/realm/util/interprocess_mutex.hpp:416 (realm-tests+0x10b7803) (BuildId: 4ed06def7c89e968)
    #3 realm::DB::open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, realm::DBOptions const&) /home/user/r/src/src/realm/db.cpp:1117 (realm-tests+0x10b7803)
    #4 realm::DB::open(realm::Replication&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, realm::DBOptions const&) /home/user/r/src/src/realm/db.cpp:1468 (realm-tests+0x10bacc0) (BuildId: 4ed06def7c89e968)
    #5 realm::DB::create(realm::Replication&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, realm::DBOptions const&) /home/user/r/src/src/realm/db.cpp:2765 (realm-tests+0x10bacc0)
    #6 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /home/user/r/src/test/test_lang_bind_helper.cpp:3949 (realm-tests+0x7550a1) (BuildId: 4ed06def7c89e968)
    #7 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /home/user/r/src/test/util/unit_test.hpp:640 (realm-tests+0x7c773d) (BuildId: 4ed06def7c89e968)
    #8 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /home/user/r/src/test/util/unit_test.cpp:727 (realm-tests+0xee15c0) (BuildId: 4ed06def7c89e968)
    #9 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /home/user/r/src/test/util/unit_test.cpp:681 (realm-tests+0xee18b0) (BuildId: 4ed06def7c89e968)
    #10 operator() /home/user/r/src/test/util/unit_test.cpp:640 (realm-tests+0xee1d08) (BuildId: 4ed06def7c89e968)
    #11 operator() /home/user/r/src/test/util/unit_test.cpp:644 (realm-tests+0xee1d98) (BuildId: 4ed06def7c89e968)
    #12 entry_point<realm::test_util::unit_test::TestList::run(Config)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0xee1d98)

  Thread T106 (tid=5584, running) created by thread T2 at:
    #0 pthread_create /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x44219) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::Thread::start(void* (*)(void*), void*) /home/user/r/src/src/realm/util/thread.hpp:472 (realm-tests+0x755423) (BuildId: 4ed06def7c89e968)
    #2 start<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:450 (realm-tests+0x755423)
    #3 Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run() /home/user/r/src/test/test_lang_bind_helper.cpp:3961 (realm-tests+0x755423)
    #4 realm::test_util::unit_test::RegisterTest<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads>::run_test(realm::test_util::unit_test::TestContext&) /home/user/r/src/test/util/unit_test.hpp:640 (realm-tests+0x7c773d) (BuildId: 4ed06def7c89e968)
    #5 realm::test_util::unit_test::TestList::ThreadContextImpl::run(realm::test_util::unit_test::TestList::SharedContextImpl::Entry, realm::util::UniqueLock&) /home/user/r/src/test/util/unit_test.cpp:727 (realm-tests+0xee15c0) (BuildId: 4ed06def7c89e968)
    #6 realm::test_util::unit_test::TestList::ThreadContextImpl::run() /home/user/r/src/test/util/unit_test.cpp:681 (realm-tests+0xee18b0) (BuildId: 4ed06def7c89e968)
    #7 operator() /home/user/r/src/test/util/unit_test.cpp:640 (realm-tests+0xee1d08) (BuildId: 4ed06def7c89e968)
    #8 operator() /home/user/r/src/test/util/unit_test.cpp:644 (realm-tests+0xee1d98) (BuildId: 4ed06def7c89e968)
    #9 entry_point<realm::test_util::unit_test::TestList::run(Config)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0xee1d98)

  Thread T108 (tid=5586, running) created by thread T106 at:
    #0 pthread_create /usr/src/debug/gcc/gcc/libsanitizer/tsan/tsan_interceptors_posix.cpp:1036 (libtsan.so.2+0x44219) (BuildId: 7e8fcb9ed0a63b98f2293e37c92ac955413efd9e)
    #1 realm::util::Thread::start(void* (*)(void*), void*) /home/user/r/src/src/realm/util/thread.hpp:472 (realm-tests+0x79dbc8) (BuildId: 4ed06def7c89e968)
    #2 start<(anonymous namespace)::handover_querier(HandoverControl<Work>*, realm::test_util::unit_test::TestContext&, realm::DBRef)::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:450 (realm-tests+0x79dbc8)
    #3 handover_querier /home/user/r/src/test/test_lang_bind_helper.cpp:3825 (realm-tests+0x79dbc8)
    #4 operator() /home/user/r/src/test/test_lang_bind_helper.cpp:3962 (realm-tests+0x79e565) (BuildId: 4ed06def7c89e968)
    #5 entry_point<Realm_UnitTest__LangBindHelper_HandoverBetweenThreads::test_run()::<lambda()> > /home/user/r/src/src/realm/util/thread.hpp:483 (realm-tests+0x79e565)

SUMMARY: ThreadSanitizer: data race /home/user/r/src/src/realm/util/encrypted_file_mapping.cpp:633 in realm::util::EncryptedFileMapping::copy_up_to_date_page(unsigned long)

@finnschiermer @ironage could you comment on that? Looks to me that this is specifically encryption related issue after all.

@kiburtse
Copy link
Contributor

So, as was clarified under under linked pr for work-around, this may really be just a false positive in our case. Encrypted pages are refreshed unconditionally, but the data itself for those threads which read it in the moment is not expected to change, but tsan naturally sees it as a race. So, it's by design.

@ironage do you think the stacktraces we've gathered here could all be explained by this?

everything what intersects with realm::util::EncryptedFileMapping::copy_up_to_date_page so far:

realm::GroupWriter::write_array_at<realm::WriteWindowMgr::MapWindow>
realm::NodeHoader::get_width_from_header
realm::GroupCommitter::commit
realm::GroupWriter::write_array
realm::ArrayWithFind::find_sse_intern
realm::Array::get_universal

Suppressing all this with tsan doesn't look like the solution to me.

@ironage
Copy link
Contributor

ironage commented Nov 25, 2023

@kiburtse yes, at this point, I do think that every race that intersects with EncryptedFileMapping::copy_up_to_date_page is benign. We do have a suppression for that already, and I wonder if the races being reported are just not being suppressed properly because that part of the stack was lost. Eg. on the failing examples in this issue we see something like this:

[2023/04/05 13:50:16.578] 1: WARNING: ThreadSanitizer: data race (pid=29191)
[2023/04/05 13:50:16.578] 1:   Write of size 8 at 0x7f7c51d07140 by thread T473 (mutexes: write M595178272699097136, write M141159135264809048):
[2023/04/05 13:50:16.580] 1:     #0 memcpy /home/nnelson/Documents/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/tsan/../sanitizer_common/sanitizer_common_interceptors.inc:808 (realm-tests+0x79452e)
[2023/04/05 13:50:16.581] 1:     #1 void realm::GroupWriter::write_array_at<realm::GroupWriter::MapWindow>(realm::GroupWriter::MapWindow*, unsigned long, char const*, unsigned long) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/group_writer.cpp:1334 (realm-tests+0x1656b5a)
[2023/04/05 13:50:16.583] 1:     #2 realm::GroupWriter::write_group() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/group_writer.cpp:897 (realm-tests+0x1650def)
[2023/04/05 13:50:16.584] 1:     #3 realm::DB::low_level_commit(unsigned long, realm::Transaction&, bool) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/db.cpp:2481 (realm-tests+0x163ef99)
[2023/04/05 13:50:16.586] 1:     #4 realm::DB::do_commit(realm::Transaction&, bool) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/db.cpp:2411 (realm-tests+0x163ea98)
[2023/04/05 13:50:16.587] 1:     #5 realm::Transaction::commit_and_continue_as_read(bool) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/transaction.cpp:219 (realm-tests+0x1760b39)
[2023/04/05 13:50:16.588] 1:     #6 (anonymous namespace)::handover_writer(std::shared_ptr<realm::DB>) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4042 (realm-tests+0xd0d6ea)
[2023/04/05 13:50:16.589] 1:     #7 operator() /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../test/test_lang_bind_helper.cpp:4066 (realm-tests+0xd0d6ea)
[2023/04/05 13:50:16.589] 1:     #8 void* realm::util::Thread::entry_point<(anonymous namespace)::handover_querier((anonymous namespace)::HandoverControl<(anonymous namespace)::Work>*, realm::test_util::unit_test::TestContext&, std::shared_ptr<realm::DB>)::$_44>(void*) /data/mci/88bc830d5b7a75ee4adeed742eeb587b/realm-core/build/../src/realm/util/thread.hpp:483 (realm-tests+0xd0d6ea)
[2023/04/05 13:50:16.589] 1:
[2023/04/05 13:50:16.589] 1:   Previous read of size 8 at 0x7f7c51d07140 by thread T459:
[2023/04/05 13:50:16.589] 1:     [failed to restore the stack]

But the second stack, the one that couldn't be restored, is very likely EncryptedFileMapping::copy_up_to_date_page. And maybe because the stack couldn't be restored, it is not getting filtered by our suppression? If that is the case, then your recent change to increase the TSAN history may fix all this.
If not, although it is not ideal, maybe the only path forward is to add the rest of the races as they come up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants