ThreadSanitizer reports #52

Pagghiu · 2020-01-17T16:27:20Z

I have been testing latest master (4f9941b).
ThreadSanitizer, enabled under XCode 11.0, is reporting some data races when running unmodified samples.

I am reporting the output of one Data race report for ParallelSum as an example.

==================
WARNING: ThreadSanitizer: data race (pid=56086)
  Read of size 4 at 0x7ffeefbff49c by thread T4:
    #0 enki::TaskScheduler::TryRunTask(unsigned int, unsigned int, unsigned int&) TaskScheduler.cpp:412 (ParallelSum:x86_64+0x100007204)
    #1 enki::TaskScheduler::TryRunTask(unsigned int, unsigned int&) TaskScheduler.cpp:377 (ParallelSum:x86_64+0x1000050d0)
    #2 enki::TaskScheduler::TaskingThreadFunction(enki::ThreadArgs const&) TaskScheduler.cpp:236 (ParallelSum:x86_64+0x100004e04)
    #3 decltype(std::__1::forward<void (*)(enki::ThreadArgs const&)>(fp)(std::__1::forward<enki::ThreadArgs>(fp0))) std::__1::__invoke<void (*)(enki::ThreadArgs const&), enki::ThreadArgs>(void (*&&)(enki::ThreadArgs const&), enki::ThreadArgs&&) type_traits:4361 (ParallelSum:x86_64+0x10000d06d)
    #4 void std::__1::__thread_execute<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(enki::ThreadArgs const&), enki::ThreadArgs, 2ul>(std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(enki::ThreadArgs const&), enki::ThreadArgs>&, std::__1::__tuple_indices<2ul>) thread:342 (ParallelSum:x86_64+0x10000ceb1)
    #5 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(enki::ThreadArgs const&), enki::ThreadArgs> >(void*) thread:352 (ParallelSum:x86_64+0x10000bf09)

  Previous write of size 4 at 0x7ffeefbff49c by main thread:
    #0 enki::ITaskSet::ITaskSet() TaskScheduler.h:122 (ParallelSum:x86_64+0x100003288)
    #1 ParallelReductionSumTaskSet::ParallelReductionSumTaskSet(unsigned int) ParallelSum.cpp:81 (ParallelSum:x86_64+0x100003ba8)
    #2 ParallelReductionSumTaskSet::ParallelReductionSumTaskSet(unsigned int) ParallelSum.cpp:82 (ParallelSum:x86_64+0x100002e04)
    #3 main ParallelSum.cpp:146 (ParallelSum:x86_64+0x100002390)

  Location is stack of main thread.

  Thread T4 (tid=3398714, running) created by main thread at:
    #0 pthread_create <null>:2673040 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x2aa2d)
    #1 std::__1::__libcpp_thread_create(_opaque_pthread_t**, void* (*)(void*), void*) __threading_support:328 (ParallelSum:x86_64+0x10000be4e)
    #2 std::__1::thread::thread<void (&)(enki::ThreadArgs const&), enki::ThreadArgs, void>(void (&&&)(enki::ThreadArgs const&), enki::ThreadArgs&&) thread:368 (ParallelSum:x86_64+0x10000ba71)
    #3 std::__1::thread::thread<void (&)(enki::ThreadArgs const&), enki::ThreadArgs, void>(void (&&&)(enki::ThreadArgs const&), enki::ThreadArgs&&) thread:360 (ParallelSum:x86_64+0x100006238)
    #4 enki::TaskScheduler::StartThreads() TaskScheduler.cpp:298 (ParallelSum:x86_64+0x100005901)
    #5 enki::TaskScheduler::Initialize(unsigned int) TaskScheduler.cpp:924 (ParallelSum:x86_64+0x10000a687)
    #6 main ParallelSum.cpp:136 (ParallelSum:x86_64+0x10000231a)

SUMMARY: ThreadSanitizer: data race TaskScheduler.cpp:412 in enki::TaskScheduler::TryRunTask(unsigned int, unsigned int, unsigned int&)
==================
ThreadSanitizer report breakpoint hit. Use 'thread info -s' to get extended information about the report.

This is reporting that reading subTask.pTask->m_RangeToRun is a data race

bool TaskScheduler::TryRunTask( uint32_t threadNum_, uint32_t priority_, uint32_t& hintPipeToCheck_io_ )
{
// ...
 
        if( subTask.pTask->m_RangeToRun < partitionSize )
        {
            SubTaskSet taskToRun = SplitTask( subTask, subTask.pTask->m_RangeToRun );
       }

When declaring ParallelSumTaskSet m_ParallelSumTaskSet; inside struct ParallelReductionSumTaskSet

struct ParallelReductionSumTaskSet : ITaskSet
{
    ParallelSumTaskSet m_ParallelSumTaskSet;
    uint64_t m_FinalSum;

    ParallelReductionSumTaskSet( uint32_t size_ ) : m_ParallelSumTaskSet( size_ ), m_FinalSum(0)
    {
            m_ParallelSumTaskSet.Init( g_TS.GetNumTaskThreads() );
    }

    virtual void    ExecuteRange( TaskSetPartition range_, uint32_t threadnum_ )
    {
        g_TS.AddTaskSetToPipe( &m_ParallelSumTaskSet );
        g_TS.WaitforTask( &m_ParallelSumTaskSet );

        for( uint32_t i = 0; i < m_ParallelSumTaskSet.m_NumPartialSums; ++i )
        {
            m_FinalSum += m_ParallelSumTaskSet.m_pPartialSums[i].count;
        }
    }
}

will initialize ParallelSumTaskSet::m_RangeToRun in the constructor:

    class ITaskSet : public ICompletable
    {
    public:
        ITaskSet()
            : m_SetSize(1)
            , m_MinRange(1)
            , m_RangeToRun(1)
        {}
};

I am not expert on the field, but it looks like a potential false positive, because TryRunTask is executed only after AddTaskSetToPipe.

I try to keep our software clean from all sanitizer reports, so that I can catch real bugs ;)
For this reason if this or other reports look safe, I suggest to add annotations that disable TSAN where appropriate (using no_sanitize("thread")).

Does it make sense for me to report all data races found by the TSAN output?

Oh, and thanks for your excellent work on the library :)

The text was updated successfully, but these errors were encountered:

dougbinks · 2020-01-17T17:28:42Z

As you note, this is a false positive as the constuctor for ITaskSet is called before the task is launched via AddTaskSetToPipe and then TryRunTask.

This might be a bug in ThreadSanitizer (it's in beta still), I'm wondering if the looping over the test causes it to see a read from the previous iteration before a write from the next? Would you consider posting this to https://github.com/google/sanitizers/wiki/ThreadSanitizerCppManual as a potential issue?

Marking enkiTS functions with __attribute__((no_sanitize("thread"))) in a cross-compiler compatible way is potentially a fair amount of work, and it might lower the readability of the code as I may need to markup a fair few functions with a macro. I'll look into it when I get some time.

If you do get further reports do let me know - I'll also look into adding ThreadSanitizer to my internal testing.

dougbinks · 2020-01-17T17:31:59Z

Thanks for the report by the way!

Pagghiu · 2020-01-20T23:27:37Z

I have been analyzing the reports some more and I've found that there is only one point of synchronization that ThreadSanitizer is complaining about.
TSAN complains about the data race around m_flags during the CAS loop insideReaderTryReadBack.
What could happen according to TSAN is that it's possible for ReaderTryReadBack to get access to a value in m_Buffer that it is still not formally flagged with FLAG_CAN_READ by WriterTryWriteFront.
Adding acquire and release semantics when storing / loading m_flags fixes it.
Sadly I don't have enough knowledge yet to prove that this is a correct modification to make/
I am using the current level of intuition that I have around these lock-free techniques while I am studying them, using TSAN as a sort of "validator".

In detail, these are the changes, in case you're interested:

    template<uint8_t cSizeLog2, typename T> inline
        bool LockLessMultiReadPipe<cSizeLog2,T>::ReaderTryReadBack(   T* pOut )
    {

               // ...

        while(true)
        {
            // ...
            uint32_t previous = FLAG_CAN_READ;
            bool bSuccess = m_Flags[  actualReadIndex ].compare_exchange_strong( previous, FLAG_INVALID, std::memory_order_acquire );
            if( bSuccess )
            {
                break;
            }
            // ...
        }

        // ...
        return true;
    }

and

 template<uint8_t cSizeLog2, typename T> inline
        bool LockLessMultiReadPipe<cSizeLog2,T>::WriterTryWriteFront( const T& in )
    {
        //  ...
        m_Buffer[ actualWriteIndex ] = in;
        m_Flags[  actualWriteIndex ].store( FLAG_CAN_READ, std::memory_order_release );
        //  ...

        return true;
    }

dougbinks · 2020-01-21T09:49:19Z

Thanks - I'll take a look at that.

dougbinks · 2020-01-21T11:26:42Z

I think this is a false positive as I use an acquire / release thread fence, but if it silences the warning it may be worth implementing even if not an issue - I'll check it doesn't adversely affect performance first.

Related to issue #52

Pagghiu · 2020-01-21T17:09:20Z

I can confirm that with the latest change you've been pushing with commits 71cd321 to 49d2de8 in the dev branch, in my tests, TSAN no longer complains about data races 😄.

If you plan to land such changes in the master then we could also close this issue.

dougbinks · 2020-01-21T17:47:18Z

I'll be landing these to master after some testing, and will close this once I do - thanks for the reports!

dougbinks · 2020-01-22T10:53:08Z

I've completed testing and am merging to main & closing, once again thanks for the reports!

dougbinks · 2020-01-22T11:06:35Z

Just wanted to add an additional comment on ThreadSanitizer (TSAN) for future reference:

It appears the issue of TSAN not recognising thread fences is known - see about 3/4 the way down this thread about TBB and TASN.

dougbinks self-assigned this Jan 17, 2020

dougbinks added the not a bug label Jan 17, 2020

dougbinks added a commit that referenced this issue Jan 21, 2020

LockLessMultiReadPipe switch from fence to memory order in atomic ops

dd5691e

Related to issue #52

dougbinks closed this as completed Jan 22, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ThreadSanitizer reports #52

ThreadSanitizer reports #52

Pagghiu commented Jan 17, 2020

dougbinks commented Jan 17, 2020

dougbinks commented Jan 17, 2020

Pagghiu commented Jan 20, 2020

dougbinks commented Jan 21, 2020

dougbinks commented Jan 21, 2020

Pagghiu commented Jan 21, 2020

dougbinks commented Jan 21, 2020

dougbinks commented Jan 22, 2020

dougbinks commented Jan 22, 2020

ThreadSanitizer reports #52

ThreadSanitizer reports #52

Comments

Pagghiu commented Jan 17, 2020

dougbinks commented Jan 17, 2020

dougbinks commented Jan 17, 2020

Pagghiu commented Jan 20, 2020

dougbinks commented Jan 21, 2020

dougbinks commented Jan 21, 2020

Pagghiu commented Jan 21, 2020

dougbinks commented Jan 21, 2020

dougbinks commented Jan 22, 2020

dougbinks commented Jan 22, 2020