Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RxAtomic causes Thread Sanitiser race conditions #1853

Closed
8 tasks
andyj-at-aspin opened this issue Jan 15, 2019 · 24 comments
Closed
8 tasks

RxAtomic causes Thread Sanitiser race conditions #1853

andyj-at-aspin opened this issue Jan 15, 2019 · 24 comments

Comments

@andyj-at-aspin
Copy link

Short description of the issue:

The newly added AtomicInt.swift causes data race reports from the Xcode Thread Sanitiser when running in DEBUG on iOS Simulator

Expected outcome:

No reports of data races.

What actually happens:

Any subroutines wrapped in TRACE_RESOURCES involved with incrementing or decrementing counters can trigger Thread Sanitiser data race reports.

Self contained code example that reproduces the issue:

  var atomicTester = AtomicInt(0)

        for i in 0...100 {
            DispatchQueue.global(qos: .background).async {
                if i % 2 == 0 {
                    self.atomicTester.increment()
                } else {
                    self.atomicTester.decrement()
                }
            }
        }  

RxSwift/RxCocoa/RxBlocking/RxTest version/commit

4.4.0

Platform/Environment

  • [* ] iOS
  • [ *] macOS
  • [*] tvOS
  • [* ] watchOS
  • [* ] playgrounds

How easy is to reproduce? (chances of successful reproduce after running the self contained code)

  • [* ] easy, 100% repro
  • sometimes, 10%-100%
  • hard, 2% - 10%
  • extremely hard, %0 - 2%

Xcode version:

10.1 (10B61)

⚠️ Fields below are optional for general issues or in case those questions aren't related to your issue, but filling them out will increase the chances of getting your issue resolved. ⚠️

Installation method:

  • CocoaPods
  • Carthage
  • [*] Git submodules

I have multiple versions of Xcode installed:
(so we can know if this is a potential cause of your issue)

  • yes (which ones)
  • [ *] no

Level of RxSwift knowledge:
(this is so we can understand your level of knowledge
and formulate the response in an appropriate manner)

  • just starting
  • I have a small code base
  • [* ] I have a significant code base
@andyj-at-aspin
Copy link
Author

For what it's worth, I am in no way suggesting that AtomicInt is in itself not thread safe as I can see it's wrapping stdatomic's functions, however the fact that the swift functions AtomicInt.increment() and AtomicInt.decrement() are declared as mutable is tripping the Thread Sanitiser up.

As observed events are triggered by multiple events in my source project and often observed on alternate threads (i.e. the main one) the new resource tracing checks are constantly tripping the sanitiser. I have had to resort to disabling all sanitiser checks for the whole RxSwift framework in order to continue allowing checking of my own projects.

@freak4pc
Copy link
Member

Is this basically a false positive of Thread Sanitizer?

@andyj-at-aspin
Copy link
Author

andyj-at-aspin commented Jan 15, 2019

As I have added above, it is, but to shut it up I have to stop all sanitisation checks for the Reactive framework. Unfortunately it's a 'feature' of swift, unless there's a way to have increment() and decrement() call sites that aren't 'mutable'.

@kzaher
Copy link
Member

kzaher commented Jan 18, 2019

Ok, should we ditch RxAtomic, introduce NSRecursiveLock and just be done with it? Is there any other viable alternative?

We will probably be slightly slower, but I don't think we should be more then a few percentages slower.

@freak4pc
Copy link
Member

Did we ever benchmark how bad it would be? I imagine it shouldn't be too noticeable on >95% of the systems

@kzaher
Copy link
Member

kzaher commented Jan 18, 2019

I guess the difference depends on the concrete chain being tested so it's hard to quantify the exact impact. The smaller the Swift overhead, the bigger the impact is.

We can rerun the couple of performance tests we have and assess the difference there.

@LucianoPAlmeida
Copy link
Contributor

Hey @kzaher
Maybe pthread API's like pthread_mutex_t/ pthread_rwlock_t or os_unfair_lock could be a viable options here? Would NSRecursiveLock will work on Linux? Not sure

@freak4pc
Copy link
Member

There are some interesting thoughts here: http://www.cocoawithlove.com/blog/2016/06/02/threads-and-mutexes.html

@LucianoPAlmeida
Copy link
Contributor

Interesting article @freak4pc :))
Personally, I'm always biased to use p_thread_mutex because it is fast and can work the same as a NSLock or NSRecursiveLock if we use PTHREAD_MUTEX_RECURSIVE attr.
Also, there are some more benchmarks here: http://www.vadimbulavin.com/benchmarking-locking-apis/

@kzaher
Copy link
Member

kzaher commented Jan 21, 2019

I've tried running performance tests with the following implementation of AtomicInt.

I got 10-17% worse performance on Benchmarks. (Mostly around 10%)

https://github.com/ReactiveX/RxSwift/tree/mutex_atomic

Feel free to try to improve this.

Anyway, if there is no other alternative, I'll merge this, but it seems a bit wasteful :(

final class AtomicInt {
    private var mutex: pthread_mutex_t
    private var value: Int32

    init(_ initialValue: Int32) {
        self.value = initialValue

        var attr = pthread_mutexattr_t()
        pthread_mutexattr_init(&attr); defer { pthread_mutexattr_destroy(&attr) }

        self.mutex = pthread_mutex_t()
        guard pthread_mutex_init(&self.mutex, &attr) == 0 else { fatalError() }
    }

    @discardableResult
    func increment() -> Int32 {
        guard pthread_mutex_lock(&self.mutex) == 0 else { fatalError() }
        let oldValue = self.value
        self.value += 1
        guard pthread_mutex_unlock(&self.mutex) == 0 else { fatalError() }
        return oldValue
    }

    @discardableResult
    func decrement() -> Int32 {
        guard pthread_mutex_lock(&self.mutex) == 0 else { fatalError() }
        let oldValue = self.value
        self.value -= 1
        guard pthread_mutex_unlock(&self.mutex) == 0 else { fatalError() }
        return oldValue
    }

    func isFlagSet(_ mask: Int32) -> Bool {
        guard pthread_mutex_lock(&self.mutex) == 0 else { fatalError() }
        let oldValue = self.value
        guard pthread_mutex_unlock(&self.mutex) == 0 else { fatalError() }
        return oldValue & mask != 0
    }

    @discardableResult
    func fetchOr(_ mask: Int32) -> Int32 {
        guard pthread_mutex_lock(&self.mutex) == 0 else { fatalError() }
        let oldValue = self.value
        self.value |= mask
        guard pthread_mutex_unlock(&self.mutex) == 0 else { fatalError() }
        return oldValue
    }

    func load() -> Int32 {
        guard pthread_mutex_lock(&self.mutex) == 0 else { fatalError() }
        let oldValue = self.value
        guard pthread_mutex_unlock(&self.mutex) == 0 else { fatalError() }
        return oldValue
    }
}

@freak4pc
Copy link
Member

The performance hit is quite large - I didn't expect it to be this bad.
Should we really bother replacing it if its just a false positive? (the thread sanitizer warning, that is).

Also, did you make an attempt on NSRecursiveLock? Was it even slower?

@kzaher
Copy link
Member

kzaher commented Jan 21, 2019

@freak4pc Feel free to play with it. But I'm not sure I know how to do better than this. AtomicInt were really fast :)

I'm not sure is there a way to silence the thread sanitizer? Somehow I doubt it.

@freak4pc
Copy link
Member

Yeah I understand the silencing part - but is it really worth 10% performance downgrade just for a warning? I guess I mainly wonder if it would be at all noticeable. I'll see if I can actually play with it tomorrow.

What did you use for the benchmark?

@kzaher
Copy link
Member

kzaher commented Jan 21, 2019

There is a benchmark target inside or the Rx project. I think it's called Benchmark or something really similar to that. It's not the Microoptimizations target.

@LucianoPAlmeida
Copy link
Contributor

Hey @kzaher, @freak4pc
Was playing with these two things and added two things that I thought would give a performance improvement, but in the end, there was not :((

  1. Try to use the non-overflow check option (&+=) of += and -=. Thought it would be an improvement, but really wasn't :(( or I just did the benchmark wrong 🤣
  2. Adding @inline attribute to the AtomicInt functions, although since is an internal final class, I think the optimizer would've done it anyway in -O. But just to give it a hit.

Also, notice that the Benchmarks target was with Debug config on the Test action which I think will compile -Onone and not give us a fair benchmark, so changed to Release-Test to compile -O.

The modifications are here https://github.com/LucianoPAlmeida/RxSwift/tree/mutex_atomic

Question, I'm not familiar with the usage of AtomicInt on the project, so maybe it is not required, but it wouldn't be good to add PTHREAD_MUTEX_RECURSIVE attr on the mutex to avoid deadlock, just for safety?

@LucianoPAlmeida
Copy link
Contributor

Also, did you make an attempt on NSRecursiveLock? Was it even slower?

@freak4pc All the benchmarks I've seen on Darwin with Apple Foundation it shows that it is slower. And on Linux with Corelibs Foundation, the NSRecusiveLock is implemented as a wrapper on top of pthread (see here), so there's no difference I think :))

@kzaher
Copy link
Member

kzaher commented Jan 24, 2019

I somehow doubt that we'll come close to the speed of the current version of AtomicInt :)

@freak4pc
Copy link
Member

The real question is whether we should replace AtomicInt just because of a thread sanitizer false positive bug 🤔

@kzaher
Copy link
Member

kzaher commented Jan 24, 2019

Does anybody know how does the ThreadSanitizer actually work, is there some way to plug into it and mark the memory as properly synchronized?

@LucianoPAlmeida
Copy link
Contributor

@kzaher
I think the attribute((no_sanitize("thread")) do exactly that :))
You can see more info about it here

@LucianoPAlmeida
Copy link
Contributor

LucianoPAlmeida commented Jan 24, 2019

Actually, reading more about it maybe SanitizerSpecialCaseList will be the one to use here :)
Edit: Maybe not, because we have to pass that blacklist file as a compilation option. So probably we just annotate the functions with no_sanitize("thread") and will be fine.

@LucianoPAlmeida
Copy link
Contributor

I had some free time so, did #1860 maybe it can be useful :))

@kzaher
Copy link
Member

kzaher commented Jan 31, 2019

@andyj-at-aspin I've merged fixes for TSan, please check the master branch. I would like to close this issue.

Thnx a lot @LucianoPAlmeida .

@andyj-at-aspin
Copy link
Author

The issue no longer occurs with pulls from the master branch and I have verified that RxSwift is very definitely making use of the restructured AtomicInt objects. Thanks so much, guys. I can't wait for this to be included in a formal release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants