-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement an efficient mutex in std::sync
#11462
Conversation
|
||
impl StaticMutex { | ||
/// Attempts to grab this lock, see `Mutex::trylock` | ||
pub fn trylock<'a>(&'a self) -> Option<Guard<'a>> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: I think it would be more rust-y if this were named try_lock()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I remembered we had things like try_recv
and try_send
, so I think is certainly most appropriate as try_lock
Much of the justification can be found in the large implementation comment found in the module itself.
Is it really necessary to have the "held" variable? Isn't it possible to use the pthread mutex itself to figure it if it's held, by unconditionally trylocking after push in lock(), and by unlocking it before (not after like in the current code!) pop() in unlock()? Note that this probably needs an explicit memory fence in both paths. The invariant is of course that if the pthread lock is locked, everything on the queue will be woken up eventually. You would have the issue that the "memory of a mutex must be guaranteed to be valid until all unlocks have returned", but I don't think this is a problem in Rust where you can't just free memory arbitrarily like in C and always have a reference count that protects shared memory. |
If the point of this mutex is to be the mutex, then the clause I mentioned is incredibly important in my opinion. That's the exact problem that a mutex is trying to solve (wait for a time at which it's ok to do something). Most rust code wouldn't be susceptible to that condition, but any code that is susceptible is really going to need it (and otherwise won't have a solution in libstd. You also can't pop outside of the lock because it's a single-consumer queue. I would love to get rid of the |
The problem that the mutex is trying to solve is mainly making sure that the code protected by a single mutex is globally serialized, not that as soon as you unlock it and know that nothing else is going to lock it again you can kill the structure . Figuring out when you can kill the structure is generally the job of the reference count in the structure containing the mutex; I'm not even sure that all native OS mutexes provide that guarantee (EDIT: POSIX requires that it is provided, Windows seems to say nothing). Rust code without unsafe code that breaks Rust semantics cannot be affected because to call unlock() you need to pass an &mut to the mutex, and if that &mut is reborrowed, then the original &mut it is reborrowed from will continue to be semantically valid just after the unlock() call, and thus the memory must continue to be valid after the unlock(), making it irrelevant whether unlock() touches the memory or not after unlocking.
That's a serious issue though. A possible solution is to wrap the q.pop() in an additional pthread mutex used only for that, and also keep an atomic int with the queue size, which would allow to skip taking the lock and popping when it is zero. |
BTW, it's possible to create a mutex providing that guarantee using a mutex that doesn't, by boxing the mutex that doesn't in an UnsafeArc, and keeping a clone of the UnsafeArc in the lock guard for the mutex wrapper. But as already argued no proper Rust code can need that guarantee, and anyway the sensible way of doing things is to put the mutex and the data it protects inside an Arc or RWARC, where the Arc or RWARC reference count determines when to destroy the object independently of the mutex. So I think that guarantee is unimportant for Rust and should only be provided if it can be provided for free, which doesn't seem to be the case. |
This removes all usage of `try_send_deferred` and all related functionality. Primarily, this builds all extra::sync primitives on top of `sync::Mutex` instead of `unstable::sync::Mutex`.
I think I have found a lockless algorithm that works. Note that it requires support for 64-bit atomics even on 32-bit platforms, and either limits tasks to 2^32 or requires 128-bit atomics on 64-bit platforms. However, while libstd doesn't currently implement those, 64-bit atomics are available on all modern 32-bit x86 and ARM systems (x86 >= Pentium Pro, ARM >= ARMv6). The idea is that we hold the state in an atomic (u32, u32) called (queue_size, lockers). Lockers is the number of threads that are about to call lock() on the internal lock or that hold the internal lock; queue_size is the number of queued tasks. Basically the "held" variable is replaced by lockers != 0, and having queue_size in the same atomic value allows to do the held check and enqueuing atomically. The "atomically" blocks are to be implemented with compare-and-swap, the queue using the intrusive MPSC queue already written.
Transitions:
EDIT: changed to now also be globally fair The idea is that if lockers > 1, we don't pop the queue, but rather just unlock, which means that queued native tasks get run. New native tasks instead queue up due to queue_size != 0 EDIT 2: added try_lock, fixed lock to use it and not have a race that caused blocking on green tasks EDIT 3: and if we replace the MPSC queue implementation with one using list reversal, we can remove the last yields EDIT 4: implemented this in pull #11520, suggesting to consider that pull requests as an improvement over the code in this one (assuming the algorithm indeed works) |
Closing, adding another crucial |
…rve-literals, r=blyxyas Preserve literals and range kinds in `manual_range_patterns` Fixes rust-lang#11461 Also enables linting when there are 3 or fewer alternatives if one of them is already a range pattern changelog: none
These commits introduce
Mutex
andStaticMutex
types in libstd as first-class primitives. One of the major goals of this implementation is to be efficient with a threshold of pthread mutexes as the benchmark.Most of the comments can be found on the mutex module itself and I won't repeat the details here. The good news is that this allows for a different building block of
libextra
which means thattry_send_deferred
can now be completely removed!In profiling, native performance is the same as pthreads performance, and green performance is about 3x faster than pthreads pending fixes to the scheduler that doesn't attempt to fan out all workloads as quickly as possible.