-
Notifications
You must be signed in to change notification settings - Fork 765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce function calling overhead by ~25% in ReferencePool::update_counts: #1608
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! These are some really great optimizations which many people will be happy with. I think there's a potential deadlock though (which we can avoid), see comment below.
src/gil.rs
Outdated
unsafe { ffi::Py_INCREF(ptr.as_ptr()) }; | ||
} | ||
|
||
for ptr in swap_vec_with_lock!(self.pointers_to_decref) { | ||
for ptr in swap_vec!(ops.1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to be careful here. Decrementing references can run arbitrary Python drop code, which can lead to GIL release which may eventually deadlock because we hold the lock here. (I think this mutex is also non reentrant so may deadlock on a single thread.)
After swapping ops.1
we should release the lock before decreasing any references.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, should be good now.
…nts: 1) Place both increfs and decrefs behind a single mutex, rather than two. Even uncontended mutexes aren't free to acquire. 2) Keep a dirty tracking bool to avoid acquiring any mutexes at all in the common case of no modifications (because everything happened with the GIL held)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks very much 😌
Looks like I was late for the party... Very interesting to see this optimization works, thanks. |
Benchmarks
Before
With single lock
With both single lock and dirty flag