-
Notifications
You must be signed in to change notification settings - Fork 14
fix: avoid deadlocks during shutdown #1397
fix: avoid deadlocks during shutdown #1397
Conversation
Avoid deadlocks caused by the SessionPool both controlling the lifetime of the background threads in its destructor, and being also destructed by those threads. The background threads are now owned by the ConnectionImpl.
Codecov Report
@@ Coverage Diff @@
## master #1397 +/- ##
==========================================
- Coverage 94.38% 94.35% -0.03%
==========================================
Files 190 190
Lines 15659 15679 +20
==========================================
+ Hits 14779 14794 +15
- Misses 880 885 +5
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thanks! This LGTM, but please wait for @mr-salty to confirm that this should fix the issues he was seeing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd considered this but I didn't think it was viable since ConnectionImpl
doesn't actually own the SessionPool
, it just has a shared_ptr
to it. So, I don't think the order of operations you gave in chat is guaranteed - the SessionPool
may outlive the ConnectionImpl
, although it's possible that doesn't happen in any of our tests.
But, the one twist here I hadn't considered was passing CompletionQueue
to the SessionPool
instead of a pointer to BackgroundThreads
. If we did the latter then CompletionQueue
could end up dereferencing freed memory (we can't fix that with shared_ptr<BackgroundThreads>
without running into the original issue). But, in this case, I believe we might end up with a CompletionQueue
with no threads servicing it, is that ok? I recall having issues with that in the past.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @devbww, @mr-salty, and @scotthart)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd considered this but I didn't think it was viable since
ConnectionImpl
doesn't actually own theSessionPool
, it just has ashared_ptr
to it. So, I don't think the order of operations you gave in chat is guaranteed - theSessionPool
may outlive theConnectionImpl
,
Good point.
although it's possible that doesn't happen in any of our tests.
I would argue that none of the callbacks inside SessionPool
can block trying to join the threads for the simple reason that they never had access to the BackgroundThreads object in the first place.
But, the one twist here I hadn't considered was passing
CompletionQueue
to theSessionPool
instead of a pointer toBackgroundThreads
. If we did the latter thenCompletionQueue
could end up dereferencing freed memory (we can't fix that withshared_ptr<BackgroundThreads>
without running into the original issue).
Sure, but we don't do that...
But, in this case, I believe we might end up with a
CompletionQueue
with no threads servicing it, is that ok?
During shutdown? I think it is. I mean, sure, we still may want to add the SessionPool::Shutdown()
function to wait for all the timers and RPCs to finish, but this will not deadlock with or without that function.
I recall having issues with that in the past.
There is the thing in **grpc::**CompletionQueue having to wait for all the active operations, but we fixed that in the shutdown for **google::cloud::**CompletionQueue
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @devbww and @scotthart)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit concerned about the possibility of a unserviced CompletionQueue but I'm not sure how to tease that out... and in any case this is an improvement over the current situation and will unblock me - and I can see if any issues arise with my pending changes.
Reviewable status: 0 of 6 files reviewed, all discussions resolved (waiting on @devbww and @scotthart)
I am happy to write the |
…nner#1397) Avoid deadlocks caused by the SessionPool both controlling the lifetime of the background threads in its destructor, and being also destructed by those threads. The background threads are now owned by the ConnectionImpl.
Avoid deadlocks caused by the SessionPool both controlling the
lifetime of the background threads in its destructor, and being also
destructed by those threads. The background threads are now owned by
the ConnectionImpl.
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)