Decrement the inflight counter on ConnectionRefused
#184
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I decided to take a swing at #169 after adding periodic forced DB failovers (chaos engineering ftw) in addition to occasional actual DB failovers due to real-world things showed that my app would never recover after the failover completed.
After a lot of debugging, I discovered this was only happening when I explicitly set
max_pool_size
. I noticed the number ofDB::ConnectionRefused
exceptions I caught happened to be exactly that number before never occurring again for the life of the process, so I started runningpp db.pool
in a loop and noticed thatinflight
was stuck at that value despite there being no inflight connections — they were failing instantly because the failover hadn't yet completed, but the@inflight
value was still at the same value as@max_pool_size
so the pool had no way to know. So it never tried to reconnect after those 30 reconnects becausecan_increase_pool?
could never return truthy, so it would just sit there and wait for a connection to become available, and that never happened because@total
was empty and there were no actual in-flight connections.I just started using this branch in my app and it now works after a database failover.
This is the same problem #170 is trying to resolve, and I mentioned in this comment on that PR that it sounded like the existing retry mechanism wasn't working as intended. I also used the reproduction instructions @robcole posted in #169 and the app recovers immediately with this change.
Fixes #169
Closes #170