-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Consider all peers as potential candidates during pull-request in case of offline nodes #18333
Conversation
All peers will be all 500 or 1000 nodes? |
Sorry about the wording we are still only sending 1 pull request. Instead of taking |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for debugging the local-cluster tests
Pull request has been modified.
Codecov Report
@@ Coverage Diff @@
## master #18333 +/- ##
=========================================
- Coverage 82.4% 82.3% -0.1%
=========================================
Files 434 434
Lines 121329 121335 +6
=========================================
- Hits 99978 99953 -25
- Misses 21351 21382 +31 |
v1.7 backport? |
Problem
When we receive contact info from an offline node in our list of peers, we assign it a much higher weight than other peers because we have not sent a pull request to it yet. This causes the sampling to always select only this peer to send pull requests to. However the ping check never allows us to send a pull request and update the timestamp, as the node is offline. Thus we never get a chance to send pull requests to other nodes.
Summary of Changes
After assigning the weights, instead of performing random sampling we use a weighted shuffle to ensure that if the higher weighted nodes fail the ping check we can still send pull requests to the other nodes.
Also reenables
test_no_optimistic_confirmation_violation_with_tower
andtest_optimistic_confirmation_violation_without_tower
Fixes #18279