Make sure nodes don't hammer each other even when reserved #8901

tomaka · 2021-05-25T09:51:48Z

Right now, when node A has node B as reserved, node A will try really hard to open a substream to node B.
Even if node B refuses the substream, node A will almost immediately try again. This results in a spam of substream open attempts, which unnecessarily uses bandwidth and CPU.

If we have a lot of reserved nodes, for example because of the collation and validation substreams in Polkadot, and most of them refuse the substream, this spam can be considerable. They're not supposed to refuse the substream in normal situations, but with our recent revert to 0.8.30, they do in practice.

This PR adds a 10 seconds "ban" during which we don't try to re-open a substream after it has failed to open, even if the node is reserved.

kpp

Is there some kind of dynamic backoff with randomness? If all nodes got disconnected, we will spam with requests exactly in 10 seconds.

tomaka · 2021-05-25T10:12:00Z

Indeed, I changed it to a random 5 to 10 seconds interval. We actually use randomness everywhere else already.

andresilva · 2021-05-25T11:03:31Z

bot merge

ghost · 2021-05-25T11:03:34Z

Trying merge.

* Make sure nodes don't hammer each other even when reserved * Make the ban random

burdges · 2021-05-25T13:43:11Z

Why a fixed or random back off instead of a progressively larger one?

rphmeier · 2021-05-25T14:35:29Z

I foresee this potentially causing some issues with collator networking. If a validator initially refuses a connection from a collator due to a race in when they see the next relay chain block, then the collator won't attempt to connect for 10 seconds. That may lead to unnecessary gaps in the parachain.

kpp · 2021-05-25T14:49:51Z

@burdges, @rphmeier, as I see both or your suggestions can be fixed with an exponential backoff starting from, e.g. a rand 1-2s, insreasing by a multiplier of 2^N?

tomaka · 2021-05-25T17:09:39Z

If a validator initially refuses a connection from a collator due to a race in when they see the next relay chain block

Validators never refuse substream from collators unless they're full.
They could accept the substream and then immediately close it, but that's a different code path.

rphmeier · 2021-05-25T18:47:22Z

Ok, that is true. The way that @Lldenaurois is designing the collator protocol is to maintain a peer-set which is larger than the actual reservoir of peers we accept. As we add PreVF handshake logic we'd need to maintain that guarantee, so we don't reject connections outright.

…h#8901) * Make sure nodes don't hammer each other even when reserved * Make the ban random

Make sure nodes don't hammer each other even when reserved

f48217c

tomaka added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels May 25, 2021

andresilva approved these changes May 25, 2021

View reviewed changes

tomaka requested review from ordian and kpp May 25, 2021 09:56

andresilva added a commit to paritytech/polkadot that referenced this pull request May 25, 2021

test paritytech/substrate#8901

a0b7af8

kpp reviewed May 25, 2021

View reviewed changes

Make the ban random

8b385d6

kpp approved these changes May 25, 2021

View reviewed changes

ghost merged commit 3c2bd9a into paritytech:master May 25, 2021

tomaka deleted the try-fix-cpu branch May 25, 2021 11:13

s3krit pushed a commit that referenced this pull request May 25, 2021

Make sure nodes don't hammer each other even when reserved (#8901)

d6cdfb7

* Make sure nodes don't hammer each other even when reserved * Make the ban random

nazar-pc pushed a commit to autonomys/substrate that referenced this pull request Aug 8, 2021

Make sure nodes don't hammer each other even when reserved (paritytec…

d185759

…h#8901) * Make sure nodes don't hammer each other even when reserved * Make the ban random

jordy25519 pushed a commit to cennznet/substrate that referenced this pull request Sep 17, 2021

Make sure nodes don't hammer each other even when reserved (paritytec…

fb724d0

…h#8901) * Make sure nodes don't hammer each other even when reserved * Make the ban random

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make sure nodes don't hammer each other even when reserved #8901

Make sure nodes don't hammer each other even when reserved #8901

tomaka commented May 25, 2021 •

edited

Loading

kpp left a comment

tomaka commented May 25, 2021

andresilva commented May 25, 2021

ghost commented May 25, 2021

burdges commented May 25, 2021

rphmeier commented May 25, 2021

kpp commented May 25, 2021

tomaka commented May 25, 2021

rphmeier commented May 25, 2021 •

edited

Loading

Make sure nodes don't hammer each other even when reserved #8901

Make sure nodes don't hammer each other even when reserved #8901

Conversation

tomaka commented May 25, 2021 • edited Loading

kpp left a comment

Choose a reason for hiding this comment

tomaka commented May 25, 2021

andresilva commented May 25, 2021

ghost commented May 25, 2021

burdges commented May 25, 2021

rphmeier commented May 25, 2021

kpp commented May 25, 2021

tomaka commented May 25, 2021

rphmeier commented May 25, 2021 • edited Loading

tomaka commented May 25, 2021 •

edited

Loading

rphmeier commented May 25, 2021 •

edited

Loading