Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

[client/network] ban peer for 10s after disconnecting #13747

Closed

Conversation

melekes
Copy link
Contributor

@melekes melekes commented Mar 29, 2023

plus do not accept incoming connections if peer is still banned.

Refs https://github.com/paritytech/infrastructure-alerts/issues/26
Refs #13778

@melekes melekes self-assigned this Mar 29, 2023
@melekes melekes added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Mar 29, 2023
@melekes
Copy link
Contributor Author

melekes commented Mar 31, 2023

@altonen @dmitry-markin this PR solves the problem from "our" side by ignoring incoming connections which arrive too soon (peers are banned for 10s).

@bkchr
Copy link
Member

bkchr commented Mar 31, 2023

plus do not accept incoming connections if peer is still banned.

Is this actually on a connection level? Isn't this just on the "substream" level?

Any possibility to add some simple test?

When do we ban a peer? Only when they lost all their reputation? Or would we also ban them in other circumstances?

@bkchr
Copy link
Member

bkchr commented Mar 31, 2023

And the peerset already has some kind of banning or? When a node "runs out of reputation" we already had some banning behavior, but 30 seconds AFAIR or?

@bkchr
Copy link
Member

bkchr commented Mar 31, 2023

And the peerset already has some kind of banning or? When a node "runs out of reputation" we already had some banning behavior, but 30 seconds AFAIR or?

// We use `k = 0.98`, so we divide by `50`. With that value, it takes 34.3 seconds
// to reduce the reputation by half.
ahh that was the number I was speaking about, but this is only the time it takes to change the reputation by half.

@@ -1533,6 +1552,12 @@ impl NetworkBehaviour for Notifications {

// Disabled => Disabled | Incoming
PeerState::Disabled { mut connections, backoff_until } => {
// Do not accept incoming connections if the peer is still banned.
if backoff_until.map_or(false, |t| t > Instant::now()) {
*entry.into_mut() = PeerState::Disabled { connections, backoff_until };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we insert PeerState::DisabledPendingEnable with a timer here instead? Otherwise it seems we will never progress from this state.

Copy link
Contributor Author

@melekes melekes Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 I was under the impression that the peer will try again at some point in the future (i.e. the same code block will be executed again => progress is made).

@dmitry-markin
Copy link
Contributor

What was the reason why this wasn't merged last time?

@dmitry-markin

This comment was marked as resolved.

@altonen
Copy link
Contributor

altonen commented Mar 31, 2023

Fixes #13778

I wouldn't tell that this fixes the referenced issue, it only puts misbehaving peers "on hold" locally.

It doesn't fix it. I'm generally in favor of merging this but the underlying issue for this behavior has to be found out first, otherwise we're just masking it with this change.

The reason why this was not merged the last time was because there was some discussion about this behavior in general and how it might blow up in our faces as apparently you're not supposed to do this in the listening end but the dialer is expected to behave correctly. I'm also leaning towards not merging anything connectivity-related because sc-network is already in such a fragile conditions that adding new variables into the mix makes it just that much harder to debug and it's already quite difficult to understand what is going on, even with the help of trace logs.

@melekes
Copy link
Contributor Author

melekes commented Apr 4, 2023

the underlying issue for this behavior has to be found out first, otherwise we're just masking it with this change.

I agree ☝️

Is this actually on a connection level? Isn't this just on the "substream" level?

All the streams within the notifications protocol.

When do we ban a peer?

Any time we choose to disconnect from a peer (bad handshake, max peers reached).

@bkchr
Copy link
Member

bkchr commented Apr 4, 2023

Any time we choose to disconnect from a peer (bad handshake, max peers reached).

Do we also ban when the peer has less than the minimum reputation in any peerset?

@melekes
Copy link
Contributor Author

melekes commented Apr 5, 2023

Do we also ban when the peer has less than the minimum reputation in any peerset?

Yes, but that happens independently in the peerset afaik.

@melekes
Copy link
Contributor Author

melekes commented Apr 7, 2023

but the underlying issue for this behavior has to be found out first

Closing this until we find out the reason for peers constantly reconnecting to us (see #13778 (comment) for @altonen's thoughts).

@melekes melekes closed this Apr 7, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants