Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading bootnode client code from polkadot-v0.9.38 to polkadot-v1.1.0 causes bans #5265

Open
maltekliemann opened this issue Aug 6, 2024 · 3 comments
Labels
I10-unconfirmed Issue might be valid, but it's not yet known.

Comments

@maltekliemann
Copy link

We have two bootnodes running at polkadot-v0.9.38. Upgrading one of these to our new client, which is at polkadot-v1.1.0 raises the following two error messages:

[🔮 Zeitgeist Parachain] Report 12D3KooWAXGvE8rMyNqpeUsmf54sQ6auje7VvRGfYxgMQJc6bZaA: -2147483648 to -2147483648. Reason: Same block request multiple times. Banned, disconnecting.

[Relaychain] Report 12D3KooWGd8FPMDvLgE4CrEZg31FPJLmVwihC2PVZ9gfTuY9tosX: +100 to -2147483548. Reason: Grandpa: Neighbor message. Banned, disconnecting.

(The runtime is still at the old version at this point.)

We have failed to reproduce these error messages using a non-bootnode client on our local machines, also in integration tests or using a local parachain network - no errors!

What causes these errors? Are they bootnode related? Can we expect them to vanish once a majority of the nodes are updated to the new client? Should we be concerned about going through with this update?

@github-actions github-actions bot added the I10-unconfirmed Issue might be valid, but it's not yet known. label Aug 6, 2024
@skunert
Copy link
Contributor

skunert commented Aug 7, 2024

The first message is being tracked here: #1915 and here: #531

There has been recent work to avoid these duplicate block requests: #5029
However, that fix is fairly recent and will not come to 1.1.0. The issue should not appear too often however.

Regarding the grandpa neighbor message I am not sure. cc @lexnv

@maltekliemann
Copy link
Author

Thanks for the reply. Am I interpreting this correctly: We don't have to worry about halting our network due to at least the first message?

@lexnv
Copy link
Contributor

lexnv commented Aug 7, 2024

We don't have to worry about halting our network due to at least the first message?

I would say everything is ok, nothing to worry about from those 2 log messages. Generally, we care more about about block production and connectivity to multiple peers. It is ok for some peers to get disconnected, especially if they "misbehave".

We currently ban peers that have made the same request to us multiple time (3 times).
This is causing the first error message Same block request.

This PR #5029 aims to offload some pressure from peers who are slow to respond. However, we'd have to wait a bit for it to get deployed to the majority of the network before we see some significant effects.

The second message is not concerning. We are adding +100 to the reputation of a banned peer. This is expected behavior and we are "rewarding" the peer by increasing its reputation.

We are emitting warnings if the reputation of a peer gets under a threshold. In this case, the +100 did not increase the peer's reputation enough to escape the banned threshold.

Next Steps

  • Make the warn message a bit clearer (something like "The peer did not escape the banned threshold, disconnecting")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I10-unconfirmed Issue might be valid, but it's not yet known.
Projects
None yet
Development

No branches or pull requests

3 participants