Avoid bad peers propagation #1134

lock9 · 2019-09-30T21:43:52Z

(Originally from @decentralisedkev on #366)

Summary
Problems:

When a node disconnects from another node, nodes do not have a way to check this, subsequently, nodes propagate bad peers to other nodes, wasting the other nodes' resources in trying to connect to bad peers and makes the initial synchronization slow.**
Nodes have no way to disconnect peers who are slowing them down in terms of message delivery as stated above. For example, if I ask a peer for blocks/headers and that peer is taking more than 45 seconds, the node should disconnect and search for a faster peer to request the same resources from. The problem with that peer being slow will increase if more peers keep connecting to it and asking for inventory.

Do you have any solution you want to propose?

Add the heartbeat messages into the p2p protocol; ping and pong. So a node sends a ping message with a nonce, the other node then replies with a pong message with the same nonce to signify that the connection is still alive.
Implement an inflight message tracker for each peer that the node has connected to. If a getheaders message is sent and a headers is not heard within 30 seconds, for example, the node should disconnect from that peer.
An example implementation of this can be seen here: https://github.com/decentralisedkev/neo-go/blob/v2/pkg/p2p/peer/stall/stall.go
With these solutions in place, nodes will stop keeping a large number of bad peers or peers that are ran for a short period of time.

Where in the software does this update applies to?

P2P (TCP)

The text was updated successfully, but these errors were encountered:

ixje · 2019-10-01T06:25:48Z

ping/pong is already present in the latest P2P protocol (Add ping/pong for updating node height #673)
the inflight tracking is a client specific optimisation, not a P2P protocol extension/upgrade.

Also a node should not decide for another node if a peer is bad based on metrics like; ping, response time, etc. The only acceptable metric for filtering should be "not able to connect at all" (a.k.a dead peer), rationale;

Ping is based on geographical location, what might be bad a bad address for node A, might be good for node B because they're closer.
Response time has some correlation with ping, but also just because you think 2 seconds is too slow, doesn't mean other nodes think that's too slow for their application.

lock9 · 2019-11-09T03:10:31Z

@ixje What if the node you tried to connect was 'full'? If we know that the peer is full, we should avoid propagating it, at least for some time.
I agree that we should not keep trying to connect to the same 'bad nodes' over and over again but also be careful to not judge the node by "our relation with it".

ixje · 2019-11-09T11:45:36Z

I just noticed that "bad peers" is no longer a thing in NEO3

neo/neo/Network/RPC/RpcServer.cs

Line 533 in f70a76d

json["bad"] = new JArray(); //badpeers has been removed

Ideally, yes we avoid propagating. However, what will we do if the list we have available is small and mostly consists of "full" nodes? What is a proper timeout for full nodes to be propagated? They could arguably not be full 1 second after our connection attempt.

lock9 added the Discussion Initial issue state - proposed but not yet accepted label Sep 30, 2019

kevaundray mentioned this issue Sep 30, 2019

Optimisations For The Network Protocol #366

Closed

lock9 added Enhancement Type - Changes that may affect performance, usability or add new features to existing modules. P2P Module - peer-to-peer message exchange and network optimisations, at TCP or UDP level (not HTTP). labels Sep 30, 2019

lock9 changed the title ~~Avoid propagating bad peers~~ Avoid bad peers propagation Oct 1, 2019

lock9 mentioned this issue Oct 25, 2019

Neo 3 Opmizations update list #1189

Closed

31 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid bad peers propagation #1134

Avoid bad peers propagation #1134

lock9 commented Sep 30, 2019

ixje commented Oct 1, 2019

lock9 commented Nov 9, 2019

ixje commented Nov 9, 2019

Avoid bad peers propagation #1134

Avoid bad peers propagation #1134

Comments

lock9 commented Sep 30, 2019

ixje commented Oct 1, 2019

lock9 commented Nov 9, 2019

ixje commented Nov 9, 2019