-
Notifications
You must be signed in to change notification settings - Fork 619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Reject sync request if too many peers are syncing #3147
Comments
Let's not conflate nearup (which is a tool to manage the node) with the behavior of the node itself. Whatever we do with nearup should be separate from nearcore. As for syncing, since a node has a limited number of peers, the number of peers that are syncing is naturally limited. Also, other than state sync (for which we already have limits), syncing is not very resource intensive so I don't think that we probably don't need to impose extra restrictions, although I do agree that limiting the number of peers syncing is a way to prevent eclipse attack. |
I agree, let's not add hacks into nearup, like adding a randomized timer, that would solve node issues. Inability of the node to efficiently communicate with the peers and decide when and how to sync is the node issue. |
I see, you were talking about rolling release. Closing this issue. |
Motivation
When all network nodes are rebooted after the update they try syncing at the same time and this makes the booting slow.
Proposed design 1
Node A should reject syncing requests using structured error X when it already has more than Y nodes syncing. Number Y should be determined by benchmarking the syncing code. When node B receives structured error X during the sync it should attempt more nodes, and have a some delay retry mechanism on the peers that returned structured error X. This will naturally line up nodes into a queue.
This also makes monitoring of such network easier, since it will be easier to observe why node has not synced yet.
Proposed design 2
@evgenykuzyakov proposed that nearup can have a random delay before it starts the node. @nearmax 's argument against it is that it will not work universally, e.g. it won't work when nodes are upgraded by the community and NEAR foundation does not have perfect control on when and how people start them. Besides having randomization is a heuristics which adds to the maintenance of the system.
The text was updated successfully, but these errors were encountered: