Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add networking req/resp request rate limiting rule #2690

Closed

Conversation

Nashatyrev
Copy link
Member

@Nashatyrev Nashatyrev commented Oct 21, 2021

The networking spec is still missing any request rate limit rules. That leads to the situation when different clients make their own assumptions on 'reasonable' rate limits and could disconnect and downscore other clients with different assumptions.

This PR introduces rate limiting similar to RLPx, where requester must not send another request until previous one is responded. However it would be good to relax this rule and allow sending maximum 2 concurrent requests to compensate network roundtrip latency.

UPD: changed the maximum from 2 to 32 to give a space for fast nodes to serve many requests in parallel
UPD: RLPx actually supports parallel requests (though without req/resp ids), then not sure how rate limiting actually works there

Below there is a bit more background from Discord channel conversation:

[17:35] Nashatyrev: On merge event we hit the issue when Teku was regularly disconnecting Lodestar due to exceeding block request rate limit.
In chat with @dapplion we found out that there are no any limits settled in the spec and each client limits inbound/outbound requests (or does not limit at all) according to its internal options.
From my perspective it makes sense adding ReqResp rate limiting rules to the networking spec.
[17:41] Nashatyrev: However instead of hardcoding rate limits I would suggest to add a ReqResp error code RateLimitExceeded which would allow more flexible traffic limiting. I.e. a serving node could dynamically limit the traffic according to it's current load while a requesting node may dynamically adopt its request rate without a risk of being disconnected and downscored
[17:44] Nashatyrev: That approach could also be helpful for small networks when a number of peers low, but they could serve requests at higher rates. E.g. due to Teku conservative rate limits a test net with just 2 nodes syncs very slowly
[17:56] Nashatyrev: There are possible options of how RateLimitExceeded could work:

  • either per ReqResp message type or globally across all types. I would prefer the first option here since e.g. ping/metadata messages are expected to be less frequent than e.g. block requests
  • error message may contain a cool down time period after which messages could be sent again (I've seen a couple of stock exchange trading protocols which utilize this option)
  • If RateLimitExceeded errors are ignored by the requesting side (e.g. > N errors were sent during some time frame) then the peer could be disconnected and downscored

[18:05] Nashatyrev: A different possible option for rate limiting is the maximum number of concurrent requests.
E.g. if this number is 1 then requester must not send a new request until receives response to the previous one (I believe this is how RLPx works)
[10:13] arnetheduck: the simple solution here is that the server withholds the response until the client is within the quota - this avoids the need for complex error handling, decreases chatter, avoids roundtrip timing issues and is trivial to implement - the rate limiting happens "naturally" at the pace that the server is willing to serve and the client can judge how good a server is by the only metric that mattters: how many valid blocks is this server giving me, per second
[13:02] Nashatyrev: Right, this looks like the latter option I suggested (with parallelism of max 1). And it looks more simple and natural to me as well.
However that may work weird for ping/metadata requests. I.e. if a node assumes max rate for ping as 1 request per 10 seconds, what should it do when receiving an extra ping request? From the other side should we rate limit ping/metadata requests at all?
[13:58] arnetheduck: well, one thing to keep in mind is that ping requests are pretty useless from a utility point of view - they basically announce that you have nothing important to say - re parallelism, there's a case to allow multiple parallel requests up to some limit, but again, this "naturally" solves itself if you limit processing to one at a time per node - this allows clients to pipeline requests and avoid roundtrip latency, but by serving them one by one according to a budget, if they send too many, some will simply time out - and if you have too many requests timing out, you eventually disconnect (regardless if they're pings or block requests)
[14:00] arnetheduck: from a performance point of view, you might make 1-2 block requests in parallel - one in-flight and one queued - beyond that, you don't have much to gain as a client - if you're doing more, you're likely buggy or spammy
[14:01] arnetheduck: the beauty of this solution is that you don't need to add anything to the spec, and you don't have to worry about compatibility over time with different versions of clients
[14:10] Nashatyrev: Totally agree except I believe it should still be added to the spec for all clients to be in line with this rate limiting scheme.
Currently spec doesn't prevent you from sending as many parallel requests as you like.
I'm also tending to limit max parallel requests (to 2) on per message type basis.

CCing @dapplion @arnetheduck

@arnetheduck
Copy link
Contributor

There's a duality here that we should probably nuance a bit: when you're a client, you should make no more than N concurrent requests whereas as a server, you should be expected to be able to handle / queue at least N requests concurrently.

N == 2 seems as good as any limit.

@Nashatyrev
Copy link
Member Author

N == 2 seems as good as any limit.

Right. ... as any limit >= 2

@arnetheduck
Copy link
Contributor

Right. ... as any limit >= 2

well, not quite: client <= 2 and server >= 2 - else the server can't punish offending clients.. servers are free to support as many requests as they want as long as they support at least two (so that clients can parallelize)

@nisdas
Copy link
Contributor

nisdas commented Oct 22, 2021

2 seems low and might be problematic when trying to sync with in smaller networks. We use a cost based rate-limiter which chooses to rate limit via amount of data requested ( as compared to number of concurrent requests). If a server can serve > 2 concurrent requests, shouldn't we allow it to ? This is useful when trying to fetch a large range of blocks and we have a very small amount of peers to request from. Enforcing limits from the requester's side rather than server seems excessive here.

@Nashatyrev
Copy link
Member Author

2 seems low and might be problematic when trying to sync with in smaller networks. We use a cost based rate-limiter which chooses to rate limit via amount of data requested ( as compared to number of concurrent requests). If a server can serve > 2 concurrent requests, shouldn't we allow it to ? This is useful when trying to fetch a large range of blocks and we have a very small amount of peers to request from. Enforcing limits from the requester's side rather than server seems excessive here.

Right! As @arnetheduck said 2 as good as any other number. We may choose e.g. 32 or 64 to have a potential for fast nodes to serve many requests in parallel.
From the other side slow nodes would just queue excessive requests and process them at their ability. If some of those requests are timed out on the requesting side their streams would be closed remotely and the server side would just drop them out of queue to not perform unnecessary work.

@Nashatyrev
Copy link
Member Author

Copying some more thoughts from the discord channel:

17:11] pineapple: From a performance and throughput optimisation point of view for things such as downloading blocks, I think it's mistake to (a) limit parallel request depth to 2, and (b) have the server process only one request at a time. Both of these limits pessimize throughput in some circumstances.
[17:13] pineapple: For (a) I believe the number 2 is based on the assumption that it's enough to prevent gaps in processing and data flow. But when the network bandwidth-delay product is high compared with the server's processing time, there will be gaps in the data flow with 2 in flight requests that would be closed with 3 in flight requests (and 4, 5, etc depending on the product). That means throughput will be higher with more than 2 requests in flight, even from a server that processes only 1 request at a time.
[17:16] pineapple: For (b), the simplest case for processing more than one request in parallel, if you have them, is probably local database access. Local storage is like a mini network with its own round trip time. For maximum throughput the storage system needs multiple requests in flight, and one of the ways to ensure that is by servicing multiple requests in parallel when you have them,
[17:39] pineapple: RLPx allows concurrent requests, and this is the reason some RLPx sub-protocols have a requestId field, because responses can be sent in a different order than corresponding requests. Even sub-protocol versions without requestId (eth/65 and below) allow out of order responses, but the client must do a bit more work to figure out how to handle those replies, if it chooses to send multiple requests in flight.
Example from https://eips.ethereum.org/EIPS/eip-2481
Let’s consider a client making many simultaneous requests for GetBlockHeaders to one of its peers. By nature it can not be guaranteed that the expected responses arrive in the same order as they were sent.
Ethereum Improvement Proposals
EIP-2481: eth/66: request identifier
[17:50] Nashatyrev: @Pineapple that totally makes sense! Constant 2 was kind of starting point. Actually there is no much difference between 2 and e.g. 64 requests for a slow server to put into its queue.

…ast nodes to serve many requests in parallel
@Nashatyrev
Copy link
Member Author

Changed the limit from 2 to 32 in this PR

@dapplion
Copy link
Member

dapplion commented Oct 28, 2021

@Nashatyrev Thanks for leading this work, but does a MAXIMUM_CONCURRENT_REQUESTS param communicate an expectation of rate limiting requirements? I imagined some rough numbers denoted as request count per time unit.

@Nashatyrev
Copy link
Member Author

@Nashatyrev Thanks for leading this work, but does a MAXIMUM_CONCURRENT_REQUESTS param communicate an expectation of rate limiting requirements? I imagined some rough numbers denoted as request count per time unit.

Request rate limit settled in stone (in spec) would strictly restrict fast nodes from serving requests faster if they are able to do so. While MAXIMUM_CONCURRENT_REQUESTS approach is far more flexible. The serving side would be able to control request rate dynamically. That would be kind of backpressure mechanism

@Nashatyrev
Copy link
Member Author

BTW just thought that with a large MAXIMUM_CONCURRENT_REQUESTS (like now 32 or maybe even more larger) we may restrict the number of parallel requests globally across all request types. That could be easier from the implementation point of view.

@arnetheduck
Copy link
Contributor

That could be easier from the implementation point of view.

I suspect this would depend on the implementation - in libp2p, each request type is essentially a separate protocol so it's usually easier to deal with things on a per-request-type basis.

I view this PR mainly as imposing a never-to-normally-be-reached limit / sanity check, so that servers can make some assumptions about honest clients so as to filter out the easiest / worst offenders.

Indeed, having fixed rate limits will likely backfire - the server has all it needs to make this decision on its own based on local conditions.

@Nashatyrev
Copy link
Member Author

Closing this PR in favor of #3767

@Nashatyrev Nashatyrev closed this Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants