Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a simple throttler/limiter #91

Closed
umputun opened this issue Jun 12, 2021 · 8 comments
Closed

Add a simple throttler/limiter #91

umputun opened this issue Jun 12, 2021 · 8 comments
Labels
enhancement New feature or request

Comments

@umputun
Copy link
Owner

umputun commented Jun 12, 2021

This can be useful, and won't add much complexity. The throttler should be able to limit the total number of concurrent requests per reproxy server as well as per proxy destination. Not sure if limiting per destination server (virtual host) needed.

It also should be able to limit req/sec per remote client for the same areas (total and destination).

@umputun umputun added the enhancement New feature or request label Jun 12, 2021
@fkirill
Copy link

fkirill commented Jun 20, 2021

Would be interested to implement this.
My understanding is that this implementation of rate limiter https://pkg.go.dev/golang.org/x/time/rate should be enough for all the low-level mechanics.
When we talk about variants like "per remote client" I guess we need to be a bit defensive here, as generally speaking just having a limiter per IP address (do we need to have other means of client identification, like HTTP header? technically, a client can be distributed and/or have a non-permanent externally-facing address), we are dealing with a potentially unbounded data set, so we may need to limit the number of clients we're going to track (for example, using LRU cache).

And just to clarify, this is a simple reverse proxy, so distributed rate-limiting is not something we're considering right now, is that right? This can be done using something like a UDP broadcast to share the state between proxy hosts, but again the per-client state is going to be a problem (data size, serialization/deserialization cost, etc.)

@umputun
Copy link
Owner Author

umputun commented Jun 20, 2021

I thought about leveraging https://github.com/didip/tollbooth which is a relatively thin wrapper on top of x/time/rate, decently tested and battle proved.

When we talk about variants like "per remote client" I guess we need to be a bit defensive here, as generally speaking just having a limiter per IP address (do we need to have other means of client identification, like HTTP header? technically, a client can be distributed and/or have a non-permanent externally-facing address), we are dealing with a potentially unbounded data set, so we may need to limit the number of clients we're going to track (for example, using LRU cache).

as far as I recall tollbooth does all of this

distributed rate-limiting is not something we're considering right now, is that right?

yep

@fkirill
Copy link

fkirill commented Jun 22, 2021

Yes, you're right, seems like it's a wrapper with additional bells and whistles over x/time/rate.
I already created a draft implementation (https://github.com/umputun/reproxy/pull/94/files), can you please have a look that I extended the code in a logical way?
Switching from rate to tollbooth should be pretty small, shouldn't change much in overall implementation.

@umputun umputun mentioned this issue Jun 23, 2021
@umputun
Copy link
Owner Author

umputun commented Jun 23, 2021

@fkirill I had this basic implementation for a while, just never pushed. Pls take a look

I have tried to keep it as simple as possible from the user's point of view. No fancy params, just a couple of limiter values. Internally it uses tollbooth limiter with LimitByKeys. Also borrowed your idea of avoiding repeated code for passThroughHandler

@fkirill
Copy link

fkirill commented Jun 26, 2021

No worries, thanks!

@fkirill
Copy link

fkirill commented Jun 26, 2021

Hey Umputun, a quick question. I was reading the code you mentioned and found some hints around limiting concurrent requests.

So should throttling limit concurrent in-flight requests or admitted requests per second?

It's not a mutually exclusive choice.
My initial impression is that tollbooth is a wrapper around time/rate and it would limit throughput (which is what time/rate does). But I might have misunderstood it, never worked with the library before.
It shouldn't be too complex to implement concurrent request tracking too (just another Throttler implementation), we would need to maintain an in-memory map of all currently executing requests and shave off everything when a map reaches a certain size.

My recommendation is to start with throughput limiting (option (2) below). Here are some pros and cons for both approaches.

A little bit of theory. There are two schools of thought around throttling:

  1. Limiting concurrent requests. This approach effectively protects the server/backend. This produces VARIABLE throughput and DOES depend on request latency. Effectively it puts an upper bound on overall downstream throughput consumption so that server is never overloaded. This is nice, but it's quite hard to reason about it, it's even harder to guarantee something to the consumer. Depending on the load, there could be quite problematic side effects too. For starters, it is quite hard to ensure fairness. A client that does more requests overall, will receive better treatment on average (more requests admitted) than a client that is making fewer calls. So a noisy neighbor problem can be still an issue. Another problematic side effect is that if one downstream system started to answer slower or is down (requests time out), then the overall throughput of the system will be drastically reduced. As for pros, for the systems where there is a huge disparity between expected volume in clients' calls, this approach potentially gives a better overall user experience as it will reject an equal percentage of all calls from all clients. Another pro is that if the system has not reached its limits, it could tolerate substantial short spikes in demand from individual clients without impacting overall stability. I guess, this approach is only used in very specialized cases (like in DynamoDB).
  2. Limiting throughput per session/client (in most AWS services it's usually per AWSAccountId). This approach ensures BANDWIDTH for each client (guarantees fairness) but is susceptible to ignoring the fact that some clients just have more data and need to make more calls because of that (and therefore this throttling approach is usually accompanied by a limits management system, but we don't need this in our simple case, at least not now).

@umputun
Copy link
Owner Author

umputun commented Jun 26, 2021

So should throttling limit concurrent in-flight requests or admitted requests per second?

this is a good question. initially, I planned to limit in-flight requests, at least on the global (server) level and even added such a milldeware. However, for per-client limiters, this feels very unusual and unexpected and I have opted for the more traditional req/sec. Providing two distinct throttlers will be very confusing for the user, so I have switched both global and client-level to req/seq via tollbooth

I don't see how the number of concurrent req/seq makes much sense except for the overall limiter and, maybe, on the overall destination server level. I mean, limiting in-flight for each client seems almost useless and as a user, I wouldn't even know what value I want to set here. So, I'm planning to keep req/seq only for now as this one seems to be a universally accepted idea and the user could limit the server's load indirectly. We may consider adding concurrent limiters if needed, but I would prefer to keep it simple and not confuse users with two different methods of throttling.

I would appreciate if you can review the PR. To me it seems to be almost trivial and, from reading the sources of tollboth, LimitByKeys method seems to be the one we want.

@fkirill
Copy link

fkirill commented Jul 1, 2021

Hey Umputun, I approved with a couple of (nuanced) comments.
Potentially we're missing metrics ("what's not registered in metrics, haven't really happened").
Potentially it makes sense to also surface configuration for the surge bucket size. Currently (by default) equals to rate, see here https://github.com/didip/tollbooth/blob/4a3af0d94edfddc169de231525db021ef6a15e1b/tollbooth.go#L32. It means that the current limiter won't tolerate any spikes at all, this might not be ideal.

@umputun umputun closed this as completed Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants