Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

g3proxy: Outbound rate limiting in requests per second per domain + queue management #309

Open
bern548456 opened this issue Sep 11, 2024 · 2 comments

Comments

@bern548456
Copy link

Hello,

I'm trying to use G3 proxy in forward proxy mode for HTTP and HTTPS requests.
I haven't found a way to set rate limits in terms of requests per seconds per domain in the server options.
And the User-level rate limiting seems to be global (to the user) - not applied per domain.

I'm looking for the limit to be applied per domain (without knowing them in advance).
I'm ok with using HTTPS interception to extract the contacted domain, but if SNI is better advised - and supported - then no problem (I'm just in doubt whether all requests do have the SNI field, as I want to avoid running into difficulties if some don't)

All requests should be processed (instead of rejecting requests exceeding the rate limit for a given domain).
No burst mechanism should be applied, in order to respect the given rate limit.
All requests should be handled in the order they where received (regardless of whether they were on hold or not for a specific domain).

Is this possible with G3 ?
I'm still reading the docs and code, but would be grateful if you had any inputs!

Best regards

@bern548456 bern548456 changed the title G3proxy: Outbound rate limiting in requests per second per domain + queue management g3proxy: Outbound rate limiting in requests per second per domain + queue management Sep 11, 2024
@zh-jq-b
Copy link
Member

zh-jq-b commented Sep 12, 2024

I'm looking for the limit to be applied per domain (without knowing them in advance).

It's not supported yet. And there will be a max domain number limit if implemented.

I'm ok with using HTTPS interception to extract the contacted domain, but if SNI is better advised - and supported - then no problem (I'm just in doubt whether all requests do have the SNI field, as I want to avoid running into difficulties if some don't)

HTTPS interception is needed if you want to set rate limit for HTTP requests. SNI is not suitable here as there will be many requests in the same HTTPS connection, and may be multiplexed when it's H2.

All requests should be processed (instead of rejecting requests exceeding the rate limit for a given domain). No burst mechanism should be applied, in order to respect the given rate limit. All requests should be handled in the order they where received (regardless of whether they were on hold or not for a specific domain).

It's possible to use a queue along with rate limit policy. But can you show more details about the target usage?

@bern548456
Copy link
Author

Hi,
thank you for your quick answer and insights.

The typical use case would be data aggregation.
As different coding practices with different speed management capabilities (or lack thereof) might be encountered between client tools, the g3 proxy would be used as a safe guard edge proxy.

I'm still hesitating between using a separte component (e.g.: redis) to store queues and rate limiting info (to avoid local memory exhaustion), or doing everything locally (for ease of use and better performances). But since there would only be a unique proxy instance (even if it's multi-threaded), I'd be inclined to go for the local in-memory option (and not the redis one) - ok, that would require synchronization and thread-safe mechanisms.

You are right that in all case, the number of visited domains and requests sent per domain can quickly use too much memory. This is why I would consider using a global rate limiter (for all incoming requests and domains) that would help restrict the growth of each domain's queue if necessary. And yes, each domain would still have its own rate limiter. Perhaps this (the global rate limiter) could be adjusted at runtime, as a backpressure mechanism.

In addition, I'd consider splitting the overall scope into smaller ones (maybe doing 1000 per 1000 domains). I don't want to process all requests as quickly as possible but rather find a compromise.

All in all, I'm well aware this could be complex to implement which is why I'm still thinking about it ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants