Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

response-ratelimiting plugin still forwards request after limit is exceeded - Kong 0.7.0 #1096

Closed
codewaffle opened this issue Mar 23, 2016 · 9 comments
Assignees
Labels
task/feature Requests for new features in Kong

Comments

@codewaffle
Copy link

I'm running 0.7.0 using Docker and the official image, linked to a vanilla Cassandra container.

My custom response headers are triggering the plugin properly - x-ratelimit-remaining-test-hour decrements to 0 and Kong starts kicking back 429 responses for every request from that IP - perfect, except those requests are still being forwarded to the backend without any indicator that they should be rate limited (and I suspect they shouldn't be reaching the backend at all).

@subnetmarco subnetmarco self-assigned this Mar 24, 2016
@subnetmarco
Copy link
Member

@codewaffle I cannot replicate your bug, and once Kong starts answering with 429 I don't see any request hitting my upstream service.

Could you please give me the curl commands that you used for adding the API and the plugin?

@codewaffle
Copy link
Author

I'm doing some more comprehensive testing now to see if I can narrow it down.

@codewaffle
Copy link
Author

@thefosk I've put together a Vagrantfile + script + supporting demo API that showcases this issue, visible in this gist: https://gist.github.com/codewaffle/b646acbd94cef0b84377

The Vagrantfile will execute the test during the provisioning step.

@subnetmarco
Copy link
Member

@codewaffle thanks for the replication instructions. The first thing I noticed is that I ran my test with the Rate-Limiting plugin and not the Response Rate-Limiting plugin, so that's why I couldn't replicate it.

The Response Rate-Limiting plugin is a very particular plugin, that although it supports custom limits, it also introduces some limitations (as discussed in #733 (comment)), especially this part:

The reason why the previous implementation doesn't block the request in the access_by_lua is because an upstream service may have more than one quota specified, and Kong doesn't know which one is going to be modified. Let's say that I have an API that provides quotas on images (20 images/day) and videos (100 videos/day): the current PR would block any request that deal with videos if the images quotas (or any other quota) is being reached, which is unexpected behavior.

I can go ahead and implement the following fix when there is only one quota:

if there is only one quota specified, and the user is already over his limit, then block the request

But I am open to hear any feedback.

@subnetmarco subnetmarco reopened this Mar 28, 2016
@subnetmarco subnetmarco added area/plugins task/feature Requests for new features in Kong and removed task/bug labels Mar 28, 2016
@codewaffle
Copy link
Author

From the user's perspective, the request was blocked, no matter if they tripped a single rate limit out of one, or a single rate limit out of fifty.. the user CAN NOT get results out of any request if they have exceeded any individual limit applied to an API, so it is extremely unnatural that these requests could actually have state-changing effects internally.

This could be made workable if Kong presented headers that the backend API when limits have been tripped, and then my API could be modified to account for that... but then we're moving a big chunk of the rate limiting problem to the external backend API - which is exactly the problem I'm trying to avoid with Kong.

@subnetmarco
Copy link
Member

@codewaffle how many custom limits did you setup in the plugin?

@codewaffle
Copy link
Author

I'm only using two at the moment - but definitely more than one, so your proposed workaround would not help directly.

After reading through #733 again I'm starting to understand the reasoning behind how it currently works - a client wants to READ something and is only allowed to read it x times per period. The extraneous requests to the server can be ignored, because the backend call itself has no side effects.

In my use case, a client wants to trigger a heavy background process, and I'm gating it with two limits: a 'call' limit and a 'fail' limit - if the client triggers failure results too quickly, I don't want them to be able to make normal calls either - so blocking off the entire API if ANY limit is tripped is exactly what I desire, and how I originally interpreted the documentation.

So, it looks like I just misunderstood how the plugin was supposed to work.. if a boolean flag could be added to cause requests to be pre-emptively discarded if any rate limits are currently met, this would still fit my use case perfectly - otherwise it looks like moving rate limiting directly into my API is the way to go.

@subnetmarco
Copy link
Member

if a boolean flag could be added to cause requests to be pre-emptively discarded if any rate limits are currently met.

This is a good idea. So there is a combination of solutions that we can take:

  • Introduce a new property in the plugin called something like block_on_first_violation, when block_on_first_violation=true then the system will automatically block every request if any limit has been reached. This will still generate an exception on bulk requests, because Kong doesn't know preventively how many limits a request will process (ie the limit is set to 5 images conversions, but with one request I can convert multiple images: Kong doesn't preventively know how many images the request will convert).

To avoid having this exception, either block_on_first_violation automatically disables bulk requests (limits can only be incremented by 1, and not more), or we introduce the following property:

  • disable_bulk, when set to true limits can only be incremented by 1 and not more.

I am more tempted to merge the two so that block_on_first_violation also automatically disables bulk requests. Thoughts?

@codewaffle
Copy link
Author

The bulk problem can also affect non-bulk requests: if the user makes many requests at once, Kong won't know when to start limiting until those requests start returning responses.

As such, adding more distinctions between bulk and non-bulk is just going to muddy things further, in my opinion - block_on_first_violation would be an edge case as is, and should probably work the same no matter how many points we're incrementing by.

Currently, the only way I can think of to reject the request without causing the backend to fully process it, is to pass rate limiting info to the backend (ala x-consumer-id) and inspect it on a per-endpoint basis, allowing the backend to reject early - the endpoint already knows it wants to consume a DEMO credit (it must, to construct the X-Kong-Limit header), so reading the incoming request headers to see if Kong thinks they have any credits left doesn't seem out of the question, and means Kong doesn't need to know any specifics about the endpoint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
task/feature Requests for new features in Kong
Projects
None yet
Development

No branches or pull requests

2 participants