-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't fail when Redis is misbehaving, e.g. OOM command not allowed when used memory > 'maxmemory' #511
Comments
Here's how I monkey-patched it to achieve the desired result: # Monkey-patch of https://github.com/rack/rack-attack/blob/master/lib/rack/attack.rb#L102
# Resolves https://github.com/rack/rack-attack/issues/511
class Rack::Attack
def safely default_return_value = false
yield
rescue Exception => e
Rails.logger.warn "Rack Attack error bypassed: #{e.message}. #{e.backtrace&[0]}"
Appsignal.send_error e
default_return_value
end
def call(env)
return @app.call(env) if !self.class.enabled || env["rack.attack.called"]
env["rack.attack.called"] = true
env['PATH_INFO'] = PathNormalizer.normalize_path(env['PATH_INFO'])
request = Rack::Attack::Request.new(env)
if safely { configuration.safelisted?(request) }
@app.call(env)
elsif safely { configuration.blocklisted?(request) }
# Deprecated: Keeping blocklisted_response for backwards compatibility
if configuration.blocklisted_response
configuration.blocklisted_response.call(env)
else
configuration.blocklisted_callback.call(request)
end
elsif safely { configuration.throttled?(request) }
# Deprecated: Keeping throttled_response for backwards compatibility
if configuration.throttled_response
configuration.throttled_response.call(env)
else
configuration.throttled_callback.call(request)
end
else
safely { configuration.tracked?(request) }
@app.call(env)
end
end
end |
We recently had a problem where our servers were being overloaded. Most of our requests failed in rack-attack waiting on redis responses. The redis server itself seemed ok, so I think we were just out of network bandwidth on the servers (i.e. not enough servers) But I'd like to think that rack attack shouldn't be the point of failure for scalability, considering if this was a DOS attack it would have been successful. I'm trying to determine if adding a more aggressive redis timeout would have helped. But if the redis calls did timeout, I think it would have still failed without a change similar to the one suggested here? |
That is correct as the timeout would end up as an exception, and these abort the execution. By the way, I didn't really consider timeouts in my fix... If it timeouts after 30 or 60 seconds, the user would have already abandoned the application. It's effectively still a dead website, even with my fix. I wonder if it's possible to configure the Redis client with a very short timeout, e.g. 1 second, so that a timeout exception is raised very quickly, and we move to the next middleware ASAP. If that's possible, it would be great. |
Yes, we were thinking about the same thing. By default rack-attack uses Rails.cache, but perhaps you could configure it with your own redis cache that uses a driver configured with a shorter timeout than what your other redis clients use. |
Amen to this. We were bit by it this morning. I propose there should be two modes in response to Redis (i.e. cache) unavailability:
I would propose that #2 be the default behavior, because I suspect the vast majority of users would have little use for DoS mitigation if the entire system goes down (i.e. Redis DoSes your system for you!) This will of course require a relatively low timeout setting--something like 50~100ms Redis connection timeout would be recommended. I will look at making a PR for this. Part of the problem is that the Rails cache layer tends to suppress errors and return nil... will see what can be done here. |
PR raised here: #577 |
Rack Attack is written so it doesn't fail if Redis is down on application startup. That's cool.
However, it will bring down the application if Redis server starts misbehaving at later point, for example:
Preferably, we should ignore any Redis errors and pass the request to the next middleware.
Alternatively, we should allow users to define an exception handler for any errors that happen inside Rack Attack (not the next middleware), with an option to bypass the exception and continue processing. Example:
Here's the full stacktrace from when the problem happened and killed my application:
The text was updated successfully, but these errors were encountered: