Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail when Redis is misbehaving, e.g. OOM command not allowed when used memory > 'maxmemory' #511

Open
Nowaker opened this issue Dec 20, 2020 · 6 comments · May be fixed by #577
Open

Comments

@Nowaker
Copy link

Nowaker commented Dec 20, 2020

Rack Attack is written so it doesn't fail if Redis is down on application startup. That's cool.

However, it will bring down the application if Redis server starts misbehaving at later point, for example:

Redis::CommandError: OOM command not allowed when used memory > 'maxmemory'

Preferably, we should ignore any Redis errors and pass the request to the next middleware.

Alternatively, we should allow users to define an exception handler for any errors that happen inside Rack Attack (not the next middleware), with an option to bypass the exception and continue processing. Example:

Rack::Attack.exception_handler = lambda do |exception|
  Rails.logger.error exception # do whatever you want here
  raise # Re-raises the exception, therefore failing the request. Without an explicit `raise` in the handler, the processing would continue
end

Here's the full stacktrace from when the problem happened and killed my application:

lib/redis/client.rb:121 call  
lib/redis.rb:769 block in setex 
lib/redis.rb:58 block in synchronize  
/home/ec2-user/.rbenv/versions/2.6.3/lib/ruby/2.6.0/monitor.rb:230 mon_synchronize  
lib/redis.rb:58 synchronize 
lib/redis.rb:768 setex  
redis-store (1.6.0) lib/redis/store/interface.rb:17 setex 
redis-store (1.6.0) lib/redis/store/serialization.rb:13 block in setex  
redis-store (1.6.0) lib/redis/store/serialization.rb:40 _marshal  
redis-store (1.6.0) lib/redis/store/serialization.rb:13 setex 
redis-store (1.6.0) lib/redis/store/namespace.rb:11 block in setex  
redis-store (1.6.0) lib/redis/store/namespace.rb:89 namespace 
redis-store (1.6.0) lib/redis/store/namespace.rb:11 setex 
redis-store (1.6.0) lib/redis/store/ttl.rb:6 set  
redis-store (1.6.0) lib/redis/store/serialization.rb:5 block in set 
redis-store (1.6.0) lib/redis/store/serialization.rb:40 _marshal  
redis-store (1.6.0) lib/redis/store/serialization.rb:5 set  
redis-store (1.6.0) lib/redis/store/namespace.rb:7 block in set 
redis-store (1.6.0) lib/redis/store/namespace.rb:89 namespace 
redis-store (1.6.0) lib/redis/store/namespace.rb:7 set  
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:282 block (2 levels) in write_entry 
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:270 with  
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:282 block in write_entry  
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:340 failsafe  
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:280 write_entry 
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:86 block in write 
lib/active_support/cache.rb:547 block in instrument 
lib/appsignal/hooks/active_support_notifications.rb:21 block in instrument  
lib/active_support/notifications/instrumenter.rb:20 instrument  
lib/appsignal/hooks/active_support_notifications.rb:35 instrument 
lib/appsignal/hooks/active_support_notifications.rb:20 instrument 
lib/active_support/cache.rb:547 instrument  
redis-activesupport (5.2.0) lib/active_support/cache/redis_store.rb:81 write  
/home/ec2-user/.rbenv/versions/2.6.3/lib/ruby/2.6.0/delegate.rb:83 method_missing 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/store_proxy/active_support_redis_store_proxy.rb:34 write 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/store_proxy/active_support_redis_store_proxy.rb:21 increment 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/cache.rb:74 do_count 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/cache.rb:27 count  
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/throttle.rb:31 matched_by? 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/configuration.rb:93 block in throttled?  
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/configuration.rb:92 any? 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack/configuration.rb:92 throttled? 
/var/www/apps/app/shared/bundle/ruby/2.6.0/bundler/gems/rack-attack-6d1bc9b617c5/lib/rack/attack.rb:118 call
(... other middlewares ...)
@Nowaker
Copy link
Author

Nowaker commented Dec 20, 2020

Here's how I monkey-patched it to achieve the desired result:

# Monkey-patch of https://github.com/rack/rack-attack/blob/master/lib/rack/attack.rb#L102
# Resolves https://github.com/rack/rack-attack/issues/511
class Rack::Attack
  def safely default_return_value = false
    yield
  rescue Exception => e
    Rails.logger.warn "Rack Attack error bypassed: #{e.message}. #{e.backtrace&[0]}"
    Appsignal.send_error e
    default_return_value
  end

  def call(env)
    return @app.call(env) if !self.class.enabled || env["rack.attack.called"]

    env["rack.attack.called"] = true
    env['PATH_INFO'] = PathNormalizer.normalize_path(env['PATH_INFO'])
    request = Rack::Attack::Request.new(env)

    if safely { configuration.safelisted?(request) }
      @app.call(env)
    elsif safely { configuration.blocklisted?(request) }
      # Deprecated: Keeping blocklisted_response for backwards compatibility
      if configuration.blocklisted_response
        configuration.blocklisted_response.call(env)
      else
        configuration.blocklisted_callback.call(request)
      end
    elsif safely { configuration.throttled?(request) }
      # Deprecated: Keeping throttled_response for backwards compatibility
      if configuration.throttled_response
        configuration.throttled_response.call(env)
      else
        configuration.throttled_callback.call(request)
      end
    else
      safely { configuration.tracked?(request) }
      @app.call(env)
    end
  end
end

@jeremyhaile
Copy link

We recently had a problem where our servers were being overloaded. Most of our requests failed in rack-attack waiting on redis responses. The redis server itself seemed ok, so I think we were just out of network bandwidth on the servers (i.e. not enough servers)

But I'd like to think that rack attack shouldn't be the point of failure for scalability, considering if this was a DOS attack it would have been successful. I'm trying to determine if adding a more aggressive redis timeout would have helped. But if the redis calls did timeout, I think it would have still failed without a change similar to the one suggested here?

@Nowaker
Copy link
Author

Nowaker commented Jun 30, 2021

But if the redis calls did timeout, I think it would have still failed without a change similar to the one suggested here?

That is correct as the timeout would end up as an exception, and these abort the execution.

By the way, I didn't really consider timeouts in my fix... If it timeouts after 30 or 60 seconds, the user would have already abandoned the application. It's effectively still a dead website, even with my fix.

I wonder if it's possible to configure the Redis client with a very short timeout, e.g. 1 second, so that a timeout exception is raised very quickly, and we move to the next middleware ASAP. If that's possible, it would be great.

@jeremyhaile
Copy link

I wonder if it's possible to configure the Redis client with a very short timeout, e.g. 1 second, so that a timeout exception is raised very quickly, and we move to the next middleware ASAP.

Yes, we were thinking about the same thing. By default rack-attack uses Rails.cache, but perhaps you could configure it with your own redis cache that uses a driver configured with a shorter timeout than what your other redis clients use.

@johnnyshields
Copy link

johnnyshields commented Mar 18, 2022

Amen to this. We were bit by it this morning.

I propose there should be two modes in response to Redis (i.e. cache) unavailability:

  1. Hard fail (current behavior)
  2. Disable Rack-Attack for the request, so system continues to work.

I would propose that #2 be the default behavior, because I suspect the vast majority of users would have little use for DoS mitigation if the entire system goes down (i.e. Redis DoSes your system for you!) This will of course require a relatively low timeout setting--something like 50~100ms Redis connection timeout would be recommended.

I will look at making a PR for this. Part of the problem is that the Rails cache layer tends to suppress errors and return nil... will see what can be done here.

@johnnyshields johnnyshields linked a pull request Mar 18, 2022 that will close this issue
@johnnyshields
Copy link

PR raised here: #577

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants