Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple rate limiting for agent rpc calls. #3140

Closed
wants to merge 1 commit into from

Conversation

wojtkiewicz
Copy link
Contributor

@wojtkiewicz wojtkiewicz commented Jun 12, 2017

Hi,
This PR is related to OOM outage that we encountered on our production.
Details can be found here: https://groups.google.com/forum/#!topic/consul-tool/YYnhFeC1qi4

After some discussion we got feedback from @slackpad that you are interested in some basic form of rate limiting for consul agents to prevent abusive clients from causing the cluster to become unstable.

In this PR only agents with client mode are limited. Rate limiting for simplicity covers all rpc calls regardless if came from dns, http or sync.

Rate limiting is based on https://godoc.org/golang.org/x/time/rate.

Configured by RPCRate and RPCMaxBurst. We recommend these values not to be lower than expected number of services registered locally on a single consul agent. By default RPCRate is set to infinite and RPCMaxBurst is ignored.

There is also two new metrics added:

  • consul.client.rpc.rate: rate of rpc calls before rate limiting, can be used with a threshold to trigger monitoring events when agent is getting closer to its limit
  • consul.client.rpc.exceeded.rate: how often client is rate limited, good candidate to trigger monitoring events that agent is actually rate limited

Sample configuration:
{"rpc_rate": 100, "rpc_max_burst": 100}

@wojtkiewicz
Copy link
Contributor Author

Builds were failing for last two weeks on master. Doesn't seem to be related to anything I changed.

@wojtkiewicz
Copy link
Contributor Author

@slackpad could you tell me if basic premise of this PR is good?

I can keep this PR up to date with master but so far I got no feedback which worries me a little bit.

@slackpad
Copy link
Contributor

Hi @wojtkiewicz sorry for the delay! Yes this looks like a useful feature. Can you do a rebase and we can work on reviewing this and getting it merged?

@wojtkiewicz wojtkiewicz force-pushed the agent-rpc-rate-limiter branch 2 times, most recently from a950d1f to 88aa3ee Compare August 10, 2017 08:00
@wojtkiewicz
Copy link
Contributor Author

@slackpad rebased

@slackpad
Copy link
Contributor

slackpad commented Sep 1, 2017

Thanks for your work on this @wojtkiewicz - this will land in 0.9.3!

@wojtkiewicz
Copy link
Contributor Author

awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants