-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error Code: NatGatewayLimitExceeded - Even with a higher limit than the error reports. #560
Comments
Thanks for reporting this! It's very unfortunate... We've been discussing adding a NAT gateways per-AZ, so that AZ failure doesn't make privates subnets in other AZs cut off from the world (#392), but looks like single NAT gateways is a little already problematic. Are you using private subnets at all? I wonder if we can eventually find a solution to provision NAT gateways only when a nodegroup gets deployed in a the private subnets. Has this happen continuously for a little while and eventually stopped being an issue? If so, perhaps the limit have eventually consistent semantics? |
I spun up another cluster yesterday without issues, so I couldn't collect logs. I might try to bring up another one today and see if I can get it to error again. Like I said, it's only about 30% of the time. Up until yesterday I was only creating clusters using private networking. But the first time I tried to spin one up without |
Ok, Happened to me the first time: https://gist.github.com/chs-bnet/fd9a3c66b6965a9faa05d089cd925b61 |
Seems like this spurious CF limit error would have to be a CF issue rather the I would actually suggest splitting That way you can deploy three clusters in a VPC with shared gateways with
And you make sensible/efficient use of NAT Gateways. And you still have the simple |
Yes, separate commads for VPC and IAM were already discussed, and there are
multiple reasons to have them. The challenge is weather resources should be
in the same stack or separate stack? It is sort of a compatibility
question. I am sure we can come with a good answer, and I like what you
suggested here. But it is a separate topic, we have a few issue that cover
those, perhaps would be a good idea to start with a concrete design
proposal doc.
However, wrt use of shared network stacks, there is always a trade-off
around IP addresses also.
…On Thu, 21 Feb 2019, 12:11 am Aaron Roydhouse ***@***.*** wrote:
Seems like this spurious CF limit error would have to be a CF issue rather
the eksctl?
I would actually suggest splitting create cluster into create network and create
cluster, even if create cluster has the default behavior to run create
network first. create network would make the VPC/subnets/routing tabs and
gateways, and export resource name that a cluster needs.
That way you can deploy three clusters in a VPC with shared gateways with
create network --name foo
create cluster --name prod --network-stack=foo
create cluster --name preprod --network-stack=foo
create cluster --name uat --network-stack=foo
And you make sensible/efficient use of NAT Gateways.
And you still have the simple create cluster (just don't specify
--network-stack). Only difference is that it would create two CF stacks,
one network, one cluster. And using the simple case wouldn't preclude you
adding in a second cluster later.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#560 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAPWSwfgXGqNJtDYWwEbqSyecxvBIV7Zks5vPeRHgaJpZM4bD_K3>
.
|
I'm greatly in favor of separate, non-nested, stacks with suitable export/imports. Exports are globally named, but nothing a configurable prefix in What is the trade-off around IPs with shared network stacks? |
On Thu, 21 Feb 2019, 11:09 pm Aaron Roydhouse ***@***.*** wrote:
I'm greatly in favor of separate, non-nested, stacks with suitable
export/imports. Exports are globally named, but nothing a configurable
prefix in eksctl can't address.
We will eventually need to do this one way or another, but seems like
default combined cluster+VPC+IAM stack option is easier for basic usage.
What is the trade-off around IPs with shared network stacks?
Just that a subnet doesn't have infinitely many addresses if you want to
have a lot of pods; some users who have their VPC linked to other networks
have little IPs to spare and prefer overlays.
|
Just noticed this was tagged as awaiting information. This issue is still an ongoing problem for us. Is there any more information that you actually need, or was my previous log sufficient? |
@chs-bnet it's clear to me that this is ongoing, thanks for clarifying that. Did you try to increase the limit to something much higher? To be honest, all we can do here is provide an option to disable NAT Gateway, and/or offer Egress-only NAT Gateway... The real issue is completely outside of our control, unless I'm missing something. As far as I'm aware, there is no API for limits. And overall, this sounds like a flaky behaviour on AWS side, but I'm surprised that only you are seeing this particular behaviour. To be clear, private subnets and NAT Gateway are created disregarding of whether So would an option to use no NAT Gateway or an Egress-only NAT Gateway be sufficient for you? |
What happened?
Occasionally getting "limit exceeded" errors when the limit isn't actually exceeded.
Error:
What you expected to happen?
The limits for the account where this is being run have been increased to 20. At the time of EKS creation, there are only 7 active NAT Gateways. There shouldn't be any error.
How to reproduce it?
It doesn't happen every time, but maybe 1/3 of the time and it seems random. But to reproduce just do
eksctl create cluster
in an account with a limit higher than 5 NAT Gateways.Versions
Using
aws-iam-authenticator
. Version doesn't output a version number, but the latest per installation instructions from Amazon as of about 2 weeks ago.Logs
Since it's random, it's hard to collect the logs. But I will update the issue with them when I can.
The text was updated successfully, but these errors were encountered: