-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DNS-Controller hitting AWS changeset limit after 1.7 upgrade #3121
Comments
@bcorijn where are those logs located? |
@shavo007 log output from the dns-controller pod. I managed to fix it by deleting some annotations, which caused me to drop below 1000 records in the changeset. Upping the logging to v8 showed me that it had the impression that all my nodes were modified and my record needed to be replaced, which was a delete + action for each node / annotation combination. |
This just happened again, this time without any update. I did add a new IG though with one instance, which I guess has triggered all records to be refreshed to add another IP to them. |
Looking into the AWS R53 API for some other projects, I think the |
Will use half the operations vs REMOVE + ADD. Issue kubernetes#3121
I had a go at a more complete fix in #3860 as well. I think we want to vendor the dnsprovider library, which makes it fairly intrusive - but I feel like we should have vendored dnsprovider a long time ago so we could fix this stuff faster. |
Good to hear this got into 1.8, moving to an upsert should already move the limit a lot higher. |
This is blocks next for 1.9 now? |
This is affecting us, too - on k8s |
This is addressed on kops 1.8.x work more changes in 1.9.x |
Automatic merge from submit-queue. Copy dnsprovider into our code, implement route53 batching Fixes #3121
@justinsb We just tried the
Definitely running the image:
|
We built dns-controller from master, and dropped the batch size from the @pwillie is working on a PR to make batch size an argument |
Pull request here: #4496 |
Yeah, ran into it again myself as well a few days ago on 1.8.. PR looks good to allow people who suffer from this a workaround! |
Just ran the 1.9.0.alpha.2 controller and still run in to this |
can this be cherry-picked to 1.8.x |
By adding the flag: --route53-batch-size=100 |
for me setting it to 50 worked. Thanks. |
Hi,
I tried rolling my cluster from 1.6.4 to 1.7.2 with kops 1.7.0 an hour ago or so, starting with 2 out of my 3 masters. I noticed that after doing this my kubectl commands were becoming very slow, something that in the past had been a symptom of the api.* records in Route53 being incorrect. Checking them indeed showed that the new IP's had not been recorded.
Next suspect was a faulty DNS-Controller, but I saw it had been updated to v1.7.1 and had started correctly. The logs however do show something interesting: it is trying to replace all my record sets with new IP's. This is already a first bug I believe, since the only change were my masters there should be no changes to my records, as those only include nodes?
Besides that, it seems that the amount of nodes times the amount of records I have, exceeds the AWS limit. This causes the following error:
So none of my records are being updated currently. Rolling dns-controller back to 1.6.1 has the same issue, I presume I just never had this many records to be changed before...
The text was updated successfully, but these errors were encountered: