-
Notifications
You must be signed in to change notification settings - Fork 560
Change dns healthcheck to look at external domain #3282
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: carlpett Assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/assign @CecileRobertMichon |
I think this is based on the temp workaround for intermitted DNS issues we had. I believe, We should do both. Internal and external lookups. |
@carlpett could you kindly update the manifest accordingly? Thanks so much! |
@carlpett and thanks a lot for this ! |
Yes, I thought a while about that too, but I don't think there's an (obvious) way to do both? Each command/domain lookup is tied to one url, which is then used as a livenessProbe. And we can't have multiple probes per container. |
Yes - Exactly. The health prop is on the command exit code, it does not matter how many we execute. |
Codecov Report
@@ Coverage Diff @@
## master #3282 +/- ##
=======================================
Coverage 53.31% 53.31%
=======================================
Files 104 104
Lines 15574 15574
=======================================
Hits 8304 8304
Misses 6537 6537
Partials 733 733 |
I found the readability a bit poor when chaining it with |
@jackfrancis Anything else needed here? |
@carlpett I'm kicking E2E to see how it likes these changes. Thanks for you patience— |
@jackfrancis If I read the test output correctly, this is the cause:
I'm not sure how I could have caused connection refused on Windows nodes... Is it flaky? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
* 'master' of https://github.com/Azure/acs-engine: (59 commits) Docs: Update user guide list to include Windows, update description of clusters (Azure#3473) update to Azure CNI v1.0.10 (Azure#3551) Adding 'make dev' equivalent for Windows (Azure#3471) print out ubuntu ver in e2e (Azure#3555) fix an issue where networkPlugin was not defined correctly when using calico or cilium (Azure#3271) Bump ginkgo to a tagged release (Azure#3554) Reenable AzureFile tests for Windows on K8s 1.11.1, resolves Azure#3439 (Azure#3496) removing rbac error checking from merge fn (Azure#3530) Change dns healthcheck to look at external domain (Azure#3282) DOCUMENTATION: Fix Documented Default Value for clusterSubnet (Azure#3474) Document required manual calico 2.6.3 -> calico 3.1.1 upgrade when upgrading from < 0.17.0-provisioned clusters (Azure#3208) revert --image-pull-policy=IfNotPresent for win (Azure#3553) --image-pull-policy=IfNotPresent for kubectl run commands (Azure#3552) Kubernetes: --max-pods=30 should be Azure CNI-only (Azure#3543) disable Azure CNI network monitor addon default (Azure#3550) only do az vm list for k8s (Azure#3540) Retire Swarm E2E for PR test coverage (Azure#3539) retire Azure CDN for container image repository proxying (Azure#3535) removed datadisk to allow scale after upgrade (Azure#3482) Pump k8s-azure-kms version (Azure#3531) ...
* master: (59 commits) Docs: Update user guide list to include Windows, update description of clusters (Azure#3473) update to Azure CNI v1.0.10 (Azure#3551) Adding 'make dev' equivalent for Windows (Azure#3471) print out ubuntu ver in e2e (Azure#3555) fix an issue where networkPlugin was not defined correctly when using calico or cilium (Azure#3271) Bump ginkgo to a tagged release (Azure#3554) Reenable AzureFile tests for Windows on K8s 1.11.1, resolves Azure#3439 (Azure#3496) removing rbac error checking from merge fn (Azure#3530) Change dns healthcheck to look at external domain (Azure#3282) DOCUMENTATION: Fix Documented Default Value for clusterSubnet (Azure#3474) Document required manual calico 2.6.3 -> calico 3.1.1 upgrade when upgrading from < 0.17.0-provisioned clusters (Azure#3208) revert --image-pull-policy=IfNotPresent for win (Azure#3553) --image-pull-policy=IfNotPresent for kubectl run commands (Azure#3552) Kubernetes: --max-pods=30 should be Azure CNI-only (Azure#3543) disable Azure CNI network monitor addon default (Azure#3550) only do az vm list for k8s (Azure#3540) Retire Swarm E2E for PR test coverage (Azure#3539) retire Azure CDN for container image repository proxying (Azure#3535) removed datadisk to allow scale after upgrade (Azure#3482) Pump k8s-azure-kms version (Azure#3531) ...
What this PR does / why we need it: The healthcheck for kube-dns should ensure it is possible with recursive queries to detect network issues outside the cluster. Related to #2971.
Special notes for your reviewer: Implements the suggested solution from this comment