-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop deleting underlying services when federation service is deleted #37353
Stop deleting underlying services when federation service is deleted #37353
Conversation
cc @saad-ali as FYI for 1.5. This is a bug fix for 1.5. Required in 1.5 since we are introducing cascading deletion based on DeleteOptions.OrphanDependents for federation resource. Federation service controller was doing it even without that option which is unexpected. |
Brought up a cluster and verified that services are not deleted in underlying clusters and DNS records are not touched. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that, a few minor nits. Please feel free to apply the LGTM label after addressing them.
Could you please also audit the federated service e2e tests to ensure that we don't accidentally leak services in the underlying clusters after this change?
// or we do nothing for service deletion | ||
// TODO: Should uncomment this? | ||
// We should delete the load balancer when service is deleted | ||
// Look at the same method in kube service controller. | ||
//err := s.dns.balancer.EnsureLoadBalancerDeleted(service) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove this as well. We are not going to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
umm. Why not? kube service controller calls this.
Leaving this as is. Can remove it in a separate PR if we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikhiljindal by "this" I meant the TODO comment and commented code. What was your "this"?
What loadbalancer do you think we are going to delete here? Also, the commented code is wrong anyway. s.dns
is an interface that doesn't contain a field called balancer
. My gut feeling is that, this was a copy-paste from cluster-level service controller and doesn't apply here.
Please remove it. The comment change is making things worse. And the right fix for that is just removing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah you are right. Removed it. Thanks for clarifying
err := s.deleteClusterService(clusterName, cachedService, cluster.clientset) | ||
if err != nil { | ||
hasErr = true | ||
} else if err := s.ensureDnsRecords(clusterName, cachedService); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should ensure that this behavior is carried over to the cascading deletion. Could you please add a TODO somewhere? Or open an issue may be? One of the two, either one is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the effect of s.ensureDnsRecords()
in PR #36390. May be I am missing something?
for clusterName, clusterClientset := range clusters { | ||
_, err := clusterClientset.Core().Services(service.Namespace).Get(service.Name) | ||
if err != nil { | ||
framework.Failf("Unexpected error in fetching service %s in cluster %s, %s", service.Name, clusterName, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure about failing here. It should log an error and continue?
Also, I would just put this in a defer
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aah the comment "Cleanup" was wrong. This is the test. We want to verify that services are not deleted from underlying clusters. Removed the comment.
AfterEach deletes these services.
82b5284
to
e2540b7
Compare
Thanks @madhusudancs Adding LGTM label |
@nikhiljindal please look at the comment. |
e2540b7
to
34eae22
Compare
Thanks @madhusudancs |
@k8s-bot test this |
@k8s-bot cvm gce e2e test this issue: #IGNORE (stuck on Waiting for status to be reported) |
@k8s-bot cvm gce e2e test this issue: #IGNORE (failed with "Error starting build") |
Removing requires release czar attention as per offline discussion with saad last week. |
Jenkins GCE e2e failed for commit 34eae22. Full PR test history. The magic incantation to run this job again is |
k8s-bot cvm gce e2e test this (build failed) |
@k8s-bot cvm gce e2e test this (build failed) |
@k8s-bot cvm gce e2e test this |
@k8s-bot gci gke e2e test this |
Ack. Thanks |
Unit tests seem to have been stuck for more than 11 hours. |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue |
Commit found in the "release-1.5" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked. |
Fixes #36799
Fixing federation service controller to not delete services from underlying clusters when federated service is deleted.
None of the federation controller should do this unless explicitly asked by the user using DeleteOptions. This is the only federation controller that does that.
cc @kubernetes/sig-cluster-federation @madhusudancs
This change is