-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support upgrades for connect refactor #509
Conversation
@@ -925,11 +927,321 @@ func TestReconcileUpdateEndpoint(t *testing.T) { | |||
expectedProxySvcInstances []*api.CatalogService | |||
expectedAgentHealthChecks []*api.AgentCheck | |||
}{ | |||
// Legacy services are not managed by endpoints controller, but endpoints controller | |||
// will still add/update the legacy service's health checks. | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reviewers since EP controller tests are long:
The tests I added were that for a legacy service:
- the health check is added when the pod is healthy
- the health check is added when the pod is unhealthy
- the health check is updated from healthy --> unhealthy
- the health check is updated from unhealthy --> healthy
Since when an agent is rolled it will lose the health checks, I wanted to test that they are added back for a legacy service, and that they are updated if they already exist.
@@ -210,6 +223,111 @@ func (r *EndpointsController) SetupWithManager(mgr ctrl.Manager) error { | |||
).Complete(r) | |||
} | |||
|
|||
// getServiceCheck will return the health check for this pod and service if it exists. | |||
func getServiceCheck(client *api.Client, healthCheckID string) (*api.AgentCheck, error) { | |||
filter := fmt.Sprintf("CheckID == `%s`", healthCheckID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we may encounter a bug similar to the one where we weren't using %q
when defining a filter in connect-init
? I'm guessing the `'s around "%s" make it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup! And I didn't notice issues testing it end to end, and that codepath would be used for updating the health check which I did test :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @ndhanushkodi. I couldn't complete this review. I removed myself from reviewers and added the plat team back and @ishustava was assigned.
I didn't get to finish but overall my concerns are that we're intertwining the code paths for endpoints-controller managed pods and legacy pods and that I'm worried that this is going to set us up for confusion.
I was looking to see if there was a clean way at the top of the for
loop to deal with the legacy pod and then continue the loop without carrying the isRegisteredByEndpointsCtrl
boolean throughout the rest of the loop code. I think that someone will add something at the bottom of the loop and be unaware that the code is dealing with legacy pods.
@@ -263,21 +379,13 @@ func (r *EndpointsController) createServiceRegistrations(pod corev1.Pod, service | |||
Address: pod.Status.PodIP, | |||
Meta: meta, | |||
Namespace: r.consulNamespace(pod.Namespace), | |||
Check: &api.AgentServiceCheck{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if my understanding is wrong: we're not creating the check here because we're going to create it after the service register so the check also gets created for non-endpoints-ctrl pods?
If so, why don't we create it here and then after serviceRegister if the service isn't managed by endpoints controller we create the check?
connect-inject/handler.go
Outdated
@@ -247,6 +247,10 @@ func (h *Handler) Handle(_ context.Context, req admission.Request) admission.Res | |||
} | |||
pod.Labels[keyInjectStatus] = injected | |||
|
|||
// Add the managed-by label since services are now managed by endpoints controller. This is to support upgrading | |||
// from consul-k8s without Endpoints controller to consul-k8s with Endpoints controller. | |||
pod.Labels[keyManagedBy] = endpointsController |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make more sense as an annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it fit well with the pattern for the keyInjectStatus label. So annotations are how users configure our software to run, and labels are things we put on pods as we manage them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving for now to not block.
Yes
Addressed below.
@lkysow I'm addressing these comments here: The reason why I made the health check registration separate from the service registration was so we could have logic along the lines of:
I could see how a future contributor might add logic meant for endpoints-controller managed pods to the end of that, and have that apply to legacy and endpoints controller managed pods. One option to reduce confusion is with a comment describing that the code after the if managedByEndpointsController block applies to legacy and endpoints controller pods. I will go ahead and do this option in this PR so when this is merged with the existing logic it's less confusing. Then, in a future PR, we could make a refactor to have the logic look like this:
@ishustava and @kschoche I'm looking for feedback on whether you think I should go forward with the future refactor option after merging this PR with some more documentation. I'm also looking for feedback on that refactor idea. |
Before the connect refactor, service registration in Consul was managed by the lifecycle sidecar, which would re-register the service with Consul every 10s. Now, service registration is managed by Endpoints controller. In order to support upgrades to the refactored Endpoints controller, we need Endpoints controller to NOT register or deregister any services managed by lifecycle sidecar. To do this, the annotation consul.hashicorp.com/connect-inject-managed-by is added to pods managed by endpoints controller, so endpoints controller will ignore older services managed by lifecycle sidecar (legacy services) for service registration/deregistration. To support health checks for legacy services, the Endpoints controller will always update the healthcheck for any pod, whether it's managed by Endpoints controller or not.
Legacy services have the proxy healthcheck coupled to the service registration, so that can remain in endpoints controller as well.
I like the refactor idea to make it more clear. |
Likewise! |
Thanks @ishustava @kschoche ! |
Before the connect refactor, service registration in Consul was managed by the lifecycle sidecar, which would re-register the service with Consul every 10s. Now, service registration is managed by Endpoints controller.
In order to support upgrades to the refactored Endpoints controller, we need Endpoints controller to NOT register or deregister any services managed by lifecycle sidecar. To do this, the annotation
consul.hashicorp.com/connect-inject-managed-by
is added to pods by the mutating webhook, so endpoints controller will ignore older services managed by lifecycle sidecar (legacy services) for service registration/deregistration.To make sure endpoints controller only deregisters services managed by endpoints controller, a meta key indicating the service is managed by endpoints controller is added to the Consul service registration.
To support health checks for legacy services, the Endpoints controller will always update the healthcheck for any pod, whether it's managed by Endpoints controller or not. It will only do this for the service healthcheck, not the proxy healthchecks.
How I've tested this PR:
How I expect reviewers to test this PR:
Code review and if you can, the steps above.
Checklist: