Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics service is not garbage collected #459

Closed
pepov opened this issue Sep 6, 2018 · 6 comments
Closed

Metrics service is not garbage collected #459

pepov opened this issue Sep 6, 2018 · 6 comments

Comments

@pepov
Copy link

pepov commented Sep 6, 2018

I think it would make sense to set an ownerreference to the metrics service to be garbage collected when the operator is gone. One idea I have in mind is to use the downwardAPI to pass the pod name and set the ownerreference directly to it, since it will recreate it anyways. The other idea I have is to create the service from the yaml and just update it if needed from the operator. This way users can know that there is a resource that needs to be removed (helm takes care of it for example).

@pepov
Copy link
Author

pepov commented Sep 6, 2018

I can work on this if any of the proposed way makes sense.

@bofm
Copy link

bofm commented Sep 21, 2018

Maybe an owner reference for the metrics service could help? The owner reference should point to the operator deployment.

@pepov
Copy link
Author

pepov commented Sep 21, 2018

I agree with the ownerref, but not sure about marking the deployment as the owner because it's not necessarily a deployment that creates the operator. What about looking up the operator pod's ownerref and setting it to whatever has been set there?

@shawn-hurley
Copy link
Member

I do think this is a change that we will want. I'll defer to @hasbro17 to make sure.

I think that for now, using the downward API, and setting the owner reference as the pod makes sense for now.

@lilic lilic added the metrics label Nov 16, 2018
@hasbro17
Copy link
Contributor

We turned off the metrics service creation in v0.1.0 as we're currently reworking how to expose both the controller specific metrics(queue length, number of reconciles) and app specific metrics via separate registries and ports #715

But yes we should have the service be owned by the operator so it's garbage collected.
We could have the operator-pod own the service it creates, but we need to then consider what happens if we have 2 replica pods for the operator deployment(now that we support leader election #724 ).

Note that the metrics service selects all operator pods.
https://github.com/operator-framework/operator-sdk/blob/master/pkg/k8sutil/k8sutil.go#L79
Although if we're setting readiness checks then the non-leader pod endpoints would not be selected by the service since they're Unready.

I imagine we could have the following workflow:

  • leader pod is removed/deleted
  • metrics service owned by leader pod is GCed
  • another pod becomes the leader
  • waits until the previous metrics service is fully deleted
  • recreates the service with the ownerref set to itself

@lilic Thoughts. We can probably address this alongside #715

@lilic
Copy link
Member

lilic commented Nov 29, 2018

Will do this as soon as the initial metrics PR is merged! #786

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants