Support For Cluster Addons #400

mhenriks · 2018-08-13T18:29:29Z

KubeVirt [1] is a cluster addon for running virtual machines in Kubernetes. I am looking into creating an Operator and integrating with OLM in order to better manage the lifecycle of KubeVirt installations. But since Kubevirt is a cluster addon, it has some unique requirements/restrictions.

The main thing is that I only want a single instance of the KubeVirt application to exist in the entire cluster. It can be installed into any namespace. But only once. So, if I create an instance of my KubeVirtApplication CRD in namespace "kubevirt-system" I shouldn't be able to create another instance in that namespace. Or in any other namespace. I'm sure that there are ways to implement this restriction in code. But I think some SDK support would be nice, assuming this is in line with the philosophy of the Operator Framework.

What do you think? Is this within the scope of the Operator Framework? Should there be additional support in the SDK/OLM for this case?

[1] https://github.com/kubevirt/kubevirt

hasbro17 · 2018-08-13T23:40:13Z

@mhenriks It's hard to say if the SDK could support something like this.
I'm not entirely sure how you would enforce No more than 1 CRD instance(CR) between all namespaces.

First off I'm assuming the operator would have to be a cluster-wide operator that watches all namespaces.
After the first CR is created the operator would have to ignore all other CRs of the same type(across all namespaces).

I don't know if it's possible to restrict the creation of any new CRs. Maybe through an Admission Webhook but I think that would be beyond the scope of what an operator is expected to do.

The operator could possibly ignore all CRs after the first one. But I'm not clear on the implementation details. Sounds a bit like leader election between CRs of the same type.

If you could elaborate any ideas that you might have on how this restriction could be implemented in an operator that might give us more context to see if the SDK would be the right place for this.

/cc @ecordell I don't know if this is something that could be enforced via OLM. Or if this model of a cluster-wide operator and singleton CR even works with OLM.

fabiand · 2018-08-14T10:20:43Z

First off I'm assuming the operator would have to be a cluster-wide operator that watches all namespaces.

Maybe it helps to say that I could imagine that the specific namespace the add-on should run in is fixed, and not variable.
Just like it is with some add-ons in i.e. OpenShift Origin (IIRC).

fabiand · 2018-08-14T12:02:01Z

xref #236

ecordell · 2018-08-14T12:19:01Z

@mhenriks I think the best approach here would be to use ResourceQuotas: (see kubernetes/kubernetes#64201 for enabling ObjectCount quotas for CRs).

If you need to restrict the number before that feature is ready, I would just check the cluster state in the operator and write out a status ("status: failed, reason: there's already one in the namespace")

Should there be additional support in the SDK/OLM for this case?

I'd like to chat more offline about the requirements for kubevirt, there might be some things we want to change to better support kubevirt's deployment with OLM

fabiand · 2018-08-14T13:06:40Z

+1 and let me highlight that kubevirt is just one example of an add-on. hopefully there is no need to highlight this ;)

mhenriks · 2018-08-14T15:08:56Z

@hasbro17 yeah, as far as implementation, I was originally thinking that a validating webhook would be one way to go. But ResourceQuotas may be better.

@ecordell I'll be in touch!

dmage · 2018-08-23T12:47:49Z

The OpenShift integrated registry is a singleton too. If all CRs share the same storage, they'll be just a kind of replicas and everything will work fine. Though, in normal scenarios there should be exactly one CR in the cluster and perhaps that should be enforced.

/cc @bparees @legionus

fabiand · 2018-08-23T14:16:21Z

I think some work that needs to happen before the operator is adjusted is to work out how addons should be layed out in general.

In the past we saw that infrastructur ecomponents land in kube-system but we now also see that there are dedicated namespaces.

The insight might be that the creation of certain CRs is limited to a namespace. OR that the corresponding CRDs are namespaced.

bparees · 2018-08-23T15:25:36Z

@dmage I imagine we'd also want to enforce that there is exactly one instance of the registry operator (running in the openshift-image-registry namespace). Presumably that is also true for other cluster singleton operators.

For 4.0 I suspect we're just going to live with "don't install the operator in another namespace and don't create extra registry CR instances" though.

openshift-bot · 2019-05-09T12:14:21Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2019-06-08T13:45:42Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

fabiand · 2019-06-13T12:40:42Z

/remove-lifecycle rotten I think it's still worth formalizing this a little more - otoh it look slike we have much more possibilities than 6mths ago.

openshift-bot · 2019-09-11T12:58:23Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2019-10-11T14:24:00Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2019-11-10T17:33:54Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci-robot · 2019-11-10T17:34:02Z

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2019

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 8, 2019

openshift-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jun 13, 2019

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 11, 2019

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 11, 2019

openshift-ci-robot closed this as completed Nov 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support For Cluster Addons #400

Support For Cluster Addons #400

mhenriks commented Aug 13, 2018

hasbro17 commented Aug 13, 2018

fabiand commented Aug 14, 2018

fabiand commented Aug 14, 2018

ecordell commented Aug 14, 2018

fabiand commented Aug 14, 2018

mhenriks commented Aug 14, 2018

dmage commented Aug 23, 2018

fabiand commented Aug 23, 2018

bparees commented Aug 23, 2018

openshift-bot commented May 9, 2019

openshift-bot commented Jun 8, 2019

fabiand commented Jun 13, 2019 via email

openshift-bot commented Sep 11, 2019

openshift-bot commented Oct 11, 2019

openshift-bot commented Nov 10, 2019

openshift-ci-robot commented Nov 10, 2019

Support For Cluster Addons #400

Support For Cluster Addons #400

Comments

mhenriks commented Aug 13, 2018

hasbro17 commented Aug 13, 2018

fabiand commented Aug 14, 2018

fabiand commented Aug 14, 2018

ecordell commented Aug 14, 2018

fabiand commented Aug 14, 2018

mhenriks commented Aug 14, 2018

dmage commented Aug 23, 2018

fabiand commented Aug 23, 2018

bparees commented Aug 23, 2018

openshift-bot commented May 9, 2019

openshift-bot commented Jun 8, 2019

fabiand commented Jun 13, 2019 via email

openshift-bot commented Sep 11, 2019

openshift-bot commented Oct 11, 2019

openshift-bot commented Nov 10, 2019

openshift-ci-robot commented Nov 10, 2019