-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Discussion] Operators vs. controller pattern #300
Comments
I think they are two patterns, operator is issued by coreos and the controller is used in Kubernetes: https://github.com/kubernetes/kubernetes/tree/master/pkg/controller. |
Since CoreOS coined the term "Operator", their article is the authority on what they mean by that:
To paraphrase: All Operators use the controller pattern, but not all controllers are Operators. It's only an Operator if it's got: controller pattern + API extension + single-app focus. TfJob is a good example of an Operator, because it's a custom controller + CRD that's focused only on running one particular app (TensorFlow). Things like BlueGreenDeployment or IndexedJob implement general patterns that apply abstractly to "whatever it is you're running", so although they are also custom controllers + CRDs, they are not Operators. |
My 2 cents. Operator is a customized controller implement with CRD. It follow the same pattern with build-in controllers (i.e. watch, diff, action). The key idea of Operator is providing you with a framework to do extra operation during installation or scaling instances. e.g. register the new instance to master onAdd() of it. These operation can include alert and act on failure, backup, or reconfigure etc. But the app itself is still deployed with Deployment, ReplicaSet, or even StatefulSet, Operator just provide you with a way to automatically "operate" them by following controller pattern. Based on those above, I guess you are now actually writing your own version of Operator. 😃 You may want to consider using it directly. My friend @hongchaodeng from CoreOS would be the best person to final this discussion and plz correct my random words if anything wrong. |
This is very helpful. My key takeway is that in the context of the TfJob CRD "operator vs. controller" is mostly semantics and doesn't really refer to a different design for the TfJob controller. To be more specific, the TfJob controller was created by copying the CoreOs etcd operator. Which wasn't using Informer and Controller classes like https://github.com/kubernetes/sample-controller. My working assumption is that the etcd-operator preceded the existence of these libraries and that's why it didn't use them. |
It is helpful and thanks :-) I am actually thinking of etcd-operator when we talk about operator. And after looking through Prometheus Operator and kong operator, I found that they are different in implementation although they are all operators. Then I agree with enisoc@ now
Operator is just a concept, not a pattern,my opinion is corrected. We copy the code from etcd-operator, and I have a question about the implementation:
And as resouer@ said hongchaodeng@ could give us more helpful information about it :-) I am not sure if I understand the code, if there are some things that I missed please correct me :-) |
I was sent a link here - very interesting conversation. We're starting to embrace CRDs on OpenFaaS - there are other distinctions at play here. i.e the difference between an event-driven controller with "owner references" for its CRDs vs. a controller that simply polls for state and remediates that way. |
@alexellis what are power references? |
Typo - owner references - claims about CRDs made by a controller to receive relevant events. |
An operator is a Kubernetes controller that understands 2 domains: Kubernetes and something else. By combining knowledge of both domains, it can automate tasks that usually require a human operator that understands both domains.
This is correct. Within CoreOS, internally and externally, there are a variety of operators each with different designs depending on what the operator needs to do. There is an extremely common design where you use a Custom Resource to represent a group of Kubernetes resources that your operator is managing. The etcd Operator is an example of this: it uses the EtcdCluster resource to represent all of the Kubernetes resources required for a single cluster and it handles any extra logic for those resources (e.g. how to properly replace a node in the etcd quorum). Within reason, I highly recommend you structure your code like the sample-controller repository. The etcd operator is the original code that inspired the idea of an operator and, as such, is not necessarily using the best practices. |
@gaocegege @jlewi Sorry for the late reply.
I strongly agree with the viewpoint above. I also think https://github.com/kubernetes/sample-controller is a best practice of operator, and that is what we did in https://github.com/caicloud/kubeflow-controller. Now we are planing to merge our implementation to upstream. I believe @gaocegege would open a PR to do the merging work this week. |
Are operators and controllers actually two different patterns or just different terminology?
When I originally created the TfJob controller I based it on the CoreOs etcd operator.
We are refactoring the code to be a controller #206 and use more K8s infrastructure to support controllers.
However, as far as I can tell operators aren't fundamentally architected differently than controllers. Am I missing something?
/cc @gaocegege @wackxu @enisoc
The text was updated successfully, but these errors were encountered: