-
Notifications
You must be signed in to change notification settings - Fork 71
Conversation
Let's try to minimize the code changes. If |
/hold |
sure. I shall finish the gang part and try to maximize code re-use in the next commit. |
@Jeffwan While I'm still working on adding comment and unit-tests for Let me know if I need to rebase it first. |
|
||
// JobInterface defines the abstract interface for Pod related actions, such like get, create or delete TFJob, | ||
// PyTorchJob or KFJob | ||
type JobInterface interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems it brings more interfaces. I notice this change is very large and includes lots of refactoring. Maybe it's better to add more details in the issue for training leads to review?
It would be great to cover following topics
- What's the advantage of using
reconciler.v1
overcontroller.v1
? - What's the user facing changes? for example, if user wants to build an operator, how this solution benefit their development?
/cc @kubeflow/wg-training-leads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For questions here, I updated #141 (comment) to elaborate the benefit of reconciler.v1
.
But from my perspective, I would keep both (reconciler and controller) for developers/users at different level.
/cc @kubeflow/wg-training-leads Please help review the proposal. |
In general, do we need to low level api code in longer run? Can we discuss in next meeting before merge? |
@johnugeorge Sure, this is a big change requires user (operator) facing changes. Let's have an agreement in the meeting first. |
/hold |
I think the The concern on keeping both
@johnugeorge Sure. Let's discuss this on the community meeting on August 11th. |
/cc @kubeflow/wg-automl-leads |
go.mod
Outdated
@@ -23,4 +26,6 @@ replace ( | |||
k8s.io/apimachinery => k8s.io/apimachinery v0.19.9 | |||
k8s.io/client-go => k8s.io/client-go v0.19.9 | |||
k8s.io/code-generator => k8s.io/code-generator v0.19.9 | |||
sigs.k8s.io/controller-runtime => sigs.k8s.io/controller-runtime v0.7.2 | |||
github.com/prometheus/client_golang v1.11.0 => github.com/prometheus/client_golang v1.7.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this replace or we can remove it since we are using v0.21.3
version of K8s packages ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to keep go packages in common and tf-operator consistent. We can discuss this dependency issue on the next community meeting (August 11th).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are using v0.21.3
, shall I get rid of other replace items like k8s.io/client-go => k8s.io/client-go v0.19.9
?
3b28ba7
to
7551d6b
Compare
/cc @alculquicondor |
This is a long PR. Would it be worth splitting the go.mod upgrades in its own commit or PR? Any specific parts were you would like my input, given that I'm still unfamiliar with this project? |
} | ||
|
||
// BareKubeflowJobReconciler returns the pointer of a KubeflowJobReconciler with minimal implementation | ||
func BareKubeflowJobReconciler(client client.Client) *KubeflowJobReconciler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use term like "Base"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the Bare
means only method interfaces independently defined in JobInterface
is implemented in the returned KubeflowJobReconciler
pointer, I think the Base
conveys the same message.
The scope and codes overall looks good to me. Other leads, please leave comments. /cc @kubeflow/wg-training-leads |
a2a0424
to
0daf75e
Compare
Overall looks good to me. I think this is safe to merge and it doesn't introduce user facing low level JobController. (just some reorganization to extract common codes). If we plan to support a CustomJob which can support arbitrary roles. reconciler.v1 would be a good start. For existing frameworks, we can make some plans later. This is not a blocking issue to 1.4 release and beta release of all-in-one operator. We can probably introduce this to v2 apis & controllers. @kubeflow/wg-training-leads Please have a double check. Let's hold it for two days. |
I agree with @Jeffwan For existing frameworks this is not necessary at this point. We can plan for it at a later point |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Jeffwan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel |
#140
The common repository offers a
controller.v1
package for developers to make kubeflow operators. The controller mode exposes low-level APIs for operators needs high customization requests, which are more about the controller mechanism, such like pre-processing/post-processing for enqueue/dequeue actions.However, for more entry-level developers or users who may not that familiar with controller, exposing to many low-level APIs could be confusing. Meanwhile, with effort from controller-runtime, kubebuilder and operator-sdk, the
reconciler
mode is offering a high-level, controller-mechanism-decoupled APIs which are only related to how to meet the resources expectation defined in the declarative API like TFJob, MXNetJob, etc.To add the
reconciler.v1
package as long as to keep code re-use between thecontroller.v1
andreconciler.v1
packages,core
package is extracted to be shared betweenreconciler.v1
andcontroller.v1
.The
reconciler.v1
defines its API in reconciler.v1/common/interface.go which methods only related to reconciliation exposed.A base/parent implementation named
KubeflowReconciler
is defined so developers only need to override methods in reconciler.v1/common/must_customize.go.reconciler.v1
package (controller.v1
is also modified to extract shared functionality)reconciler.v1