Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MPIJob controller #578

Merged
merged 1 commit into from
Mar 3, 2023
Merged

Conversation

mimowo
Copy link
Contributor

@mimowo mimowo commented Feb 16, 2023

What type of PR is this?

/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #65

Special notes for your reviewer:

The follow ups:

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. labels Feb 16, 2023
@netlify
Copy link

netlify bot commented Feb 16, 2023

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 2d435c8
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/640207eb0657800008633f57

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 16, 2023
@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Feb 16, 2023
@mimowo mimowo marked this pull request as ready for review February 16, 2023 09:20
@tenzen-y
Copy link
Member

@mimowo Can we use a job interface (#537)? Or would you like to switch implementation using a job interface after #544 is completed?

@mimowo mimowo mentioned this pull request Feb 16, 2023
@mimowo
Copy link
Contributor Author

mimowo commented Feb 16, 2023

@mimowo Can we use a job interface (#537)? Or would you like to switch implementation using a job interface after #544 is completed?

As mentioned here: #65 (comment). For now, I'm doing a rough prototype by copy-paste to see what the obstacles are. I guess once it is working we could also abstract out the library based on the two implementations. This could also give us better clues of how the final interface should look like. I guess we need to sync up on this.

@mimowo mimowo marked this pull request as draft February 16, 2023 09:32
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 17, 2023
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 17, 2023
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 20, 2023
@mimowo mimowo marked this pull request as ready for review February 20, 2023 08:05
@mimowo mimowo force-pushed the mpijob-controller branch 4 times, most recently from 2f8b14b to c4db71a Compare February 21, 2023 11:34
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the first feedback.

@@ -0,0 +1,27 @@
# permissions for end users to edit jobs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the long term, we may want to prepare a separate kustomization.yaml for each integration in the following:

config/
├── components
├── default
├── default_with_mpijob
├── default_with_training-operator
├── default_with_rayjob
├── default_with_all_integration
...

pkg/util/testing/wrappers_mpijob.go Outdated Show resolved Hide resolved
Comment on lines +105 to +112
// SetupWithManager sets up the controller with the Manager. It indexes workloads
// based on the owning jobs.
func (r *MPIJobReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&kubeflow.MPIJob{}).
Owns(&kueue.Workload{}).
Complete(r)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no CRDs for MPIJob in the cluster, the kueue-controller always says logs in the following, although kueue works fine:

{"level":"error","ts":"2023-03-01T16:01:40.4642038Z","logger":"controller-runtime.source","caller":"source/source.go:143","msg":"if kind is a CRD, it should be installed before calling Start","kind":"MPIJob.kubeflow.org","error":"no matches for kind \"MPIJob\" in version \"kubeflow.org/v2beta1\"","stacktrace":"sigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1.1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.2/pkg/source/source.go:143\nk8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext\n\t/go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:235\nk8s.io/apimachinery/pkg/util/wait.WaitForWithContext\n\t/go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:662\nk8s.io/apimachinery/pkg/util/wait.poll\n\t/go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:596\nk8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext\n\t/go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:547\nsigs.k8s.io/controller-runtime/pkg/source.(*Kind).Start.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.14.2/pkg/source/source.go:136"}

Maybe we can skip to watch MPIJob using following library if there are no CRDs for MPIJob:

// If err is not nil, there are no CRDs for MPIJob in the cluster.
_, err := mgr.GetRESTMapper().RESTMapping(schema.GroupKind{Group: kubeflow.SchemeGroupVersion.Group, Kind: "MPIJob"})

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's an overkill for now, as we plan to move this code to a separate binary

Copy link
Member

@tenzen-y tenzen-y Mar 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's an overkill for now, as we plan to move this code to a separate binary

Does that mean we will run controllers for CustomJobs as sidecar containers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either sidecars or separate deployments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sgtm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest this as a future improvement that may not be needed if we move the code out to a separate binary (probably next release though).

test/integration/controller/mpijob/suite_test.go Outdated Show resolved Hide resolved
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use MPIJob or mpijob in messages and comments instead of Job or job?

pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
pkg/controller/workload/mpijob/mpijob_controller.go Outdated Show resolved Hide resolved
@mimowo
Copy link
Contributor Author

mimowo commented Mar 1, 2023

Can we use MPIJob or mpijob in messages and comments instead of Job or job?

I have applied the suggestions. I think this will be easier to maintain once we commonize the code, which is planned, based on the interface: https://github.com/kubernetes-sigs/kueue/tree/main/keps/369-job-interface.

@tenzen-y
Copy link
Member

tenzen-y commented Mar 2, 2023

Can we use MPIJob or mpijob in messages and comments instead of Job or job?

I have applied the suggestions. I think this will be easier to maintain once we commonize the code, which is planned, based on the interface: https://github.com/kubernetes-sigs/kueue/tree/main/keps/369-job-interface.

Ah, I see. sgtm

@mimowo
Copy link
Contributor Author

mimowo commented Mar 2, 2023

/assign @kerthcet
Please complete the review and lgtm.

@mimowo mimowo force-pushed the mpijob-controller branch 2 times, most recently from b400105 to 015e75f Compare March 2, 2023 08:25
@kerthcet
Copy link
Contributor

kerthcet commented Mar 2, 2023

I'll take a look but maybe later(tomorrow I think), stuck with other things. Hope you don't mind. ^_^

@@ -519,7 +520,7 @@ func ConstructWorkloadFor(ctx context.Context, client client.Client,
job *batchv1.Job, scheme *runtime.Scheme) (*kueue.Workload, error) {
w := &kueue.Workload{
ObjectMeta: metav1.ObjectMeta{
Name: job.Name,
Name: GetWorkloadNameForJob(job.Name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a user facing change we should highlight. But seems we can't do as in Kubernetes currently. Just to notify that this may be involved in the changeLog.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of, workloads are generally API objects that should be "technical details" to the users. But, it might be worth mentioning, I will ask to add the note when we prepare release notes for 0.3.0.

limitations under the License.
*/

package jobframework
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should rename the folder workload to job for Workload is a specific object in Kueue, then the dir tree looks like below, but that can be a separate commit.

job/
├── batchjob
├── mpijob
├── framework

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially, but this would make the diff under this PR unnecessarily big. If we decide on the folder rename this should be a dedicated PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I don't have a strong opinion on the layout. Maybe it can be discussed further when we commonize the code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the proposal by @kerthcet, for another PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me finish the job library first, or I need to resolve the conflicts again.


// podsReady checks if all pods are ready or succeeded
func podsReady(job *kubeflow.MPIJob) bool {
for _, c := range job.Status.Conditions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like when MPIJob is running, then it's ready. Not quite familiar with MPIJob operator, what will happen before job running?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the Running condition is set when all workers are running and the launcher: https://github.com/kubeflow/mpi-operator/blob/5946ef4157599a474ab82ff80e780d5c2546c9ee/pkg/controller/mpi_job_controller.go#L1041-L1045.
Before that, the MPIJobCreated is set, and MPIJobSuspended (if it is suspended).

pkg/controller/workload/mpijob/mpijob_webhook.go Outdated Show resolved Hide resolved
@@ -0,0 +1,27 @@
# permissions for end users to edit jobs.
Copy link
Contributor

@kerthcet kerthcet Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BatchJob is an in-tree object, so I think it's ok to maintain the yaml here. And we can better organize it like we do in test/e2e/config(I haven't tried it yet). But I don't want to block this significant feature.

@mimowo mimowo force-pushed the mpijob-controller branch 2 times, most recently from 82b8f32 to 4dc53e4 Compare March 3, 2023 12:06
@mimowo
Copy link
Contributor Author

mimowo commented Mar 3, 2023

@kerthcet thank you for the review! I think I have addressed the remarks. Let me know if something more is needed.

@alculquicondor
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2023
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2023
@alculquicondor
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2023
@mimowo
Copy link
Contributor Author

mimowo commented Mar 3, 2023

Rebased and squashed. @alculquicondor please renew lgtm :)

@k8s-ci-robot k8s-ci-robot merged commit 6a956f8 into kubernetes-sigs:main Mar 3, 2023
Copy link
Member

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! We should add more docs and examples around this.

@kerthcet
Copy link
Contributor

kerthcet commented Mar 4, 2023

Thanks @mimowo that's a great start!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support kubeflow's MPIJob
7 participants