-
Notifications
You must be signed in to change notification settings - Fork 299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Update MultiKueue to support Kubeflow Jobs ManagedBy #4116
[Feature] Update MultiKueue to support Kubeflow Jobs ManagedBy #4116
Conversation
Skipping CI for Draft Pull Request. |
/ok-to-test |
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
1497ed2
to
066d9b1
Compare
/retest |
066d9b1
to
9ac33d8
Compare
/retest |
9ac33d8
to
3f54d20
Compare
/retest |
3f54d20
to
70e4321
Compare
/lgtm |
LGTM label has been added. Git tree hash: f90c413c312ee765ea444ba1947c7364ac07935f
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Great work! Looking forward to defaulting of the managedBy field for Kubeflow!
# Modify the `newTag` for the `kubeflow/training-operator` to use the one training-operator version | ||
$YQ eval '(.images[] | select(.name == "kubeflow/training-operator").newTag) = env(KUBEFLOW_IMAGE_VERSION)' -i "$KUBEFLOW_MANIFEST_MANAGER/kustomization.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cleanup is great to see. Our users will benefit from that simplification of the setup too!
@@ -454,6 +454,7 @@ var _ = ginkgo.Describe("MultiKueue", func() { | |||
ginkgo.It("Should run a kubeflow PyTorchJob on worker if admitted", func() { | |||
// Since it requires 1600M of memory, this job can only be admitted in worker 2. | |||
pyTorchJob := testingpytorchjob.MakePyTorchJob("pytorchjob1", managerNs.Name). | |||
ManagedBy(kueue.MultiKueueControllerName). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be needed when we have the webhook to add the ManagedBy by default for Kubeflow, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually yes, you are right, the behaviour should be the same with empty field and this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome. Thank you for your cross-project contribution!
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mimowo, mszadkow, tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Support Kubeflow Training-Operator Jobs managedBy feature for the MultiKueue.
Allow for installing training-operator in management cluster instead of only crds.
Which issue(s) this PR fixes:
Relates to #2552
Special notes for your reviewer:
Does this PR introduce a user-facing change?