-
Notifications
You must be signed in to change notification settings - Fork 251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide Kubeflow PyTorchJob support for MultiKueue #2735
Provide Kubeflow PyTorchJob support for MultiKueue #2735
Conversation
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/ok-to-test |
7384585
to
48c87ee
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM
@@ -459,6 +460,88 @@ var _ = ginkgo.Describe("MultiKueue", func() { | |||
}, util.Timeout, util.Interval).Should(gomega.Succeed()) | |||
}) | |||
}) | |||
|
|||
ginkgo.It("Should run a kubeflow PyTorchJob on worker if admitted", func() { | |||
pyTorchJob := testingpytorchjob.MakePyTorchJob("pytorchjob1", managerNs.Name). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment on the worker we expect o run it and why
(nit: maybe we should make this one run in worker 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense I will update in all of the jobs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
LGTM label has been added. Git tree hash: 8b2a8a046c07026986cb8e09ed8681b1410ba595
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some nits.
pkg/controller/jobs/kubeflow/jobs/pytorchjob/pytorch_multikueue_adapter.go
Outdated
Show resolved
Hide resolved
pkg/controller/jobs/kubeflow/jobs/pytorchjob/pytorch_multikueue_adapter.go
Outdated
Show resolved
Hide resolved
pkg/controller/jobs/kubeflow/jobs/pytorchjob/pytorch_multikueue_adapter.go
Outdated
Show resolved
Hide resolved
pkg/controller/jobs/kubeflow/jobs/pytorchjob/pytorch_multikueue_adapter.go
Outdated
Show resolved
Hide resolved
6671628
to
68dc6da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/lgtm
/approve
LGTM label has been added. Git tree hash: 843db76576fd6082359d2e4e65f56ae3ceb85518
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mszadkow, tenzen-y The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
The PR introduces a new MultiKueue adapter to handle PyTorchJob (Kubeflow).
We want to extend MultiKueue capabilities to satisfy the needs of early adopters.
Which issue(s) this PR fixes:
Relates #2552
Special notes for your reviewer:
Does this PR introduce a user-facing change?