Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sample-pytorchjob should update to cuda12+ and pytorch 2.0 #1910

Closed
village-way opened this issue Mar 26, 2024 · 4 comments · Fixed by #1992
Closed

sample-pytorchjob should update to cuda12+ and pytorch 2.0 #1910

village-way opened this issue Mar 26, 2024 · 4 comments · Fixed by #1992
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@village-way
Copy link
Contributor

village-way commented Mar 26, 2024

What would you like to be added:

sample-pytorchjob should update cuda and pytorch version, the image in the examples/jobs/sample-pytorchjob.yaml is too old.

Why is this needed:

the sample pytorchjob failed to run on ubuntu 22.04 with cuda12.3

Completion requirements:

This enhancement requires the following artifacts:
the image docker.io/kubeflowkatib/pytorch-mnist:v1beta1-45c5727 is from kubeflowkatib and the dockerfile is updated by kubeflow/katib#2278

The artifacts should be linked in subsequent comments.

@village-way village-way added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 26, 2024
@kannon92
Copy link
Contributor

@village-way would you be willing to submit a PR? Generally, we should use examples from kubeflow.

@tenzen-y
Copy link
Member

@village-way I'd be happy to review the opened PR :)

@village-way
Copy link
Contributor Author

@village-way I'd be happy to review the opened PR :)

I'm very excited to do this. I'll submit the PR a little later.

village-way added a commit to village-way/kueue that referenced this issue Apr 17, 2024
…netes-sigs#1910)

Signed-off-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
village-way added a commit to village-way/kueue that referenced this issue Apr 17, 2024
…netes-sigs#1910)

Signed-off-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
@kannon92
Copy link
Contributor

Thank you @village-way for picking this up!

village-way added a commit to village-way/kueue that referenced this issue Apr 18, 2024
…netes-sigs#1910)

Signed-off-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
k8s-ci-robot pushed a commit that referenced this issue Apr 19, 2024
#1992)

Signed-off-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
k8s-infra-cherrypick-robot pushed a commit to k8s-infra-cherrypick-robot/kueue that referenced this issue Apr 19, 2024
…netes-sigs#1910)

Signed-off-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
k8s-ci-robot pushed a commit that referenced this issue Apr 19, 2024
#2019)

Signed-off-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
Co-authored-by: wangdepeng <wangdepeng_yewu@cmss.chinamobile.com>
kannon92 pushed a commit to openshift-kannon92/kubernetes-sigs-kueue that referenced this issue Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants