Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[K8s] Distinguish between H100 and H100 Mega #3890

Closed
romilbhardwaj opened this issue Aug 28, 2024 · 0 comments · Fixed by #3891
Closed

[K8s] Distinguish between H100 and H100 Mega #3890

romilbhardwaj opened this issue Aug 28, 2024 · 0 comments · Fixed by #3891

Comments

@romilbhardwaj
Copy link
Collaborator

In our GKELabelFormatter, we currently do not distinguish between H100 and H100 Mega (a3-highgpu vs a3-megagpu):

if acc in ['H100-80GB', 'H100-MEGA-80GB']:
# H100 is named H100-80GB or H100-MEGA-80GB in GKE,
# where the latter has improved bandwidth.
# See a3-mega instances on GCP.
# TODO: we do not distinguish the two GPUs for simplicity,
# but we can evaluate whether we should distinguish
# them based on users' requests.
return 'H100'

We should fix this - GKE/DWS users need a way to distinguish between the two GPU types to provision the right instance type for their use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant