Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[k8s] Custom GPU node labels need better error hints #2652

Closed
romilbhardwaj opened this issue Oct 4, 2023 · 0 comments · Fixed by #2653
Closed

[k8s] Custom GPU node labels need better error hints #2652

romilbhardwaj opened this issue Oct 4, 2023 · 0 comments · Fixed by #2653
Labels
k8s Kubernetes related items

Comments

@romilbhardwaj
Copy link
Collaborator

A user configured their k8s cluster GPU labels in uppercase:

kubectl label nodes <node> skypilot.co/accelerator=A40

sky check passed silently:

$ sky check
...
Kubernetes: enabled
...

But sky launch fails:

(base) ➜  ~ sky launch -c test --gpus A40:1
I 10-03 17:39:46 optimizer.py:1044] No resource satisfying <Cloud>({'A40': 1}) on [AWS, Azure, GCP, IBM, Kubernetes, Lambda].
sky.exceptions.ResourcesUnavailableError: Catalog and kubernetes cluster does not contain any instances satisfying the request:
Task(run=<empty>)
  resources: <Cloud>({'A40': 1}).

To fix: relax or change the resource requirements.

Hint: sky show-gpus to list available accelerators.
      sky check to check the enabled clouds.

Neither of these hints are useful (sky show-gpus will be supported only after #2638, and sky check showed no error).

The problem is the user used uppercase labels to label their nodes, and our checks only check the presence of a skypilot.co/accelerator label, not its validity. Our hints should include this.

@romilbhardwaj romilbhardwaj added the k8s Kubernetes related items label Oct 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
k8s Kubernetes related items
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant