-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Handle kubernetes config and reachability errors #3043
[k8s] Handle kubernetes config and reachability errors #3043
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @landscapepainter!
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@gmail.com>
…aunch_with_confirm
To fix the whole user story (" |
…o kube-config-issue
…o kube-config-issue # Conflicts: # sky/check.py
Now also closes #3093. Since the missing kubeconfig is actually an error, I have removed the silent disabling of Kubernetes cloud (cloud-specific |
…o kube-config-issue # Conflicts: # sky/clouds/kubernetes.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this issue @landscapepainter!
Thanks @Michaelvll - addressed your comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix @landscapepainter @romilbhardwaj! LGTM.
This resolves #3013.
When
~/.kube/config
was removed externally, two errors were noted each from runningsky launch
andsky down --purge
.sky launch
: removing~/.kube/config
sets the Kubernetes cloud to be disabled. And as it was removed externally, the change does not get reflected to the state. To resolve this issue, the change was made to make sure sky check is ran when provisioning to update the state, so thatwrite_cluster_config
does not try to upload the credential file for Kubernetes cloud(~/.kube/config
).sky down --purge
:runs cleanup_ports
as part of thepost_teardown_cleanup
. While doing so for Kubernetes cloud, it tries to load the config and raises an error since it's removed. Error handling is implemented to handle the scenario.Tested (run the relevant ones):
bash format.sh
sky down --purge
fails on Kubernetes clusters that no longer exist #3013pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh