-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Disable autostop for controller on kubernetes #3521
Conversation
…o serve_k8s_playground # Conflicts: # sky/clouds/kubernetes.py # sky/serve/core.py
…o serve_k8s_playground # Conflicts: # sky/cli.py
…o serve_k8s_playground
…t-org/skypilot into serve_k8s_playground # Conflicts: # sky/serve/core.py # sky/serve/replica_managers.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @romilbhardwaj! LGTM.
Thanks @Michaelvll! Tested with
|
Awesome! Can we also test it without explicit specify the controller resources? |
Yes, smoke tests pass even without specifying controller resources! Kubernetes is automatically chosen for the controller if it has enough resources (8 CPU, 24 GB mem), verified with |
if isinstance(to_provision.cloud, clouds.Kubernetes | ||
) and controller_utils.Controllers.from_name( | ||
cluster_name | ||
) == controller_utils.Controllers.JOBS_CONTROLLER: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be nice to add ()
if isinstance(to_provision.cloud, clouds.Kubernetes | |
) and controller_utils.Controllers.from_name( | |
cluster_name | |
) == controller_utils.Controllers.JOBS_CONTROLLER: | |
if (isinstance(to_provision.cloud, clouds.Kubernetes | |
) and controller_utils.Controllers.from_name( | |
cluster_name | |
) == controller_utils.Controllers.JOBS_CONTROLLER): |
Skips autostop when using the jobs controller on Kubernetes.
This should be merged after #3377, since that PR will add support for
SERVICE_ACCOUNT
, which is helpful for running the controller on GKE exec based auth clusters.Tested (run the relevant ones):
bash format.sh
sky jobs launch
with controller and job on Kubernetespytest tests/test_smoke.py --managed-jobs --kubernetes