-
Notifications
You must be signed in to change notification settings - Fork 728
Closed as not planned
Closed as not planned
Copy link
Description
When running the command sky serve down:
andyl@andylizf-dev-server ~/skypilot (persistent-service)> sky serve down sky-s
ervice-6c01
Terminating service(s) 'sky-service-6c01'. Proceed? [Y/n]:
Service 'sky-service-6c01' is scheduled to be terminated.
The system confirms that the service is scheduled for termination. However, upon inspecting the controller logs, it becomes evident that the replicas fail to terminate due to invalid credentials. While the failure itself is expected, the absence of any error messages during the operation is problematic and could lead to resource leaks.
/opt/conda/lib/python3.10/multiprocessing/resource_tracker.py:104: UserWarning: resource_tracker: process died unexpectedly, relaunching. Some resources might leak.
warnings.warn('resource_tracker: process died unexpectedly, '
I 01-25 02:02:15 service.py:103] Terminating replica 1 ...
I 01-25 02:02:15 service.py:103] Terminating replica 2 ...
WARNING:googleapiclient.http:Encountered 403 Forbidden with reason "insufficientPermissions"
WARNING:googleapiclient.http:Encountered 403 Forbidden with reason "insufficientPermissions"
E 01-25 02:02:20 replica_managers.py:163] Failed to terminate the sky serve replica cluster sky-service-6c01-1. Retrying after 5.001233025315587 seconds.Details: googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-1-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:20 replica_managers.py:167] Traceback: Traceback (most recent call last):
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/serve/replica_managers.py", line 151, in terminate_cluster
E 01-25 02:02:20 replica_managers.py:167] sky.down(cluster_name)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:20 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/core.py", line 487, in down
E 01-25 02:02:20 replica_managers.py:167] backend.teardown(handle, terminate=True, purge=purge)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:20 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 366, in _record
E 01-25 02:02:20 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/backend.py", line 146, in teardown
E 01-25 02:02:20 replica_managers.py:167] self._teardown(handle, terminate, purge)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 3680, in _teardown
E 01-25 02:02:20 replica_managers.py:167] self.teardown_no_lock(
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 4022, in teardown_no_lock
E 01-25 02:02:20 replica_managers.py:167] provisioner.teardown_cluster(repr(cloud),
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/provisioner.py", line 208, in teardown_cluster
E 01-25 02:02:20 replica_managers.py:167] provision.terminate_instances(cloud_name, cluster_name.name_on_cloud,
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/__init__.py", line 52, in _wrapper
E 01-25 02:02:20 replica_managers.py:167] return impl(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 549, in terminate_instances
E 01-25 02:02:20 replica_managers.py:167] handler_to_instances = _filter_instances(handlers, project_id, zone,
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 38, in _filter_instances
E 01-25 02:02:20 replica_managers.py:167] instance_dict = instance_handler.filter(
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance_utils.py", line 396, in filter
E 01-25 02:02:20 replica_managers.py:167] response = (cls.load_resource().instances().list(
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
E 01-25 02:02:20 replica_managers.py:167] return wrapped(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/http.py", line 938, in execute
E 01-25 02:02:20 replica_managers.py:167] raise HttpError(resp, content, uri=self.uri)
E 01-25 02:02:20 replica_managers.py:167] googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-1-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:20 replica_managers.py:167]
E 01-25 02:02:20 replica_managers.py:163] Failed to terminate the sky serve replica cluster sky-service-6c01-2. Retrying after 6.67271440093055 seconds.Details: googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-2-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:20 replica_managers.py:167] Traceback: Traceback (most recent call last):
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/serve/replica_managers.py", line 151, in terminate_cluster
E 01-25 02:02:20 replica_managers.py:167] sky.down(cluster_name)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:20 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/core.py", line 487, in down
E 01-25 02:02:20 replica_managers.py:167] backend.teardown(handle, terminate=True, purge=purge)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:20 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 366, in _record
E 01-25 02:02:20 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/backend.py", line 146, in teardown
E 01-25 02:02:20 replica_managers.py:167] self._teardown(handle, terminate, purge)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 3680, in _teardown
E 01-25 02:02:20 replica_managers.py:167] self.teardown_no_lock(
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 4022, in teardown_no_lock
E 01-25 02:02:20 replica_managers.py:167] provisioner.teardown_cluster(repr(cloud),
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/provisioner.py", line 208, in teardown_cluster
E 01-25 02:02:20 replica_managers.py:167] provision.terminate_instances(cloud_name, cluster_name.name_on_cloud,
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/__init__.py", line 52, in _wrapper
E 01-25 02:02:20 replica_managers.py:167] return impl(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 549, in terminate_instances
E 01-25 02:02:20 replica_managers.py:167] handler_to_instances = _filter_instances(handlers, project_id, zone,
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 38, in _filter_instances
E 01-25 02:02:20 replica_managers.py:167] instance_dict = instance_handler.filter(
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance_utils.py", line 396, in filter
E 01-25 02:02:20 replica_managers.py:167] response = (cls.load_resource().instances().list(
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
E 01-25 02:02:20 replica_managers.py:167] return wrapped(*args, **kwargs)
E 01-25 02:02:20 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/http.py", line 938, in execute
E 01-25 02:02:20 replica_managers.py:167] raise HttpError(resp, content, uri=self.uri)
E 01-25 02:02:20 replica_managers.py:167] googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-2-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:20 replica_managers.py:167]
WARNING:googleapiclient.http:Encountered 403 Forbidden with reason "insufficientPermissions"
E 01-25 02:02:28 replica_managers.py:163] Failed to terminate the sky serve replica cluster sky-service-6c01-1. Retrying after 9.805574426895168 seconds.Details: googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-1-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:28 replica_managers.py:167] Traceback: Traceback (most recent call last):
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/serve/replica_managers.py", line 151, in terminate_cluster
E 01-25 02:02:28 replica_managers.py:167] sky.down(cluster_name)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:28 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/core.py", line 487, in down
E 01-25 02:02:28 replica_managers.py:167] backend.teardown(handle, terminate=True, purge=purge)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:28 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 366, in _record
E 01-25 02:02:28 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/backend.py", line 146, in teardown
E 01-25 02:02:28 replica_managers.py:167] self._teardown(handle, terminate, purge)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 3680, in _teardown
E 01-25 02:02:28 replica_managers.py:167] self.teardown_no_lock(
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 4022, in teardown_no_lock
E 01-25 02:02:28 replica_managers.py:167] provisioner.teardown_cluster(repr(cloud),
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/provisioner.py", line 208, in teardown_cluster
E 01-25 02:02:28 replica_managers.py:167] provision.terminate_instances(cloud_name, cluster_name.name_on_cloud,
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/__init__.py", line 52, in _wrapper
E 01-25 02:02:28 replica_managers.py:167] return impl(*args, **kwargs)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 549, in terminate_instances
E 01-25 02:02:28 replica_managers.py:167] handler_to_instances = _filter_instances(handlers, project_id, zone,
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 38, in _filter_instances
E 01-25 02:02:28 replica_managers.py:167] instance_dict = instance_handler.filter(
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance_utils.py", line 396, in filter
E 01-25 02:02:28 replica_managers.py:167] response = (cls.load_resource().instances().list(
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
E 01-25 02:02:28 replica_managers.py:167] return wrapped(*args, **kwargs)
E 01-25 02:02:28 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/http.py", line 938, in execute
E 01-25 02:02:28 replica_managers.py:167] raise HttpError(resp, content, uri=self.uri)
E 01-25 02:02:28 replica_managers.py:167] googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-1-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:28 replica_managers.py:167]
WARNING:googleapiclient.http:Encountered 403 Forbidden with reason "insufficientPermissions"
E 01-25 02:02:30 replica_managers.py:163] Failed to terminate the sky serve replica cluster sky-service-6c01-2. Retrying after 13.516306336032262 seconds.Details: googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-2-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
E 01-25 02:02:30 replica_managers.py:167] Traceback: Traceback (most recent call last):
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/serve/replica_managers.py", line 151, in terminate_cluster
E 01-25 02:02:30 replica_managers.py:167] sky.down(cluster_name)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:30 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/core.py", line 487, in down
E 01-25 02:02:30 replica_managers.py:167] backend.teardown(handle, terminate=True, purge=purge)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:30 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 366, in _record
E 01-25 02:02:30 replica_managers.py:167] return f(*args, **kwargs)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/backend.py", line 146, in teardown
E 01-25 02:02:30 replica_managers.py:167] self._teardown(handle, terminate, purge)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 3680, in _teardown
E 01-25 02:02:30 replica_managers.py:167] self.teardown_no_lock(
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 4022, in teardown_no_lock
E 01-25 02:02:30 replica_managers.py:167] provisioner.teardown_cluster(repr(cloud),
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/provisioner.py", line 208, in teardown_cluster
E 01-25 02:02:30 replica_managers.py:167] provision.terminate_instances(cloud_name, cluster_name.name_on_cloud,
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/__init__.py", line 52, in _wrapper
E 01-25 02:02:30 replica_managers.py:167] return impl(*args, **kwargs)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 549, in terminate_instances
E 01-25 02:02:30 replica_managers.py:167] handler_to_instances = _filter_instances(handlers, project_id, zone,
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 38, in _filter_instances
E 01-25 02:02:30 replica_managers.py:167] instance_dict = instance_handler.filter(
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance_utils.py", line 396, in filter
E 01-25 02:02:30 replica_managers.py:167] response = (cls.load_resource().instances().list(
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
E 01-25 02:02:30 replica_managers.py:167] return wrapped(*args, **kwargs)
E 01-25 02:02:30 replica_managers.py:167] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/googleapiclient/http.py", line 938, in execute
E 01-25 02:02:30 replica_managers.py:167] raise HttpError(resp, content, uri=self.uri)
E 01-25 02:02:30 replica_managers.py:167] googleapiclient.errors.HttpError: <HttpError 403 when requesting https://compute.googleapis.com/compute/v1/projects/skypilot-375900/zones/us-central1-a/instances?filter=%28%28labels.ray-cluster-name+%3D+sky-service-6c01-2-e2dc%29%29&alt=json returned "Request had insufficient authentication scopes.". Details: "[{'message': 'Insufficient Permission', 'domain': 'global', 'reason': 'insufficientPermissions'}]">
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/opt/conda/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/serve/replica_managers.py", line 160, in terminate_cluster
raise RuntimeError('Failed to terminate the sky serve replica '
RuntimeError: Failed to terminate the sky serve replica cluster sky-service-6c01-2.
E 01-25 02:02:47 service.py:116] Replica 2 failed to terminate.
I 01-25 02:02:47 service.py:123] Cleaning up storage for version 1, task_yaml: /home/sky/.sky/serve/sky_service_6c01/task_v1.yaml
I 01-25 02:02:47 storage.py:645] Verifying bucket for storage skypilot-filemounts-andyl-75edb7ce
I 01-25 02:02:47 storage.py:1000] Storage type StoreType.GCS already exists.
E 01-25 02:02:55 service.py:78] Failed to clean up storage: sky.exceptions.StorageBucketDeleteError: Failed to delete GCS bucket skypilot-filemounts-andyl-75edb7ce.Detailed error: b'Removing gs://skypilot-filemounts-andyl-75edb7ce/job-75edb7ce/workdir/server.py#1737763198494882...\nRemoving gs://skypilot-filemounts-andyl-75edb7ce/job-75edb7ce/workdir/task.yaml#1737763198492095...\nRemoving gs://skypilot-filemounts-andyl-75edb7ce/...\nAccessDeniedException: 403 Access denied.\n'
E 01-25 02:02:55 service.py:81] Traceback: Traceback (most recent call last):
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/data/storage.py", line 2213, in _delete_gcs_bucket
E 01-25 02:02:55 service.py:81] subprocess.check_output(remove_obj_command,
E 01-25 02:02:55 service.py:81] File "/opt/conda/lib/python3.10/subprocess.py", line 421, in check_output
E 01-25 02:02:55 service.py:81] return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
E 01-25 02:02:55 service.py:81] File "/opt/conda/lib/python3.10/subprocess.py", line 526, in run
E 01-25 02:02:55 service.py:81] raise CalledProcessError(retcode, process.args,
E 01-25 02:02:55 service.py:81] subprocess.CalledProcessError: Command '[[ "$(uname)" == "Darwin" ]] && skypilot_gsutil() { gsutil -m -o "GSUtil:parallel_process_count=1" "$@"; } || skypilot_gsutil() { gsutil -m "$@"; };skypilot_gsutil rm -r gs://skypilot-filemounts-andyl-75edb7ce' returned non-zero exit status 1.
E 01-25 02:02:55 service.py:81]
E 01-25 02:02:55 service.py:81] During handling of the above exception, another exception occurred:
E 01-25 02:02:55 service.py:81]
E 01-25 02:02:55 service.py:81] Traceback (most recent call last):
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/serve/service.py", line 76, in cleanup_storage
E 01-25 02:02:55 service.py:81] backend.teardown_ephemeral_storage(task)
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/utils/common_utils.py", line 386, in _record
E 01-25 02:02:55 service.py:81] return f(*args, **kwargs)
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/backend.py", line 138, in teardown_ephemeral_storage
E 01-25 02:02:55 service.py:81] return self._teardown_ephemeral_storage(task)
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/backends/cloud_vm_ray_backend.py", line 3631, in _teardown_ephemeral_storage
E 01-25 02:02:55 service.py:81] storage.delete()
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/data/storage.py", line 1110, in delete
E 01-25 02:02:55 service.py:81] store.delete()
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/data/storage.py", line 1907, in delete
E 01-25 02:02:55 service.py:81] deleted_by_skypilot = self._delete_gcs_bucket(self.name)
E 01-25 02:02:55 service.py:81] File "/home/sky/skypilot-runtime/lib/python3.10/site-packages/sky/data/storage.py", line 2220, in _delete_gcs_bucket
E 01-25 02:02:55 service.py:81] raise exceptions.StorageBucketDeleteError(
E 01-25 02:02:55 service.py:81] sky.exceptions.StorageBucketDeleteError: Failed to delete GCS bucket skypilot-filemounts-andyl-75edb7ce.Detailed error: b'Removing gs://skypilot-filemounts-andyl-75edb7ce/job-75edb7ce/workdir/server.py#1737763198494882...\nRemoving gs://skypilot-filemounts-andyl-75edb7ce/job-75edb7ce/workdir/task.yaml#1737763198492095...\nRemoving gs://skypilot-filemounts-andyl-75edb7ce/...\nAccessDeniedException: 403 Access denied.\n'
E 01-25 02:02:55 service.py:81]
E 01-25 02:02:55 service.py:289] Service sky-service-6c01 failed to clean up.
Metadata
Metadata
Assignees
Labels
No labels