Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage] Make sky storage delete parallel #1990 #2058

Merged
merged 11 commits into from
Jun 28, 2023

Conversation

ksasi
Copy link
Contributor

@ksasi ksasi commented Jun 9, 2023

This update parallelize sky storage deletion as raised in issue#1990

Tested (run the relevant ones):

  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_multiple_buckets_creation_and_deletion
  • Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

@ksasi ksasi marked this pull request as ready for review June 9, 2023 11:13
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ksasi, and welcome to SkyPilot! Left a comment. Can we also include a test for deleting multiple buckets? Something like we have in test_smoke::TestStorageWithCredentials::test_new_bucket_creation_and_deletion, but for multiple buckets?

sky/cli.py Outdated
@@ -3220,8 +3220,9 @@ def storage_delete(names: List[str], all: bool): # pylint: disable=redefined-bu
else:
names = _get_glob_storages(names)

for name in names:
sky.storage_delete(name)
pool_obj = multiprocessing.Pool()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use subprocess_utils.run_in_parallel here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, subprocess_utils.run_in_parallel makes more sense.

@romilbhardwaj romilbhardwaj changed the title Fix for issue - [Storage] Make sky storage delete parallel #1990 [Storage] Make sky storage delete parallel #1990 Jun 9, 2023
@romilbhardwaj
Copy link
Collaborator

Also, a polite reminder to run ./format.sh to make sure your code is compliant with our formatting requirements - you can read more here.

@ksasi ksasi marked this pull request as draft June 11, 2023 16:51
@ksasi ksasi marked this pull request as ready for review June 12, 2023 18:35
@ksasi
Copy link
Contributor Author

ksasi commented Jun 13, 2023

@romilbhardwaj , as suggested I had added test for deleting multiple buckets along with other updates.

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ksasi! The functionality works well - storage delete is much faster now! Left some comments for the test.

tests/test_smoke.py Outdated Show resolved Hide resolved
@ksasi
Copy link
Contributor Author

ksasi commented Jun 21, 2023

@romilbhardwaj

Tested (run the relevant ones):

  • Any manual or new tests for this PR (please specify below)
  • Relevant individual smoke tests: pytest tests/test_smoke.py::TestStorageWithCredentials::test_multiple_buckets_creation_and_deletion

timestamp = str(time.time()).replace('.', '')
store_obj = storage_lib.Storage(name=f'sky-test-{timestamp}')
storage_mult_obj.append(store_obj)
yield storage_mult_obj
Copy link
Collaborator

@romilbhardwaj romilbhardwaj Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way to get storage objects from yield_storage_object here? That way deletion after test is handled by that code. If not, we should add deletion code after the yield statement here (refer to yield_storage_object).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are unable to use multiple yield statements in fixture as they are not supported. Hence, had to resort to this specific way of implementation. I am adding the deletion code after the yield statement as suggested.

tests/test_smoke.py Show resolved Hide resolved
…atch_storage_obj to account for multiple store types
Comment on lines 2465 to 2492
# Staggering tests for each store type to prevent BucketDeleteError
# exception due to concurrent delete all
if store_type.value == 'S3':
time.sleep(0)
elif store_type.value == 'GCS':
time.sleep(10)
storage_obj_name = int(
subprocess.check_output(
["sky storage ls|wc -l"],
shell=True).decode('utf-8').splitlines()[0]) - 1
while storage_obj_name > 0:
time.sleep(5)
storage_obj_name = int(
subprocess.check_output(
["sky storage ls|wc -l"],
shell=True).decode('utf-8').splitlines()[0]) - 1
elif store_type.value == 'R2':
time.sleep(20)
storage_obj_name = int(
subprocess.check_output(
["sky storage ls|wc -l"],
shell=True).decode('utf-8').splitlines()[0]) - 1
while storage_obj_name > 0:
time.sleep(5)
storage_obj_name = int(
subprocess.check_output(
["sky storage ls|wc -l"],
shell=True).decode('utf-8').splitlines()[0]) - 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of staggering tests, we can use @pytest.mark.xdist_group('multiple_bucket_deletion') to run all parameterized tests serially. Let's remove this staggering code and add @pytest.mark.xdist_group to this test :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer. I removed the staggering code and added @pytest.mark.xdist_group('multiple_bucket_deletion') to this test.

…eletion test to run all parameterized tests serially
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks @ksasi for adding this!

@ksasi
Copy link
Contributor Author

ksasi commented Jun 28, 2023

@romilbhardwaj Thanks for your guidance during this entire PR.

@romilbhardwaj romilbhardwaj merged commit beac2d5 into skypilot-org:master Jun 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants