-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Storage] R2 Cloud Object Storage integration #1736
[Storage] R2 Cloud Object Storage integration #1736
Conversation
…skypilot into r2-integration merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @landscapepainter! Looks mostly good. One ask for adding mounting tests.
@@ -100,6 +106,8 @@ def get_store_prefix(storetype: StoreType) -> str: | |||
return 's3://' | |||
elif storetype == StoreType.GCS: | |||
return 'gs://' | |||
elif storetype == StoreType.R2: | |||
return 's3://' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it! Can we leave a comment here mentioning this gotcha?
tests/test_smoke.py
Outdated
@@ -58,7 +58,7 @@ | |||
# id. | |||
test_id = str(uuid.uuid4())[-2:] | |||
|
|||
LAMBDA_TYPE = '--cloud lambda --gpus A100' | |||
LAMBDA_TYPE = '--cloud lambda --gpus A10' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this change unintentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change was made in master branch. I wrote it while resolving merge conflict. bb6429b
@@ -682,6 +682,36 @@ def test_gcp_storage_mounts(): | |||
run_one_test(test) | |||
|
|||
|
|||
@pytest.mark.cloudflare |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I also realized this might be a better way of selecting when to run R2 tests rather than using the R2_AVAILABLE
. Is it possible to do the same in TestStorageWithCredentials
below by replacing the parametrize decorator like so:
@pytest.mark.parametrize('store_type', [
storage_lib.StoreType.S3, storage_lib.StoreType.GCS,
pytest.param(storage_lib.StoreType.R2,
marks=pytest.mark.cloudflare)
])
(I'm not fully sure if the above code would work, you may need to dig around a bit)
The goal is to run cloudflare tests when pytest tests/test_smoke.py --cloudflare
is run. You may need to modify conftest.py too.
tests/conftest.py
Outdated
@@ -68,7 +69,6 @@ def pytest_configure(config): | |||
cloud_keyword = cloud_to_pytest_keyword[cloud] | |||
config.addinivalue_line( | |||
'markers', f'{cloud_keyword}: mark test as {cloud} specific') | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unintended?
@@ -678,6 +680,35 @@ def test_gcp_storage_mounts(): | |||
run_one_test(test) | |||
|
|||
|
|||
@pytest.mark.cloudflare | |||
def test_cloudflare_storage_mounts(generic_cloud: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pytest tests/test_smoke.py::test_cloudflare_storage_mounts --cloudflare
fails with:
ValueError: Cloud cloudflare is not a valid cloud among ['aws', 'azure', 'gcp', 'lambda', 'local']
Failed.
Reason: sky launch -y -c t-cloudflare-storab5c-a56f-d6 --cloud cloudflare /var/folders/98/hhq8wrtx6y13196q61xphjsm0000gn/T/tmp5ay5w1vi.yaml
Log: less /var/folders/98/hhq8wrtx6y13196q61xphjsm0000gn/T/cloudflare_storage_mounts-sfyxxvg9.log
We need to fix this by setting the launch cloud to default_cloud_to_run[0] when --cloudflare
is specified. Might need to look at def _generic_cloud in conftest.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Super excited to have R2 support in SkyPilot!
Left a minor comment on conftest.py, please add the comment before merging.
tests/conftest.py
Outdated
@@ -69,14 +69,18 @@ def pytest_configure(config): | |||
cloud_keyword = cloud_to_pytest_keyword[cloud] | |||
config.addinivalue_line( | |||
'markers', f'{cloud_keyword}: mark test as {cloud} specific') | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unintended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a fix to the previous unintended new line #1736 (comment)
if config.getoption('--cloudflare') and cloud == 'cloudflare': | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this - it fixes the test bug. However, I'm not fully sure I understand why/how this condition check fixes the bug - can you add a short comment explaining why we check both config.getoption('--cloudflare')
AND cloud == 'cloudflare'
. Wouldn't checking just one of them be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll leave a short comment in the code and elaborate a bit here.
When the test is running for cloudflare, --cloudflare
flag should be passed. And for 'cloudflare'
, we set 'gcp'
to be in cloud_to_run
in L81. Due to this, it's possible for the condition cloud not in cloud_to_run
at L110 to be true when iterating for 'cloudflare'
as cloud_to_run
will contain 'gcp'
instead of 'cloudflare'
. Hence, we need to stop 'cloudflare'
being processed with item.add_marker(skip_marks[cloud])
in that case.
Having only config.getoption('--cloudflare')
is not enough as this would ignore to put other clouds in item.add_marker(skip_marks[cloud])
with continue
when --cloudflare
flag is passed. And having only cloud == 'cloudflare'
wouldn't be enough since this will skip adding 'cloudflare'
to item.add_marker(skip_marks[cloud])
even for when we are not running cloudflare R2 test.
This PR integrates another Cloud Object Storage from Cloudflare, R2, which has been requested #1639 .
It supports every interactions that can be made to a storage within skypilot besides direct transfer from/to R2 to/from another Cloud Object Storage. This should be addressed in later PR.
R2 is S3 compatible. Hence, we can reuse
aws s3
andaws s3api
commands along withgoofys
to interact with the storage along with some additional steps such as providing<accountid>
and another aws credential used for R2. These should be instructed in the installation doc. Followings are the difference on using aws-cli between S3 and R2 storages given both are configured.S3
aws s3 ls s3://test-bucket
R2
aws s3 ls s3://test-bucket --endpoint https://<accountid>.r2.cloudflarestorage.com --profile=r2
And working with goofys on R2 works in a similar way as above.
Note:
cloudflare.py
file is not added undersky/clouds
as cloudflare does not provide computing instances.test_private_bucket
frompytest tests/test_smoke.py::TestStorageWithCredentials
is purposely not tested for R2 as it is not necessary considering how R2 andaws-cli
interact.Tested (run the relevant ones):
pytest tests/test_smoke.py::TestStorageWithCredentials
besidestest_private_bucket
TODO: