Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for S3 access points for remote storage. #22

Open
ktchu opened this issue Feb 8, 2021 · 2 comments
Open

Support for S3 access points for remote storage. #22

ktchu opened this issue Feb 8, 2021 · 2 comments

Comments

@ktchu
Copy link

ktchu commented Feb 8, 2021

Depending on IT policies, it may not be possible to use an S3 bucket for DVC remote storage because it may not be possible to obtain bucket-level access permissions. In these situations, S3 access points for DVC remote provide a possible alternative because access policies can be attached to access points that grant access permissions at the "subdirectory" level.

One way to add support for S3 access points would be allow the DVC remote storage URLs to be AWS ARN identifiers. Unfortunately, the current URL processing implementation is unable to handle ARN identifiers because it attempts to interpret the portion of the ARN after the first colon as a port:

Port could not be cast to integer value as 'aws:s3:us-west-2:123456789012:accesspoint'

@isidentical
Copy link

AWS Reference: https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-arn-format.html

@efiop efiop transferred this issue from iterative/dvc Jan 1, 2023
@gokulchittaranjan
Copy link

The source of this problem is:
return infer_storage_options(path)["path"] (init.py in dvc_s3 folder)

This issue could be fixed upstream in [filesystem_spec] (fsspec/filesystem_spec@a1cf9ba)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants