Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can drs_cp be made to work with controlled-access buckets? #71

Open
smgogarten opened this issue Jan 21, 2023 · 4 comments
Open

can drs_cp be made to work with controlled-access buckets? #71

smgogarten opened this issue Jan 21, 2023 · 4 comments

Comments

@smgogarten
Copy link

I'm trying to use drs_cp on a file in a controlled-access bucket with requester pays enabled. drs_stat works, but drs_cp does not:

> f <- "drs://dg.4503:dg.4503/288ff0aa-a426-11ea-82d1-8bda0857af94"
> drs_stat(f)
# A tibble: 1 × 9
  drs                                                    fileN…¹  size gsUri acces…² timeU…³ hashes       bucket name 
  <chr>                                                  <chr>   <int> <chr> <chr>   <chr>   <list>       <chr>  <chr>
1 drs://dg.4503:dg.4503/288ff0aa-a426-11ea-82d1-8bda085… phs000… 58278 gs:/… https:… 2020-0… <named list> nih-n… phs0…
# … with abbreviated variable names ¹​fileName, ²​accessUrl, ³​timeUpdated
> drs_cp(f, ".")
Error: 'gsutil -m cp -n 'gs://nih-nhlbi-topmed-released-phs000964-c3/phs000964.v3.pht004839.v2.p1.TOPMed_WGS_JHS_Sample.MULTI.txt.gz' '/home/rstudio'' failed:
  AccessDeniedException: 403 pet-107443797655395020525@terra-0c3bdde8.iam.gserviceaccount.com does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list' denied on resource (or it may not exist).
    CommandException: 1 file/object could not be transferred.
  exit status: 1
In addition: Warning message:
'gsutil_requesterpays()' returned an error:
  'gsutil -u terra-0c3bdde8 requesterpays get gs://nih-nhlbi-topmed-released-phs000964-c3' failed:
  AccessDeniedException: 403 pet-107443797655395020525@terra-0c3bdde8.iam.gserviceaccount.com does not have storage.buckets.get access to the Google Cloud Storage bucket. Permission 'storage.buckets.get' denied on resource (or it may not exist).
  exit status: 1 

Digging into this further, I get an identical error when using gsutil -u <project_id> cp in the terminal. However, I can access the same file by using terra-notebook-utils in the terminal:

rstudio@f29b33636061:~$ /home/rstudio/.local/bin/tnu drs copy drs://dg.4503:dg.4503/288ff0aa-a426-11ea-82d1-8bda0857af94 .
2023-01-20 11:46:40::INFO  Enabling requester pays for your workspace. This will only take a few seconds...
 /home/rstudio/phs000964.v3.pht004839.v2.   100%   [========================================]   56.9KiB   374.9KiB/s   0.15s

Can drs_cp use the same mechanism as terra-notebook-utils?

@mtmorgan
Copy link
Collaborator

mtmorgan commented Jan 21, 2023

I will look into this early next week (it is supposed to work for protected endpoints); terra notebook Utica can also be used in R via reticulate and is a good workaround

@mtmorgan
Copy link
Collaborator

Can you provide information on where the drs URI comes from?

@smgogarten
Copy link
Author

smgogarten commented Jan 24, 2023

The DRS URIs come from NHLBI BioData Catalyst via the Gen3 Data Explorer: https://gen3.biodatacatalyst.nhlbi.nih.gov/explorer
I've linked my BioData Catalyst account to my AnVIL account, and I get a different error if the link has expired, so that's not the issue.

@smgogarten
Copy link
Author

Possibly related Terra support request here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants