Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem running Dockerized protect / where to find SSE-C key for downloading reference data? #294

Open
anttikos opened this issue Jun 17, 2021 · 4 comments

Comments

@anttikos
Copy link

I'm having a trouble running protect, and figuring out what exactly is failing, as the output is 4655 lines long.. I'm most probably just missing some parameters, but I'm not sure which ones exactly.

Starting from line 1143 I start getting errors related to Amazon S3 authentication:
0091ac8c0f62 2021-06-16 13:00:23,448 MainThread WARNING toil.leader: H/2/jobmrdMHX RuntimeError: s3am failed with (boto.exception.NoAuthHandlerFound: No handler was ready to authenticate. 1 handlers were checked. ['HmacAuthV1Handler'] Check your credentials) while downloading (S3://protect-data/hg38_references/gencode.v25.pc_transcripts.fa.tar.gz) 0091ac8c0f62 2021-06-16 13:00:23,448 MainThread WARNING toil.leader: H/2/jobmrdMHX ERROR:toil.worker:Exiting the worker because of a failed job on host 0091ac8c0f62

I'm assuming I should use the --sse-key and --sse-key-is-master parameters to set up the SSE-C key in order to download the reference genome files, but I have trouble finding where could I find the actual key files. Also, I'm not sure if this is the main issue, or are the other issues as well.

Any help is highly appreciated!

Here is the command I'm using to run protect
docker run -v /var/run/docker.sock:/var/run/docker.sock -v /data:/data/ quay.io/ucsc_cgl/protect:2.5.6-1.13.0 --sample-name sample_name --tumor-dna /data/dna/tumor_dna_sample.merged_1.fq.gz --tumor-dna2 /data/dna/tumor_dna_sample.merged_2.fq.gz --normal-dna /data/dna/normal_dna_sample.merged_1.fq.gz --normal-dna2 /data/dna/normal_dna_sample.merged_2.fq.gz --tumor-rna /data/rna/tumor_rna_sample_1.fq.gz --tumor-rna2 /data/rna/tumor_rna_sample_2.fq.gz --reference-build hg38 --tumor-type PRAD --work-mount /data/protect 2> error_log_2021-06-16.txt

Attached is the full output log

error_log_2021-06-16.txt

@adamnovak
Copy link

I don't think this is an SSE (server-side encryption) problem. It sounds like you don't have a .aws credentials directory available to the container. Usually it would want to be in the home directory inside the container.

Unfortunately, even when I try with my UCSC AWS account, I don't have permission to access that data either:

aws s3 cp s3://protect-data/hg38_references/gencode.v25.pc_transcripts.fa.tar.gz -
download failed: s3://protect-data/hg38_references/gencode.v25.pc_transcripts.fa.tar.gz to - An error occurred (403) when calling the HeadObject operation: Forbidden

So the real problem may be that the permissions on the protect-data bucket have changed; the data may have previously been available for public unauthenticated access and now it is not anymore.

@erichweiler Did you change the permissions on the protect-data bucket when we were concerned about leaking money due to public S3 access?

@erichweiler
Copy link

erichweiler commented Oct 12, 2021 via email

@hbeale
Copy link

hbeale commented Oct 15, 2021

You need requester-payer. Here are the commands I ran:

aws s3 ls --request-payer requester protect-data/hg38_references/
aws s3 sync --dryrun --request-payer requester s3://protect-data/hg38_references /mnt/neoepitopes/protect_references/
aws s3 sync --request-payer requester s3://protect-data/hg38_references /mnt/neoepitopes/protect_references/

@adamnovak
Copy link

@erichweiler Unfortunately I don't know where the bucket is. But @hbeale assures me that the real problem here is that you need to have AWS account credentials available, and you need to use requester-pays mode to download, which is behind the --request-payer flag in aws. I'm not sure if s3am needs any special flag for that; it might accept needing to pay for downloads by default, if you give it AWS credentials to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants