Skip to content

Commit

Permalink
Merge pull request #137 from ENCODE-DCC/hotfix_update_doc_aws_increas…
Browse files Browse the repository at this point in the history
…e_limits

add description/troubleshooting for aws backend
  • Loading branch information
leepc12 authored Aug 6, 2021
2 parents 78d7801 + b933bd4 commit 899b6b7
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 0 deletions.
2 changes: 2 additions & 0 deletions scripts/aws_caper_server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ https://console.aws.amazon.com/cloudformation/home?#/stacks/new?stackName=gwfcor
- `S3 Bucket name`: S3 bucket name to store your pipeline outputs. This is not a full path for the output directory. It's just bucket's name without the scheme prefix `s3://`. Make sure that this bucket doesn't exist. If it exists then delete it or try with a different non-existing bucket name.
- `VPC ID`: Choose the VPC `GenomicsVPC` that you just created.
- `VPC Subnet IDs`: Choose all private subnets created with the above VPC.
- `Max vCPUs for Default Queue`: Maximum total number of CPUs for the spot instance queue. It's 4000 by default, which is huge already. But if you use more CPUs than this limit then your jobs will be stuck at `RUNNABLE` status.
- `Max vCPUs for Priority Queue`: Maximum total number of CPUs for the on-demand instance queue. It's 4000 by default, which is huge already. But if you use more CPUs than this limit then your jobs will be stuck at `RUNNABLE` status.
3. Click on `Next` and then `Next` again. Agree to `Capabililties`. Click on `Create stack`.
4. Go to your [AWS Batch](https://console.aws.amazon.com/batch) and click on `Job queues` in the left sidebar. You will see two Job Queues (`priority-*` and `default-*`). There has been some issues with the default one which is based on spot instances. Spot instances are interrupted quite often and Cromwell doesn't seem to handle it properly. We recommend to use `priority-*` queue even though it costs a bit more than spot instances. Click on the chosen job queue and get ARN of it. This ARN will be used later to create Caper server instance.

Expand Down
5 changes: 5 additions & 0 deletions scripts/aws_caper_server/TROUBLESHOOTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,3 +63,8 @@ If you use S3 URIs in an input JSON which are in a different region, then you wi
### `S3Exception: null (Service: S3, Status Code: 400)`

If you see `400` error then please use this shell script `./create_instance.sh` to create an instance instead of running Caper server on your laptop/machine.


### Tasks (jobs) are stuck at RUNNABLE status

Go to `Job Queues` in `AWS Batch` on your AWS console and find your job queue (default or priority) that matches with the ARN in your Caper conf. Edit the queue and increase number of maximum vCPUs.

0 comments on commit 899b6b7

Please sign in to comment.