|
| 1 | +# Advanced Flags |
| 2 | + |
| 3 | +## Custom Networks |
| 4 | + |
| 5 | +Variant Transforms supports custom networks. This can be used to start the processing VMs in a specific subnetwork of your Google Cloud project as opposed to the default network. |
| 6 | + |
| 7 | +Specify a subnetwork by using the `--subnetwork` flag and provide the name of the subnetwork as follows: `--subnetwork my-subnet`. Just use the name of the subnet, not the full path. |
| 8 | + |
| 9 | +Example: |
| 10 | +```bash |
| 11 | +COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ... |
| 12 | +
|
| 13 | +docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \ |
| 14 | + --project "${GOOGLE_CLOUD_PROJECT}" \ |
| 15 | + --subnetwork my-subnet \ |
| 16 | + ... |
| 17 | + "${COMMAND}" |
| 18 | +``` |
| 19 | +
|
| 20 | +
|
| 21 | +## Removing External IPs |
| 22 | +Variant Transforms allows disabling the use of external IP addresses with the |
| 23 | +`--use_public_ips` flag. If not specified, this defaults to true, so to restrict the |
| 24 | +use of external IP addresses, use `--use_public_ips false`. Note that without external |
| 25 | +IP addresses, VMs can only send packets to other internal IP addresses. To allow these |
| 26 | +VMs to connect to the external IP addresses used by Google APIs and services, you can |
| 27 | +[enable Private Google Access](https://cloud.google.com/vpc/docs/configure-private-google-access) |
| 28 | +on the subnet. |
| 29 | +
|
| 30 | +Example: |
| 31 | +```bash |
| 32 | +COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ... |
| 33 | + |
| 34 | +docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \ |
| 35 | + --project "${GOOGLE_CLOUD_PROJECT}" \ |
| 36 | + --use_public_ips false \ |
| 37 | + ... |
| 38 | + "${COMMAND}" |
| 39 | +``` |
| 40 | + |
| 41 | +## Custom Dataflow Runner Image |
| 42 | +By default Variant Transforms uses a custom docker image to run the pipeline in: `gcr.io/cloud-lifesciences/variant-transforms-custom-runner:latest`. |
| 43 | +This image contains all the necessary python/linux dependencies needed to run variant transforms so that they are not downloaded from the internet when the pipeline starts. |
| 44 | + |
| 45 | +You can override which container is used by passing a `--sdk_container_image` as in the following example: |
| 46 | + |
| 47 | +```bash |
| 48 | +COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ... |
| 49 | +
|
| 50 | +docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \ |
| 51 | + --project "${GOOGLE_CLOUD_PROJECT}" \ |
| 52 | + --sdk_container_image gcr.io/path/to/my/container\ |
| 53 | + ... |
| 54 | + "${COMMAND}" |
| 55 | +``` |
| 56 | +
|
| 57 | +## Custom Service Accounts |
| 58 | +By default the dataflow workers will use the [default compute service account](https://cloud.google.com/compute/docs/access/service-accounts#default_service_account). You can override which service account to use with the `--service_account` flag as in the following example: |
| 59 | +
|
| 60 | +```bash |
| 61 | +COMMAND="/opt/gcp_variant_transforms/bin/vcf_to_bq ... |
| 62 | + |
| 63 | +docker run gcr.io/cloud-lifesciences/gcp-variant-transforms \ |
| 64 | + --project "${GOOGLE_CLOUD_PROJECT}" \ |
| 65 | + --service_account my-cool-dataflow-worker@<PROJECT_ID>.iam.gserviceaccount.com\ |
| 66 | + ... |
| 67 | + "${COMMAND}" |
| 68 | +``` |
| 69 | + |
| 70 | +**Other Service Account Notes:** |
| 71 | +- The [Life Sciences Service Account is not changable](https://cloud.google.com/life-sciences/docs/troubleshooting#missing_service_account) |
| 72 | +- The [Dataflow Admin Service Account is not changable](https://cloud.google.com/dataflow/docs/concepts/security-and-permissions#service_account) |
0 commit comments