-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GL-548 - Update CreateVat code to handle samples that do not contain all population groups. #7965
GL-548 - Update CreateVat code to handle samples that do not contain all population groups. #7965
Conversation
…m the input ancestry file. Have GvsCreateVAT.wdl only pull fields from the VCF for the selected subpopulations. Update create_variant_annotation_table.py to set empty population-specific AC/AN/AF, etc. values if population is not present.
CreateVAT workflow run on a quickstart with only two populations for the ten samples (amr, afr) here. |
Codecov Report
@@ Coverage Diff @@
## ah_var_store #7965 +/- ##
================================================
Coverage ? 86.235%
Complexity ? 35200
================================================
Files ? 2173
Lines ? 165016
Branches ? 17793
================================================
Hits ? 142302
Misses ? 16385
Partials ? 6329 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one request: please update all the WDLs that use our custom docker image to use the most up-to-date one, "us.gcr.io/broad-dsde-methods/variantstore:gg_vattest_2022_07_28"
Sounds good - was thinking of doing that at the last moment (i.e. having the PR approved, build a 'ah_varstore_' docker off of this PR and then update the wdls and then merge. |
@@ -23,7 +23,6 @@ workflow GvsCreateVAT { | |||
Array[String] contig_array = ["chr1", "chr2", "chr3", "chr4", "chr5", "chr6", "chr7", "chr8", "chr9", "chr10", "chr11", "chr12", "chr13", "chr14", "chr15", "chr16", "chr17", "chr18", "chr19", "chr20", "chr21", "chr22", "chrX", "chrY", "chrM"] | |||
File reference = "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.fasta" | |||
File nirvana_data_directory = "gs://broad-dsp-spec-ops/scratch/rcremer/Nirvana/NirvanaData.tar.gz" | |||
File AnAcAf_annotations_template = "gs://broad-dsp-spec-ops/scratch/rcremer/Nirvana/vat/custom_annotations_template.tsv" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
Update GvsCreateVAT.wdl to build the subpopulation-specific files from the input ancestry file.
Have GvsCreateVAT.wdl only pull fields from the VCF for the selected subpopulations.
Update create_variant_annotation_table.py to set empty population-specific AC/AN/AF, etc. values if population is not present.