You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
NOTE that you will need to be in branch crg2-hg38, not master, to run the hg38 crg2 pipeline! And for cre, switch to branch hg38 for report generation.
gnomAD is a database of exomes and genomes from (mostly) healthy individuals. We use gnomAD as a control cohort; a variant with a population allele frequency (AF) of 1% or higher is almost certainly not the cause of an extremely rare monogenic disease. The gnomAD AFs allow us to filter down the variants in an individual with rare monogenic disease so that we can more easily identify the variant or variants associated with their phenotype. Here we will be updating the gnomAD SNV/indel annotation source (they also provide SV AFs).
gnomAD AFs are available in a VCF (or per-chromosome VCFs that can be combined). We use vcfanno to add these AFs to the VCF generated by crg2 in this [rule](variant allele frequencies ). vcfanno requires a config that specifies which fields to use from a VCF to annotate another VCF, and any operations that might be applied to these. In crg2-hg38, that config is here.
Combine chromosome-wise VCFs for exomes, and combine chromosome-wise VCFs for genomes, resulting in one VCF for gnomAD exomes and one for gnomAD genomes.
You will likely need to process the VCF to exclude unwanted fields, normalize, etc as in this script, the key step being the bcftools command. However, we want to keep FAIL variants so you would remove the first part of the command that filters to include only PASS variants.
Check to see that these VCFs have the fields specified in the vcfanno config.
Replace the filenames in the vcfanno config to reflect the v4 VCFs.
Run the pipeline to generate small variant reports.
The text was updated successfully, but these errors were encountered:
Update gnomAD to version 4 in the crg2-hg38 branch.
Include gnomAD_faf95_popmax column.
Look into whether or not there is a GRCh37 version that we can use to update the GRCh37 pipeline.
Please see this document for a summary of a previous gnomAD update. And this associated pull request..
NOTE that you will need to be in branch crg2-hg38, not master, to run the hg38 crg2 pipeline! And for cre, switch to branch hg38 for report generation.
gnomAD is a database of exomes and genomes from (mostly) healthy individuals. We use gnomAD as a control cohort; a variant with a population allele frequency (AF) of 1% or higher is almost certainly not the cause of an extremely rare monogenic disease. The gnomAD AFs allow us to filter down the variants in an individual with rare monogenic disease so that we can more easily identify the variant or variants associated with their phenotype. Here we will be updating the gnomAD SNV/indel annotation source (they also provide SV AFs).
gnomAD AFs are available in a VCF (or per-chromosome VCFs that can be combined). We use vcfanno to add these AFs to the VCF generated by crg2 in this [rule](variant allele frequencies ). vcfanno requires a config that specifies which fields to use from a VCF to annotate another VCF, and any operations that might be applied to these. In crg2-hg38, that config is here.
You will need to:
The text was updated successfully, but these errors were encountered: