You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-------------------- EXCEPTION --------------------
MSG: Should not use homo_sapiens_refseq as --species.
Try using flags --refseq or --merged with --species homo_sapiens
Notice that --species homo_sapiens_refseq is incorrect.
The correct way is to use --species homo_sapiens along with --refseq (latter is present because we include it in vep_custom_args.
The --dir_cache ${PWD}/ensembl-vep (corresponds to vep_cache) is organized as:
which is the default structure if downloaded via ensembl-vep (these files are downloaded via docker myself). For homo_sapiens, there are two more versions, the default homo_sapien and the other one homo_sapiens_merged. If download them all, then the folder would look like:
By default vep will use ${dir_cache}/${species}/${cache_version}_${assembly}
If --refseq is provided, it will use ${dir_cache}/${species}_refseq/${cache_version}_${assembly}
If --merged is provided, it will use ${dir_cache}/${species}_merged/${cache_version}_${assembly}
Below are mappings between vep command options and sarek input params:
$dir_cache: $vep_cache
$species: $vep_species
$cache_version: $vep_cache_version
$assembly: $vep_genome
However, I'm not sure if this is the case for other species other than homo_sapiens.
For homo_sapiens, the correct logic to parse might be:
check if vep_custom_args contains --refseq/--merged
if both not exist, then validate annotation folder existence: ${vep_cache}/${vep_species}/${vep_cache_version}_${vep_genome}
if any of them exists, should validate annotation folder existence: ${vep_cache}/${vep_species}_refseq/${vep_cache_version}_${vep_genome} or ${vep_cache}/${vep_species}_merged/${vep_cache_version}_${vep_genome} ; note that $vep_species should still be homo_sapiens (because we need --species homo_sapiens in the vep command)
A more consistent way might be to provide another sarek input param to deal with vep's --refseq/--merged option
Command used and terminal output
No response
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered:
Description of the bug
The code parsing vep related options can't deal with the
--refseq/--merged
options in the vep command.If setting:
then the error is:
the
command.sh
:vep \ -i sample.freebayes.vcf.gz \ -o sample.freebayes_VEP.ann.vcf.gz \ --stats_file sample.freebayes_VEP.ann.summary.html \ --vcf --refseq --offline --hgvsg --allele_number --everything --format vcf \ --compress_output bgzip \ --fasta human_g1k_v37_decoy.fasta \ --assembly GRCh37 \ --species homo_sapiens_refseq \ --cache \ --cache_version 111 \ --dir_cache ${PWD}/ensembl-vep \ --fork 6
Notice that
--species homo_sapiens_refseq
is incorrect.The correct way is to use
--species homo_sapiens
along with--refseq
(latter is present because we include it invep_custom_args
.The
--dir_cache ${PWD}/ensembl-vep
(corresponds tovep_cache
) is organized as:which is the default structure if downloaded via ensembl-vep (these files are downloaded via docker myself). For homo_sapiens, there are two more versions, the default
homo_sapien
and the other onehomo_sapiens_merged
. If download them all, then the folder would look like:VEP's annotation folder selection logic:
${dir_cache}/${species}/${cache_version}_${assembly}
--refseq
is provided, it will use${dir_cache}/${species}_refseq/${cache_version}_${assembly}
--merged
is provided, it will use${dir_cache}/${species}_merged/${cache_version}_${assembly}
Below are mappings between vep command options and sarek input params:
$dir_cache: $vep_cache
$species: $vep_species
$cache_version: $vep_cache_version
$assembly: $vep_genome
However, I'm not sure if this is the case for other species other than homo_sapiens.
For homo_sapiens, the correct logic to parse might be:
vep_custom_args
contains--refseq/--merged
${vep_cache}/${vep_species}/${vep_cache_version}_${vep_genome}
${vep_cache}/${vep_species}_refseq/${vep_cache_version}_${vep_genome}
or${vep_cache}/${vep_species}_merged/${vep_cache_version}_${vep_genome}
; note that$vep_species
should still behomo_sapiens
(because we need--species homo_sapiens
in the vep command)A more consistent way might be to provide another sarek input param to deal with vep's
--refseq/--merged
optionCommand used and terminal output
No response
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: