Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error division by zero #1765

Open
xsoleacha opened this issue Oct 9, 2024 · 2 comments
Open

Error division by zero #1765

xsoleacha opened this issue Oct 9, 2024 · 2 comments
Assignees

Comments

@xsoleacha
Copy link

Describe the issue

When annotating CNVs with VEP, I get the following error:

Illegal division by zero at /media/amontalban/Disc1/VEP/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 552, <__ANONIO__> line 144.

My input file is a VCF file with the CNVs that looks like this:

chr1	69037	.	N	<DEL>	10.5	PASS	END=71585;SVLEN=2549;EXONS=1;READSEXP=46;READSOBS=0;READSRATIO=0;BF=10.5;SVTYPE=DEL	GT	1/1
chr1	1454345	.	N	<DUP>	10.3	PASS	END=1470163;SVLEN=15819;EXONS=9;READSEXP=892;READSOBS=1122;READSRATIO=1.26;BF=10.3;SVTYPE=DUP	GT	0/1
chr1	1633748	.	N	<DEL>	13.4	PASS	END=1634654;SVLEN=907;EXONS=4;READSEXP=84;READSOBS=18;READSRATIO=0.214;BF=13.4;SVTYPE=DEL	GT	1/1
chr1	1704550	.	N	<DUP>	11	PASS	END=1709133;SVLEN=4584;EXONS=7;READSEXP=427;READSOBS=570;READSRATIO=1.33;BF=11;SVTYPE=DUP	GT	0/1

I'm annotating these variants with a custom file coming from dbVar, that looks like this:

1	10000	nssv16889290	N	<DUP>	.	.	DBVARID=nssv16889290;SVTYPE=DUP;END=52000;SVLEN=42001;EXPERIMENT=1;SAMPLESET=1;REGIONID=nsv6138160;AC=1453;AF=0.241208;AN=6026
1	10001	nssv14768	T	<DUP>	.	.	DBVARID=nssv14768;SVTYPE=DUP;IMPRECISE;END=88143;CIPOS=0,0;CIEND=0,0;SVLEN=78143;EXPERIMENT=1;SAMPLE=NA12155;REGIONID=nsv7879
1	10001	nssv14781	T	<DUP>	.	.	DBVARID=nssv14781;SVTYPE=DUP;IMPRECISE;END=82189;CIPOS=0,0;CIEND=0,0;SVLEN=72189;EXPERIMENT=1;SAMPLE=NA18860;REGIONID=nsv7879
1	10001	nssv14784	T	<DUP>	.	.	DBVARID=nssv14784;SVTYPE=DUP;IMPRECISE;END=87466;CIPOS=0,0;CIEND=0,0;SVLEN=77466;EXPERIMENT=1;SAMPLE=NA18975;REGIONID=nsv7879

After doing some digging, I have found that the line of the custom VCF file that generates the problem is this:

2	240692045	nssv211104	G	<DEL>	.	.	DBVARID=nssv211104;SVTYPE=DEL;END=240692045;SVLEN=-1;EXPERIMENT=1;SAMPLESET=1;LINKS=dbSNP:ss49921710;REGIONID=nsv192526;SEQ=g

As you can see, this is a 1 bp-long deletion at position chr2:240692045.

After digging deeper, I see that the problem may be that, somewhere in the code, VEP adds one to the start position of the variants in the custom file, and when it computes the length of each variant in this file it ends up in 0 (end - start + 1). In the case of the previous line, the values are: 240692045 - 240692046 + 1 = 0. Note that the value 240692046 is the result adding 1 to the start position of the CNV (240692045).

Then, in line 552 of File.pm module, the error comes because it uses the previously calculated length to calculate the overlap of my input variants with the variants in the custom VCF file, hence the division by zero issue.

I am not sure whether this is an expected behaviour, or a problem with my custom annotation file.

Thank you very much for your help.

Additional information

Please fill in the following sections to help us find the source of your issue as quickly as possible.

System

  • VEP version: 111
  • VEP Cache version: 111
  • Perl version: 5.34
  • OS: Ubuntu 22.04
  • tabix installed: yes

Full VEP command line

vep -i ${input_vcf} \
        --assembly ${assembly} \
        --offline --cache \
        --cache_version ${cache_version} \
        --dir_cache ${dir_cache} \
        --format vcf \
        -o ${output_vcf} \
        --vcf \
        --fasta ${fasta} \
        --merged \
        --use_given_ref --check_ref --mane --uniprot \
        --ccds --hgvs --symbol --canonical --protein --domains \
        --numbers  --pubmed --variant_class --no_escape \
        --dont_skip --individual all --show_ref_allele --exclude_null_alleles \
        --dir_plugins ${plugins_dir} \
        --custom file=GRCh38.variant_call.all.noINV.vcf.gz,short_name=dbVar_patho,fields=PC%DBVARID%SVLEN%CLNSIG%clinical_source%REGIONID%PHENO,num_records=all,reciprocal=1,format=vcf,coords=1,same_type=1,type=overlap,overlap_cutoff=0 \
        --verbose \
        --stats_file ${stats_file} \
        --force_overwrite \
        --max_sv_size 20000000

Full error message

Illegal division by zero at /media/amontalban/Disc1/VEP/ensembl-vep/modules/Bio/EnsEMBL/VEP/AnnotationSource/File.pm line 552, <__ANONIO__> line 144.

Data files (if applicable)

Not included.

@nakib103 nakib103 self-assigned this Oct 9, 2024
@nakib103
Copy link
Contributor

nakib103 commented Oct 9, 2024

Hi @xsoleacha,

Thanks for your query!
We have similar problem reported where the VCF parser had a one-by-off error and I think you are getting the same issue. We are planning to fix it in a future Ensembl VEP version. I will investigate further if your issue gets resolved by the fix in VCF parser and let you know the update.

Best regards,
Nakib

@nakib103
Copy link
Contributor

Hi @xsoleacha,

Can you provide me with the full stdout message? And, also, if possible, the specific input variant that is causing the issue. I have not been able to reproduce the issue as of yet and thus cannot check if the fix we are working on solves your issue.

Best regards,
Nakib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants