Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BCFtools sort zero position overflow #1753

Closed
Rapsssito opened this issue Jul 19, 2022 · 6 comments
Closed

BCFtools sort zero position overflow #1753

Rapsssito opened this issue Jul 19, 2022 · 6 comments
Labels
htslib-dependent Cannot be fixed until htslib is fixed

Comments

@Rapsssito
Copy link

Rapsssito commented Jul 19, 2022

There is an overflow issue when sorting a VCF file with zero positions. Here is an example:

Unsorted VCF

MT	0	1915430843:1	N	]MT:16640]N	99	PASS	.
MT	0	3110098432:1	N	]MT:16638]N	99	PASS	.
MT	16638	3110098432:2	G	G[MT:0[	99	PASS	.
MT	16640	1915430843:2	G	G[MT:0[	99	PASS	.

Commands

bcftools sort test.vcf -o test.sorted.vcf

Sorted VCF

MT	4294967296	3110098432:1	N	]MT:16638]N	99	PASS	.
MT	4294967296	1915430843:1	N	]MT:16640]N	99	PASS	.
MT	16638	3110098432:2	G	G[MT:0[	99	PASS	.
MT	16640	1915430843:2	G	G[MT:0[	99	PASS	.

Note how the 0 from 3110098432:1 has been wrongly changed to 4294967296.

@Rapsssito Rapsssito changed the title BCFtools sorting overflow BCFtools sorting zero position overflow Jul 19, 2022
@Rapsssito Rapsssito changed the title BCFtools sorting zero position overflow BCFtools sort/view zero position overflow Jul 19, 2022
@Rapsssito
Copy link
Author

Rapsssito commented Jul 19, 2022

@pd3, could you check if you are able to reproduce this issue?

@jkbonfield
Copy link
Contributor

It shouldn't go wrong like this ideally, but note the input is invalid. VCF is 1-based, not 0-based. The zeros have been converted to -1 (as internally the code is using 0-based coordinates) which has caused a wrap-around.

@Rapsssito
Copy link
Author

@jkbonfield, actually, the VCF specification allows positions with 0. This is an example from VCFv4.2:

imagen

@Rapsssito Rapsssito changed the title BCFtools sort/view zero position overflow BCFtools sort zero position overflow Jul 19, 2022
@pd3
Copy link
Member

pd3 commented Jul 20, 2022

This problem is not specific to bcftools sort but to a round trip through BCF via htslib. The format stores 0-based position in hts_pos_t type and 0 coordinate overflows. This needs to be fixed in htslib.

@pd3 pd3 added the htslib-dependent Cannot be fixed until htslib is fixed label Jul 20, 2022
@Rapsssito
Copy link
Author

@pd3, I have created the corresponding issue in htslib: samtools/htslib#1475

pd3 added a commit to pd3/htslib that referenced this issue Jul 20, 2022
The 0 coordinate is valid in VCF specification, but the round-trip
VCF -> BCF -> VCF turns MT:0 into MT:4294967296. Add a check to
detect this overflow.

See samtools#1475 and samtools/bcftools#1753
pd3 added a commit to pd3/htslib that referenced this issue Jul 20, 2022
The 0 coordinate is valid in VCF specification, but the round-trip
VCF -> BCF -> VCF turns MT:0 into MT:4294967296. Add a check to
detect this overflow.

See samtools#1475 and samtools/bcftools#1753
@daviesrob
Copy link
Member

Fixed by samtools/htslib#1476

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
htslib-dependent Cannot be fixed until htslib is fixed
Projects
None yet
Development

No branches or pull requests

4 participants