Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cap hts_getline() return value at INT_MAX #1448

Merged
merged 1 commit into from
Jun 13, 2022

Conversation

jmarshall
Copy link
Member

@jmarshall jmarshall commented Jun 10, 2022

Something I noticed while writing a script to generate an enormous VCF record for #1447: the initial version of this generated a 2.5 GiB VCF record line, but vcf_read() failed to read it.

It turns out that kgetline2() (as used by hts_getline() for uncompressed input) reads such a line without trouble, but hts_getline() truncates the read string's length to int and returns negative — thus indicating error rather than a successfully read line.

This PR clamps the successful return value at INT_MAX. An alternative would be to change the return type to e.g. ssize_t, but this would have potential ABI implications perhaps…

In practice, no hts_getline() invocations in htslib, samtools, or bcftools use the length returned at all; they only use the return value to distinguish between success/EOF/error. Other third-party code potentially may use the length.

Also add to the existing basic hts_getline() tests In test/sam.c: check that the successful return value is indeed the expected length.

On success, hts_getline() returns the length of the string read.
Its return type is int, so when plain int is 32 bits, trouble ensues
for very long lines exceeding 2GiB: the return value wraps to negative
and is misinterpreted as error. Rather than changing the return type
to e.g. ssize_t, clamp the return value for very long lines.

In test/sam.c's test cases, check the return value is indeed the
expected length.
@daviesrob daviesrob merged commit 1109c8b into samtools:develop Jun 13, 2022
@jmarshall jmarshall deleted the hts_getline branch June 13, 2022 14:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants