Cap hts_getline() return value at INT_MAX #1448
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Something I noticed while writing a script to generate an enormous VCF record for #1447: the initial version of this generated a 2.5 GiB VCF record line, but
vcf_read()
failed to read it.It turns out that
kgetline2()
(as used byhts_getline()
for uncompressed input) reads such a line without trouble, buthts_getline()
truncates the read string's length toint
and returns negative — thus indicating error rather than a successfully read line.This PR clamps the successful return value at
INT_MAX
. An alternative would be to change the return type to e.g.ssize_t
, but this would have potential ABI implications perhaps…In practice, no
hts_getline()
invocations in htslib, samtools, or bcftools use the length returned at all; they only use the return value to distinguish between success/EOF/error. Other third-party code potentially may use the length.Also add to the existing basic
hts_getline()
tests In test/sam.c: check that the successful return value is indeed the expected length.