Fix indexing bug by flushing BCF bgzf stream after header write #1742
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
bcf_idx_init()
callsbgzf_tell()
to get the starting index offset. This was OK when single-threaded but broke with multiple threads becausebgzf_tell()
lies about the file offset unlessbgzf_flush()
was called first.SAM.gz
,BAM
andVCF.gz
all did this, butBCF
didn't leading to an incorrect first index entry when combining multi-threads with indexing on the fly. Fix by adding the missingbgzf_flush()
after writing the header.As a side benefit, the BCF variant records will now start in a fresh BGZF block, instead of being mixed in with part of the BCF header.
test/index.bcf.csi has to be replaced due to the extra flush adding one more block to the (uncompressed) index.bcf file that gets generated by the test harness.
Fixes #1740