1.17
Download the source code here: htslib-1.17.tar.bz2.(The "Source code" downloads are generated by GitHub and are incomplete as they are missing some generated files.)
-
A new API for iterating through a BAM record's aux field. (PR #1354, addresses #1319. Thanks to John Marshall)
-
Text mode for bgzip. Allows bgzip to compress lines of text with block breaks at newlines. (PR #1493, thanks to Mike Lin for the initial version PR #1369)
-
Make tabix support CSI indices with large positions. Unlike SAM and VCF files, BED files do not set a maximum reference length which hindered CSI support. This change sets an arbitrary large size of 100G to enable it to work. (PR #1506)
-
Add a fai_line_length function. Exposes the internal line-wrap length. (PR #1516)
-
Check for invalid barcode tags in fastq output. (PR #1518, fixes samtools/samtools#1728. Reported by Poshi)
-
Warn if reference found in a CRAM file is not contained in the specified reference file. (PR #1517 and PR #1521, adds diagnostics for #1515. Reported by Wei WeiDeng)
-
Add a
faidx_seq_len64
function that can return sequence lengths longer thanINT_MAX
. At the same time limitfaidx_seq_len
toINT_MAX
output. Also add afai_adjust_region
to ensure given ranges do not go beyond the end of the requested sequence. (PR #1519) -
Add a
bcf_strerror
function to give text descriptions of BCF errors. (PR #1510) -
Add CRAM SQ/M5 header checking when specifying a fasta file. This is to prevent creating a CRAM that cannot be decoded again. (PR #1522. In response to samtools/samtools#1748 though not a direct fix)
-
Improve support for very long input lines (> 2Gbyte). This is mostly useful for tabix which does not do much interpretation of its input. (PR #1542, a partial fix for #1539)
-
Speed up
load_ref_portion
. This function has been sped up by about 7x, which speeds up low-depth CRAM decoding by about 10%. (PR #1551) -
Expand CRAM API to cope with new samtools cram_size command. (PR #1546)
-
Merges neighbouring
I
andD
ops into one op within pileup. This means4M1D1D1D3M
is reported as4M3D3M
. Fixing this in sam.c means not only is samtools mpileup now looking better, but any tool using the mpileup API will be getting consistent results. (PR #1552, fixes the last remaining part of samtools/samtools#139) -
Update the API documentation for
bgzf_mt
as it refered to a previous iteration. (PR #1556, fixes #1553. Reported by Raghavendra Padmanabhan)
Build changes
-
Use POSIX grep in testing as egrep and fgrep are considered obsolete. (PR #1509, thanks to David Seifert)
-
Switch to building libdefalte with cmake for Cirris CI. (PR #1511)
-
Ensure strings in
config_vars.h
are escaped correctly. (PR #1530, fixes #1527. Reported by Lucas Czech) -
Easier modification of shared library permissions during install. (PR #1532, fixes #1525. Reported by StephDC)
-
Fix build on ancient compilers. Added
-std=gnu90
to build tests so older C compilers will still be happy. (PR #1524, fixes #1523. Reported by Martin Jakt) -
Switch MacOS CI tests to an ARM-based image. (PR #1536)
-
Cut down the number of
embed_ref=2
tests that get run. (PR #1537) -
Add symbol versions to
libhts.so
. This is to aid package developers. (PR #1560 addresses #1505, thanks to John Marshall. Reported by Stefan Bruens) -
htscodecs now updated to v1.4.0. (PR #1563)
-
Cleaned up misleading system error reports in test_bgzf. (PR #1565)
Bug fixes
Bug fixes
-
VCF. Fix n-squared complexity in sample line with many adjacent tabs [fuzz]. (PR #1503)
-
Improved bcftools detection and reporting of bgzf decode errors. (PR #1504, thanks to Lilian Janin. PR #1529 thanks to Bergur Ragnarsson, fixes #1528. PR #1554)
-
Prevent crash when the only FASTA entry has no sequence [fuzz]. (PR #1507)
-
Fixed typo in
sam.h
documentation. (PR #1512, thanks to kojix2) -
Fix buffer read-overrun in
bam_plp_insertion_mod
. (PR #1520) -
Fix hash keys being left behind by
bcf_hdr_remove
. (PR #1535, fixes #1533. Reported by Giulio Genovese in #842) -
Make
bcf_hdr_idinfo_exists
more robust by checkingid
value exists. (PR #1544, fixes #1538. Reported by Giulio Genovese) -
CRAM improvements. Fixed crash with multi-threaded CRAM. Fixed a bug in the codec parameter learning for CRAM 3.1 name tokeniser. Fixed Cram compression container substitution matrix generation, (PR #1558, PR #1559 and PR #1562)