Skip to content

Commit

Permalink
Enable auto-vectorisation in CRAM 3.1 codecs.
Browse files Browse the repository at this point in the history
I suspect this was initially hard as on, but later made something we
explicitly enable but forgetting to add that code into htslib.

On Illumina it made little difference, but wasn't detrimental.
I need bigger data sets, but they're mostly on unavailable systems
right now.  Small tests demonstrate utility though, specifically on
decode speeds.

For other platforms:

Ultima Genomics
===============

Orig

real    0m25.784s
user    0m24.506s
sys     0m1.189s

real    0m9.155s
user    0m7.775s
sys     0m1.379s

RANS_ORDER_SIMD_AUTO

real    0m24.987s
user    0m23.699s
sys     0m1.219s

real    0m8.097s
user    0m6.635s
sys     0m1.461s

That's 13% quicker decode and 3% quicker encode.

It's mostly QS and tags:

$ ~/samtools/samtools cram-size -v _.cram|grep 32x16
BLOCK       10       617823        77895  12.61% r32x16-o1
BLOCK       12    911236491    188134803  20.65% r32x16-o1R  QS
BLOCK       27       232221        38816  16.72% r32x16-o0   FC
BLOCK       31        54067        10718  19.82% r32x16-o0   BS
BLOCK  7614554    917596491     50148593   5.47% r32x16-o1   t0Z
BLOCK  7630914    931877007    108982153  11.69% r32x16-o1R  tpB

ONT
===

Orig

real    0m3.018s
user    0m2.854s
sys     0m0.130s

real    0m0.578s
user    0m0.538s
sys     0m0.040s

RANS_ORDER_SIMD_AUTO

real    0m2.912s
user    0m2.740s
sys     0m0.120s

real    0m0.500s
user    0m0.430s
sys     0m0.070s

That's 16% quicker decode and 4% quicker encode, but sample size is
admittedly tiny for both tests.

File size changes are under 0.1% growth, mainly due to 32 rANS states
instead of 4.  The RANS_ORDER_SIMD_AUTO flag basically enables the
32-way rANS if the block is sufficiently large (>50kb), so it's
the extra 112 byte state overhead isn't significant.
  • Loading branch information
jkbonfield committed Sep 6, 2023
1 parent 5acbc15 commit 618af7c
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion cram/cram_io.c
Original file line number Diff line number Diff line change
Expand Up @@ -1852,8 +1852,9 @@ static char *cram_compress_by_method(cram_slice *s, char *in, size_t in_size,
// see enum cram_block. We map RANS_* methods to order bit-fields
static int methmap[] = { 1, 64,9, 128,129, 192,193 };

int m = method == RANS_PR0 ? 0 : methmap[method - RANS_PR1];
cp = rans_compress_4x16((unsigned char *)in, in_size, &out_size_i,
method == RANS_PR0 ? 0 : methmap[method - RANS_PR1]);
m | RANS_ORDER_SIMD_AUTO);
*out_size = out_size_i;
return (char *)cp;
}
Expand Down

0 comments on commit 618af7c

Please sign in to comment.