Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
The old table equates to: 0 1 2 3 A : C G T N C : A G T N G : A C T N T : A C G N N : A C G T The new one is: 0 1 2 3 A : T C G N C : A G T N G : T C A N T : A G C N N : A C G T This affects the generation of BS codes for Ref/Seq combinations. The idea is we want common substitutions to be sharing the same code value so compression improves. Mostly this is a (tiny) win for compression, across a multitude of technologies and organisms. There are a few exceptions (one of the Streptococcus samples grew, and AVITI had a marginal growth, but generally it's an irrelevance on the platforms that don't have aggressive quality quantisation as the files become dominated elsewhere. Even with this on Illumina, it's generally of the order of a 0.1% to total file size. However it's completely free and has no real CPU impact either.
- Loading branch information