Skip to content

Releases: tbenavi1/genomescope2.0

v2.0.1

10 Jul 15:40
5063303
Compare
Choose a tag to compare

This release makes a slight update to how the kmer histogram file is loaded to improve the compatibility with fastK to ensure the last kmer frequency is included in the modeling and plots. Previously the last row in the histogram file was skipped from being loaded as this value is used by several tools to count kmers at or above this ceiling rather than a distinct value. For example, by default jellyfish will record kmer frequencies up to 10000 so the last row counts kmers that occur 10000 or more times (kmc by default uses 255). This can lead to somewhat unusual plots where the kmer counts for 10000 would spike and be much larger than 9999 (or other high values). For jellyfish or kmc this can be addressed by using a larger maximum kmer frequency (e.g. setting the maximum to 100,000 or 1,000,000 as needed), but for fastK there is a hard cutoff of 32,767 that cannot be exceeded. However, for gigabase genomes sequenced to reasonable coverage there are often many kmers that occur more than 32767 times so these would be excluded from the genomescope analysis. This release ensures that the last row in the file is included in the analysis so that these kmer counts are also included. This will lead to a more accurate estimate of genome size and repetitiveness but should have little to no impact on the estimate of heterozygosity or coverage. The plots are also updated to display this last value to make it clearer that the last value will also be included. The previous behavior can be restored by manually setting the "Max k-mer coverage" to exclude the last row in the file.

First Release of GenomeScope 2.0

07 Feb 15:55
7dc6806
Compare
Choose a tag to compare

This is the first release.