Releases: fbreitwieser/krakenuniq
KrakenUniq v1.0.4
In this release we removed the requirement to have "file" command in the Docker or Singularity image (thanks @lskatz and @boulund) by eliminating use of "file" in determining the compression type for paired read files. We now also support gzipped/bzipped fasta files with paired reads. We now force --preload switch when building the database for speed.
KrakenUniq v1.0.3
This release adds Dockerfile (thanks @Jessime), and adds few changes to the documentation.
KrakenUniq v1.0.2
This release fixes an issue with possibly incorrect output produced when running multiple krakenuniq processes in the same folder with --paired input files.
KrakenUniq v1.0.1
KrakenUniq v1.0.0
KrakenUniq v0.7.3
This maintenance release provides the following updates:
(1) fixes issues with building large databases
(2) installs Jellyfish version 1 KRAKENUNIQ_INSTALL_DIR/jellyfish-install/bin/, if -j switch is used (KrakenUniq requires Jellyfish version 1 to build databases)
(3) fixes --work-on-disk option (#97)
KrakenUniq v0.7.2
This maintenance release fixes the use of --paired option in krakenuniq, and the failure at the last step (report) of building a krakenuniq database.
KrakenUniq v0.7.1
This minor release fixes a bug in the Makefile that resulted in installation of unusable executables count_unique and set_lcas. The bug resulted in fatal error in building new krakenuniq database. Classification with an existing database was not affected.
Thanks to @alekseyzimin for fixing the bug.
KrakenUniq v0.7
New option for low-memory computers: --preload-size
By default, KrakenUniq performs memory mapping to load the database; i.e., it does not load the entire database into main memory. (Kraken 1 employs the same strategy.) This makes classification of larger read datasets much slower, but it allows KrakenUniq to run on machines with low available main memory. If enough free RAM is available to hold the entire database in main memory, users are recommended to explicitly load the entire database prior to classification using the flag --preload, which dramatically speeds up the classification, often by a factor of 20 or more.
To improve the performance when not enough main memory is available to load the entire database into RAM, we added a new capability to KrakenUniq. When using this new feature, only a chunk of the database is loaded into memory at a time, after which the algorithm iterates over the reads and looks up all k-mers in those reads that are matching in this database chunk. This process is repeated until the entire database has been processed. The k-mer lookups are then merged, and reads are classified based on the results of the full database. This new feature makes it feasible to run KrakenUniq on very large datasets and huge databases on virtually any computer, even a laptop, while providing exact classifications that are identical to those of KrakenUniq in its other modes. Users can employ this feature with --preload-size and specify the amount of available main memory that they want to use for loading chunks of the database, e.g., --preload-size 8G or --preload-size 500M.
Automatic detection of compressed fastq/fasta input
The input format (fastq or fasta, bzip2 or gzip compressed) is now detected automatically. No need to use --fasta-input, --fastq-input, --gzip-compressed or --bzip2-compressed switches.
Thanks to @cpockrandt for developing the code and @salzberg and @alekseyzimin for the initial idea, suggestions and testing.
KrakenUniq v0.6 with improved preloading of the database
KrakenUniq (and also Kraken) often ran very slow with really big databases. The problem was that --preload didn't truly force to load the DB in memory, so it spends forever (many days) going back and forth to disk. With the fix included in this release, krakenuniq ran in 16 minutes on a database where before it took >100 hours.
Thanks a lot to @alekseyzimin for the development and contribution of the fix and @salzberg for reporting the issue!