Skip to content

Releases: raufs/skDER

v1.1.0

17 May 19:51
Compare
Choose a tag to compare
  • Introduce ability to specify GTDB release and update to using GTDB R220 as default for when users request to auto-download and include all genomes from a particular genus/species.
  • Remove need for symlinking genomes locally, instead fastx index files are now written in the same folder as the input genomes and deleted afterwards.
  • Parallelization when computing N50 is done by splitting up number of genomes by the number of CPUs allocated and thus writing to at most X number of files at a time, where X is the number of CPUs. This is to address: #4

v1.0.10

29 Mar 15:24
3260e70
Compare
Choose a tag to compare
  • Minor change, added new argument to use https://ftp.ncbi.nlm.nih.gov/genomes instead of https://ftp.ncbi.nih.gov/genomes in case there are issues with connecting to the latter. This gets passed to ncbi-genome-download's -u argument.

Full Changelog: v1.0.9...v1.0.10

v1.0.9

29 Mar 00:09
Compare
Choose a tag to compare
  • Support for gzipped files added (#4)
  • GTDB/NCBI downloaded genomes are now kept in gzip form
  • FASTA files ending in *.fas now allowed (#4)
  • If local input genomes are provided, default behavior is now to symlink files in the skDER results directory and do indexing for N50 calculation there.
  • FASTA confirmation now optional (might paralelize in the future and turn back on as default - but currently iterative) - it can take a while if there are a lot of files.

Full Changelog: v1.0.8...v1.0.9

v1.0.8

16 Oct 16:45
9f65fff
Compare
Choose a tag to compare
  • Fix broken GTDB-based downloading feature.
  • Polish names for genomic assemblies downloaded based on GTDB species names.

Full Changelog: v1.0.7...v1.0.8

v1.0.7

12 Oct 12:45
Compare
Choose a tag to compare
  • Corrected faulty usage of the -s option in skani triangle and now set it to the default value. This should now result in the more accurate ANI estimates being used for the dereplication methods as intended.
  • Updated stats and runtime info for running dynamic/greedy approaches on the Wiki.
  • Added new secondary clustering option, -n which will report the relation/distance of all genomes in the input set to their nearest representative genome.

Full Changelog: v1.0.6...v1.0.7

v1.0.6

29 Sep 14:11
ac9c2d5
Compare
Choose a tag to compare
  • Mostly just updates to the README & help function.
  • Added missing library import statements in util.py

Full Changelog: v1.0.5...v1.0.6

v1.0.5

20 Aug 20:54
f87b42c
Compare
Choose a tag to compare
  • update packaging of program + installation guide

v1.0.4

15 Aug 23:46
942caf1
Compare
Choose a tag to compare
  • fix SKDER_PATH now that skder moved to bin/

v1.0.2

15 Aug 20:24
e8d8694
Compare
Choose a tag to compare

updates for v.1.0.2

  • KEY: Correct overflow issue in C++ code related to integer multiplication in computing scores for dynamic dereplication approach
  • Introduce a greedy set cover dereplication approach as an alternate method
  • Improve code + documentation organization
  • Add test case

Full Changelog: v1.0.1...v1.0.2

v1.0.1

04 Aug 16:36
0f81997
Compare
Choose a tag to compare
  • Create directory with representative genomes in the output directory.
  • Add version flag , change input for --genomes argument from accepting a directory to multiple paths to genome files.
  • Update Enterococcus dereplication showcasing.

Full Changelog: v1.0...v1.0.1