Releases · bluenote-1577/skani

05 Jul 01:42

bluenote-1577

v0.2.2

666894c

v0.2.2

v0.2.2 released - 2024-07-04

Major

added the --small-genomes preset. This is just an alias for -c 30 -m 200 --faster-small. This makes skani much faster when comparing hundreds of thousands of small genomes.

Minor

fixed a bug where skani triangle --full-matrix gave different results between STDOUT and -o (thanks to Florian Plaza Onate)
added a --diagonal option (suggested by Antonio Camargo) to print diagonal entries for sparse and lower-triangular distance matrices
added a warning to use --faster-small when comparing too many contigs (e.g. viruses, plasmids).

Assets 2

05 Jul 01:44

github-actions

latest

666894c

latest Latest

Latest

Commits

ff4e727: Update README.md (Jim Shaw)
619f63b: Update README.md (Jim Shaw)
ffcf1b5: Update README.md (Jim Shaw)
ec4e2e1: Update README.md (Jim Shaw)
1ca7597: Update README.md (Jim Shaw)
93dd7ac: Update README.md (Jim Shaw)
ee3132a: Update README.md (Jim Shaw)
9709a52: Fix output format in triangle mode (#31) (Florian Plaza Oñate) #31
b6cf523: added new integration test for -o, --full-matrix triangle (bluenote-1577)
a17e639: v0.2.2 - better small genome options and some bug fixes/features (bluenote-1577)
9c5fe00: v0.2.2 - fixing up tests (bluenote-1577)
666894c: v0.2.2 - running pre release script (bluenote-1577)

Assets 3

12 Oct 04:46

bluenote-1577

v0.2.1

adafd7f

v0.2.1

v0.2.1 released - 2023-10-11

More consistent support for small contigs and sequences.

Major

--faster-small option included in dist and triangle.

Genomes (and contigs with the --i, --ri, --qi options) with less than 20 marker k-mers are not screened according to the -s option. This has always been the case, but not documented online. This is because screening is not as effective for very small genomes. This makes skani more sensitive for small sequences but can hamper performance on very large datasets with lots of small genomes/contigs.

This heuristic can now be disabled with the --faster-small option. This can make huge comparisons much faster if you don't care about sensitivity for very small genomes.

Minor

skani's version is now displayed properly
Added some error messages for degenerate cases (and more testing)
We found that the statically built binary can be a lot slower in certain cases. File i/o may be an issue for the binary version. A note is now added in the README.

Assets 2

26 Sep 01:12

bluenote-1577

v0.2.0

d555178

v0.2.0

v0.2.0 released - 2023-09-26

BREAKING

--learned-ani feature was buggy before and now removed.

Major

Major bug found: debiasing for ANI was turned off if there were > 5000 queries present in skani search and skani dist. This bug is fixed now.

Minor

The rust API is changing in this version. Not published to Cargo yet (waiting on DDOtten/partitions#3 to be published to crates...)
Version number fixed

Assets 2

01 Sep 10:07

bluenote-1577

v0.1.5

3018522

v0.1.5

v0.1.5 released - 2023-09-01

Major

Improved "N" character support:

changed query-reference selection method slightly via a slight hack, using marker seeds to estimate reference length instead. This makes it so NNN characters are not counted.
Now seeds with "N" characters present are no longer indexed.

Minor

--robust now uses the learned ANI debiasing procedure by default.

Assets 2

15 Jun 21:14

bluenote-1577

v0.1.4

5f03545

v0.1.4

v0.1.4 released - 2023-06-14

Major

skani triangle had a bug where if more than 5000 queries were present and --sparse or -E was not specified, the intermediate batch of 5000 queries would be written in sparse mode.
skani triangle -o was giving different upper triangle matrix instead of lower triangle (skani triangle > res gives lower triangle). Matrices are consistently lower triangle now.
Changed to lto = true for release mode. I see anywhere from a 5-10% speedup for this.

Minor

Changed some dependencies so no more dependencies on old crates that will deprecate.

Assets 2

11 May 00:15

bluenote-1577

v0.1.3

7724041

v0.1.3

v0.1.3 released - 2023-05-09

Major

Fixed a bug where memory was blowing up in dist and triangle when the marker-index was activated. For big datasets, there could be > 100 GBs of wasted memory.
skani now outputs intermediate results after processing each batch of 5000 queries. This will mean that outputs may no longer be deterministically ordered if there are > 5000 genomes, but you can sort the output file to get deterministic outputs, i.e skani triangle *.fa | sort -k 3 -n > sorted_skani_result.txt will guarantee deterministic output order.

Minor

Changed the marker index hash table population method. Used to overestimate memory usage slightly.
New help message for marker parameters. Turns out that for small genomes, having more markers may make filtering significantly better.
Added -i option to sketch so you can sketch individual records in multifastas -- does not work for search yet though, only for sketching.

Assets 2

29 Apr 04:04

bluenote-1577

v0.1.2

abf2c2d

v0.1.2

v0.1.2 released - 2023-04-28.

Small fixes.

Added --medium pre-set, which is just -c 70. Seems to work okay for comparing fragmented genomes.
BREAKING: Changed --marker-index to --no-marker-index as a more sane option.
Added --distance option to skani triangle to output distance matrix (i.e. 100 - ANI) instead of similarity matrix.
Misc. help message fixes

Assets 2

09 Apr 19:28

bluenote-1577

v0.1.1

966fd01

v0.1.1

Small tweaks.

Made aligned fraction matrix a full matrix by default, since aligned fraction is not symmetric.
Fixed an issue with the static compiled version being too slow

Assets 2

07 Feb 21:28

bluenote-1577

v0.1.0

1afbcbf

v0.1.0

v0.1.0 released - 2023-02-07.

We added new experiments on the revised version of our preprint (pending bioRxiv update) in the appendix. We show skani has quite good AF correlation with MUMmer, and that it works decently on simple eukaryotic MAGs, especially with the --slow option (see below).

Major

ANI debiasing added - skani now uses a debiasing step with a regression model trained on MAGs to give more accurate ANIs. v0.0.1 gave robust, but slightly overestimated ANIs, especially around 95-97% range. Debiasing is enabled by default, but can be turned off with --no-learned-ani.
More accurate aligned fraction - chaining algorithm changed to give a more accurate aligned fraction (AF) estimate. The previous version had more variance and underestimated AF for certain assemblies.

Minor

Small contig/genome defaults made better - should be more sensitive so that they don't get filtered by default.
Repetitive k-mer masking made better - smarter settings and should work better for eukaryotic genomes; shouldn't affect prokaryotic genomes much.
--fast and --slow mode added - alias for -c 200 and -c 30 respectively.
More non x86_64 builds should work - there was a bug before where skani would be dysfunctional on non x86_64 architectures. It seems to at least build on ARM64 architectures successfully now.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.2 released - 2024-07-04

Major

Minor

Commits

v0.2.1 released - 2023-10-11

Major

Minor

v0.2.0 released - 2023-09-26

BREAKING

Major

Minor

v0.1.5 released - 2023-09-01

Major

Minor

v0.1.4 released - 2023-06-14

Major

Minor

v0.1.3 released - 2023-05-09

Major

Minor

v0.1.2 released - 2023-04-28.

v0.1.0 released - 2023-02-07.

Major

Minor

Releases: bluenote-1577/skani

v0.2.2

v0.2.2 released - 2024-07-04

Major

Minor

latest

Commits

v0.2.1

v0.2.1 released - 2023-10-11

Major

Minor

v0.2.0

v0.2.0 released - 2023-09-26

BREAKING

Major

Minor

v0.1.5

v0.1.5 released - 2023-09-01

Major

Minor

v0.1.4

v0.1.4 released - 2023-06-14

Major

Minor

v0.1.3

v0.1.3 released - 2023-05-09

Major

Minor

v0.1.2

v0.1.2 released - 2023-04-28.

v0.1.1

v0.1.0

v0.1.0 released - 2023-02-07.

Major

Minor