Releases: sourmash-bio/sourmash_plugin_directsketch
v0.4.1
What's Changed
This release includes a bugfix where using a zipfile without an explicit path would yield an error (#118). The remaining changes are internal, including adding parameter string validation and improving the sketching utilities for potential use in other plugins.
- MRG: refactor sketching utilities by @bluegenes in #112
- MRG: validate param strings by @bluegenes in #114
- MRG: update sourmash core to 0.16.0 by @bluegenes in #115
- MRG: fix bug in zip paths if output provided in current dir by @bluegenes in #121
- bump to 0.4.1 by @bluegenes in #128
dependabot
- Bump reqwest from 0.12.7 to 0.12.8 by @dependabot in #110
- Bump futures from 0.3.30 to 0.3.31 by @dependabot in #111
- Bump pyo3 from 0.22.3 to 0.22.5 by @dependabot in #122
- Bump anyhow from 1.0.89 to 1.0.90 by @dependabot in #126
- Bump serde_json from 1.0.128 to 1.0.132 by @dependabot in #124
- Bump openssl from 0.10.66 to 0.10.68 by @dependabot in #125
Full Changelog: v0.4.0...v0.4.1
v0.4.0
This release introduces two new parameters:
--checksum-failures
- an output file to log any failures with the checksum file download and parsing or any md5sum mismatches. Required forgbsketch
--batch-size
- enables writing smaller, batched zipfiles. This is recommended for large database generation, as batches allow restart after unexpected failure. It also should address some issues arising from extremely large zips.
Under the hood, this release also introduces a standardized sketching building framework that may be useful outside of this plugin.
What's Changed
- MRG: report checksum file download failures by @bluegenes in #92
- MRG: add generic support for signature building by @bluegenes in #101
- MRG: improve restart by optionally writing batched zipfiles by @bluegenes in #102
- MRG: fix ci by moving install from
mambaforge
-->miniforge
by @bluegenes in #106 - bump to v0.4.0 by @bluegenes in #109
Dependabot
sourmash-core
:- Bump sourmash from 0.14.0 to 0.14.1 by @dependabot in #62
- Bump sourmash from 0.14.1 to 0.15.0 by @dependabot in #75
- Bump sourmash from 0.15.0 to 0.15.1 by @dependabot in #87
- Bump sourmash from 0.15.1 to 0.15.2 by @dependabot in #103
simple-error
:- Bump simple-error from 0.3.0 to 0.3.1 by @dependabot in #59
reqwest
:- Bump reqwest from 0.12.4 to 0.12.5 by @dependabot in #60
- Bump reqwest from 0.12.5 to 0.12.7 by @dependabot in #88
lazy_static
:- Bump lazy_static from 1.4.0 to 1.5.0 by @dependabot in #61
pyo3
:- Bump pyo3 from 0.21.2 to 0.22.0 by @dependabot in #64
- Bump pyo3 from 0.22.0 to 0.22.1 by @dependabot in #66
- Bump pyo3 from 0.22.1 to 0.22.2 by @dependabot in #73
- Bump pyo3 from 0.22.2 to 0.22.3 by @dependabot in #99
serde_json
:- Bump serde_json from 1.0.117 to 1.0.119 by @dependabot in #63
- Bump serde_json from 1.0.119 to 1.0.120 by @dependabot in #67
serde
:- Bump serde from 1.0.203 to 1.0.204 by @dependabot in #65
tokio
:- Bump tokio from 1.38.0 to 1.38.1 by @dependabot in #74
- Bump tokio from 1.38.1 to 1.40.0 by @dependabot in #91
pytest
:- Update pytest requirement from <8.3.0,>=6.2.4 to >=6.2.4,<8.4.0 by @dependabot in #71
openssl
:- Bump openssl from 0.10.64 to 0.10.66 by @dependabot in #72
regex
:- Bump regex from 1.10.5 to 1.10.6 by @dependabot in #80
- Bump regex from 1.10.6 to 1.11.0 by @dependabot in #104
anyhow
:- Bump anyhow from 1.0.86 to 1.0.89 by @dependabot in #100
Full Changelog: v0.3.2...v0.4.0
v0.3.2
What's Changed
- MRG: update to sourmash-rs core r0.14.0 by @ctb in #52
- MRG: set zip permissions to 644 by @bluegenes in #53
- MRG: enable dayhoff, hp sketching by @bluegenes in #55
- bump version to 0.3.2 by @bluegenes in #54
Dependabot
-
Bump tokio from 1.37.0 to 1.38.0 by @dependabot in #46
-
Bump serde from 1.0.202 to 1.0.203 by @dependabot in #45
-
Bump regex from 1.10.4 to 1.10.5 by @dependabot in #51
New Contributors
Full Changelog: v0.3.1...v0.3.2
v0.3.1
- fixes URL formatting bug in failure output
- adds new
urlsketch
command - changes failure output format for both
gbsketch
,urlsketch
. The new header is:accession,name,moltype,md5sum,download_filename,url
, which matches theurlsketch
input format.
What's Changed
- fix url printing by @bluegenes in #36
- add
urlsketch
command by @bluegenes in #34
Dependabot and version updates
- Bump anyhow from 1.0.83 to 1.0.86 by @dependabot in #39
- Bump serde from 1.0.201 to 1.0.202 by @dependabot in #38
- Bump camino from 1.1.6 to 1.1.7 by @dependabot in #37
- bump version to 0.3.1 by @bluegenes in #43
Full Changelog: v0.3.0...v0.3.1
v0.3.0
This release fixes a bug where the wrong version may be downloaded #27.
The input format has changed slightly! Required columns are now: accession,name,ftp_path
. ftp_path
column name must be present, but column can be empty.
- if
ftp_path
is provided, it is used as the path for finding files associated with the accession. Otherwise,gbsketch
will build theftp_path
from the accession.
What's Changed
- optionally use ftp_path input for
gbsketch
by @bluegenes in #29 - prevent unneccesary downloads by also setting genomes-only/proteomes-only via params if not keeping fastas by @bluegenes in #30
- do not require signature output file if not sketching by @bluegenes in #31
Full Changelog: v0.2.3...v0.3.0
v0.2.3
What's Changed
- fix ci by @bluegenes in #6
- revert channel sizes by @bluegenes in #23
- bump version to 0.2.3 by @bluegenes in #24
Full Changelog: v0.2.2...v0.2.3
v0.2.2
Bugfix Release
- fix a bug where md5sum file error caused
directsketch
to hang
What's Changed
- fix error handling by @bluegenes in #19
- Bump serde from 1.0.200 to 1.0.201 by @dependabot in #12
- Bump anyhow from 1.0.82 to 1.0.83 by @dependabot in #11
- Bump serde_json from 1.0.116 to 1.0.117 by @dependabot in #10
New Contributors
- @dependabot made their first contribution in #12
Full Changelog: v0.2.1...v0.2.2
v0.2.1
What's Changed
- changed progress reporting back from 5% --> 1%; adjusted to reflect start times better
- remove interval delay by @bluegenes in #16
Full Changelog: v0.2.0...v0.2.1
v0.2.0
Major changes:
- #8 - actually use tokio threading, fully asynchronous file downloading + writing
- #9 - download md5sums and check them prior to sketching
- #14 - make sure we return an error if the md5sum can't be downloaded (rather than just continuing)
- #15 - safer tokio thread/runtime setting while still allowing pytest to run multiple iterations at once
Benchmarking shows this structure is much faster
software/version | command | acc details | time | max RAM |
---|---|---|---|---|
v0.1.0 | gbsketch |
9 fungal | 6min | 156 MB |
main (v0.2.0) | gbsketch |
9 fungal | 10s | 156 MB |
v0.1.0 | gbsketch |
49 fungal | 58min | 1.5 GB |
main (v0.2.0) | gbsketch |
49 fungal | 1min 26s | 1.6GB |
main(v0.2.0) | gbsketch |
243 fungal | 4min | 1.16GB |
What's Changed
- check md5sums by @bluegenes in #9
- WIP: fully async with tokio threading by @bluegenes in #8
- return error if downloading md5sums fails by @bluegenes in #14
- safer tokio thread setting by @bluegenes in #15
Full Changelog: v0.1.0...v0.2.0