Skip to content

Pre‐trained data

raquellewei edited this page Nov 16, 2023 · 20 revisions

Databases by Taxonomy

GTDB-RS214 databases

These databases were pre-trained using the 214 release of bacterial and archaeal data from GTDB, which GTDB spans 402,709 genomes organized into 85,205 species clusters. For more information on the raw data, refer to the statistics here.

K-mer size ANI Zip file
21 0.8 Download
21 0.95 Download
21 0.995 Download
21 0.9995 Download
31 0.8 Download
31 0.95 Download
31 0.995 Download
31 0.9995 Download
51 0.8 Download
51 0.95 Download
51 0.995 Download
51 0.9995 [Download](https://zenodo.org/records/10113572/files/gtdb-rs214-reps.k51_0.9995_pretrained.zip?download=1)

Archaea

K-mer size ANI Zip file
21 0.8 Download
21 0.95 Download
21 0.995 Download
21 0.9995 Download
31 0.8 Download
31 0.95 Download
31 0.995 Download
31 0.9995 Download
51 0.8 Download
51 0.95 Download
51 0.995 Download
51 0.9995 Download

Fungi

K-mer size ANI Zip file
21 0.8 Download
21 0.95 Download
21 0.995 Download
21 0.9995 Download
31 0.8 Download
31 0.95 Download
31 0.995 Download
31 0.9995 Download
51 0.8 Download
51 0.95 Download
51 0.995 Download
51 0.9995 Download

Protozoa

K-mer size ANI Zip file
21 0.8 Download
21 0.95 Download
21 0.995 Download
21 0.9995 Download
31 0.8 Download
31 0.95 Download
31 0.995 Download
31 0.9995 Download
51 0.8 Download
51 0.95 Download
51 0.995 Download
51 0.9995 Download
Clone this wiki locally