Freely available pre-built hash files here
Read manual. It's Awesome!
Go to directory to install database.
DBNAME=$(pwd)
Install the latest and greatest taxonomy database.
kraken2-build --download-taxonomy --db $DBNAME
Favorites:
- archaea
- bacteria
- plasmid
- viral
- human
- fungi
- plant
- protozoa
- UniVec_Core
For example:
kraken2-build --download-library archaea --db $DBNAME
Make a directory to gather genomes
mkdir ${DBNAME}/my_custom_genomes
cd ${DBNAME}/my_custom_genomes
RefSeq genomes can be searched here. Search "All Databases" in the top search bar. For example, search "bos taurus". NCBI will return the search results and highlight a RefSeq genome assembly. Select the "Download" button. Download the "Genomic FASTA" from the "RefSeq" database.
Including these in build has worked good:
- Bos taurus (cow)
- Sus scrofa (pig)
- Equus caballus (horse)
- gallus gallus (chicken)
- Canis lupus familiaris (dog)
- Felis catus (house cat)
- Odocoileus virginianus (white-tailed deer)
- Panthera tigris altaica (Amur tiger)
- Loxodonta africana (elephant)
- Tursiops_truncatus (bottlenose dolphin)
- Anas_platyrhynchos (mallard)
- Nanorana parkeri(frog)
- Cyprinus carpio (common carp)
- Aedes albopictus (mosquito)
- Dermacentor silvarum (tick)
Once genomes are downloaded transfer, unzipped genomes to ${DBNAME}/my_custom_genomes
as .fna
files.
When in ${DBNAME}/my_custom_genomes
:
for file in *fna; do kraken2-build --add-to-library $file --db $DBNAME; done
Build with available threads:
kraken2-build --build --threads 24 --db $DBNAME
Once built:
echo $DBNAME
This is your new database path for Kraken 2.
kraken2 --db ${DBNAME} --threads ${CPU_NUMBER} --paired ${read1} ${read2} --output ${sample_name}-outputkraken.txt --report ${sample_name}-reportkraken.txt
Finish with Krona graph
krona_lca_all.py -f ${sample_name}-outputkraken.txt
ktImportTaxonomy kronaInput.txt
mv taxonomy.krona.html ${sample_name}-taxonomy.krona.html
mv taxonomy.krona.html.files ${sample_name}-taxonomy.krona.html.files
"library" directory with genomes used to build Kraken2 database must be present at kraken2 database location. Read length of 240 assuming 500 base chemistry.
Build:
bracken-build -d ${DBNAME} -t 46 -k 35 -l 240
Run:
bracken -d ${DBNAME} -i ${sample_name}-reportkraken.txt -o ${sample_name}-bracken.txt -r 240