Skip to content
/ spire Public

Pipeline used to generate data on spire.embl.de

Notifications You must be signed in to change notification settings

grp-bork/spire

Repository files navigation

SPIRE workflow

Bork Group Logo Developed by the Bork Group
Raise an issue or contact us

See our other Software & Services
Contributors:

Description

SPIRE.embl.de is a comprehensive, planetary-scale microbiome resource featuring over 100,000 processed metagenomes curated from more than 700 studies worldwide. SPIRE and the SPIRE workflow provide a framework for uniform processing and annotation of metagenomics data. The SPIRE workflow allows users to process their own metagenomic data in the same standardized manner, enabling seamless comparison and analysis within the extensive global context of SPIRE.

Citation

This workflow: DOI

Cite the SPIRE publication when using our workflow:

Schmidt TSB, Fullam A, Ferretti P, et al. SPIRE: a Searchable, Planetary-scale mIcrobiome REsource. Nucleic Acids Res. 2024;52(D1):D777-D783. doi:10.1093/nar/gkad943

Overview


Preprocessing:

  1. Trimming (ngless)
  2. Length filtering (ngless)
  3. Human DNA decontamination (GRCh38)

MAGs:

  1. Assembly (megahit)
  2. Gene Calling (prodigal)
  3. Remove Small Contigs (seqtk)
  4. Index (bwa)
  5. Alignment (bwa, samtools)
  6. Calculate Depths (jgi_summarize_bam_contig_depths)
  7. Binning (metabat2)
  8. Per-Bin Gene Calling (seqtk)
  9. Assembly Stats (assembly-stats)
  10. Assembly Mash Sketching (mash)
  11. Bin Mash Sketching (mash)

Annotation:

  1. rRNA Detection (barrnap)
  2. ARG Annotation (abricate, rgi)
  3. Virulence Factor Annotation (abricate)
  4. sORFs Detection (macrel)
  5. Genome Quality Assessment (checkm2, gunc)
  6. Functional Annotation (eggnog-mapper)
  7. Taxonomic Classification (gtdbtk)