Skip to content
/ Tapirs Public

a snakemake workflow for reproducible metabarcoding

License

Notifications You must be signed in to change notification settings

EvoHull/Tapirs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tapirs

Snakemake Build Status

Tapirs is a reproducible modular workflow for the analysis of DNA metabarcoding data.

Tapirs uses the Snakemake workflow manager and is compartmentalised into several modules, each performing a step of the workflow. Tapirs is designed to be experimental, allowing you to test the effect of different approaches to data analysis. Tapirs is curently v1.0. It is simple, robust, and reliable in our hands, but not all features are yet present.

Tapirs was created by the EvoHull group, at the University of Hull, UK

Detailed instructions for installation, setup, and modification are contained within the Tapirs documentation

Quickstart

  1. Install conda (miniconda)
  2. Install git
  3. Clone the Tapirs repository, and relocate there
    • git clone https://github.com/EvoHull/Tapirs
    • cd Tapirs
  4. Create an environment with snakemake and other software for the workflow
    • conda env create -f workflow/envs/env.yaml
    • conda activate tapirs
  5. Download taxonomy
    • wget ftp://ftp.ncbi.nih.gov/pub/taxonomy/new_taxdump/new_taxdump.zip
    • unzip new_taxdump.zip -d resources/databases/new_taxdump
    • rm new_taxdump.zip
  6. Populate resources/databases with your reference databases and resources/libraries with your data (a directory containing your demultiplexed R1/R2.fastq.gz sample files)
  7. Place your sample sheet tsv in config/ (see config/Hull_test.tsv for layout format)
  8. Adjust config/config.yaml to configure the Tapirs workflow (see below)
  9. Dry run snakemake -npr to identify any issues
  10. Run snakemake --cores 4 (you can run all available cores with snakemake --cores)

Configuring the Tapirs workflow

You should adjust config/config.yaml to specify the location of relevant files (reference databases and sequence data to be analysed) and parameters for the analysis (experiment name, sample sheet name, amplicon/primer lengths, analysis methods etc.). Defaults are present and are set for the test data set: Hull_test.

Consult the Tapirs documentation to get more extensive support.

Workflow overview

One example workflow is illustrated below, you may configure yours differently.

workflow graph

Authors

EvoHull group, University of Hull, UK

  • Dave Lunt (@davelunt)
  • Graham Sellers (@Graham-Sellers)
  • Michael R Winter (@mrmrwinter)
  • Merideth Freiheit (@merfre)
  • Marco Benucci

tapirs_logo

About

a snakemake workflow for reproducible metabarcoding

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages