Metabarcoding processing pipeline

by Alexander Keller (LMU Munich)

A simple script to process metabarcoding (e.g. 16S V4) data, with amplicons generated by

16S: Kozich et al. 2013 AEM
ITS2: Sickel et al. 2015 BMC Ecology

If you use this script, please kindly cite this article: https://doi.org/10.1098/rstb.2021.0171

Dependencies

VSEARCH https://github.com/torognes/vsearch
SeqFilter https://github.com/BioInf-Wuerzburg/SeqFilter
(USEARCH python scripts depreciated and work around is now integrated https://drive5.com/python/ )
Also check the _DBs folder for Databases

What will the script do?

Un-gzipping files
Individual sample preparation
- Merging forward and reverse reads
- Quality filtering
- Backup Option: Forward read only use in case of bad quality reverse reads
Community level processing
- Dereplication
- Denoising
- ASV generation
- Chimera (de novo) removal
- Taxonomic classification
  - allows for multiple reference databases (iterative) with decreasing priority
  - all unclassified reads are hierarchically classified
- Creation of a community table

Usage:

Put all your raw sequencing files (.fastq or .fastq.gz) into a subfolder of where this script is (do not use full paths).
Copy a config.txt from the resources folder, adapt it to your needs, and copy it into your data folder. Consier to check paths to binaries in the script file
You also need to add a config.txt file, where information about databases are stored. An example is in the example directory.

Then you are ready to run:

bash _processing_MB_0.2a.sh <FOLDER>

Results will be in a new subfolder of your current directory called <FOLDER>.<DATE>

In case the analysis needs to be reverted, which will remove files and bring the folder structure back to the original state.

bash _revert_analysis_1.sh <FOLDER>

Import into R

In the <FOLDER>.<DATE> folder, there will be an R script for data import and basic ecological analyses.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
_DBs		_DBs
_resources		_resources
LICENSE		LICENSE
README.md		README.md
_compression_cleanup_1.sh		_compression_cleanup_1.sh
_processing_MB_0.2a.sh		_processing_MB_0.2a.sh
_revert_analysis_1.sh		_revert_analysis_1.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Metabarcoding processing pipeline

Dependencies

What will the script do?

Usage:

Import into R

About

Releases

Packages

Languages

License

BioMeDS/metabarcoding_pipeline

Folders and files

Latest commit

History

Repository files navigation

Metabarcoding processing pipeline

Dependencies

What will the script do?

Usage:

Import into R

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages