nHUMAnN workflow

	Developed by the Bork Group Raise an issue or contact us See our other Software & Services	Contributors: Christian Schudoma Daniel Podlesny
The development of this workflow was supported by NFDI4Microbiota

Description

The nHUMAnN workflow is a nextflow workflow for running HUMAnN3 based on Metaphlan4 profiles via joint index generation. The workflow includes optional read preprocessing and host/human decontamination steps provided by the nevermore workflow library.

Due to compatibility issues between current CHOCOPhlAn databases and recent versions of HUMAnN3, nHUMAnN makes use of a patched HUMAnN3 version obtainable as a Docker container.

Citation

This workflow:

Also cite:

Beghini F, McIver LJ, Blanco-Míguez A, et al. Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. Elife. 2021;10:e65088. Published 2021 May 4. doi:10.7554/eLife.65088

Overview

Requirements

The easiest way to handle dependencies is via Singularity/Docker containers. Alternatively, conda environments, software module systems or native installations can be used.

Preprocessing

Preprocessing and QA is done with bbmap, fastqc, and multiqc.

Decontamination/Host removal

Decontamination is done with kraken2 and additionally requires seqtk.

Kraken2 database

Host removal requires a kraken2 host database.

Metaphlan Profiling

The default supported MetaPhlAn version is 4.

CHOCOPhlAn database for Metaphlan4

Get the mpa_vOct22_CHOCOPhlAnSGB_202212 database from here, unpack the tarball, and point the --mp4_db parameter to the database's root directory.

In params.yml:

mp4_db: "/path/to/mpa_vOct22_CHOCOPhlAnSGB_202212/"

On the command line:

--mp4_db "/path/to/mpa_vOct22_CHOCOPhlAnSGB_202212/"

HUMAnN Profiling

The default supported HUMAnN3 version is 3.

HUMAnN databases

Get the annotated CHOCOPhlAn db from here and the annotated uniref db from here, unpack the tarballs and set the respective parameters.

In params.yml:

humann_nuc_db: "/path/to/full_chocophlan_db/"
humann_prot_db: "/path/to/uniref90_annotated_v201901b_full/"

On the command line:

--humann_nuc_db "/path/to/full_chocophlan_db/"
--humann_prot_db "/path/to/uniref90_annotated_v201901b_full/"

Usage

Cloud-based Workflow Manager (CloWM)

This workflow will be available on the CloWM platform (coming soon).

Command-Line Interface (CLI)

The workflow run is controlled by environment-specific parameters (see run.config) and study-specific parameters (see params.yml). The parameters in the params.yml can be specified on the command line as well.

You can either clone this repository from GitHub and run it as follows

git clone https://github.com/grp-bork/nHUMAnN.git
nextflow run /path/to/nhumann [-resume] -c /path/to/run.config -params-file /path/to/params.yml

Or, you can have nextflow pull it from github and run it from the $HOME/.nextflow directory.

nextflow run cschu/nHUMAnN [-resume] -c /path/to/run.config -params-file /path/to/params.yml

Input files

Fastq files are supported and can be either uncompressed (but shouldn't be!) or compressed with gzip or bzip2. Sample data must be arranged in one directory per sample.

Per-sample input directories

All files in a sample directory will be associated with the name of the sample folder. Paired-end mate files need to have matching prefixes. Mates 1 and 2 can be specified with suffixes _[12], _R[12], .[12], .R[12]. Lane IDs or other read id modifiers have to precede the mate identifier. Files with names not containing either of those patterns will be assigned to be single-ended. Samples consisting of both single and paired end files are assumed to be paired end with all single end files being orphans (quality control survivors).

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
bin		bin
config		config
docs		docs
metaphlow		metaphlow
nevermore		nevermore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
clowm_info.json		clowm_info.json
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nHUMAnN workflow

Description

Citation

Overview

Requirements

Preprocessing

Decontamination/Host removal

Kraken2 database

Metaphlan Profiling

CHOCOPhlAn database for Metaphlan4

HUMAnN Profiling

HUMAnN databases

Usage

Cloud-based Workflow Manager (CloWM)

Command-Line Interface (CLI)

Input files

Per-sample input directories

About

Releases 2

Packages

Contributors 2

Languages

License

grp-bork/nHUMAnN

Folders and files

Latest commit

History

Repository files navigation

nHUMAnN workflow

Description

Citation

Overview

Requirements

Preprocessing

Decontamination/Host removal

Kraken2 database

Metaphlan Profiling

CHOCOPhlAn database for Metaphlan4

HUMAnN Profiling

HUMAnN databases

Usage

Cloud-based Workflow Manager (CloWM)

Command-Line Interface (CLI)

Input files

Per-sample input directories

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages