Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADD argNorm module to ARG subworkflow #405

Merged
merged 34 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
c381f0d
ADD argNorm module to ARG subworkflow
Vedanth-Ramji Jul 15, 2024
8fb9262
Fix path to argNorm output in test.nf
Vedanth-Ramji Jul 15, 2024
6073b62
Move argNorm test to test_preannotated
Vedanth-Ramji Jul 16, 2024
c6df5aa
ENH make sure to not pass empty files into argNorm
Vedanth-Ramji Jul 16, 2024
7097413
RFCT simplify file check for argnorm
Vedanth-Ramji Jul 16, 2024
6f16440
Fix argNorm tests
Vedanth-Ramji Jul 17, 2024
1f0a7ac
ENH remove unnecessary trailing whitespaces
Vedanth-Ramji Jul 17, 2024
3bc1fac
[automated] Fix linting with Prettier
nf-core-bot Jul 18, 2024
431a7cc
ENH only use tsv version of hamronization output, simplify argnorm ou…
Vedanth-Ramji Jul 18, 2024
c79cafd
[automated] Fix linting with Prettier
nf-core-bot Jul 18, 2024
bde8907
ENH allow argNorm to be skipped
Vedanth-Ramji Jul 18, 2024
002ade7
BUG remove arg_skip_argnorm in test
Vedanth-Ramji Jul 18, 2024
a0e9aa1
RFCT make sure every named process is mixed
Vedanth-Ramji Jul 18, 2024
fb3af3f
Improve argnorm description in nextflow_schema.json
Vedanth-Ramji Jul 18, 2024
32b93a6
DOC add argnorm citation
Vedanth-Ramji Jul 20, 2024
d8e5b17
Merge branch 'dev' into dev
Vedanth-Ramji Jul 20, 2024
24cb448
[automated] Fix linting with Prettier
nf-core-bot Jul 20, 2024
4bc6692
remove unnecessary whitespaces
Vedanth-Ramji Jul 20, 2024
8504e01
Update snapshot for test_preannotated
Vedanth-Ramji Jul 21, 2024
2d73a41
Update base.config to mention requirements correctly for argNorm and …
Vedanth-Ramji Jul 23, 2024
f2e4f58
remove unnecessary whitespaces
Vedanth-Ramji Jul 23, 2024
41d51cd
fix spacing and argnorm documentation
Vedanth-Ramji Jul 23, 2024
e671c3f
Fix argNorm citation
Vedanth-Ramji Jul 23, 2024
583875d
Update test snapshot (others incoming later)
jasmezz Jul 23, 2024
f6545ac
Fix full_test assertions [skip ci]
jasmezz Jul 23, 2024
7b6ee27
Full_test snapshot [skip ci]
jasmezz Jul 24, 2024
b262551
Apply suggestions from code review
jfy133 Jul 24, 2024
91381ce
Updating nf-test files + snapshots (non-taxonomy)
jasmezz Jul 24, 2024
3130528
Update taxonomy test snapshots
jasmezz Jul 24, 2024
c0f40b7
Correct ARO test for deeparg in argnorm
Vedanth-Ramji Jul 24, 2024
bf0b735
Update ARO test for deeparg in argnorm
Vedanth-Ramji Jul 24, 2024
f5e5195
Update test_preannotated snapshot
Vedanth-Ramji Jul 24, 2024
6b93a96
Update argnorm deeparg test
Vedanth-Ramji Jul 24, 2024
451cd52
Update citation style, docs
jasmezz Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#382](https://github.com/nf-core/funcscan/pull/382) Optimised BGC screening run time and prevent crashes due to too-short contigs by adding contig length filtering for BGC workflow only. (by @jfy133, @darcy220606)
- [#366](https://github.com/nf-core/funcscan/pull/366) Added nf-test on pipeline level. (by @jfy133, @Darcy220606, @jasmezz)
- [#403](https://github.com/nf-core/funcscan/pull/403) Added antiSMASH parameters `--pfam2go`, `--rre`, and `--tfbs`. (reported by @Darcy220606, added by @jasmezz)
- [#405](https://github.com/nf-core/funcscan/pull/405) Added argNorm to ARG subworkflow. (by @Vedanth-Ramji)

### `Fixed`

Expand All @@ -44,6 +45,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
| AMPlify | 1.1.0 | 2.0.0 |
| AMRFinderPlus | 3.11.18 | 3.12.8 |
| antiSMASH | 6.1.1 | 7.1.0 |
| argNorm | NA | 0.5.0 |
| bioawk | 1.0 | NA |
| comBGC | 1.6.1 | 1.6.2 |
| DeepARG | 1.0.2 | 1.0.4 |
Expand Down
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@

> Blin, K., Shaw, S., Kloosterman, A. M., Charlop-Powers, Z., van Wezel, G. P., Medema, M. H., & Weber, T. (2021). antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic acids research, 49(W1), W29–W35. [DOI: 10.1093/nar/gkab335](https://doi.org/10.1093/nar/gkab335)

- [argNorm](https://github.com/BigDataBiology/argNorm)

> Svetlana Ugarcina Perovic, Vedanth Ramji, Hui Chong, Yiqian Duan, Finlay Maguire, Luis Pedro Coelho (2024). BigDataBiology/argNorm: Version 0.5.0 (v0.5.0). GitHub. https://github.com/BigDataBiology/argNorm. Zenodo. [DOI:10.5281/zenodo.10963591](https://zenodo.org/doi/10.5281/zenodo.10963591)

- [Bakta](https://doi.org/10.1099/mgen.0.000685)

> Schwengers, O., Jelonek, L., Dieckmann, M. A., Beyvers, S., Blom, J., & Goesmann, A. (2021). Bakta: rapid and standardized annotation of bacterial genomes via alignment-free sequence identification. Microbial Genomics, 7(11). [DOI: 10.1099/mgen.0.000685](https://doi.org/10.1099/mgen.0.000685)
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ The nf-core/funcscan AWS full test dataset are contigs generated by the MGnify s
2. Taxonomic classification of contigs of **prokaryotic origin** with [`MMseqs2`](https://github.com/soedinglab/MMseqs2)
3. Annotation of assembled prokaryotic contigs with [`Prodigal`](https://github.com/hyattpd/Prodigal), [`Pyrodigal`](https://github.com/althonos/pyrodigal), [`Prokka`](https://github.com/tseemann/prokka), or [`Bakta`](https://github.com/oschwengers/bakta)
4. Screening contigs for antimicrobial peptide-like sequences with [`ampir`](https://cran.r-project.org/web/packages/ampir/index.html), [`Macrel`](https://github.com/BigDataBiology/macrel), [`HMMER`](http://hmmer.org/), [`AMPlify`](https://github.com/bcgsc/AMPlify)
5. Screening contigs for antibiotic resistant gene-like sequences with [`ABRicate`](https://github.com/tseemann/abricate), [`AMRFinderPlus`](https://github.com/ncbi/amr), [`fARGene`](https://github.com/fannyhb/fargene), [`RGI`](https://card.mcmaster.ca/analyze/rgi), [`DeepARG`](https://bench.cs.vt.edu/deeparg)
5. Screening contigs for antibiotic resistant gene-like sequences with [`ABRicate`](https://github.com/tseemann/abricate), [`AMRFinderPlus`](https://github.com/ncbi/amr), [`fARGene`](https://github.com/fannyhb/fargene), [`RGI`](https://card.mcmaster.ca/analyze/rgi), [`DeepARG`](https://bench.cs.vt.edu/deeparg). [`argNorm`](https://github.com/BigDataBiology/argNorm) is used to map the outputs of `DeepARG`, `AMRFinderPlus`, and `ABRicate` to the [`Antibiotic Resistance Ontology`](https://www.ebi.ac.uk/ols4/ontologies/aro) for consistent ARG classification terms.
6. Screening contigs for biosynthetic gene cluster-like sequences with [`antiSMASH`](https://antismash.secondarymetabolites.org), [`DeepBGC`](https://github.com/Merck/deepbgc), [`GECCO`](https://gecco.embl.de/), [`HMMER`](http://hmmer.org/)
7. Creating aggregated reports for all samples across the workflows with [`AMPcombi`](https://github.com/Darcy220606/AMPcombi) for AMPs, [`hAMRonization`](https://github.com/pha4ge/hAMRonization) for ARGs, and [`comBGC`](https://raw.githubusercontent.com/nf-core/funcscan/master/bin/comBGC.py) for BGCs
8. Software version and methods text reporting with [`MultiQC`](http://multiqc.info/)
Expand Down
15 changes: 15 additions & 0 deletions conf/base.config
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,21 @@ process {
cpus = 1
}

withName: ARGNORM_DEEPARG {
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
cpus = 1
}

withName: ARGNORM_ABRICATE {
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
cpus = 1
}

withName: ARGNORM_AMRFINDERPLUS {
memory = { check_max( 4.GB * task.attempt, 'memory' ) }
cpus = 1
}

withName: AMPCOMBI2_PARSETABLES {
memory = { check_max( 8.GB * task.attempt, 'memory' ) }
time = { check_max( 2.h * task.attempt, 'time' ) }
Expand Down
30 changes: 30 additions & 0 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -629,6 +629,36 @@ process {
]
}

withName: ARGNORM_ABRICATE {
publishDir = [
path: {"${params.outdir}/arg/argnorm/abricate/"},
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.prefix = { "${meta.id}.normalized.tsv" }
ext.args = "--hamronized"
}

withName: ARGNORM_AMRFINDERPLUS {
publishDir = [
path: {"${params.outdir}/arg/argnorm/amrfinderplus/"},
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.prefix = { "${meta.id}.normalized.tsv" }
ext.args = "--hamronized"
}

withName: ARGNORM_DEEPARG {
publishDir = [
path: {"${params.outdir}/arg/argnorm/deeparg/"},
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
]
ext.prefix = { "${meta.id}.normalized.tsv" }
ext.args = "--hamronized"
}

withName: MERGE_TAXONOMY_COMBGC {
publishDir = [
path: { "${params.outdir}/reports/combgc" },
Expand Down
33 changes: 29 additions & 4 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

The output of nf-core/funcscan provides reports for each of the functional groups:

- antibiotic resistance genes (tools: [ABRicate](https://github.com/tseemann/abricate), [AMRFinderPlus](https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder), [DeepARG](https://bitbucket.org/gusphdproj/deeparg-ss/src/master), [fARGene](https://github.com/fannyhb/fargene), [RGI](https://card.mcmaster.ca/analyze/rgi) – summarised by [hAMRonization](https://github.com/pha4ge/hAMRonization))
- antibiotic resistance genes (tools: [ABRicate](https://github.com/tseemann/abricate), [AMRFinderPlus](https://www.ncbi.nlm.nih.gov/pathogens/antimicrobial-resistance/AMRFinder), [DeepARG](https://bitbucket.org/gusphdproj/deeparg-ss/src/master), [fARGene](https://github.com/fannyhb/fargene), [RGI](https://card.mcmaster.ca/analyze/rgi) – summarised by [hAMRonization](https://github.com/pha4ge/hAMRonization) and ABRicate/AMRFinderPlus/DeepARG are normalised to ARO by [argNorm](https://github.com/BigDataBiology/argNorm))
- antimicrobial peptides (tools: [Macrel](https://github.com/BigDataBiology/macrel), [AMPlify](https://github.com/bcgsc/AMPlify), [ampir](https://ampir.marine-omics.net), [hmmsearch](http://hmmer.org) – summarised by [AMPcombi](https://github.com/Darcy220606/AMPcombi))
- biosynthetic gene clusters (tools: [antiSMASH](https://docs.antismash.secondarymetabolites.org), [DeepBGC](https://github.com/Merck/deepbgc), [GECCO](https://gecco.embl.de), [hmmsearch](http://hmmer.org) – summarised by [comBGC](#combgc))

Expand Down Expand Up @@ -35,8 +35,9 @@ results/
| ├── amrfinderplus/
| ├── deeparg/
| ├── fargene/
| ├── rgi/
| ├── hamronization/
| └── rgi/
| └── argnorm/
├── bgc/
| ├── antismash/
| ├── deepbgc/
Expand Down Expand Up @@ -99,6 +100,7 @@ Output Summaries:

- [AMPcombi](#ampcombi) – summary report of antimicrobial peptide gene output from various detection tools.
- [hAMRonization](#hamronization) – summary of antimicrobial resistance gene output from various detection tools.
- [argNorm](#argNorm) - Normalize ARG annotations from [ABRicate](#abricate), [AMRFinderPlus](#amrfinderplus), and [DeepARG](#deeparg) to the ARO
- [comBGC](#combgc) – summary of biosynthetic gene cluster output from various detection tools.
- [MultiQC](#multiqc) – report of all software and versions used in the pipeline.
- [Pipeline information](#pipeline-information) – report metrics generated during the workflow execution.
Expand Down Expand Up @@ -274,7 +276,7 @@ Output Summaries:

### ARG detection tools

[ABRicate](#abricate), [AMRFinderPlus](#amrfinderplus), [DeepARG](#deeparg), [fARGene](#fargene), [RGI](#rgi)
[ABRicate](#abricate), [AMRFinderPlus](#amrfinderplus), [DeepARG](#deeparg), [fARGene](#fargene), [RGI](#rgi).

#### ABRicate

Expand Down Expand Up @@ -441,7 +443,7 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation

### Summary tools

[AMPcombi](#ampcombi), [hAMRonization](#hamronization), [comBGC](#combgc), [MultiQC](#multiqc), [pipeline information](#pipeline-information)
[AMPcombi](#ampcombi), [hAMRonization](#hamronization), [comBGC](#combgc), [MultiQC](#multiqc), [pipeline information](#pipeline-information), [argNorm](#argnorm).

#### AMPcombi

Expand Down Expand Up @@ -566,6 +568,29 @@ Note that filtered FASTA is only used for BGC workflow for run-time optimisation

[hAMRonization](https://github.com/pha4ge/hAMRonization) summarizes the outputs of the **antimicrobial resistance gene** detection tools (ABRicate, AMRFinderPlus, DeepARG, fARGene, RGI) into a single unified tabular format. It supports a variety of summary options including an interactive summary.

#### argNorm

<details markdown="1">
<summary>Output files</summary>

- `normalized/`
- `*.{tsv}`: search results in tabular format
</details>
<details markdown="1">
<summary>ARG summary table headers</summary>

| Table column | Description |
| ---------------------------- | -------------------------------------------------------------------------------- |
| `ARO` | ARO accessions of ARG |
| `confers_resistance_to` | ARO accessions of drugs to which ARGs confer resistance to |
| `resistance_to_drug_classes` | ARO accessions of drugs classes to which drugs in `confers_resistance_to` belong |

</details>

[argnorm](https://github.com/BigDataBiology/argNorm) is a tool to normalize antibiotic resistance genes (ARGs) by mapping them to the antibiotic resistance ontology (ARO) created by the CARD database. argNorm also enhances antibiotic resistance gene annotations by providing categorization of the drugs that antibiotic resistance genes confer resistance to.

argNorm takes the outputs of the [hAMRonization](#hamronization) tool of [ABRicate](#abricate), [AMRFinderPlus](#amrfinderplus), and [DeepARG](#deeparg) and normalizes ARGs in the hAMRonization output to the ARO.

#### comBGC

<details markdown="1">
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,11 @@
"git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d",
"installed_by": ["modules"],
"patch": "modules/nf-core/untar/untar.diff"
},
"argnorm": {
"branch": "master",
"git_sha": "e4fc46af5ec30070e6aef780aba14f89a28caa88",
"installed_by": ["modules"]
}
}
},
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/argnorm/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

68 changes: 68 additions & 0 deletions modules/nf-core/argnorm/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

60 changes: 60 additions & 0 deletions modules/nf-core/argnorm/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions modules/nf-core/argnorm/tests/argnorm_hamronized.config

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions modules/nf-core/argnorm/tests/argnorm_raw.config

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading