Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DRAFT] Add NGSCheckMate in as part of a cram sampleQC subworkflow #1252

Merged
merged 29 commits into from
Nov 8, 2023
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
94ca07d
Add NGSCheckMate
SPPearce Sep 24, 2023
21d38a2
Update CHANGELOG
SPPearce Sep 24, 2023
2e72d1c
Update tools list for sampleqc
SPPearce Sep 25, 2023
cee8614
Update CHANGELOG.md
SPPearce Sep 29, 2023
9c48c87
Merge branch 'dev' into ngscheckmate
SPPearce Sep 29, 2023
a2496de
Update conf/igenomes.config
SPPearce Sep 29, 2023
05f681e
Update conf/igenomes.config
SPPearce Sep 29, 2023
9d9105c
Update CHANGELOG.md
SPPearce Sep 29, 2023
2ab1e4d
Update conf/igenomes.config
SPPearce Sep 29, 2023
ac649e0
Update conf/igenomes.config
SPPearce Sep 29, 2023
a0e5f43
Add tests, swap to ngscheckmate
SPPearce Sep 30, 2023
2c95984
Merge branch 'dev' into ngscheckmate
SPPearce Sep 30, 2023
97e1fa6
Fix NGSCheckMate test name
SPPearce Oct 1, 2023
64273f3
Merge remote-tracking branch 'refs/remotes/origin/ngscheckmate' into …
SPPearce Oct 1, 2023
ad97096
Update test
SPPearce Oct 1, 2023
7378fe9
Update output path and docs
SPPearce Oct 2, 2023
d37c24e
Change ngscheckmate publishdir
SPPearce Oct 2, 2023
4e73bdb
Update tests/config/tags.yml
SPPearce Oct 2, 2023
8f59abc
Merge branch 'dev' into ngscheckmate
maxulysse Oct 11, 2023
3acaef9
Fix merge conflict
SPPearce Nov 7, 2023
e1706a3
Apply code review suggestions, fix channel to mpileup
SPPearce Nov 7, 2023
01ae112
Swap around bed location in confs
SPPearce Nov 7, 2023
13df981
Swap to modules test-data
SPPearce Nov 7, 2023
32227d3
Apply suggestions from code review
SPPearce Nov 7, 2023
10185fa
Add getGenomeAttribute check
SPPearce Nov 7, 2023
ac87102
Update conf/test.config
SPPearce Nov 7, 2023
cf1e116
Move blank line
SPPearce Nov 7, 2023
5f25439
Add to somatic full test
SPPearce Nov 7, 2023
4e55d5d
Merge remote-tracking branch 'origin/dev' into ngscheckmate
SPPearce Nov 7, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Added

- [#1246](https://github.com/nf-core/sarek/pull/1246) - Back to dev
- [#1252](https://github.com/nf-core/sarek/pull/1252) Added NGSCheckMate tool for checking that samples come from the same individual
SPPearce marked this conversation as resolved.
Show resolved Hide resolved

### Changed

Expand Down
4 changes: 4 additions & 0 deletions conf/igenomes.config
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ params {
known_indels_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/GATKBundle/{1000G_phase1,Mills_and_1000G_gold_standard}.indels.b37.vcf.gz.tbi"
known_indels_vqsr = '--resource:1000G,known=false,training=true,truth=true,prior=10.0 1000G_phase1.indels.b37.vcf.gz --resource:mills,known=false,training=true,truth=true,prior=10.0 Mills_and_1000G_gold_standard.indels.b37.vcf.gz'
mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh37/Annotation/Control-FREEC/out100m2_hg19.gem"
ngscheckmate_bed = "https://raw.githubusercontent.com/parklab/NGSCheckMate/master/SNP/SNP_GRCh37_hg19_wChr.bed"
SPPearce marked this conversation as resolved.
Show resolved Hide resolved
snpeff_db = 87
snpeff_genome = 'GRCh37'
vep_cache_version = 110
Expand Down Expand Up @@ -68,6 +69,7 @@ params {
known_indels_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/{Mills_and_1000G_gold_standard.indels.hg38,beta/Homo_sapiens_assembly38.known_indels}.vcf.gz.tbi"
known_indels_vqsr = '--resource:gatk,known=false,training=true,truth=true,prior=10.0 Homo_sapiens_assembly38.known_indels.vcf.gz --resource:mills,known=false,training=true,truth=true,prior=10.0 Mills_and_1000G_gold_standard.indels.hg38.vcf.gz'
mappability = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/Control-FREEC/out100m2_hg38.gem"
ngscheckmate_bed = "https://raw.githubusercontent.com/parklab/NGSCheckMate/master/SNP/SNP_GRCh38_hg38_wChr.bed"
SPPearce marked this conversation as resolved.
Show resolved Hide resolved
pon = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000g_pon.hg38.vcf.gz"
pon_tbi = "${params.igenomes_base}/Homo_sapiens/GATK/GRCh38/Annotation/GATKBundle/1000g_pon.hg38.vcf.gz.tbi"
snpeff_db = 105
Expand All @@ -79,6 +81,7 @@ params {
'Ensembl.GRCh37' {
bwa = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/version0.6.0/"
fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa"
ngscheckmate_bed = "https://raw.githubusercontent.com/parklab/NGSCheckMate/master/SNP/SNP_GRCh37_hg19_woChr.bed"
SPPearce marked this conversation as resolved.
Show resolved Hide resolved
readme = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Annotation/README.txt"
snpeff_db = 87
snpeff_genome = 'GRCh37'
Expand All @@ -89,6 +92,7 @@ params {
'NCBI.GRCh38' {
bwa = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/BWAIndex/version0.6.0/"
fasta = "${params.igenomes_base}/Homo_sapiens/NCBI/GRCh38/Sequence/WholeGenomeFasta/genome.fa"
ngscheckmate_bed = "https://raw.githubusercontent.com/parklab/NGSCheckMate/master/SNP/SNP_GRCh38_hg38_wChr.bed"
SPPearce marked this conversation as resolved.
Show resolved Hide resolved
snpeff_db = 105
snpeff_genome = 'GRCh38'
vep_cache_version = 110
Expand Down
12 changes: 12 additions & 0 deletions conf/modules/ngscheckmate.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
process {
withName: ".*BAM_NGSCHECKMATE:BCFTOOLS_MPILEUP" {
maxulysse marked this conversation as resolved.
Show resolved Hide resolved
ext.when = { params.tools && params.tools.split(',').contains('sampleqc') }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why sampleqc and not ngscheckmate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that the other tools that have been discussed (sexdeterrmine, somalier) would go into the same subworkflow. So I thought they might be all controlled together, but if they should be more granular that is fine too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather be granular, as we already have some qc tools

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. maybe someone wants to run checkmate but not somalier etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be fair, for most our QC tools, they enabled by default and disabled à la demande with --skip_tools, should we do that there?

ext.args2 = '--no-version --ploidy 1 -c'
ext.args3 = '--no-version'
}

withName: ".*BAM_NGSCHECKMATE:NGSCHECKMATE_NCM" {
ext.args = '-V'
}

}
12 changes: 11 additions & 1 deletion modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"bcftools/mpileup": {
"branch": "master",
"git_sha": "911696ea0b62df80e900ef244d7867d177971f73",
"installed_by": ["modules"]
"installed_by": ["bam_ngscheckmate", "modules"]
},
"bcftools/sort": {
"branch": "master",
Expand Down Expand Up @@ -322,6 +322,11 @@
"git_sha": "a6e11ac655e744f7ebc724be669dd568ffdc0e80",
"installed_by": ["modules"]
},
"ngscheckmate/ncm": {
"branch": "master",
"git_sha": "32d6725f584ebf460de39b7c1c53a29d5384d697",
"installed_by": ["bam_ngscheckmate"]
},
"samblaster": {
"branch": "master",
"git_sha": "603ecbd9f45300c9788f197d2a15a005685b4220",
Expand Down Expand Up @@ -461,6 +466,11 @@
},
"subworkflows": {
"nf-core": {
"bam_ngscheckmate": {
"branch": "master",
"git_sha": "32d6725f584ebf460de39b7c1c53a29d5384d697",
"installed_by": ["subworkflows"]
},
"vcf_annotate_ensemblvep": {
"branch": "master",
"git_sha": "dedc0e31087f3306101c38835d051bf49789445a",
Expand Down
64 changes: 64 additions & 0 deletions modules/nf-core/ngscheckmate/ncm/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

77 changes: 77 additions & 0 deletions modules/nf-core/ngscheckmate/ncm/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,9 @@ params {
vep_spliceai = null // spliceai plugin disabled within VEP
vep_spliceregion = null // spliceregion plugin disabled within VEP

// NGSCheckMate
ngscheckmate_bed = null

SPPearce marked this conversation as resolved.
Show resolved Hide resolved
// MultiQC options
multiqc_config = null
multiqc_title = null
Expand Down Expand Up @@ -382,6 +385,9 @@ includeConfig 'conf/modules/post_variant_calling.config'
//annotate
includeConfig 'conf/modules/annotate.config'

//ngscheckmate
includeConfig 'conf/modules/ngscheckmate.config'
SPPearce marked this conversation as resolved.
Show resolved Hide resolved

// Function to ensure that resource requirements don't go beyond
// a maximum limit
def check_max(obj, type) {
Expand Down
8 changes: 7 additions & 1 deletion nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@
"fa_icon": "fas fa-toolbox",
"description": "Tools to use for duplicate marking, variant calling and/or for annotation.",
"help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka2\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.",
"pattern": "^((ascat|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(?<!,)$"
"pattern": "^((ascat|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|sampleqc|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(?<!,)$"
},
"skip_tools": {
"type": "string",
Expand Down Expand Up @@ -713,6 +713,12 @@
"hidden": true,
"help_text": "If you use AWS iGenomes, this has already been set for you appropriately."
},
"ngscheckmate_bed": {
"type": "string",
"fa_icon": "fas fa-file",
"description": "Path to SNP bed file for sample checking with NGSCheckMate",
"help_text": "If you use AWS iGenomes, this has already been set for you appropriately."
},
"snpeff_db": {
"type": "string",
"fa_icon": "fas fa-database",
Expand Down
30 changes: 30 additions & 0 deletions subworkflows/local/cram_sampleqc/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
include { BAM_NGSCHECKMATE } from '../../../subworkflows/nf-core/bam_ngscheckmate/main'

workflow CRAM_SAMPLEQC {

take:
ch_cram // channel: [ val(meta), cram, crai ]
ngscheckmate_bed // channel: [ ngscheckmate_bed ]
fasta // channel: [ fasta ]

main:

ch_versions = Channel.empty()

ch_ngscheckmate_bed = ngscheckmate_bed.map{bed -> [[id: "ngscheckmate"], bed]}

ch_fasta = fasta.map{fasta -> [[id: "genome"], fasta]}

BAM_NGSCHECKMATE ( ch_cram.map{meta, cram, crai -> [meta, cram]}, ch_ngscheckmate_bed, ch_fasta)
ch_versions = ch_versions.mix(BAM_NGSCHECKMATE.out.versions.first())

emit:
corr_matrix = BAM_NGSCHECKMATE.out.corr_matrix // channel: [ meta, corr_matrix ]
matched = BAM_NGSCHECKMATE.out.matched // channel: [ meta, matched ]
all = BAM_NGSCHECKMATE.out.all // channel: [ meta, all ]
vcf = BAM_NGSCHECKMATE.out.vcf // channel: [ meta, vcf ]
pdf = BAM_NGSCHECKMATE.out.pdf // channel: [ meta, pdf ]

versions = ch_versions // channel: [ versions.yml ]
}

49 changes: 49 additions & 0 deletions subworkflows/nf-core/bam_ngscheckmate/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading