Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dealing with optional and mandatory GATK resource files #592

Merged
merged 22 commits into from
Jun 20, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
f217e6a
Add error message if dbsnp or known_indels is not supplied for bqsr o…
FriederikeHanssen Jun 16, 2022
2d811ff
multi-line not supported, use \n instead
FriederikeHanssen Jun 16, 2022
dd54c4c
Make PON optional
FriederikeHanssen Jun 16, 2022
a49df17
Make germline_resource optional for mutect2
FriederikeHanssen Jun 16, 2022
30b368c
Update getpileup, deal with optional dbsnp, germline, knwon_indels
FriederikeHanssen Jun 16, 2022
f3796c1
Formatting
FriederikeHanssen Jun 16, 2022
63e8e7d
Formatting
FriederikeHanssen Jun 16, 2022
f0805af
Merge remote-tracking branch 'upstream/dev' into gatk_resource
FriederikeHanssen Jun 17, 2022
37043a0
Add suggestion on value channels to handle optional input
FriederikeHanssen Jun 17, 2022
3d34ae7
make germline_resource work again
FriederikeHanssen Jun 17, 2022
c6210fa
remove test code
FriederikeHanssen Jun 17, 2022
a72ce06
add mutect2 no intervals tests
FriederikeHanssen Jun 17, 2022
1d6e363
some indents
FriederikeHanssen Jun 17, 2022
7fd12ab
more channel :sparkles: magic
FriederikeHanssen Jun 17, 2022
17d3d72
Merge remote-tracking branch 'upstream/dev' into gatk_resource
FriederikeHanssen Jun 17, 2022
ae9eb2f
add config for mutect2 tests
FriederikeHanssen Jun 17, 2022
f8a0e12
not sure what is going with mutect2
FriederikeHanssen Jun 20, 2022
7216964
Fix getpileupsummaries output
FriederikeHanssen Jun 20, 2022
ed7c724
Update conf/modules.config
FriederikeHanssen Jun 20, 2022
975e71c
Try to request more memory
FriederikeHanssen Jun 20, 2022
26d52d4
Revert this, everything is red
FriederikeHanssen Jun 20, 2022
2aa1e2f
try .5
FriederikeHanssen Jun 20, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#587](https://github.com/nf-core/sarek/pull/587) - Fix issue with VEP extra files
- [#581](https://github.com/nf-core/sarek/pull/581) - `TIDDIT` is back
- [#590](https://github.com/nf-core/sarek/pull/590) - Fix empty folders during scatter/gather
- [#592](https://github.com/nf-core/sarek/pull/592) - Fix optional resources for Mutect2, GetPileupSummaries, and HaplotypeCaller: issue [#299](https://github.com/nf-core/sarek/issues/299), [#359](https://github.com/nf-core/sarek/issues/359), [#367](https://github.com/nf-core/sarek/issues/367)

### Deprecated

Expand Down
25 changes: 17 additions & 8 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -808,8 +808,8 @@ process{
ext.args = { "-tumor-segmentation ${meta.id}.segmentation.table" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
path: { "${params.outdir}/variant_calling/" },
saveAs: { filename -> filename.equals('versions.yml') ? null : "${meta.id}/mutect2/${filename}" }
]
}

Expand All @@ -825,8 +825,8 @@ process{
ext.prefix = {"${meta.id}.filtered"}
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" },
saveAs: { filename -> filename.equals('versions.yml') ? null : filename }
path: { "${params.outdir}/variant_calling/" },
saveAs: { filename -> filename.equals('versions.yml') ? null : "${meta.id}/mutect2/${filename}" }
]
}

Expand All @@ -850,9 +850,18 @@ process{
ext.prefix = { meta.num_intervals <= 1 ? meta.id : "${meta.id}_${intervals.simpleName}" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/" },
path: { "${params.outdir}/variant_calling/" },
pattern: "*.table",
saveAs: { meta.num_intervals > 1 ? null : "${meta.id}/mutect2/${it}" }
]
}

withName: 'GETPILEUPSUMMARIES_.*' {
maxulysse marked this conversation as resolved.
Show resolved Hide resolved
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/" },
pattern: "*.table",
saveAs: { meta.num_intervals > 1 ? null : "mutect2/${it}" }
saveAs: { meta.num_intervals > 1 ? null : "${meta.tumor_id}_vs_${meta.normal_id}/mutect2/${it}" }
]
}

Expand Down Expand Up @@ -880,9 +889,9 @@ process{
ext.args = { params.ignore_soft_clipped_bases ? "--dont-use-soft-clipped-bases true --f1r2-tar-gz ${task.ext.prefix}.f1r2.tar.gz" : "--f1r2-tar-gz ${task.ext.prefix}.f1r2.tar.gz" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/" },
path: { "${params.outdir}/variant_calling/" },
pattern: "*{vcf.gz,vcf.gz.tbi,stats}",
saveAs: { meta.num_intervals > 1 ? null : "mutect2/${it}" }
saveAs: { meta.num_intervals > 1 ? null : "${meta.id}/mutect2/${it}" }
]
}

Expand Down
2 changes: 1 addition & 1 deletion conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ params {

// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = '6.GB'
max_memory = '6.5GB'
max_time = '8.h'

// Input data
Expand Down
2 changes: 1 addition & 1 deletion modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@
"git_sha": "169b2b96c1167f89ab07127b7057c1d90a6996c7"
},
"gatk4/getpileupsummaries": {
"git_sha": "f40cfefc0899fd6bb6adc300142ca6c3a35573ff"
"git_sha": "1ac223ad436c1410e9c16a5966274b7ca1f8d855"
},
"gatk4/haplotypecaller": {
"git_sha": "169b2b96c1167f89ab07127b7057c1d90a6996c7"
Expand Down
2 changes: 1 addition & 1 deletion modules/nf-core/modules/gatk4/getpileupsummaries/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

23 changes: 11 additions & 12 deletions subworkflows/local/prepare_genome.nf
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,6 @@ workflow PREPARE_GENOME {
TABIX_PON(pon.flatten().map{ it -> [[id:it.baseName], it] })

chr_files = chr_dir
//TODO this works, but is not pretty. I will leave this in your hands during refactoring @Maxime
if ( params.chr_dir.endsWith('tar.gz')){
UNTAR_CHR_DIR(chr_dir.map{ it -> [[id:it[0].baseName], it] })
chr_files = UNTAR_CHR_DIR.out.untar.map{ it[1] }
Expand All @@ -71,16 +70,16 @@ workflow PREPARE_GENOME {
ch_versions = ch_versions.mix(TABIX_PON.out.versions)

emit:
bwa = BWAMEM1_INDEX.out.index // path: bwa/*
bwamem2 = BWAMEM2_INDEX.out.index // path: bwamem2/*
hashtable = DRAGMAP_HASHTABLE.out.hashmap // path: dragmap/*
dbsnp_tbi = TABIX_DBSNP.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: dbsnb.vcf.gz.tbi
dict = GATK4_CREATESEQUENCEDICTIONARY.out.dict // path: genome.fasta.dict
fasta_fai = SAMTOOLS_FAIDX.out.fai.map{ meta, fai -> [fai] } // path: genome.fasta.fai
germline_resource_tbi = TABIX_GERMLINE_RESOURCE.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: germline_resource.vcf.gz.tbi
known_indels_tbi = TABIX_KNOWN_INDELS.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: {known_indels*}.vcf.gz.tbi
msisensorpro_scan = MSISENSORPRO_SCAN.out.list.map{ meta, list -> [list] } // path: genome_msi.list
pon_tbi = TABIX_PON.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: pon.vcf.gz.tbi
bwa = BWAMEM1_INDEX.out.index // path: bwa/*
bwamem2 = BWAMEM2_INDEX.out.index // path: bwamem2/*
hashtable = DRAGMAP_HASHTABLE.out.hashmap // path: dragmap/*
dbsnp_tbi = TABIX_DBSNP.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: dbsnb.vcf.gz.tbi
dict = GATK4_CREATESEQUENCEDICTIONARY.out.dict // path: genome.fasta.dict
fasta_fai = SAMTOOLS_FAIDX.out.fai.map{ meta, fai -> [fai] } // path: genome.fasta.fai
germline_resource_tbi = TABIX_GERMLINE_RESOURCE.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: germline_resource.vcf.gz.tbi
known_indels_tbi = TABIX_KNOWN_INDELS.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: {known_indels*}.vcf.gz.tbi
msisensorpro_scan = MSISENSORPRO_SCAN.out.list.map{ meta, list -> [list] } // path: genome_msi.list
pon_tbi = TABIX_PON.out.tbi.map{ meta, tbi -> [tbi] }.collect() // path: pon.vcf.gz.tbi
chr_files = chr_files
versions = ch_versions // channel: [ versions.yml ]
versions = ch_versions // channel: [ versions.yml ]
}

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

35 changes: 26 additions & 9 deletions tests/test_tools.yml
Original file line number Diff line number Diff line change
Expand Up @@ -430,10 +430,18 @@
- no_intervals
- tumor_only
- variant_calling
exit_code: 1
stdout:
contains:
- "--tools mutect2 and --no_intervals cannot be used together."
files:
- path: results/variant_calling/sample2/mutect2/sample2.vcf.gz
- path: results/variant_calling/sample2/mutect2/sample2.vcf.gz.tbi
- path: results/variant_calling/sample2/mutect2/sample2.vcf.gz.stats
- path: results/variant_calling/sample2/mutect2/sample2.contamination.table
- path: results/variant_calling/sample2/mutect2/sample2.segmentation.table
- path: results/variant_calling/sample2/mutect2/sample2.artifactprior.tar.gz
- path: results/variant_calling/sample2/mutect2/sample2.pileupsummaries.table
- path: results/variant_calling/sample2/mutect2/sample2.filtered.vcf.gz
- path: results/variant_calling/sample2/mutect2/sample2.filtered.vcf.gz.tbi
- path: results/variant_calling/sample2/mutect2/sample2.filtered.vcf.gz.filteringStats.tsv
- path: results/csv/variantcalled.csv

- name: Run variant calling on somatic sample with mutect2
command: nextflow run main.nf -profile test,tools_somatic,docker --tools mutect2 -c ./tests/nextflow.config
Expand All @@ -456,16 +464,25 @@
- path: results/csv/variantcalled.csv

- name: Run variant calling on somatic sample with mutect2 without intervals
command: nextflow run main.nf -profile test,tools_somatic,docker --tools mutect2 --no_intervals
command: nextflow run main.nf -profile test,tools_somatic,docker --tools mutect2 --no_intervals -c ./tests/nextflow.config
tags:
- mutect2
- no_intervals
- somatic
- variant_calling
exit_code: 1
stdout:
contains:
- "--tools mutect2 and --no_intervals cannot be used together."
files:
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.vcf.gz
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.vcf.gz.tbi
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.vcf.gz.stats
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.contamination.table
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.segmentation.table
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample3.pileupsummaries.table
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4.pileupsummaries.table
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.artifactprior.tar.gz
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.filtered.vcf.gz
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.filtered.vcf.gz.tbi
- path: results/variant_calling/sample4_vs_sample3/mutect2/sample4_vs_sample3.filtered.vcf.gz.filteringStats.tsv
- path: results/csv/variantcalled.csv

- name: Run variant calling on somatic sample with msisensor-pro
command: nextflow run main.nf -profile test,tools_somatic,docker --tools msisensorpro
Expand Down
Loading