Dealing with optional and mandatory GATK resource files #592

FriederikeHanssen · 2022-06-16T10:37:40Z

PR checklist

…r haplotypecaller

github-actions · 2022-06-16T10:40:46Z

`nf-core lint` overall result: Passed ✅ ⚠️

Posted for pipeline commit 2aa1e2f

+| ✅ 144 tests passed       |+
#| ❔   4 tests were ignored |#
!| ❗   8 tests had warnings |!

❗ Test warnings:

readme - README did not have a Nextflow minimum version badge.
pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in test_full.config: Specify the paths to your full test data ( on nf-core/test-datasets or directly in repositories, e.g. SRA)
pipeline_todos - TODO string in test_full.config: Give any required params for the test so that command line flags are not needed
pipeline_todos - TODO string in awsfulltest.yml: You can customise AWS full pipeline tests as required
schema_description - No description provided in schema for parameter: umi_read_structure
schema_description - No description provided in schema for parameter: group_by_umi_strategy

❔ Tests ignored:

files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
files_unchanged - File ignored due to lint config: lib/NfcoreTemplate.groovy

✅ Tests passed:

files_exist - File found: .gitattributes
files_exist - File found: .gitignore
files_exist - File found: .nf-core.yml
files_exist - File found: .editorconfig
files_exist - File found: .prettierrc.yml
files_exist - File found: CHANGELOG.md
files_exist - File found: CITATIONS.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: CODE_OF_CONDUCT.md
files_exist - File found: LICENSE or LICENSE.md or LICENCE or LICENCE.md
files_exist - File found: nextflow_schema.json
files_exist - File found: nextflow.config
files_exist - File found: README.md
files_exist - File found: .github/.dockstore.yml
files_exist - File found: .github/CONTRIBUTING.md
files_exist - File found: .github/ISSUE_TEMPLATE/bug_report.yml
files_exist - File found: .github/ISSUE_TEMPLATE/config.yml
files_exist - File found: .github/ISSUE_TEMPLATE/feature_request.yml
files_exist - File found: .github/PULL_REQUEST_TEMPLATE.md
files_exist - File found: .github/workflows/branch.yml
files_exist - File found: .github/workflows/ci.yml
files_exist - File found: .github/workflows/linting_comment.yml
files_exist - File found: .github/workflows/linting.yml
files_exist - File found: assets/email_template.html
files_exist - File found: assets/email_template.txt
files_exist - File found: assets/sendmail_template.txt
files_exist - File found: assets/nf-core-sarek_logo_light.png
files_exist - File found: conf/modules.config
files_exist - File found: conf/test.config
files_exist - File found: conf/test_full.config
files_exist - File found: docs/images/nf-core-sarek_logo_light.png
files_exist - File found: docs/images/nf-core-sarek_logo_dark.png
files_exist - File found: docs/output.md
files_exist - File found: docs/README.md
files_exist - File found: docs/README.md
files_exist - File found: docs/usage.md
files_exist - File found: lib/nfcore_external_java_deps.jar
files_exist - File found: lib/NfcoreSchema.groovy
files_exist - File found: lib/NfcoreTemplate.groovy
files_exist - File found: lib/Utils.groovy
files_exist - File found: lib/WorkflowMain.groovy
files_exist - File found: main.nf
files_exist - File found: assets/multiqc_config.yml
files_exist - File found: conf/base.config
files_exist - File found: conf/igenomes.config
files_exist - File found: .github/workflows/awstest.yml
files_exist - File found: .github/workflows/awsfulltest.yml
files_exist - File found: lib/WorkflowSarek.groovy
files_exist - File found: modules.json
files_exist - File not found check: Singularity
files_exist - File not found check: parameters.settings.json
files_exist - File not found check: .nf-core.yaml
files_exist - File not found check: bin/markdown_to_html.r
files_exist - File not found check: conf/aws.config
files_exist - File not found check: .github/workflows/push_dockerhub.yml
files_exist - File not found check: .github/ISSUE_TEMPLATE/bug_report.md
files_exist - File not found check: .github/ISSUE_TEMPLATE/feature_request.md
files_exist - File not found check: docs/images/nf-core-sarek_logo.png
files_exist - File not found check: .markdownlint.yml
files_exist - File not found check: .yamllint.yml
files_exist - File not found check: .travis.yml
nextflow_config - Config variable found: manifest.name
nextflow_config - Config variable found: manifest.nextflowVersion
nextflow_config - Config variable found: manifest.description
nextflow_config - Config variable found: manifest.version
nextflow_config - Config variable found: manifest.homePage
nextflow_config - Config variable found: timeline.enabled
nextflow_config - Config variable found: trace.enabled
nextflow_config - Config variable found: report.enabled
nextflow_config - Config variable found: dag.enabled
nextflow_config - Config variable found: process.cpus
nextflow_config - Config variable found: process.memory
nextflow_config - Config variable found: process.time
nextflow_config - Config variable found: params.outdir
nextflow_config - Config variable found: params.input
nextflow_config - Config variable found: params.show_hidden_params
nextflow_config - Config variable found: params.schema_ignore_params
nextflow_config - Config variable found: manifest.mainScript
nextflow_config - Config variable found: timeline.file
nextflow_config - Config variable found: trace.file
nextflow_config - Config variable found: report.file
nextflow_config - Config variable found: dag.file
nextflow_config - Config variable (correctly) not found: params.version
nextflow_config - Config variable (correctly) not found: params.nf_required_version
nextflow_config - Config variable (correctly) not found: params.container
nextflow_config - Config variable (correctly) not found: params.singleEnd
nextflow_config - Config variable (correctly) not found: params.igenomesIgnore
nextflow_config - Config variable (correctly) not found: params.name
nextflow_config - Config timeline.enabled had correct value: true
nextflow_config - Config report.enabled had correct value: true
nextflow_config - Config trace.enabled had correct value: true
nextflow_config - Config dag.enabled had correct value: true
nextflow_config - Config manifest.name began with nf-core/
nextflow_config - Config variable manifest.homePage began with https://github.com/nf-core/
nextflow_config - Config dag.file ended with .html
nextflow_config - Config variable manifest.nextflowVersion started with >= or !>=
nextflow_config - Config manifest.version ends in dev: '3.0dev'
nextflow_config - Config params.custom_config_version is set to master
nextflow_config - Config params.custom_config_base is set to https://raw.githubusercontent.com/nf-core/configs/master
nextflow_config - Lines for loading custom profiles found
files_unchanged - .gitattributes matches the template
files_unchanged - .prettierrc.yml matches the template
files_unchanged - CODE_OF_CONDUCT.md matches the template
files_unchanged - LICENSE matches the template
files_unchanged - .github/.dockstore.yml matches the template
files_unchanged - .github/CONTRIBUTING.md matches the template
files_unchanged - .github/ISSUE_TEMPLATE/bug_report.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/config.yml matches the template
files_unchanged - .github/ISSUE_TEMPLATE/feature_request.yml matches the template
files_unchanged - .github/PULL_REQUEST_TEMPLATE.md matches the template
files_unchanged - .github/workflows/branch.yml matches the template
files_unchanged - .github/workflows/linting_comment.yml matches the template
files_unchanged - .github/workflows/linting.yml matches the template
files_unchanged - assets/email_template.html matches the template
files_unchanged - assets/email_template.txt matches the template
files_unchanged - assets/sendmail_template.txt matches the template
files_unchanged - docs/README.md matches the template
files_unchanged - lib/nfcore_external_java_deps.jar matches the template
files_unchanged - lib/NfcoreSchema.groovy matches the template
files_unchanged - .gitignore matches the template
actions_ci - '.github/workflows/ci.yml' is triggered on expected events
actions_ci - '.github/workflows/ci.yml' checks minimum NF version
actions_awstest - '.github/workflows/awstest.yml' is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml is triggered correctly
actions_awsfulltest - .github/workflows/awsfulltest.yml does not use -profile test
readme - README Nextflow minimum version in Quick Start section matched config. README: 21.10.3, Config: 21.10.3
pipeline_name_conventions - Name adheres to nf-core convention
template_strings - Did not find any Jinja template strings (380 files)
schema_lint - Schema lint passed
schema_lint - Schema title + description lint passed
schema_params - Schema matched params returned from nextflow config
actions_schema_validation - Workflow validation passed: awstest.yml
actions_schema_validation - Workflow validation passed: linting.yml
actions_schema_validation - Workflow validation passed: awsfulltest.yml
actions_schema_validation - Workflow validation passed: branch.yml
actions_schema_validation - Workflow validation passed: fix-linting.yml
actions_schema_validation - Workflow validation passed: local_modules.yml
actions_schema_validation - Workflow validation passed: linting_comment.yml
actions_schema_validation - Workflow validation passed: ci.yml
merge_markers - No merge markers found in pipeline files
modules_json - Only installed modules found in modules.json
multiqc_config - 'assets/multiqc_config.yml' follows the ordering scheme of the minimally required plugins.
multiqc_config - 'assets/multiqc_config.yml' contains a matching 'report_comment'.
multiqc_config - 'assets/multiqc_config.yml' contains 'export_plots: true'.

Run details

nf-core/tools version 2.4.1
Run at 2022-06-20 12:58:46

FriederikeHanssen · 2022-06-16T13:19:27Z

@maxulysse still unsure on how to deal with known_sites (see nextflow slack). For the rest, would love your opinion. I am not super happy with prepare_genome.nf, however I coudln't find another work around. If you know of some nextflow magic there, i would be all 👂

FriederikeHanssen · 2022-06-17T12:49:53Z

@maxulysse carefully testing now all combinations again, but so far so good.

FriederikeHanssen · 2022-06-17T13:38:36Z

Tested the following and all works/fails as expectd:

BQSR/FiltervariantTranches: with known_sites, just dbsnp, just known_indels
Haplotypecaller: with/without dbsnp
Mutect/Getpileupsummaries: with and without germline_resource
Getpileupsummaries: without intervals uses germline_resource

maxulysse

<3

maxulysse · 2022-06-17T13:40:24Z

Can you do the channel.value magic for all of the reference files?

FriederikeHanssen · 2022-06-17T13:41:57Z

yes should be able to, the only exception is germline_resource_tbi because I need it to check later whether getpileup summaries should be run. I couldn't get naything else to work withChannel.value([]), tried Channel.ifEmpty , germline_resource_tbi ?: etc.

FriederikeHanssen · 2022-06-17T13:43:08Z

For the fasta also no, because it is mandatory, so it should be empty (however this is also caught by the input validation)

FriederikeHanssen · 2022-06-17T16:02:07Z

Mutect needs fixing

FriederikeHanssen · 2022-06-20T06:51:19Z

Not sure what is going on with mutect2. Locally results files are there:

results
├── csv
│   └── variantcalled.csv
├── multiqc
│   ├── multiqc_data
│   │   ├── mqc_bcftools-stats-subtypes_1.txt
│   │   ├── mqc_bcftools_stats_depth_1.txt
│   │   ├── mqc_bcftools_stats_indel-lengths_1.txt
│   │   ├── mqc_bcftools_stats_vqc_Count_Indels.txt
│   │   ├── mqc_bcftools_stats_vqc_Count_SNP.txt
│   │   ├── mqc_bcftools_stats_vqc_Count_Transitions.txt
│   │   ├── mqc_bcftools_stats_vqc_Count_Transversions.txt
│   │   ├── mqc_vcftools_tstv_by_count_1.txt
│   │   ├── mqc_vcftools_tstv_by_qual_1.txt
│   │   ├── multiqc.log
│   │   ├── multiqc_bcftools_stats.txt
│   │   ├── multiqc_citations.txt
│   │   ├── multiqc_data.json
│   │   ├── multiqc_general_stats.txt
│   │   ├── multiqc_sources.txt
│   │   ├── vcftools_tstv_by_count.txt
│   │   └── vcftools_tstv_by_qual.txt
│   ├── multiqc_plots
│   │   ├── pdf
│   │   │   ├── mqc_bcftools-stats-subtypes_1.pdf
│   │   │   ├── mqc_bcftools-stats-subtypes_1_pc.pdf
│   │   │   ├── mqc_bcftools_stats_depth_1.pdf
│   │   │   ├── mqc_bcftools_stats_indel-lengths_1.pdf
│   │   │   ├── mqc_bcftools_stats_vqc_Count_Indels.pdf
│   │   │   ├── mqc_bcftools_stats_vqc_Count_SNP.pdf
│   │   │   ├── mqc_bcftools_stats_vqc_Count_Transitions.pdf
│   │   │   ├── mqc_bcftools_stats_vqc_Count_Transversions.pdf
│   │   │   ├── mqc_vcftools_tstv_by_count_1.pdf
│   │   │   └── mqc_vcftools_tstv_by_qual_1.pdf
│   │   ├── png
│   │   │   ├── mqc_bcftools-stats-subtypes_1.png
│   │   │   ├── mqc_bcftools-stats-subtypes_1_pc.png
│   │   │   ├── mqc_bcftools_stats_depth_1.png
│   │   │   ├── mqc_bcftools_stats_indel-lengths_1.png
│   │   │   ├── mqc_bcftools_stats_vqc_Count_Indels.png
│   │   │   ├── mqc_bcftools_stats_vqc_Count_SNP.png
│   │   │   ├── mqc_bcftools_stats_vqc_Count_Transitions.png
│   │   │   ├── mqc_bcftools_stats_vqc_Count_Transversions.png
│   │   │   ├── mqc_vcftools_tstv_by_count_1.png
│   │   │   └── mqc_vcftools_tstv_by_qual_1.png
│   │   └── svg
│   │       ├── mqc_bcftools-stats-subtypes_1.svg
│   │       ├── mqc_bcftools-stats-subtypes_1_pc.svg
│   │       ├── mqc_bcftools_stats_depth_1.svg
│   │       ├── mqc_bcftools_stats_indel-lengths_1.svg
│   │       ├── mqc_bcftools_stats_vqc_Count_Indels.svg
│   │       ├── mqc_bcftools_stats_vqc_Count_SNP.svg
│   │       ├── mqc_bcftools_stats_vqc_Count_Transitions.svg
│   │       ├── mqc_bcftools_stats_vqc_Count_Transversions.svg
│   │       ├── mqc_vcftools_tstv_by_count_1.svg
│   │       └── mqc_vcftools_tstv_by_qual_1.svg
│   └── multiqc_report.html
├── pipeline_info
│   ├── execution_report_2022-06-20_08-48-17.html
│   ├── execution_timeline_2022-06-20_08-48-17.html
│   ├── execution_trace_2022-06-20_08-48-17.txt
│   ├── pipeline_dag_2022-06-20_08-48-17.html
│   └── software_versions.yml
├── reports
│   ├── bcftools
│   │   └── sample4_vs_sample3.filtered.bcftools_stats.txt
│   └── vcftools
│       ├── sample4_vs_sample3.filtered.FILTER.summary
│       ├── sample4_vs_sample3.filtered.TsTv.count
│       └── sample4_vs_sample3.filtered.TsTv.qual
├── untar
│   └── chromosomes
│       └── chr21.fasta
└── variant_calling
    └── sample4_vs_sample3
        └── mutect2
            ├── sample3.pileupsummaries.table
            ├── sample4.pileupsummaries.table
            ├── sample4_vs_sample3.artifactprior.tar.gz
            ├── sample4_vs_sample3.contamination.table
            ├── sample4_vs_sample3.filtered.vcf.gz
            ├── sample4_vs_sample3.filtered.vcf.gz.filteringStats.tsv
            ├── sample4_vs_sample3.filtered.vcf.gz.tbi
            ├── sample4_vs_sample3.segmentation.table
            ├── sample4_vs_sample3.vcf.gz
            ├── sample4_vs_sample3.vcf.gz.stats
            └── sample4_vs_sample3.vcf.gz.tbi

conf/modules.config

Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>

FriederikeHanssen · 2022-06-20T11:05:30Z

GetPileupSummariesNormal fails with a memory error. Wasn't an issue before because it wasn't run. Locally this works because the job gets automatically resubmitted with a higher memory request 🥲 .

FriederikeHanssen · 2022-06-20T13:22:58Z

locally it didn't work with 7.5GB of memory. I am afraid that mutect test just won't run on GHA

maxulysse · 2022-06-20T13:27:04Z

locally it didn't work with 7.5GB of memory. I am afraid that mutect test just won't run on GHA

Too bad, then, let's run that test locally only whenever needed.
I'm assuming we already have input data as small as possible?

FriederikeHanssen · 2022-06-20T13:28:39Z

yep Gavin downsampled it as much as he could but the somatic GATK tools require a certain amount of SNPs to run at all. If we further reduce they'll fail for other reasons and much more upstream

Add error message if dbsnp or known_indels is not supplied for bqsr o…

f217e6a

…r haplotypecaller

FriederikeHanssen added 6 commits June 16, 2022 12:44

multi-line not supported, use \n instead

2d811ff

Make PON optional

dd54c4c

Make germline_resource optional for mutect2

a49df17

Update getpileup, deal with optional dbsnp, germline, knwon_indels

30b368c

Formatting

f3796c1

Formatting

63e8e7d

FriederikeHanssen added 3 commits June 17, 2022 10:24

Merge remote-tracking branch 'upstream/dev' into gatk_resource

f0805af

Add suggestion on value channels to handle optional input

37043a0

make germline_resource work again

3d34ae7

remove test code

c6210fa

FriederikeHanssen marked this pull request as ready for review June 17, 2022 13:34

FriederikeHanssen requested a review from maxulysse as a code owner June 17, 2022 13:34

add mutect2 no intervals tests

a72ce06

some indents

1d6e363

maxulysse approved these changes Jun 17, 2022

View reviewed changes

more channel ✨ magic

7fd12ab

maxulysse approved these changes Jun 17, 2022

View reviewed changes

Merge remote-tracking branch 'upstream/dev' into gatk_resource

17d3d72

FriederikeHanssen added 2 commits June 17, 2022 20:33

add config for mutect2 tests

ae9eb2f

not sure what is going with mutect2

f8a0e12

maxulysse approved these changes Jun 20, 2022

View reviewed changes

Fix getpileupsummaries output

7216964

maxulysse reviewed Jun 20, 2022

View reviewed changes

conf/modules.config Outdated Show resolved Hide resolved

maxulysse reviewed Jun 20, 2022

View reviewed changes

conf/modules.config Show resolved Hide resolved

Update conf/modules.config

ed7c724

Co-authored-by: Maxime U. Garcia <maxime.garcia@scilifelab.se>

FriederikeHanssen added 3 commits June 20, 2022 13:07

Try to request more memory

975e71c

Revert this, everything is red

26d52d4

try .5

2aa1e2f

maxulysse merged commit 1894b2f into nf-core:dev Jun 20, 2022

This was referenced Jun 20, 2022

[BUG] no suitable codecs found #367

Closed

Error in GetPileupSummaries when using --no_intervals option #299

Closed

[BUG] Mutect2 - Error with both 'intervals' and 'no-intervals' options #359

Closed

ameynert mentioned this pull request Mar 3, 2023

Use GATK small_exac_common_3.hg38.vcf.gz as default germline_resource #959

Open

FriederikeHanssen deleted the gatk_resource branch March 3, 2023 12:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dealing with optional and mandatory GATK resource files #592

Dealing with optional and mandatory GATK resource files #592

FriederikeHanssen commented Jun 16, 2022 •

edited

Loading

github-actions bot commented Jun 16, 2022 •

edited

Loading

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

FriederikeHanssen commented Jun 16, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

maxulysse left a comment

maxulysse commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 20, 2022

FriederikeHanssen commented Jun 20, 2022

FriederikeHanssen commented Jun 20, 2022

maxulysse commented Jun 20, 2022

FriederikeHanssen commented Jun 20, 2022

Dealing with optional and mandatory GATK resource files #592

Dealing with optional and mandatory GATK resource files #592

Conversation

FriederikeHanssen commented Jun 16, 2022 • edited Loading

PR checklist

github-actions bot commented Jun 16, 2022 • edited Loading

nf-core lint overall result: Passed ✅ ⚠️

❗ Test warnings:

❔ Tests ignored:

✅ Tests passed:

Run details

FriederikeHanssen commented Jun 16, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

maxulysse left a comment

Choose a reason for hiding this comment

maxulysse commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 17, 2022

FriederikeHanssen commented Jun 20, 2022

FriederikeHanssen commented Jun 20, 2022

FriederikeHanssen commented Jun 20, 2022

maxulysse commented Jun 20, 2022

FriederikeHanssen commented Jun 20, 2022

FriederikeHanssen commented Jun 16, 2022 •

edited

Loading

github-actions bot commented Jun 16, 2022 •

edited

Loading

`nf-core lint` overall result: Passed ✅ ⚠️