Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add step markduplicates & allow BAM input for all steps #536

Merged
merged 69 commits into from
May 12, 2022
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
d399642
Add step markduplicates
FriederikeHanssen May 3, 2022
b156dd7
Add step md tests, need to update prepare_recal input [skip actions]
FriederikeHanssen May 4, 2022
0c7e618
Add tests to CI [skip actions]
FriederikeHanssen May 4, 2022
a1250da
Add first funcitoning version of md step
FriederikeHanssen May 4, 2022
e4d7037
Add option to start prepare_recal with bam or cram
FriederikeHanssen May 5, 2022
b3d9cce
Add bam test for step recal & simplify other tests
FriederikeHanssen May 5, 2022
99801a9
going places with cram inpuit for md step & -skip-tool md
FriederikeHanssen May 5, 2022
7b0c2aa
Merge branch 'dev' into step_md
FriederikeHanssen May 5, 2022
8da6cbf
update tests
FriederikeHanssen May 5, 2022
0f797da
Merge remote-tracking branch 'origin/step_md' into step_md
FriederikeHanssen May 5, 2022
74a5e29
remove view() and such
FriederikeHanssen May 5, 2022
4622f1c
update subway map with step md entry and bam/cram entry
FriederikeHanssen May 5, 2022
782ac0c
update changelog
FriederikeHanssen May 5, 2022
d89040a
Update CHANGELOG.md
FriederikeHanssen May 6, 2022
76959cd
reduce size of subway map
FriederikeHanssen May 6, 2022
d37a9de
Rename csv file after comment from code review
FriederikeHanssen May 6, 2022
7af97d5
Use bam_qc workflow for cram and bam, qc will always be run cram files
FriederikeHanssen May 6, 2022
93ffdbc
remove TODOs and unused code
FriederikeHanssen May 6, 2022
2adff87
Typo
FriederikeHanssen May 6, 2022
f8d136d
rename test files
FriederikeHanssen May 6, 2022
00a687b
Update subway map
FriederikeHanssen May 6, 2022
866c67e
try to fix svg size
FriederikeHanssen May 6, 2022
6858f4f
Fix background color
FriederikeHanssen May 6, 2022
a6ea44f
resize svg
FriederikeHanssen May 6, 2022
ccef688
use a4 page for png
FriederikeHanssen May 6, 2022
05f427a
Fix mix statement & remove printlns
FriederikeHanssen May 6, 2022
435301e
Fix naming of output files
FriederikeHanssen May 6, 2022
e26be81
some working version with cramtobam
FriederikeHanssen May 9, 2022
3f386b5
replace samtools bam2cram with convert
FriederikeHanssen May 9, 2022
0caa0bf
only create output md csv if not step == md
FriederikeHanssen May 9, 2022
5200eff
actually add new files
FriederikeHanssen May 9, 2022
d38af5c
Actually publish file IF md is run
FriederikeHanssen May 9, 2022
987b5a6
Fix interval path
FriederikeHanssen May 10, 2022
fe53e6f
Fix more file paths
FriederikeHanssen May 10, 2022
67a0ec4
fix conditional for bam and cram combination
FriederikeHanssen May 10, 2022
b5117c3
Fix empty list input with empty channel
FriederikeHanssen May 10, 2022
96dc83c
csv file for md are not produced if starting step is markduplicates, …
FriederikeHanssen May 10, 2022
3441bc0
Fix channel content by removing bai
FriederikeHanssen May 10, 2022
2ebad14
rename tag to skip subworkflow tests until fixed
FriederikeHanssen May 10, 2022
07b96b6
fix syntax
FriederikeHanssen May 10, 2022
4bed666
Move samtools stats to bam_to_qc wf, separate cram for restart and pr…
FriederikeHanssen May 10, 2022
7e5a366
update tests
FriederikeHanssen May 10, 2022
502a1b6
Complete step starts from bam and cram
FriederikeHanssen May 10, 2022
f59dfce
Swap csv writing methods, table should be written for bqsr and not md
FriederikeHanssen May 10, 2022
fe0db1e
fix import
FriederikeHanssen May 10, 2022
f4f8107
fix channel name
FriederikeHanssen May 10, 2022
68dbf72
correct test file name
FriederikeHanssen May 10, 2022
62d1e68
fix mix
FriederikeHanssen May 10, 2022
038be43
update publishing
FriederikeHanssen May 10, 2022
d649af2
fix name again
FriederikeHanssen May 10, 2022
082348c
fix null output of samtools.stats
FriederikeHanssen May 11, 2022
3ec4078
complete should_exits:false for test
FriederikeHanssen May 11, 2022
24f5f05
add all files
FriederikeHanssen May 11, 2022
bdb8869
add all files
FriederikeHanssen May 12, 2022
10af05b
Fix tests & add tools strelka to make sure channel magic works
FriederikeHanssen May 12, 2022
9b807d6
Fix skip markduplicates channel magic
FriederikeHanssen May 12, 2022
a92722a
Correct chanel name
FriederikeHanssen May 12, 2022
370602f
fix recalibrate test
FriederikeHanssen May 12, 2022
3d3c81b
fix channel name
FriederikeHanssen May 12, 2022
b9a2ede
channel mapping
FriederikeHanssen May 12, 2022
11633dd
fix csv writing
FriederikeHanssen May 12, 2022
15dd172
fix naming AGAIN
FriederikeHanssen May 12, 2022
84bdbfe
Fix output VC name
FriederikeHanssen May 12, 2022
a5d13c5
Fix process name
FriederikeHanssen May 12, 2022
560977a
remove TODO
FriederikeHanssen May 12, 2022
a8c717b
remove unused TODO statements
FriederikeHanssen May 12, 2022
a322b33
update docs about step markduplicates
FriederikeHanssen May 12, 2022
a6c11f1
update docs with bam/cram input options
FriederikeHanssen May 12, 2022
e044d9b
update changelog
FriederikeHanssen May 12, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,12 @@ jobs:
- "gatk4_spark"
- "haplotypecaller"
- "manta"
- "markduplicates"
- "mutect2"
- "msisensorpro"
# - 'save_bam_mapped'
- "prepare_recalibration"
- "recalibrate"
- "variantcalling_channel"
- "skip_markduplicates"
- "strelka"
Expand Down
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [#512](https://github.com/nf-core/sarek/pull/512), [#531](https://github.com/nf-core/sarek/pull/531), [#537](https://github.com/nf-core/sarek/pull/537) - Subway map for pipeline
- [#522](https://github.com/nf-core/sarek/pull/522) - Add QC for vcf files & MultiQC
- [#533](https://github.com/nf-core/sarek/pull/533) - Add param `--only_paired_variant_calling` to allow skipping of germline variantcalling for paired samples
- [#536](https://github.com/nf-core/sarek/pull/536) - Add `--step markduplicates` to start from duplicate marking, `--step prepare_recalibration` now ONLY starts at process `BaseRecalibrator` & adding `bam` and `cram` input support for `--step` `markduplicates`, `prepare_recalibration` and `recalibrate`
- [#538](https://github.com/nf-core/sarek/pull/538) - Add param `--seq_platform`, default: `ILLUMINA`

### Changed
Expand Down
4 changes: 2 additions & 2 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ process {
}

withName: 'TABIX_DBSNP' {
ext.when = { !params.dbsnp_tbi && params.dbsnp && (params.step == "mapping" || params.step == "prepare_recalibration") || params.tools && (params.tools.contains('controlfreec') || params.tools.contains('haplotypecaller') || params.tools.contains('mutect2')) }
ext.when = { !params.dbsnp_tbi && params.dbsnp && (params.step == "mapping" || params.step == "markduplicates" || params.step == "prepare_recalibration") || params.tools && (params.tools.contains('controlfreec') || params.tools.contains('haplotypecaller') || params.tools.contains('mutect2')) }
publishDir = [
enabled: params.save_reference,
mode: params.publish_dir_mode,
Expand All @@ -111,7 +111,7 @@ process {
}

withName: 'TABIX_KNOWN_INDELS' {
ext.when = { !params.known_indels_tbi && params.known_indels && (params.step == 'mapping' || params.step == 'prepare_recalibration') }
ext.when = { !params.known_indels_tbi && params.known_indels && (params.step == 'mapping' || params.step == "markduplicates" || params.step == 'prepare_recalibration') }
publishDir = [
enabled: params.save_reference,
mode: params.publish_dir_mode,
Expand Down
18 changes: 17 additions & 1 deletion conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,26 @@ profiles {
pair {
params.input = "${baseDir}/tests/csv/3.0/fastq_pair.csv"
}
prepare_recalibration {
markduplicates_bam {
params.input = "${baseDir}/tests/csv/3.0/mapped_single.csv"
FriederikeHanssen marked this conversation as resolved.
Show resolved Hide resolved
params.step = 'markduplicates'
}
markduplicates_cram {
params.input = "${baseDir}/tests/csv/3.0/mapped_single_cram.csv"
params.step = 'markduplicates'
}
prepare_recalibration_cram {
params.input = "${baseDir}/tests/csv/3.0/mapped_single_cram.csv"
params.step = 'prepare_recalibration'
}
prepare_recalibration_bam {
params.input = "${baseDir}/tests/csv/3.0/mapped_single.csv"
params.step = 'prepare_recalibration'
}
recalibrate {
params.input = "${baseDir}/tests/csv/3.0/prepare_recalibration_single_bam.csv"
params.step = 'recalibrate'
}
save_bam_mapped {
params.save_bam_mapped = true
}
Expand Down
Binary file modified docs/images/sarek_subway.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,878 changes: 1,962 additions & 1,916 deletions docs/images/sarek_subway.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
"help_text": "Only one step",
"enum": [
"mapping",
"markduplicates",
"prepare_recalibration",
"recalibrate",
"variant_calling",
Expand Down
4 changes: 2 additions & 2 deletions subworkflows/nf-core/bam_to_cram.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

39 changes: 39 additions & 0 deletions subworkflows/nf-core/mapping_cram_qc.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions tests/csv/3.0/mapped_single_cram.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
patient,status,sample,cram,crai
test,0,test,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.sorted.cram.crai
2 changes: 2 additions & 0 deletions tests/csv/3.0/prepare_recalibration_single_bam.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
patient,status,sample,bam,bai,table
test1,0,test1,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/gatk/test.baserecalibrator.table
5 changes: 2 additions & 3 deletions tests/csv/3.0/recalibrated.csv
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
patient,gender,status,sample,cram,crai
test,XX,0,sample1,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai
test2,XX,0,sample2,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai
test2,XX,1,sample3,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai
test1,XX,1,sample2,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai
test3,XX,0,sample3,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test.paired_end.recalibrated.sorted.cram.crai
test3,XX,1,sample4,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai
test3,XX,1,sample5,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram,https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/illumina/cram/test2.paired_end.recalibrated.sorted.cram.crai
23 changes: 23 additions & 0 deletions tests/test_markduplicates.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
- name: Run Mark Duplicates
command: nextflow run main.nf -profile test,markduplicates,docker
tags:
- bam
- markduplicates
- preprocessing
files:
- path: results/multiqc
- path: results/preprocessing/test/markduplicates/test.md.cram
- path: results/preprocessing/test/markduplicates/test.md.cram.crai
- path: results/preprocessing/test/recal_table/test.recal.table
- path: results/preprocessing/test/recalibrated/test.recal.cram
- path: results/preprocessing/test/recalibrated/test.recal.cram.crai
- path: results/preprocessing/csv/markduplicates.csv
- path: results/preprocessing/csv/markduplicates_test.csv
- path: results/preprocessing/csv/markduplicates_no_table.csv
- path: results/preprocessing/csv/markduplicates_no_table_test.csv
- path: results/preprocessing/csv/recalibrated.csv
- path: results/preprocessing/csv/recalibrated_test.csv
- path: results/reports/qualimap/test/test.mapped
- path: results/reports/qualimap/test/test.recal
- path: results/reports/samtools_stats/test/test.md.cram.stats
- path: results/reports/samtools_stats/test/test.recal.cram.stats
29 changes: 19 additions & 10 deletions tests/test_prepare_recalibration.yml
Original file line number Diff line number Diff line change
@@ -1,22 +1,31 @@
- name: Run Prepare_recal
command: nextflow run main.nf -profile test,prepare_recalibration,docker
- name: Run Prepare_recal starting from bam
command: nextflow run main.nf -profile test,prepare_recalibration_bam,docker
tags:
- bam
- prepare_recalibration
- preprocessing
files:
- path: results/multiqc
- path: results/preprocessing/test/recal_table/test.recal.table
- path: results/preprocessing/test/recalibrated/test.recal.cram
- path: results/preprocessing/test/recalibrated/test.recal.cram.crai
- path: results/preprocessing/csv/recalibrated.csv
- path: results/preprocessing/csv/recalibrated_test.csv
- path: results/reports/qualimap/test/test.recal
- path: results/reports/samtools_stats/test/test.recal.cram.stats

- name: Run Prepare_recal starting from cram
command: nextflow run main.nf -profile test,prepare_recalibration_cram,docker
tags:
- cram
- prepare_recalibration
- preprocessing
files:
- path: results/multiqc
- path: results/preprocessing/test/markduplicates/test.md.cram
- path: results/preprocessing/test/markduplicates/test.md.cram.crai
- path: results/preprocessing/test/recal_table/test.recal.table
- path: results/preprocessing/test/recalibrated/test.recal.cram
- path: results/preprocessing/test/recalibrated/test.recal.cram.crai
- path: results/preprocessing/csv/markduplicates.csv
- path: results/preprocessing/csv/markduplicates_test.csv
- path: results/preprocessing/csv/markduplicates_no_table.csv
- path: results/preprocessing/csv/markduplicates_no_table_test.csv
- path: results/preprocessing/csv/recalibrated.csv
- path: results/preprocessing/csv/recalibrated_test.csv
- path: results/reports/qualimap/test/test.mapped
- path: results/reports/qualimap/test/test.recal
- path: results/reports/samtools_stats/test/test.md.cram.stats
- path: results/reports/samtools_stats/test/test.recal.cram.stats
12 changes: 12 additions & 0 deletions tests/test_recalibrate.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
- name: Run Recalibration starting from bam
command: nextflow run main.nf -profile test,recalibrate,docker
tags:
- bam
- recalibrate
- preprocessing
files:
- path: results/multiqc
- path: results/preprocessing/test/recalibrated/test.recal.cram
- path: results/preprocessing/test/recalibrated/test.recal.cram.crai
- path: results/reports/qualimap/test/test.recal
- path: results/reports/samtools_stats/test/test.recal.cram.stats
Loading