-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vardictjava variant calling workflow #1582
base: dev
Are you sure you want to change the base?
Conversation
This PR is against the
|
|
it seems that |
Hi @maxulysse I'm not sure I understand. Do you mean vardictjava module? I created a mulled container to combine it with htslib |
you did that in your PR only? You should do that in the modules repo instead |
oh i see, I understand now. I've used seqera container. Is that acceptable? Otherwise, I can wait for when docker and singularity imgs are available - i've added it to the multi-package-container |
let's create the PR first, and we'll discuss this there, and I'll ping more people |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really great, apologies for the delay in reviewing. I mostly have documentation comments: There is a section in the usage.md, that lists all tools and whether they are suited for WES,pnal,WGS, and germline, tumor only and paired. It would be great if you could add an entry there.
@@ -111,8 +111,8 @@ | |||
"type": "string", | |||
"fa_icon": "fas fa-toolbox", | |||
"description": "Tools to use for duplicate marking, variant calling and/or for annotation.", | |||
"help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively), and bcftools annotate (needs `--bcftools_annotation`).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.", | |||
"pattern": "^((ascat|bcfann|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_dnascope|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|ngscheckmate|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(?<!,)$" | |||
"help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively), VarDictJava and bcftools annotate (needs `--bcftools_annotation`).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you indicate here in the help text for what vardict is used?
BAM_VARIANT_CALLING_SINGLE_VARDICTJAVA( | ||
cram, | ||
dict, | ||
fasta, // TODO CHECK Do I need to remap fasta and fasta_fai to match module? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes you generally need to or you end up with a path value is null
error
cram | ||
.branch {meta, cram, crai -> | ||
bam: cram.extension == "bam" | ||
cram: cram.extension == "cram"} | ||
.set{ch_bam_from_cram} | ||
|
||
CRAM_TO_BAM( | ||
ch_bam_from_cram.cram, | ||
fasta, | ||
fasta_fai | ||
) | ||
|
||
// Combine converted bam, bai and intervals | ||
ch_bam_from_cram.bam | ||
.mix(CRAM_TO_BAM.out.bam.join(CRAM_TO_BAM.out.bai, failOnDuplicate: true, failOnMismatch: true)) | ||
.combine(intervals) | ||
.map{meta, bam, bai, intervals, num_intervals -> [ meta + [ num_intervals:num_intervals ], bam, bai, intervals ]} | ||
.set{ ch_vardict_input} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no cram support 😭 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maxulysse we should refactor this to only to cram -> bam once if tools like cnvkit and vardict are both selected.
Thanks @FriederikeHanssen. Apologies for the delay, was away and just catching up on my emails. I'll action this shortly |
PR checklist
nf-core lint
).nf-test test tests/ --verbose --profile +docker
).nextflow run . -profile debug,test,docker --outdir <OUTDIR>
).docs/usage.md
is updated.docs/output.md
is updated.CHANGELOG.md
is updated.README.md
is updated (including new tool citations and authors/contributors).