Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vardictjava variant calling workflow #1582

Open
wants to merge 18 commits into
base: dev
Choose a base branch
from

Conversation

eolaniru
Copy link

@eolaniru eolaniru commented Jul 3, 2024

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/sarek branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nf-test test tests/ --verbose --profile +docker).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

Copy link

github-actions bot commented Jul 3, 2024

This PR is against the master branch ❌

  • Do not close this PR
  • Click Edit and change the base to dev
  • This CI test will remain failed until you push a new commit

Hi @eolaniru,

It looks like this pull-request is has been made against the eolaniru/sarek master branch.
The master branch on nf-core repositories should always contain code from the latest release.
Because of this, PRs to master are only allowed if they come from the eolaniru/sarek dev branch.

You do not need to close this PR, you can change the target branch to dev by clicking the "Edit" button at the top of this page.
Note that even after this, the test will continue to show as failing until you push a new commit.

Thanks again for your contribution!

@eolaniru eolaniru changed the base branch from master to dev July 3, 2024 16:27
Copy link

github-actions bot commented Jul 3, 2024

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 7bf9e60

+| ✅ 200 tests passed       |+
#| ❔  12 tests were ignored |#
!| ❗   3 tests had warnings |!

❗ Test warnings:

  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • files_exist - File is ignored: .github/workflows/awsfulltest.yml
  • files_exist - File is ignored: .github/workflows/awstest.yml
  • files_exist - File is ignored: conf/modules.config
  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md
  • files_unchanged - File ignored due to lint config: assets/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_light.png
  • files_unchanged - File ignored due to lint config: docs/images/nf-core-sarek_logo_dark.png
  • files_unchanged - File ignored due to lint config: .gitignore or .prettierignore
  • actions_ci - actions_ci
  • actions_awstest - 'awstest.yml' workflow not found: /home/runner/work/sarek/sarek/.github/workflows/awstest.yml
  • template_strings - template_strings
  • modules_config - modules_config

✅ Tests passed:

Run details

  • nf-core/tools version 2.14.1
  • Run at 2024-07-05 15:53:12

@maxulysse
Copy link
Member

it seems that vardictjava was modified locally.
Can you either patch it, or modify upstream and update?

@eolaniru
Copy link
Author

eolaniru commented Jul 3, 2024

Hi @maxulysse I'm not sure I understand. Do you mean vardictjava module? I created a mulled container to combine it with htslib

@maxulysse
Copy link
Member

you did that in your PR only?

You should do that in the modules repo instead

@eolaniru
Copy link
Author

eolaniru commented Jul 3, 2024

oh i see, I understand now. I've used seqera container. Is that acceptable? Otherwise, I can wait for when docker and singularity imgs are available - i've added it to the multi-package-container

@maxulysse
Copy link
Member

oh i see, I understand now. I've used seqera container. Is that acceptable? Otherwise, I can wait for when docker and singularity imgs are available - i've added it to the multi-package-container

let's create the PR first, and we'll discuss this there, and I'll ping more people

@maxulysse maxulysse added this to the 3.5 milestone Jul 15, 2024
@FriederikeHanssen FriederikeHanssen removed this from the 3.5 milestone Aug 19, 2024
Copy link
Contributor

@FriederikeHanssen FriederikeHanssen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great, apologies for the delay in reviewing. I mostly have documentation comments: There is a section in the usage.md, that lists all tools and whether they are suited for WES,pnal,WGS, and germline, tumor only and paired. It would be great if you could add an entry there.

@@ -111,8 +111,8 @@
"type": "string",
"fa_icon": "fas fa-toolbox",
"description": "Tools to use for duplicate marking, variant calling and/or for annotation.",
"help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively), and bcftools annotate (needs `--bcftools_annotation`).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.",
"pattern": "^((ascat|bcfann|cnvkit|controlfreec|deepvariant|freebayes|haplotypecaller|sentieon_dnascope|sentieon_haplotyper|manta|merge|mpileup|msisensorpro|mutect2|ngscheckmate|sentieon_dedup|snpeff|strelka|tiddit|vep)?,?)*(?<!,)$"
"help_text": "Multiple tools separated with commas.\n\n**Variant Calling:**\n\nGermline variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: DeepVariant, FreeBayes, GATK HaplotypeCaller, mpileup, Sentieon Haplotyper, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit\n\nTumor-only somatic variant calling can currently be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, mpileup, Mutect2, Strelka\n- Structural Variants: Manta, TIDDIT\n- Copy-number: CNVKit, ControlFREEC\n\nSomatic variant calling can currently only be performed with the following variant callers:\n- SNPs/Indels: FreeBayes, Mutect2, Strelka\n- Structural variants: Manta, TIDDIT\n- Copy-Number: ASCAT, CNVKit, Control-FREEC\n- Microsatellite Instability: MSIsensorpro\n\n> **NB** Mutect2 for somatic variant calling cannot be combined with `--no_intervals`\n\n**Annotation:**\n \n- snpEff, VEP, merge (both consecutively), VarDictJava and bcftools annotate (needs `--bcftools_annotation`).\n\n> **NB** As Sarek will use bgzip and tabix to compress and index VCF files annotated, it expects VCF files to be sorted when starting from `--step annotate`.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you indicate here in the help text for what vardict is used?

BAM_VARIANT_CALLING_SINGLE_VARDICTJAVA(
cram,
dict,
fasta, // TODO CHECK Do I need to remap fasta and fasta_fai to match module?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes you generally need to or you end up with a path value is null error

Comment on lines +23 to +40
cram
.branch {meta, cram, crai ->
bam: cram.extension == "bam"
cram: cram.extension == "cram"}
.set{ch_bam_from_cram}

CRAM_TO_BAM(
ch_bam_from_cram.cram,
fasta,
fasta_fai
)

// Combine converted bam, bai and intervals
ch_bam_from_cram.bam
.mix(CRAM_TO_BAM.out.bam.join(CRAM_TO_BAM.out.bai, failOnDuplicate: true, failOnMismatch: true))
.combine(intervals)
.map{meta, bam, bai, intervals, num_intervals -> [ meta + [ num_intervals:num_intervals ], bam, bai, intervals ]}
.set{ ch_vardict_input}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no cram support 😭 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxulysse we should refactor this to only to cram -> bam once if tools like cnvkit and vardict are both selected.

@eolaniru
Copy link
Author

eolaniru commented Oct 7, 2024

Thanks @FriederikeHanssen. Apologies for the delay, was away and just catching up on my emails. I'll action this shortly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants