Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Tests for Mutect2 #148

Closed
lescai opened this issue Mar 6, 2020 · 5 comments
Closed

Create Tests for Mutect2 #148

lescai opened this issue Mar 6, 2020 · 5 comments
Labels
enhancement New feature or request
Milestone

Comments

@lescai
Copy link
Contributor

lescai commented Mar 6, 2020

As agreed with @maxulysse @cgpu opening this issue to discuss how to perform larger scale testing for Mutect2, and make sure it runs as expected.
We can discuss here what datasets to use, where to place the data and how to run the tests.

@cgpu
Copy link

cgpu commented Mar 6, 2020

Some questions that we might have to answer:

  • Will Mutect2 run with the current small reference without complaining?
  • Are we testing only one part of one chromosome?
  • Are we subsampling with something like seqtk to keep very few reads from all chromosomes?
  • Can we subsample with something like gatk SelectVariants + test specific intervals list the resource vcf files needed?

@maxulysse
Copy link
Member

@chelauk this issue might be interesting for you as well

@maxulysse maxulysse added this to the 3.0 milestone Mar 6, 2020
@maxulysse maxulysse added the enhancement New feature or request label Mar 6, 2020
@chelauk
Copy link
Contributor

chelauk commented Mar 6, 2020

Mutect2 doesn't work at the moment with the tiny test dataset. It needs a compatible germline resource. I am trying to create one

@chelauk
Copy link
Contributor

chelauk commented Mar 9, 2020

I am trying to get sarek to work with the test dataset:
I edited the gnomAD.r2.1.1.GRCh38.PASS.AC.AF.only.vcf.gz resource to contain only those variants in the small.intervals file.
This meant that gatk Mutect2 runs successfully
gatk GatherPileupSummaries also runs
but with the test dataset it generates an empty table
This breaks the CalculateContamination step with an i/o problem. I was wondering how to get round this. We could add a test and ignore the filtration steps but that means that we don't test the filtration steps.

@maxulysse maxulysse mentioned this issue Mar 19, 2020
7 tasks
@maxulysse
Copy link
Member

Done by: nf-core/test-datasets#340

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants