Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring windows and padding for assembly and genotyping #6358

Merged
merged 19 commits into from
Feb 5, 2020

Conversation

davidbenjamin
Copy link
Contributor

@jamesemery Here it is. It will make sense to review this one commit at a time. I have done my best to produce a clean commit history.

@davidbenjamin
Copy link
Contributor Author

Closes #5676.

@davidbenjamin
Copy link
Contributor Author

Assigning to @takutosato because it's pertinent to Mutect2 and because I have dumped enough on the engine team lately.

@davidbenjamin davidbenjamin requested review from takutosato and removed request for jamesemery January 30, 2020 16:31
Copy link
Contributor

@takutosato takutosato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is a big win for the entire GATK community

*
* A genotyping region is an interval surrounding an active region in which we perform pair-HMM and Smith-Waterman alignment. Even though
* we ultimately only genotype and call variants within the active region, we call entire haplotypes as an intermediate step, hence the
* needed for an expanded genotyping region. The parameters {@code snpPaddingForGenotyping} and {@code indelPaddingForGenotyping} determine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mentioning the defaults here (snp genotyping padding = 25, indel genotyping padding = 75, assembly region padding = 100) would be helpful for the reader (i.e. me)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm nervous about hard-coding numbers -- what if they change? -- but I replaced the {@code }s with {@link }s so the reader can quickly navigate to the defaults in the IDE.

@davidbenjamin
Copy link
Contributor Author

@takutosato Back to you. How does the @link compromise suit you?

* Importantly active regions are excusively a matter of delegating responsibility for calling and have nothing to do with the size
* of the span over which we perform local assembly and pair-HMM.
* in the same active region. The size of active regions are determined by the variants present, {@link AssemblyRegionArgumentCollection.activeProbThreshold},
* and {@link AssemblyRegionArgumentCollection.maxProbPropagationDistance}, and are bounded by the parameters {@link AssemblyRegionArgumentCollection.minAssemblyRegionSize}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks these links work great

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants