-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring windows and padding for assembly and genotyping #6358
Conversation
Closes #5676. |
Assigning to @takutosato because it's pertinent to Mutect2 and because I have dumped enough on the engine team lately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is a big win for the entire GATK community
src/main/java/org/broadinstitute/hellbender/engine/spark/AssemblyRegionArgumentCollection.java
Show resolved
Hide resolved
* | ||
* A genotyping region is an interval surrounding an active region in which we perform pair-HMM and Smith-Waterman alignment. Even though | ||
* we ultimately only genotype and call variants within the active region, we call entire haplotypes as an intermediate step, hence the | ||
* needed for an expanded genotyping region. The parameters {@code snpPaddingForGenotyping} and {@code indelPaddingForGenotyping} determine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think mentioning the defaults here (snp genotyping padding = 25, indel genotyping padding = 75, assembly region padding = 100) would be helpful for the reader (i.e. me)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm nervous about hard-coding numbers -- what if they change? -- but I replaced the {@code }
s with {@link }
s so the reader can quickly navigate to the defaults in the IDE.
@takutosato Back to you. How does the |
* Importantly active regions are excusively a matter of delegating responsibility for calling and have nothing to do with the size | ||
* of the span over which we perform local assembly and pair-HMM. | ||
* in the same active region. The size of active regions are determined by the variants present, {@link AssemblyRegionArgumentCollection.activeProbThreshold}, | ||
* and {@link AssemblyRegionArgumentCollection.maxProbPropagationDistance}, and are bounded by the parameters {@link AssemblyRegionArgumentCollection.minAssemblyRegionSize} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks these links work great
47a90c4
to
0cade93
Compare
@jamesemery Here it is. It will make sense to review this one commit at a time. I have done my best to produce a clean commit history.