-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
properly merge AD values when no PLs #7836
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a unit test as well? Likely you can add/modify an existing.
} else if (GenotypeGVCFsEngine.excludeFromAnnotations(g)) { | ||
genotypeBuilder.alleles(Collections.nCopies(ploidy, Allele.NO_CALL)); | ||
} | ||
if (g.hasAD()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this up instead so the code is less branched/duplicative… and you've also lost the else-if case when we have ADs. like
if (g.hasPL() || g.hasAD()) {
# do common stuff (like perSampleIndexesOfRelevantAlleles)
if (g.hasPL() {
# do PL only stuff
}
if (g.hasAD()) {
# do AD stuff
}
} else if (GenotypeGVCFsEngine.excludeFromAnnotations(g)) {
# as is
}
...ava/org/broadinstitute/hellbender/tools/walkers/ReferenceConfidenceVariantContextMerger.java
Outdated
Show resolved
Hide resolved
} else if (GenotypeGVCFsEngine.excludeFromAnnotations(g)) { | ||
genotypeBuilder.alleles(Collections.nCopies(ploidy, Allele.NO_CALL)); | ||
} | ||
if (g.hasAD()) { | ||
int[] perSampleIndexesOfRelevantAlleles = AlleleSubsettingUtils.getIndexesOfRelevantAllelesForGVCF(remappedAlleles, targetAlleles, vc.getStart(), g, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea how expensive this operation is; might it be worth popping this a scoping level to be right under line 540 with a if (g.hasPL() || g.hasAD()) {...
so it never gets called twice (lines 543 and 554)?
Codecov Report
@@ Coverage Diff @@
## master #7836 +/- ##
===============================================
+ Coverage 86.666% 87.075% +0.409%
- Complexity 36781 37000 +219
===============================================
Files 2214 2221 +7
Lines 173551 173955 +404
Branches 18737 18795 +58
===============================================
+ Hits 150409 151471 +1062
+ Misses 16565 15847 -718
- Partials 6577 6637 +60
|
90d68a8
to
2152ad8
Compare
2152ad8
to
b777040
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the test infrastructure correctly, but it seems right and I'm counting on a GATK reviewer to look at that more deeply
@@ -538,21 +538,26 @@ private GenotypesContext mergeRefConfidenceGenotypes(final VariantContext vc, | |||
final int ploidy = g.getPloidy(); | |||
final GenotypeBuilder genotypeBuilder = new GenotypeBuilder(g); | |||
if (!doSomaticMerge) { | |||
if (g.hasPL()) { | |||
// lazy initialization of the genotype index map by ploidy. | |||
if (g.hasPL() | g.hasAD()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although technically the bitwise or operator |
does what you want here, convention is to use conditional-or ||
unless you really want the non-conditional behavior.
tests.add(new Object[]{"test12",Arrays.asList(vcA_C_G_noPLs, vcA_C_G_ALT_noPLs), loc, false, true, | ||
new VariantContextBuilder(VCbase).alleles(A_C_G).genotypes( | ||
new GenotypeBuilder("A_C_G.test2").AD(new int[]{60,9,0}).alleles(noCalls).make(), | ||
new GenotypeBuilder("A_C_G.test").AD(new int[]{60,9,0}).alleles(noCalls).make()).make()}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variant context has Arrays.asList(Aref, C, G, Allele.NON_REF_ALLELE) so the ADs should be of length 4. Do you mean LAD? If this is how you want it for GVS, then I'd like to make that more obvious some how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the test (on line 51) we hardcode the filtering out of the Allele.NON_REF_ALLELE
( @param removeNonRefSymbolicAllele if true, remove the <NON_REF> allele from the merged VC )
during the merge.
So we dont expect to have a corresponding AD value for it (local or otherwise---but this shouldn't be for local AD)
merging w blessing from Jonn |
* properly merge AD values when no PLs * dont check for AD twice * AD test * conditional or * update unit tests * no reason to uniquify sample names * got a lil test happy * better comments
No description provided.