Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds AlleleDepthPseudoCounts genotype annotation. #7303

Merged
merged 1 commit into from
Jun 15, 2021
Merged

Conversation

vruano
Copy link
Contributor

@vruano vruano commented Jun 10, 2021

Similar to AD, the new annotation (DD) captures the depth of each allele supporting evidence
or reads, however it does so by following a variational Bayes approach looking into the
likelihoods rather than applying a fix threshold.

This turns out to be more robust in some instances.

To get the new non-standard annotation in HC you need to add -A AllelePseudoDepth

@vruano vruano requested review from davidbenjamin and fleharty June 10, 2021 10:08
@gatk-bot
Copy link

gatk-bot commented Jun 10, 2021

Travis reported job failures from build 34475
Failures in the following jobs:

Test Type JDK Job ID Logs
integration openjdk11 34475.12 logs
integration openjdk8 34475.2 logs

@fleharty
Copy link
Contributor

@vruano
Tests are failing, could you resolve this?
It appears to be in:
org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotatorIntegrationTest > testAgainstMutect2 FAILED
java.lang.IllegalArgumentException: Dirichlet parameters may not be negative
at org.broadinstitute.hellbender.utils.Utils.validateArg(Utils.java:798)
at org.broadinstitute.hellbender.utils.Dirichlet.(Dirichlet.java:26)
at org.broadinstitute.hellbender.tools.walkers.mutect.SomaticLikelihoodsEngine.getEffectiveCounts(SomaticLikelihoodsEngine.java:56)

Copy link
Contributor

@fleharty fleharty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vruano I have a few minor nits. I think we should aim to get this in tomorrow so that it is part of the release. Rob and Richard have asked that we get this in this release if possible.

@@ -0,0 +1,170 @@
package org.broadinstitute.hellbender.tools.walkers.annotator;

import htsjdk.variant.variantcontext.*;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these be explicit imports? I'm not sure.

}

@Override
public void annotate(final ReferenceContext ref, final VariantContext vc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add some javadoc for this method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that the doc is inherited from the parent class. Is that sufficient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the @inheritdoc is good.

for (int i = 0; i < result.length; i++) {
double best = lkMatrix.getEntry(0, i);
double secondBest = Double.NEGATIVE_INFINITY;
for (int j = 1; j < lkMatrix.getRowDimension(); j++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me really nervous. Why, does i start with 0, and j with 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i is standard
j if you look at the other two previous lines in the outer for loop (double best = ... and double secondBest = ) the work for j == 0 is done, that is why we jump straight to j = 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, ty.

@vruano vruano force-pushed the vrr_pseudo_counts branch from 4a9395e to 8d735de Compare June 14, 2021 09:10
Copy link
Contributor

@fleharty fleharty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, though, I'm uncomfortable with,
INITIAL_PRIOR_CACHE_MAX_ALLELE = 10.
The reason is that in high depth somatic samples I we sometimes see more than 10 alleles show up due to homopolymer runs. At high depth sometimes you'll see an insertion of 1 A, AA, AAA, etc on out to like 14 As.

This is rare, but it does happen.

Are there any solutions where we can easily deal with a general number of alleles? Sorry, I know it's a pain.

}

@Override
public void annotate(final ReferenceContext ref, final VariantContext vc,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the @inheritdoc is good.

Similar to AD, the new annotation (DD) captures the depth of each allele supporting evidence
or reads, however it does so by following a variational Bayes approach looking into the
likelihoods rather than applying a fix threshold.
@vruano vruano force-pushed the vrr_pseudo_counts branch from caace67 to 90872de Compare June 14, 2021 23:46
@vruano
Copy link
Contributor Author

vruano commented Jun 15, 2021

Test pass, @fleharty want to merge?

Copy link
Contributor

@fleharty fleharty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Valentin!

@vruano vruano merged commit ca33160 into master Jun 15, 2021
@vruano vruano deleted the vrr_pseudo_counts branch June 15, 2021 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants