Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1695] Check for illegal genotype index after splitting multi-allelic variants. #1725

Merged
merged 2 commits into from
Sep 27, 2017

Conversation

heuermh
Copy link
Member

@heuermh heuermh commented Sep 14, 2017

Fixes #1695

@fnothaft
Copy link
Member

OOC, do you know what the license on the GIAB data is? We should ensure that it is OK to check in an excerpt of the GIAB truth file.

Copy link
Member

@fnothaft fnothaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just dropped a question on the code, and a question WRT importing the GIAB file. Thanks @heuermh!

@@ -787,7 +787,7 @@ class VariantContextConverter(
gIndices: Array[Int]): Genotype.Builder = {

// AD is an array type field
if (g.hasAD) {
if (g.hasAD && gIdx < g.getAD.size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this only affect the AD field? I would've expected it to broadly impact fields that have the same array count as AD? Do we already check those?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that file also has an ADALL VCF FORMAT Number=R Type=Integer genotype field that causes ArrayIndexOutOfBoundsExceptions, thus the other change at line 1289.

I'll add additional assertions to the test case to demonstrate this fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hah! I'd missed that change earlier. Thanks for clarifying!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2373/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1725/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 1b8213b463b23656995b7d4aa0d80b853419944a # timeout=10Checking out Revision 1b8213b463b23656995b7d4aa0d80b853419944a (origin/pr/1725/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 1b8213b463b23656995b7d4aa0d80b853419944aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh
Copy link
Member Author

heuermh commented Sep 14, 2017

OOC, do you know what the license on the GIAB data is? We should ensure that it is OK to check in an excerpt of the GIAB truth file.

"License: There are no restrictions on the use of this data."

https://aws.amazon.com/public-datasets/giab

"[P]lease cite http://www.nature.com/nbt/journal/v32/n3/full/nbt.2835.html (doi:10.1038/nbt.2835) and http://www.nature.com/articles/sdata201625 (doi:10.1038/sdata.2016.25) when using these calls."

ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/latest/README_NISTv3.3.2.txt

I can add an accompanying HG...vcf.README to the adam-core/src/test/resources directory.

@fnothaft
Copy link
Member

I can add an accompanying HG...vcf.README to the adam-core/src/test/resources directory.

+1

@@ -787,7 +787,7 @@ class VariantContextConverter(
gIndices: Array[Int]): Genotype.Builder = {

// AD is an array type field
if (g.hasAD) {
if (g.hasAD && gIdx < g.getAD.size) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hah! I'd missed that change earlier. Thanks for clarifying!

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2376/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1725/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 8d26c62 # timeout=10Checking out Revision 8d26c62 (origin/pr/1725/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 8d26c627d44a9a0e692ff6767e6759902dd84523First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result SUCCESSADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result SUCCESSADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result SUCCESSNotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2377/
Test PASSed.

@heuermh
Copy link
Member Author

heuermh commented Sep 27, 2017

Rebased, ready for review @devin-petersohn

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2403/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1725/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains c00a9decf806b3cfe01e243f49ebd8a5f0872594 # timeout=10Checking out Revision c00a9decf806b3cfe01e243f49ebd8a5f0872594 (origin/pr/1725/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f c00a9decf806b3cfe01e243f49ebd8a5f0872594First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2404/
Test PASSed.

@devin-petersohn devin-petersohn merged commit 1302e07 into bigdatagenomics:master Sep 27, 2017
@devin-petersohn
Copy link
Member

Merged, thanks @heuermh!

@heuermh heuermh deleted the issue-1695 branch September 27, 2017 21:19
@heuermh heuermh added this to the 0.23.0 milestone Oct 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants