Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adam2vcf doesn't have info fields #939

Merged
merged 1 commit into from
May 21, 2016

Conversation

andrewmchen
Copy link
Member

I added info fields only if the vcf is from a single sample.

What's the desired behavior if the vcf has multiple samples per site?

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1080/

Build result: FAILURE

GitHub pull request #939 of commit f3b09ed automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/939/merge^{commit} # timeout=10 > git branch -a --contains 68e0faef576be5033a341ce95b2e47757c630a07 # timeout=10 > git rev-parse remotes/origin/pr/939/merge^{commit} # timeout=10Checking out Revision 68e0faef576be5033a341ce95b2e47757c630a07 (origin/pr/939/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 68e0faef576be5033a341ce95b2e47757c630a07First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.10,1.4.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.4.1,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1081/
Test PASSed.

@fnothaft
Copy link
Member

Whoops! You should have an info field per site, as well as sample specific annotations (per sample, per site), see https://samtools.github.io/hts-specs/VCFv4.2.pdf. Essentially, the info field behavior should be the same in the single and multi sample cases.

@andrewmchen
Copy link
Member Author

Hm I read through the VCF spec and I don't think I completely understand still.

First, my understanding was that before this PR we wouldn't have any per site info field. Second, when convert a VCF to a RDD of adam Genotypes, are the VariantCallingAnnotations statistics not per sample but instead statistics aggregated across samples from that site?

@heuermh heuermh added this to the 0.20.0 milestone Feb 22, 2016
@heuermh
Copy link
Member

heuermh commented Feb 22, 2016

@andrewmchen after we get 0.19.0 out, I plan to focus on VCF INFO handling, for supporting the VCF ANN spec and other related issues. Please let me know how I might help in a week or so.

@andrewmchen
Copy link
Member Author

@heuermh Sure! Would love to work with you on some of the VCF features. Right now I'm just playing around with it for the purposes of avocado.

Maybe this isn't the desired behavior, but I think it's kind of strange that adam2vcf returns a much different vcf than the original that was passed into vcf2adam. Are the related issues you mentioned related to this?

@fnothaft
Copy link
Member

I will open a follow on PR that addresses some issues with this (specifically, stores sample info in genotype, aggregate in INFO).

@fnothaft
Copy link
Member

@heuermh is this good to go on your side? I just looked in more detail and think it is fine (doesn't need any more patches). If it is good from your side, then I will merge.

@heuermh
Copy link
Member

heuermh commented May 20, 2016

LGTM

@fnothaft fnothaft merged commit 22259c1 into bigdatagenomics:master May 21, 2016
@fnothaft
Copy link
Member

Merged! Thanks @andrewmchen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants