Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes to load properly VCF samples dict from _samples.avro #9

Open
wants to merge 4 commits into
base: genotypes-rdd
Choose a base branch
from

Conversation

jpdna
Copy link

@jpdna jpdna commented May 25, 2016

This PR fixes a problem that in current code the sequence of actions:

  1. Run 'vcf2adam' to produce adam parquet from a genotype cotnaining VCF
  2. Load that adam genotype parquet data using ADAMContext.loadGenotypes()

fails for two reasons:

This PR fixes these issues such that writing and loading genotype data works correctly.

fnothaft and others added 4 commits May 20, 2016 09:28
Resolves bigdatagenomics#909:

* Refactors `org.bdgenomics.adam.rdd.variation` to add `GenomicRDD`s for
  `Genotype`, `Variant`, and `VariantContext`. These classes write
  sequence and sample metadata to disk.
* Refactors `ADAMRDDFunctions` to an abstract class in preparation for
  further refactoring in bigdatagenomics#1011.
* Added `AvroGenomicRDD` trait which consolidates Parquet + Avro metadata
  writing code across all Avro data models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants