Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrote the getType method to handle all ploidy levels #904

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,16 +26,56 @@ object RichGenotype {
}

class RichGenotype(val genotype: Genotype) {

def ploidy: Int = genotype.getAlleles.size

def getType: GenotypeType = {
assert(ploidy <= 2, "getType only meaningful for genotypes with ploidy <= 2")
genotype.getAlleles.toList.distinct match {
case List(GenotypeAllele.Ref) => GenotypeType.HOM_REF
case List(GenotypeAllele.Alt) => GenotypeType.HOM_ALT
case List(GenotypeAllele.Ref, GenotypeAllele.Alt) |
List(GenotypeAllele.Alt, GenotypeAllele.Ref) => GenotypeType.HET
case _ => GenotypeType.NO_CALL

// Get the list with the distinct alleles
val distinctListOfAlleles = genotype.getAlleles.toList.distinct

// In the case that there is only one distinct allele
// This should be applicable to any genome ploidy.
if (distinctListOfAlleles.size == 1) {
distinctListOfAlleles match {

// If all alleles are the reference allele, the genotype is Homozygous Reference (HOM_REF)
case List(GenotypeAllele.Ref) => GenotypeType.HOM_REF

// If all alleles are the primary alternative allele, the genotype is Homozygous Alternative (HOM_ALT)
case List(GenotypeAllele.Alt) => GenotypeType.HOM_ALT

// If all alleles are not called, the genotype is not called (NO_CALL)
case List(GenotypeAllele.NoCall) => GenotypeType.NO_CALL

// If all alleles are OtherAlt.
// If genotype.getAlleles returns a single OtherAlt, the genotype is Homozygous Alternative (HOM_ALT)
// If genotype.getAlleles returns a multiple OtherAlt, the genotype is
// A) The OtherAlt alleles are the same OtherAlt alleles: Homozygous Alternative (HOM_ALT)
// B) The OtherAlt allales are different OtherAlt alleles: Heterozygous
// For now return NO_CALL as the genotypes, as was done in the previous getType function
// See also issue https://github.com/bigdatagenomics/adam/issues/897
case List(GenotypeAllele.OtherAlt) => GenotypeType.NO_CALL

// only the four above alleles are possible
// https://github.com/bigdatagenomics/bdg-formats/blob/master/src/main/resources/avro/bdg.avdl#L464
case _ => throw new IllegalStateException("Found single distinct allele other than the four possible alleles: Ref, Alt, NoCall and OtherAlt")
}
} // In the case that there are multiple distinct alleles
// This should be applicable to any genome ploidy.
else {
// If there is a not called allele in this distinct list of alleles
// The genotype is NO_CALL
// IN HTS-JDK this would be GenotypeType.MIXED , this type is not available in BDG / ADAM
// https://github.com/bigdatagenomics/bdg-formats/blob/master/src/main/resources/avro/bdg.avdl#L483
// https://github.com/samtools/htsjdk/blob/master/src/java/htsjdk/variant/variantcontext/Genotype.java#L218
if (distinctListOfAlleles contains GenotypeAllele.NoCall) {
GenotypeType.NO_CALL
} // Otherwise the distinct alleles are a combination of 2 or 3 alleles from the list (GenotypeAllele.Ref, GenotypeAllele.Alt, GenotypeAllele.OtherAlt)
// Therefore the genotype is Heterozygous HET
else {
GenotypeType.HET
}
}
}

Expand Down