-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ADAM-1497] Add union to GenomicRDD. #1526
Conversation
Test FAILed. Build result: FAILURE[...truncated 16 lines...] > /home/jenkins/git2/bin/git rev-parse origin/pr/1526/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains ce29373ec3eff23b4c14bc2a99f1a7a7a3efc59c # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/1526/merge^{commit} # timeout=10Checking out Revision ce29373ec3eff23b4c14bc2a99f1a7a7a3efc59c (origin/pr/1526/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f ce29373ec3eff23b4c14bc2a99f1a7a7a3efc59cFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.11,2.0.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.0.0,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.0.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.0.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'Test FAILed. |
Resolves bigdatagenomics#1497. Also adds `size` methods to `SequenceDictionary` and `RecordGroupDictionary`. Requires the addition of a `ClassTag` to the `GenericGenomicRDD` case class.
55cb7f9
to
7013184
Compare
Test PASSed. |
val union = features1.union(features2) | ||
assert(union.rdd.count === (features1.rdd.count + features2.rdd.count)) | ||
// only a single contig between the two | ||
assert(union.sequences.size === 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implies there is a distinct
here . . .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a filter by contig name when merging two SequenceDictionary
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification
val genotype2 = sc.loadGenotypes(testFile("small.vcf")) | ||
val union = genotype1.union(genotype2) | ||
assert(union.rdd.count === (genotype1.rdd.count + genotype2.rdd.count)) | ||
assert(union.sequences.size === (genotype1.sequences.size + genotype2.sequences.size)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... but not here. Does (_ ++ _)
collapse elements that are equals
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is everyone's jolly favorite bioinformatics boondoggle: chr
prefixes vs no prefixes. Thus, the contigs are distinct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, I know. However, works for our test purposes.
Thank you, @fnothaft |
Resolves #1497. Also adds
size
methods toSequenceDictionary
andRecordGroupDictionary
. Requires the addition of aClassTag
to theGenericGenomicRDD
case class.