Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1683] Pull in Spark-BAM as a secondary loading path. #1686

Closed

Conversation

fnothaft
Copy link
Member

Resolves #1683. Not expected to compile for Spark 1.x/Scala 2.10.

</transformers>
<relocations>
<relocation>
<pattern>org.seqdoop</pattern>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a heads up to @ryan-williams, if I was to go this route, I would also want to relocate the org.hammerlab classes in org.hammerlab:spark-bam into org.bdgenomics so that we're not publishing a JAR with org.hammerlab classes that don't match what your published org.hammerlab:spark-bam artifact contains. However, I can't relocate the org.hammerlab classes without breaking my compile. I'm not sure why, but Scala companion objects (or possibly implicits?) seem to get broken by the relocation process.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2333/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1686/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 6dcb712 # timeout=10Checking out Revision 6dcb712 (origin/pr/1686/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 6dcb712abcf1ac551eccbc4a94528cd1bede99b0First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.6.0,2.11,2.1.0,centosTriggering ADAM-prb ? 2.3.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.6.0,2.10,2.1.0,centosTriggering ADAM-prb ? 2.6.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.10,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,1.6.1,centosTriggering ADAM-prb ? 2.3.0,2.11,2.1.0,centosADAM-prb ? 2.6.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,2.1.0,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.10,2.1.0,centos completed with result FAILUREADAM-prb ? 2.6.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.10,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,1.6.1,centos completed with result FAILUREADAM-prb ? 2.3.0,2.11,2.1.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

val readerFn = if (sc.hadoopConfiguration.getBoolean(ADAMContext.USE_SPARK_BAM, false)) {
def sparkBamRead(path: String): RDD[SAMRecord] = {
if (path.endsWith(".bam")) {
sc.loadBam(HLPath(path))
Copy link
Member

@ryan-williams ryan-williams Aug 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably didn't make this clear but sc.loadReads is the intended main entry-point to spark-bam, and will handle .sam, .bam, and .cram (the latter by delegating straight to hadoop-bam; for .sam, it's a new implementation that is probably equivalens to hadoop-bam's… i.e. it's not designed to be faster/better for any reason, like the .bam support is).

Using spark-bam just for .bams is also reasonable, this is just FTR

pom.xml Outdated
<exclusions>
<exclusion>
<groupId>org.hammerlab</groupId>
<artifactId>iterator_2.10</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why this was needed, if you recall?

pom.xml Outdated
@@ -535,6 +548,11 @@
</exclusions>
</dependency>
<dependency>
<groupId>org.hammerlab</groupId>
<artifactId>spark-bam_2.10</artifactId>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark-bam_2.10, you say 🤔 😁

is this here in anticipation of a world where that exists? j/w / trying to follow

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2521/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1686/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 462ec82 # timeout=10Checking out Revision 462ec82 (origin/pr/1686/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 462ec825227d43af99edb0c7d5a3edd13d91ed28First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2524/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1686/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains faed42e # timeout=10Checking out Revision faed42e (origin/pr/1686/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f faed42e549fc7fb1d537c09e39bc45d00b29a0ebFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft
Copy link
Member Author

Jenkins, retest this please.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2525/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1686/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains faed42e # timeout=10Checking out Revision faed42e (origin/pr/1686/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f faed42e549fc7fb1d537c09e39bc45d00b29a0ebFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/2526/

Build result: FAILURE

[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1686/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 084f071 # timeout=10Checking out Revision 084f071 (origin/pr/1686/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 084f0718d12a68fed77c2ae64c750eafbcc70d3eFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@heuermh
Copy link
Member

heuermh commented Jan 6, 2020

Closing as WontFix. Feel free to create a new PR after rebasing against git head.

@heuermh heuermh closed this Jan 6, 2020
@heuermh heuermh added this to the 0.31.0 milestone Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support using Spark-BAM to load BAM files
4 participants