Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use hadoop-bam BAMInputFormat to do loadIndexedBam #953

Conversation

andrewmchen
Copy link
Member

I changed the loadIndexedBam function to use the new InputFormat released in hadoop-bam 7.4.0 that filters using an index file. We used to use a InputFormat that @erictu wrote.

In order to make this change, I had to upgrade our htsjdk library to version 2.1.0. Hope this doesn't break anything.

I haven't tested this on the cluster but I wrote a small test here. When I get a chance I'll try testing this on the cluster as well.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1091/

Build result: FAILURE

GitHub pull request #953 of commit 13f8b6a automatically merged.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > git --version # timeout=10 > git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ > git rev-parse origin/pr/953/merge^{commit} # timeout=10 > git branch -a --contains 8311854 # timeout=10 > git rev-parse remotes/origin/pr/953/merge^{commit} # timeout=10Checking out Revision 8311854 (origin/pr/953/merge) > git config core.sparsecheckout # timeout=10 > git checkout -f 83118549c4f1557462c5d8a811327cc74fd8cb8dFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.10,1.4.1,centosTriggering ADAM-prb ? 2.6.0,2.11,1.4.1,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft
Copy link
Member

Thanks for opening this, @andrewmchen! I will review soon. Moving to the new Hadoop-BAM/HTSJDK releases sounds good, but I would like to defer merging this to after the 0.19.0 release, as the version bump requires a move to Java 8. Since 0.19.0 has a lot of fixes in it, I'd like to keep that as widely available as possible.

@fnothaft fnothaft added this to the 0.20.0 milestone Feb 21, 2016
@andrewmchen
Copy link
Member Author

OK great thanks! I'm guessing then that the compilation problems on Jenkins are because of the mismatch between Java 7 and 8?

@fnothaft
Copy link
Member

The unit test failure is showing a Java version mismatch, but I'd have to look into it more closely to say something more intelligent.

@heuermh
Copy link
Member

heuermh commented Feb 22, 2016

+1 to waiting until 0.20 or later

@@ -41,4 +42,19 @@ trait Interval {
*/
def width: Long = end - start

/**
* Need to implement getStart function from Locatable. (1-based start position. closed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing 0- and 1- based coordinate systems in the same class gives me the hives. Can we hide this as an implementation detail in an adapter somewhere in the i/o code?

@fnothaft
Copy link
Member

Jenkins, test this please.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1235/

Build result: FAILURE

GitHub pull request #953 of commit 13f8b6a.Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'[EnvInject] - Loading node environment variables.Building remotely on amp-jenkins-worker-05 (centos spark-test) in workspace /home/jenkins/workspace/ADAM-prb > /home/jenkins/git2/bin/git rev-parse --is-inside-work-tree # timeout=10Fetching changes from the remote Git repository > /home/jenkins/git2/bin/git config remote.origin.url https://github.com/bigdatagenomics/adam.git # timeout=10Fetching upstream changes from https://github.com/bigdatagenomics/adam.git > /home/jenkins/git2/bin/git --version # timeout=10 > /home/jenkins/git2/bin/git -c core.askpass=true fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse 13f8b6a^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a --contains 13f8b6a # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/953/head^{commit} # timeout=10 > /home/jenkins/git2/bin/git rev-parse remotes/origin/pr/953/merge^{commit} # timeout=10Checking out Revision 13f8b6a (origin/pr/953/merge, origin/pr/953/head) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 13f8b6ab35faeb4b16d990451d7166f35c23c63aFirst time build. Skipping changelog.Triggering ADAM-prb ? 2.6.0,2.10,1.5.2,centosTriggering ADAM-prb ? 2.6.0,2.11,1.5.2,centosTouchstone configurations resulted in FAILURE, so aborting...Notifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'
Test FAILed.

@fnothaft
Copy link
Member

fnothaft commented May 20, 2016

Superseded by #1036.

@fnothaft fnothaft closed this May 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants