-
Notifications
You must be signed in to change notification settings - Fork 308
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added check for bam index files in loadIndexedBam #1831
Conversation
Test FAILed. Build result: FAILURE[...truncated 15 lines...] > /home/jenkins/git2/bin/git fetch --tags --progress https://github.com/bigdatagenomics/adam.git +refs/pull/:refs/remotes/origin/pr/ # timeout=15 > /home/jenkins/git2/bin/git rev-parse origin/pr/1831/merge^{commit} # timeout=10 > /home/jenkins/git2/bin/git branch -a -v --no-abbrev --contains 665e131 # timeout=10Checking out Revision 665e131 (origin/pr/1831/merge) > /home/jenkins/git2/bin/git config core.sparsecheckout # timeout=10 > /home/jenkins/git2/bin/git checkout -f 665e131385df3b695eb244735b2242cd385a4948First time build. Skipping changelog.Triggering ADAM-prb ? 2.6.2,2.11,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,1.6.3,centosTriggering ADAM-prb ? 2.7.3,2.10,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.10,2.2.0,centosTriggering ADAM-prb ? 2.6.2,2.10,1.6.3,centosTriggering ADAM-prb ? 2.6.2,2.11,2.2.0,centosTriggering ADAM-prb ? 2.7.3,2.11,2.2.0,centosADAM-prb ? 2.6.2,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.7.3,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,2.2.0,centos completed with result FAILUREADAM-prb ? 2.6.2,2.10,1.6.3,centos completed with result FAILUREADAM-prb ? 2.6.2,2.11,2.2.0,centos completed with result FAILUREADAM-prb ? 2.7.3,2.11,2.2.0,centos completed with result FAILURENotifying endpoint 'HTTP:https://webhooks.gitter.im/e/ac8bb6e9f53357bc8aa8'Test FAILed. |
Test PASSed. |
786643d
to
6a4859f
Compare
Fixes #1830 |
We hadn't checked in binary files as test resources before, as then diffs in git aren't meaningful. If we're going to break that convention, and this is as good a time as any, then we should add an index for Do you need this for the 0.23.0 release? |
Test PASSed. |
6a4859f
to
3380265
Compare
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few small nits.
// todo: can this method handle SAM and CRAM, or just BAM? | ||
val bamFiles = getFsAndFiles(path).filter(p => p.toString.endsWith(".bam")) | ||
|
||
val indexFiles = getFsAndFiles(indexPath).filter(p => p.toString.endsWith(".bai")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use getFsAndFilesWithFilter
.
new Path(pathName + ".bai") | ||
else | ||
path | ||
|
||
// todo: can this method handle SAM and CRAM, or just BAM? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update this comment to // currently only supports BAM
and cross reference #1833.
|
||
// If pathName is a directory of bam files, scan all files to | ||
// select index files. | ||
val indexPath = if (pathName.endsWith(".bam")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use {}
's.
Also, its not clear that this code and the comment match up. Can you add more to the comment to clarify how this bit of code is necessary for directory support?
require(bamFiles.nonEmpty, | ||
"Did not find any BAM files at %s.".format(path)) | ||
|
||
// Make sure each bam file has an index | ||
bamFiles.foreach(file => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: if multiple files are missing an index, this will just throw for the first one. Instead, I'd suggest:
val missingIndices = bamFiles.filter(f => indexFiles.contains(f.toString + ".bai"))
require(missingIndices.isEmpty,
"Missing indices for BAMs:\n%s".format(missingIndices.mkString("\n"))
@@ -435,6 +435,14 @@ class ADAMContextSuite extends ADAMFunSuite { | |||
assert(reads.rdd.count == 4) | |||
} | |||
|
|||
sparkTest("loadIndexedBam should throw exception without an index file") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test case for the directory/glob case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping! Also, add a test case for the myFile.bai
case.
val bamFiles = getFsAndFiles(path).filter(p => p.toString.endsWith(".bam")) | ||
|
||
val indexFiles = getFsAndFiles(indexPath).filter(p => p.toString.endsWith(".bai")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fnothaft getFsAndFilesWith filter wasn't able to find .bai files unless just the directory was specified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah makes sense.
Test PASSed. |
|
||
// If pathName is a single file or *.bam, append .bai to find all bam indices. | ||
// Otherwise, pathName is a directory and the entire path must be searched | ||
// for indices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you note:
// TODO: support BAMs where the index is located at s/.bam/.bai/g
These aren't common (myFile.bam
+ myFile.bai
vs myFile.bam.bai
) but you do see them in the wild occasionally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe create an issue for that after this merges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if you do new Path(pathName.toString.replace(".bam", "*.bai"))
here and make a change below (will drop a comment), I think you can handle both cases.
require(bamFiles.nonEmpty, | ||
"Did not find any BAM files at %s.".format(path)) | ||
|
||
val missingIndices = bamFiles.filterNot(f => { | ||
indexFiles.contains(f.toString + ".bai") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do indexFiles.contains(f.toString + ".bai") || indexFiles.contains(f.toString.dropRight(3) + "bai"))
you should handle both the myFile.bam.bai
and myFile.bai
cases.
@@ -435,6 +435,14 @@ class ADAMContextSuite extends ADAMFunSuite { | |||
assert(reads.rdd.count == 4) | |||
} | |||
|
|||
sparkTest("loadIndexedBam should throw exception without an index file") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping! Also, add a test case for the myFile.bai
case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @akmorrow13! LGTM.
Test PASSed. |
5cfe378
to
ee06eda
Compare
Test PASSed. |
I'll give it one last review check after the 0.23.0 release is cut |
Yeah, this is good to go from my side but I have a strong preference for it being in 0.24.0 instead of 0.23.0. |
Thank you, @akmorrow13! |
No description provided.