PartitionAndJoin should throw an exception if it sees an unmapped read #297

kozanitis · 2014-07-09T20:16:39Z

PartitionAndJoin crashes with a null pointer exception when I call it to join a set of 4 mouse-chrM coordinates with a small mouse file.

You can find my mouse.bam, mouse.adam, the coordinate test.txt file and my source code here:

https://github.com/kozanitis/misc

This is the exception stack:
2014-07-09 09:39:57 WARN TaskSetManager:70 - Loss was due to java.lang.NullPointerException
java.lang.NullPointerException
at org.bdgenomics.adam.rdd.NonoverlappingRegions.hasRegionsFor(RegionJoin.scala:311)
at org.bdgenomics.adam.rdd.MultiContigNonoverlappingRegions.filter(RegionJoin.scala:365)
at org.bdgenomics.adam.rdd.RegionJoin$$anonfun$5.apply(RegionJoin.scala:125)
at org.bdgenomics.adam.rdd.RegionJoin$$anonfun$5.apply(RegionJoin.scala:125)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

kozanitis · 2014-07-09T22:04:20Z

I resolved the issue by filtering out the unmapped adam records before calling PartitionAndJoin. Perhaps a relevant exception might be of help?

carlyeks · 2014-07-10T13:32:13Z

Good point, we should be catching and throwing an exception in the case that the reads are unmapped. I'm going to leave this open until we fix that.

tdanford · 2014-10-15T13:19:32Z

So the question here is, "should it throw an exception?" (which technically, it already does: it throws an un-interpretable NPE) Or "should it simply filter out the unmapped reads before attempting to do the join?"

I'm inclined to think that the latter is a better option.

kozanitis · 2014-10-15T22:23:18Z

@tdanford definitely your second option sounds user friendlier... But my vote goes for the first approach as it looks cleaner from a design point of view. @fnothaft , @massie what you guys think?

fnothaft · 2014-10-15T22:51:43Z

I prefer #1; the user should check the validity of their data before doing the join.

massie · 2014-10-15T22:55:29Z

I like the fast-fail exception but we should ensure that it's easy for a
user to understand the error, e.g. "PartitionAndJoin on RDD with unmapped
reads is not supported" using a NotSupportedException or something similar.

-Matt

On Wed, Oct 15, 2014 at 3:51 PM, Frank Austin Nothaft <
notifications@github.com> wrote:

I prefer #1 #1; the user
should check the validity of their data before doing the join.

—
Reply to this email directly or view it on GitHub
#297 (comment)
.

fnothaft · 2014-10-15T22:58:47Z

@massie that's what @tdanford adds in #421

massie · 2014-10-15T23:02:15Z

+1

-Matt

On Wed, Oct 15, 2014 at 3:58 PM, Frank Austin Nothaft <
notifications@github.com> wrote:

@massie https://github.com/massie that's what @tdanford
https://github.com/tdanford adds in #421
#421

—
Reply to this email directly or view it on GitHub
#297 (comment)
.

fnothaft · 2016-07-06T16:00:27Z

Closed as won't fix.

kozanitis closed this as completed Jul 9, 2014

carlyeks reopened this Jul 10, 2014

carlyeks self-assigned this Jul 10, 2014

tdanford added the enhancement label Jul 28, 2014

tdanford changed the title ~~PartitionAndJoin exceptions~~ PartitionAndJoin should throw an exception if it sees an unmapped read Oct 14, 2014

tdanford assigned tdanford and unassigned carlyeks Oct 14, 2014

This was referenced Oct 15, 2014

Create standardized, interpretable exceptions for error reporting #420

Closed

CLI & mappability filter for region join, custom exceptions #421

Closed

fnothaft mentioned this issue Feb 8, 2015

Remove wrapping by Option where throwing an exception is preferable #574

Closed

fnothaft added the wontfix label Jul 6, 2016

fnothaft closed this as completed Jul 6, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PartitionAndJoin should throw an exception if it sees an unmapped read #297

PartitionAndJoin should throw an exception if it sees an unmapped read #297

kozanitis commented Jul 9, 2014

kozanitis commented Jul 9, 2014

carlyeks commented Jul 10, 2014

tdanford commented Oct 15, 2014

kozanitis commented Oct 15, 2014

fnothaft commented Oct 15, 2014

massie commented Oct 15, 2014

fnothaft commented Oct 15, 2014

massie commented Oct 15, 2014

fnothaft commented Jul 6, 2016

PartitionAndJoin should throw an exception if it sees an unmapped read #297

PartitionAndJoin should throw an exception if it sees an unmapped read #297

Comments

kozanitis commented Jul 9, 2014

kozanitis commented Jul 9, 2014

carlyeks commented Jul 10, 2014

tdanford commented Oct 15, 2014

kozanitis commented Oct 15, 2014

fnothaft commented Oct 15, 2014

massie commented Oct 15, 2014

fnothaft commented Oct 15, 2014

massie commented Oct 15, 2014

fnothaft commented Jul 6, 2016