-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PartitionAndJoin should throw an exception if it sees an unmapped read #297
Comments
I resolved the issue by filtering out the unmapped adam records before calling PartitionAndJoin. Perhaps a relevant exception might be of help? |
Good point, we should be catching and throwing an exception in the case that the reads are unmapped. I'm going to leave this open until we fix that. |
So the question here is, "should it throw an exception?" (which technically, it already does: it throws an un-interpretable NPE) Or "should it simply filter out the unmapped reads before attempting to do the join?" I'm inclined to think that the latter is a better option. |
I prefer #1; the user should check the validity of their data before doing the join. |
I like the fast-fail exception but we should ensure that it's easy for a -Matt On Wed, Oct 15, 2014 at 3:51 PM, Frank Austin Nothaft <
|
+1 -Matt On Wed, Oct 15, 2014 at 3:58 PM, Frank Austin Nothaft <
|
Closed as won't fix. |
PartitionAndJoin crashes with a null pointer exception when I call it to join a set of 4 mouse-chrM coordinates with a small mouse file.
You can find my mouse.bam, mouse.adam, the coordinate test.txt file and my source code here:
https://github.com/kozanitis/misc
This is the exception stack:
2014-07-09 09:39:57 WARN TaskSetManager:70 - Loss was due to java.lang.NullPointerException
java.lang.NullPointerException
at org.bdgenomics.adam.rdd.NonoverlappingRegions.hasRegionsFor(RegionJoin.scala:311)
at org.bdgenomics.adam.rdd.MultiContigNonoverlappingRegions.filter(RegionJoin.scala:365)
at org.bdgenomics.adam.rdd.RegionJoin$$anonfun$5.apply(RegionJoin.scala:125)
at org.bdgenomics.adam.rdd.RegionJoin$$anonfun$5.apply(RegionJoin.scala:125)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
The text was updated successfully, but these errors were encountered: