Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PartitionAndJoin should throw an exception if it sees an unmapped read #297

Closed
kozanitis opened this issue Jul 9, 2014 · 9 comments
Closed

Comments

@kozanitis
Copy link

PartitionAndJoin crashes with a null pointer exception when I call it to join a set of 4 mouse-chrM coordinates with a small mouse file.

You can find my mouse.bam, mouse.adam, the coordinate test.txt file and my source code here:

https://github.com/kozanitis/misc

This is the exception stack:
2014-07-09 09:39:57 WARN TaskSetManager:70 - Loss was due to java.lang.NullPointerException
java.lang.NullPointerException
at org.bdgenomics.adam.rdd.NonoverlappingRegions.hasRegionsFor(RegionJoin.scala:311)
at org.bdgenomics.adam.rdd.MultiContigNonoverlappingRegions.filter(RegionJoin.scala:365)
at org.bdgenomics.adam.rdd.RegionJoin$$anonfun$5.apply(RegionJoin.scala:125)
at org.bdgenomics.adam.rdd.RegionJoin$$anonfun$5.apply(RegionJoin.scala:125)
at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:390)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
at scala.collection.Iterator$class.foreach(Iterator.scala:727)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

@kozanitis
Copy link
Author

I resolved the issue by filtering out the unmapped adam records before calling PartitionAndJoin. Perhaps a relevant exception might be of help?

@carlyeks
Copy link
Member

Good point, we should be catching and throwing an exception in the case that the reads are unmapped. I'm going to leave this open until we fix that.

@carlyeks carlyeks reopened this Jul 10, 2014
@carlyeks carlyeks self-assigned this Jul 10, 2014
@tdanford tdanford changed the title PartitionAndJoin exceptions PartitionAndJoin should throw an exception if it sees an unmapped read Oct 14, 2014
@tdanford tdanford assigned tdanford and unassigned carlyeks Oct 14, 2014
@tdanford
Copy link
Contributor

So the question here is, "should it throw an exception?" (which technically, it already does: it throws an un-interpretable NPE) Or "should it simply filter out the unmapped reads before attempting to do the join?"

I'm inclined to think that the latter is a better option.

@kozanitis
Copy link
Author

@tdanford definitely your second option sounds user friendlier... But my vote goes for the first approach as it looks cleaner from a design point of view. @fnothaft , @massie what you guys think?

@fnothaft
Copy link
Member

I prefer #1; the user should check the validity of their data before doing the join.

@massie
Copy link
Member

massie commented Oct 15, 2014

I like the fast-fail exception but we should ensure that it's easy for a
user to understand the error, e.g. "PartitionAndJoin on RDD with unmapped
reads is not supported" using a NotSupportedException or something similar.

-Matt

On Wed, Oct 15, 2014 at 3:51 PM, Frank Austin Nothaft <
notifications@github.com> wrote:

I prefer #1 #1; the user
should check the validity of their data before doing the join.


Reply to this email directly or view it on GitHub
#297 (comment)
.

@fnothaft
Copy link
Member

@massie that's what @tdanford adds in #421

@massie
Copy link
Member

massie commented Oct 15, 2014

+1

-Matt

On Wed, Oct 15, 2014 at 3:58 PM, Frank Austin Nothaft <
notifications@github.com> wrote:

@massie https://github.com/massie that's what @tdanford
https://github.com/tdanford adds in #421
#421


Reply to this email directly or view it on GitHub
#297 (comment)
.

@fnothaft
Copy link
Member

fnothaft commented Jul 6, 2016

Closed as won't fix.

@fnothaft fnothaft closed this as completed Jul 6, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants