Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ADAM-1308] Fix stack overflow in join with custom iterator impl. #1315

Conversation

fnothaft
Copy link
Member

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/ADAM-prb/1679/
Test PASSed.

@@ -263,6 +263,30 @@ private[rdd] case class ManualRegionPartitioner(partitions: Int) extends Partiti
}
}

private class AppendableIterator[T] extends Iterator[T] {
var iterators: ListBuffer[Iterator[T]] = ListBuffer.empty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to worry about thread safety? Would immutable copy-on-write help here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nah, this is only accessed inside of a single thread.


def append(iter: Iterator[T]) {
if (iter.hasNext) {
iterators += iter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe copy? iter could be modified by the caller

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, Scala has a pretty strict "advisory" (not quite a contract?) on using Iterators:

It is of particular importance to note that, unless stated otherwise, one should never use an iterator after calling a method on it.

This is a private class used only in this file, so I think it is OK to make the assumption that iter is not modified after being passed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Np! I should make a pass and document this assumption better.

}

def hasNext: Boolean = {
iterators.nonEmpty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this query all of the iterators in iterators?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid having to loop over all the iterators, one of the optimizations I made is that we remove empty iterators from the list, and we don't add empty iterators to the list.

Copy link
Member

@devin-petersohn devin-petersohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@akmorrow13
Copy link
Contributor

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

@fnothaft
Copy link
Member Author

I was going to try running this on the problem before it's merged unless anyone deems that unnecessary.

I would really like that so we can confirm that this is a fix before we merge.

@heuermh heuermh modified the milestone: 0.21.0 Dec 15, 2016
@devin-petersohn
Copy link
Member

I computed on the same code as before, and did not run into stack overflow errors. I think we can go forward with the merge.

@akmorrow13
Copy link
Contributor

I reran my pipeline and it looks good. Thanks @fnothaft !

@heuermh
Copy link
Member

heuermh commented Dec 16, 2016

Thank you for confirming, @devin-petersohn @akmorrow13

@heuermh heuermh merged commit 3bb9736 into bigdatagenomics:master Dec 16, 2016
@heuermh
Copy link
Member

heuermh commented Dec 16, 2016

Thank you, @fnothaft

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants