[SPARK-4433] fix a racing condition in zipWithIndex #3291

mengxr · 2014-11-16T04:43:56Z

Spark hangs with the following code:

sc.parallelize(1 to 10).zipWithIndex.repartition(10).count()

This is because ZippedWithIndexRDD triggers a job in getPartitions and it causes a deadlock in DAGScheduler.getPreferredLocs (synced). The fix is to compute startIndices during construction.

This should be applied to branch-1.0, branch-1.1, and branch-1.2.

@pwendell

SparkQA · 2014-11-16T04:49:59Z

Test build #23433 has started for PR 3291 at commit c284d9f.

This patch merges cleanly.

SparkQA · 2014-11-16T06:15:02Z

Test build #23433 has finished for PR 3291 at commit c284d9f.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-16T06:15:06Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23433/
Test PASSed.

rxin · 2014-11-16T06:28:14Z

This makes the whole thing eager, isn't it?

mengxr · 2014-11-17T07:59:34Z

One way to do this lazily is via shuffle:

      val identityPartitioner = new Partitioner {
        override def numPartitions: Int = p
        override def getPartition(key: Any): Int = key.asInstanceOf[Int]
      }
      val startIndices = PartitionPruningRDD.create(this, _ < p - 1) // skip the last partition
        .mapPartitionsWithIndex { (split, iter) =>
          val size = Utils.getIteratorSize(iter)
          Iterator.range(split + 1, p).map { i =>
            (i, size)
          }
        }.reduceByKey(identityPartitioner, _ + _)
        .values
      this.zipPartitions(startIndices) { (iter, startIndexIter) =>
        val startIndex = if (startIndexIter.hasNext) startIndexIter.next() else 0L
        iter.zipWithIndex.map { case (item, localIndex) =>
          (item, startIndex + localIndex)
        }
      }

But I think this is more expensive.

mengxr · 2014-11-18T18:38:14Z

@rxin Does it look good to you? I hope that this fix can get into 1.1.1.

rxin · 2014-11-18T21:49:36Z

lgtm

mengxr · 2014-11-19T00:28:38Z

Thanks! Merged into master, branch-1.2, 1.1, and 1.0.

Spark hangs with the following code: ~~~ sc.parallelize(1 to 10).zipWithIndex.repartition(10).count() ~~~ This is because ZippedWithIndexRDD triggers a job in getPartitions and it causes a deadlock in DAGScheduler.getPreferredLocs (synced). The fix is to compute `startIndices` during construction. This should be applied to branch-1.0, branch-1.1, and branch-1.2. pwendell Author: Xiangrui Meng <meng@databricks.com> Closes apache#3291 from mengxr/SPARK-4433 and squashes the following commits: c284d9f [Xiangrui Meng] fix a racing condition in zipWithIndex

Spark hangs with the following code: ~~~ sc.parallelize(1 to 10).zipWithIndex.repartition(10).count() ~~~ This is because ZippedWithIndexRDD triggers a job in getPartitions and it causes a deadlock in DAGScheduler.getPreferredLocs (synced). The fix is to compute `startIndices` during construction. This should be applied to branch-1.0, branch-1.1, and branch-1.2. pwendell Author: Xiangrui Meng <meng@databricks.com> Closes apache#3291 from mengxr/SPARK-4433 and squashes the following commits: c284d9f [Xiangrui Meng] fix a racing condition in zipWithIndex (cherry picked from commit bb46046) Signed-off-by: Xiangrui Meng <meng@databricks.com>

Spark hangs with the following code: ~~~ sc.parallelize(1 to 10).zipWithIndex.repartition(10).count() ~~~ This is because ZippedWithIndexRDD triggers a job in getPartitions and it causes a deadlock in DAGScheduler.getPreferredLocs (synced). The fix is to compute `startIndices` during construction. This should be applied to branch-1.0, branch-1.1, and branch-1.2. pwendell Author: Xiangrui Meng <meng@databricks.com> Closes #3291 from mengxr/SPARK-4433 and squashes the following commits: c284d9f [Xiangrui Meng] fix a racing condition in zipWithIndex (cherry picked from commit bb46046) Signed-off-by: Xiangrui Meng <meng@databricks.com>

andrewor14 · 2014-11-19T20:37:16Z

Hey @mengxr can you close this

fix a racing condition in zipWithIndex

c284d9f

mengxr closed this Nov 20, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-4433] fix a racing condition in zipWithIndex #3291

[SPARK-4433] fix a racing condition in zipWithIndex #3291

Uh oh!

mengxr commented Nov 16, 2014

Uh oh!

SparkQA commented Nov 16, 2014

Uh oh!

SparkQA commented Nov 16, 2014

Uh oh!

AmplabJenkins commented Nov 16, 2014

Uh oh!

rxin commented Nov 16, 2014

Uh oh!

mengxr commented Nov 17, 2014

Uh oh!

mengxr commented Nov 18, 2014

Uh oh!

rxin commented Nov 18, 2014

Uh oh!

mengxr commented Nov 19, 2014

Uh oh!

andrewor14 commented Nov 19, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-4433] fix a racing condition in zipWithIndex #3291

[SPARK-4433] fix a racing condition in zipWithIndex #3291

Uh oh!

Conversation

mengxr commented Nov 16, 2014

Uh oh!

SparkQA commented Nov 16, 2014

Uh oh!

SparkQA commented Nov 16, 2014

Uh oh!

AmplabJenkins commented Nov 16, 2014

Uh oh!

rxin commented Nov 16, 2014

Uh oh!

mengxr commented Nov 17, 2014

Uh oh!

mengxr commented Nov 18, 2014

Uh oh!

rxin commented Nov 18, 2014

Uh oh!

mengxr commented Nov 19, 2014

Uh oh!

andrewor14 commented Nov 19, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants