Skip to content

Could not compute split [ from JIRA] #1

@tsface

Description

@tsface

SPARK-12306

https://issues.apache.org/jira/browse/SPARK-12306

Currently in Spark Streaming, a Receiver stores the received data into some BlockManager and then later the data will be used by a BlockRDD. If this BlockManager were to lost because of some failure, then this BlockRDD would throw a SparkException saying "Could not compute split, block not found".
In most cases this is the right thing to do. However, in a streaming scenario where it can tolerant small pieces of data loss, maybe just move on silently – instead of throwing an exception – is more preferable.
This issue proposes to add such an "spark.streaming.ignoreBlockNotFound" option, which defaults to false, to tell whether to throw an exception or just move on when a block is not found.

SPARK-5001

https://issues.apache.org/jira/browse/SPARK-5001

I've counted messages using kafkainputstream of spark-1.1.1. The test app failed when the latter batch job completed sooner than the previous. In the source code, BlockRDDs older than (time-rememberDuration) will be removed in cleanMetaData after one job completed. And the previous job will abort due to block not found.The relevant log are as follows:
2014-12-25 14:07:12(Logging.scala:59)[sparkDriver-akka.actor.default-dispatcher-14] INFO :Starting job streaming job 1419487632000 ms.0 from job set of time 1419487632000 ms
2014-12-25 14:07:15(Logging.scala:59)[sparkDriver-akka.actor.default-dispatcher-14] INFO :Starting job streaming job 1419487635000 ms.0 from job set of time 1419487635000 ms
2014-12-25 14:07:15(Logging.scala:59)[sparkDriver-akka.actor.default-dispatcher-15] INFO :Finished job streaming job 1419487635000 ms.0 from job set of time 1419487635000 ms
2014-12-25 14:07:15(Logging.scala:59)[sparkDriver-akka.actor.default-dispatcher-16] INFO :Removing blocks of RDD BlockRDD[3028] at createStream at TestKafka.java:144 of time 1419487635000 ms from DStream clearMetadata
java.lang.Exception: Could not compute split, block input-0-1419487631400 not found for 3028

SPARK-10898

https://issues.apache.org/jira/browse/SPARK-10898

Setting spark.streaming.concurrentJobs causes blocks to be deleted before read

SPARK-10210

https://issues.apache.org/jira/browse/SPARK-10210

When write ahead log is not enabled, a recovered streaming driver still tries to run jobs using pre-failure block ids, and fails as the block do not exists in-memory any more (and cannot be recovered as receiver WAL is not enabled).
This occurs because the driver-side WAL of ReceivedBlockTracker is recovers that past block information, and ReceiveInputDStream creates BlockRDDs even if those blocks do not exist.
The solution is to filter out block ids that do not exist before creating the BlockRDD.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions