[SPARK-20643][core] Add listener implementation to collect app state. #19383

vanzin · 2017-09-28T17:54:11Z

The initial listener code is based on the existing JobProgressListener (and others),
and tries to mimic their behavior as much as possible. The change also includes
some minor code movement so that some types and methods from the initial history
server code code can be reused.

The code introduces a few mutable versions of public API types, used internally,
to make it easier to update information without ugly copy methods, and also to
make certain updates cheaper.

Note the code here is not 100% correct. This is meant as a building ground for
the UI integration in the next milestones. As different parts of the UI are
ported, fixes will be made to the different parts of this code to account
for the needed behavior.

I also added annotations to API types so that Jackson is able to correctly
deserialize options, sequences and maps that store primitive types.

The initial listener is based on the existing JobProgressListener (and others), and tries to mimin their behavior as much as possible. The change also includes some minor code movement so that some types and methods from the initial history server code code can be reused. The code introduces a few mutable versions of public API types, used internally, to make it easier to update information without ugly copy methods, and also to make certain updates cheaper. Note the code here is not 100% correct. This is meant as a building ground for the UI integration in the next milestones. As different parts of the UI are ported, fixes will be made to the different parts of this code to account for the needed behavior. I also added annotations to API types so that Jackson is able to correctly deserialize options, sequences and maps that store primitive types.

vanzin · 2017-09-28T17:55:20Z

For context:

Project link: https://issues.apache.org/jira/browse/SPARK-18085
Upcoming PRs that build on this code: https://github.com/vanzin/spark/pulls

SparkQA · 2017-09-28T21:09:33Z

Test build #82284 has finished for PR 19383 at commit b08b711.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
class KVStoreScalaSerializer extends KVStoreSerializer

squito

not done with the review yet but will checkpoint my concerns here. Overall looks fine.

squito · 2017-09-29T20:03:29Z

common/kvstore/src/main/java/org/apache/spark/util/kvstore/LevelDB.java


    Options options = new Options();
-    options.createIfMissing(!path.exists());
+    options.createIfMissing(true);


just curious, did you encounter a problem w/ the previous version? though I guess makes sense to tell leveldb to always create if missing.

Tests generally use a temp dir for the db, using Utils.createTempDir or something like that which creates the directory for you. That would cause this to fail unless you deleted the directory first (which LevelDBSuite does), which I found a little bit annoying after a while.

squito · 2017-09-29T20:24:57Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+
+import java.util.Date
+
+import scala.collection.JavaConverters._


squito · 2017-09-29T20:25:06Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+import org.apache.spark.status.api.v1
+import org.apache.spark.storage._
+import org.apache.spark.ui.SparkUI
+import org.apache.spark.ui.scope._


squito · 2017-09-29T20:26:25Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

 import org.apache.spark.internal.Logging
 import org.apache.spark.scheduler._
 import org.apache.spark.scheduler.ReplayListenerBus._
+import org.apache.spark.status.KVUtils._


only need to import KVIndexParam?

I start using more stuff from that object in later changes, so might as well avoid these changes later.

In fact I'm not sure why I'm not just using open() in this class now, since it was added to KVUtils...

squito · 2017-09-29T20:34:13Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+  }
+
+  override def onJobStart(event: SparkListenerJobStart): Unit = {
+    // Compute (a potential underestimate of) the number of tasks that will be run by this job.


I realize you are copying this comment, but it seems wrong. Its a potential under-estimate of the job-progress. Its a potential over-estimate of the number of tasks that will be run. I looked at the referenced PR, and I think it agrees with that understanding -- the pr description says "If a job contains stages that aren't run, then its overall job progress bar may be an underestimate of the total job progress"

squito · 2017-09-29T20:47:31Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+        case JobFailed(_) => JobExecutionStatus.FAILED
+      }
+
+      job.completionTime = if (event.time != -1) Some(new Date(event.time)) else None


same here on the time filters

squito · 2017-09-29T21:00:04Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+      val skipped = !event.stageInfo.submissionTime.isDefined
+      stage.status = event.stageInfo.failureReason match {
+        case Some(_) => v1.StageStatus.FAILED
+        case None => if (skipped) v1.StageStatus.SKIPPED else v1.StageStatus.COMPLETE


its slightly confusing that skipped actually doesn't indicate skipped. maybe rename to hasSubmissionTime (with corresponding change in logic)? or even include directly in match, something like

stage.status = event.stageInfo.failureReason match { case Some(_) => v1.StageStatus.FAILED case None if event.stageInfo.submissionTime.isDefined => v1.StageStatus.COMPLETE case _ => v1.StageStatus.SKIPPED }

squito · 2017-09-29T21:06:02Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+  override def onBlockUpdated(event: SparkListenerBlockUpdated): Unit = {
+    event.blockUpdatedInfo.blockId match {
+      case block: RDDBlockId => updateRDDBlock(event, block)
+      case _ => // TODO: API only covers RDD storage. UI might need shuffle storage too.


I actually don't think shuffle blocks ever get reported via SparkListenerBlockUpdated (might be wrong about this).

There will be updates for Broadcast blocks, though I think those are also ignored in the UI.

so in that pr, you handle StreamBlocks, but my point is that the comment about shuffle storage is wrong.

squito · 2017-09-29T21:30:28Z

core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala

+      new StageInfo(1, 0, "stage1", 4, Nil, Nil, "details1"),
+      new StageInfo(2, 0, "stage2", 4, Nil, Seq(1), "details2"))
+
+    val stageProps = new Properties()


these properties should also get passed to onJobStart, and there should be an assert on job.info.jobGroup

(probably it should also have the scheduler pool, but thats missing in the current code, so can keep it separate ...)

squito · 2017-09-29T21:32:23Z

core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala

+
+    check[StageDataWrapper](key(stages.head)) { stage =>
+      assert(stage.info.status === v1.StageStatus.ACTIVE)
+      assert(stage.info.submissionTime === Some(new Date(stages.head.submissionTime.get)))


assert on stage.info.schedulingPool.

(should probably also have jobGroup, but again, current code doesn't have it, so can be separate.)

SparkQA · 2017-09-30T00:42:56Z

Test build #82333 has finished for PR 19383 at commit 886903c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2017-10-02T21:31:00Z

Test build #82390 has finished for PR 19383 at commit d1fc7ac.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2017-10-18T16:21:07Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+  private var coresPerTask: Int = 1
+
+  // Keep track of live entities, so that task metrics can be efficiently updated (without
+  // causing too many writes to the underlying store, and other expensive operations).


When we update metrics to disk?

When the SHS starts writing UI data to disk (starting with vanzin#43). But even writing to an in-memory store can have non-trivial overhead (e.g. resizing a large hash table).

cloud-fan · 2017-10-18T16:25:31Z

core/src/main/scala/org/apache/spark/status/LiveEntity.scala

+    store.write(doUpdate())
+  }
+
+  protected def doUpdate(): Any


can you add document? Seems all implementations do copy instead of update.

cloud-fan · 2017-10-18T16:27:47Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+ * A Spark listener that writes application information to a data store. The types written to the
+ * store are defined in the `storeTypes.scala` file and are based on the public REST API.
+ */
+private class AppStatusListener(kvstore: KVStore) extends SparkListener with Logging {


where do we use it?

SparkQA · 2017-10-18T21:10:35Z

Test build #82890 has finished for PR 19383 at commit ca43746.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2017-10-24T19:12:40Z

core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala

    val dbPath = new File(path, "listing.ldb")
+    val metadata = new FsHistoryProviderMetadata(CURRENT_LISTING_VERSION, logDir.toString())

    def openDB(): LevelDB = new LevelDB(dbPath, new KVStoreScalaSerializer())


squito · 2017-10-24T19:27:07Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+
+    val schedulingPool = Option(event.properties).flatMap { p =>
+      Option(p.getProperty("spark.scheduler.pool"))
+    }.getOrElse(SparkUI.DEFAULT_POOL_NAME)


you actually need to set the scheduling pool in onStageSubmitted. If it is shared by multiple jobs, with different pools, then this will just use the scheduling pool of the job that was submitted last, rather than the one that actually is used when the stage is submitted. that has a handle on the properties of the submitting job so shoudl be easy

squito · 2017-10-24T19:35:32Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+
+  override def onTaskGettingResult(event: SparkListenerTaskGettingResult): Unit = {
+    liveTasks.get(event.taskInfo.taskId).foreach { task =>
+      update(task)


whats the point of doing this? won't you already have this update written?

Hmm, I thought I needed to handle this event later on, but looks like I don't, so it can go away.

Actually, this records the gettingResultTime value in the underlying store (that value is part of the mutable TaskInfo that the LiveTask instance references).

squito · 2017-10-24T19:37:33Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+    // completion event is for. Let's just drop it here. This means we might have some speculation
+    // tasks on the web ui that are never marked as complete.
+    if (event.taskInfo == null || event.stageAttemptId == -1) {
+      return


shoudl you still do liveTasks.remove(event.taskInfo.taskId) even if event.stageAttemptId == -1?

I'm a little skeptical that this can really happen at all, but this is the exact behavior of JobProgressListener.

But I guess we can let the rest of the code run, at worst it won't do anything bad because it can't find a matching stage.

ok that makes sense -- I also couldn't see how this would happen but figured maybe that case was there for a reason.

squito · 2017-10-24T19:43:13Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+      stage.jobs.foreach { job =>
+        stage.status match {
+          case v1.StageStatus.COMPLETE =>
+            job.completedStages = job.completedStages + event.stageInfo.stageId


nit: job.completedStages += event.stageInfo.stageId

squito · 2017-10-24T20:30:06Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+  override def onBlockUpdated(event: SparkListenerBlockUpdated): Unit = {
+    event.blockUpdatedInfo.blockId match {
+      case block: RDDBlockId => updateRDDBlock(event, block)
+      case _ => // TODO: API only covers RDD storage. UI might need shuffle storage too.


so in that pr, you handle StreamBlocks, but my point is that the comment about shuffle storage is wrong.

squito · 2017-10-24T20:54:42Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+      maybeExec.foreach { exec =>
+        if (exec.rddBlocks + rddBlocksDelta > 0) {
+          val dist = rdd.distribution(exec)
+          dist.memoryRemaining = newValue(dist.memoryRemaining, -memoryDelta)


Memory remaining (and the on- and off-heap breakdown) looks wrong to me.

LiveRDDDistribution is local to one LiveRDD. But you will have partitions from multiple RDDs stored on one executor. So the memory remaining needs to take into account all of the rdds on that executor. I don't think you can keep this value precomputed -- an update to a totally different RDD would change it. (unless on every block update, you update the value for all rdds stored on that executor.)

The current UI handles this by storing it in the StorageStatusListener by executor, and populating this info in every request for the RDD info.

There's a bunch of fixes to this part in vanzin#46. As mentioned in the description, this is not expected to be 100% correct, and fixes will be made as individual pages are changed to use this data.

squito · 2017-10-24T21:14:34Z

core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala

+      assert(execs.size > 0)
+      execs.foreach { exec =>
+        assert(exec.info.memoryBytesSpilled === s1Tasks.size / 2)
+      }


nit -- this section on execs doesn't belong inside check[StageDataWrapper](key(stages.head))

squito · 2017-10-24T21:32:22Z

core/src/test/scala/org/apache/spark/status/AppStatusListenerSuite.scala

+      assert(exec.info.memoryUsed === 3L)
+      assert(exec.info.diskUsed === 3L)
+    }
+


Add a test for memoryRemaining with multiple Rdds on the same executor.

We can add those in vanzin#46 when this part of the listener is fixed.

SparkQA · 2017-10-25T01:33:18Z

Test build #83024 has finished for PR 19383 at commit 53357a1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

squito · 2017-10-25T15:29:33Z

lgtm

squito · 2017-10-25T21:46:56Z

btw, for any other potential reviews, I'm going already reviewing the rest of marcelo's commits in this project (the prs against is own repo here: https://github.com/vanzin/spark/pulls). In general I'm just finding small things and have enough paged in that I expect to be able to review the rest of these changes quickly.

squito · 2017-10-26T16:07:07Z

merged to master. thanks @vanzin

cloud-fan · 2017-10-28T23:16:44Z

core/src/main/scala/org/apache/spark/status/AppStatusListener.scala

+  }
+
+  private def update(entity: LiveEntity): Unit = {
+    entity.write(kvstore)


Seems the update is called very frequently, almost for each event. Does it mean we flush data to disk very frequently too?

This will be tweaked in following PRs.

cloud-fan · 2017-10-28T23:17:24Z

core/src/main/scala/org/apache/spark/status/api/v1/api.scala

    val memoryUsed: Long,
    val memoryRemaining: Long,
    val diskUsed: Long,
+    @JsonDeserialize(contentAs = classOf[JLong])


what does this mean?

See jackson documentation. It tells jackson to deserialize the contents of a container as a specific type.

cloud-fan · 2017-10-28T23:21:19Z

project/MimaExcludes.scala

  // Exclude rules for 2.3.x
  lazy val v23excludes = v22excludes ++ Seq(
+    // SPARK-18085: Better History Server scalability for many / large applications
+    ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.status.api.v1.ExecutorSummary.executorLogs"),


what's going on here? I don't see this PR touch the code of ExecutorSummary

I removed an import and that changed the type of one of the fields.

ah i see: -import scala.collection.Map. But is this really necessary to break the compatibility?

Well, it doesn't break compatibility because this class is not to be used outside Spark (it's only public so mima can detect breakages in the JSON format, and this doesn't change that).

Leaving the old import makes using this class really awkward from other places that don't have that import.

Hide a couple of classes a little more.

886903c

squito reviewed Sep 29, 2017

View reviewed changes

Marcelo Vanzin added 2 commits October 2, 2017 11:04

Feedback.

dc7bb5c

Use more code that was moved to KVUtils.

d1fc7ac

cloud-fan reviewed Oct 18, 2017

View reviewed changes

Add method javadoc.

ca43746

squito suggested changes Oct 24, 2017

View reviewed changes

Feedback.

53357a1

squito mentioned this pull request Oct 25, 2017

SHS-NG M4.3: Port StorageTab to the new backend. vanzin/spark#46

Closed

asfgit closed this in 0e9a750 Oct 26, 2017

vanzin deleted the SPARK-20643 branch October 26, 2017 18:32

cloud-fan reviewed Oct 28, 2017

View reviewed changes


		import java.util.Date

		import scala.collection.JavaConverters._

[SPARK-20643][core] Add listener implementation to collect app state. #19383

[SPARK-20643][core] Add listener implementation to collect app state. #19383

Uh oh!

Conversation

vanzin commented Sep 28, 2017

Uh oh!

vanzin commented Sep 28, 2017

Uh oh!

SparkQA commented Sep 28, 2017

Uh oh!

squito left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Sep 30, 2017

Uh oh!

SparkQA commented Oct 2, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Oct 18, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanzin Oct 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

vanzin Oct 24, 2017 •

edited

Loading