-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6766][Streaming] Fix issue about StreamingListenerBatchSubmitted and StreamingListenerBatchStarted #5414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…reamingListenerBatchStarted.batchInfo.processingStartTime; fix a typo
|
cc @tdas |
|
Test build #29849 has started for PR 5414 at commit |
|
Good catch, LGTM :). |
|
Test build #29849 has finished for PR 5414 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. OMG!
|
This looks good to me at first glance, but I am concerned that such a thing went unnoticed for so long despite unit tests in StreamingListenerSuite. Let me think whether we can improve the tests. If so, we definitely should add tests as part of this PR so that such a thing cannot happen again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be refactored to dedup lines 139 and 144.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is hard to dedup cause of ordering call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, handleJobStart will set jobSet.hasStarted to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about
val isFirstJobOfJobSet = !jobSet.hasStarted
jobSet.handleJobStart(job)
if (isFirstJobOfJobSet) {
listenerBus.post(StreamingListenerBatchStarted(jobSet.toBatchInfo))
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Updated.
|
Overall, I realized that there is not tests for StreamingJobProgressListener, which is my bad. :( We really need to add some tests for the StreamingJobProgressListener which tests some basic aspects like this. Could you add a |
OK. I will add some basic tests for StreamingJobProgressListener |
|
Added some tests to StreamingListenerSuite for this issue, and added StreamingJobProgressListenerSuite for some basic tests of StreamingJobProgressListener. |
|
Test build #29935 has started for PR 5414 at commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does these two units tests take? I wonder whether it is best (in the interest of minimizing test run times) to collapse all these two unit tests into the earlier "batch info reporting".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[info] StreamingListenerSuite:
[info] - batch info reporting (4 seconds, 976 milliseconds)
[info] - receiver info reporting (847 milliseconds)
[info] - SPARK-6766: batch info should be submitted (667 milliseconds)
[info] - SPARK-6766: processingStartTime of batch info should not be None when starting (691 milliseconds)
[info] ScalaTest
[info] Run completed in 8 seconds, 996 milliseconds.
[info] Total number of tests run: 4
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 4, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my last test result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, each of the two new unit tests is taking .7 seconds. Not good. Best to merge them into the first unit test.
The reason its taking time is that its starting a new SparkContext every time, and running batches after that.
|
Test build #29935 timed out for PR 5414 at commit |
|
Test FAILed. |
|
retest this please |
|
Test build #29937 has started for PR 5414 at commit |
|
Test build #29937 has finished for PR 5414 at commit
|
|
Test PASSed. |
|
Test build #29985 has started for PR 5414 at commit |
|
Collapsed the new unit tests together. |
|
Test build #29985 has finished for PR 5414 at commit
|
|
Test PASSed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this line? Run streams should have already stopped the ssc, isnt it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that I misread runStreams. This line is not necessary.
|
Just a few minor comments, otherwise LGTM. Can merge as soon as you update these. |
|
Test build #30019 has started for PR 5414 at commit |
|
The concerned testsuites have passed, I am merging this to master and 1.3 branch. Thanks! |
|
I merged it in master, but could not merge it to 1.3 because of conflicts. Could you open another PR to branch 1.3? Would be good to have these bug fixes in 1.3.x |
|
Test build #30019 has finished for PR 5414 at commit
|
|
Test PASSed. |
…ed and StreamingListenerBatchStarted (backport to branch 1.3) Backport SPARK-6766 #5414 to branch 1.3 Conflicts: streaming/src/main/scala/org/apache/spark/streaming/ui/StreamingJobProgressListener.scala Author: zsxwing <zsxwing@gmail.com> Closes #5452 from zsxwing/SPARK-6766-branch-1.3 and squashes the following commits: cb87e44 [zsxwing] [SPARK-6766][Streaming] Fix issue about StreamingListenerBatchSubmitted and StreamingListenerBatchStarted (backport to branch 1.3)
…tches" lists to StreamingPage This PR adds two lists, `Active Batches` and `Completed Batches`. Here is the screenshot:  Due to [SPARK-6766](https://issues.apache.org/jira/browse/SPARK-6766), I need to merge #5414 in my local machine to get the above screenshot. Author: zsxwing <zsxwing@gmail.com> Closes #5434 from zsxwing/SPARK-6796 and squashes the following commits: be50fc6 [zsxwing] Fix the code style 51b792e [zsxwing] Fix the unit test 6f3078e [zsxwing] Make 'startTime' readable f40e0a9 [zsxwing] Merge branch 'master' into SPARK-6796 2525336 [zsxwing] Rename 'Processed batches' and 'Waiting batches' and also add links a69c091 [zsxwing] Show the number of total completed batches too a12ad7b [zsxwing] Change 'records' to 'events' in the UI 86b5e7f [zsxwing] Make BatchTableBase abstract b248787 [zsxwing] Add tests to verify the new tables d18ab7d [zsxwing] Fix the code style 6ceffb3 [zsxwing] Add "Active Batches" and "Completed Batches" lists to StreamingPage
This PR includes:
StreamingListenerBatchSubmittedwhenJobSetis submittedStreamingListenerBatchStarted.batchInfo.processingStartTimecompletedaBatchInfos->completedBatchInfos