-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-9924] [WEB UI] Don't schedule checkForLogs while some of them are already running. #8153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…are already running.
|
ok to test @vanzin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can avoid all mutability with
val tasks: Seq[Future[_]] = logInfos.grouped(20).map{ batch =>
replayExecutor.submit(new Runnable {
override def run(): Unit = mergeApplicationListing(batch)
})
}(I know the change to grouped is unrelated, but everytime I look at this code it confuses me for a second why we have overlapping sliding windows -- might as well as clean it up while you are messing around here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I am a Scala noob - I did it the Java way. :-)
Your suggestion looks much cleaner - I have updated the PR.
|
minor style comment, otherwise makes sense to me |
|
Test build #40716 has finished for PR 8153 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're talking about cleaning this up, you could do this also:
logInfos.grouped(20)
.map { batch =>
replayExecutor.submit(new Runnable {
override def run(): Unit = mergeApplicationListing(batch)
})
}
.foreach { task =>
// Wait for all tasks to finish. This makes sure that checkForLogs is
// not scheduled again while some tasks are already running in the
// replayExecutor.
try {
task.get()
} catch {
case e: InterruptedException =>
throw e
case e: Exception =>
logWarning("Error replaying logs.", e)
}
}
Note I added some missing exception handling, which would cause you to revert to the old behavior of piling up executions if an error happened.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I have updated the PR to add exception handling. I have modified the log message because errors while replaying event logs are already caught in the mergeApplicationListing method.
|
In a way I think the underlying issue is more a problem with the aggressive default polling interval (10 seconds?). But this is a way to make it better. I think in the (not so distant) future we should investigate using the recently added inotify-like API in HDFS, to see whether it helps us avoid polling altogether. |
|
Test build #40782 has finished for PR 8153 at commit
|
|
Jenkins, retest this please |
|
Test build #40805 timed out for PR 8153 at commit |
…behavior of piling up executions.
|
Test build #40857 has finished for PR 8153 at commit
|
|
Jenkins, retest this please |
|
LGTM. |
|
Test build #40900 timed out for PR 8153 at commit |
|
retest this please |
|
Test build #40933 has finished for PR 8153 at commit
|
|
Can we retest this please? |
|
Test build #1625 timed out for PR 8153 at commit |
|
retest this please |
|
Test build #40975 timed out for PR 8153 at commit |
|
As usual, flaky tests in other unrelated modules. I'll just give up on jenkins and merge this Monday morning. |
|
Merged to master, thanks! |
…are already running. Author: Rohit Agarwal <rohita@qubole.com> Closes apache#8153 from mindprince/SPARK-9924.
No description provided.