-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-6870][Yarn] Catch InterruptedException when yarn application state monitor thread been interrupted #5479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #30112 has finished for PR 5479 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate of #5451 so it would have been better to collaborate on that rather than open a new PR. However, this is closer to the right fix, so maybe we can converge on this PR.
Why is Thread.currentThread().interrupt() called here? I thought that would only be done to preserve the interrupt state, but then that should only happen in the catch block right? The thread isn't waited-on by anything else and is terminating otherwise in the non-interrupted code path.
Also, it's correct to not stop the SparkContext in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we don't need to call Thread.currentThread().interrupt() here, but I think we need to stop the SparkContext. If user kill the app on Yarn, then we need to stop the SparkContext right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems reasonably to stop the context. Should that happen in a finally block so that it happens even if interrupted, or, is the interrupted case one where we assume it's already shutting down?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We interrupt the monitor thread when we call stop(), so don't need to call sc.stop() again. We add sc.stop() after client.monitorApplication return just to confirm we can stop SparkContext when app has finished/failed/killed before we stop SparkContext.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK so the net change here is to
- log the interruption at info, instead of throwing the exception. Either way, the thread finishes immediately after.
- remove the spurious
interrupt()call here
LGTM
|
Test build #30129 has finished for PR 5479 at commit
|
On PR #5305 we interrupt the monitor thread but forget to catch the InterruptedException, then in the log will print the stack info, so we need to catch it.