-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-16339] [CORE] ScriptTransform does not print stderr when outstream is lost #13834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #61002 has finished for PR 13834 at commit
|
|
SGTM |
|
Jenkins retest this please |
|
Test build #61011 has finished for PR 13834 at commit
|
|
Jenkins retest this please |
|
Test build #61023 has finished for PR 13834 at commit
|
|
Jenkins retest this please |
|
Test build #61036 has started for PR 13834 at commit |
|
Looks like jenkins is having issues. Thanks to @srowen for triggering retests. I will try once more. |
|
Jenkins retest this please |
|
Test build #61068 has finished for PR 13834 at commit
|
|
Jenkins retest this please |
|
Test build #61103 has finished for PR 13834 at commit
|
|
I think the failure is legit. It's stuck at: So is the problem that the process never returns from waitFor until that stream is closed? |
|
@tejasapatil are you able to follow up on this one? I could do it if you're busy. |
04c8637 to
4b6adf2
Compare
|
ok to test |
|
Hm, but now the stream is not closed in the case of an exception. I think we have a Utils method for closing and logging any exception that occurs? that would let it then proceed anyway |
|
Test build #61602 has finished for PR 13834 at commit
|
|
@srowen : In case of exception, we |
|
OK, I guess that's a safe enough assumption. I guess I'm worried just because we've seen that if this doesn't happen then waitFor will hang. But I suppose there's no reason to expect it would block after destroy(). If it needs a force-kill then lots of stuff is probably going wrong. |
|
Hm, wait, don't we have a problem where a fatal error is thrown from this block? now it still waits for the process in the finally block, but will hang because the stream is not closed. If that's right, just safer to always close the stream and catch the exception? I think we have a Utils method for it. |
…ore closing the outstream
4b6adf2 to
e2277ff
Compare
| throw e | ||
| } finally { | ||
| try { | ||
| outputStream.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I mean, before this was safe in that we'd always close the stream, or try to. The problem was just that closing could throw an exception and skip printing the stderr buffer. I don't see the value in moving close() just to have to further change the method behavior to work around the move. Why not just leave it here but prevent an exception from being thrown from close?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
Test build #61755 has finished for PR 13834 at commit
|
|
Test build #61756 has finished for PR 13834 at commit
|
|
Test build #61757 has finished for PR 13834 at commit
|
|
Test build #61758 has finished for PR 13834 at commit
|
|
Jenkins re-test please |
|
Not sure what's up with that, but to be clear, I mean, isn't this just a one-line change? |
|
@srowen : I also changed the |
|
I had thought that closing the stream was sufficient (at least, that's the current assumption in the code). But OK I can see making that change for a slightly different reason, to make sure any fatal exception also terminates the process (and closes its streams, still) |
|
Jenkins retest this please (last failure was a hard JVM crash, obviously not related) |
|
Test build #61792 has finished for PR 13834 at commit
|
|
Merged to master/2.0 |
…eam is lost ## What changes were proposed in this pull request? Currently, if due to some failure, the outstream gets destroyed or closed and later `outstream.close()` leads to IOException in such case. Due to this, the `stderrBuffer` does not get logged and there is no way for users to see why the job failed. The change is to first display the stderr buffer and then try closing the outstream. ## How was this patch tested? The correct way to test this fix would be to grep the log to see if the `stderrBuffer` gets logged but I dont think having test cases which do that is a good idea. (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) … Author: Tejas Patil <tejasp@fb.com> Closes #13834 from tejasapatil/script_transform. (cherry picked from commit 5f34204) Signed-off-by: Sean Owen <sowen@cloudera.com>
What changes were proposed in this pull request?
Currently, if due to some failure, the outstream gets destroyed or closed and later
outstream.close()leads to IOException in such case. Due to this, thestderrBufferdoes not get logged and there is no way for users to see why the job failed.The change is to first display the stderr buffer and then try closing the outstream.
How was this patch tested?
The correct way to test this fix would be to grep the log to see if the
stderrBuffergets logged but I dont think having test cases which do that is a good idea.(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
…