-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-19330. S3A: Add LeakReporter; use in S3AInputStream #7151
HADOOP-19330. S3A: Add LeakReporter; use in S3AInputStream #7151
Conversation
S3AInputStream.finalizer to - abort any active HTTP connection. - warn if there was one - warning to include: path, and stack+thread ID of creation. The warning message is logged at WARN to log org.apache.hadoop.fs.s3a.connection.leaks set to a lower log level to hide these messages This is a best-effort recovery from stream leaks. Cleanup will *only* take place during GC; it is easy to run out of connections without running out of memory. Change-Id: I162c347ce292a5e7e053a03342bf69b7bca17b11
what the log looks like S3AInputStream
new warning log
|
💔 -1 overall
This message was automatically generated. |
Change-Id: Id7808e1e178c50cfa6e201d6b28a8827573696a7
This is ready for review but a tricky one to test. The test does work but it took me awhile to realise that we are finalizers our run in a separate thread and you need to put a sleep in so the captured log does not to come back empty. Anyone who understands GC and finalizers is very welcome to look at this and provide feedback. @mukund-thakur @saikatroy038 @HarshitGupta11 @ahmarsuhail @shameersss1 |
🎊 +1 overall
This message was automatically generated. |
I did just write a marginal optimisation to only build the stack trace if the log was <= warn but haven't committed it. It just wasn't testable and I did not want to create the risk of an NPE for marginal speedups. These objects are talking over HTTPS to object storage; the overhead of stack treewalk is insignifcant. |
Pull out leak reporting into hadoop-common class org.apache.hadoop.fs.impl.LeakReporter implements Closeable Which takes -message to print -predicate for being open -RunnableRaisingIOE (new class) for the close action. + to string, re-entrancy guard, catch and swallow of failures... Has a hadoop-common unit test. Change-Id: Iaee47fc26e674d29163b6ffc7b598187e44a7675
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a good understanding of finalizers or GC, but the code change itself looks good to me :)
@ahmarsuhail thanks. my last commit actually pulls out everything into a hadoop-common |
💔 -1 overall
This message was automatically generated. |
Add a Stream statistic for this with * Capability of FS (if !prefetching) and stream itself (tested) * statistic collected in stream and merged with FS (tested) * test tuning (with failing test fixed) * doc review Change-Id: I747a9477e0571d729c81e6f17dac127ffb6ade60
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1 pending yetus
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3AInputStreamLeakage.java
Show resolved
Hide resolved
...p-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/impl/TestLeakReporter.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/connecting.md
Show resolved
Hide resolved
Style Change-Id: I5ecc157500bfb84d94a40d712df45994b7dddb3b
🎊 +1 overall
This message was automatically generated. |
If a file is opened for reading through the S3A connector is not closed, then when garbage collection takes place * An error message is reported at WARN, including the file name. * A stack trace of where the stream was created is reported at INFO. * A best-effort attempt is made to release any active HTTPS connection. * The filesystem IOStatistic stream_leaks is incremented. The intent is to make it easier to identify where streams are being opened and not closed -as these consume resources including often HTTPS connections from the connection pool of limited size. It MUST NOT be relied on as a way to clean up open files/streams automatically; some of the normal actions of the close() method are omitted. Instead: view the warning messages and IOStatistics as a sign of a problem, the stack trace as a way of identifying what application code/library needs to be investigated. Contributed by Steve Loughran Change-Id: I58f1d8774913ad890189125ee2510ea0e4bf6edf
…7160) If a file is opened for reading through the S3A connector is not closed, then when garbage collection takes place * An error message is reported at WARN, including the file name. * A stack trace of where the stream was created is reported at INFO. * A best-effort attempt is made to release any active HTTPS connection. * The filesystem IOStatistic stream_leaks is incremented. The intent is to make it easier to identify where streams are being opened and not closed -as these consume resources including often HTTPS connections from the connection pool of limited size. It MUST NOT be relied on as a way to clean up open files/streams automatically; some of the normal actions of the close() method are omitted. Instead: view the warning messages and IOStatistics as a sign of a problem, the stack trace as a way of identifying what application code/library needs to be investigated. Contributed by Steve Loughran
S3AInputStream.finalizer() to
The warning message is logged at WARN to log
org.apache.hadoop.fs.s3a.connection.leaks
set to a lower log level to hide these messages
This is a best-effort recovery from stream leaks.
Cleanup will only take place during GC; it is easy to run out of connections without running out of memory.
How was this patch tested?
New ITest.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?