-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recover gracefully from Thread.interrupt() #1902
Comments
Can fix. |
@dave-r12 this looks tricky. Interested? |
Hah, nice. @snodnipper do you recall/know if at least one exception is being thrown on any of your HTTP calls? I think this is a connection leak within OkHttp itself so your excessive close calls may not be necessary. I noted 2 possible leak sites, one relating to hitting the max follow up threshold (which has since been addressed) and the other around HTTP authentication. |
Excellent, thanks @dave-r12 for taking a look. From memory, I was downloading several million files. This demo project has a crude outline. It takes the visible area of a map and calculates the URLs to fetch. I probably then created an ExecutorService to download several concurrently and over the few million tiles I probably got a few of those errors. re: "at least one exception is being thrown" - possibly some of the HTTP calls failed given the number of tiles. I note that in my demo project I am not using SSL (I am using OSM) as the source...whereas the code that generated the problem was connecting to an https endpoint in reference to your HTTP authentication thoughts. Just for good measure, I note that the demo project gets the following exception on a screen rotation. I assume it is a separate issue.
|
Thanks, and cool demo app BTW. I'm also seeing that StrictMode message. It's been quite awhile since I've done development on Android, so please correct me if I'm wrong here. I think part of the problem is that your are creating an OkHttpClient and Cache each time the MainActivity is created. From the wiki:
I don't think it's surprising this message is getting logged. I'm also wondering if a similar scenario is happening regarding your StrictMode message for the socket. |
So I have updated the demo to only create the OkHttpClient once - you are quite correct. We certainly do not seem to get the issue so frequently now...but I am still getting it in my logcat. For example:
With regard to the original code that generated the errors, it was almost certainly using a Dagger Singleton instance. I will try and find some time find / execute the code again with OkHttp. |
Alright, thanks. I'll see if I can take a look |
I believe I've been able to reproduce a variant of the original exception, but I think it's separate from what you are doing. Here is the repo, https://github.com/dave-r12/okhttp-racing-leak |
It looks like there was a fair bit of refactoring work around ConnectionPool and StreamAllocation since OkHttp 2.5. @snodnipper is it possible for you to upgrade to OkHttp3? The leaks I mentioned have already been identified and fixed. The other thing I found was in reference to your issue #1903 about using Thread.interrupt() & InterruptedIOException not playing nice with OkHttp. For whatever reason, I can't trigger this on the desktop (I think we'd need to dive into the JDK source to understand why.) But it does seem to trigger on Android reliably. If you interrupt a thread while it is in the middle of a request, it will leak the connection (it also requires a domain with 1 DNS entry.) I've updated the demo app I created which replicates the issue reliably. Can anyone else confirm this? |
For what it’s worth, we haven’t paid much attention or interest to |
I updated the map app to OkHttp 3.0.1 and received the following output:
I'll check out your okhttp-racing-leak project in a moment. |
And you are certain that the problem is not in your code? Also that first On Sun, Jan 17, 2016, 3:53 PM Ollie Snowden notifications@github.com
|
great @dave-r12 - your demo seems to reproduce the original issue. I received the following:
Given that you said "[t]he leaks I mentioned have already been identified and fixed" does this mean this issue can be closed? @JakeWharton - I will have a go at simplifying the Map demo code, which is essentially a quick 'n dirty adapter for the closed source ESRI SDK (executing my OkHttp code with ThreadPoolExecutor / Futures that are cancelled when out of screen). |
I believe Okhttp doesn't handle very well with Interrupted. I think you should remove resp.body() != null checks |
@snodnipper from my perspective, the takeaways are:
Given Jesse's comment above, I'm reluctant to research this one further. It feels like mixing Thread.interrupt() with OkHttp is asking for trouble. It's a bit unfortunate that the map library forced your hand, but maybe they provide some hooks to implement your own strategy? |
Thanks folks. I can raise another issue if the map code still generates an exception under intended use.
|
@swankjesse Interruption is the standard way to interrupt long-running tasks in Java. In our WebDriver tests (WD recently switched to OkHttp), we had to switch back to ApacheHttp for this very reason. It's unfeasible to somehow pierce through multiple layers to get to the |
@dagguh yeah, but |
|
Interrupting a thread in the middle of a blocking I/O call is likely to leave things in an inconsistent state, particularly when considering the variety of platforms (Android, JVM) and socket types (TCP, JVM SSL, native SSL). So I don't think it's safe to assume that we can handle client interrupts predictably in this way. I think you have a valid point about HTTP/1.1, it should be handled relatively cleanly if interruptions do occur. Probably the same for HTTP/2, all streams using the same connection are eventually errored, but the client left in a decent state. |
It's also problematic if thread interruption works differently for HTTP/1 and HTTP/2; we've worked hard to encapsulate those differences. |
In a related discussion from Apache folks, it's concluded that:
|
Closing, any framework usage of OkHttp should hook into clean cancel calls, and connection pool etc. |
I think as has been pointed out, ExcutorService uses thread interrupts to encourage threads to stop what they are doing. ExecutorService is pretty core to Java. It would be unclean to not support thread interrupts, and it would be wrong to run into errors because of thread interrupts. I think, perhaps unfortunately, you have to deal with interrupts because it is such a core part of the language. Ideally waits/reads/writes would all stop when interrupted (perhaps similar to a read/write timeout, could you just treat them the same?). If not that ideal solution then at the very least you pretty much have to not run into errors when threads are interrupted as that is core to any logic being executed in java. Finally I will mention that many devs would probably be super happy if okhttp supported interrupts, the ability to ask a thread to stop without needing to restart the JVM is a very nice very useful property. @yschimke updated since you re-opened for discussion. |
For discussion. Will close if we decide not to support. |
OTOH the caller's thread is clearly defined. If the lib does multithreading underneath, it could still respect the interruptability of the caller thread and propagate it further down the layers. |
indeed, it can not be assumed that interrupt wont be called. |
Keep your chin up, this can be done :) I think that any existing IO/concurrency/locking call are all things that can already fail (e.g. lock timed out, some IO error). I think also the network failures is stuff you already have to deal with. Interrupts really give you an opportunity to see what happens when those fail, it is easy to cause a thread to be interrupted making it easier to see what happens when an exception is thrown at some location or silent failure happens (sigh java). Since interrupts are easy to cause it should be easy to test which results in making it easier to move toward better support of interrupts. I don't think the final two sentences are useful in this situation, that isn't the case. I suspect it would be closer to: |
As far as I know the only code that isn't interrupt-friendly is the response cache. |
It might make sense to improve okhttp with interrupts bit by bit rather than try to make it all work in a single PR. |
Deleting my objections then. |
Assuming someone submits a PR, which components need to remain?
out of:
FWIW this still surprises me because of an expectation that we need to be aware of things like InterruptedIOException.bytesTransferred, if we want something like a Connection to allow a consumer of a single stream to be interrupted and recover. |
My preference is that interrupting an OkHttp call is as similar as possible to canceling the call. The call itself is over, but it shouldn’t leak resources or leave shared features like the connection pool or cache in an inconsistent state. We don’t do anything about |
Basic behaviour confirmed in test #5937 We should add further tests in response to specific bugs. I don't think we currently cover all cases. |
Relates to square/okhttp#3146. This was from #71. There is a draft PR https://github.com/square/okhttp/pull/7185/files which documents OkHttp's ideal handling of cancellation including interrupts. But a few key points 1) This is a target state, and OkHttp does not currently handle interrupts correctly. In the past this has been identified, and the advice is to avoid interrupts on Http threads, see discussion on square/okhttp#1902. Also an attempt at a fix here square/okhttp#7023 which wasn't in a form to land. 2) Even with this fixed, it is likely to never be optimal, because of OkHttp sharing a socket connection for multiple inflight requests. From square/okhttp#7185 ``` Thread.interrupt() is Clumsy ---------------------------- `Thread.interrupt()` is Java's built-in mechanism to cancel an in-flight `Thread`, regardless of what work it's currently performing. We recommend against using `Thread.interrupt()` with OkHttp because it may disrupt shared resources including HTTP/2 connections and cache files. In particular, calling `Thread.interrupt()` may cause unrelated threads' call to fail with an `IOException`. ``` This PR leaves the Loader/DataSource thread parked on a countdown latch, while this may seem wasteful and an additional context switch. However in practice the response isn't returned until the Http2Connection and Http2Reader have a response from the server and these means effectively parking in a `wait()` statement here https://github.com/square/okhttp/blob/9e039e94123defbfd5f11dc64ae146c46b7230eb/okhttp/src/jvmMain/kotlin/okhttp3/internal/http2/Http2Stream.kt#L140 PiperOrigin-RevId: 446652468
Relates to square/okhttp#3146. This was from androidx/media#71. There is a draft PR https://github.com/square/okhttp/pull/7185/files which documents OkHttp's ideal handling of cancellation including interrupts. But a few key points 1) This is a target state, and OkHttp does not currently handle interrupts correctly. In the past this has been identified, and the advice is to avoid interrupts on Http threads, see discussion on square/okhttp#1902. Also an attempt at a fix here square/okhttp#7023 which wasn't in a form to land. 2) Even with this fixed, it is likely to never be optimal, because of OkHttp sharing a socket connection for multiple inflight requests. From square/okhttp#7185 ``` Thread.interrupt() is Clumsy ---------------------------- `Thread.interrupt()` is Java's built-in mechanism to cancel an in-flight `Thread`, regardless of what work it's currently performing. We recommend against using `Thread.interrupt()` with OkHttp because it may disrupt shared resources including HTTP/2 connections and cache files. In particular, calling `Thread.interrupt()` may cause unrelated threads' call to fail with an `IOException`. ``` This PR leaves the Loader/DataSource thread parked on a countdown latch, while this may seem wasteful and an additional context switch. However in practice the response isn't returned until the Http2Connection and Http2Reader have a response from the server and these means effectively parking in a `wait()` statement here https://github.com/square/okhttp/blob/9e039e94123defbfd5f11dc64ae146c46b7230eb/okhttp/src/jvmMain/kotlin/okhttp3/internal/http2/Http2Stream.kt#L140 PiperOrigin-RevId: 446652468
Using 2.5.0 and excessive / unnecessary close() calls I still get a StrictMode error. I notice the glide folk had an issue logged that looked similar. Both are using an ExecutorService. Perhaps this just boils down to just 'Android / StrictMode' and can be simply ignored - rfc.
The text was updated successfully, but these errors were encountered: