-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Seeking ExoPlayer Rapidly with OkHttp with http2 connection causes sockettimeoutexception #4078
Comments
If the issue is unique to the OkHttp extension then do you think it's likely that the issue is in OkHttp rather than ExoPlayer? Would it be possible for you to validate whether we're forgetting to close connections in the ExoPlayer layer? If we're not, that points to an issue in OkHttp. By the way, we've just updated the OkHttp extension to point to the latest release of OkHttp (3.10.0, whereas previously we were depending on 3.9.0). The change is c861b50. It might be worth seeing if the issue still reproduces when using the newer release. |
We are using 3.10 locally and are still seeing the issue. It appears to correlate with the underlying Http2 stream getting InterruptedIoException, it may be an okhttp issue with how it handles Interrupts (square/okhttp#1903). I noticed that ExoPlayer is using ExecutorService to do the downloads and calling shutdown on them interrupts the thread doing the OKHttp operations (java.util.concurrent.ThreadPoolExecutor#interruptIdleWorkers(boolean)). I think this is the underlying cause and is a grey area, but feels like an ExoPlayer issue with its use of OkHttp? Let me know what you decide |
It would be good if you could spend some more time root causing the issue. ExoPlayer calls into OkHttp are isolated to a single class, so it should be pretty easy for you to work out whether we're misusing the OkHttp API. If we're not misusing the API, which I suspect is the case, then it should be possible for you to create a simple reproduction case that calls OkHttp directly and does not depend on ExoPlayer. That issue should then be filed on the OkHttp project. |
I have attached a VERY rough test project that illustrates the issue. I was able to reproduce 100% by using a genymotion emulator and turning the network connection to GPRS so the timing of the The second loadable will error every time with a sockettimeoutexception because the initial cancel call left the socket in a bad state. I know this doesn't use the exact ExtractorLoadables or anything, but it shows that if we call cancelload() on a loader class that is reading from an OKHTTP socket, it will cause the situation. Do you think this should be fixed on ExoPlayer's end, or should I file this over in the OKHttp project? |
This looks a lot like square/okhttp#3146 to me. Which is apparently fixed in 3.10.0, but given this issue, I'm not so sure. I'd suggest you try and further cut down and clean up your test project, then file an issue on OkHttp. A few notes:
|
I think this is actually caused by: square/okhttp#3915 |
Hi @natez0r, I encountered the same issue as what's been described here. I'd like to know if the issue has been resolved for you? I traced this thread to issue square/okhttp#3915 and square/okhttp#4127, used okhttp-3.11.0 + exoplayer 2.7.3 but still experience the same issue after some frequent seek. Bumping up the readTimeOut threshold does remedy the endless buffering, but it still take a long wait before it come back. So I'm wondering what's the current status on your end. How long does your playback back after some frequent seeking? Also did you do any specific configuration on your okHttpClient config? Thanks! |
we have code in place to mitigate the ability to seek rapidly right now (we don't perform a seek until the user has lifted their finger on a scrub). One of the main underlying issues causing it was fixed in okhttp 3.11, which @ojw28 has told me has been bumped in the current dev-v2 branch. I'd imagine this will be a lot better after that. |
@natez0r - Any idea what the status of this issue is at this point? |
@ojw28 We are currently upgrading to exo 2.9 and okhttp 3.11, plan to have something testable by end of week where we can determine if this is fixed/mitigated |
@natez0r Any update on this? |
upgraded to 2.9.1, I tested locally with our throttling code removed and i was still getting socket timeouts on rapidly cancelled requests:
|
What's the current status of this issue? The underlying issue in okhttp seems to be resolved (yet @natez0r says a variant of it still happens). Is there another okhttp issue that tracks this? |
It's unlikely to be an ExoPlayer issue. In which case this is really a question about whether OkHttp still has a problem, and if so, whether there's a corresponding tracking bug for it. |
@noamtamim I have not filed a new OKHttp issue for the underlying cause (I have not been able to spend any time on this recently). It definitely still occurs, but it does appear to be an OKHttp issue vs ExoPlayer. In our codebase, we have a throttling mechanism that works around the issue right now. |
@natez0r is there a way to prove to okhttp that this issue still occurs (and that it's caused by their library and not ExoPlayer)? As a workaround, is it possible to avoid the http/2 "upgrade", so that the connection remains http/1.1 even if the server supports http/2? |
Just to be clear: right now, it's not safe to use ExoPlayer with OkHttp if there's a chance you'll end up talking to a http/2 server - at least not without throttling seek() calls. |
If it's really easy to reproduce then file another issue on OkHttp with really easy reproduction steps. I don't think there's a plausible explanation for it being anything other than an OkHttp issue. |
The internet suggests so: |
As I wrote, it's easy to reproduce using ExoPlayer. But that doesn't prove the bug is not in how ExoPlayer uses okhttp. For example, they claim that if you're not reading the entire response (the part that was buffered so far) you're doing it wrong. |
I think it was concluded in that thread (possibly in the hidden items part) that calling As above, if it's really easy to reproduce then the correct thing to do is file an issue on OkHttp with really easy reproduction steps. Even if it is an issue with ExoPlayer's usage, they are the best people to explain what we're doing wrong. Trying to prove"something uses an API correctly isn't really a feasible thing to do. Unless you can find something it's doing incorrectly then you pretty much have to assume it's correct, after you've stared at it for long enough...!
Are these statement of facts (i.e. you see this in your own experimentation) or are they questions? |
@ojw28 All that's required to reproduce the issue is:
When the player loads, let it start playing. Then tap the fast-forward button quickly 5-10 times. Playback gets stuck. When http/2 is disabled (using the builder and This is what I call "really easy to reproduce" -- it fails when using the default parameters and basic setup. I tested with the basic Widevine sample. This is also why I argue it's not safe to use ExoPlayer with okhttp (unless you explicitly disable http/2). |
Ok. So turn that into a small demo project, and file an issue on OkHttp. The fact it works fine with |
I believe this issue is already being tracked in the OkHttp repo |
I was just looking at whether we could disable http/2 in the ExoPlayer extension, but it looks like application code has to do this when building the
I agree the issue here is what's being tracked by square/okhttp#3146. |
The DataSourceFactory could derive a new OkHttpClient from the one provided (using It might be a good idea to add a warning somewhere, telling developers about this issue -- so that users of the extension will disable http/2 (unless they use a fixed version, etc).
That's a valid point, although in most use cases you don't repeatedly and quickly cancel and restart requests. For playback (with seeking), that's very common unless you do what @natez0r did (add some artificial latency in the UI). |
in OKHTTP 4.2 (and backported to 3.14.3), it appears that they may've found the underlying issue that I is causing this: https://square.github.io/okhttp/changelog/#version-420 (https://square.github.io/okhttp/changelog_3x/#version-3143) Is there any plan to upgrade to above 3.12 for okhttp extension? I know it deprecates old versions of android, but could there be some configuration we could use to force the newer versions of OKHTTP? |
unfortunately, it's API 21. |
It looks like they're still doing 3.12.X releases, and that the fix was back-ported to 3.12.5 (release notes, details of 3.12.X releases. So I think we can just upgrade to 3.12.5 and mark this as fixed. Do you agree? |
Issue: #4078 PiperOrigin-RevId: 268887744
Sounds good to me
…On Fri, Sep 13, 2019, 5:04 AM Oliver Woodman ***@***.***> wrote:
It looks like they're still doing 3.12.X releases, and that the fix was
back-ported to 3.12.5 (release notes
<https://github.com/square/okhttp/blob/master/docs/changelog_3x.md#version-3125>,
details of 3.12.X releases
<https://cashapp.github.io/2019-02-05/okhttp-3-13-requires-android-5>.
So I think we can just upgrade to 3.12.5 and mark this as fixed. Do you
agree?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4078?email_source=notifications&email_token=AAEUIMY4YJYUNVU557WKZQTQJN6VFA5CNFSM4EYWESKKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6U2B7A#issuecomment-531210492>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAEUIM7KKL2GL7VM7PKETADQJN6VFANCNFSM4EYWESKA>
.
|
Issue: #4078 PiperOrigin-RevId: 268887744
@ojw28 Sadly, this error still happens (with the current tip of dev-v2-r2.10.5 branch, commit 2fca01b). I'll attach a test app soon.
|
Unzip and open in Android Studio: The current setup uses method 2 above, edit build.gradle to change to method 1. Reproduction steps:
In logcat, filter on "okhttp". After every seek you'll see Now, to verify that it's an http/2 issue, uncomment line 53: Repeat the steps above -- no timeouts (you'll see many |
As for the audio/video sync issue I mentioned before - sometimes (about 1 in 4 attempts), when I start tapping the FF button, the music continues to play even when the video is buffering (due to seeks). When that happens, even when I stop seeking and let it play, I hear the original audio playing from the beginning and the "current" audio. Tapping pause pauses the video and audio, but the "original" audio continues to play until the app is killed. If this is not a known issue you're currently working on, I'll file a separate bug -- please let me know. |
An alternative way to observe the http/2 timeout issue -- using the ExoPlayer demo app. public HttpDataSource.Factory buildHttpDataSourceFactory() {
final OkHttpClient client = new OkHttpClient.Builder()
.eventListener(new EventListener() {
@Override
public void connectEnd(Call call, InetSocketAddress inetSocketAddress, Proxy proxy, Protocol protocol) {
Log.d(TAG, "okhttp connected protocol=" + protocol);
}
@Override
public void callFailed(Call call, IOException ioe) {
Log.d(TAG, "okhttp callFailed " + ioe);
}
})
// Uncomment the "protocols" below to forbid http/2
// .protocols(Collections.singletonList(Protocol.HTTP_1_1))
.build();
return new OkHttpDataSourceFactory(client, null);
} From the HLS section, play "Apple 4x3 basic stream". The issue reproduces when repeatedly and quickly tapping the FF button, as explained above. |
Thanks for the updates. You should probably post your sample and reproduction steps on the OkHttp issue, since this remains an OkHttp problem (although you may want to fix the bug below first :)). It's not something we're likely to be able to fix on our side.
I think it's because your demo app isn't hooking the player into the activity lifecycle properly. It creates and starts a player in |
@ojw28 thanks, you're right about the bug. It was late at night (past 11pm in Israel :-)), I was too lazy to handle Android methods in a test app, and I missed the obvious. I wanted to make sure you're not missing anything so I jumped the gun on this report. |
Issue: #4078 PiperOrigin-RevId: 288651166
Fingers crossed this is finally fixed! |
I'll verify as soon as I get time for it.
Where's the fix?
|
Issue: #4078 PiperOrigin-RevId: 288651166
I had a similar issue in my app. When I tapped my skip button quickly 5-10 times, playback got stuck with a SocketTimeoutException. Updating exoplayer to 2.11.2 did not solve this issue for me. Disabling http/2 with So I'm not sure this issue is fixed now. |
Sadly, I didn't get the chance to do it. For now we have disabled http/2. |
I suggest you (once again) report this to the OkHttp project along with reproduction steps. There's not much we can do about it on the ExoPlayer side, unfortunately. You could alternatively use a network stack that doesn't have this issue (e.g., Cronet). |
I can confirm that the issue still occurs in the test app attached to #4078 (comment). I've updated it to use ExoPlayer 2.11.3 and a new stream. It seems to be harder to reproduce now (you have to play with the seek bar more) - but it still happens. I'll try to create a test app that does not include ExoPlayer. |
@ojw28 I couldn't reproduce it with OkHttp alone. As I mentioned above, it's much harder to reproduce it now. So the question is, how to recover? These are the final logcat lines when it happens:
The
It took 33 seconds for ExoPlayer to report this error, too much for a recovery (the user won't wait). There are two questions here:
|
I think (1) is a question for the OkHttp team, and (2) is a question for the app developer. Ultimately, we're providing an extension that allows you to use the OkHttp network stack. No one has ever diagnosed an issue with our integration. If the underlying network stack doesn't work properly, it's the responsibility of the maintainers of that network stack to resolve that. We also provide an extension that allows you to use Cronet, which as far as I'm aware does not suffer from this kind of issue. |
@ojw28 I agree with you on 2. By 1, I'm asking if you think any recovery (or quick failure, allowing the app to recover faster) code can be added to OkHttpDataSource. Kaltura's backend and CDN support http/2 and it's a shame we cannot utilize it on Android, which accounts for most of the plays. |
From the stack trace above, it looks like the issue is surfaced to If the initial Let us know if you find anything useful!
Why is it not an option for you to switch to a network stack that supports http/2 reliably? |
The initial I'll try to experiment with the client.
AFAIK my options are (a) HttpURLConnection - does not support http/2 (b) OkHttp and (c) Cronet. I'm sure that since Cronet is Chrome's native http stack, it more reliable than any other stack; however, last I checked it increases app size significantly. I can't find the size now, but I remember it was more than our entire SDK. |
Cronet is bundled inside Google Play Services these days, which means you no longer have to bundle the native stack directly in your apk. You can probably just rip out the I'm actually not sure why we haven't done that by default. It looks to me like we should do. Applications that still want to bundle the native stack can still do so, and When using the Google Play Services stack, So I'd suggest you take another look. |
This is great news, I will sure give it a go. |
Note: We will be merging #7324 to default the Cronet extension to using Google Play Services rather than the embedded library. |
Issue description
When using the OkHttp extension and playing media over http/2, if you seek the player rapidly the OkHttp connection pool eventually will contain a connection that is hung and will never work. This is much easier to get to reproduce in our own application, but it's definitely possible to get to happen in a demo app.
I suspect that a connection isn't being closed properly and the following h/2 connections which are multiplexed over the same socket fail.
Reproduction steps
I have emailed the sample app and the stream to the alias. You need to serve the file over http/2. I used https://www.npmjs.com/package/@http2/ to serve the stream from my machine and used genymotion/emulator.
Setup @http2 to have host.js with the IP of your machine and start it pointing at the directory containing your HLS files.
replace "YOUR_IP" in test_app.nroy.twitter.com.exoplayerplayground.MainActivity#M3U8_URL with the IP of your machine
Run the provided app and play the video. Scrub back and forth using the red scrubber at the top quickly until the video no longer continues playing (it may take 30+ seconds). The sample app includes logging to ensure that you're using http/2 (this issue does not manifest unless using http/2). If you connect stetho, you'll see the player attempting to re-request the .ts segment, receiving the 200 response and headers, but never receiving any bytes.
Link to test content
emailed.
Version of ExoPlayer being used
2.7.0
Device(s) and version(s) of Android being used
geny motion emulator 8.0
nexus 5x 8.0
It takes awhile to reproduce in the sample, it happens almost instantly in our app.
A full bug report captured from the device
emailed
The text was updated successfully, but these errors were encountered: