Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idle connection breaks when turning data off and on. #4079

Closed
roberthoenig opened this issue Jun 20, 2018 · 16 comments
Closed

Idle connection breaks when turning data off and on. #4079

roberthoenig opened this issue Jun 20, 2018 · 16 comments
Labels
android Relates to usage specifically on Android bug Bug in existing code needs info More information needed from reporter
Milestone

Comments

@roberthoenig
Copy link

roberthoenig commented Jun 20, 2018

I first thought this to be an issue with React Native and filed an issue there: facebook/react-native#19709

However, further debugging makes me believe that the issue might lie within okhttp.

I created a tiny Github repo that reproduces this bug with okhttp in an Android app:

https://github.com/roberthoenig/react-native-fetch-bug

(Don't get confused by the repo name react-native-fetch-bug: The example doesn't use React Native at all, it's just where I encountered this bug first.)

The repo's README.md also contains steps to reproduce the issue.

Here is my theory of what's going on:

  1. When retrieving a resource with okhttp, the connection used for the retrieval gets stored in the client's connectionPool.
  2. When turning off the network (e.g. by disabling wifi and data) the idle connections in the connectionPool somehow get corrupted.
  3. After turning the network back on and making a new request to the same resource, okhttp will try to reuse the existing idle connection in the connectionPool. That connection is corrupted, however, and will silently fail. The symptom is that no request gets dispatched, and no errors are thrown. After a couple minutes, however, things seem to recover and the request gets dispatched as if nothing happened.
@roberthoenig
Copy link
Author

Clarification:

Here are the reproduction steps. I didn't include them in the original post because they are already in the repo I linked, but it's probably useful to have them visible here as well):

Reproduction

  1. Run the Android app https://github.com/roberthoenig/react-native-fetch-bug/blob/master/README.md
    It contains two buttons: FETCH, which fetches http://publicobject.com/helloworld.txt and EVICT CONNECTION POOL, which calls evictAll().
  2. Quickly press FETCH repeatedly. The requested resources come in, headers are logged.
  3. Quickly turn off data.
  4. Turn on data.
  5. Press FETCH again. Wait for some time. No response will come in.
  6. Press EVICT CONNECTION POOL.
  7. Press FETCH again. The requested resource comes in, headers are logged.

@yschimke
Copy link
Collaborator

@roberthoenig Thanks for the thorough reproduction, I'll take a look in next couple of days if no-one beats me to it. If you get any more insight please post here as there are a number of issues seen that are some form of Android network disconnects not observed after airplane mode.

@roberthoenig
Copy link
Author

Just a quick note: I don't actually know that disconnecting corrupts the idle connections. All I observed is that it they get corrupted somehwere between turning data off and on.

roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 21, 2018
…ebook#19709).

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue, its likely origins and
how to reproduce it. A short summary of the issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle connections in okhttp
clients. New requests made over these clients fail. This commit works around that bug by
clearing the idle connection pool when Android disconnects from the network.
roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 22, 2018
…book#19709.

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue, its likely origins and
how to reproduce it. A short summary of the issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle connections in okhttp
clients. New requests made over these clients fail. This commit works around that bug by
clearing the idle connection pool of each client when Android disconnects from the network.
@yschimke
Copy link
Collaborator

@roberthoenig Thanks for the app, unfortunately I couldn't really reproduce what you are seeing. Is it some particular android version you are testing with? Is it only on devices or emulators?

I do see a delay in the request after reconnect, but it succeeds and then following requests work fine.

I'm sure I'm just doing something differently to you, and I've seen similar problems myself.

I've got a similar test app https://github.com/yschimke/okhttp-testapp

@yschimke yschimke added the bug Bug in existing code label Jun 24, 2018
@roberthoenig
Copy link
Author

Is it some particular android version you are testing with?

I'm testing it on an emulator with the following details:


Name: small_zulip
CPU/ABI: Google APIs Intel Atom (x86_64)
Path: /home/robert/.android/avd/small_zulip.avd
Target: google_apis [Google APIs] (API level 23)
Skin: 480x800
SD Card: 100M
hw.dPad: no
hw.lcd.height: 800
runtime.network.speed: full
hw.accelerometer: yes
hw.device.name: 4in WVGA (Nexus S)
vm.heapSize: 48
skin.dynamic: yes
hw.device.manufacturer: Generic
hw.lcd.width: 480
hw.gps: yes
hw.initialOrientation: Portrait
skin.path.backup: _no_skin
image.androidVersion.api: 23
hw.audioInput: yes
image.sysdir.1: system-images/android-23/google_apis/x86_64/
tag.id: google_apis
showDeviceFrame: no
hw.camera.back: virtualscene
hw.mainKeys: yes
AvdId: small_zulip
hw.camera.front: emulated
hw.lcd.density: 240
avd.ini.displayname: small_zulip
hw.arc: false
hw.gpu.mode: auto
hw.device.hash2: MD5:0c867f9f3b5d06e6018bb60c4f64aed3
hw.ramSize: 512
hw.trackBall: no
PlayStore.enabled: false
fastboot.forceColdBoot: no
hw.battery: yes
hw.cpu.ncore: 4
hw.sdCard: yes
tag.display: Google APIs
runtime.network.latency: none
hw.keyboard: yes
hw.sensors.proximity: yes
disk.dataPartition.size: 800M
hw.sensors.orientation: yes
avd.ini.encoding: UTF-8
hw.gpu.enabled: yes

It should be a standard configuration of a Nexus S with API level 23. Nothing spcial about the choice of device. The API level is the target API of React Native. My build.gradle has the following okhttp dependency:

compile 'com.squareup.okhttp3:okhttp:3.6.0'

Is it only on devices or emulators?

The original reports of this issue zulip/zulip-mobile#2287 zulip/zulip-mobile#2310 zulip/zulip-mobile#2315 were on real devices. I only reproduced the issue on an emulator. Didn't try myself on a real device yet.

I do see a delay in the request after reconnect, but it succeeds and then following requests work fine.

Hmm, how long is the delay?

One thing noteworthy about my repro steps is that I turn the network off and on with

$ adb shell svc data disable
$ adb shell svc data enable

Also, I just tried to reproduce it myself again with the steps outlined above:

  1. Quickly press FETCH repeatedly. The requested resources come in, headers are logged.
  2. Quickly turn off data.
  3. Turn on data.
  4. Press FETCH again. Wait for some time. No response will come in.
  5. Press EVICT CONNECTION POOL.
  6. Press FETCH again. The requested resource comes in, headers are logged.

Step 1 wasn't actually necessary for things to break. Might be for others, though.

roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 25, 2018
…book#19709.

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue, its likely origins and
how to reproduce it. A short summary of the issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle connections in okhttp
clients. New requests made over these clients fail. This commit works around that bug by
clearing the idle connection pool of each client when Android disconnects from the network.
roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 25, 2018
…book#19709.

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue, its likely origins and
how to reproduce it. A short summary of the issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle connections in okhttp
clients. New requests made over these clients fail. This commit works around that bug by
clearing the idle connection pool of each client when Android disconnects from the network.
roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 27, 2018
…book#19709.

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue, its likely origins and
how to reproduce it. A short summary of the issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle connections and ongoing
calls in okhttp clients. New requests made over these clients fail. This commit works around
that bug by evicting the idle connection pool and cancelling all ongoing calls of each client
when we receive a DISCONNECTED or CONNECTING event (we don't know yet if only one or both of
them cause the issue).
Cancelling all calls is aggressive, but when a device disconnects any ongoing calls can fail
anyway, so an app has to expect this scenario.
roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 27, 2018
…book#19709.

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue, its likely origins and
how to reproduce it. A short summary of the issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle connections and ongoing
calls in okhttp clients. New requests made over these clients fail. This commit works around
that bug by evicting the idle connection pool and cancelling all ongoing calls of each client
when we receive a DISCONNECTED or CONNECTING event (we don't know yet if only one or both of
them cause the issue).
Cancelling all calls is aggressive, but when a device disconnects any ongoing calls can fail
anyway, so an app has to expect this scenario.
roberthoenig added a commit to roberthoenig/react-native that referenced this issue Jun 28, 2018
…book#19709.

This bug is probably actually a bug in OkHttp: square/okhttp#4079
Both issues linked above contain extensive details about the issue,
its likely origins and how to reproduce it. A short summary of the
issue and the fix in this commit:

On Android, disconnecting from the network somehow corrupts the idle
connections and ongoing calls in okhttp clients. New requests made over
these clients fail. This commit works around that bug by evicting the idle
connection pool when we receive a DISCONNECTED or CONNECTING event (we
don't know yet if only one or both of them cause the issue). Technically,
to fully fix this issue, we would also need to cancel all ongoing calls.
However, cancelling all ongoing calls is aggressive, and not always desired
(when the app disconnects only for a short time, ongoing calls might still
succeed). In practice, just evicting idle connections results in this issue
occurring less often, so let's go with that for now.
@swankjesse
Copy link
Collaborator

Does it time out? Seems like a bug in Android.

@swankjesse
Copy link
Collaborator

Unclear what action to take here.

@swankjesse swankjesse added this to the 3.12 milestone Jul 5, 2018
@steelbytes
Copy link

I've been getting this as well. There are other bug reports here that discuss this

@musicode
Copy link

I have the same issue, when to fix it?

@swankjesse
Copy link
Collaborator

Presumably the nicest fix would be a thing that observes Android’s network stack and applies changes to our connection pool. That could be a code sample or a gist!

@yschimke
Copy link
Collaborator

yschimke commented Nov 4, 2018

@swankjesse forking off #4366, because I had some thoughts I was experimenting with previously...

@yschimke
Copy link
Collaborator

Probably worth retesting after #5920 which is in 4.5, I'm hopeful that fixes a ton of connectivity issues with dodgy Android networking. Anyone seeing with 4.5?

@yschimke yschimke added android Relates to usage specifically on Android needs info More information needed from reporter labels Apr 11, 2020
@v-ladynev
Copy link

v-ladynev commented May 25, 2020

Just for information:
We have the similar issues (for HTTPS GET requests, not HTTP/2) on a server (not an Android client) using the same 3.6.0 version of okhttp with Java 7. We are using a standard connection pool configuration, but, looks like, a connection pool was not a reason of the issue, because a lot of new connections were created with the same java.net.SocketTimeoutException.

@lyind
Copy link

lyind commented Mar 17, 2021

Is this a duplicate of #3278 ?

@gnprice
Copy link

gnprice commented Feb 21, 2022

@lyind It may well be! (Speaking as a colleague of this issue's reporter.) Thanks for spotting that connection.

@yschimke
Copy link
Collaborator

Dupe of #3278

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
android Relates to usage specifically on Android bug Bug in existing code needs info More information needed from reporter
Projects
None yet
Development

No branches or pull requests

8 participants