Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stale data read in subsequent requests after an HTTP GET that contains a Response Body #3093

Closed
jpearl opened this issue Jan 8, 2017 · 6 comments

Comments

@jpearl
Copy link

jpearl commented Jan 8, 2017

I have code that first attempts to issue an HTTP HEAD request to a url. If the HEAD request fails (either connection refused or non 2xx status code, I retry the request as a normal HTTP GET.

However, I have one URL for which the HTTP HEAD request returns an HTTP/1.1 400 that contains a body. (I realize this is against the http RFC which states "the server MUST NOT return a message-body in the response" https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html - but I think we can handle this case).

Using OkHttp3 (same exact repro with OkHttp2), the HEAD request correctly indicates the return code of 400 on the HEAD request. However, after closing that connection and opening a /new/ connection as an HTTP GET, that Response fails to parse the StatusLine. Instead, the String that is parsed as the status line of request #2 (the GET request), is actually the body of request #1, the HEAD request.

I've ensured that the HEAD request connection is closed before issuing the GET request. I've also disabled caching on the OkHttp client. But nothing is resolving this.

To help debug, here's some same curl requests for the url in question
curl of the url with the HEAD request:

$ curl -X HEAD -i http://wtmd-ice.streamguys1.com/wtmd.m3u
HTTP/1.0 400 Bad Request
Content-Type: text/html

<b>unknown request</b>

And a curl of the GET request:

# curl -X GET -i http://wtmd-ice.streamguys1.com/wtmd.m3u
HTTP/1.0 200 OK
Content-Type: audio/x-mpegurl

http://wtmd-ice.streamguys1.com:8000/wtmd

The exact exception thrown trying to parse the StatusLine for the GET (second) request is:

java.net.ProtocolException: Unexpected status line: <b>unknown request</b>

In essence, the simplified code looks like:

Request request = createRequestBuilder(url).head().build();
Response response = okhttpclient.newCall(request).execute();
if (response.code() < 200 || response.code() > 299) {
    response.close();
    request = createRequestBuilder(url).build();
    response = okhttpclient.newCall(request).execute(); // <- it's this line that's throwing the ProtocolException
}
@jpearl
Copy link
Author

jpearl commented Jan 8, 2017

And the OkHttp logging output:

HEAD request:

01-08 13:42:48.576 27221 27300 I okhttp3.OkHttpClient: --> HEAD http://wtmd-ice.streamguys1.com/wtmd.m3u http/1.1
01-08 13:42:48.576 27221 27300 I okhttp3.OkHttpClient: Cache-Control: no-cache
01-08 13:42:48.576 27221 27300 I okhttp3.OkHttpClient: User-Agent: <my user agent>
01-08 13:42:48.576 27221 27300 I okhttp3.OkHttpClient: --> END HEAD
01-08 13:42:48.940 27221 27300 I okhttp3.OkHttpClient: <-- 400 Bad Request http://wtmd-ice.streamguys1.com/wtmd.m3u (363ms)
01-08 13:42:48.940 27221 27300 I okhttp3.OkHttpClient: Content-Type: text/html
01-08 13:42:48.941 27221 27300 I okhttp3.OkHttpClient: <-- END HTTP

And the subsequent GET request:

01-08 13:42:48.941 27221 27300 I okhttp3.OkHttpClient: --> GET http://wtmd-ice.streamguys1.com/wtmd.m3u http/1.1
01-08 13:42:48.941 27221 27300 I okhttp3.OkHttpClient: Cache-Control: no-cache
01-08 13:42:48.941 27221 27300 I okhttp3.OkHttpClient: User-Agent: <my user agent>
01-08 13:42:48.941 27221 27300 I okhttp3.OkHttpClient: --> END GET
01-08 13:42:48.947 27221 27300 I okhttp3.OkHttpClient: <-- HTTP FAILED: java.net.ProtocolException: Unexpected status line: <b>unknown request</b>

@swankjesse
Copy link
Collaborator

Have you reported the problem against the server’s maintainers? You can work around this in OkHttp by clearing the connection pool.

@jpearl
Copy link
Author

jpearl commented Jan 9, 2017

I haven't yet, but I will and will report back when I hear back.

Thanks for the suggesting - clearing the connection pool works:

okhttpclient.connectionPool().evictAll();

I also found another workaround, via source inspection(!)

requestBuilder.addHeader("Connection", "close");

Personally, I think I'm going to go with the second workaround, so that these bad HEAD requests can be isolated from a latency standpoint as well, and not dump the entire cached connection pool, which might be reused on a subsequent (GET) request.

So I have a workaround, but what else does this bug uncover? Is it concerning that, even in this "error" case on connection 1, that that still has a downstream impact on subsequent connections. In other words, it seems like state and/or data is not properly being cleared and resources recycled across connections.

@swankjesse
Copy link
Collaborator

Yeah, that's sort of the problem with HTTP connection pooling. It's rarely a problem in practice because webservers usually fix their bugs.

@jpearl
Copy link
Author

jpearl commented Jan 9, 2017

But even after the connection is explicitly closed, why does (what i'm assuming is) some part of the connection's socket's buffer remain viable? Shouldn't an explicit close on a connection clear all of the underlying resources?

Even if it's a bug on the webserver, can't we do better than that and proactively catch this case in okhttp? I'm assuming this particular webserver is backed by a whole streaming content platform provided as a web offering to many customers (https://streamguys.com), so I would assume this is actually a systemic problem that all consumers of at least this one web platform will have.

It's just really concerning to me that resources like this are not managed cleanly, which makes any given request actually non-deterministic: a HEAD followed by a GET will fail while a single GET or a GET followed by a GET will succeed. (not to mention the amount of time each developer who runs into this problem is going to spend debugging this non-deterministic behavior).

What does an actual fix look like in okhttp?

@swankjesse
Copy link
Collaborator

I don’t think there’s any satisfying fix for this in OkHttp, or in any HTTP client to be completely honest.

We could use available() to preview if the stream has unexpected bytes before issuing a request. This would add an extra system call to all HTTP requests, most of which aren’t going to be fighting broken webservers. And it wouldn’t necessarily work reliably.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants