Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Heartbeat using closed connections for body checks with TLS #8588

Closed
andrewvc opened this issue Oct 5, 2018 · 4 comments · Fixed by #8660
Closed

Heartbeat using closed connections for body checks with TLS #8588

andrewvc opened this issue Oct 5, 2018 · 4 comments · Fixed by #8660

Comments

@andrewvc
Copy link
Contributor

andrewvc commented Oct 5, 2018

Looking at this thread https://discuss.elastic.co/t/heartbeat-http-use-of-closed-network-connection/147876 it appears we may have a bug with our socket lifecycle with TLS connections.

Seems like we possibly close the conn too early.

@andrewvc
Copy link
Contributor Author

This is a tricky one to track down. Turns out this is at least as old as 6.2.0.

One avenue to investigate jasonwbarnett/fileserver@0404949

will update when I have more...

@andrewvc
Copy link
Contributor Author

I should mention this isn't TLS specific, this is simply to do with bodies over a certain size. It appears we terminate the connection early for larger bodies.

@andrewvc
Copy link
Contributor Author

Well, it's definitely somewhere in our custom dialer, I'm going to tear it apart tomorrow and isolate the failure.

@urso if you have an idea where the failure might be off the top of your head, LMK. If not, I'm glad to dive in and get my hands dirty with the innards of the dialer chain.

One note, I had thought it might be timeouts being incorrectly set to some tiny value that triggered after a few hundred bytes, but that doesn't seem to be it.

@andrewvc
Copy link
Contributor Author

OK, well, it took me longer to find than I'd like to admit, but the root cause is here: https://github.com/elastic/beats/blob/master/heartbeat/monitors/active/http/simple_transp.go#L88

Essentially we close the conn before we return from the RoundTripper. This is bad because the body is actually read later.

The two options are to:

  1. Link the closing of the connection with the closing of the body
  2. Immediately read the body within RoundTrip

andrewvc added a commit to andrewvc/beats that referenced this issue Oct 19, 2018
This fixes an issue where the connection would be closed after only a partial body read. The RoundTripper would close the conn before returning the response which included a partially buffered body. This would actually work for short responses, since the backing bufio would do a partial read, but would fail on all but the shortest responses.

Normally connection lifecycle is handled outside the realm of the `RoundTripper`, but for our purposes we don't want to re-use connections. Since it is a requirement that all response bodies be closed, we can piggy-back on top of that to ensure the connection is closed.

Fixes elastic#8588
andrewvc added a commit to andrewvc/beats that referenced this issue Oct 26, 2018
This fixes an issue where the connection would be closed after only a partial body read. The RoundTripper would close the conn before returning the response which included a partially buffered body. This would actually work for short responses, since the backing bufio would do a partial read, but would fail on all but the shortest responses.

Normally connection lifecycle is handled outside the realm of the `RoundTripper`, but for our purposes we don't want to re-use connections. Since it is a requirement that all response bodies be closed, we can piggy-back on top of that to ensure the connection is closed.

Fixes elastic#8588
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant