-
Notifications
You must be signed in to change notification settings - Fork 39.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prevent corrupted spdy stream after hijacking connection #43922
Conversation
Hi @cezarsa. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@cezarsa thanks for writing this. It's been on my list but I hadn't gotten around to it yet. This was also something that confused me - are we supposed to read from the buffered reader until it's exhausted, and then switch to reading from the connection, or can we do what you did, and just always read from the buffered reader? |
@ncdc I'm glad I could help. It was an interesting one to track down. :) From what I can see by looking at the hijack implementation it's safe to keep reading from the bufio.Reader, after the buffer has been exhausted it will read from the underlying connection (actually it's backed by a I guess there's a slight overhead of reading from the bufio.Reader because it goes through the |
@k8s-bot ok to test |
/approve |
I'm not sure about what went wrong in the tests. It looks like the apiserver didn't even start on |
I'm a bit hesitant to include this test. How frequently does it fail without your fix? |
Sorry for the late response. The added test always fails with Go 1.8, however I couldn't trigger a failure with Go 1.7. I'm fine removing it as it was most useful when debugging this. |
@@ -32,6 +34,19 @@ const HeaderSpdy31 = "SPDY/3.1" | |||
type responseUpgrader struct { | |||
} | |||
|
|||
// connWrapper is used to wrap a hijacked connection its bufio.Reader. All |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please put an "and" in between "connection" and "its"?
OK, thanks for the info. @liggitt do you think we should keep or remove this new test? |
I'd rather have a deterministic non-stress test for the unit test. If that's not possible, I'd probably remove this one.
This is slightly concerning... is the behavior of the returned buffer documented to work that way, or are we depending on an implementation detail that can change? |
AFAIK, this is not possible.
Here is what is in godoc: // The returned bufio.Reader may contain unprocessed buffered
// data from the client.
Hijack() (net.Conn, *bufio.ReadWriter, error) I wonder if it would be safer to read from the Or we could ask the go team what we should do. |
BTW this was guidance from the go team: #38228 (comment) |
@dsnet could you please review this PR and let us know if it's acceptable re the use of the buffered reader returned by the Hijack() call, or if we need to do something differently? Thanks! |
I'll remove the test. Regarding reading from the bufio, if we were to peek from the buffer something like this could be made maintain the wrapper idea: func (w *connWrapper) Read(b []byte) (n int, err error) {
if w.bufReader != nil {
w.bufData, err = w.bufReader.Peek(w.bufReader.Buffered())
if err != nil {
return
}
w.bufReader = nil
}
if len(w.bufData) > 0 {
n = copy(b, w.bufData)
w.bufData = w.bufData[n:]
return
}
return w.Conn.Read(b)
} It's a bit more convoluted then reading straight from the bufio but it works. |
can we use https://golang.org/pkg/io/#MultiReader ? |
That certainly sounds better |
Okay, but since using a io.MultiReader with the bufio and the net.Conn directly would not work because the bufio would never EOF, we have to create a bytes.Reader with the peeked []byte to use in the MultiReader: func (w *connWrapper) Read(b []byte) (n int, err error) {
if w.reader == nil {
var bufData []byte
bufData, err = w.bufReader.Peek(w.bufReader.Buffered())
if err != nil {
return
}
w.reader = io.MultiReader(bytes.NewReader(bufData), w.Conn)
}
return w.reader.Read(b)
} |
bufReader *bufio.Reader | ||
} | ||
|
||
func (w *connWrapper) Read(b []byte) (n int, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This solution is okay, but has one edge case. When Close is called, it is entirely possible for Read to still return with data instead of EOF. That may or may not be surprising behavior.
To fix this, you could discard the buffer upon Close.
func (w *connWrapper) Close() error {
err := w.Conn.Close()
w.bufReader.Discard(w.bufReader.Buffered())
return err
}
Also be careful that w.bufReader is not safe for concurrent use. I don't know if the usage of connWrapper allows calling Close and Read concurrently.
Updated with the atomic approach for now. |
@cezarsa: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Is this easier to do now that we have go1.8? |
We do need to get this in soon, as the upgrade to 1.8 and/or my #44451 have made the spdy-related tests flakey. |
if atomic.LoadInt32(&w.closed) == 1 { | ||
return 0, io.EOF | ||
} | ||
return w.bufReader.Read(b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dsnet if we make it past the closed check on line 51, and before this line is executed, Close()
executes and closes the net.Conn
, what will happen when we next execute w.bufReader.Read(b)
- will it EOF?
/release-note-none |
/assign |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cezarsa, ncdc, smarterclayton
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue |
This PR fixes corner case in spdy stream code where some bytes would never arrive at the server.
Reading directly from a hijacked connection isn't safe because some data may have already been read by the server before
Hijack
was called. To ensure all data will be received it's safer to read from the returnedbufio.Reader
. This problem seem to happen more frequently when using Go 1.8.This is described in https://golang.org/pkg/net/http/#Hijacker:
I came across this while debugging a flaky test that used code from the
k8s.io/apimachinery/pkg/util/httpstream/spdy
package. After filling the code with debug logs and long hours running the tests in loop in the hope of catching the error I finally caught something weird.The first word on the first spdy frame read by the server here had the value
0x03000100
. See, the first frame to arrive on the server was supposed to be a control frame indicating the creation of a new stream, but all control frames need the high-order bit set to 1, which was not the case here, so the saver mistakenly assumed this was a data frame and the stream would never be created. The correct value for the first word of a SYN_STREAM frame was supposed to be0x80030001
and this lead me on the path of finding who had consumed the first 1 byte prior to the frame reader being called and finally finding the problem with the Hijack call.I added a new test to try stressing this condition and ensuring that this bug doesn't happen anymore. However, it's quite ugly as it loops 1000 times creating streams on servers to increase the chances of this bug happening. So, I'm not sure whether it's worth it to keep this test or if I should remove it from the PR. Please let me know what you guys think and I'll be happy to update this.
Fixes #45093 #45089 #45078 #45075 #45072 #45066 #45023