Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed libcurl connection pool to use Connection response header values #5473

Merged
merged 16 commits into from
Apr 10, 2024

Conversation

antkmsft
Copy link
Member

@antkmsft antkmsft commented Mar 30, 2024

Fixes #5450.

I propose for us to revert #5308 first (which is done by a sibling PR #5472), and then immediately release a GA patch/minor version, and then to merge this change, and release a beta.

@antkmsft antkmsft marked this pull request as ready for review March 30, 2024 09:07
@antkmsft antkmsft requested a review from Jinming-Hu March 30, 2024 09:31
@Azure Azure deleted a comment from azure-pipelines bot Mar 30, 2024
@Azure Azure deleted a comment from azure-pipelines bot Mar 30, 2024
@Jinming-Hu
Copy link
Member

I don't understand how this fixed #5450.

Can we add another check? If status code is not 2xx, and request header contains Expect: 100-continue and request body length is not zero, we don't reuse the connection.

@antkmsft
Copy link
Member Author

antkmsft commented Apr 1, 2024

@Jinming-Hu, do you know of some spec that defines the behavior that you're referring to, so we can implement it the best?
The one I put in here, comes from the HTTP specs and RFCs (I put the links in comments above the lines I modified), but there are many addendums, I could have missed something related to 100-continue.
In the specs, I also did not see anything special about 2xx codes on whether the connection can be reused, all the references I've seen were about the value of the Connect header, regardless of the code.
Do you know if WinHTTP implements the behavior you're describing?

It seems to me, what I've done here, does adhere to HTTP standard, i.e. if a response has "Connection: keep-alive" hader, we are keeping it alive. If it has "Connection: close", we are closing it. If it has "Connection: keep-alive, close", we are keeping it alive. If it has no "Connection" header, then we are keeping it alive if the response is HTTP/1.1+, we and closing if it is below HTTP/1.1.
We are not relying on the last HTTP response code, so even if it was a 404 with keep-alive, we would keep it alive, as this is what the server is asking us to do. I ran the customer sample, and I saw 404 with "Connection: close", in which case we would have closed it, and this way the issue would've been fixed.
I am not touching any other parts, just removing the "2xx => keep-alive, regardless of headers; !=2xx => close, regardless of headers" behavior, which does not seem to be specified anywhere.
If curl connection pool had any special handling for 100-continue, the current implementation will keep the behavior, I just changed the condition as I described above.

@Jinming-Hu
Copy link
Member

@antkmsft I agree your implementation is compliant to the standard. But Azure services are not compliant, and this is not the only issue (I've seen others). As Azure SDK, we better be compatible with Azure services.

@antkmsft
Copy link
Member Author

antkmsft commented Apr 1, 2024

@Jinming-Hu, I 100% agree with you, and I do not want for the SDK to be unreliable when talking to Azure, not at all.
I just don't understand if the interaction with Azure is not compliant, how come we don't see a problem when using WinHTTP, which does not have this special logic, and must only adhere to standard.
And the second question remains - do you have any references - documentation, or code in other language SDKs, which has this extra logic, which we could use as a reference. At this moment, I only have spoken word and this very implementation that contains this knowledge, plus no tests. Please understand, by no means I am implying that if these do not exist, your point is not valid, and I would not want to check in something that introduces more problems. I am just trying to figure this out, for the implementation become better, not worse. I want to have a code that you'll genuinely agree on, after we sort out all these questions, if necessary.
Plus, a side question - you see this PR, I just replaced the logic of "httpCode < 200 && httpCode >= 300" by looking into "Connection" header instead. I did not remove any existing handling of "100-Continue". Do you think it will break the current handling of "100-Continue"?

If you want, we can take this offline, if that will be faster and easier to discuss. Once again, I am not trying to force this on Storage, you have nothing to worry about, you don't need to be prepared to defend it if the things really can't work the other way. I am just trying to bring clarity, for both of us. If it is Storage-specific, we can look into the transport to have "Storage Mode", where it would work one way for the Storage, and maybe the other way for everything else, if it is only Storage who works this way. Other good thing that may be an outcome of this, is if we put more comments/references/tests on how things should work, so the next time someone modifies transport, they are aware on how to not break Storage, and don't accidentally break you.
But as I said, if it really works this way, I don't understand how it works successfully on WinHTTP and in other language SDKs.

@Jinming-Hu
Copy link
Member

Jinming-Hu commented Apr 1, 2024

Hi @antkmsft , winhttp transport or any other language SDKs do not use 100-continue at all, that's why we don't see the issue anywhere else.

Plus, a side question - you see this PR, I just replaced the logic of "httpCode < 200 && httpCode >= 300" by looking into "Connection" header instead. I did not remove any existing handling of "100-Continue". Do you think it will break the current handling of "100-Continue"?

Your code change looks fine to me. But I cannot answer this question before it's fully tested. As I said, Azure services are not 100% compliant to standard. I don't know if server side will always send Connection: close header when it wants to terminate current session, but it looks to me you've made this assumption in this PR.

Anyway, since you mentioned you desired to fix issue #5450 in this PR. Let me add some details of the situation and explain why it doesn't fix as expected.

  1. Storage SDK sends a PUT request with a non-empty request body (which means Content-Length request header is not 0, let's say it's 6) and Expect: 100-continue request header, but it doesn't send the header unless server returns 100 Continue status code.
  2. Storage service returns 4xx status code and response headers, but it doesn't want to close this connection, so there's no Connection: close in response headers.
  3. Now both client and server agree to continue using this connection. But they do not agree in the current status of this connection.
    1. Client side thinks the previous request is finished because it has received a status code and response headers. It should send a new HTTP request if there's any.
    2. Server side thinks the previous request is not finished because it hasn't received the request body. I tend to think this is a bug of server-side.
  4. Client side sends a new request, for example,
    HEAD /whatever/path HTTP/1.1
    host: foo.bar.com
    ...
    
  5. Server side takes the first 6 bytes (HEAD /) of the send request and thinks this is the request body of the first request and discard it.
  6. Server side keeps reading the remaining data on the wire and thinks the first part (whatever/path) is an HTTP verb. It fails the request with 400 invalid verb.

This issue is not triggered before PR #5308 because we don't re-use connections failed with 4xx.

@antkmsft
Copy link
Member Author

antkmsft commented Apr 1, 2024

@Jinming-Hu and @LarryOsterman, please check out the latest iteration (commit 8d81706).

@antkmsft antkmsft requested a review from LarryOsterman April 1, 2024 21:34
@antkmsft
Copy link
Member Author

antkmsft commented Apr 2, 2024

FYI @mchelnokov

@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
@Azure Azure deleted a comment from azure-pipelines bot Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HTTP Error 400. The request verb is invalid
4 participants